I reproduce a hang by rebooting Kyu, starting the "wangdoodle" server on port 114 using the "t 5" command and then running the python "try" script. Here is what the threads look like after starting the server, but before starting the try script:
Thread: name ( &tp ) state pc sp pri * Thread: shell (400a7998) READY I 4000460c 40568000 11 Thread: net (400a75d8) SEM I 400108fc 40578000 12 Thread: net_slow (400a74e8) REPEAT C 40015a64 4057c000 13 Thread: tcp-bsd (400a77b8) SEM J 40010494 40570000 14 Thread: tcp-timer (400a76c8) READY I 400108f8 40574000 15 Thread: wangdoodle (400a7308) SEM I 400108f8 40584000 30 Thread: idle (400a78a8) READY C 40010950 4056c000 1234 Thread Cur : (400a7998)
After only 244 transfers, we are hung. An odd thing to be pleased with. I know we are hung because the python script is stuck waiting for a transfer to complete. Because I now run the shell at priority 11, it is not being starved by the odd loop that "wang_thr" is in, and I am able to examine the system:
Kyu, ready> l Thread: name ( &tp ) state pc sp pri * Thread: shell (400a7998) READY J 40010494 40568000 11 Thread: net (400a75d8) SEM J 40010494 40578000 12 Thread: net_slow (400a74e8) REPEAT C 40015a64 4057c000 13 Thread: tcp-bsd (400a77b8) SEM I 400108f8 40570000 14 Thread: tcp-timer (400a76c8) READY C 40023c34 40574000 15 Thread: wangdoodle (400a7308) SEM J 40010494 40584000 30 Thread: wang_thr (400a73f8) READY I 400104f0 40580000 31 Thread: idle (400a78a8) READY C 40010950 4056c000 1234I can investigate what the "wang_thr" is doing via the "u" command:
Kyu, ready> u wang_thr Thread wang_thr, dump (-) Thread: wang_thr (400a73f8) READY I 400104f0 40580000 31 state: READY stack: 40580000 mode: INT CPU: 0 pc : [<400104f0>] lr : [<40010494>] sp : 40583ea0 ip : 00000001 fp : 40583ebc r10: 00000002 r9 : 5bf42ee8 r8 : 5ffb4b70 r7 : 5bf452a0 r6 : 40000000 r5 : 00000002 r4 : 4005c100 r3 : 40037d64 r2 : 00000000 r1 : 00000000 r0 : 00000000 cpsr: 60000013 Flags: nZCv IRQs on FIQs on Mode SVC_32 PC = 400104f0 ( change_thread+1f0 ) Called from thr_unblock+178 -- 4000fd44 Called from sem_unblock+70 -- 40010c0c Called from user_unlock+1c -- 40023a28 Called from tcp_send+88 -- 4002ee5c Called from big_wangdoodle+88 -- 40032190 Called from one_wangdoodle+fc -- 400322e8 Called from run_wangdoodle+78 -- 40032388 Called from thr_exit -- 4000f7a0Remember that the "wang_thr" is launched anew for every connection. Aso note that the traceback ends with thr_exit(). This is true for every thread (and not particularly important here). It is because Kyu arranges things so that if a thread ever returns, it will exit.
I will note that developing improved debugging tools is most definitely worth the time and effort. So, I will take the time right now to do this. This involves making changes in thread.c, particularly to thr_show().
This was a simple change, but it does make the "l" command more interesting. Before I start the test, I see the prior thread is always tcp-timer, but when the test is running, I see alternately "net" and "wang_thr" (mostly wang_thr) and occasionally "tcp-bsd". This all seems as it should be. When things are running happily, I see the following (which is just a random sample while the test is running).:
Kyu, ready> l Thread: name ( &tp ) state pc sp pri Thread: shell (400a799c) READY J 400104ac 40568000 11 Thread: net (400a75dc) SEM J 400104ac 40578000 12 Thread: net_slow (400a74ec) REPEAT C 40015a94 4057c000 13 Thread: tcp-bsd (400a77bc) SEM J 400104ac 40570000 14 Thread: tcp-timer (400a76cc) REPEAT C 40023c64 40574000 15 Thread: wangdoodle (400a730c) SEM J 400104ac 40584000 30 * Thread: wang_thr (400a73fc) READY I 40006218 40580000 31 Thread: idle (400a78ac) READY C 40010980 4056c000 1234
An important rule to follow (and I just broke it) is to not make code changes while the long test is running. The reason is that if I start investigating code addresses in detail, I want to be able to use the disassembly in "kyu.dump". So I restarted the test with the new build and will restrain my fingers in the meanwhile.
This time I get a hang after only 99 transfers and the thread display is as follows:
Kyu, ready> l Thread: name ( &tp ) state pc sp pri Thread: shell (400a799c) READY J 40010464 40568000 11 Thread: net (400a75dc) SEM J 40010464 40578000 12 Thread: net_slow (400a74ec) REPEAT C 40015a4c 4057c000 13 * Thread: tcp-bsd (400a77bc) SEM I 400108e4 40570000 14 Thread: tcp-timer (400a76cc) READY C 40023c1c 40574000 15 Thread: wangdoodle (400a730c) SEM J 40010464 40584000 30 Thread: wang_thr (400a73fc) READY I 400104c8 40580000 31 Thread: idle (400a78ac) READY C 40010938 4056c000 1234What I find interesting and surprising here is that the "prior" thread pointer is now pointing to "tcp-bsd". Previously I didn't have that information and looked at the above listing and saw the "wang_thr" as the only ready thread. This list has other surprises as well, namely that "tcp_timer" is also ready to run. The above is just one random sample. Another gives the following:
Kyu, ready> l Thread: name ( &tp ) state pc sp pri Thread: shell (400a799c) READY J 40010464 40568000 11 Thread: net (400a75dc) SEM J 40010464 40578000 12 Thread: net_slow (400a74ec) REPEAT C 40015a4c 4057c000 13 * Thread: tcp-bsd (400a77bc) SEM I 400108e4 40570000 14 Thread: tcp-timer (400a76cc) REPEAT C 40023c1c 40574000 15 Thread: wangdoodle (400a730c) SEM J 40010464 40584000 30 Thread: wang_thr (400a73fc) READY I 400104c8 40580000 31 Thread: idle (400a78ac) READY C 40010938 4056c000 1234Here we see "tcp-timer" has changed state to REPEAT, which indicates that it does run or something is happening.
Here is what "t 1" shows us with respect to TCP data structures:
INPCB: 40168780 Listen -- local, foreign: 0.0.0.0 111 .. 0.0.0.0 0 INPCB: 40168348 5 -- local, foreign: 192.168.0.73 114 .. 192.168.0.5 34346 INPCB: 401680a8 Listen -- local, foreign: 0.0.0.0 114 .. 0.0.0.0 0 Socket: 40168018 ACTIVE pcb = 401680a8, state = 4000 Socket: 401682b8 ACTIVE pcb = 40168348, state = 4022 CONNECTED CANTRCV Socket: 401686f0 ACTIVE pcb = 40168780, state = 4000 Socket: 40168880 -inactive- pcb = 00000000, state = 0000Status 5 is "close wait", also note the CANTRCV bit set in the socket, but this has been sitting for a while and perhaps the linux side has timed out and taken some action to close the connection. It is time to get a "fresh hang" to see. And indeed, a fresh hang shows the same thing, hung in close-wait and CANTRCV. This time the thread display looks like this:
Thread: name ( &tp ) state pc sp pri Thread: shell (400a799c) READY J 40010464 40568000 11 Thread: net (400a75dc) SEM J 40010464 40578000 12 Thread: net_slow (400a74ec) REPEAT C 40015a4c 4057c000 13 Thread: tcp-bsd (400a77bc) SEM J 40010464 40570000 14 * Thread: tcp-timer (400a76cc) READY I 4000f7bc 40574000 15 Thread: wangdoodle (400a730c) SEM J 40010464 40584000 30 Thread: wang_thr (400a73fc) DEAD I 400117a4 40580000 31 Thread: wang_thr (400a721c) READY I 400104d4 40588000 31 Thread: idle (400a78ac) READY C 40010938 4056c000 1234The odd thing here is the wang_thr marked "DEAD", I have not seen that before. A thread gets marked "DEAD" in thr_exit() and should just get cleaned up and removed. Why didn't this one?
I try "t 7" which makes a client connection from Kyu to a time server on my linux machine.
This works.
I try "t 2" which starts a server on port 111, then I use telnet to connect to it.
It responds (as it should) with "Dead men tell no tales", as it should.
This works.
In other words, the bulk of Kyu (scheduling, semaphores, timers) is working just fine. And in fact the bulk of TCP code seems to be working just fine.
I use Ctrl-C to kill the hung python script, then run it again -- and it works! There is a hung connection handling thread, but that doesn't get in the way of a new connection being made and transactions being handled.
Kyu, ready> t 1 Kyu, ready> INPCB: 40168348 Close-Wait -- local, foreign: 192.168.0.73 114 .. 192.168.0.5 58760 INPCB: 401680a8 Listen -- local, foreign: 0.0.0.0 114 .. 0.0.0.0 0 locker count: 0 Input thread: idle Timer thread: idle User lock: busy mbuf: alloc = 1, free = 12, max = 13 mbufcl: alloc = 0, free = 256, max = 9 sock: alloc = 2, free = 9, max = 11 inpcb: alloc = 2, free = 0, max = 2 Socket: 40168018 ACTIVE pcb = 401680a8, state = 4000 Socket: 401682b8 ACTIVE pcb = 40168348, state = 4022 CONNECTED CANTRCV t 9 Kyu, ready> Wang state, count = 12, 44I now have wireshark running, and there is nothing remarkable. The Kyu side has sent data packet "PK167", and the linux side has sent the expected ACK. The info from the "t 9" command indicates that the test is in the midst of the transfer, having sent piece 44 of 98. Nothing special about that, and it corresponds to the markings in the wireshark packet. The pieces are 2000 bytes in size, but for whatever reason they go out in 500 byte pieces. 44*4 is 176, which is in the ballpark of 167.
The big question is why the connection is marked as "CLOSE_WAIT" as well as CANTRCV. It is as though the tcp protocol thinks that the linux side has sent a FIN, but it hasn't.
Using "u wang_thr" to see what the thread is doing, I see:
PC = 40010498 ( change_thread+1c8 ) Called from thr_unblock+178 -- 4000fd14 Called from sem_unblock+70 -- 40010bf4 Called from user_unlock+1c -- 40023a10 Called from tcp_send+88 -- 4002ee44This is exactly where I always see it (stuck in thr_unblock()) which itself is odd. This would be where it is releasing the lock. This is the last line of code in tcp_send and should never get stuck.
So which came first? The transition to close/wait or the hang? Oddly enough, I see the socket in this state all of the time, even when the test is running as it should. However, when I make a telnet connection and peek at it, I see:
Kyu, ready> t 1 INPCB: 40168348 Establshed -- local, foreign: 192.168.0.73 114 .. 192.168.0.5 43324 INPCB: 40168130 Listen -- local, foreign: 0.0.0.0 114 .. 0.0.0.0 0 Socket: 401680a0 ACTIVE pcb = 40168130, state = 4000 Socket: 401682b8 ACTIVE pcb = 40168348, state = 4002 CONNECTED
The above is exactly what one would expect. So why do I always see the "half closed" thing when I peek at it while doing a big transfer?
Is the thread just being starved somehow? It doesn't look like it with no thread at higher priority marked as READY, but an experiment would confirm this. I can run some other thread at a priority just above the thread in question and find a way to monitor it. I do this and that thread (at priority 32) runs just fine Oddly enough, I see wang_thr in a different state now:
PC = 400108e4 ( resched+288 ) Called from thr_block+7c -- 4000fa40 Called from sem_block_cpu+4c -- 40010e10 Called from sem_block+24 -- 40010e3c Called from sbwait+38 -- 40030f78 Called from sosend+310 -- 4002f164 Called from tcp_send+80 -- 4002ee3c Called from big_wangdoodle+d4 -- 40032218The above was a "one time" event, but it has me wondering. Do I ever block in sosend? And do I handle it properly when I do. I have it print a message when it does block and then run the test again. It hums along for 50 or so tests without ever indicating that it needed to block in sosend(). In fact it runs all the way to a lockup without ever indicating that it needs to block.
I find this surprising, but maybe it really is fast enough to clear the buffer and keep things moving. The socket buffers themselves are only 8K (8*1024) bytes in size. What provides flow control for all of this?
I added a switch to let me examine what change_thread was doing after a lockup. It looks like this:
Switch wang_thr --> tcp-bsd Switch tcp-bsd --> tcp-timer Switch tcp-timer --> wang_thr Switch wang_thr --> tcp-bsd Switch tcp-bsd --> tcp-timer Switch tcp-timer --> wang_thr
Now I am focusing on the business of sosend() calling sbwait(). This seems to rarely happen and would be the occasion where tcp_send would block. I added a printf message when it does decide to block, then printout of which connection is currently being processed on the Kyu side. Just having the message that it blocked is not enough, I need to know that it blocked on the latest connection and that led to the hang.
The fact that I always see the connection in close-wait makes it look like the entire 200K of data has been processed and buffered and that we have shut down the protocol from our side and are waiting for buffered data to be processed. I can't see how this could possibly be the case, but that is the appearance things have.
It actually looks like there are two situations that can yield a hang. One is the hang in sbwait() and in thr_block(). This looks we we decided to block due to lack of space in the socket buffer and never get awakened.
The second situation is the one where we are in user_unlock() and thr_unblock(). Here we did not choose to block and it is unclear what is going on. This is where we see this cycle endlessly repeating:
Switch wang_thr --> tcp-bsd Switch tcp-bsd --> tcp-timer Switch tcp-timer --> wang_thrWhat is particularly unclear is what "wang_thr" is doing when control returns to it. Some preliminary study of this made me think we had a Kyu bug of some kind -- and maybe we do.
PC = 400104c8 ( change_thread+1f8 ) Called from thr_unblock+178 -- 4000fd14 Called from sem_unblock+70 -- 40010c0cThe code here is:
400104c4: e3a02000 mov r2, #0 400104c8: e58320ec str r2, [r3, #236] ; 0xec 400104cc: e10f0000 mrs r0, CPSR 400104d0: e38000c0 orr r0, r0, #192 ; 0xc0 400104d4: e129f000 msr CPSR_fc, r0In the C code this is:
/* Cancel any idea of a thread that yielded * to us. */ cur_thread->yield = (struct thread *) 0; /* * Remember, tp->mode tells how the thread should be resumed, * not how it should be saved. */ /* We need to lock before setting cur_thread or * we may take an interrupt in the wrong thread * structure. */ INT_lock; wang_hook ( cur_thread, new_tp ); prior_thread = cur_thread; /* new 12-29-2022 */ cur_thread = new_tp;
With the Kyu console responsive, we can dump memory using a command like:
d 40583e00 32The count is a count in words.
My first experiment was to put some delay into the routine that grinds out the "big" data. This made no difference and I still see close-wait.
Interestingly, if I telnet to port 114 and type "big", it works fine (all the data is ASCII and telnet does not seem to mind at all). When I use a "t 1" command I see "Established" for the socket state for this connection. So I am still confused. What is different between using telnet and using netcat? (My Python script is just a supervisor that runs netcat to do all the network IO).
Aha! I know what is going on with netcat. Netcat has sent a FIN, even though lots of data has yet to be received. It sent the 4 characters "big\n" and since that is all it had to send, it concluded that with a FIN. All this is entirely proper. The TCP protocol knows that has happened and when tcp_close() is called, it will shut down the whole game. Note that the flag CANTRCV has nothing at all to do with all the ACK packets that linux needs to send to deal with the data coming its way. As has been pointed out, a TCP connection consists of two independent data streams moving in opposite directions. The CANTRCV applies to data flowing from the linux side to the Kyu side. The ACK packets are part of the stream in the other direction.
Kyu / tom@mmto.org