January 6, 2023

BSD TCP/IP for Kyu - Questionable queue counts

I have now run for over 24 hours and have made 84,000 back to back transfers of 200,000 bytes without a complaint.

I have noticed one thing of concern. When I do a "t 1" command to get TCP info and statistics, along with the "l" command to list Kyu threads, I see this:

Kyu, ready> INPCB: 40168348      Close-Wait -- local, foreign: 192.168.0.73 114 .. 192.168.0.5 54504
INPCB: 401680a8          Listen -- local, foreign: 0.0.0.0 114 .. 0.0.0.0 0
locker count: 0
Input thread: idle
Timer thread: idle
User lock: busy
  mbuf: alloc =  3, free =   6, max =  9
mbufcl: alloc =  2, free = 254, max =  4
  sock: alloc =  2, free =   0, max =  2
 inpcb: alloc =  2, free =   0, max =  2
Socket: 40168018   ACTIVE   pcb = 401680a8, state = 4000, rcv =     0, snd =     0
Socket: 401682b8   ACTIVE   pcb = 40168348, state = 4022, rcv =     0, snd =  1000 CONNECTED CANTRCV
Kyu input queue size: 0
TCP input queue size: -589
Kyu output queue size: 0
Netbuf head: 406b2610
511 netbuf available
511 netbuf on free list
512 netbuf configured
Clock: 75379

  Thread:       name (  &tp   )    state     sem       pc       sp     pri
  Thread:      shell (400a7a20)   READY J          400105cc 40568000   11
  Thread:     net-in (400a7650)   READY J          400105cc 40570000   20
  Thread:    net-out (400a755c)     SEM J net-outq 400105cc 40572000   21
  Thread:   net-slow (400a7468)  REPEAT C          40015e20 40574000   22
  Thread:  tcp-input (400a7838)     SEM J tcp-main 400105cc 4056c000   24
  Thread:  tcp-timer (400a7744)  REPEAT C          40024364 4056e000   25
  Thread: wangdoodle (400a7280)     SEM J   socket 400105cc 40578000   31
* Thread:   wang_thr (400a7374)   READY I          40016908 40576000   35
  Thread:       idle (400a792c)   READY C          40010aec 4056a000 1234
The item of concern is this:
TCP input queue size: -589
This is the value of the variable "tcp_inq_count". It is incremented in one place (within the queue locked section) and decremented in one place (just after we release the queue lock!). So that answer would seem to be simple. Move the decrement back one line and decrement before releasing the lock. It just goes to show that a statement as simple as this can have a race:
tcp_inq_count--;
And indeed it can. The value needs to be fetched, decremented, and then stored. This situation is entirely "cosmetic", although highly interesting. I make the change and restart the test. We see this roughly once every 142 transfers. Each transfer has 800 packets of 500 bytes, so we see this once every 100,000 queue accesses (more or less).

Yep, that was it. Yet another lesson about race conditions and locking.


Have any comments? Questions? Drop me a line!

Kyu / tom@mmto.org