The following block of code, in `sys/sys/buf_ring.h': /* * If there are other enqueues in progress * that preceeded us, we need to wait for them * to complete */ while (br->br_prod_tail != prod_head) cpu_spinwait(); br->br_prod_bufs++; br->br_prod_bytes += nbytes; br->br_prod_tail = prod_next; critical_exit(); can be seen at runtime, memory-wise as: while (br->br_prod_tail != prod_head) cpu_spinwait(); br->br_prod_tail = prod_next; br->br_prod_bufs++; br->br_prod_bytes += nbytes; critical_exit(); That is, there is no memory barrier to enforce completion of the load/increment/store/load/load/addition/store operations before updating what other thread spin on. Even if `br_prod_tail' is marked `volatile', there is no guarantee that it will not be re-ordered wrt. non-volatile write (to `br_prod_bufs' and `br_prod_bytes'). Confirmed by Kip Macy (kmacy@) in http://lists.freebsd.org/pipermail/freebsd-hackers/2011-September/036454.html. How-To-Repeat: code review.
On Sat, 24 Sep 2011, Arnaud Lacombe wrote: >> Description: > The following block of code, in `sys/sys/buf_ring.h': > > > /* > * If there are other enqueues in progress > * that preceeded us, we need to wait for them > * to complete > */ > while (br->br_prod_tail != prod_head) > cpu_spinwait(); > br->br_prod_bufs++; > br->br_prod_bytes += nbytes; > br->br_prod_tail = prod_next; > critical_exit(); > > > can be seen at runtime, memory-wise as: > > while (br->br_prod_tail != prod_head) > cpu_spinwait(); > br->br_prod_tail = prod_next; > br->br_prod_bufs++; > br->br_prod_bytes += nbytes; > critical_exit(); > > That is, there is no memory barrier to enforce completion of the > load/increment/store/load/load/addition/store operations before > updating what other thread spin on. The counters are 64 bits, so it also does non-atomic increments of them no 32-bit arches. > Even if `br_prod_tail' is marked `volatile', there is no guarantee > that it will not be re-ordered wrt. non-volatile write (to `br_prod_bufs' > and `br_prod_bytes'). Using volatile is generally bogus. Here it seems to mainly give pessimizations and more opportunities for bad memory orders. The i386 code for incrementing a 64-bit volatile x is: movl x, %eax movl x+4, %edx addl $1, %eax adcl $0, %edx movl %eax, x movl %edx, x+4 while for a 64-bit non-volatile it is: addl $1, x adcl $0, x+4 so volatile gives more caching in registers instead of less. The following are some of the bad memory orders possible: with volatile: lo = br->br_prod_bytes.lo; hi = br->br_prod_bytes.hi; br->br_prod_tail = prod_next; br->br_prod_bufs++; lo += nbytes; hi += carry; br->br_prod_bytes.hi = hi; br->br_prod_bytes.lo = lo; without volatile: br->br_prod_bytes.lo += nbytes; br->br_prod_tail = prod_next; br->br_prod_bufs++; br->br_prod_bytes.hi += carry; I think the token method would make the nonatomic accesses to the counters sufficiently atomic if it worked. The necessary memory barriers would probably have a memory clobber which effectively makes all memory variables transiently volatile (where volatile actually means non-volatile with respect to them changing -- holding the token prevents them changing -- but actually means volatile with respect to their memory accesses) so the effect of declaring the counters permanently volatile would be reduced to a pessimization. Even reads of them in sysctls must hold the token to get a consistent snapshot. Bruce
For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped