Bug 246475 - Unread entries potentially lost in buf_ring after ABA condition
Summary: Unread entries potentially lost in buf_ring after ABA condition
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.1-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-15 04:29 UTC by Martin Balao
Modified: 2021-08-14 10:47 UTC (History)
8 users (show)

See Also:


Attachments
bug_ring memory corruption proof-of-concept and execution trace (6.12 KB, application/gzip)
2020-05-15 04:29 UTC, Martin Balao
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Balao 2020-05-15 04:29:13 UTC
Created attachment 214511 [details]
bug_ring memory corruption proof-of-concept and execution trace

We [1] found a potential memory corruption bug in the implementation of FreeBSD's ring-buffer. Under specific scheduling conditions, and assuming a multiple-producers scenario, it might be possible to break the 'one entry always empty' invariant and overwrite unread values. Looks to us that release 12.1.0 [2] is affected, and previous releases may be affected as well.

The bug appears to be an ABA problem in the buf_ring_enqueue function. If br->br_prod_head wraps-around the ring-buffer to have the same value than what was originally read by a Producer, the CAS will succeed but the 'unnoticed' activity may have moved cons_tail. In particular, it may have moved cons_tail to the point that the previous "prod_next == cons_tail" check result is now invalid. The Producer will move forward instead of checking again. As a result, a Producer may write the entry supposed to be empty. With no-stopper empty entry, following producers will start overwriting unread values.

We believe that short ring-buffers in highly concurrent scenarios are the most likely circumstances in which this could happen. That is because of the easier prod_head wrap-around.

In regards to fixing this issue, a double-word or wide CAS that includes a counter as part of the condition should be enough to make the ABA unfeasible. This would require some support from the architecture, though. Other ring-buffer implementations such as DPDK [3] take a slightly different approach, which should provide an equivalent level of security and is easier to implement on any architecture. Extrapolated to FreeBSD's buf_ring, this would be to let prod_head take values across the whole uint32 range and use the mask only when reading or writing entries. For the ABA to occur, the whole uint32 range would need to be wrapped-around -which looks unrealistic-.

Attached to this bug report you will find our proof-of-concept. It is a Linux user-space port of FreeBSD's ring-buffer. We instrumented buf_ring.h to generate proper scheduling conditions, and implemented all the scaffolding around. We tried to mock structures and functions with reasonable Linux x86_64 equivalents. Please note that our proof-of-concept has -for illustration purposes- more Producers than what is actually needed: 2 Producers and 1 Consumer should be enough. Attached to this bug report you will also find a trace of execution.

It has not been under the scope of our investigation to trigger this bug from FreeBSD user-space. It well be the case that the bug is there but there is no current way of triggering it due to the constraints imposed by the context. However, we suggest fixing it on any case to avoid a show up in the future.

--
[1] - Andres Lopez, Martin Di Paola and Martin Balao
[2] -
https://svnweb.freebsd.org/base/release/12.1.0/sys/sys/buf_ring.h?view=co
[3] - http://git.dpdk.org/dpdk/tree/lib/librte_ring/rte_ring_c11_mem.h