| Summary: | BPF_MTAP/bpf_mtap are not threadsafe and cause panics on SMP systems | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Base System | Reporter: | Mark Gooderum <mark> | ||||||||
| Component: | kern | Assignee: | Christian S.J. Peron <csjp> | ||||||||
| Status: | Closed FIXED | ||||||||||
| Severity: | Affects Only Me | ||||||||||
| Priority: | Normal | ||||||||||
| Version: | 5.3-RELEASE | ||||||||||
| Hardware: | Any | ||||||||||
| OS: | Any | ||||||||||
| Attachments: |
|
||||||||||
Responsible Changed From-To: freebsd-bugs->rwatson Grab ownership of this PR Responsible Changed From-To: rwatson->csjp Assign this bug to csjp, he has recently taken on fixing locking in BPF, and may have fixed it in his recent changes. State Changed From-To: open->patched This issue should be patched in -CURRENT, once it is tested sufficiently enough, I will MFC it. State Changed From-To: patched->closed The fix has been MFCed |
BPF_MTAP/BPF_MTAP2 do a non-atomic test and invoke based on the value of the if_bpf field. The problem is that if the last bpf of on the interface is deleted, the if_bpf field is set to NULL by bpf_detachd(). If another thread (such as a userland process) deletes the last bpf in the (admittedly small) window between the test and invocation then bpf_mtap() is invoked with a NULL bp parameter which cases a fault on the LIST_EMPTY check #define BPF_MTAP(_ifp,_m) do { \ if ((_ifp)->if_bpf) { \ M_ASSERTVALID(_m); \ bpf_mtap((_ifp)->if_bpf, (_m)); \ } \ } while (0) This happens becase on i386 at least the initial NULL check does _NOT_ fetch the variable: Line 3359 of "../../../dev/bge/if_bge.c" starts at address 0xc045e7a5 <bge_start_locked+53> and ends at 0xc045e7c0 <bge_start_locked+80>. 0xc045e7a5 <bge_start_locked+53>: cmpl $0x0,0x2074(%edi,%eax,4) 0xc045e7ad <bge_start_locked+61>: jne 0xc045ea71 <bge_start_locked+769> 0xc045e7b3 <bge_start_locked+67>: lea 0xfc(%esi),%eax 0xc045e7b9 <bge_start_locked+73>: mov %eax,0xffffffec(%ebp) 0xc045e7bc <bge_start_locked+76>: lea 0x0(%esi),%es It just does an optimized test for NULLness, the value isn't fetched until later when setting up the call: 0xc045ea46 <bge_start_locked+726>: mov %ebx,0x4(%esp) 0xc045ea4a <bge_start_locked+730>: mov 0x3c(%esi),%eax 0xc045ea4d <bge_start_locked+733>: mov %eax,(%esp) 0xc045ea50 <bge_start_locked+736>: call 0xc05897d0 <bpf_mtap> 0xc045ea55 <bge_start_locked+741>: lea 0x0(%esi),%esi 0xc045ea59 <bge_start_locked+745>: lea 0x0(%edi),%edi 0xc045ea60 <bge_start_locked+752>: mov 0xfffffff0(%ebp),%eax 0xc045ea63 <bge_start_locked+755>: cmpl $0x0,0x2074(%edi,%eax,4) 0xc045ea6b <bge_start_locked+763>: je 0xc045e7c0 <bge_start_locked+80> So the window is small but real. Our field experience is with a box doing a large amount of packet processing while running frequent nessus scans (nessus adds and removes BPF filters on the fly as needed for certain tests). Fix: Either modify bpf_mtap()/bpf_mtap2() to check for a NULL parameter (the test/modify race doesn't apply once we are into the function because the bpf_if lasts as long as the interface, only the pointer to it in the struct ifnet comes and goes and by the time we're in bpf_mtap() we're looking a copy of the variable on the stack, not the actual ifnet field. Most interfaces have per-interface locks that could also be used but those mutexes are currently private to the drivers. /* + * We can sometimes be invoked w/NULL bp due to a small race in + * BPF_MTAP(), see PR#xxxxx. + */ + if (!bp) + return; + + /* * Lockless read to avoid cost of locking the interface if there are * no descriptors attached. */ if (LIST_EMPTY(&bp->bif_dlist)) return; pktlen = m_length(m, NULL); if (pktlen == m->m_len) { bpf_tap(bp, mtod(m, u_char *), pktlen); return; @@ -1245,20 +1252,27 @@ void bpf_mtap2(bp, data, dlen, m) struct bpf_if *bp; void *data; u_int dlen; struct mbuf *m; { struct mbuf mb; struct bpf_d *d; u_int pktlen, slen; + + /* + * We can sometimes be invoked w/NULL bp due to a small race in + * BPF_MTAP2(), see PR#xxxxx. + */ + if (!bp) + return; /* * Lockless read to avoid cost of locking the interface if there are * no descriptors attached. */ if (LIST_EMPTY(&bp->bif_dlist)) return; pktlen = m_length(m, NULL); /*--302rT8RfQU5xlkFKELbrBulMjLgPcVyMcutrwB5yHuXbBGZH Content-Type: text/plain; name="file.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="file.diff" --- /tmp/tmp.97907.0 Thu Oct 6 15:57:52 2005 +++ sys/net/bpf.c Thu Oct 6 15:57:45 2005 @@ -1201,20 +1201,27 @@ */ void bpf_mtap(bp, m) struct bpf_if *bp; struct mbuf *m; { struct bpf_d *d; u_int pktlen, slen; How-To-Repeat: Have lots of bpf filter add/deletes happening on a system under a heavy packet load. I will attach a simple test program that adds/deletes filters on an interface at a high rate if desired. This window also affects BPF_MTAP2/bpf_mtap2()