Bug 87014

Summary: BPF_MTAP/bpf_mtap are not threadsafe and cause panics on SMP systems
Product: Base System Reporter: Mark Gooderum <mark>
Component: kernAssignee: Christian S.J. Peron <csjp>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 5.3-RELEASE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Makefile
none
bpfspin.c
none
BPFMTAP.difftxt none

Description Mark Gooderum 2005-10-06 22:10:12 UTC
BPF_MTAP/BPF_MTAP2 do a non-atomic test and invoke based on the value of the if_bpf field.    The problem is that if the last bpf of on the interface is deleted, the if_bpf field is set to NULL by bpf_detachd(). If another thread (such as  a userland process) deletes the last bpf in the (admittedly small) window between the test and invocation then bpf_mtap() is invoked with a NULL bp parameter which cases a fault on the LIST_EMPTY check

#define    BPF_MTAP(_ifp,_m) do {                    \
   if ((_ifp)->if_bpf) {                    \
       M_ASSERTVALID(_m);                \
       bpf_mtap((_ifp)->if_bpf, (_m));            \
   }                            \
} while (0) 

This happens becase on i386 at least the initial NULL check does _NOT_ fetch the variable:

Line 3359 of "../../../dev/bge/if_bge.c"
  starts at address 0xc045e7a5 <bge_start_locked+53>
  and ends at 0xc045e7c0 <bge_start_locked+80>.

0xc045e7a5 <bge_start_locked+53>:       cmpl   $0x0,0x2074(%edi,%eax,4)
0xc045e7ad <bge_start_locked+61>:       jne    0xc045ea71 <bge_start_locked+769>
0xc045e7b3 <bge_start_locked+67>:       lea    0xfc(%esi),%eax
0xc045e7b9 <bge_start_locked+73>:       mov    %eax,0xffffffec(%ebp)
0xc045e7bc <bge_start_locked+76>:       lea    0x0(%esi),%es 

It just does an optimized test for NULLness, the value isn't fetched until later when setting up the call:


0xc045ea46 <bge_start_locked+726>:      mov    %ebx,0x4(%esp)
0xc045ea4a <bge_start_locked+730>:      mov    0x3c(%esi),%eax
0xc045ea4d <bge_start_locked+733>:      mov    %eax,(%esp)
0xc045ea50 <bge_start_locked+736>:      call   0xc05897d0 <bpf_mtap>
0xc045ea55 <bge_start_locked+741>:      lea    0x0(%esi),%esi
0xc045ea59 <bge_start_locked+745>:      lea    0x0(%edi),%edi
0xc045ea60 <bge_start_locked+752>:      mov    0xfffffff0(%ebp),%eax
0xc045ea63 <bge_start_locked+755>:      cmpl   $0x0,0x2074(%edi,%eax,4)
0xc045ea6b <bge_start_locked+763>:      je     0xc045e7c0 <bge_start_locked+80>

So the window is small but real.  Our field experience is with a box doing a large amount of packet processing while running frequent nessus scans (nessus adds and removes BPF filters on the fly as needed for certain tests).

Fix: Either modify bpf_mtap()/bpf_mtap2() to check for a NULL parameter (the test/modify race doesn't apply once we are into the function because the bpf_if lasts as long as the interface, only the pointer to it in the struct ifnet comes and goes and by the time we're in bpf_mtap() we're looking a copy of the variable on the stack, not the actual ifnet field.

Most interfaces have per-interface locks that could also be used but those mutexes are currently private to the drivers.



/*
+        * We can sometimes be invoked w/NULL bp due to a small race in
+        * BPF_MTAP(), see PR#xxxxx.
+        */
+       if (!bp)
+               return;
+
+       /*
         * Lockless read to avoid cost of locking the interface if there are
         * no descriptors attached.
         */
        if (LIST_EMPTY(&bp->bif_dlist))
                return;

        pktlen = m_length(m, NULL);
        if (pktlen == m->m_len) {
                bpf_tap(bp, mtod(m, u_char *), pktlen);
                return;
@@ -1245,20 +1252,27 @@
 void
 bpf_mtap2(bp, data, dlen, m)
        struct bpf_if *bp;
        void *data;
        u_int dlen;
        struct mbuf *m;
 {
        struct mbuf mb;
        struct bpf_d *d;
        u_int pktlen, slen;
+
+       /*
+        * We can sometimes be invoked w/NULL bp due to a small race in
+        * BPF_MTAP2(), see PR#xxxxx.
+        */
+       if (!bp)
+               return;

        /*
         * Lockless read to avoid cost of locking the interface if there are
         * no descriptors attached.
         */
        if (LIST_EMPTY(&bp->bif_dlist))
                return;

        pktlen = m_length(m, NULL);
        /*--302rT8RfQU5xlkFKELbrBulMjLgPcVyMcutrwB5yHuXbBGZH
Content-Type: text/plain; name="file.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="file.diff"

--- /tmp/tmp.97907.0    Thu Oct  6 15:57:52 2005
+++ sys/net/bpf.c  Thu Oct  6 15:57:45 2005
@@ -1201,20 +1201,27 @@
  */
 void
 bpf_mtap(bp, m)
        struct bpf_if *bp;
        struct mbuf *m;
 {
        struct bpf_d *d;
        u_int pktlen, slen;
How-To-Repeat: Have lots of bpf filter add/deletes happening on a system under a heavy packet load.  I will attach a simple test program that adds/deletes filters on an interface at a high rate if desired.

This window also affects BPF_MTAP2/bpf_mtap2()
Comment 1 Robert Watson freebsd_committer freebsd_triage 2005-10-08 18:44:12 UTC
Responsible Changed
From-To: freebsd-bugs->rwatson

Grab ownership of this PR
Comment 2 Robert Watson freebsd_committer freebsd_triage 2006-06-12 12:41:06 UTC
Responsible Changed
From-To: rwatson->csjp

Assign this bug to csjp, he has recently taken on fixing locking in 
BPF, and may have fixed it in his recent changes.
Comment 3 Christian S.J. Peron freebsd_committer freebsd_triage 2006-06-12 14:40:22 UTC
State Changed
From-To: open->patched

This issue should be patched in -CURRENT, once it is tested 
sufficiently enough, I will MFC it.
Comment 4 Christian S.J. Peron freebsd_committer freebsd_triage 2007-01-21 02:58:04 UTC
State Changed
From-To: patched->closed

The fix has been MFCed