Bug 194550

Summary: panic: race condition with epair and atmconfig
Product: Base System Reporter: Oleg Ginzburg <olevole>
Component: kernAssignee: Andrey V. Elsukov <ae>
Status: Closed FIXED    
Severity: Affects Some People CC: ae, olevole
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Oleg Ginzburg 2014-10-23 09:23:17 UTC
When epair created fast enough the system got the panic when performing atmconfig (afexist in /etc/network.subr executed by devd)

Work-around: kill devd process or in the afexists() in /etc/network.subr prevent atmconfig execution:
-----
        atm)
++              return 1
                if [ -x /sbin/atmconfig ]; then
                        /sbin/atmconfig diag list > /dev/null 2>&1
                else
                        return 1
                fi
-----


Panic is easy to reproduce via script:
-----
#!/bin/sh
for i in $( seq 0 300 ); do
     echo $i
     /sbin/ifconfig epair${i} create
     [ $? -ne 0 ] && exit 1
done
-----


Significantly accelerate the emergence of panic if run in parallel:
-----
#!/bin/sh
while [ 1 ];do
     /sbin/atmconfig diag list
done
-----


KGDB Backtrace:
--
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor read instruction, page not present
instruction pointer     = 0x20:0x0
stack pointer           = 0x28:0xfffffe007b5857b0
frame pointer           = 0x28:0xfffffe007b5857e0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 3008 (atmconfig)
Uptime: 1m47s
Dumping 145 out of 2021 MB:..12%..23%..34%..45%..56%..67%..78%..89%..100%

Reading symbols from /boot/kernel/pf.ko.symbols...done.
Loaded symbols for /boot/kernel/pf.ko.symbols
Reading symbols from /boot/kernel/nullfs.ko.symbols...done.
Loaded symbols for /boot/kernel/nullfs.ko.symbols
Reading symbols from /boot/kernel/fdescfs.ko.symbols...done.
Loaded symbols for /boot/kernel/fdescfs.ko.symbols
Reading symbols from /boot/kernel/if_epair.ko.symbols...done.
Loaded symbols for /boot/kernel/if_epair.ko.symbols
#0  doadump (textdump=1) at pcpu.h:219
219             __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) list
214     static __inline __pure2 struct thread *
215     __curthread(void)
216     {
217             struct thread *td;
218
219             __asm("movq %%gs:%1,%0" : "=r" (td)
220                 : "m" (*(char *)OFFSETOF_CURTHREAD));
221             return (td);
222     }
223     #ifdef __clang__
Current language:  auto; currently minimal
(kgdb)
--
Comment 1 commit-hook freebsd_committer freebsd_triage 2014-10-23 14:30:17 UTC
A commit references this bug:

Author: ae
Date: Thu Oct 23 14:29:53 UTC 2014
New revision: 273547
URL: https://svnweb.freebsd.org/changeset/base/273547

Log:
  Move if_get_counter initialization from if_attach into if_alloc.
  Also, initialize all counters before ifnet will become available in the system.
  This fixes possible access to uninitialized ifned fields.

  PR:		194550

Changes:
  head/sys/net/if.c