I run pretty busy mpd-5.6 based PPPoE access server (about 1700 simultaneous connections at most loaded hours). It uses dummynet extensively: each connecting user obtains its own dynamic dummynet pipes. Some time ago this server has crashed. Crashdump points to some dummynet-related code: GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: 0 (dummynet) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff801adf1a = db_trace_self_wrapper+0x2a kdb_backtrace() at 0xffffffff80330827 = kdb_backtrace+0x37 panic() at 0xffffffff802fd48e = panic+0x1ce trap_fatal() at 0xffffffff804f3f80 = trap_fatal+0x290 trap_pfault() at 0xffffffff804f430e = trap_pfault+0x23e trap() at 0xffffffff804f47de = trap+0x3ce calltrap() at 0xffffffff804dac04 = calltrap+0x8 --- trap 0xc, rip = 0x1, rsp = 0xffffff8122a9ea20, rbp = 0xffffff8122a9ea40 --- uart_z8530_class() at 0x1 uma_zfree_arg() at 0xffffffff804b53da = uma_zfree_arg+0x3a m_freem() at 0xffffffff8035c4a7 = m_freem+0x37 dummynet_send() at 0xffffffff8040017d = dummynet_send+0x2d dummynet_task() at 0xffffffff80400496 = dummynet_task+0x1c6 taskqueue_run_locked() at 0xffffffff8033cdf5 = taskqueue_run_locked+0x85 taskqueue_thread_loop() at 0xffffffff8033cf8e = taskqueue_thread_loop+0x4e fork_exit() at 0xffffffff802d13cf = fork_exit+0x11f fork_trampoline() at 0xffffffff804db14e = fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff8122a9ecf0, rbp = 0 --- Uptime: 14d8h5m59s Dumping 754 out of 4079 MB:..3%..11%..22%..32%..41%..51%..62%..73%..81%..92% Reading symbols from /boot/kernel/ipmi.ko...done. Loaded symbols for /boot/kernel/ipmi.ko #0 doadump () at /home/src/sys/kern/kern_shutdown.c:268 268 if (textdump_pending) (kgdb) bt #0 doadump () at /home/src/sys/kern/kern_shutdown.c:268 #1 0xffffffff802fcf8a in boot (howto=260) at /home/src/sys/kern/kern_shutdown.c:448 #2 0xffffffff802fd467 in panic (fmt=0x1 <Address 0x1 out of bounds>) at /home/src/sys/kern/kern_shutdown.c:639 #3 0xffffffff804f3f80 in trap_fatal (frame=0xc, eva=Variable "eva" is not available. ) at /home/src/sys/amd64/amd64/trap.c:848 #4 0xffffffff804f430e in trap_pfault (frame=0xffffff8122a9e970, usermode=0) at /home/src/sys/amd64/amd64/trap.c:764 #5 0xffffffff804f47de in trap (frame=0xffffff8122a9e970) at /home/src/sys/amd64/amd64/trap.c:457 #6 0xffffffff804dac04 in calltrap () at /home/src/sys/amd64/amd64/exception.S:228 #7 0x0000000000000001 in ?? () #8 0xffffffff802eb967 in mb_dtor_pack (mem=Variable "mem" is not available. ) at /home/src/sys/kern/kern_mbuf.c:453 #9 0xffffffff804b53da in uma_zfree_arg (zone=0xffffff00df773780, item=0xffffff0054691b00, udata=0x0) at /home/src/sys/vm/uma_core.c:2543 #10 0xffffffff8035c4a7 in m_freem (mb=0x0) at mbuf.h:562 #11 0xffffffff8040017d in dummynet_send (m=0xffffff0054691b00) at /home/src/sys/netinet/ipfw/ip_dn_io.c:705 #12 0xffffffff80400496 in dummynet_task (context=Variable "context" is not available. ) at /home/src/sys/netinet/ipfw/ip_dn_io.c:615 #13 0xffffffff8033cdf5 in taskqueue_run_locked (queue=0xffffff0003a09380) at /home/src/sys/kern/subr_taskqueue.c:250 #14 0xffffffff8033cf8e in taskqueue_thread_loop (arg=Variable "arg" is not available. ) at /home/src/sys/kern/subr_taskqueue.c:387 #15 0xffffffff802d13cf in fork_exit (callout=0xffffffff8033cf40 <taskqueue_thread_loop>, arg=0xffffffff80769e80, frame=0xffffff8122a9ec40) at /home/src/sys/kern/kern_fork.c:876 #16 0xffffffff804db14e in fork_trampoline () at /home/src/sys/amd64/amd64/exception.S:602 #17 0x0000000000000000 in ?? () #18 0x0000000000000000 in ?? () #19 0x0000000000000000 in ?? () #20 0x0000000000000000 in ?? () #21 0x0000000000000000 in ?? () #22 0x0000000000000000 in ?? () #23 0x0000000000000000 in ?? () #24 0x0000000000000000 in ?? () #25 0x0000000000000000 in ?? () #26 0x0000000000000000 in ?? () #27 0x0000000000000000 in ?? () #28 0x0000000000000000 in ?? () #29 0x0000000000000000 in ?? () #30 0x0000000000000000 in ?? () #31 0x0000000000000000 in ?? () #32 0x0000000000000000 in ?? () #33 0x0000000000000000 in ?? () #34 0x0000000000000000 in ?? () #35 0x0000000000000000 in ?? () #36 0x0000000000000000 in ?? () #37 0x0000000000000000 in ?? () #38 0x0000000000000000 in ?? () #39 0x0000000000000000 in ?? () #40 0x0000000000000000 in ?? () #41 0xffffffff80763f00 in sleepq_chains () #42 0xffffff0003a12d18 in ?? () #43 0x0000000000000000 in ?? () #44 0xffffff0003a128e0 in ?? () ---Type <return> to continue, or q <return> to quit--- #45 0xffffff8122a9eaf0 in ?? () #46 0xffffff8122a9ea98 in ?? () #47 0xffffff0001c0e470 in ?? () #48 0xffffffff80323972 in sched_switch (td=0xffffffff8033cf40, newtd=0xffffffff80769e80, flags=Variable "flags" is not available. ) at /home/src/sys/kern/sched_ule.c:1886 Previous frame inner to this frame (corrupt stack?) Here comes its kernel configuration file: cpu HAMMER ident PPPOE # To statically compile in device wiring instead of /boot/device.hints #hints "GENERIC.hints" # Default places to look for devices. # Use the following to compile in values accessible to the kernel # through getenv() (or kenv(1) in userland). The format of the file # is 'variable=value', see kenv(1) # # env "GENERIC.env" makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols #options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL # Enable gjournal-based UFS journaling options MD_ROOT # MD is a potential root device #options NFSCLIENT # Network Filesystem Client #options NFSSERVER # Network Filesystem Server #options NFSLOCKD # Network Lock Manager options NFS_ROOT # NFS usable as /, requires NFSCLIENT #options MSDOSFS # MSDOS Filesystem #options CD9660 # ISO 9660 Filesystem #options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization #options GEOM_JOURNAL options COMPAT_43TTY # BSD 4.3 TTY compat (sgtty) options COMPAT_FREEBSD32 # Compatible with i386 binaries #options COMPAT_FREEBSD4 # Compatible with FreeBSD4 #options COMPAT_FREEBSD5 # Compatible with FreeBSD5 #options COMPAT_FREEBSD6 # Compatible with FreeBSD6 #options COMPAT_FREEBSD7 # Compatible with FreeBSD7 #options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options STACK # stack(9) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options P1003_1B_SEMAPHORES # POSIX-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options PRINTF_BUFR_SIZE=512 # Prevent printf output being interspersed. options KBD_INSTALL_CDEV # install a CDEV entry in /dev options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4) options AUDIT # Security event auditing options MAC # TrustedBSD MAC Framework #options FLOWTABLE # per-cpu routing cache #options KDTRACE_FRAME # Ensure frames are compiled in #options KDTRACE_HOOKS # Kernel DTrace hooks options INCLUDE_CONFIG_FILE # Include this file in kernel # Make an SMP-capable kernel by default options SMP # Symmetric MultiProcessor Kernel # CPU frequency control device cpufreq # Bus support. device acpi device pci # Floppy drives #device fdc # ATA and ATAPI devices device ata device atadisk # ATA disk drives device atapicd # ATAPI CDROM drives # SCSI peripherals device scbus # SCSI bus (required for SCSI) device da # Direct Access (disks) device cd # CD device pass # Passthrough device (direct SCSI access) # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver # syscons is the default console driver, resembling an SCO console device sc # Serial (COM) ports device uart # Generic UART driver # PCI Ethernet NICs. device em # Intel PRO/1000 Gigabit Ethernet Family device igb # Pseudo devices. device loop # Network loopback device random # Entropy device device ether # Ethernet support device vlan # 802.1Q VLAN support device pty # BSD-style compatibility pseudo ttys device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translation) device firmware # firmware assist module device snp device bpf # Berkeley packet filter # USB support #options USB_DEBUG # enable debug msgs #options USB_VERBOSE device uhci # UHCI PCI->USB interface device ehci # EHCI PCI->USB interface (USB 2.0) device usb # USB Bus (required) device ukbd # Keyboard device umass # Disks/Mass storage - Requires scbus and da device ums # Mouse device ucom # USB support for Prolific PL-2303 serial adapters device uplcom # USB support for Silicon Laboratories CP2101/CP2102 based USB serial adapters device uslcom #options IPSEC #device crypto options NETGRAPH options NETGRAPH_ETHER options NETGRAPH_IFACE options NETGRAPH_MPPC_ENCRYPTION options NETGRAPH_PPP options NETGRAPH_PPPOE options NETGRAPH_SOCKET options NETGRAPH_TCPMSS options NETGRAPH_TEE options NETGRAPH_VJC options IPFIREWALL options IPFIREWALL_FORWARD options DUMMYNET options VFS_AIO device smbus device smb device ichsmb device iicbus device iicbb device ic device iic device iicsmb device coretemp device ichwd device nvram device lagg options KDB options KDB_TRACE options KDB_UNATTENDED options DDB options DDB_NUMSYM #options NETGRAPH_DEBUG #options INVARIANT_SUPPORT #options INVARIANTS #options DEBUG_MEMGUARD #options BREAK_TO_DEBUGGER options ALT_BREAK_TO_DEBUGGER device bridge Fix: Unknown for me. How-To-Repeat: Run busy router with lots of dummynet dynamic pipes, lots of traffic inside pipes and high rate of pipe creation/expiration.
Responsible Changed From-To: freebsd-bugs->freebsd-net Over to maintainer(s).
Hi, We have the very same issue here on FreeBSD 9.2 RELEASE on different hardware (HP DL360 G6 and HP Microserver G7). I suggest someone change Importance to something else than "Normal Affects Only me" because we have at least 8 such servers with the same issue. It happens more often when there's more load (customers and/or traffic). Don't know how it scales exactly (linear, quadratic or exponential) nor if it scales with traffic or the amount of pppoe clients but it sure if one of them of a mix of them. Here's the kernel dump: FreeBSD hiden_host 9.2-RELEASE FreeBSD 9.2-RELEASE #0 r255898: Thu Sep 26 22:50:31 UTC 2013 root@bake.isc.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 panic: page fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: current process = 0 (dummynet) trap number = 12 panic: page fault cpuid = 7 KDB: stack backtrace: #0 0xffffffff80947986 at kdb_backtrace+0x66 #1 0xffffffff8090d9ae at panic+0x1ce #2 0xffffffff80cf20d0 at trap_fatal+0x290 #3 0xffffffff80cf2431 at trap_pfault+0x211 #4 0xffffffff80cf29e4 at trap+0x344 #5 0xffffffff80cdbd13 at calltrap+0x8 #6 0xffffffff809c5959 at bpf_mtap2+0x89 #7 0xffffffff8188e11a at ng_iface_bpftap+0x2a #8 0xffffffff8188eb11 at ng_iface_output+0xf1 #9 0xffffffff80a3a104 at ip_output+0xd74 #10 0xffffffff81864edc at dummynet_send+0x13c #11 0xffffffff81865467 at dummynet_task+0x1b7 #12 0xffffffff80954554 at taskqueue_run_locked+0x74 #13 0xffffffff80955506 at taskqueue_thread_loop+0x46 #14 0xffffffff808db67f at fork_exit+0x11f #15 0xffffffff80cdc23e at fork_trampoline+0xe Uptime: 21d19h53m27s Dumping 1450 out of 16359 MB:..2%..12%..21%..31%..41%..51%..61%..71%..81%..91% Reading symbols from /boot/kernel/if_carp.ko...Reading symbols from /boot/kernel/if_carp.ko.symbols...done. done. Loaded symbols for /boot/kernel/if_carp.ko Reading symbols from /boot/kernel/pf.ko...Reading symbols from /boot/kernel/pf.ko.symbols...done. done. Loaded symbols for /boot/kernel/pf.ko Reading symbols from /boot/kernel/ipfw.ko...Reading symbols from /boot/kernel/ipfw.ko.symbols...done. done. Loaded symbols for /boot/kernel/ipfw.ko Reading symbols from /boot/kernel/dummynet.ko...Reading symbols from /boot/kernel/dummynet.ko.symbols...done. done. Loaded symbols for /boot/kernel/dummynet.ko Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /boot/kernel/ng_socket.ko.symbols...done. done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /boot/kernel/netgraph.ko.symbols...done. done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_mppc.ko...Reading symbols from /boot/kernel/ng_mppc.ko.symbols...done. done. Loaded symbols for /boot/kernel/ng_mppc.ko Reading symbols from /boot/kernel/rc4.ko...Reading symbols from /boot/kernel/rc4.ko.symbols...done. done. Loaded symbols for /boot/kernel/rc4.ko Reading symbols from /boot/kernel/ng_ether.ko...Reading symbols from /boot/kernel/ng_ether.ko.symbols...done. done. Loaded symbols for /boot/kernel/ng_ether.ko Reading symbols from /boot/kernel/ng_pppoe.ko...Reading symbols from /boot/kernel/ng_pppoe.ko.symbols...done. done. Loaded symbols for /boot/kernel/ng_pppoe.ko Reading symbols from /boot/kernel/ng_tee.ko...Reading symbols from /boot/kernel/ng_tee.ko.symbols...done. done. Loaded symbols for /boot/kernel/ng_tee.ko Reading symbols from /boot/kernel/ng_iface.ko...Reading symbols from /boot/kernel/ng_iface.ko.symbols...done. done. Loaded symbols for /boot/kernel/ng_iface.ko Reading symbols from /boot/kernel/ng_ppp.ko...Reading symbols from /boot/kernel/ng_ppp.ko.symbols...done. done. Loaded symbols for /boot/kernel/ng_ppp.ko #0 doadump (textdump=<value optimized out>) at pcpu.h:234 234 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=<value optimized out>) at pcpu.h:234 #1 0xffffffff8090d486 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:449 #2 0xffffffff8090d987 in panic (fmt=0x1 <Address 0x1 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:637 #3 0xffffffff80cf20d0 in trap_fatal (frame=0xc, eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:879 #4 0xffffffff80cf2431 in trap_pfault (frame=0xffffff8882642700, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:795 #5 0xffffffff80cf29e4 in trap (frame=0xffffff8882642700) at /usr/src/sys/amd64/amd64/trap.c:463 #6 0xffffffff80cdbd13 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:232 #7 0xffffffff8090adb0 in _rw_rlock (rw=0xfffffe013aade5a8, file=0x0, line=485069968) at /usr/src/sys/kern/kern_rwlock.c:382 #8 0xffffffff809c5959 in bpf_mtap2 (bp=0xfffffe013aade580, data=0xffffff88826429bc, dlen=4, m=0xfffffe0300f46700) at /usr/src/sys/net/bpf.c:2197 #9 0xffffffff8188e11a in ng_iface_bpftap (ifp=<value optimized out>, m=0x0, family=144 '\220') at /usr/src/sys/modules/netgraph/iface/../../../netgraph/ng_iface.c:444 #10 0xffffffff8188eb11 in ng_iface_output (ifp=0xfffffe014566a000, m=0xfffffe0300f46700, dst=0xffffff8882642aac, ro=<value optimized out>) at /usr/src/sys/modules/netgraph/iface/../../../netgraph/ng_iface.c:394 #11 0xffffffff80a3a104 in ip_output (m=0xfffffe0300f46700, opt=<value optimized out>, ro=0xffffff8882642a90, flags=<value optimized out>, imo=0x0, inp=0x0) at /usr/src/sys/netinet/ip_output.c:631 #12 0xffffffff81864edc in dummynet_send (m=0xfffffe0300f46700) at /usr/src/sys/modules/dummynet/../../netpfil/ipfw/ip_dn_io.c:655 #13 0xffffffff81865467 in dummynet_task (context=<value optimized out>, pending=<value optimized out>) at /usr/src/sys/modules/dummynet/../../netpfil/ipfw/ip_dn_io.c:618 #14 0xffffffff80954554 in taskqueue_run_locked (queue=0xfffffe000d2d1a80) at /usr/src/sys/kern/subr_taskqueue.c:312 #15 0xffffffff80955506 in taskqueue_thread_loop (arg=<value optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:501 #16 0xffffffff808db67f in fork_exit ( callout=0xffffffff809554c0 <taskqueue_thread_loop>, arg=0xffffffff81869be0, frame=0xffffff8882642c40) at /usr/src/sys/kern/kern_fork.c:992 #17 0xffffffff80cdc23e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:606 #18 0x0000000000000000 in ?? () (kgdb)
Sorry, some errors in my previous post: "Don't know how it scales exactly (linear, quadratic or exponential) nor if it scales with traffic or the amount of pppoe clients but it sure if one of them of a mix of them." should be: "I don't know how it scales exactly (linear, quadratic or exponential) nor if it scales with traffic or the amount of pppoe clients but it sure is one of them or a mix of them."
(In reply to dblais from comment #2) I cannot look into this right now but noticed that what you are reporting looks different than the original post. That seems double free and what you are seeing is locking issue in bpf. I don't know the code very well but that's my understanding. 1 more suggestion, if you have many servers with the exact same panic, what is the frequency of the panics? Can you provide any more info? It'd be really helpful if you can run -current or stable10/release10 on one of those machines as those branches are actively being worked on and there are higher chances that what you are seeing _might_ have been fixed there. my 2 c.
(In reply to Hiren Panchasara from comment #4) This PR is probably duplicate of my later PR 149513 that contains more details.
(In reply to Hiren Panchasara from comment #4) This PR is probably duplicate of my later PR *195102* that contains more details.
(In reply to dblais from comment #2) > #6 0xffffffff80cdbd13 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:232 > #7 0xffffffff8090adb0 in _rw_rlock (rw=0xfffffe013aade5a8, file=0x0, > line=485069968) at /usr/src/sys/kern/kern_rwlock.c:382 > #8 0xffffffff809c5959 in bpf_mtap2 (bp=0xfffffe013aade580, > data=0xffffff88826429bc, dlen=4, m=0xfffffe0300f46700) > at /usr/src/sys/net/bpf.c:2197 > #9 0xffffffff8188e11a in ng_iface_bpftap (ifp=<value optimized out>, m=0x0, > family=144 '\220') > at /usr/src/sys/modules/netgraph/iface/../../../netgraph/ng_iface.c:444 > #10 0xffffffff8188eb11 in ng_iface_output (ifp=0xfffffe014566a000, > m=0xfffffe0300f46700, dst=0xffffff8882642aac, ro=<value optimized out>) > at /usr/src/sys/modules/netgraph/iface/../../../netgraph/ng_iface.c:394 This panic looks different. Probably an interface has gone away and BPF's interface departure handler already destroyed bif_lock.
I just opened a new bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199096 since my error is similar but different.
Known problem in ng_iface(4) described in the PR 220076 believed to be source of this problem too. *** This bug has been marked as a duplicate of bug 220076 ***