PCI-X dual-port Intel NICs crashes box (see crash dump below) - the box is running as a firewall. Crashes don't occur when using different NICs but with the same setup (pf enabled, IPv4/IPv6 enabled, gif0 enabled). ---------------------------------------- GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0xffff804039eb8170 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff808c01b9 stack pointer = 0x28:0xffffff802c9fba40 frame pointer = 0x28:0xffffff00174c1100 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (em0 taskq) trap number = 12 panic: page fault cpuid = 1 Uptime: 49m0s Physical memory: 2033 MB Dumping 1359 MB:panic: bufwrite: buffer is not busy??? cpuid = 1 1344 1328 1312 1296 1280 1264 1248 1232 Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0xffffffffffff80ff fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffff0001a94ab0 stack pointer = 0x28:0xffffff80000d1b30 frame pointer = 0x28:0x0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (irq22: em3 atapci0) trap number = 12 panic: page fault cpuid = 1 1216 1200 1184 1168 1152 1136 1120 1104 1088 1072 1056 1040 1024 1008 992 976 960 944 928 912 896 880 864 848 832 816 800 784 768 752 736 720 704 688 672 656 640 624 608 592 576 560 544 528 512 496 480 464 448 432 416 400 384 368 352 336 320 304 288 272 256 240 224 208 192 176 160 144 128 112 96 80 64 48 32 16 #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:223 #1 0x0000000000000004 in ?? () #2 0xffffffff805bbc29 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416 #3 0xffffffff805bc022 in panic (fmt=0x104 <Address 0x104 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:579 #4 0xffffffff808c8600 in trap_fatal (frame=0xffffff0001a94ab0, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:852 #5 0xffffffff808c89d5 in trap_pfault (frame=0xffffff802c9fb990, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:768 #6 0xffffffff808c92cc in trap (frame=0xffffff802c9fb990) at /usr/src/sys/amd64/amd64/trap.c:494 #7 0xffffffff808af9b3 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #8 0xffffffff808c01b9 in pmap_kextract (va=254734608629831) at /usr/src/sys/amd64/amd64/pmap.c:1048 #9 0xffffffff808ae5c8 in bus_dmamap_load_mbuf_sg (dmat=0xffffff0001ad0a00, map=0xffffffff80ca6700, m0=Variable "m0" is not available. ) at /usr/src/sys/amd64/amd64/busdma_machdep.c:653 #10 0xffffffff8036b582 in em_get_buf (adapter=0xffffff80003b8000, i=845) at /usr/src/sys/dev/e1000/if_em.c:4041 #11 0xffffffff8036ea2d in em_rxeof (adapter=0xffffff80003b8000, count=99) at /usr/src/sys/dev/e1000/if_em.c:4439 #12 0xffffffff80373f5e in em_handle_rxtx (context=Variable "context" is not available. ) at /usr/src/sys/dev/e1000/if_em.c:1660 #13 0xffffffff805f665b in taskqueue_run (queue=0xffffff0001c22580) at /usr/src/sys/kern/subr_taskqueue.c:239 #14 0xffffffff805f68c5 in taskqueue_thread_loop (arg=Variable "arg" is not available. ) at /usr/src/sys/kern/subr_taskqueue.c:360 #15 0xffffffff805939cd in fork_exit (callout=0xffffffff805f6880 <taskqueue_thread_loop>, arg=0xffffff80003bc7d0, frame=0xffffff802c9fbc80) at /usr/src/sys/kern/kern_fork.c:843 #16 0xffffffff808afe0e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:561 #17 0x0000000000000000 in ?? () #18 0x0000000000000000 in ?? () #19 0x0000000000000000 in ?? () #20 0x0000000000000000 in ?? () #21 0x0000000000000000 in ?? () #22 0x0000000000000000 in ?? () #23 0x0000000000000000 in ?? () #24 0x0000000000000000 in ?? () #25 0x0000000000000000 in ?? () #26 0x0000000000000000 in ?? () #27 0x0000000000000000 in ?? () #28 0x0000000000000000 in ?? () #29 0x0000000000000000 in ?? () #30 0x0000000000000000 in ?? () #31 0x0000000000000000 in ?? () #32 0x0000000000000000 in ?? () #33 0x0000000000000000 in ?? () #34 0x0000000000000000 in ?? () #35 0x0000000000000000 in ?? () #36 0x0000000000000000 in ?? () #37 0x0000000000000000 in ?? () #38 0x0000000000000000 in ?? () #39 0x0000000000000000 in ?? () #40 0x0000000000000000 in ?? () #41 0x0000000000ed9000 in ?? () #42 0xffffffff80c6c600 in tdq_cpu () #43 0x0000000000000001 in ?? () #44 0xffffff0001a94ab0 in ?? () #45 0xffffffff80c6b980 in affinity () #46 0xffffffff80c6b980 in affinity () #47 0xffffff802c9fb5f8 in ?? () #48 0xffffff0001a94ab0 in ?? () #49 0xffffffff805df020 in sched_switch (td=0xffffff80003bc7d0, newtd=Variable "newtd" is not available. ) at /usr/src/sys/kern/sched_ule.c:1858 #50 0x0000000000000000 in ?? () #51 0x0000000000000000 in ?? () ---Type <return> to continue, or q <return> to quit---q Quit (kgdb) quit How-To-Repeat: happens intermittently... no known steps to recreate.
Responsible Changed From-To: freebsd-bugs->freebsd-net Over to maintainer(s).
Additional note: the kernel's /usr/src/sys/dev/e1000 directory was updated to CURRENT(as of Feb 2 2010) after the first crash. But it still kept on crashing - on the same spot.
This looks like these: http://forums.freebsd.org/archive/index.php/t-10475.html http://lists.freebsd.org/pipermail/freebsd-stable/2010-January/054369.html How about this? http://www.mail-archive.com/commits@crater.dragonflybsd.org/msg03494.html -- cheers mars ----- Joan Crawford - "I, Joan Crawford, I believe in the dollar. Everything I earn, I spend." - http://www.brainyquote.com/quotes/authors/j/joan_crawford.html
On 7 February 2010 00:59, Mars G Miro <spry@anarchy.in.the.ph> wrote: > This looks like these: > http://forums.freebsd.org/archive/index.php/t-10475.html > http://lists.freebsd.org/pipermail/freebsd-stable/2010-January/054369.html > > How about this? > http://www.mail-archive.com/commits@crater.dragonflybsd.org/msg03494.html > I don't know - from the looks of the traces it seems like all of these point to a different bug...
It also crashes on the same spot using igb(4) ------------------------------------------------------------------------------------------------------ GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: cessor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (irq260: igb1) trap number = 12 panic: page fault cpuid = 0 Uptime: 7m49s Physical memory: 2031 MB Dumping 1284 MB: (CTRL-C to abort) 1269 1253 1237 1221 1205 1189 1173 1157 1141 1125 1109 1093 1077 1061 1045 1029 1013 997 981 965 949 933 917 901 885 869 85 3 837 821 805 789 773 757 741 725 709 693 677 661 645 629 613 597 581 565 549 533 517 501 485 469 453 437 421 405 389 373 357 341 325 309 293 277 261 245 229 2 13 197 181 165 149 133 117 101 85 69 53 37 21 5 Reading symbols from /boot/kernel/if_igb.ko...Reading symbols from /boot/kernel/if_igb.ko.symbols...done. done. Loaded symbols for /boot/kernel/if_igb.ko #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:223 #1 0x0000000000000004 in ?? () #2 0xffffffff805ba099 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416 #3 0xffffffff805ba492 in panic (fmt=0x104 <Address 0x104 out of bounds>) at /usr/src/sys/kern/kern_shutdown.c:579 #4 0xffffffff808c6a70 in trap_fatal (frame=0xffffff000142fab0, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:852 #5 0xffffffff808c6e45 in trap_pfault (frame=0xffffff80000be910, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:768 #6 0xffffffff808c773c in trap (frame=0xffffff80000be910) at /usr/src/sys/amd64/amd64/trap.c:494 #7 0xffffffff808ade23 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #8 0xffffffff808be629 in pmap_kextract (va=18446465905857658880) at /usr/src/sys/amd64/amd64/pmap.c:1048 #9 0xffffffff808aca38 in bus_dmamap_load_mbuf_sg (dmat=0xffffff00014a9780, map=0xffffffff80ca4880, m0=Variable "m0" is not available. ) at /usr/src/sys/amd64/amd64/busdma_machdep.c:653 #10 0xffffffff80eb44bc in igb_get_buf (rxr=0xffffff0001488800, i=Variable "i" is not available. ) at /usr/src/sys/modules/igb/../../dev/e1000/if_igb.c:3467 #11 0xffffffff80eb4bd5 in igb_rxeof (rxr=0xffffff0001488800, count=99) at /usr/src/sys/modules/igb/../../dev/e1000/if_igb.c:4134 #12 0xffffffff80eb4f75 in igb_msix_rx (arg=Variable "arg" is not available. ) at /usr/src/sys/modules/igb/../../dev/e1000/if_igb.c:1455 #13 0xffffffff80593e90 in intr_event_execute_handlers (p=Variable "p" is not available. ) at /usr/src/sys/kern/kern_intr.c:1165 #14 0xffffffff805953c7 in ithread_loop (arg=0xffffff0001486b20) at /usr/src/sys/kern/kern_intr.c:1178 #15 0xffffffff80591e3d in fork_exit (callout=0xffffffff80595340 <ithread_loop>, arg=0xffffff0001486b20, frame=0xffffff80000bec80) at /usr/src/sys/kern/kern_fork.c:843 #16 0xffffffff808ae27e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:561
Responsible Changed From-To: freebsd-net->jfv Over to maintainer.
the em(4) function referenced in the stack trace "em_handle_rxtx" was retired and its functionality has been broken up across multiple function handlers in stable/10 and head. If this is still a problem on stable/10, reopen this ticket and we'll take a deeper look. lem(4) still has a lem_handle_rxtx() which makes me think this issue might still be applicable there.