Bug 208649

Summary: 10.3 release crashing in ipfw or intel drivers
Product: Base System Reporter: h-k
Component: kernAssignee: Mark Linimon <linimon>
Status: Closed Overcome By Events    
Severity: Affects Some People CC: grahamperrin, sbruno
Priority: ---    
Version: 10.3-BETA2   
Hardware: amd64   
OS: Any   

Description h-k 2016-04-09 10:30:28 UTC
We have a bridge with dummynet shaping:

root@pipe2:~ # ifconfig
igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO>
        ether 00:1b:21:36:5d:9e
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
igb1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO>
        ether 00:1b:21:36:5d:9f
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=4209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWTSO>
        ether 00:1b:21:46:67:69
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: no carrier
em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
        ether 00:07:e9:17:71:d9
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: no carrier
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:50:90:8e:8f:00
        inet 192.168.8.2 netmask 0xffffff80 broadcast 192.168.8.127
        nd6 options=9<PERFORMNUD,IFDISABLED>
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: igb1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 2 priority 128 path cost 2000000
        member: igb0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 1 priority 128 path cost 2000000
bridge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 00:1b:21:46:67:69
        nd6 options=9<PERFORMNUD,IFDISABLED>
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: em1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 4 priority 128 path cost 2000000
        member: em0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 3 priority 128 path cost 2000000


root@pipe2:~ # ipfw show
00005      2        92 allow ip from any to any mac-type 0x0806 layer2
00006      0         0 deny ip from any to 217.117.112.144/28,217.117.125.152/29 layer2 // NAT IPs vo izbejanie ziklov
00010 371789  22100077 allow ip from table(1) to any layer2
00010 632790 932188391 allow ip from any to table(1) layer2
00020      0         0 deny ip from any to 192.168.0.0/16 out via igb1 layer2
00500      0         0 deny ip from table(5) to any dst-port 25 layer2 // Denys from TestServer
00501      0         0 deny ip from 192.168.0.0/16 to any dst-port 25 layer2 // maill forr fake IP
00502      0         0 deny ip from table(6) to any layer2 // Denys from TestServer
00602      0         0 skipto 825 ip from table(52) to any dst-port 53,80,443 layer2
00604      0         0 skipto 825 ip from any 53,80,443 to table(52) layer2
00822      0         0 deny ip from table(52) to not table(2) layer2 // Tabl52-vse blokirov IP tabl2-open hosts
00849      0         0 allow udp from not 217.117.112.0/20,192.168.128.0/20 123 to 217.117.112.0/20,192.168.128.0/20 layer2
00850      0         0 deny udp from not 217.117.112.0/26 123 to any not dst-port 123 layer2
05502      0         0 deny udp from table(111) to any not dst-port 3658,5730-5739,27005-27095,5060 layer2 // zashita ot uTP
05504      0         0 deny udp from any not 3658,5730-5739,27005-27095,5060 to table(111) layer2 // zashita ot uTP
55010      0         0 pipe 30 ip from table(30) to any layer2 // 50 MBit/sec
55020    787     63185 pipe 80 ip from any to table(30) layer2 // 50 MBit/sec
55030      0         0 pipe 37 ip from table(37) to any layer2 // 75 MBit/sec
60000      8       772 deny ip from any to any layer2
65535      0         0 allow ip from any to any


randomly OS get crashed with following stack:

Apr  9 10:42:03 pipe2 kernel: Fatal trap 12: page fault while in kernel mode
Apr  9 10:42:03 pipe2 kernel: cpuid = 0; apic id = 00
Apr  9 10:42:03 pipe2 kernel: fault virtual address     = 0x188
Apr  9 10:42:03 pipe2 kernel: fault code                = supervisor read data, page not present
Apr  9 10:42:03 pipe2 kernel: instruction pointer       = 0x20:0xffffffff80a1b31f
Apr  9 10:42:03 pipe2 kernel: stack pointer             = 0x28:0xfffffe0090ff6620
Apr  9 10:42:03 pipe2 kernel: frame pointer             = 0x28:0xfffffe0090ff6640
Apr  9 10:42:03 pipe2 kernel: code segment              = base rx0, limit 0xfffff, type 0x1b
Apr  9 10:42:03 pipe2 kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
Apr  9 10:42:03 pipe2 kernel: processor eflags  = interrupt enabled, resume, IOPL = 0
Apr  9 10:42:03 pipe2 kernel: current process           = 12 (irq256: igb0:que 0)
Apr  9 10:42:03 pipe2 kernel: trap number               = 12
Apr  9 10:42:03 pipe2 kernel: panic: page fault
Apr  9 10:42:03 pipe2 kernel: cpuid = 0
Apr  9 10:42:03 pipe2 kernel: KDB: stack backtrace:
Apr  9 10:42:03 pipe2 kernel: #0 0xffffffff8098e390 at kdb_backtrace+0x60
Apr  9 10:42:03 pipe2 kernel: #1 0xffffffff80951066 at vpanic+0x126
Apr  9 10:42:03 pipe2 kernel: #2 0xffffffff80950f33 at panic+0x43
Apr  9 10:42:03 pipe2 kernel: #3 0xffffffff80d55f7b at trap_fatal+0x36b
Apr  9 10:42:03 pipe2 kernel: #4 0xffffffff80d5627d at trap_pfault+0x2ed
Apr  9 10:42:03 pipe2 kernel: #5 0xffffffff80d558fa at trap+0x47a
Apr  9 10:42:03 pipe2 kernel: #6 0xffffffff80d3b8d2 at calltrap+0x8
Apr  9 10:42:03 pipe2 kernel: #7 0xffffffff819e3f05 at dummynet_send+0x95
Apr  9 10:42:03 pipe2 kernel: #8 0xffffffff819e4307 at dummynet_io+0x357
Apr  9 10:42:03 pipe2 kernel: #9 0xffffffff819c13ae at ipfw_check_frame+0x23e
Apr  9 10:42:03 pipe2 kernel: #10 0xffffffff80a24ef4 at pfil_run_hooks+0x84
Apr  9 10:42:03 pipe2 kernel: #11 0xffffffff80a1b370 at ether_demux+0x40
Apr  9 10:42:03 pipe2 kernel: #12 0xffffffff80a1c0fe at ether_nh_input+0x35e
Apr  9 10:42:03 pipe2 kernel: #13 0xffffffff80a24092 at netisr_dispatch_src+0x62
Apr  9 10:42:03 pipe2 kernel: #14 0xffffffff804f859c at igb_rxeof+0x60c
Apr  9 10:42:03 pipe2 kernel: #15 0xffffffff804f8c41 at igb_msix_que+0x121
Apr  9 10:42:03 pipe2 kernel: #16 0xffffffff8091c99b at intr_event_execute_handlers+0xab
Apr  9 10:42:03 pipe2 kernel: #17 0xffffffff8091cde6 at ithread_loop+0x96
Apr  9 10:42:03 pipe2 kernel: Uptime: 4h58m8s

9.3 boxes work fine on the same ruleset ipfw and other configuration

root@pipe2:~ # pciconf -lv
hostb0@pci0:0:0:0:      class=0x060000 card=0x836d1043 chip=0x2e308086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '4 Series Chipset DRAM Controller'
    class      = bridge
    subclass   = HOST-PCI
pcib1@pci0:0:1:0:       class=0x060400 card=0x836d1043 chip=0x2e318086 rev=0x03 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '4 Series Chipset PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
vgapci0@pci0:0:2:0:     class=0x030000 card=0x836d1043 chip=0x2e328086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '4 Series Chipset Integrated Graphics Controller'
    class      = display
    subclass   = VGA
hdac0@pci0:0:27:0:      class=0x040300 card=0x83f31043 chip=0x27d88086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'NM10/ICH7 Family High Definition Audio Controller'
    class      = multimedia
    subclass   = HDA
pcib2@pci0:0:28:0:      class=0x060400 card=0x81791043 chip=0x27d08086 rev=0x01 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = 'NM10/ICH7 Family PCI Express Port 1'
    class      = bridge
    subclass   = PCI-PCI
uhci0@pci0:0:29:0:      class=0x0c0300 card=0x81791043 chip=0x27c88086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'NM10/ICH7 Family USB UHCI Controller'
    class      = serial bus
    subclass   = USB
uhci1@pci0:0:29:1:      class=0x0c0300 card=0x81791043 chip=0x27c98086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'NM10/ICH7 Family USB UHCI Controller'
    class      = serial bus
    subclass   = USB
uhci2@pci0:0:29:2:      class=0x0c0300 card=0x81791043 chip=0x27ca8086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'NM10/ICH7 Family USB UHCI Controller'
    class      = serial bus
    subclass   = USB
uhci3@pci0:0:29:3:      class=0x0c0300 card=0x81791043 chip=0x27cb8086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'NM10/ICH7 Family USB UHCI Controller'
    class      = serial bus
    subclass   = USB
ehci0@pci0:0:29:7:      class=0x0c0320 card=0x81791043 chip=0x27cc8086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'NM10/ICH7 Family USB2 EHCI Controller'
    class      = serial bus
    subclass   = USB
pcib3@pci0:0:30:0:      class=0x060401 card=0x81791043 chip=0x244e8086 rev=0xe1 hdr=0x01
    vendor     = 'Intel Corporation'
    device     = '82801 PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
isab0@pci0:0:31:0:      class=0x060100 card=0x81791043 chip=0x27b88086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801GB/GR (ICH7 Family) LPC Interface Bridge'
    class      = bridge
    subclass   = PCI-ISA
atapci0@pci0:0:31:1:    class=0x01018a card=0x81791043 chip=0x27df8086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82801G (ICH7 Family) IDE Controller'
    class      = mass storage
    subclass   = ATA
atapci1@pci0:0:31:2:    class=0x01018f card=0x81791043 chip=0x27c08086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'NM10/ICH7 Family SATA Controller [IDE mode]'
    class      = mass storage
    subclass   = ATA
igb0@pci0:1:0:0:        class=0x020000 card=0xa03c8086 chip=0x10c98086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82576 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
igb1@pci0:1:0:1:        class=0x020000 card=0xa03c8086 chip=0x10c98086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82576 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
em0@pci0:2:0:0: class=0x020000 card=0xa01f8086 chip=0x10d38086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82574L Gigabit Network Connection'
    class      = network
    subclass   = ethernet
em1@pci0:3:0:0: class=0x020000 card=0x002e8086 chip=0x100e8086 rev=0x02 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82540EM Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet


3 kernel dump can be found here - https://mail.proc.ru/temp/xz.zip
Comment 1 Hiren Panchasara freebsd_committer freebsd_triage 2016-04-11 17:46:52 UTC
I am not sure what could cause this. I don't see much changes to that code except r272089 (from Sean - cc'd). But I don't think can be a cause of your crash. 

Unsure if any change in igb can cause this.

You are reporting that this worked fine on 9.3. Is it possible for you to further bisect this? Try 10.1 or 10.2 and see?
Comment 2 h-k 2016-04-11 17:51:52 UTC
Yes, i can try 10.1 or 10.2 but not too fast, probably on this weekends (16-17 apr).
Comment 3 Sean Bruno freebsd_committer freebsd_triage 2016-04-11 18:18:23 UTC
(In reply to h-k from comment #0)
The crash dumps at the url referred to don't seem to exist.  :-(
Comment 4 h-k 2016-04-12 05:38:39 UTC
(In reply to Sean Bruno from comment #3)

my fault. was wiped by crontab.
right url now - https://mail.proc.ru/xz.tar.bz
Comment 5 Graham Perrin freebsd_committer freebsd_triage 2021-10-11 01:30:29 UTC
With 10.3 and 11.4 end of life, is this still an issue?
Comment 6 Mark Linimon freebsd_committer freebsd_triage 2023-05-19 10:32:46 UTC
Feedback timeout.

To submitter: I'm sorry that this PR was allowed to get stale.