pf seems to have a problem for reassembling large UDP packets, problem annonced on the pf ML here by rpaulo@ : http://lists.freebsd.org/pipermail/freebsd-pf/2013-December/007265.html How-To-Repeat: I've reach to reproduce the problem, but with pf in transparent mode. Full explanation for reproducting this problem here: http://lists.freebsd.org/pipermail/freebsd-pf/2014-January/007277.html
Reproducing this problem on 10.1-RELEASE-p4 have a bigger impact: System crash. [root@bridge-firewall]~# [zone: mbuf] kern.ipc.nmbufs limit reached Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x1d fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff81814f07 stack pointer = 0x28:0xfffffe00003f2810 frame pointer = 0x28:0xfffffe00003f2890 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (vtnet0 rxq 0) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: #0 0xffffffff807d6e30 at kdb_backtrace+0x60 #1 0xffffffff8079f735 at panic+0x155 #2 0xffffffff80b263bf at trap_fatal+0x38f #3 0xffffffff80b266d8 at trap_pfault+0x308 #4 0xffffffff80b25d3a at trap+0x47a #5 0xffffffff80b0bdf2 at calltrap+0x8 #6 0xffffffff818154b5 at bridge_forward+0x2d5 #7 0xffffffff81813c55 at bridge_input+0x555 #8 0xffffffff8085ac35 at ether_nh_input+0x2a5 #9 0xffffffff80862732 at netisr_dispatch_src+0x62 #10 0xffffffff80bf85a3 at vtnet_rxq_eof+0x793 #11 0xffffffff80bf888a at vtnet_rxq_tq_intr+0x5a #12 0xffffffff807e52a5 at taskqueue_run_locked+0xe5 #13 0xffffffff807e5d38 at taskqueue_thread_loop+0xa8 #14 0xffffffff807732ca at fork_exit+0x9a #15 0xffffffff80b0c32e at fork_trampoline+0xe Uptime: 6m17s Consoles: userboot
Same problem on -current r282520: - Corrupted reassembled packet outgoing the bridge - Crash As example, a simple big ping: ping -c 1 -s 1500 10.0.0.3 Produce this tcpdump output on the INCOMING PF-bridge interface: [root@R2]~# tcpdump -pni em0 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes 11:03:41.790409 IP 10.0.0.1 > 10.0.0.3: ICMP echo request, id 62723, seq 0, length 1480 11:03:41.790434 IP 10.0.0.1 > 10.0.0.3: ip-proto-1 But produce this tcpdump output on the OUTGOING PF-bridge interface: [root@R2]~# tcpdump -pni em1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on em1, link-type EN10MB (Ethernet), capture size 262144 bytes 11:03:54.863303 20:00:40:01:df:91 > 45:00:05:dc:61:8c, ethertype Unknown (0x0a00), length 1500: 0x0000: 0001 0a00 0003 0800 3b06 f703 0000 5549 ........;.....UI 0x0010: f51b 0001 c0ed 0809 0a0b 0c0d 0e0f 1011 ................ 0x0020: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0030: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0040: 3233 3435 3637 3839 3a3b 3c3d 3e3f 4041 23456789:;<=>?@A 0x0050: 4243 4445 4647 4849 4a4b 4c4d 4e4f 5051 BCDEFGHIJKLMNOPQ 0x0060: 5253 5455 5657 5859 5a5b 5c5d 5e5f 6061 RSTUVWXYZ[\]^_`a 0x0070: 6263 6465 6667 6869 6a6b 6c6d 6e6f 7071 bcdefghijklmnopq 0x0080: 7273 7475 7677 7879 7a7b 7c7d 7e7f 8081 rstuvwxyz{|}~... 0x0090: 8283 8485 8687 8889 8a8b 8c8d 8e8f 9091 ................ 0x00a0: 9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1 ................ 0x00b0: a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1 ................ 0x00c0: b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1 ................ 0x00d0: c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1 ................ 0x00e0: d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1 ................ 0x00f0: e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1 ................ 0x0100: f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001 ................ 0x0110: 0203 0405 0607 0809 0a0b 0c0d 0e0f 1011 ................ 0x0120: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0130: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0140: 3233 3435 3637 3839 3a3b 3c3d 3e3f 4041 23456789:;<=>?@A 0x0150: 4243 4445 4647 4849 4a4b 4c4d 4e4f 5051 BCDEFGHIJKLMNOPQ 0x0160: 5253 5455 5657 5859 5a5b 5c5d 5e5f 6061 RSTUVWXYZ[\]^_`a 0x0170: 6263 6465 6667 6869 6a6b 6c6d 6e6f 7071 bcdefghijklmnopq 0x0180: 7273 7475 7677 7879 7a7b 7c7d 7e7f 8081 rstuvwxyz{|}~... 0x0190: 8283 8485 8687 8889 8a8b 8c8d 8e8f 9091 ................ 0x01a0: 9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1 ................ 0x01b0: a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1 ................ 0x01c0: b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1 ................ 0x01d0: c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1 ................ 0x01e0: d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1 ................ 0x01f0: e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1 ................ 0x0200: f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001 ................ 0x0210: 0203 0405 0607 0809 0a0b 0c0d 0e0f 1011 ................ 0x0220: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0230: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0240: 3233 3435 3637 3839 3a3b 3c3d 3e3f 4041 23456789:;<=>?@A 0x0250: 4243 4445 4647 4849 4a4b 4c4d 4e4f 5051 BCDEFGHIJKLMNOPQ 0x0260: 5253 5455 5657 5859 5a5b 5c5d 5e5f 6061 RSTUVWXYZ[\]^_`a 0x0270: 6263 6465 6667 6869 6a6b 6c6d 6e6f 7071 bcdefghijklmnopq 0x0280: 7273 7475 7677 7879 7a7b 7c7d 7e7f 8081 rstuvwxyz{|}~... 0x0290: 8283 8485 8687 8889 8a8b 8c8d 8e8f 9091 ................ 0x02a0: 9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1 ................ 0x02b0: a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1 ................ 0x02c0: b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1 ................ 0x02d0: c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1 ................ 0x02e0: d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1 ................ 0x02f0: e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1 ................ 0x0300: f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001 ................ 0x0310: 0203 0405 0607 0809 0a0b 0c0d 0e0f 1011 ................ 0x0320: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0330: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0340: 3233 3435 3637 3839 3a3b 3c3d 3e3f 4041 23456789:;<=>?@A 0x0350: 4243 4445 4647 4849 4a4b 4c4d 4e4f 5051 BCDEFGHIJKLMNOPQ 0x0360: 5253 5455 5657 5859 5a5b 5c5d 5e5f 6061 RSTUVWXYZ[\]^_`a 0x0370: 6263 6465 6667 6869 6a6b 6c6d 6e6f 7071 bcdefghijklmnopq 0x0380: 7273 7475 7677 7879 7a7b 7c7d 7e7f 8081 rstuvwxyz{|}~... 0x0390: 8283 8485 8687 8889 8a8b 8c8d 8e8f 9091 ................ 0x03a0: 9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1 ................ 0x03b0: a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1 ................ 0x03c0: b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1 ................ 0x03d0: c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1 ................ 0x03e0: d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1 ................ 0x03f0: e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1 ................ 0x0400: f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001 ................ 0x0410: 0203 0405 0607 0809 0a0b 0c0d 0e0f 1011 ................ 0x0420: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0430: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0440: 3233 3435 3637 3839 3a3b 3c3d 3e3f 4041 23456789:;<=>?@A 0x0450: 4243 4445 4647 4849 4a4b 4c4d 4e4f 5051 BCDEFGHIJKLMNOPQ 0x0460: 5253 5455 5657 5859 5a5b 5c5d 5e5f 6061 RSTUVWXYZ[\]^_`a 0x0470: 6263 6465 6667 6869 6a6b 6c6d 6e6f 7071 bcdefghijklmnopq 0x0480: 7273 7475 7677 7879 7a7b 7c7d 7e7f 8081 rstuvwxyz{|}~... 0x0490: 8283 8485 8687 8889 8a8b 8c8d 8e8f 9091 ................ 0x04a0: 9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1 ................ 0x04b0: a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1 ................ 0x04c0: b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1 ................ 0x04d0: c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1 ................ 0x04e0: d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1 ................ 0x04f0: e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1 ................ 0x0500: f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001 ................ 0x0510: 0203 0405 0607 0809 0a0b 0c0d 0e0f 1011 ................ 0x0520: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0530: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0540: 3233 3435 3637 3839 3a3b 3c3d 3e3f 4041 23456789:;<=>?@A 0x0550: 4243 4445 4647 4849 4a4b 4c4d 4e4f 5051 BCDEFGHIJKLMNOPQ 0x0560: 5253 5455 5657 5859 5a5b 5c5d 5e5f 6061 RSTUVWXYZ[\]^_`a 0x0570: 6263 6465 6667 6869 6a6b 6c6d 6e6f 7071 bcdefghijklmnopq 0x0580: 7273 7475 7677 7879 7a7b 7c7d 7e7f 8081 rstuvwxyz{|}~... 0x0590: 8283 8485 8687 8889 8a8b 8c8d 8e8f 9091 ................ 0x05a0: 9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1 ................ 0x05b0: a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1 ................ 0x05c0: b2b3 b4b5 b6b7 b8b9 babb bcbd bebf .............. 11:03:54.863318 00:b9:40:01:04:85 > 45:00:00:30:61:8c, ethertype Unknown (0x0a00), length 48: 0x0000: 0001 0a00 0003 c0c1 c2c3 c4c5 c6c7 c8c9 ................ 0x0010: cacb cccd cecf d0d1 d2d3 d4d5 d6d7 d8d9 ................ 0x0020: dadb .. And when pushing multiple fragmented packets, it crash: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x1c fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff81a14b13 stack pointer = 0x28:0xfffffe00003857f0 frame pointer = 0x28:0xfffffe0000385860 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (em0 taskq) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: #0 0xffffffff808582c7 at kdb_backtrace+0x67 #1 0xffffffff808188a9 at vpanic+0x189 #2 0xffffffff80818713 at panic+0x43 #3 0xffffffff80be93d9 at trap_fatal+0x379 #4 0xffffffff80be970e at trap_pfault+0x31e #5 0xffffffff80be8d51 at trap+0x461 #6 0xffffffff80bcc7b2 at calltrap+0x8 #7 0xffffffff81a150e7 at bridge_forward+0x2f7 #8 0xffffffff81a137cc at bridge_input+0x5dc #9 0xffffffff809073b3 at ether_nh_input+0x2d3 #10 0xffffffff80910231 at netisr_dispatch_src+0x61 #11 0xffffffff80906ab6 at ether_input+0x26 #12 0xffffffff80902cda at if_input+0xa #13 0xffffffff804734d0 at lem_rxeof+0x4c0 #14 0xffffffff80473b54 at lem_handle_rxtx+0x34 #15 0xffffffff8086b519 at taskqueue_run_locked+0x139 #16 0xffffffff8086c318 at taskqueue_thread_loop+0xc8 #17 0xffffffff807df92a at fork_exit+0x9a Uptime: 6m18s
Hi, We really need this bug to be fixed. It prevents us from deploying new projects, we are heavy users of transparent mode. Thanks !
(In reply to Jerome Toutee from comment #3) Hi Jerome, I'm not able to reproduce this on CURRENT. Can you confirm that you can still reproduce it there?
Let me restart my virtual-lab on -current (same version for all VMs): root@VM2:~ # uname -a FreeBSD 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r304964M: Sun Aug 28 21:49:48 CEST 2016 olivier@lame4.bsdrp.net:/usr/obj/BSDRP12.amd64/usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64 amd64 Simple lab diagram: VM 1 (vtnet0)------(vtnet0) VM2 (vtnet1) -------- (vtnet1) VM 3 VM1 setup: sysrc ifconfig_vtnet0="inet 10.0.0.1/24" service netif restart VM 2 setup: sysrc ifconfig_vtnet0="up" sysrc ifconfig_vtnet1="up" sysrc cloned_interfaces="bridge0" sysrc ifconfig_bridge0="addm vtnet0 addm vtnet1 up" sysrc pf_enable=yes cat > /etc/pf.conf <<EOF set skip on lo0 scrub pass EOF service netif restart service pf start VM 3 setup: sysrc ifconfig_vtnet1="inet 10.0.0.3/24" service netif restart Now a standard ping works but not fragmented (same problem with UDP). Example from VM1 to VM3: root@VM1:~ # ping -c 1 10.0.0.3 PING 10.0.0.3 (10.0.0.3): 56 data bytes 64 bytes from 10.0.0.3: icmp_seq=0 ttl=64 time=0.258 ms --- 10.0.0.3 ping statistics --- 1 packets transmitted, 1 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.258/0.258/0.258/0.000 ms => Works with "standard size" (non-fragmented) ICMP ping. root@:~ # ping -c 1 -s 1500 10.0.0.3 PING 10.0.0.3 (10.0.0.3): 1500 data bytes --- 10.0.0.3 ping statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss => But not with fragmented ICMP A tcpdump on VM2 or VM3 give the same "corrupted" IP packet generated: root@VM2:~ # tcpdump -vv -pnei vtnet1 tcpdump: listening on vtnet1, link-type EN10MB (Ethernet), capture size 262144 bytes 09:39:59.656215 20:00:40:01:33:fa > 45:00:05:dc:0d:24, ethertype Unknown (0x0a00), length 1500: 0x0000: 0001 0a00 0003 0800 12d1 b907 0000 57c4 ..............W. 0x0010: 02ef 000a 16c8 0809 0a0b 0c0d 0e0f 1011 ................ 0x0020: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0030: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0040: 3233 3435 3637 3839 3a3b 3c3d 3e3f 4041 23456789:;<=>?@A 0x0050: 4243 4445 4647 4849 4a4b 4c4d 4e4f 5051 BCDEFGHIJKLMNOPQ 0x0060: 5253 5455 5657 5859 5a5b 5c5d 5e5f 6061 RSTUVWXYZ[\]^_`a 0x0070: 6263 6465 6667 6869 6a6b 6c6d 6e6f 7071 bcdefghijklmnopq 0x0080: 7273 7475 7677 7879 7a7b 7c7d 7e7f 8081 rstuvwxyz{|}~... 0x0090: 8283 8485 8687 8889 8a8b 8c8d 8e8f 9091 ................ 0x00a0: 9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1 ................ 0x00b0: a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1 ................ 0x00c0: b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1 ................ 0x00d0: c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1 ................ 0x00e0: d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1 ................ 0x00f0: e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1 ................ 0x0100: f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001 ................ 0x0110: 0203 0405 0607 0809 0a0b 0c0d 0e0f 1011 ................ 0x0120: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............! 0x0130: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01 0x0140: 3233 3435 3637 3839 3a3b 3c3d 3e3f 4041 23456789:;<=>?@A 0x0150: 4243 4445 4647 4849 4a4b 4c4d 4e4f 5051 BCDEFGHIJKLMNOPQ 0x0160: 5253 5455 5657 5859 5a5b 5c5d 5e5f 6061 RSTUVWXYZ[\]^_`a 0x0170: 6263 6465 6667 6869 6a6b 6c6d 6e6f 7071 bcdefghijklmnopq 0x0180: 7273 7475 7677 7879 7a7b 7c7d 7e7f 8081 rstuvwxyz{|}~... (etc.) If I remove the "scrub" pf feature: There is no more problem.
I've generated a core dump and start kgdb on it: There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x1c fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8221c218 stack pointer = 0x28:0xfffffe000dff36c0 frame pointer = 0x28:0xfffffe000dff3730 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 11 (irq267: virtio_pci1) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: #0 0xffffffff809590b7 at kdb_backtrace+0x67 #1 0xffffffff80911f32 at vpanic+0x182 #2 0xffffffff80911da3 at panic+0x43 #3 0xffffffff80d36c11 at trap_fatal+0x351 #4 0xffffffff80d36e03 at trap_pfault+0x1e3 #5 0xffffffff80d3638c at trap+0x26c #6 0xffffffff80d19e71 at calltrap+0x8 #7 0xffffffff8221dd74 at bridge_forward+0x304 #8 0xffffffff8221d0ce at bridge_input+0x5de #9 0xffffffff80a1a290 at ether_nh_input+0x2a0 #10 0xffffffff80a30c05 at netisr_dispatch_src+0xa5 #11 0xffffffff80a19936 at ether_input+0x26 #12 0xffffffff807f0c6c at vtnet_rxq_eof+0x84c #13 0xffffffff807f1be3 at vtnet_rx_vq_intr+0x93 #14 0xffffffff808d68ef at intr_event_execute_handlers+0x20f #15 0xffffffff808d6b56 at ithread_loop+0xc6 #16 0xffffffff808d3535 at fork_exit+0x85 #17 0xffffffff80d1a3ae at fork_trampoline+0xe Uptime: 2m55s Dumping 113 out of 224 MB:..15%..29%..43%..57%..71%..85%..99% Reading symbols from /data/debug/boot/kernel/if_bridge.ko.debug...done. Loaded symbols for /data/debug/boot/kernel/if_bridge.ko.debug Reading symbols from /boot/kernel/bridgestp.ko...done. Loaded symbols for /boot/kernel/bridgestp.ko Reading symbols from /boot/kernel/pf.ko...done. Loaded symbols for /boot/kernel/pf.ko #0 doadump (textdump=<value optimized out>) at pcpu.h:221 221 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump (textdump=<value optimized out>) at pcpu.h:221 #1 0xffffffff809119b9 in kern_reboot (howto=260) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff80911f6b in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff80911da3 in panic (fmt=0x0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff80d36c11 in trap_fatal (frame=0xfffffe000dff3610, eva=28) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/trap.c:841 #5 0xffffffff80d36e03 in trap_pfault (frame=0xfffffe000dff3610, usermode=0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/trap.c:691 #6 0xffffffff80d3638c in trap (frame=0xfffffe000dff3610) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/trap.c:442 #7 0xffffffff80d19e71 in calltrap () at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/exception.S:236 #8 0xffffffff8221c218 in bridge_pfil (mp=<value optimized out>, bifp=<value optimized out>, ifp=0xfffff8000329f000, dir=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:3511 #9 0xffffffff8221dd74 in bridge_forward (sc=<value optimized out>, sbif=<value optimized out>, m=0x0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2265 #10 0xffffffff8221d0ce in bridge_input (ifp=<value optimized out>, m=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2475 #11 0xffffffff80a1a290 in ether_nh_input (m=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/if_ethersubr.c:602 #12 0xffffffff80a30c05 in netisr_dispatch_src (proto=5, source=<value optimized out>, m=0x0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/netisr.c:1120 #13 0xffffffff80a19936 in ether_input (ifp=<value optimized out>, m=0x0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/if_ethersubr.c:757 #14 0xffffffff807f0c6c in vtnet_rxq_eof (rxq=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:1745 #15 0xffffffff807f1be3 in vtnet_rx_vq_intr (xrxq=0xfffff800032b8c00) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:1876 #16 0xffffffff808d68ef in intr_event_execute_handlers ( p=<value optimized out>, ie=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_intr.c:1262 #17 0xffffffff808d6b56 in ithread_loop (arg=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_intr.c:1275 #18 0xffffffff808d3535 in fork_exit ( callout=0xffffffff808d6a90 <ithread_loop>, arg=0xfffff800032b2f80, frame=0xfffffe000dff3ac0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_fork.c:1038 #19 0xffffffff80d1a3ae in fork_trampoline () at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/exception.S:611 #20 0x0000000000000000 in ?? () Current language: auto; currently minimal => Displaying code at instruction pointer creating the problem: (kgdb) list *0xffffffff8221c218 0xffffffff8221c218 is in bridge_pfil (/usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:3511). 3506 /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c: No such file or directory. in /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c (kgdb) frame 8 #8 0xffffffff8221c218 in bridge_pfil (mp=<value optimized out>, bifp=<value optimized out>, ifp=0xfffff8000329f000, dir=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:3511 3511 in /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c ===== I didn't have source code (just debug symbol) on this machin, then looking in if_bridge.c at line 3511: It's bridge_fragment() function (called by bridge_pfil): 3481 static int 3482 bridge_fragment(struct ifnet *ifp, struct mbuf *m, struct ether_header *eh, 3483 int snap, struct llc *llc) 3484 { 3485 struct mbuf *m0; 3486 struct ip *ip; 3487 int error = -1; 3488 3489 if (m->m_len < sizeof(struct ip) && 3490 (m = m_pullup(m, sizeof(struct ip))) == NULL) 3491 goto out; 3492 ip = mtod(m, struct ip *); 3493 3494 m->m_pkthdr.csum_flags |= CSUM_IP; 3495 error = ip_fragment(ip, &m, ifp->if_mtu, ifp->if_hwassist); 3496 if (error) 3497 goto out; 3498 3499 /* walk the chain and re-add the Ethernet header */ 3500 for (m0 = m; m0; m0 = m0->m_nextpkt) { 3501 if (error == 0) { 3502 if (snap) { 3503 M_PREPEND(m0, sizeof(struct llc), M_NOWAIT); 3504 if (m0 == NULL) { 3505 error = ENOBUFS; 3506 continue; 3507 } 3508 bcopy(llc, mtod(m0, caddr_t), 3509 sizeof(struct llc)); 3510 } 3511 M_PREPEND(m0, ETHER_HDR_LEN, M_NOWAIT); 3512 if (m0 == NULL) { 3513 error = ENOBUFS; 3514 continue; 3515 } 3516 bcopy(eh, mtod(m0, caddr_t), ETHER_HDR_LEN); 3517 } else 3518 m_freem(m); 3519 } 3520 3521 if (error == 0) 3522 KMOD_IPSTAT_INC(ips_fragmented); 3523 3524 return (error); 3525 3526 out: 3527 if (m != NULL) 3528 m_freem(m); 3529 return (error); 3530 } => The line that create problem should be: M_PREPEND(m0, ETHER_HDR_LEN, M_NOWAIT); Right ? But how to display m0 variable ? It seems I can only see "ifp" variable: (kgdb) p *ifp $3 = {if_link = {tqe_next = 0xfffff80003385800, tqe_prev = 0xfffff8000329f800}, if_clones = {le_next = 0x0, le_prev = 0x0}, if_groups = {tqh_first = 0xfffff800032b2420, tqh_last = 0xfffff800032b2428}, if_alloctype = 6 '\006', if_softc = 0xfffff800031e7000, if_llsoftc = 0x0, if_l2com = 0x0, if_dname = 0xfffff80003176a58 "vtnet", if_dunit = 1, if_index = 2, if_index_reserved = 0, if_xname = 0xfffff8000329f060 "vtnet1", if_description = 0x0, if_flags = 35075, if_drv_flags = 64, if_capabilities = 1572904, if_capenable = 524328, if_linkmib = 0x0, if_linkmiblen = 0, if_refcount = 1, if_type = 6 '\006', if_addrlen = 6 '\006', if_hdrlen = 18 '\022', if_link_state = 2 '\002', if_mtu = 1500, if_metric = 0, if_baudrate = 10000000000, if_hwassist = 0, if_epoch = 1, if_lastchange = {tv_sec = 1472470495, tv_usec = 912458}, if_snd = {ifq_head = 0x0, ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 10240, ifq_mtx = {lock_object = {lo_name = 0xfffff8000329f060 "vtnet1", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, ifq_drv_head = 0x0, ifq_drv_tail = 0x0, ifq_drv_len = 0, ifq_drv_maxlen = 0, altq_type = 0, altq_flags = 0, altq_disc = 0x0, altq_ifp = 0xfffff8000329f000, altq_enqueue = 0, altq_dequeue = 0, altq_request = 0, altq_clfier = 0x0, altq_classify = 0, altq_tbr = 0x0, altq_cdnr = 0x0}, if_linktask = {ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0xffffffff80a0d610 <do_link_state_change>, ta_context = 0xfffff8000329f000}, if_addr_lock = {lock_object = { lo_name = 0xffffffff81232f6f "if_addr_lock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, if_addrhead = { tqh_first = 0xfffff800032b7900, tqh_last = 0xfffff8000368c028}, if_multiaddrs = {tqh_first = 0xfffff800033c6b80, tqh_last = 0xfffff800033c6e80}, if_amcount = 0, if_addr = 0xfffff800032b7900, if_broadcastaddr = 0xffffffff81233490 "▒▒▒▒▒▒", if_afdata_lock = { lock_object = {lo_name = 0xffffffff81232f7c "if_afdata", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, if_afdata = 0xfffff8000329f208, if_afdata_initialized = 2, if_fib = 0, if_vnet = 0x0, if_home_vnet = 0x0, if_vlantrunk = 0x0, if_bpf = 0xfffff800032c6a80, if_pcount = 1, if_bridge = 0xfffff8000368de00, if_lagg = 0x0, if_pf_kif = 0xfffff8000341fd00, if_carp = 0x0, if_label = 0x0, if_netmap = 0xfffff800032f7400, if_output = 0xffffffff80a18d60 <ether_output>, if_input = 0xffffffff80a19910 <ether_input>, if_start = 0, if_ioctl = 0xffffffff807f20e0 <vtnet_ioctl>, if_init = 0xffffffff807f1f90 <vtnet_init>, if_resolvemulti = 0xffffffff80a19950 <ether_resolvemulti>, if_qflush = 0xffffffff807f2900 <vtnet_qflush>, if_transmit = 0xffffffff807f27f0 <vtnet_txq_mq_start>, if_reassign = 0, if_get_counter = 0xffffffff807f2780 <vtnet_get_counter>, if_requestencap = 0xffffffff80a19a70 <ether_requestencap>, if_counters = 0xfffff8000329f410, if_hw_tsomax = 65518, if_hw_tsomaxsegcount = 35, if_hw_tsomaxsegsize = 2048, if_pspare = 0xfffff8000329f480, if_ispare = 0xfffff8000329f4a0} (kgdb) Regards,
Created attachment 174240 [details] wireshark analysis Here is my wireshark analysis between a trace with scrub and a trace without scrub.
Created attachment 174241 [details] pcaps file I've added as attachment these 2 tcpdump files (done on real hardware): - A first standard ping is send from 10.0.0.1 to 10.0.0.3 - A second ping with 1500 size is generated - There are little IPv6 noise on this pcap: you can ignore them.
I've reproduce the problem under VirtualBox (with em interface) and on a real hardware lab (with igb interface). And I've studied the tcpdump with pf-bridge-scrub vs pf-bridge-without_scrub: Once scrub is enabled: the IP payload is translated as an Ethernet payload, adding an Ethernet header is missing. I've attached pcaps file and a screenshot of my wireshark analysis.
I've rebuild a kernel with all DEBUG enabled. And generating only first one fragmented ICMP (ping -c 1 -s 1500 10.0.0.3) generate this kassert panic: [root@router]~# panic: vtnet_txq_encap: no mbuf packet header! cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00003ab530 vpanic() at vpanic+0x182/frame 0xfffffe00003ab5b0 kassert_panic() at kassert_panic+0x126/frame 0xfffffe00003ab620 vtnet_txq_mq_start_locked() at vtnet_txq_mq_start_locked+0x635/frame 0xfffffe00003ab6e0 vtnet_txq_mq_start() at vtnet_txq_mq_start+0x6f/frame 0xfffffe00003ab720 bridge_enqueue() at bridge_enqueue+0x9a/frame 0xfffffe00003ab760 bridge_forward() at bridge_forward+0x322/frame 0xfffffe00003ab7c0 bridge_input() at bridge_input+0x5f4/frame 0xfffffe00003ab830 ether_nh_input() at ether_nh_input+0x2ab/frame 0xfffffe00003ab870 netisr_dispatch_src() at netisr_dispatch_src+0x80/frame 0xfffffe00003ab8d0 ether_input() at ether_input+0x62/frame 0xfffffe00003ab900 vtnet_rxq_eof() at vtnet_rxq_eof+0x835/frame 0xfffffe00003ab9b0 vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x4e/frame 0xfffffe00003ab9e0 intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame 0xfffffe00003aba20 ithread_loop() at ithread_loop+0xa6/frame 0xfffffe00003aba70 fork_exit() at fork_exit+0x84/frame 0xfffffe00003abab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00003abab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 11 tid 100025 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why
I've generated a core dump (with a DEBUG kernel) and looked into it: Unread portion of the kernel message buffer: panic: vtnet_txq_encap: no mbuf packet header! cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00003ab530 vpanic() at vpanic+0x182/frame 0xfffffe00003ab5b0 kassert_panic() at kassert_panic+0x126/frame 0xfffffe00003ab620 vtnet_txq_mq_start_locked() at vtnet_txq_mq_start_locked+0x635/frame 0xfffffe00003ab6e0 vtnet_txq_mq_start() at vtnet_txq_mq_start+0x6f/frame 0xfffffe00003ab720 bridge_enqueue() at bridge_enqueue+0x9a/frame 0xfffffe00003ab760 bridge_forward() at bridge_forward+0x322/frame 0xfffffe00003ab7c0 bridge_input() at bridge_input+0x5f4/frame 0xfffffe00003ab830 ether_nh_input() at ether_nh_input+0x2ab/frame 0xfffffe00003ab870 netisr_dispatch_src() at netisr_dispatch_src+0x80/frame 0xfffffe00003ab8d0 ether_input() at ether_input+0x62/frame 0xfffffe00003ab900 vtnet_rxq_eof() at vtnet_rxq_eof+0x835/frame 0xfffffe00003ab9b0 vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x4e/frame 0xfffffe00003ab9e0 intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame 0xfffffe00003aba20 ithread_loop() at ithread_loop+0xa6/frame 0xfffffe00003aba70 fork_exit() at fork_exit+0x84/frame 0xfffffe00003abab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00003abab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic Reading symbols from /data/debug/boot/kernel/if_bridge.ko.debug...done. Loaded symbols for /data/debug/boot/kernel/if_bridge.ko.debug Reading symbols from /boot/kernel/bridgestp.ko...done. Loaded symbols for /boot/kernel/bridgestp.ko Reading symbols from /boot/kernel/pf.ko...done. Loaded symbols for /boot/kernel/pf.ko #0 doadump (textdump=0) at pcpu.h:221 221 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump (textdump=0) at pcpu.h:221 #1 0xffffffff8035512b in db_dump (dummy=<value optimized out>, dummy2=false, dummy3=0, dummy4=0x0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/ddb/db_command.c:546 #2 0xffffffff80354f29 in db_command (cmd_table=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/ddb/db_command.c:453 #3 0xffffffff80354c84 in db_command_loop () at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/ddb/db_command.c:506 #4 0xffffffff80357d2b in db_trap (type=<value optimized out>, code=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/ddb/db_main.c:251 #5 0xffffffff808fe593 in kdb_trap (type=<value optimized out>, code=<value optimized out>, tf=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/subr_kdb.c:654 #6 0xffffffff80c9993d in trap (frame=0xfffffe00003ab460) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/trap.c:556 #7 0xffffffff80c7a2d1 in calltrap () at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/exception.S:236 #8 0xffffffff808fdc3b in kdb_enter (why=0xffffffff8118cc44 "panic", msg=0x80 <Address 0x80 out of bounds>) at cpufunc.h:63 #9 0xffffffff808c05ff in vpanic (fmt=<value optimized out>, ap=0xfffffe00003ab5f0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:752 #10 0xffffffff808c0456 in kassert_panic (fmt=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:649 #11 0xffffffff807bc0d5 in vtnet_txq_mq_start_locked (txq=0xfffff80003698b00, m=0xfffff80003e25700) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:2185 #12 0xffffffff807bce3f in vtnet_txq_mq_start (ifp=0xfffff800036d3800, m=0xfffff80003e25700) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:2381 #13 0xffffffff8221b72a in bridge_enqueue (sc=0xfffff8000369d200, dst_ifp=<value optimized out>, m=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:1920 #14 0xffffffff8221e2c2 in bridge_forward (sc=<value optimized out>, sbif=<value optimized out>, m=0xfffffe00003ab410) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2271 #15 0xffffffff8221d564 in bridge_input (ifp=<value optimized out>, m=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2475 #16 0xffffffff809afc4b in ether_nh_input (m=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/if_ethersubr.c:602 #17 0xffffffff809c4cb0 in netisr_dispatch_src (proto=5, source=0, m=0xfffff80003e25600) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/netisr.c:1120 #18 0xffffffff809af252 in ether_input (ifp=<value optimized out>, m=0x0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/if_ethersubr.c:757 #19 0xffffffff807bb675 in vtnet_rxq_eof (rxq=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:1745 #20 0xffffffff807bc69e in vtnet_rx_vq_intr (xrxq=0xfffff80003698e00) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:1876 #21 0xffffffff8088dde6 in intr_event_execute_handlers ( p=<value optimized out>, ie=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_intr.c:1262 #22 0xffffffff8088e466 in ithread_loop (arg=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_intr.c:1275 #23 0xffffffff8088b4f4 in fork_exit ( callout=0xffffffff8088e3c0 <ithread_loop>, arg=0xfffff800034c1ee0, frame=0xfffffe00003abac0) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_fork.c:1038 #24 0xffffffff80c7a80e in fork_trampoline () at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/exception.S:611 #25 0x0000000000000000 in ?? () Current language: auto; currently minimal => It seems that bridge_enqueue() is sending a bad/unexisting mbuf to the interface. (kgdb) frame 13 #13 0xffffffff8221b72a in bridge_enqueue (sc=0xfffff8000369d200, dst_ifp=<value optimized out>, m=<value optimized out>) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:1920 1920 if ((err = dst_ifp->if_transmit(dst_ifp, m))) { => kgdb can't display m (mbuf pointer) value here, but at the previous frame it can display it: (kgdb) frame 14 #14 0xffffffff8221e2c2 in bridge_forward (sc=<value optimized out>, sbif=<value optimized out>, m=0xfffffe00003ab410) at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2271 2271 bridge_enqueue(sc, dst_if, m); (kgdb) print m $1 = (struct mbuf *) 0xfffffe00003ab410 On my VMs that are using vtnet interface, vtnet didn't have VLANTAG neither VLAN_HWTAGGING: [root@router]~# ifconfig vtnet1 vtnet1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=80028<VLAN_MTU,JUMBO_MTU,LINKSTATE> Then bridge_enqueue() should trigger this code part: /* * If underlying interface can not do VLAN tag insertion itself * then attach a packet tag that holds it. */ if ((m->m_flags & M_VLANTAG) && (dst_ifp->if_capenable & IFCAP_VLAN_HWTAGGING) == 0) { I beleive there is something wrong here. Then I've insered a : M_ASSERTPKTHDR(m); just before line 1920: if ((err = dst_ifp->if_transmit(dst_ifp, m))) and this new ASSERT is triggered : [root@router]~# panic: bridge_enqueue: no mbuf packet header! cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00003ab630 vpanic() at vpanic+0x182/frame 0xfffffe00003ab6b0 kassert_panic() at kassert_panic+0x126/frame 0xfffffe00003ab720 bridge_enqueue() at bridge_enqueue+0x11a/frame 0xfffffe00003ab760 bridge_forward() at bridge_forward+0x322/frame 0xfffffe00003ab7c0 bridge_input() at bridge_input+0x5f4/frame 0xfffffe00003ab830 ether_nh_input() at ether_nh_input+0x2ab/frame 0xfffffe00003ab870 netisr_dispatch_src() at netisr_dispatch_src+0x80/frame 0xfffffe00003ab8d0 ether_input() at ether_input+0x62/frame 0xfffffe00003ab900 vtnet_rxq_eof() at vtnet_rxq_eof+0x835/frame 0xfffffe00003ab9b0 vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x4e/frame 0xfffffe00003ab9e0 intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame 0xfffffe00003aba20 ithread_loop() at ithread_loop+0xa6/frame 0xfffffe00003aba70 fork_exit() at fork_exit+0x84/frame 0xfffffe00003abab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00003abab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 11 tid 100025 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why
I've added some lines like: if_printf(ifp,"[DEBUG] bridge_fragment() exiting, m_len: %d\n",m->m_len); in the sys/net/if_bridge.c code. Now, here is the behavior with pf-in-bridge-mode, BUT without scrub, when I generate a "ping -c 1 -s 1500" (: bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 1514 bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 1514 bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 1514 bridge0: [DEBUG] bridge_pfil() exit, dir: 2(IN:1/OUT:2), m_len: 1514 bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 62 bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 62 bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 62 bridge0: [DEBUG] bridge_pfil() exit, dir: 2(IN:1/OUT:2), m_len: 62 bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 1514 bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 1514 bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 1514 bridge0: [DEBUG] bridge_pfil() exit, dir: 2(IN:1/OUT:2), m_len: 1514 bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 62 bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 62 bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 62 bridge0: [DEBUG] bridge_pfil() exit, dir: 2(IN:1/OUT:2), m_len: 62 => For each packet received, there are transmitted as-it. Now, here is the behavior with pf-in-bridge-mode WITH scrub: bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 1514 pf_normalize_ip: DEBUG branch frag: 0xfffff80003e73300(m_pkthrd.len:1500) pf_normalize_ip: reass frag 45306 @ 0-1480 pf_fillup_fragment: reass frag 45306 @ 0-1480 bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 62 pf_normalize_ip: DEBUG branch frag: 0xfffff80003e73200(m_pkthrd.len:48) pf_normalize_ip: reass frag 45306 @ 1480-1508 pf_fillup_fragment: reass frag 45306 @ 1480-1508 pf_isfull_fragment: 1508 < 1508? pf_reassemble: complete: 0xfffff80003e73300(m_pkthrd.len:1528, p_len: 1528) bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 1542 bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 1542 vtnet1: [DEBUG] bridge_fragment() entering, m_len: 1528 vtnet1: [DEBUG] bridge_fragment() exiting, m_len: 1500 panic: bridge_enqueue: no mbuf packet header! => There are 2 new functions called: pf_normalize and bridge_fragment. Here is my interpretation in the scrub-and-bridge-mode: 1. bridge_pfil (IN) the first fragmented packet (mbuf_len of MTU max ethernet frame = 1514) 2. pf_normalize (scrub) detect a fragment, and wait for the next fragment 3. bridge_pfil (IN) the second fragment packet (mbuf_len of 62 Bytes Ethernet frame) 4. pf_normalize reassemble this 2 mbuf in one big mbuf of 1528 (=20 bytes for IP header + 1508 bytes of ICMP header+data) 5. bridge_pfil (IN) re-add 14 bytes of Ethernet Header to this mbuf (m_len=1542 bytes) 6. bridge_pfil (OUT) takes this mbuf (m_len=1542), remove the Ethernet header (m_len - 14 = 1528) and call bridge_fragment() because it's bigger than MTU. 7. bridge_fragment should have a bug, because it reduce the m_len to 1500 and try to fordward it to NIC (it should be at 1514 minimum, not 1500!). 8. The ASSERT I've set is triggered: We can't send an mbuf without ethernet header to the NIC.
funny, after lot's of printf() for debuging, it seems it's the first suspicious function that was source of the panic that is corrupting my mbuf/packet: in bridge_fragment(): M_PREPEND(m0, ETHER_HDR_LEN, M_NOWAIT); Here is the new output of my debug output: bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), frag :0xfffff8000386e800(m_len: 1514) pf_normalize_ip: DEBUG branch frag: 0xfffff8000386e800(m_pkthrd.len:1500) pf_normalize_ip: reass frag 44538 @ 0-1480 pf_fillup_fragment: reass frag 44538 @ 0-1480bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), frag :0xfffff8000386e700(m_len: 62) pf_normalize_ip: DEBUG branch frag: 0xfffff8000386e700(m_pkthrd.len:48) pf_normalize_ip: reass frag 44538 @ 1480-1508 pf_fillup_fragment: reass frag 44538 @ 1480-1508 pf_isfull_fragment: 1508 < 1508? pf_reassemble: complete: 0xfffff8000386e800(m_pkthrd.len:1528, p_len: 1528) bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), frag: 0xfffff8000386e800(m_len: 1542) bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), frag :0xfffff8000386e800(m_len: 1542) vtnet1: [DEBUG] bridge_fragment() entering, frag:0xfffff8000386e800(m_len: 1528), ether_dhost : 58:9c:fc:02:03:03 vtnet1: [DEBUG] bridge_fragment() after ip_fragment, first mbuf in chain is frag:0xfffff8000386e800(m_len: 1500), second is 0xfffff80003796c00(m_len: 20) vtnet1: [DEBUG] bridge_fragment() walking chain, frag m0:0xfffff8000386e800(m_len: 1500), frag m:0xfffff8000386e800(m_len: 1500) vtnet1: [DEBUG] bridge_fragment() walking chain after M_PREPEND, frag m0:0xfffff80003796d00(m_len: 14), frag m:0xfffff8000386e800(m_len: 1500) vtnet1: [DEBUG] bridge_fragment() walking chain after bcopy, frag m0:0xfffff80003796d00(m_len: 14), frag m:0xfffff8000386e800(m_len: 1500) vtnet1: [DEBUG] bridge_fragment() exiting, m_len: 1500 panic: bridge_enqueue: no mbuf packet header! => Before calling M_PREPEND, there is a mbuf chain: - first element is 1500 bytes long - second element is 20 bytes long Then we need to add ETHER_HDR_LEN to the begining of the first element: After M_PREPEND, the 1500 bytes long should be 1514 bytes long… but we obtain a 14 bytes long mbuf!!!!
Patch proposed here: https://reviews.freebsd.org/D7780
A commit references this bug: Author: kp Date: Sat Sep 24 07:09:43 UTC 2016 New revision: 306289 URL: https://svnweb.freebsd.org/changeset/base/306289 Log: bridge: Fix fragment handling and memory leak Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling feature (like pf'scrub) is enabled on the bridge. This patch fixes corrupted packet problem and the panic (triggered easly with low RAM) as explain in PR 185633. bridge_pfil and bridge_fragment relationship: bridge_pfil() receive (IN direction) packets and sent it to the firewall The firewall can be configured for reassembling fragmented packet (like pf'scrubing) in one mbuf chain when bridge_pfil() need to send this reassembled packet to the outgoing interface, it needs to re-fragment it by using bridge_fragment() bridge_fragment() had to split this mbuf (using ip_fragment) first then had to M_PREPEND each packet in the mbuf chain for adding Ethernet header. But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain, then the "main" pointer of this mbuf chain should be updated and this case is tottaly forgotten. The original bridge_fragment code (Revision 158140, 2006 April 29) came from OpenBSD, and the call to bridge_enqueue was embedded. But on FreeBSD, bridge_enqueue() is done after bridge_fragment(), then the original OpenBSD code can't work as-it of FreeBSD. PR: 185633 Submitted by: Olivier Cochard-Labb? Differential Revision: https://reviews.freebsd.org/D7780 Changes: head/sys/net/if_bridge.c
A commit references this bug: Author: kp Date: Sun Oct 2 21:06:55 UTC 2016 New revision: 306593 URL: https://svnweb.freebsd.org/changeset/base/306593 Log: MFC r306289: bridge: Fix fragment handling and memory leak Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling feature (like pf'scrub) is enabled on the bridge. This patch fixes corrupted packet problem and the panic (triggered easly with low RAM) as explain in PR 185633. bridge_pfil and bridge_fragment relationship: bridge_pfil() receive (IN direction) packets and sent it to the firewall The firewall can be configured for reassembling fragmented packet (like pf'scrubing) in one mbuf chain when bridge_pfil() need to send this reassembled packet to the outgoing interface, it needs to re-fragment it by using bridge_fragment() bridge_fragment() had to split this mbuf (using ip_fragment) first then had to M_PREPEND each packet in the mbuf chain for adding Ethernet header. But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain, then the "main" pointer of this mbuf chain should be updated and this case is tottaly forgotten. The original bridge_fragment code (Revision 158140, 2006 April 29) came from OpenBSD, and the call to bridge_enqueue was embedded. But on FreeBSD, bridge_enqueue() is done after bridge_fragment(), then the original OpenBSD code can't work as-it of FreeBSD. PR: 185633 Submitted by: Olivier Cochard-Labb? Changes: _U stable/11/ stable/11/sys/net/if_bridge.c
A commit references this bug: Author: kp Date: Sun Oct 2 21:11:25 UTC 2016 New revision: 306594 URL: https://svnweb.freebsd.org/changeset/base/306594 Log: MFC r306289: bridge: Fix fragment handling and memory leak Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling feature (like pf'scrub) is enabled on the bridge. This patch fixes corrupted packet problem and the panic (triggered easly with low RAM) as explain in PR 185633. bridge_pfil and bridge_fragment relationship: bridge_pfil() receive (IN direction) packets and sent it to the firewall The firewall can be configured for reassembling fragmented packet (like pf'scrubing) in one mbuf chain when bridge_pfil() need to send this reassembled packet to the outgoing interface, it needs to re-fragment it by using bridge_fragment() bridge_fragment() had to split this mbuf (using ip_fragment) first then had to M_PREPEND each packet in the mbuf chain for adding Ethernet header. But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain, then the "main" pointer of this mbuf chain should be updated and this case is tottaly forgotten. The original bridge_fragment code (Revision 158140, 2006 April 29) came from OpenBSD, and the call to bridge_enqueue was embedded. But on FreeBSD, bridge_enqueue() is done after bridge_fragment(), then the original OpenBSD code can't work as-it of FreeBSD. PR: 185633 Submitted by: Olivier Cochard-Labb? Changes: _U stable/10/ stable/10/sys/net/if_bridge.c