Bug 185633 - [pf] scrubbing bug in transparent mode bug with bigger than MTU UDP packet
Summary: [pf] scrubbing bug in transparent mode bug with bigger than MTU UDP packet
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-pf mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-10 08:40 UTC by olivier
Modified: 2017-03-09 02:40 UTC (History)
3 users (show)

See Also:


Attachments
wireshark analysis (351.45 KB, image/png)
2016-08-31 06:13 UTC, Olivier Cochard
no flags Details
pcaps file (9.00 KB, application/x-tar)
2016-08-31 06:16 UTC, Olivier Cochard
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description olivier 2014-01-10 08:40:00 UTC
pf seems to have a problem for reassembling large UDP packets, problem annonced on the pf ML here by rpaulo@ :
http://lists.freebsd.org/pipermail/freebsd-pf/2013-December/007265.html

How-To-Repeat: I've reach to reproduce the problem, but with pf in transparent mode.
Full explanation for reproducting this problem here:
http://lists.freebsd.org/pipermail/freebsd-pf/2014-January/007277.html
Comment 1 olivier 2015-01-21 09:07:46 UTC
Reproducing this problem on 10.1-RELEASE-p4 have a bigger impact: System crash.

[root@bridge-firewall]~# [zone: mbuf] kern.ipc.nmbufs limit reached

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x1d
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff81814f07
stack pointer           = 0x28:0xfffffe00003f2810
frame pointer           = 0x28:0xfffffe00003f2890
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (vtnet0 rxq 0)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff807d6e30 at kdb_backtrace+0x60
#1 0xffffffff8079f735 at panic+0x155
#2 0xffffffff80b263bf at trap_fatal+0x38f
#3 0xffffffff80b266d8 at trap_pfault+0x308
#4 0xffffffff80b25d3a at trap+0x47a
#5 0xffffffff80b0bdf2 at calltrap+0x8
#6 0xffffffff818154b5 at bridge_forward+0x2d5
#7 0xffffffff81813c55 at bridge_input+0x555
#8 0xffffffff8085ac35 at ether_nh_input+0x2a5
#9 0xffffffff80862732 at netisr_dispatch_src+0x62
#10 0xffffffff80bf85a3 at vtnet_rxq_eof+0x793
#11 0xffffffff80bf888a at vtnet_rxq_tq_intr+0x5a
#12 0xffffffff807e52a5 at taskqueue_run_locked+0xe5
#13 0xffffffff807e5d38 at taskqueue_thread_loop+0xa8
#14 0xffffffff807732ca at fork_exit+0x9a
#15 0xffffffff80b0c32e at fork_trampoline+0xe
Uptime: 6m17s
Consoles: userboot
Comment 2 olivier 2015-05-06 09:08:52 UTC
Same problem on -current r282520:
- Corrupted reassembled packet outgoing the bridge
- Crash


As example, a simple big ping:
ping -c 1 -s 1500 10.0.0.3

Produce this tcpdump output on the INCOMING PF-bridge interface:

[root@R2]~# tcpdump -pni em0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:03:41.790409 IP 10.0.0.1 > 10.0.0.3: ICMP echo request, id 62723, seq 0, length 1480
11:03:41.790434 IP 10.0.0.1 > 10.0.0.3: ip-proto-1


But produce this tcpdump output on the OUTGOING PF-bridge interface:

[root@R2]~# tcpdump -pni em1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:03:54.863303 20:00:40:01:df:91 > 45:00:05:dc:61:8c, ethertype Unknown (0x0a00), length 1500:
        0x0000:  0001 0a00 0003 0800 3b06 f703 0000 5549  ........;.....UI
        0x0010:  f51b 0001 c0ed 0809 0a0b 0c0d 0e0f 1011  ................
        0x0020:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0030:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0040:  3233 3435 3637 3839 3a3b 3c3d 3e3f 4041  23456789:;<=>?@A
        0x0050:  4243 4445 4647 4849 4a4b 4c4d 4e4f 5051  BCDEFGHIJKLMNOPQ
        0x0060:  5253 5455 5657 5859 5a5b 5c5d 5e5f 6061  RSTUVWXYZ[\]^_`a
        0x0070:  6263 6465 6667 6869 6a6b 6c6d 6e6f 7071  bcdefghijklmnopq
        0x0080:  7273 7475 7677 7879 7a7b 7c7d 7e7f 8081  rstuvwxyz{|}~...
        0x0090:  8283 8485 8687 8889 8a8b 8c8d 8e8f 9091  ................
        0x00a0:  9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1  ................
        0x00b0:  a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1  ................
        0x00c0:  b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1  ................
        0x00d0:  c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1  ................
        0x00e0:  d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1  ................
        0x00f0:  e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1  ................
        0x0100:  f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001  ................
        0x0110:  0203 0405 0607 0809 0a0b 0c0d 0e0f 1011  ................
        0x0120:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0130:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0140:  3233 3435 3637 3839 3a3b 3c3d 3e3f 4041  23456789:;<=>?@A
        0x0150:  4243 4445 4647 4849 4a4b 4c4d 4e4f 5051  BCDEFGHIJKLMNOPQ
        0x0160:  5253 5455 5657 5859 5a5b 5c5d 5e5f 6061  RSTUVWXYZ[\]^_`a
        0x0170:  6263 6465 6667 6869 6a6b 6c6d 6e6f 7071  bcdefghijklmnopq
        0x0180:  7273 7475 7677 7879 7a7b 7c7d 7e7f 8081  rstuvwxyz{|}~...
        0x0190:  8283 8485 8687 8889 8a8b 8c8d 8e8f 9091  ................
        0x01a0:  9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1  ................
        0x01b0:  a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1  ................
        0x01c0:  b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1  ................
        0x01d0:  c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1  ................
        0x01e0:  d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1  ................
        0x01f0:  e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1  ................
        0x0200:  f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001  ................
        0x0210:  0203 0405 0607 0809 0a0b 0c0d 0e0f 1011  ................
        0x0220:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0230:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0240:  3233 3435 3637 3839 3a3b 3c3d 3e3f 4041  23456789:;<=>?@A
        0x0250:  4243 4445 4647 4849 4a4b 4c4d 4e4f 5051  BCDEFGHIJKLMNOPQ
        0x0260:  5253 5455 5657 5859 5a5b 5c5d 5e5f 6061  RSTUVWXYZ[\]^_`a
        0x0270:  6263 6465 6667 6869 6a6b 6c6d 6e6f 7071  bcdefghijklmnopq
        0x0280:  7273 7475 7677 7879 7a7b 7c7d 7e7f 8081  rstuvwxyz{|}~...
        0x0290:  8283 8485 8687 8889 8a8b 8c8d 8e8f 9091  ................
        0x02a0:  9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1  ................
        0x02b0:  a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1  ................
        0x02c0:  b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1  ................
        0x02d0:  c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1  ................
        0x02e0:  d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1  ................
        0x02f0:  e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1  ................
        0x0300:  f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001  ................
        0x0310:  0203 0405 0607 0809 0a0b 0c0d 0e0f 1011  ................
        0x0320:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0330:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0340:  3233 3435 3637 3839 3a3b 3c3d 3e3f 4041  23456789:;<=>?@A
        0x0350:  4243 4445 4647 4849 4a4b 4c4d 4e4f 5051  BCDEFGHIJKLMNOPQ
        0x0360:  5253 5455 5657 5859 5a5b 5c5d 5e5f 6061  RSTUVWXYZ[\]^_`a
        0x0370:  6263 6465 6667 6869 6a6b 6c6d 6e6f 7071  bcdefghijklmnopq
        0x0380:  7273 7475 7677 7879 7a7b 7c7d 7e7f 8081  rstuvwxyz{|}~...
        0x0390:  8283 8485 8687 8889 8a8b 8c8d 8e8f 9091  ................
        0x03a0:  9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1  ................
        0x03b0:  a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1  ................
        0x03c0:  b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1  ................
        0x03d0:  c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1  ................
        0x03e0:  d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1  ................
        0x03f0:  e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1  ................
        0x0400:  f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001  ................
        0x0410:  0203 0405 0607 0809 0a0b 0c0d 0e0f 1011  ................
        0x0420:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0430:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0440:  3233 3435 3637 3839 3a3b 3c3d 3e3f 4041  23456789:;<=>?@A
        0x0450:  4243 4445 4647 4849 4a4b 4c4d 4e4f 5051  BCDEFGHIJKLMNOPQ
        0x0460:  5253 5455 5657 5859 5a5b 5c5d 5e5f 6061  RSTUVWXYZ[\]^_`a
        0x0470:  6263 6465 6667 6869 6a6b 6c6d 6e6f 7071  bcdefghijklmnopq
        0x0480:  7273 7475 7677 7879 7a7b 7c7d 7e7f 8081  rstuvwxyz{|}~...
        0x0490:  8283 8485 8687 8889 8a8b 8c8d 8e8f 9091  ................
        0x04a0:  9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1  ................
        0x04b0:  a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1  ................
        0x04c0:  b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1  ................
        0x04d0:  c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1  ................
        0x04e0:  d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1  ................
        0x04f0:  e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1  ................
        0x0500:  f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001  ................
        0x0510:  0203 0405 0607 0809 0a0b 0c0d 0e0f 1011  ................
        0x0520:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0530:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0540:  3233 3435 3637 3839 3a3b 3c3d 3e3f 4041  23456789:;<=>?@A
        0x0550:  4243 4445 4647 4849 4a4b 4c4d 4e4f 5051  BCDEFGHIJKLMNOPQ
        0x0560:  5253 5455 5657 5859 5a5b 5c5d 5e5f 6061  RSTUVWXYZ[\]^_`a
        0x0570:  6263 6465 6667 6869 6a6b 6c6d 6e6f 7071  bcdefghijklmnopq
        0x0580:  7273 7475 7677 7879 7a7b 7c7d 7e7f 8081  rstuvwxyz{|}~...
        0x0590:  8283 8485 8687 8889 8a8b 8c8d 8e8f 9091  ................
        0x05a0:  9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1  ................
        0x05b0:  a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1  ................
        0x05c0:  b2b3 b4b5 b6b7 b8b9 babb bcbd bebf       ..............
11:03:54.863318 00:b9:40:01:04:85 > 45:00:00:30:61:8c, ethertype Unknown (0x0a00), length 48:
        0x0000:  0001 0a00 0003 c0c1 c2c3 c4c5 c6c7 c8c9  ................
        0x0010:  cacb cccd cecf d0d1 d2d3 d4d5 d6d7 d8d9  ................
        0x0020:  dadb                                     ..


And when pushing multiple fragmented packets, it crash:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x1c
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff81a14b13
stack pointer           = 0x28:0xfffffe00003857f0
frame pointer           = 0x28:0xfffffe0000385860
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (em0 taskq)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff808582c7 at kdb_backtrace+0x67
#1 0xffffffff808188a9 at vpanic+0x189
#2 0xffffffff80818713 at panic+0x43
#3 0xffffffff80be93d9 at trap_fatal+0x379
#4 0xffffffff80be970e at trap_pfault+0x31e
#5 0xffffffff80be8d51 at trap+0x461
#6 0xffffffff80bcc7b2 at calltrap+0x8
#7 0xffffffff81a150e7 at bridge_forward+0x2f7
#8 0xffffffff81a137cc at bridge_input+0x5dc
#9 0xffffffff809073b3 at ether_nh_input+0x2d3
#10 0xffffffff80910231 at netisr_dispatch_src+0x61
#11 0xffffffff80906ab6 at ether_input+0x26
#12 0xffffffff80902cda at if_input+0xa
#13 0xffffffff804734d0 at lem_rxeof+0x4c0
#14 0xffffffff80473b54 at lem_handle_rxtx+0x34
#15 0xffffffff8086b519 at taskqueue_run_locked+0x139
#16 0xffffffff8086c318 at taskqueue_thread_loop+0xc8
#17 0xffffffff807df92a at fork_exit+0x9a
Uptime: 6m18s
Comment 3 Jerome Toutee 2016-08-26 11:49:21 UTC
Hi,
We really need this bug to be fixed. It prevents us from deploying new projects, we are heavy users of transparent mode.
Thanks !
Comment 4 Kristof Provost freebsd_committer 2016-08-28 16:45:06 UTC
(In reply to Jerome Toutee from comment #3)
Hi Jerome,

I'm not able to reproduce this on CURRENT. Can you confirm that you can still reproduce it there?
Comment 5 Olivier Cochard freebsd_committer 2016-08-29 07:42:32 UTC
Let me restart my virtual-lab on -current (same version for all VMs):

root@VM2:~ # uname -a
FreeBSD  12.0-CURRENT FreeBSD 12.0-CURRENT #0 r304964M: Sun Aug 28 21:49:48 CEST 2016     olivier@lame4.bsdrp.net:/usr/obj/BSDRP12.amd64/usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64  amd64

Simple lab diagram:

VM 1 (vtnet0)------(vtnet0) VM2 (vtnet1) -------- (vtnet1) VM 3

VM1 setup:
sysrc ifconfig_vtnet0="inet 10.0.0.1/24"
service netif restart

VM 2 setup:
sysrc ifconfig_vtnet0="up"
sysrc ifconfig_vtnet1="up"
sysrc cloned_interfaces="bridge0"
sysrc ifconfig_bridge0="addm vtnet0 addm vtnet1 up"
sysrc pf_enable=yes
cat > /etc/pf.conf <<EOF
set skip on lo0
scrub
pass
EOF
service netif restart
service pf start

VM 3 setup:
sysrc ifconfig_vtnet1="inet 10.0.0.3/24"
service netif restart

Now a standard ping works but not fragmented (same problem with UDP).
Example from VM1 to VM3:

root@VM1:~ # ping -c 1 10.0.0.3
PING 10.0.0.3 (10.0.0.3): 56 data bytes
64 bytes from 10.0.0.3: icmp_seq=0 ttl=64 time=0.258 ms

--- 10.0.0.3 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.258/0.258/0.258/0.000 ms

=> Works with "standard size" (non-fragmented) ICMP ping.

root@:~ # ping -c 1 -s 1500 10.0.0.3
PING 10.0.0.3 (10.0.0.3): 1500 data bytes

--- 10.0.0.3 ping statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss

=> But not with fragmented ICMP

A tcpdump on VM2 or VM3 give the same "corrupted" IP packet generated:

root@VM2:~ # tcpdump -vv -pnei vtnet1
tcpdump: listening on vtnet1, link-type EN10MB (Ethernet), capture size 262144 bytes
09:39:59.656215 20:00:40:01:33:fa > 45:00:05:dc:0d:24, ethertype Unknown (0x0a00), length 1500:
        0x0000:  0001 0a00 0003 0800 12d1 b907 0000 57c4  ..............W.
        0x0010:  02ef 000a 16c8 0809 0a0b 0c0d 0e0f 1011  ................
        0x0020:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0030:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0040:  3233 3435 3637 3839 3a3b 3c3d 3e3f 4041  23456789:;<=>?@A
        0x0050:  4243 4445 4647 4849 4a4b 4c4d 4e4f 5051  BCDEFGHIJKLMNOPQ
        0x0060:  5253 5455 5657 5859 5a5b 5c5d 5e5f 6061  RSTUVWXYZ[\]^_`a
        0x0070:  6263 6465 6667 6869 6a6b 6c6d 6e6f 7071  bcdefghijklmnopq
        0x0080:  7273 7475 7677 7879 7a7b 7c7d 7e7f 8081  rstuvwxyz{|}~...
        0x0090:  8283 8485 8687 8889 8a8b 8c8d 8e8f 9091  ................
        0x00a0:  9293 9495 9697 9899 9a9b 9c9d 9e9f a0a1  ................
        0x00b0:  a2a3 a4a5 a6a7 a8a9 aaab acad aeaf b0b1  ................
        0x00c0:  b2b3 b4b5 b6b7 b8b9 babb bcbd bebf c0c1  ................
        0x00d0:  c2c3 c4c5 c6c7 c8c9 cacb cccd cecf d0d1  ................
        0x00e0:  d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf e0e1  ................
        0x00f0:  e2e3 e4e5 e6e7 e8e9 eaeb eced eeef f0f1  ................
        0x0100:  f2f3 f4f5 f6f7 f8f9 fafb fcfd feff 0001  ................
        0x0110:  0203 0405 0607 0809 0a0b 0c0d 0e0f 1011  ................
        0x0120:  1213 1415 1617 1819 1a1b 1c1d 1e1f 2021  ...............!
        0x0130:  2223 2425 2627 2829 2a2b 2c2d 2e2f 3031  "#$%&'()*+,-./01
        0x0140:  3233 3435 3637 3839 3a3b 3c3d 3e3f 4041  23456789:;<=>?@A
        0x0150:  4243 4445 4647 4849 4a4b 4c4d 4e4f 5051  BCDEFGHIJKLMNOPQ
        0x0160:  5253 5455 5657 5859 5a5b 5c5d 5e5f 6061  RSTUVWXYZ[\]^_`a
        0x0170:  6263 6465 6667 6869 6a6b 6c6d 6e6f 7071  bcdefghijklmnopq
        0x0180:  7273 7475 7677 7879 7a7b 7c7d 7e7f 8081  rstuvwxyz{|}~...
(etc.)


If I remove the "scrub" pf feature: There is no more problem.
Comment 6 Olivier Cochard freebsd_committer 2016-08-29 12:21:09 UTC
I've generated a core dump and start kgdb on it:

There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x1c
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8221c218
stack pointer           = 0x28:0xfffffe000dff36c0
frame pointer           = 0x28:0xfffffe000dff3730
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 11 (irq267: virtio_pci1)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff809590b7 at kdb_backtrace+0x67
#1 0xffffffff80911f32 at vpanic+0x182
#2 0xffffffff80911da3 at panic+0x43
#3 0xffffffff80d36c11 at trap_fatal+0x351
#4 0xffffffff80d36e03 at trap_pfault+0x1e3
#5 0xffffffff80d3638c at trap+0x26c
#6 0xffffffff80d19e71 at calltrap+0x8
#7 0xffffffff8221dd74 at bridge_forward+0x304
#8 0xffffffff8221d0ce at bridge_input+0x5de
#9 0xffffffff80a1a290 at ether_nh_input+0x2a0
#10 0xffffffff80a30c05 at netisr_dispatch_src+0xa5
#11 0xffffffff80a19936 at ether_input+0x26
#12 0xffffffff807f0c6c at vtnet_rxq_eof+0x84c
#13 0xffffffff807f1be3 at vtnet_rx_vq_intr+0x93
#14 0xffffffff808d68ef at intr_event_execute_handlers+0x20f
#15 0xffffffff808d6b56 at ithread_loop+0xc6
#16 0xffffffff808d3535 at fork_exit+0x85
#17 0xffffffff80d1a3ae at fork_trampoline+0xe
Uptime: 2m55s
Dumping 113 out of 224 MB:..15%..29%..43%..57%..71%..85%..99%

Reading symbols from /data/debug/boot/kernel/if_bridge.ko.debug...done.
Loaded symbols for /data/debug/boot/kernel/if_bridge.ko.debug
Reading symbols from /boot/kernel/bridgestp.ko...done.
Loaded symbols for /boot/kernel/bridgestp.ko
Reading symbols from /boot/kernel/pf.ko...done.
Loaded symbols for /boot/kernel/pf.ko
#0  doadump (textdump=<value optimized out>) at pcpu.h:221
221     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump (textdump=<value optimized out>) at pcpu.h:221
#1  0xffffffff809119b9 in kern_reboot (howto=260)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80911f6b in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80911da3 in panic (fmt=0x0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff80d36c11 in trap_fatal (frame=0xfffffe000dff3610, eva=28)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/trap.c:841
#5  0xffffffff80d36e03 in trap_pfault (frame=0xfffffe000dff3610, usermode=0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/trap.c:691
#6  0xffffffff80d3638c in trap (frame=0xfffffe000dff3610)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/trap.c:442
#7  0xffffffff80d19e71 in calltrap ()
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff8221c218 in bridge_pfil (mp=<value optimized out>,
    bifp=<value optimized out>, ifp=0xfffff8000329f000,
    dir=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:3511
#9  0xffffffff8221dd74 in bridge_forward (sc=<value optimized out>,
    sbif=<value optimized out>, m=0x0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2265
#10 0xffffffff8221d0ce in bridge_input (ifp=<value optimized out>,
    m=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2475
#11 0xffffffff80a1a290 in ether_nh_input (m=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/if_ethersubr.c:602
#12 0xffffffff80a30c05 in netisr_dispatch_src (proto=5,
    source=<value optimized out>, m=0x0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/netisr.c:1120
#13 0xffffffff80a19936 in ether_input (ifp=<value optimized out>, m=0x0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/if_ethersubr.c:757
#14 0xffffffff807f0c6c in vtnet_rxq_eof (rxq=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:1745
#15 0xffffffff807f1be3 in vtnet_rx_vq_intr (xrxq=0xfffff800032b8c00)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:1876
#16 0xffffffff808d68ef in intr_event_execute_handlers (
    p=<value optimized out>, ie=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_intr.c:1262
#17 0xffffffff808d6b56 in ithread_loop (arg=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_intr.c:1275
#18 0xffffffff808d3535 in fork_exit (
    callout=0xffffffff808d6a90 <ithread_loop>, arg=0xfffff800032b2f80,
    frame=0xfffffe000dff3ac0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_fork.c:1038
#19 0xffffffff80d1a3ae in fork_trampoline ()
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/exception.S:611
#20 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal

=> Displaying code at instruction pointer creating the problem:

(kgdb) list *0xffffffff8221c218
0xffffffff8221c218 is in bridge_pfil (/usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:3511).
3506    /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c: No such file or directory.
        in /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c

(kgdb) frame 8
#8  0xffffffff8221c218 in bridge_pfil (mp=<value optimized out>,
    bifp=<value optimized out>, ifp=0xfffff8000329f000,
    dir=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:3511
3511    in /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c

		
===== I didn't have source code (just debug symbol) on this machin, then looking in if_bridge.c at line 3511: It's bridge_fragment() function (called by bridge_pfil):

3481 static int
3482 bridge_fragment(struct ifnet *ifp, struct mbuf *m, struct ether_header *eh,
3483     int snap, struct llc *llc)
3484 {
3485     struct mbuf *m0;
3486     struct ip *ip;
3487     int error = -1;
3488
3489     if (m->m_len < sizeof(struct ip) &&
3490         (m = m_pullup(m, sizeof(struct ip))) == NULL)
3491         goto out;
3492     ip = mtod(m, struct ip *);
3493
3494     m->m_pkthdr.csum_flags |= CSUM_IP;
3495     error = ip_fragment(ip, &m, ifp->if_mtu, ifp->if_hwassist);
3496     if (error)
3497         goto out;
3498
3499     /* walk the chain and re-add the Ethernet header */
3500     for (m0 = m; m0; m0 = m0->m_nextpkt) {
3501         if (error == 0) {
3502             if (snap) {
3503                 M_PREPEND(m0, sizeof(struct llc), M_NOWAIT);
3504                 if (m0 == NULL) {
3505                     error = ENOBUFS;
3506                     continue;
3507                 }
3508                 bcopy(llc, mtod(m0, caddr_t),
3509                     sizeof(struct llc));
3510             }
3511             M_PREPEND(m0, ETHER_HDR_LEN, M_NOWAIT);
3512             if (m0 == NULL) {
3513                 error = ENOBUFS;
3514                 continue;
3515             }
3516             bcopy(eh, mtod(m0, caddr_t), ETHER_HDR_LEN);
3517         } else
3518             m_freem(m);
3519     }
3520
3521     if (error == 0)
3522         KMOD_IPSTAT_INC(ips_fragmented);
3523
3524     return (error);
3525
3526 out:
3527     if (m != NULL)
3528         m_freem(m);
3529     return (error);
3530 }


=> The line that create problem should be:
M_PREPEND(m0, ETHER_HDR_LEN, M_NOWAIT);

Right ?

But how to display m0 variable ? It seems I can only see "ifp" variable:

(kgdb) p *ifp
$3 = {if_link = {tqe_next = 0xfffff80003385800,
    tqe_prev = 0xfffff8000329f800}, if_clones = {le_next = 0x0,
    le_prev = 0x0}, if_groups = {tqh_first = 0xfffff800032b2420,
    tqh_last = 0xfffff800032b2428}, if_alloctype = 6 '\006',
  if_softc = 0xfffff800031e7000, if_llsoftc = 0x0, if_l2com = 0x0,
  if_dname = 0xfffff80003176a58 "vtnet", if_dunit = 1, if_index = 2,
  if_index_reserved = 0, if_xname = 0xfffff8000329f060 "vtnet1",
  if_description = 0x0, if_flags = 35075, if_drv_flags = 64,
  if_capabilities = 1572904, if_capenable = 524328, if_linkmib = 0x0,
  if_linkmiblen = 0, if_refcount = 1, if_type = 6 '\006',
  if_addrlen = 6 '\006', if_hdrlen = 18 '\022', if_link_state = 2 '\002',
  if_mtu = 1500, if_metric = 0, if_baudrate = 10000000000, if_hwassist = 0,
  if_epoch = 1, if_lastchange = {tv_sec = 1472470495, tv_usec = 912458},
  if_snd = {ifq_head = 0x0, ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 10240,
    ifq_mtx = {lock_object = {lo_name = 0xfffff8000329f060 "vtnet1",
        lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4},
    ifq_drv_head = 0x0, ifq_drv_tail = 0x0, ifq_drv_len = 0,
    ifq_drv_maxlen = 0, altq_type = 0, altq_flags = 0, altq_disc = 0x0,
    altq_ifp = 0xfffff8000329f000, altq_enqueue = 0, altq_dequeue = 0,
    altq_request = 0, altq_clfier = 0x0, altq_classify = 0, altq_tbr = 0x0,
    altq_cdnr = 0x0}, if_linktask = {ta_link = {stqe_next = 0x0},
    ta_pending = 0, ta_priority = 0,
    ta_func = 0xffffffff80a0d610 <do_link_state_change>,
    ta_context = 0xfffff8000329f000}, if_addr_lock = {lock_object = {
      lo_name = 0xffffffff81232f6f "if_addr_lock", lo_flags = 86179840,
      lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, if_addrhead = {
    tqh_first = 0xfffff800032b7900, tqh_last = 0xfffff8000368c028},
  if_multiaddrs = {tqh_first = 0xfffff800033c6b80,
    tqh_last = 0xfffff800033c6e80}, if_amcount = 0,
  if_addr = 0xfffff800032b7900,
  if_broadcastaddr = 0xffffffff81233490 "▒▒▒▒▒▒", if_afdata_lock = {
    lock_object = {lo_name = 0xffffffff81232f7c "if_afdata",
      lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1},
  if_afdata = 0xfffff8000329f208, if_afdata_initialized = 2, if_fib = 0,
  if_vnet = 0x0, if_home_vnet = 0x0, if_vlantrunk = 0x0,
  if_bpf = 0xfffff800032c6a80, if_pcount = 1, if_bridge = 0xfffff8000368de00,
  if_lagg = 0x0, if_pf_kif = 0xfffff8000341fd00, if_carp = 0x0,
  if_label = 0x0, if_netmap = 0xfffff800032f7400,
  if_output = 0xffffffff80a18d60 <ether_output>,
  if_input = 0xffffffff80a19910 <ether_input>, if_start = 0,
  if_ioctl = 0xffffffff807f20e0 <vtnet_ioctl>,
  if_init = 0xffffffff807f1f90 <vtnet_init>,
  if_resolvemulti = 0xffffffff80a19950 <ether_resolvemulti>,
  if_qflush = 0xffffffff807f2900 <vtnet_qflush>,
  if_transmit = 0xffffffff807f27f0 <vtnet_txq_mq_start>, if_reassign = 0,
  if_get_counter = 0xffffffff807f2780 <vtnet_get_counter>,
  if_requestencap = 0xffffffff80a19a70 <ether_requestencap>,
  if_counters = 0xfffff8000329f410, if_hw_tsomax = 65518,
  if_hw_tsomaxsegcount = 35, if_hw_tsomaxsegsize = 2048,
  if_pspare = 0xfffff8000329f480, if_ispare = 0xfffff8000329f4a0}
(kgdb)

Regards,
Comment 7 Olivier Cochard freebsd_committer 2016-08-31 06:13:57 UTC
Created attachment 174240 [details]
wireshark analysis

Here is my wireshark analysis between a trace with scrub and a trace without scrub.
Comment 8 Olivier Cochard freebsd_committer 2016-08-31 06:16:36 UTC
Created attachment 174241 [details]
pcaps file

I've added as attachment these 2 tcpdump files (done on real hardware):
- A first standard ping is send from 10.0.0.1 to 10.0.0.3
- A second ping with 1500 size is generated
- There are little IPv6 noise on this pcap: you can ignore them.
Comment 9 Olivier Cochard freebsd_committer 2016-08-31 06:18:04 UTC
I've reproduce the problem under VirtualBox (with em interface) and on a real hardware lab (with igb interface).

And I've studied the tcpdump with pf-bridge-scrub vs pf-bridge-without_scrub:
Once scrub is enabled: the IP payload is translated as an Ethernet payload, adding an Ethernet header is missing.
I've attached pcaps file and a screenshot of my wireshark analysis.
Comment 10 Olivier Cochard freebsd_committer 2016-09-01 04:27:14 UTC
I've rebuild a kernel with all DEBUG enabled.
And generating only first one fragmented ICMP (ping -c 1 -s 1500 10.0.0.3) generate this kassert panic:

[root@router]~# panic: vtnet_txq_encap: no mbuf packet header!
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00003ab530
vpanic() at vpanic+0x182/frame 0xfffffe00003ab5b0
kassert_panic() at kassert_panic+0x126/frame 0xfffffe00003ab620
vtnet_txq_mq_start_locked() at vtnet_txq_mq_start_locked+0x635/frame 0xfffffe00003ab6e0
vtnet_txq_mq_start() at vtnet_txq_mq_start+0x6f/frame 0xfffffe00003ab720
bridge_enqueue() at bridge_enqueue+0x9a/frame 0xfffffe00003ab760
bridge_forward() at bridge_forward+0x322/frame 0xfffffe00003ab7c0
bridge_input() at bridge_input+0x5f4/frame 0xfffffe00003ab830
ether_nh_input() at ether_nh_input+0x2ab/frame 0xfffffe00003ab870
netisr_dispatch_src() at netisr_dispatch_src+0x80/frame 0xfffffe00003ab8d0
ether_input() at ether_input+0x62/frame 0xfffffe00003ab900
vtnet_rxq_eof() at vtnet_rxq_eof+0x835/frame 0xfffffe00003ab9b0
vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x4e/frame 0xfffffe00003ab9e0
intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame 0xfffffe00003aba20
ithread_loop() at ithread_loop+0xa6/frame 0xfffffe00003aba70
fork_exit() at fork_exit+0x84/frame 0xfffffe00003abab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00003abab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 11 tid 100025 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
Comment 11 Olivier Cochard freebsd_committer 2016-09-01 05:29:12 UTC
I've generated a core dump (with a DEBUG kernel) and looked into it:	
		
		Unread portion of the kernel message buffer:
panic: vtnet_txq_encap: no mbuf packet header!
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00003ab530
vpanic() at vpanic+0x182/frame 0xfffffe00003ab5b0
kassert_panic() at kassert_panic+0x126/frame 0xfffffe00003ab620
vtnet_txq_mq_start_locked() at vtnet_txq_mq_start_locked+0x635/frame 0xfffffe00003ab6e0
vtnet_txq_mq_start() at vtnet_txq_mq_start+0x6f/frame 0xfffffe00003ab720
bridge_enqueue() at bridge_enqueue+0x9a/frame 0xfffffe00003ab760
bridge_forward() at bridge_forward+0x322/frame 0xfffffe00003ab7c0
bridge_input() at bridge_input+0x5f4/frame 0xfffffe00003ab830
ether_nh_input() at ether_nh_input+0x2ab/frame 0xfffffe00003ab870
netisr_dispatch_src() at netisr_dispatch_src+0x80/frame 0xfffffe00003ab8d0
ether_input() at ether_input+0x62/frame 0xfffffe00003ab900
vtnet_rxq_eof() at vtnet_rxq_eof+0x835/frame 0xfffffe00003ab9b0
vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x4e/frame 0xfffffe00003ab9e0
intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame 0xfffffe00003aba20
ithread_loop() at ithread_loop+0xa6/frame 0xfffffe00003aba70
fork_exit() at fork_exit+0x84/frame 0xfffffe00003abab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00003abab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

Reading symbols from /data/debug/boot/kernel/if_bridge.ko.debug...done.
Loaded symbols for /data/debug/boot/kernel/if_bridge.ko.debug
Reading symbols from /boot/kernel/bridgestp.ko...done.
Loaded symbols for /boot/kernel/bridgestp.ko
Reading symbols from /boot/kernel/pf.ko...done.
Loaded symbols for /boot/kernel/pf.ko
#0  doadump (textdump=0) at pcpu.h:221
221     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump (textdump=0) at pcpu.h:221
#1  0xffffffff8035512b in db_dump (dummy=<value optimized out>, dummy2=false,
    dummy3=0, dummy4=0x0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/ddb/db_command.c:546
#2  0xffffffff80354f29 in db_command (cmd_table=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/ddb/db_command.c:453
#3  0xffffffff80354c84 in db_command_loop ()
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/ddb/db_command.c:506
#4  0xffffffff80357d2b in db_trap (type=<value optimized out>,
    code=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/ddb/db_main.c:251
#5  0xffffffff808fe593 in kdb_trap (type=<value optimized out>,
    code=<value optimized out>, tf=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/subr_kdb.c:654
#6  0xffffffff80c9993d in trap (frame=0xfffffe00003ab460)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/trap.c:556
#7  0xffffffff80c7a2d1 in calltrap ()
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff808fdc3b in kdb_enter (why=0xffffffff8118cc44 "panic",
    msg=0x80 <Address 0x80 out of bounds>) at cpufunc.h:63
#9  0xffffffff808c05ff in vpanic (fmt=<value optimized out>,
    ap=0xfffffe00003ab5f0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:752
#10 0xffffffff808c0456 in kassert_panic (fmt=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_shutdown.c:649
#11 0xffffffff807bc0d5 in vtnet_txq_mq_start_locked (txq=0xfffff80003698b00,
    m=0xfffff80003e25700)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:2185
#12 0xffffffff807bce3f in vtnet_txq_mq_start (ifp=0xfffff800036d3800,
    m=0xfffff80003e25700)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:2381
#13 0xffffffff8221b72a in bridge_enqueue (sc=0xfffff8000369d200,
    dst_ifp=<value optimized out>, m=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:1920
#14 0xffffffff8221e2c2 in bridge_forward (sc=<value optimized out>,
    sbif=<value optimized out>, m=0xfffffe00003ab410)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2271
#15 0xffffffff8221d564 in bridge_input (ifp=<value optimized out>,
    m=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2475
#16 0xffffffff809afc4b in ether_nh_input (m=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/if_ethersubr.c:602
#17 0xffffffff809c4cb0 in netisr_dispatch_src (proto=5, source=0,
    m=0xfffff80003e25600)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/netisr.c:1120
#18 0xffffffff809af252 in ether_input (ifp=<value optimized out>, m=0x0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/net/if_ethersubr.c:757
#19 0xffffffff807bb675 in vtnet_rxq_eof (rxq=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:1745
#20 0xffffffff807bc69e in vtnet_rx_vq_intr (xrxq=0xfffff80003698e00)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/dev/virtio/network/if_vtnet.c:1876
#21 0xffffffff8088dde6 in intr_event_execute_handlers (
    p=<value optimized out>, ie=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_intr.c:1262
#22 0xffffffff8088e466 in ithread_loop (arg=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_intr.c:1275
#23 0xffffffff8088b4f4 in fork_exit (
    callout=0xffffffff8088e3c0 <ithread_loop>, arg=0xfffff800034c1ee0,
    frame=0xfffffe00003abac0)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/kern/kern_fork.c:1038
#24 0xffffffff80c7a80e in fork_trampoline ()
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/amd64/amd64/exception.S:611
#25 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal

=> It seems that bridge_enqueue() is sending a bad/unexisting mbuf to the interface.

(kgdb) frame 13
#13 0xffffffff8221b72a in bridge_enqueue (sc=0xfffff8000369d200,
    dst_ifp=<value optimized out>, m=<value optimized out>)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:1920
1920                    if ((err = dst_ifp->if_transmit(dst_ifp, m))) {

=> kgdb can't display m (mbuf pointer) value here, but at the previous frame it can display it:

(kgdb) frame 14
#14 0xffffffff8221e2c2 in bridge_forward (sc=<value optimized out>,
    sbif=<value optimized out>, m=0xfffffe00003ab410)
    at /usr/local/BSDRP/BSDRP12/FreeBSD/src/sys/modules/if_bridge/../../net/if_bridge.c:2271
2271            bridge_enqueue(sc, dst_if, m);
(kgdb) print m
$1 = (struct mbuf *) 0xfffffe00003ab410

On my VMs that are using vtnet interface, vtnet didn't have VLANTAG neither VLAN_HWTAGGING:

[root@router]~# ifconfig vtnet1
vtnet1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80028<VLAN_MTU,JUMBO_MTU,LINKSTATE>

		
Then bridge_enqueue() should trigger this code part:
		
         /*
         * If underlying interface can not do VLAN tag insertion itself
         * then attach a packet tag that holds it.
         */
        if ((m->m_flags & M_VLANTAG) &&
            (dst_ifp->if_capenable & IFCAP_VLAN_HWTAGGING) == 0) {


I beleive there is something wrong here.
Then I've insered a : M_ASSERTPKTHDR(m);
just before line 1920: if ((err = dst_ifp->if_transmit(dst_ifp, m)))

and this new ASSERT is triggered :

[root@router]~# panic: bridge_enqueue: no mbuf packet header!
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00003ab630
vpanic() at vpanic+0x182/frame 0xfffffe00003ab6b0
kassert_panic() at kassert_panic+0x126/frame 0xfffffe00003ab720
bridge_enqueue() at bridge_enqueue+0x11a/frame 0xfffffe00003ab760
bridge_forward() at bridge_forward+0x322/frame 0xfffffe00003ab7c0
bridge_input() at bridge_input+0x5f4/frame 0xfffffe00003ab830
ether_nh_input() at ether_nh_input+0x2ab/frame 0xfffffe00003ab870
netisr_dispatch_src() at netisr_dispatch_src+0x80/frame 0xfffffe00003ab8d0
ether_input() at ether_input+0x62/frame 0xfffffe00003ab900
vtnet_rxq_eof() at vtnet_rxq_eof+0x835/frame 0xfffffe00003ab9b0
vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x4e/frame 0xfffffe00003ab9e0
intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame 0xfffffe00003aba20
ithread_loop() at ithread_loop+0xa6/frame 0xfffffe00003aba70
fork_exit() at fork_exit+0x84/frame 0xfffffe00003abab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00003abab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 11 tid 100025 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
Comment 12 Olivier Cochard freebsd_committer 2016-09-01 23:59:37 UTC
I've added some lines like:
if_printf(ifp,"[DEBUG] bridge_fragment() exiting, m_len: %d\n",m->m_len);

in the sys/net/if_bridge.c code.

Now, here is the behavior with pf-in-bridge-mode, BUT without scrub, when I generate a "ping -c 1 -s 1500" (:

bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 1514
bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 1514
bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 1514
bridge0: [DEBUG] bridge_pfil() exit, dir: 2(IN:1/OUT:2), m_len: 1514
bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 62
bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 62
bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 62
bridge0: [DEBUG] bridge_pfil() exit, dir: 2(IN:1/OUT:2), m_len: 62
bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 1514
bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 1514
bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 1514
bridge0: [DEBUG] bridge_pfil() exit, dir: 2(IN:1/OUT:2), m_len: 1514
bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 62
bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 62
bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 62
bridge0: [DEBUG] bridge_pfil() exit, dir: 2(IN:1/OUT:2), m_len: 62

=> For each packet received, there are transmitted as-it.


Now, here is the behavior with pf-in-bridge-mode WITH scrub:

bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 1514
pf_normalize_ip: DEBUG branch frag: 0xfffff80003e73300(m_pkthrd.len:1500)
pf_normalize_ip: reass frag 45306 @ 0-1480
pf_fillup_fragment: reass frag 45306 @ 0-1480
bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), m_len: 62
pf_normalize_ip: DEBUG branch frag: 0xfffff80003e73200(m_pkthrd.len:48)
pf_normalize_ip: reass frag 45306 @ 1480-1508
pf_fillup_fragment: reass frag 45306 @ 1480-1508
pf_isfull_fragment: 1508 < 1508?
pf_reassemble: complete: 0xfffff80003e73300(m_pkthrd.len:1528, p_len: 1528)
bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), m_len: 1542
bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), m_len: 1542
vtnet1: [DEBUG] bridge_fragment() entering, m_len: 1528
vtnet1: [DEBUG] bridge_fragment() exiting, m_len: 1500
panic: bridge_enqueue: no mbuf packet header!

=> There are 2 new functions called: pf_normalize and bridge_fragment.

Here is my interpretation in the scrub-and-bridge-mode:
1. bridge_pfil (IN) the first fragmented packet (mbuf_len of MTU max ethernet frame = 1514)
2. pf_normalize (scrub) detect a fragment, and wait for the next fragment
3. bridge_pfil (IN) the second fragment packet (mbuf_len of 62 Bytes Ethernet frame)
4. pf_normalize reassemble this 2 mbuf in one big mbuf of 1528 (=20 bytes for IP header + 1508 bytes of ICMP header+data)
5. bridge_pfil (IN) re-add 14 bytes of Ethernet Header to this mbuf (m_len=1542 bytes)
6. bridge_pfil (OUT) takes this mbuf (m_len=1542), remove the Ethernet header (m_len - 14 = 1528) and call bridge_fragment() because it's bigger than MTU.
7. bridge_fragment should have a bug, because it reduce the m_len to 1500 and try to fordward it to NIC (it should be at 1514 minimum, not 1500!).
8. The ASSERT I've set is triggered: We can't send an mbuf without ethernet header to the NIC.
Comment 13 Olivier Cochard freebsd_committer 2016-09-02 11:57:45 UTC
funny, after lot's of printf() for debuging, it seems it's the first suspicious function that was source of the panic that is corrupting my mbuf/packet:
in bridge_fragment(): M_PREPEND(m0, ETHER_HDR_LEN, M_NOWAIT);

Here is the new output of my debug output:

bridge0: [DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), frag :0xfffff8000386e800(m_len: 1514)
pf_normalize_ip: DEBUG branch frag: 0xfffff8000386e800(m_pkthrd.len:1500)
pf_normalize_ip: reass frag 44538 @ 0-1480
pf_fillup_fragment: reass frag 44538 @ 0-1480bridge0: 
[DEBUG] bridge_pfil() enter, dir: 1(IN:1/OUT:2), frag :0xfffff8000386e700(m_len: 62)
pf_normalize_ip: DEBUG branch frag: 0xfffff8000386e700(m_pkthrd.len:48)
pf_normalize_ip: reass frag 44538 @ 1480-1508
pf_fillup_fragment: reass frag 44538 @ 1480-1508
pf_isfull_fragment: 1508 < 1508?
pf_reassemble: complete: 0xfffff8000386e800(m_pkthrd.len:1528, p_len: 1528)
bridge0: [DEBUG] bridge_pfil() exit, dir: 1(IN:1/OUT:2), frag: 0xfffff8000386e800(m_len: 1542)
bridge0: [DEBUG] bridge_pfil() enter, dir: 2(IN:1/OUT:2), frag :0xfffff8000386e800(m_len: 1542)
vtnet1: [DEBUG] bridge_fragment() entering, frag:0xfffff8000386e800(m_len: 1528), ether_dhost : 58:9c:fc:02:03:03
vtnet1: [DEBUG] bridge_fragment() after ip_fragment, first mbuf in chain is frag:0xfffff8000386e800(m_len: 1500), second is 0xfffff80003796c00(m_len: 20)
vtnet1: [DEBUG] bridge_fragment() walking chain, frag m0:0xfffff8000386e800(m_len: 1500), frag m:0xfffff8000386e800(m_len: 1500)
vtnet1: [DEBUG] bridge_fragment() walking chain after M_PREPEND, frag m0:0xfffff80003796d00(m_len: 14), frag m:0xfffff8000386e800(m_len: 1500)
vtnet1: [DEBUG] bridge_fragment() walking chain after bcopy, frag m0:0xfffff80003796d00(m_len: 14), frag m:0xfffff8000386e800(m_len: 1500)
vtnet1: [DEBUG] bridge_fragment() exiting, m_len: 1500
panic: bridge_enqueue: no mbuf packet header!


=> Before calling M_PREPEND, there is a mbuf chain:
- first element is 1500 bytes long
- second element is 20 bytes long
Then we need to add ETHER_HDR_LEN to the begining of the first element:
After M_PREPEND, the 1500 bytes long should be 1514 bytes long… but we obtain a 14 bytes long mbuf!!!!
Comment 14 Olivier Cochard freebsd_committer 2016-09-04 14:36:11 UTC
Patch proposed here:
https://reviews.freebsd.org/D7780
Comment 15 commit-hook freebsd_committer 2016-09-24 07:10:29 UTC
A commit references this bug:

Author: kp
Date: Sat Sep 24 07:09:43 UTC 2016
New revision: 306289
URL: https://svnweb.freebsd.org/changeset/base/306289

Log:
  bridge: Fix fragment handling and memory leak

  Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling
  feature (like pf'scrub) is enabled on the bridge.  This patch fixes corrupted
  packet problem and the panic (triggered easly with low RAM) as explain in PR
  185633.

  bridge_pfil and bridge_fragment relationship:

  bridge_pfil() receive (IN direction) packets and sent it to the firewall The
  firewall can be configured for reassembling fragmented packet (like pf'scrubing)
  in one mbuf chain when bridge_pfil() need to send this reassembled packet to the
  outgoing interface, it needs to re-fragment it by using bridge_fragment()
  bridge_fragment() had to split this mbuf (using ip_fragment) first then
  had to M_PREPEND each packet in the mbuf chain for adding Ethernet
  header.

  But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain,
  then the "main" pointer of this mbuf chain should be updated and this case is
  tottaly forgotten. The original bridge_fragment code (Revision 158140,
  2006 April 29) came from OpenBSD, and the call to bridge_enqueue was
  embedded.  But on FreeBSD, bridge_enqueue() is done after bridge_fragment(),
  then the original OpenBSD code can't work as-it of FreeBSD.

  PR:		185633
  Submitted by:	Olivier Cochard-Labb?
  Differential Revision:	https://reviews.freebsd.org/D7780

Changes:
  head/sys/net/if_bridge.c
Comment 16 commit-hook freebsd_committer 2016-10-02 21:07:33 UTC
A commit references this bug:

Author: kp
Date: Sun Oct  2 21:06:55 UTC 2016
New revision: 306593
URL: https://svnweb.freebsd.org/changeset/base/306593

Log:
  MFC r306289:

  bridge: Fix fragment handling and memory leak

  Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling
  feature (like pf'scrub) is enabled on the bridge.  This patch fixes corrupted
  packet problem and the panic (triggered easly with low RAM) as explain in PR
  185633.

  bridge_pfil and bridge_fragment relationship:

  bridge_pfil() receive (IN direction) packets and sent it to the firewall The
  firewall can be configured for reassembling fragmented packet (like pf'scrubing)
  in one mbuf chain when bridge_pfil() need to send this reassembled packet to the
  outgoing interface, it needs to re-fragment it by using bridge_fragment()
  bridge_fragment() had to split this mbuf (using ip_fragment) first then
  had to M_PREPEND each packet in the mbuf chain for adding Ethernet
  header.

  But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain,
  then the "main" pointer of this mbuf chain should be updated and this case is
  tottaly forgotten. The original bridge_fragment code (Revision 158140,
  2006 April 29) came from OpenBSD, and the call to bridge_enqueue was
  embedded.  But on FreeBSD, bridge_enqueue() is done after bridge_fragment(),
  then the original OpenBSD code can't work as-it of FreeBSD.

  PR:             185633
  Submitted by:   Olivier Cochard-Labb?

Changes:
_U  stable/11/
  stable/11/sys/net/if_bridge.c
Comment 17 commit-hook freebsd_committer 2016-10-02 21:11:36 UTC
A commit references this bug:

Author: kp
Date: Sun Oct  2 21:11:25 UTC 2016
New revision: 306594
URL: https://svnweb.freebsd.org/changeset/base/306594

Log:
  MFC r306289:

  bridge: Fix fragment handling and memory leak

  Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling
  feature (like pf'scrub) is enabled on the bridge.  This patch fixes corrupted
  packet problem and the panic (triggered easly with low RAM) as explain in PR
  185633.

  bridge_pfil and bridge_fragment relationship:

  bridge_pfil() receive (IN direction) packets and sent it to the firewall The
  firewall can be configured for reassembling fragmented packet (like pf'scrubing)
  in one mbuf chain when bridge_pfil() need to send this reassembled packet to the
  outgoing interface, it needs to re-fragment it by using bridge_fragment()
  bridge_fragment() had to split this mbuf (using ip_fragment) first then
  had to M_PREPEND each packet in the mbuf chain for adding Ethernet
  header.

  But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain,
  then the "main" pointer of this mbuf chain should be updated and this case is
  tottaly forgotten. The original bridge_fragment code (Revision 158140,
  2006 April 29) came from OpenBSD, and the call to bridge_enqueue was
  embedded.  But on FreeBSD, bridge_enqueue() is done after bridge_fragment(),
  then the original OpenBSD code can't work as-it of FreeBSD.

  PR:             185633
  Submitted by:   Olivier Cochard-Labb?

Changes:
_U  stable/10/
  stable/10/sys/net/if_bridge.c