Bug 200330 - panic: pf_addr_cmp: unknown address family 0 when scrub fragment drop-ovl is used
Summary: panic: pf_addr_cmp: unknown address family 0 when scrub fragment drop-ovl is ...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Kristof Provost
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-19 14:58 UTC by Danilo Egea Gondolfo
Modified: 2015-06-18 21:24 UTC (History)
2 users (show)

See Also:


Attachments
core.txt (224.91 KB, text/plain)
2015-05-19 14:58 UTC, Danilo Egea Gondolfo
no flags Details
kernel conf (7.49 KB, text/plain)
2015-05-21 15:02 UTC, Danilo Egea Gondolfo
no flags Details
core2.txt (255.77 KB, text/plain)
2015-05-22 22:46 UTC, Danilo Egea Gondolfo
no flags Details
Possible patch (451 bytes, patch)
2015-06-07 03:37 UTC, Kristof Provost
no flags Details | Diff
new core dump (243.11 KB, text/plain)
2015-06-09 21:15 UTC, Danilo Egea Gondolfo
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-05-19 14:58:30 UTC
Created attachment 156944 [details]
core.txt

After few seconds/minutes using the rule "scrub in all fragment drop-ovl" the system is crashing.
Comment 1 Kristof Provost freebsd_committer freebsd_triage 2015-05-21 07:52:07 UTC
This is going to be related to the IPv6 fragment handling work done recently.

I've had a quick try at reproducing this and so far have had no success. Can you post your kernel config?

Do you happen to know if you've got anything unusual going on traffic-wise?
Comment 2 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-05-21 15:01:44 UTC
I was looking for strange traffic, but until now it's normal. If you look at the netstat section in the core.txt there is the line:
"3 packets for unknown/unsupported protocol"

Maybe pf is processing these packets...

I'm tcpdumping my NIC for non ip packets. I'm using a cheap USB wifi adapter, maybe it's defective...
Comment 3 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-05-21 15:02:34 UTC
Created attachment 157012 [details]
kernel conf
Comment 4 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-05-22 22:44:42 UTC
After 7 hours using the system, with an ethernet cable this time, I've got the panic again.
Comment 5 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-05-22 22:46:06 UTC
Created attachment 157060 [details]
core2.txt
Comment 6 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-05-22 23:10:16 UTC
I've isolated the mbuf data. Parsing the IP header we have:
Version: 4
Hlen: 5
ToS: 0
Len: 173
Id: 54533
Off: 185
TTL: 52
P: 17
Sum: f4ad
Src: 8.8.8.8
Dst: 172.16.255.179

Curiously, the fragment in the previous panic had the same Length and, I think, the same Offset.
Comment 7 Kristof Provost freebsd_committer freebsd_triage 2015-06-07 03:37:19 UTC
Created attachment 157492 [details]
Possible patch

I've had a look, and I think I know why you're seeing this panic.

Can you try this patch?

Basically the issue is that if drop-ovl is set we take a different path, which leads through pf_fragcache(). In pf_fragcache() we create a pf_frag but fail to set the address family, which leads to the panic you see when we try to insert it (and compare with pf_addr_cmp()).
Comment 8 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-06-07 18:31:07 UTC
I'm testing. Thanks!
Comment 9 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-06-09 21:14:10 UTC
Panic again, but due a page fault now.
Comment 10 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-06-09 21:15:59 UTC
Created attachment 157587 [details]
new core dump
Comment 11 Kristof Provost freebsd_committer freebsd_triage 2015-06-09 21:20:44 UTC
Thanks for testing. I'll already commit the first fix, because it clearly fixes part of the problem you're seeing.

I'll also see if I can reproduce the problem and/or figure out what's happening now.
Comment 12 commit-hook freebsd_committer freebsd_triage 2015-06-10 13:44:18 UTC
A commit references this bug:

Author: kp
Date: Wed Jun 10 13:44:04 UTC 2015
New revision: 284222
URL: https://svnweb.freebsd.org/changeset/base/284222

Log:
  pf: address family must be set when creating a pf_fragment

  Fix a panic when handling fragmented ip4 packets with 'drop-ovl' set.
  In that scenario we take a different branch in pf_normalize_ip(), taking us to
  pf_fragcache() (rather than pf_reassemble()). In pf_fragcache() we create a
  pf_fragment, but do not set the address family. This leads to a panic when we
  try to insert that into pf_frag_tree because pf_addr_cmp(), which is used to
  compare the pf_fragments doesn't know what to do if the address family is not
  set.

  Simply ensure that the address family is set correctly (always AF_INET in this
  path).

  PR:			200330
  Differential Revision:	https://reviews.freebsd.org/D2769
  Approved by:		philip (mentor), gnn (mentor)

Changes:
  head/sys/netpfil/pf/pf_norm.c
Comment 13 Kristof Provost freebsd_committer freebsd_triage 2015-06-11 17:01:22 UTC
I've not yet been able to reproduce your second panic. As of r284260 things appear to be working for me. Is this something you can easily reproduce?

Can you also explain your use case for 'fragment drop-ovl'? 

Looking at what it actually does (which is different from what unsuspecting users - or developers - might expect it do do), I'm very tempted to just propose removing it alltogether.
Comment 14 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-06-11 17:29:20 UTC
I was using this option in a low traffic network. The last panic happened after some hours of uptime. Nothing special. My use case for "drop-ovl" is just "testing", I don't need to use this option actually, I was just testing when I discovered the problem. Does the "fragment reassemble" option do the same work? After the last panic I've deactivated the option, I'm using it again to try to reproduce the panic. Thanks for your work!
Comment 15 Kristof Provost freebsd_committer freebsd_triage 2015-06-11 17:50:32 UTC
The problem I have with 'drop-ovl' (and the same applies to 'crop') is that it's a bit misleading.
They don't actually reassemble the fragmented packet. 

That means a far lower system load, but you can't actually filter in-depth. For example, with 'scrub all fragment reassemble' you can do 'pass in inet proto icmp all icmp-type echoreq keep state', but with 'scrub all fragment drop-ovl' you'll end up just dropping a fragmented ICMP echo request. An unfragmented ICMP echo request will just get through.
That goes against what I'd expect.

I think you (and really all pf users) want to be using 'scrub in all fragment reassemble'.

(That said, if you can break it, please keep using drop-ovl, so I can debug it ;))
Comment 16 Danilo Egea Gondolfo freebsd_committer freebsd_triage 2015-06-11 17:59:16 UTC
Yes, I was expecting that drop-ovl just drops packets with fragments that has the field "fragment offset" overlapping other fragments. My question is: Can (or should) "fragment reassemble" drop these packets too?

I'll keep using it to see what happens. Thanks!
Comment 17 Kristof Provost freebsd_committer freebsd_triage 2015-06-11 18:08:03 UTC
'reassemble' does the right thing, in that it will fully reassemble the packet. It handles overlaps, by discarding the (parts of) packets it's already seen.

Processing continues with the full packet, not on a fragment-per-fragment basis.

When filtering input packets that's where it ends, because the host would have to reassemble anyway.

When forwarding we also just continue with the full packet, and fragment again when transmitting. This implies that it's possible that we'll receive 5 packets of 80 bytes, but we transmit one packet of 400 bytes. 

(Note that this is slightly different for IPv6. There we'll always refragment to the size of the largest fragment we received so we don't break path MTU.)
Comment 18 Kristof Provost freebsd_committer freebsd_triage 2015-06-12 16:59:29 UTC
This issue should be fixed as of r284260, so I'm closing this bug.

If you see any other problems please reopen (or submit a new bug).
Comment 19 commit-hook freebsd_committer freebsd_triage 2015-06-18 21:24:35 UTC
A commit references this bug:

Author: kp
Date: Thu Jun 18 21:23:42 UTC 2015
New revision: 284580
URL: https://svnweb.freebsd.org/changeset/base/284580

Log:
  Merge r284222, r284260

  pf: address family must be set when creating a pf_fragment

  Fix a panic when handling fragmented ip4 packets with 'drop-ovl' set.
  In that scenario we take a different branch in pf_normalize_ip(), taking us to
  pf_fragcache() (rather than pf_reassemble()). In pf_fragcache() we create a
  pf_fragment, but do not set the address family. This leads to a panic when we
  try to insert that into pf_frag_tree because pf_addr_cmp(), which is used to
  compare the pf_fragments doesn't know what to do if the address family is not
  set.

  Simply ensure that the address family is set correctly (always AF_INET in this
  path).

  When we try to look up a pf_fragment with pf_find_fragment() we compare (see
  pf_frag_compare()) addresses (and family), but also protocol.  We failed to
  save the protocol to the pf_fragment in pf_fragcache(), resulting in failing
  reassembly.

  PR:		200330
  Differential Revision:	https://reviews.freebsd.org/D2824
  Reviewed by:	gnn

Changes:
_U  stable/10/
  stable/10/sys/netpfil/pf/pf_norm.c