Bug 201590 - Zerowindow packets escape stateful in-kernel NAT
Summary: Zerowindow packets escape stateful in-kernel NAT
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.1-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-net mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-07-15 15:49 UTC by g_amanakis
Modified: 2018-07-08 14:43 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description g_amanakis 2015-07-15 15:49:34 UTC
According to the example of the handbook regarding NAT (https://www.freebsd.org/doc/handbook/firewalls-ipfw.html), the inbound NAT rule should be placed first (below 0400) followed by the outbound NAT rule (below 24000)

-------8<--------
ipfw nat 123 config ip xxx.xxx.xxx.xxx same_ports reset

00100 reass ip from any to any in
00200 allow ip from any to any via lo0
00300 allow ip from any to any via em1
00400 nat 123 ip from any to any in recv em0
00500 check-state
00600 skipto 24000 ip from any to me dst-port 80,443,22,500,4500,1194,993,8112 in recv em0 keep-state
00700 skipto 24000 ip from any to any out xmit em0 keep-state
00800 deny log ip from any to any
24000 nat 123 ip from any to any out xmit em0
24100 allow ip from any to any
-------8<--------

However this allows some packets (TCP Zerowindow packets) to escape NAT (why?) and IPs on the LAN (behind NAT) are exposed on the external interface (where NAT is performed).

When one places the NAT rules with the opposite order (i.e. outbound rule first and then the inbound rule) the problem disappears.

-------8<--------
    ipfw nat 123 config ip xxx.xxx.xxx.xxx same_ports reset

    00100 reass ip from any to any in
    00200 allow ip from any to any via lo0
    00300 allow ip from any to any via em1
    00400 nat 123 ip from any to any out xmit em0
    00500 check-state
    00600 skipto 24000 ip from any to me dst-port 80,443,22,500,4500,1194,993,8112 in recv em0 keep-state
    00700 skipto 24000 ip from any to any out xmit em0 keep-state
    00800 deny log ip from any to any
    24000 nat 123 ip from any to any in recv em0
    24100 allow ip from any to any
-------8<--------

The bug consists of unexpected behaviour, i.e. the TCP Zerowindow should not escape NAT in the first case.

See https://forums.freebsd.org/threads/ipfw-keep-state-and-in-kernel-nat-exposes-local-ip-on-external-interface.52134/

See https://forums.freebsd.org/threads/some-ip-frames-not-nated-with-ipfw-natd.51015/
Comment 1 g_amanakis 2015-07-21 14:45:30 UTC
Forgot to mention that 
net.inet.ip.fw.one_pass=0
Comment 2 Ben Woods freebsd_committer 2015-07-22 15:42:36 UTC
I can confirm I am also seeing some local network addresses escape out to the Internet when using IPFW with in-kernel NAT. Indeed it appears to be the ZeroWindow packets.

# tcpdump -n -e -ttt -i tun0 src net 192.168.0.0/16
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tun0, link-type NULL (BSD loopback), capture size 262144 bytes
00:00:00.000000 AF IPv4 (2), length 44: 192.168.1.103.53186 > 216.58.220.142.443: Flags [.], ack 922876993, win 0, length 0

I am using FreeBSD 11-current r285792 which is current from today.

My IPFW rules also have the inbound NAT rule before the outbound NAT rule as per the examples in the handbook.
Comment 3 Ben Woods freebsd_committer 2015-07-22 15:44:11 UTC
I also have the following in my /etc/sysctl.conf to allow packets to have more than 1 pass through the firewall (for in-kernel NAT):

net.inet.ip.fw.one_pass=0
Comment 4 g_amanakis 2015-07-22 22:32:18 UTC
I think it has to do with the keepalives produced from ip_fw_dynamic.c.
The packets go through ip_output() and this may be the reason they are not NATed. Just my impression of skimming through the code.
Comment 5 g_amanakis 2015-07-22 23:08:39 UTC
Setting net.inet.ip.fw.dyn_keepalive=0 resolves the problem.
However the bug remains as the keepalive packets should be NATed in the first place.
Comment 6 smithi 2015-07-23 11:46:25 UTC
 > 00100 reass ip from any to any in
 > 00200 allow ip from any to any via lo0
 > 00300 allow ip from any to any via em1
 > 00400 nat 123 ip from any to any in recv em0
 > 00500 check-state

I think the problem here is with rule 300; this is (yet another) statement
in the (now ancient) handbook ipfw examples that makes no sense, despite
some good work towards cleaning it up over the last year or so.

Assuming em1 is the internal interface, and that's where your keepalive
packets originate, then they are allowed to pass (before NAT) on the way
in.  That's ok in one way, as NAT only needs to be done on the way out.
The kernel routes these, then ipfw is again invoked on their way out.

Because of the use of 'via' here, meaning that the receive iface is
em1 on the way in, and is STILL the receive iface when on the way out,
and 'via iface' is true on outbound packets if EITHER the recv OR xmit
iface matches, once again these packets are allowed to pass; before NAT,
and also before check-state.  Hence they appear on the outside interface
with their original (private) source addresses, and statelessly as well.

Personally, I can't see the use for such a rule in any ruleset.  The
(better) examples in /etc/rc.firewall (here from 'client') are:

        # Allow any traffic to or from my own net.
        ${fwcmd} add pass all from me to ${net}
        ${fwcmd} add pass all from ${net} to me

.. but these only refer to traffic between this host and internal net.
Meanwhile, 'simple' is a better and far more thorough small-net ruleset.
Alternatively, consider using explicit 'recv' and 'xmit' (or both!) on
rules so it's always clear; refer to ipfw(8) "recv | xmit | via" section.

It's a shame we don't have any good examples of a ruleset like 'simple'
that include at least some stateful rules, to better show a) where NAT
should be done and b) where check-state should be first used, especially
where both are used together.  No, I'm not sure about that either .. but
it seems clear rule 300 is avoiding most of the ruleset, in and outbound.

So yes, keepalive packets should be NAT'd .. so don't pass them before NAT!
Comment 7 g_amanakis 2015-07-23 15:38:46 UTC
(In reply to smithi from comment #6)
I think this has nothing to do with the local interface, simply because the keepalive packets are generated from the *gateway* through ipfw_dyn_send_ka(). Commenting out the function resolves the symptoms. The actual sending takes place in check_dyn_rules() through ip_output().

The keepalive seems to be generated from the gateway on the basis of the dynamic rule, and this is before the outgoing NAT takes place, i.e. with the IP of the LAN.
Comment 8 g_amanakis 2015-07-23 15:43:58 UTC
Perhaps the culprit is the subfunction ipfw_send_pkt() in ipfw_dyn_send_ka().
Here the following happens:
	m->m_flags |= M_SKIP_FIREWALL;
I will try commenting out the line and see if this resolves it.
Comment 9 g_amanakis 2015-07-23 15:56:53 UTC
(In reply to g_amanakis from comment #8)
This poses another problem. Probably, commenting out the line will lead to these packets being rejected from the LAN, as they originated at the gateway. Which leads to the question whether net.inet.ip.fw.dyn_keepalive should be enabled on a gateway at the first place.