Bug 219803 - [patch] PF: implement RFC 4787 REQ 1 and 3 (full cone NAT)
Summary: [patch] PF: implement RFC 4787 REQ 1 and 3 (full cone NAT)
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-pf mailing list
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2017-06-05 17:48 UTC by Damjan Jovanovic
Modified: 2017-06-16 14:28 UTC (History)
2 users (show)

See Also:


Attachments
pf RFC 4787 req 1 and 3 implementation (18.40 KB, patch)
2017-06-05 17:48 UTC, Damjan Jovanovic
no flags Details | Diff
pf RFC 4787 req 1 and 3 implementation, version 2 (24.54 KB, patch)
2017-06-16 04:44 UTC, Damjan Jovanovic
no flags Details | Diff
pf RFC 4787 req 1 and 3 implementation, version 3 (18.97 KB, patch)
2017-06-16 14:28 UTC, Damjan Jovanovic
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Damjan Jovanovic 2017-06-05 17:48:30 UTC
Created attachment 183243 [details]
pf RFC 4787 req 1 and 3 implementation

This patch implements RFC 4787 requirements 1 and 3, changing PF's allocation of NAT mappings for UDP from the current "symmetric" NAT to a "endpoint-independent mapping" NAT, a.k.a "full cone" NAT. All UDP packets from the internal IP:port X:x go through the same external Y:y no matter the Z:z, and nothing but X:x uses Y:y.

Internal             External
X:x  -----> NAT Y:y ----> Z:z

The implementation is relatively straightforward. pf_state for UDP connections now reference a pf_udp_mapping, which is reference counted, and kept alive as long as at least 1 pf_state is referencing it. Every new NAT mapping that gets created tries to find a pf_udp_mapping by its source X:x endpoint and reuses its external Y:y, failing which, it creates a new one through an unused Y:y. Only allocation of NAT mappings is changed. Each X:x <-> Z:z still has its own distinct connection state (struct pf_state) and behaves the same as before.

Currently, only if a Z:z was previously transmitted to by X:x, can it transmit back to X:x through Y:y, i.e it behaves as a "port-restricted cone" NAT (or endpoint-independent mapping NAT with address- and port-dependent filtering, as per RFC 4787), but I am working on that too.

This should fix STUN and vastly improve UDP applications using PF's NATing such as gaming, VoIP, WebRTC, peer to peer applications, etc.

Are the NAT implementations in our other firewalls also "symmetric" NATs?
Comment 1 Kristof Provost freebsd_committer 2017-06-06 03:07:48 UTC
Can you go into a bit more detail of what problem you're trying to solve with this patch, in its current state?

I think I can see the use case for the extended patch (where you allow packets from any Z:z back to X:x), but I'd certainly object to that, as it's a significant behaviour change. It removes the accidental stateful firewall aspect of the NAT implementation.
Comment 2 Damjan Jovanovic 2017-06-06 04:30:12 UTC
In its current state, the patch provides application with a NAT hole punching capability. Unlike in a symmetric NAT, in any cone-type NAT, an internal UDP application can negotiate to receive packets from a known peer, by using STUN to create a external IP:port for its UDP socket and discover what they are, communicating them to its peer and learning what external IP:port its peer is using, and even if it's behind the most restrictive "ported-restricted cone" NAT, it can just send 1 packet to its peer's IP:port to create a connection and allow that peer to send packets back.

This works even if both peers are NATed, as long as at least 1 (the server) is not a symmetric NAT.
Comment 3 Kristof Provost freebsd_committer 2017-06-07 18:14:09 UTC
I see. I think that makes sense, but I'll need a bit of time to review this.
It probably wouldn't hurt if you could explain this on freebsd-pf@ so others can take a look too.
Comment 4 Kristof Provost freebsd_committer 2017-06-10 17:26:41 UTC
I've created a phabricator entry for this https://reviews.freebsd.org/D11137 , because that makes reviewing easier.
Comment 5 Damjan Jovanovic 2017-06-11 07:59:40 UTC
Thank you.

I've developed a patch for the same feature in LibAlias (tested with IPFW but presumably applies to natd/pppd too) on bug 219918, which you might want to look at first, as it's much shorter and simpler than this one, only about 200 lines long.

Also my tests show IPFILTER already does endpoint-independent mapping, as does "iptables" in Linux.

I've also emailed freebsd-net@ with an explanation:
https://lists.freebsd.org/pipermail/freebsd-net/2017-June/048135.html
Comment 6 Kristof Provost freebsd_committer 2017-06-15 20:14:21 UTC
With this patch my gateway box (pf and vimage jails) panics pretty quickly during boot.

#0  doadump (textdump=0) at pcpu.h:232
#1  0xffffffff803a4c2b in db_dump (dummy=<value optimized out>, dummy2=<value optimized out>, dummy3=<value optimized out>, dummy4=<value optimized out>)
    at /usr/src/sys/ddb/db_command.c:546
#2  0xffffffff803a4a1f in db_command (cmd_table=<value optimized out>) at /usr/src/sys/ddb/db_command.c:453
#3  0xffffffff803a4754 in db_command_loop () at /usr/src/sys/ddb/db_command.c:506
#4  0xffffffff803a781f in db_trap (type=<value optimized out>, code=<value optimized out>) at /usr/src/sys/ddb/db_main.c:248
#5  0xffffffff80a9bd33 in kdb_trap (type=12, code=0, tf=<value optimized out>) at /usr/src/sys/kern/subr_kdb.c:654
#6  0xffffffff80efb4f2 in trap_fatal (frame=0xfffffe022fefaf50, eva=48) at /usr/src/sys/amd64/amd64/trap.c:796
#7  0xffffffff80efb5a2 in trap_pfault (frame=0xfffffe022fefaf50, usermode=0) at pcpu.h:232
#8  0xffffffff80efad3d in trap (frame=0xfffffe022fefaf50) at /usr/src/sys/amd64/amd64/trap.c:421
#9  0xffffffff80edcf31 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236
#10 0xffffffff8267409a in pf_addrcpy (dst=0x30, src=0xfffff8002d09f590, af=2 '\002') at pcpu.h:231
#11 0xffffffff82689ead in pf_get_translation (pd=0xfffffe022fefc351, m=<value optimized out>, off=<value optimized out>, direction=2, kif=<value optimized out>,
    sn=0xfffffe022fefb438, skp=<value optimized out>, nkp=<value optimized out>, saddr=<value optimized out>, daddr=<value optimized out>, sport=<value optimized out>,
    dport=<value optimized out>, anchor_stack=<value optimized out>) at /usr/src/sys/netpfil/pf/pf_lb.c:262
#12 0xffffffff8267dd08 in pf_test_rule (rm=0xfffffe022fefb6d0, sm=0xfffffe022fefb6e0, direction=2, kif=0xfffff80006dddb00, m=0xfffff8002d23f000, off=20,
    pd=<value optimized out>, am=0xfffffe022fefb6a0, inp=<value optimized out>) at /usr/src/sys/netpfil/pf/pf.c:3336
#13 0xffffffff8267af11 in pf_test (dir=<value optimized out>, ifp=<value optimized out>, m0=<value optimized out>, inp=0x0) at /usr/src/sys/netpfil/pf/pf.c:6088
#14 0xffffffff8268cd9d in pf_check_out (arg=<value optimized out>, m=0xfffffe022fefb7c0, ifp=<value optimized out>, dir=<value optimized out>, inp=<value optimized out>)
    at /usr/src/sys/netpfil/pf/pf_ioctl.c:3582
#15 0xffffffff80b74314 in pfil_run_hooks (ph=0xfffffe0000de7a18, mp=0xfffffe022fefb818, ifp=0xfffff80006e1d800, dir=2, inp=0x0) at /usr/src/sys/net/pfil.c:108
#16 0xffffffff80bdbf80 in ip_tryforward (m=0xfffff8002d23f000) at /usr/src/sys/netinet/ip_fastfwd.c:306
#17 0xffffffff80bde9f1 in ip_input (m=0xfffff8002d23f000) at /usr/src/sys/netinet/ip_input.c:570
#18 0xffffffff80b731bf in netisr_dispatch_src (proto=1, source=0, m=0xfffff8002d23f000) at /usr/src/sys/net/netisr.c:1120
#19 0xffffffff80b593be in ether_demux (ifp=0xfffff80006e1c000, m=<value optimized out>) at /usr/src/sys/net/if_ethersubr.c:848
#20 0xffffffff80b5a3f2 in ether_nh_input (m=<value optimized out>) at /usr/src/sys/net/if_ethersubr.c:637
#21 0xffffffff80b731bf in netisr_dispatch_src (proto=5, source=0, m=0xfffff8002d23f000) at /usr/src/sys/net/netisr.c:1120
#22 0xffffffff80b5977f in ether_input (ifp=0xfffff80006e1c000, m=0x0) at /usr/src/sys/net/if_ethersubr.c:757
#23 0xffffffff80b54d6a in if_input (ifp=<value optimized out>, sendmp=<value optimized out>) at /usr/src/sys/net/if.c:3993
#24 0xffffffff804ff9cc in bge_rxeof () at /usr/src/sys/dev/bge/if_bge.c:4424
#25 0xffffffff804fd0d2 in bge_intr_task (arg=0xfffffe0000fe5000, pending=<value optimized out>) at /usr/src/sys/dev/bge/if_bge.c:4654
#26 0xffffffff80aae22d in taskqueue_run_locked (queue=0xfffff80005637400) at /usr/src/sys/kern/subr_taskqueue.c:454
#27 0xffffffff80aaefe8 in taskqueue_thread_loop (arg=<value optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:746
#28 0xffffffff80a1ab44 in fork_exit (callout=0xffffffff80aaef60 <taskqueue_thread_loop>, arg=0xfffffe0000fec568, frame=0xfffffe022fefbc00) at /usr/src/sys/kern/kern_fork.c:1038
#29 0xffffffff80edd46e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611
#30 0x0000000000000000 in ?? ()

...

#11 0xffffffff82689ead in pf_get_translation (pd=0xfffffe022fefc351, m=<value optimized out>, off=<value optimized out>, direction=2, kif=<value optimized out>,
    sn=0xfffffe022fefb438, skp=<value optimized out>, nkp=<value optimized out>, saddr=<value optimized out>, daddr=<value optimized out>, sport=<value optimized out>,
    dport=<value optimized out>, anchor_stack=<value optimized out>) at /usr/src/sys/netpfil/pf/pf_lb.c:262
262			PF_ACPY(&(*udp_mapping)->endpoints[1].addr, naddr, af);
(kgdb) p udp_mapping
Cannot access memory at address 0x0
(kgdb)

I'm not quite sure how that happens, but it's easy to reproduce.

My pf.conf is a pretty typical gateway config. A nat rule and a couple of rdr rules (including for UDP).
Comment 7 Damjan Jovanovic 2017-06-16 04:44:39 UTC
Created attachment 183512 [details]
pf RFC 4787 req 1 and 3 implementation, version 2

Sorry about that. pf_lb.c:262 expected (*udp_mapping) to be set, which is only true for UDP (I didn't test with TCP). This new patch only writes to it if it's not NULL.
Comment 8 Kristof Provost freebsd_committer 2017-06-16 07:48:18 UTC
Thanks. I'll start testing this patch.

Can you also take a look at the style(9) remarks in the review (https://reviews.freebsd.org/D11137)?
Comment 9 Damjan Jovanovic 2017-06-16 14:28:31 UTC
Created attachment 183534 [details]
pf RFC 4787 req 1 and 3 implementation, version 3

Thank you. Here is version 3, with the style changes, and without an extraneous file added in version 2.