I occasionally observe temporary failures on my public UDP ports. The problem seems to be with libalias when used for UDP port_redirect: ipfw nat 3 config log same_ports unreg_only ip <public-ip> \ redirect_port udp 192.168.xx.xx:5007 5006 \ redirect_port tcp 192.168.xx.xx:5007 5006 02420 0 0 count log proto ip4 src-port 64000-64010 02425 0 0 nat 3 proto ip4 02430 0 0 count log proto ip4 src-port 64000-64010 Then randomly # nc -u4p 64000 <public-ip> 5006 # nc -u4p 64001 <public-ip> 5006 # nc -u4p 64002 <public-ip> 5006 Feb 23 03:51:45 <security.info> edge kernel: [75643] ipfw-oper: 2420 Count UDP 91.12.117.156:64000 <public-ip>:5006 in via tun3 Feb 23 03:51:45 <security.info> edge kernel: [75643] ipfw-oper: 2430 Count UDP 91.12.117.156:64000 192.168.xx.xx:5007 in via tun3 Feb 23 03:52:08 <security.info> edge kernel: [75666] ipfw-oper: 2420 Count UDP 91.12.117.156:64001 <public-ip>:5006 in via tun3 Feb 23 03:52:08 <security.info> edge kernel: [75666] ipfw-oper: 2430 Count UDP 91.12.117.156:64001 192.168.xx.xx:5007 in via tun3 Feb 23 03:52:29 <security.info> edge kernel: [75687] ipfw-oper: 2420 Count UDP 91.12.117.156:64001 <public-ip>:5006 in via tun3 Feb 23 03:52:29 <security.info> edge kernel: [75687] ipfw-oper: 2430 Count UDP 91.12.117.156:64001 192.168.xx.xx:5007 in via tun3 Feb 23 03:52:36 <security.info> edge kernel: [75694] ipfw-oper: 2420 Count UDP 91.12.117.156:64001 51.158.21.23:5006 in via tun3 Feb 23 03:52:36 <security.info> edge kernel: [75694] ipfw-oper: 2430 Count UDP 91.12.117.156:64001 192.168.98.18:5007 in via tun3 Feb 23 03:52:56 <security.info> edge kernel: [75714] ipfw-oper: 2420 Count UDP 91.12.117.156:64002 <public-ip>:5006 in via tun3 Feb 23 03:52:56 <security.info> edge kernel: [75714] ipfw-oper: 2430 Count UDP 91.12.117.156:64002 192.168.xx.xx:5007 in via tun3 Feb 23 03:53:06 <security.info> edge kernel: [75724] ipfw-oper: 2420 Count UDP 91.12.117.156:64000 <public-ip>:5006 in via tun3 Feb 23 03:53:06 <security.info> edge kernel: [75724] ipfw-oper: 2430 Count UDP 91.12.117.156:64000 <public-ip>:5006 in via tun3 Feb 23 03:53:06 <security.info> edge kernel: [75724] ipfw-oper: 2685 Unreach 13 UDP 91.12.117.156:64000 <public-ip>:5006 in via tun3 In the beginning 64000 and 64001 get through. Now only using 64001 for some time, and then starting to use 64002, suddenly 64000 is no longer translated. (Not always happens in that exact sequence.) # ipfw nat show log nat 2: icmp=32, udp=1, tcp=40, sctp=0, pptp=0, proto=0, frag_id=0 frag_ptr=0 / tot=73 nat 3: icmp=0, udp=2, tcp=72, sctp=0, pptp=0, proto=0, frag_id=0 frag_ptr=0 / tot=74 These udp= counters are changing meanwhile, but I don't get a clue from it, and didn't find a way to log details. Platform 13.2-BETA2 (but the problem is older)
Created attachment 248673 [details] DTRACE cmdline for general amusement and bookkeping - I mean, to observe what libalias adds and removes from it's database
I managed to analyze this a bit further: With an UDP service (e.g. VPN) portforwarded behind NAT there is no notion of a connection, and apparently it can happen that libalias creates TWO distinct records for inbound and outbound traffic, with the inbound record lacking the portforward (and the received packets then dumped as undeliverable on the NATing node). The problem always resolves itself after either 12 or 14 minutes, apparently due to internal timings. Observed with dtrace (cmdline is attached): 203.0.113.1 is my public address where NAT aliases to 198.51.100.1 is the address of the connecting client from outside 192.168.1.12 is the internal address of the VPN server (where portforward goes to) -> At this time we have a packet client->server, two server->client and another client->server, all successful 18 17832 DeleteLink:entry openvpn 1708555176 icmp=5 udp=2 tcp=108 type=17 (src_addr 192.168.1.12) (dst_addr 0.0.0.0) (alias_addr 0.0.0.0) (proxy_addr 0.0.0.0) (src_port 5007) (dst_port 0) (alias_port 5006) (proxy_port 0) -> For whatever reason libalias tries to delete some UDP, but the next line shows that nothing was actually deleted. So this would apparently be the permanent portforward, there is no other codepath 18 49947 AddLink:entry openvpn 1708555180 icmp=5 udp=2 tcp=108 -> At this time we get a packet client->server, and it is the first failing (it gets processed by libalias, but NOT portforwarded (in the next second we also get packets in the other direction, so without fractional seconds logged we do not know when exactly what happened) 18 17832 DeleteLink:entry openvpn 1708555186 icmp=5 udp=2 tcp=111 type=17 (src_addr 192.168.1.12) (dst_addr 198.51.100.1) (alias_addr 0.0.0.0) (proxy_addr 0.0.0.0) (src_port 5007) (dst_port 8211) (alias_port 5006) (proxy_port 0) -> Here the portforward got deleted (but WHY? It is continuousley used, pings every 10 seconds), and udp count goes to 1. 18 49947 AddLink:entry openvpn 1708555186 icmp=5 udp=1 tcp=111 (src_addr 203.0.113.1) (dst_addr 91.62.14.110) (alias_addr 203.0.113.1) (src_port 5006) 18 49948 AddLink:return openvpn 1708555186 icmp=5 udp=2 tcp=111 type=17 (src_addr 203.0.113.1) (dst_addr 198.51.100.1) (alias_addr 203.0.113.1) (proxy_addr 0.0.0.0) (src_port 5006) (dst_port 8211) (alias_port 5006) (proxy_port 0) -> And immediately added again, but now WITHOUT the portforward!! 19 17832 DeleteLink:entry suricata 1708555186 icmp=5 udp=2 tcp=111 type=17 (src_addr 192.168.1.12) (dst_addr 0.0.0.0) (alias_addr 0.0.0.0) (proxy_addr 0.0.0.0) (src_port 5007) (dst_port 0) (alias_port 5006) (proxy_port 0) -> Apparently the housekeeping is too aggressive... 19 49947 AddLink:entry suricata 1708555186 icmp=5 udp=2 tcp=111 (src_addr 192.168.1.12) (dst_addr 198.51.100.1) (alias_addr 0.0.0.0) (src_port 5007) 19 49948 AddLink:return suricata 1708555186 icmp=5 udp=3 tcp=111 type=17 (src_addr 192.168.1.12) (dst_addr 198.51.100.1) (alias_addr 0.0.0.0) (proxy_addr 0.0.0.0) (src_port 5007) (dst_port 8211) (alias_port 5006) (proxy_port 0) -> And now we get the portforward as a *SEPARATE* record (udp went to 3!) 19 17832 DeleteLink:entry suricata 1708555186 icmp=5 udp=3 tcp=111 type=17 (src_addr 192.168.1.12) (dst_addr 0.0.0.0) (alias_addr 0.0.0.0) (proxy_addr 0.0.0.0) (src_port 5007) (dst_port 0) (alias_port 5006) (proxy_port 0) -> groundhog day again