As discovered on 11.1-RELEASE-p9 and present on -p10, reassembled packets over 4k are silently dropped by in-kernel NAT. Patch based on suggestion of Andrey V. Elsukov supplied. Cause identified by Andrey V. Elsukov on the freebsd-net and freebsd-ipfw lists on 2018-06-13 as being due to buffer allocation limits in the in-kernel implementation of libalias. "The kernel version of libalias uses m_megapullup() function to make single contiguous buffer. m_megapullup() uses m_get2() function to allocate mbuf of appropriate size. If size of packet greater than 4k it will fail. So, if you use MTU greater than 4k or if after fragments reassembly you get a packet with length greater than 4k, ipfw_nat() function will drop this packet." Additional communication on those lists by Andrey suggested a patch might resolve this issue. The following is his code, I take no credit for it. Tested and "works for me" on kernel sources from 11.1-RELEASE-p10 and GENERIC kernconf. /usr/src/sys/netinet/libalias]$ diff -u alias.c.orig alias.c 8< --- alias.c.orig 2017-07-20 16:42:02.000000000 -0700 +++ alias.c 2018-06-13 15:41:46.862121000 -0700 @@ -1758,7 +1758,14 @@ if (m->m_next == NULL && M_WRITABLE(m)) return (m); - mcl = m_get2(len, M_NOWAIT, MT_DATA, M_PKTHDR); + if (len <= MJUMPAGESIZE) + mcl = m_get2(len, M_NOWAIT, MT_DATA, M_PKTHDR); + else if (len <= MJUM9BYTES) + mcl = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, MJUM9BYTES); + else if (len <= MJUM16BYTES) + mcl = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, MJUM16BYTES); + else + goto bad; if (mcl == NULL) goto bad; m_align(mcl, len); >8 Additional details on the situation that highlighted this can be found at https://forums.freebsd.org/threads/in-kernel-nat-dropping-large-udp-return-packets.66262/
Relates to documentation issue #229003
Created attachment 194242 [details] Possible patch from suggestion of Andrey V. Elsukov
A commit references this bug: Author: ae Date: Thu Jun 14 11:15:40 UTC 2018 New revision: 335133 URL: https://svnweb.freebsd.org/changeset/base/335133 Log: In m_megapullup() use m_getjcl() to allocate 9k or 16k mbuf when requested. It is better to try allocate a big mbuf, than just silently drop a big packet. A better solution could be reworking of libalias modules to be able use m_copydata()/m_copyback() instead of requiring the single contiguous buffer. PR: 229006 MFC after: 1 week Changes: head/sys/netinet/libalias/alias.c
Some historical context: https://lists.freebsd.org/pipermail/svn-src-head/2013-March/045721.html Due to discussion thread after this commit: https://lists.freebsd.org/pipermail/svn-src-head/2013-March/045643.html
A commit references this bug: Author: ae Date: Thu Jun 21 10:51:25 UTC 2018 New revision: 335473 URL: https://svnweb.freebsd.org/changeset/base/335473 Log: MFC r335133: In m_megapullup() use m_getjcl() to allocate 9k or 16k mbuf when requested. It is better to try allocate a big mbuf, than just silently drop a big packet. A better solution could be reworking of libalias modules to be able use m_copydata()/m_copyback() instead of requiring the single contiguous buffer. PR: 229006 Changes: _U stable/11/ stable/11/sys/netinet/libalias/alias.c
A commit references this bug: Author: ae Date: Thu Jun 21 11:24:21 UTC 2018 New revision: 335474 URL: https://svnweb.freebsd.org/changeset/base/335474 Log: MFC r335133: In m_megapullup() use m_getjcl() to allocate 9k or 16k mbuf when requested. It is better to try allocate a big mbuf, than just silently drop a big packet. A better solution could be reworking of libalias modules to be able use m_copydata()/m_copyback() instead of requiring the single contiguous buffer. PR: 229006 Changes: _U stable/10/ stable/10/sys/netinet/libalias/alias.c
Fixed in head/, stable/11 and stable/10. Thanks!
Tested on a patched kernel sourced from FreeBSD 11.1-p11. Can confirm there is a significant improvement in network speeds and no packets are dropped. I'd like to request an Errata Notice so to include it in the next security update.