The code in sys/dev/ixgbe/ixgbe.h assumes that MLEN is always > 160, and doesn't exceed the size of the mbuf. MSIZE is set to 256, so if MHLEN = MSIZE - sizeof(struct m_hdr) - sizeof(struct pkthdr) < 160, ixgbe will scribble over the header information in mbufs. Similarly, if IXGBE_RX_COPY_LEN is greater than the size of the mbuf, it will scribble over other memory, potentially in the same mbuf chain, or elsewhere. This optimization needs better bounds checking/handling. 155 /* 156 * Used for optimizing small rx mbufs. Effort is made to keep the copy 157 * small and aligned for the CPU L1 cache. 158 * 159 * MHLEN is typically 168 bytes, giving us 8-byte alignment. Getting 160 * 32 byte alignment needed for the fast bcopy results in 8 bytes being 161 * wasted. Getting 64 byte alignment, which _should_ be ideal for 162 * modern Intel CPUs, results in 40 bytes wasted and a significant drop 163 * in observed efficiency of the optimization, 97.9% -> 81.8%. 164 */ 165 #define IXGBE_RX_COPY_LEN 160 166 #define IXGBE_RX_COPY_ALIGN (MHLEN - IXGBE_RX_COPY_LEN) 60 * MLEN is data length in a normal mbuf. 61 * MHLEN is data length in an mbuf with pktheader. 62 * MINCLSIZE is a smallest amount of data that should be put into cluster. 63 */ 64 #define MLEN ((int)(MSIZE - sizeof(struct m_hdr))) 65 #define MHLEN ((int)(MLEN - sizeof(struct pkthdr))) 66 #define MINCLSIZE (MHLEN + 1)
Credit goes to Anton Rang @ Isilon for finding the bug.
Hi! Does he/Isilon have a patch to fix/address this? -adrian
(In reply to Adrian Chadd from comment #2) > Hi! > > Does he/Isilon have a patch to fix/address this? We don't yet. This was a recently discovered issue on a driver that isn't available in HEAD yet...
(In reply to Garrett Cooper from comment #3) > We don't yet. This was a recently discovered issue on a driver that isn't > available in HEAD yet... Note: This bug is only harmful if you modify any of the structures that comprise `struct mbuf` as the observation hasn't been violated on FreeBSD CURRENT [yet]. Isilon runs into this issue because we add additional fields to track mbufs. Status update: Isilon is waiting for a fix from upstream (Intel); we will work with Intel to get the fix tested and pushed into FreeBSD.
(In reply to Garrett Cooper from comment #4) > `struct mbuf` as the observation hasn't been violated on FreeBSD CURRENT "as the observation hasn't" -> "; the conditions for the optimization to work haven't"
Does that mean only the sizes of struct m_hdr and struct m_pkthdr change, or do you change MSIZE as well?
(In reply to Eric Joyner from comment #6) > Does that mean only the sizes of struct m_hdr and struct m_pkthdr change, or > do you change MSIZE as well? The size of struct m_ext and struct m_hdr is larger than stock FreeBSD, with certain compile-time options. However, you can technically tweak MSIZE though so it's less than 160, which would also reproduce the situation. That scenario seems kind of silly though.
So it sounds like we may just need to change IXGBE_RX_COPY_LEN from a fixed value of 160 to something that's calculated based on the length of struct m_hdr and struct m_pkthdr combined. I think they got 160 by adding together the size of the two header structs (88 in FreeBSD amd64), adding some number (8) to make that size divisible by 32 (96 % 32 == 0), and then subtracted that from MSIZE (256) to get IXGBE_RX_COPY_LEN (160), while keeping the padding added as IXGBE_RX_COPY_ALIGN.
You could try replacing the two lines in ixgbe.h with these: #define IXGBE_RX_COPY_HDR (sizeof(struct m_hdr) + sizeof(struct pkthdr)) #define IXGBE_RX_COPY_HDR_PADDED ((((IXGBE_RX_COPY_HDR - 1) / 32) + 1) * 32) #define IXGBE_RX_COPY_LEN (MSIZE - IXGBE_RX_COPY_HDR_PADDED) #define IXGBE_RX_COPY_ALIGN (IXGBE_RX_COPY_HDR_PADDED - IXGBE_RX_COPY_HDR)
(In reply to Eric Joyner from comment #8) > So it sounds like we may just need to change IXGBE_RX_COPY_LEN from a fixed > value of 160 to something that's calculated based on the length of struct > m_hdr and struct m_pkthdr combined. > > I think they got 160 by adding together the size of the two header structs > (88 in FreeBSD amd64), adding some number (8) to make that size divisible by > 32 (96 % 32 == 0), and then subtracted that from MSIZE (256) to get > IXGBE_RX_COPY_LEN (160), while keeping the padding added as > IXGBE_RX_COPY_ALIGN. A couple questions then: - What architectures is this driver supported on? - Is this optimization only valid for amd64? - Why 32?
(In reply to Garrett Cooper from comment #10) ... > - What architectures is this driver supported on? What I was really driving at with this question is "What architectures does Intel support this driver on?"
(In reply to Garrett Cooper from comment #10) > (In reply to Eric Joyner from comment #8) > > So it sounds like we may just need to change IXGBE_RX_COPY_LEN from a fixed > > value of 160 to something that's calculated based on the length of struct > > m_hdr and struct m_pkthdr combined. > > > > I think they got 160 by adding together the size of the two header structs > > (88 in FreeBSD amd64), adding some number (8) to make that size divisible by > > 32 (96 % 32 == 0), and then subtracted that from MSIZE (256) to get > > IXGBE_RX_COPY_LEN (160), while keeping the padding added as > > IXGBE_RX_COPY_ALIGN. > > A couple questions then: > > - What architectures is this driver supported on? > - Is this optimization only valid for amd64? > - Why 32? - At least i386 and amd64, but it should theoretically work on anything with a PCI-E slot - This optimization specifically was contributed by scottl/luigi, looking at the commit logs. You should ask either of those two for details. - Doesn't the comment in the code you posted answer that?
Has this error gone away with the code change, or did something else happen?
(In reply to Eric Joyner from comment #13) I haven't tested the change yet, but thanks for the reminder (I'll verify it this week). If it works, do I have Intel's blessing for committing the fix, or should jfv do the commit?
(In reply to Garrett Cooper,425-314-3911 from comment #14) Hi Eric! I confirmed that the fix you provided worked for our environment. Unfortunately struct mbuf lacks some of the fields we have in it so it doesn't work with FreeBSD proper. I believe this patch would work for CURRENT, based on my basic C program: $ clang -Wall -Wno-c++1z-extensions -std=c11 -o ~/test_ixgbe_defines ~/test_ixgbe_defines.c $ cat ~/test_ixgbe_defines.c #include <sys/cdefs.h> #include <sys/param.h> #include <sys/mbuf.h> #include <assert.h> #ifndef offsetof #define offsetof __offsetof #endif #define IXGBE_RX_COPY_HDR_PADDED ((((MPKTHSIZE - 1) / 32) + 1) * 32) #define IXGBE_RX_COPY_LEN (MSIZE - IXGBE_RX_COPY_HDR_PADDED) #define IXGBE_RX_COPY_ALIGN (IXGBE_RX_COPY_HDR_PADDED - MPKTHSIZE) static_assert(IXGBE_RX_COPY_LEN == 160); static_assert(IXGBE_RX_COPY_ALIGN == (MHLEN - IXGBE_RX_COPY_LEN)); int main(void) { return (0); } I'm going to try this out internally, but could you and Adrian (or Randall, because Netflix runs CURRENT-ish and I believe uses ixgbe in some cases) please look at this as well to make sure that it works accordingly? diff --git a/sys/dev/ixgbe/ixgbe.h b/sys/dev/ixgbe/ixgbe.h index f7e9efa..1ddeebb 100644 --- a/sys/dev/ixgbe/ixgbe.h +++ b/sys/dev/ixgbe/ixgbe.h @@ -162,8 +162,10 @@ * modern Intel CPUs, results in 40 bytes wasted and a significant drop * in observed efficiency of the optimization, 97.9% -> 81.8%. */ -#define IXGBE_RX_COPY_LEN 160 -#define IXGBE_RX_COPY_ALIGN (MHLEN - IXGBE_RX_COPY_LEN) + +#define IXGBE_RX_COPY_HDR_PADDED ((((MPKTHSIZE - 1) / 32) + 1) * 32) +#define IXGBE_RX_COPY_LEN (MSIZE - IXGBE_RX_COPY_HDR_PADDED) +#define IXGBE_RX_COPY_ALIGN (IXGBE_RX_COPY_HDR_PADDED - MPKTHSIZE) /* Keep older OS drivers building... */ #if !defined(SYSCTL_ADD_UQUAD)
Ping.
Ping? I'll post the CR on phabricator if need be so it gets reviewed/committed to head.
A commit references this bug: Author: ngie Date: Sat Feb 28 14:57:58 UTC 2015 New revision: 279393 URL: https://svnweb.freebsd.org/changeset/base/279393 Log: Pad RX copy alignment calculation to avoid illegal memory accesses The optimization made in r239940 is valid for struct mbuf's current structure and size in FreeBSD, but hardcodes assumptions about sizes of struct mbuf, which are unfortunately broken if additional data is added to the beginning of struct mbuf X-MFC note (discussed with rwatson): This change requires the MPKTHSIZE definition, which is only available after head@r277203 and will not be MFCed as it breaks mbuf(9) KPI. A direct commit to stable/10 and merges to other branches to add the necessary definitions to work with the code as-is will be done to facilitate this MFC PR: 194314 MFC after: 2 weeks Approved/Reviewed by: erj, jfv Sponsored by: EMC / Isilon Storage Division Changes: head/sys/dev/ixgbe/ixgbe.h
A commit references this bug: Author: ngie Date: Fri Apr 24 22:18:51 UTC 2015 New revision: 281954 URL: https://svnweb.freebsd.org/changeset/base/281954 Log: MFC r279393: Pad RX copy alignment calculation to avoid illegal memory accesses The optimization made in r239940 is valid for struct mbuf's current structure and size in FreeBSD, but hardcodes assumptions about sizes of struct mbuf, which are unfortunately broken if additional data is added to the beginning of struct mbuf X-MFC note (discussed with rwatson): This change requires the MPKTHSIZE definition, which is only available after head@r277203 and will not be MFCed as it breaks mbuf(9) KPI. A direct commit to stable/10 and merges to other branches to add the necessary definitions to work with the code as-is will be done to facilitate this MFC PR: 194314 Approved/Reviewed by: erj, jfv Sponsored by: EMC / Isilon Storage Division Changes: _U stable/10/ stable/10/sys/dev/ixgbe/ixgbe.h
A commit references this bug: Author: ngie Date: Wed May 13 10:43:50 UTC 2015 New revision: 282847 URL: https://svnweb.freebsd.org/changeset/base/282847 Log: MFstable/10 r281951,r281954: r281951: Backport MHSIZE/MPKTHSIZE equivalents from head These macros are equivalent to the ones on head, except they are only exposed when _KERNEL is defined, i.e. to kernel code, whereas the code on head is exposed to userland as well This is for improved forwards compatibility with mbuf(9) macros in head@r277203+, and is required for a clean MFC of r279393 This is a direct commit to stable/10 Differential Revision: https://reviews.freebsd.org/D2126 Reviewed by: glebius, rwatson Sponsored by: EMC / Isilon Storage Division r281954: MFC r279393: Pad RX copy alignment calculation to avoid illegal memory accesses The optimization made in r239940 is valid for struct mbuf's current structure and size in FreeBSD, but hardcodes assumptions about sizes of struct mbuf, which are unfortunately broken if additional data is added to the beginning of struct mbuf X-MFC note (discussed with rwatson): This change requires the MPKTHSIZE definition, which is only available after head@r277203 and will not be MFCed as it breaks mbuf(9) KPI. A direct commit to stable/10 and merges to other branches to add the necessary definitions to work with the code as-is will be done to facilitate this MFC PR: 194314 Approved/Reviewed by: erj, jfv Sponsored by: EMC / Isilon Storage Division Changes: _U stable/9/ _U stable/9/sys/ _U stable/9/sys/dev/ _U stable/9/sys/dev/ixgbe/ stable/9/sys/dev/ixgbe/ixgbe.h _U stable/9/sys/sys/ stable/9/sys/sys/mbuf.h
Is this ticket closeable now? I'm confused on the status of patches in flight and commits.
(In reply to Sean Bruno from comment #21) I meant to MFC it to stable/8, but the rest of the driver hasn't been backported back that far. Just going to close this bug. Thanks for the followup :)!