Bug 291359 - x86_64 strncpy/stpncpy reads extra bytes from the source string.
Summary: x86_64 strncpy/stpncpy reads extra bytes from the source string.
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 15.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Robert Clausecker
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-12-03 02:47 UTC by Collin Funk
Modified: 2025-12-03 16:33 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Collin Funk 2025-12-03 02:47:38 UTC
Note that this bug should be filed under the amd64 component. However, the bugzilla seems bugged and does not show the option when filing a bug.

Using this test program to map two pages and then make the second inaccessible, we can see the strncpy and stpncpy reads bytes that it shouldn't:

```
$ cat main.c 
#include <unistd.h>
#include <string.h>
#include <sys/mman.h>
#ifndef STRNCPY
# define STRNCPY strncpy
#endif
int
main (void)
{
  char *fence = NULL;
  long int pagesize = sysconf (_SC_PAGESIZE);
  /* Map two pages.  */
  char *two_pages =
    (char *) mmap (NULL, 2 * pagesize, PROT_READ | PROT_WRITE,
                   MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
  /* Make the second page inaccessible.  */
  if (two_pages != (char *)(-1)
      && mprotect (two_pages + pagesize, pagesize, PROT_NONE) == 0)
    fence = two_pages + pagesize;
  if (fence)
    {
      char dest[8];
      dest[0] = 'a';
      dest[1] = 'b';
      dest[2] = 'c';
      dest[3] = 'd';
      dest[4] = 'e';
      dest[5] = 'f';
      dest[6] = 'g';

      *(fence - 3) = '7';
      *(fence - 2) = '2';
      *(fence - 1) = '9';

      if (STRNCPY (dest + 1, fence - 3, 3) != dest + 1)
        return 1;
      if (dest[0] != 'a')
        return 2;
      if (dest[1] != '7' || dest[2] != '2' || dest[3] != '9')
        return 3;
      if (dest[4] != 'e')
        return 4;
    }
  return 0;
}
$ cc main.c && ./a.out 
Segmentation fault         (core dumped) ./a.out
$ cc -DSTRNCPY=stpncpy main.c && ./a.out 
Segmentation fault         (core dumped) ./a.out
$ gdb ./a.out ./a.out.core
[...]
Reading symbols from ./a.out...
[New LWP 100560]
Core was generated by `./a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
Invalid permissions for mapped object.
#0  ?? () at /usr/src/lib/libc/amd64/string/stpncpy.S:230 from /lib/libc.so.7
230		movdqa		16(%rsi), %xmm0		# load second chunk of input
```

Originally found in Gnulib and reported by Bruno Haible here: https://lists.gnu.org/archive/html/bug-gnulib/2025-12/msg00023.html
Comment 1 Robert Clausecker freebsd_committer freebsd_triage 2025-12-03 11:23:43 UTC
Thank you for the report.  This is my code and I'll go investigate.
Comment 2 Robert Clausecker freebsd_committer freebsd_triage 2025-12-03 12:26:12 UTC
As far as I can see, this is an off-by-one error in processing the runt (1--32 byte input) case.  There may be additional failure conditions, such as when a longer buffer ends right at a page boundary.  Conditions necessary to trigger this:

 - input of 1--16 bytes without a NUL terminator
 - input ends right at a page boundary

The code first assembles in r8d a mask of NUL bytes encountered in the head of the source (that is, the 16 bytes read from src&~15) and in r9d a bit mask of those bits in r8d that refer to bytes inside the string (that is, -1<<(src&15)).  In r10d we have the number of characters in the string, including those added by rounding down the pointer (that is, len + (src&15)).

We then check if the string is 32 bytes or less and go into the runt case.  Then the code does this check to see if we can read another 16 bytes:

        /* 1--32 bytes to copy, bounce through the stack */
.Lrunt: movdqa  %xmm1, bounce+16(%rsp)  # clear out rest of on-stack copy
        bts     %r10d, %r8d             # treat end of buffer as end of string
        and     %r9w, %r8w              # end of string within first buffer?
        jnz     0f                      # if yes, do not inspect second buffer

The code first sets the bit corresponding to the end of the buffer in r8d, pretending the buffer is after all NUL terminated.
Then it checks if a NUL terminator has been found in the head of the string (that is r8w&r9w, note the word-sized and!). If yes, we skip the logic to load and check the second 16 byte chunk.

The problem with this code is that r10d points not to the last character in the buffer, but rather to one character after.  So if the string ends at a 16 byte boundary, the nul byte is injected at 0x10000 and the 16-bit AND fails to find it.  We then read another 16 bytes of input, causing a crash.

Note that contrary to what Jeffrey Walton said in the gnulib bug report, this is not a matter of movdqa vs. movdqu.  The read is guaranteed to be aligned, it just goes too far out of bounds.

We do have a unit test for this function (lib/libc/tests/string/stpncpy_test.c), which I had extended in base 6fa9e7d8737548ef93c573387ce62402c368d486 to cover exactly this sort of issue, but it seems like my extension was incomplete, as it was for memccpy() (see base 61ed5748e4e9c7397fcb2638b442f46ac5c9e7c5).

Getz didn't get around to port this one to AArch64 yet, so that is unaffected.

Will start to work on a fix immediately.
Comment 3 Collin Funk 2025-12-03 16:20:35 UTC
Thanks for the quick investigation!
Comment 4 Getz Mikalsen 2025-12-03 16:33:05 UTC
You can try working around it until the patch is ready by setting the ARCHLEVEL flag to scalar documented in simd(7).