When looking into an SDCC regression test failing for test-host, I found the following issue: The mbrtoc16 and mbrtoc32 functions return a wrong value for my test case. I compiled the following code on a Raspi 4 running FreeBSD13 via "cc test.c" when executing the resulting binary, the last assertion fails. #include <limits.h> #include <assert.h> #include <uchar.h> int main(void) { static mbstate_t ps; char16_t c16[3]; char c[MB_LEN_MAX] = "C"; assert(mbrtoc16(c16, c, 1, &ps) == 1); assert(mbrtoc16(c16 + 1, c + 1, 1, &ps) == 0); // Writes a null wide character and thus puts ps into the initial conversion state (C2X section 7.30.1.3) assert(c16[0] == (u"C")[0]); assert(c16rtomb(c, c16[0], &ps) == 1); return(0); } I do not have any non-aarch64 FreeBSD 13.1 systems to test. But the test does not fail for Debian GNU/Linux on aarch64 and amd64.
Clearing the mbstate_t argument before calling c16rtomb causes the test to pass. memset(&ps, 0, sizeof ps);
I think numerics@ is not the right assignee, since that is mostly for math-related problems (i.e. mostly lib/msun). mbrtowc, mbrtoc16, and mbrtoc32 are character conversion functions.
(In reply to John F. Carr from comment #1) Yes, setting the mbstate_t to zero is what should be done. Quoting C11 7.29.6: > The initial conversion state corresponds, for a conversion in either direction, to the beginning of a new multibyte character in the initial shift state. A zero-valued mbstate_t object is (at least) one way to describe an initial conversion state. A zero- valued mbstate_t object can be used to initiate conversion involving any multibyte character sequence, in any LC_CTYPE category setting. If an mbstate_t object has been altered by any of the functions described in this subclause, and is then used with a different multibyte character sequence, or in the other conversion direction, or with a different LC_CTYPE category setting than on earlier function calls, the behavior is undefined.
Ugh, to make that more readable: > The initial conversion state corresponds, for a conversion in either > direction, to the beginning of a new multibyte character in the > initial shift state. A zero-valued mbstate_t object is (at least) one > way to describe an initial conversion state. A zero- valued mbstate_t > object can be used to initiate conversion involving any multibyte > character sequence, in any LC_CTYPE category setting. If an mbstate_t > object has been altered by any of the functions described in this > subclause, and is then used with a different multibyte character > sequence, or in the other conversion direction, or with a different > LC_CTYPE category setting than on earlier function calls, the behavior > is undefined.