Created attachment 211760 [details] patch to add locale support to syslogd when converting strings to "safe" sequences syslogd has code to convert all logged messages to "safe" strings. At the moment, the code converts control characters to "^x" sequences and ALL 8-bit characters to "M-x" sequences. This means that printable characters in character sets other than ASCII are converted and so do not display as expected when viewing the logs. This patch adds LC_CTYPE locale support to syslogd and changes the "safe" conversion code to examine the logged characters using mbrtoc32() and to use iswgraph() to test if a character needs converting to safe sequences. It also uses vis() to do the conversion which is similar to OpenBSD but which means control chars become \C-x and non-graphical 8-bit chars become \M-x.
A patch to either libexec/rc/rc.d/syslogd specifically or to the rc mechanism in general will also be needed to ensure that LC_CTYPE is set before syslogd is started.
(In reply to J.R. Oldroyd from comment #1) Ah, I remembered that ${name_env} can be set in rc.conf and the rc mechanism already looks for that to set the environment. So, no changes are needed to rc.d/syslogd.
If you want to test this code, simplest way is this: 1. Add to /etc/rc.conf: syslogd_env="LC_CTYPE=C.UTF-8" 2. Restart syslogd. 3. Run: echo '\xe0\xb8\xaa\xe0\xb8\xa7\xe0\xb8\xb1\xe0\xb8\xaa\xe0\xb8\x94\xe0\xb8\xb5' | logger 4. Look at /var/log/messages. You should see "สวัสดี" (hello in Thai).
Created attachment 211894 [details] patch to add locale support to syslogd when converting strings to "safe" sequences Updated patch to do better output buffer space checking and also to use the MB_LEN_MAX constant for the input size limit. In addition, I switched from using iswgraph() to iswprint() because locale-specific space chars can also be copied as they are.
https://reviews.freebsd.org/D26456