Bug 244226

Summary: [patch] syslogd converts all 8-bit chars to M-x sequences
Product: Base System Reporter: J.R. Oldroyd <fbsd>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: New ---    
Severity: Affects Some People CC: emaste, markj
Priority: --- Keywords: patch
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
patch to add locale support to syslogd when converting strings to "safe" sequences
none
patch to add locale support to syslogd when converting strings to "safe" sequences none

Description J.R. Oldroyd 2020-02-19 11:50:20 UTC
Created attachment 211760 [details]
patch to add locale support to syslogd when converting strings to "safe" sequences

syslogd has code to convert all logged messages to "safe" strings.

At the moment, the code converts control characters to "^x" sequences and ALL 8-bit characters to "M-x" sequences.  This means that printable characters in character sets other than ASCII are converted and so do not display as expected when viewing the logs.

This patch adds LC_CTYPE locale support to syslogd and changes the "safe" conversion code to examine the logged characters using mbrtoc32() and to use iswgraph() to test if a character needs converting to safe sequences.

It also uses vis() to do the conversion which is similar to OpenBSD but which means control chars become \C-x and non-graphical 8-bit chars become \M-x.
Comment 1 J.R. Oldroyd 2020-02-19 12:00:18 UTC
A patch to either libexec/rc/rc.d/syslogd specifically or to the rc mechanism in general will also be needed to ensure that LC_CTYPE is set before syslogd is started.
Comment 2 J.R. Oldroyd 2020-02-19 14:42:19 UTC
(In reply to J.R. Oldroyd from comment #1)

Ah, I remembered that ${name_env} can be set in rc.conf and the rc mechanism already looks for that to set the environment.  So, no changes are needed to rc.d/syslogd.
Comment 3 J.R. Oldroyd 2020-02-19 15:47:02 UTC
If you want to test this code, simplest way is this:

1.  Add to /etc/rc.conf:
        syslogd_env="LC_CTYPE=C.UTF-8"

2.  Restart syslogd.

3.  Run:
        echo '\xe0\xb8\xaa\xe0\xb8\xa7\xe0\xb8\xb1\xe0\xb8\xaa\xe0\xb8\x94\xe0\xb8\xb5' | logger

4.  Look at /var/log/messages.  You should see "สวัสดี" (hello in Thai).
Comment 4 J.R. Oldroyd 2020-02-24 13:10:19 UTC
Created attachment 211894 [details]
patch to add locale support to syslogd when converting strings to "safe" sequences

Updated patch to do better output buffer space checking and also to use the MB_LEN_MAX constant for the input size limit.  In addition, I switched from using iswgraph() to iswprint() because locale-specific space chars can also be copied as they are.
Comment 5 Mark Johnston freebsd_committer 2020-09-19 15:58:12 UTC
https://reviews.freebsd.org/D26456