Bug 244226 - [patch] syslogd converts all 8-bit chars to M-x sequences
Summary: [patch] syslogd converts all 8-bit chars to M-x sequences
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2020-02-19 11:50 UTC by J.R. Oldroyd
Modified: 2024-01-13 02:46 UTC (History)
3 users (show)

See Also:


Attachments
patch to add locale support to syslogd when converting strings to "safe" sequences (1.69 KB, patch)
2020-02-19 11:50 UTC, J.R. Oldroyd
no flags Details | Diff
patch to add locale support to syslogd when converting strings to "safe" sequences (1.82 KB, patch)
2020-02-24 13:10 UTC, J.R. Oldroyd
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description J.R. Oldroyd 2020-02-19 11:50:20 UTC
Created attachment 211760 [details]
patch to add locale support to syslogd when converting strings to "safe" sequences

syslogd has code to convert all logged messages to "safe" strings.

At the moment, the code converts control characters to "^x" sequences and ALL 8-bit characters to "M-x" sequences.  This means that printable characters in character sets other than ASCII are converted and so do not display as expected when viewing the logs.

This patch adds LC_CTYPE locale support to syslogd and changes the "safe" conversion code to examine the logged characters using mbrtoc32() and to use iswgraph() to test if a character needs converting to safe sequences.

It also uses vis() to do the conversion which is similar to OpenBSD but which means control chars become \C-x and non-graphical 8-bit chars become \M-x.
Comment 1 J.R. Oldroyd 2020-02-19 12:00:18 UTC
A patch to either libexec/rc/rc.d/syslogd specifically or to the rc mechanism in general will also be needed to ensure that LC_CTYPE is set before syslogd is started.
Comment 2 J.R. Oldroyd 2020-02-19 14:42:19 UTC
(In reply to J.R. Oldroyd from comment #1)

Ah, I remembered that ${name_env} can be set in rc.conf and the rc mechanism already looks for that to set the environment.  So, no changes are needed to rc.d/syslogd.
Comment 3 J.R. Oldroyd 2020-02-19 15:47:02 UTC
If you want to test this code, simplest way is this:

1.  Add to /etc/rc.conf:
        syslogd_env="LC_CTYPE=C.UTF-8"

2.  Restart syslogd.

3.  Run:
        echo '\xe0\xb8\xaa\xe0\xb8\xa7\xe0\xb8\xb1\xe0\xb8\xaa\xe0\xb8\x94\xe0\xb8\xb5' | logger

4.  Look at /var/log/messages.  You should see "สวัสดี" (hello in Thai).
Comment 4 J.R. Oldroyd 2020-02-24 13:10:19 UTC
Created attachment 211894 [details]
patch to add locale support to syslogd when converting strings to "safe" sequences

Updated patch to do better output buffer space checking and also to use the MB_LEN_MAX constant for the input size limit.  In addition, I switched from using iswgraph() to iswprint() because locale-specific space chars can also be copied as they are.
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2020-09-19 15:58:12 UTC
https://reviews.freebsd.org/D26456