Bug 235100 - Setting LANG=zh_TW.Big5 expends `/` to "-\---/"
Summary: Setting LANG=zh_TW.Big5 expends `/` to "-\---/"
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-21 08:32 UTC by Li-Wen Hsu
Modified: 2019-01-21 17:11 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Li-Wen Hsu freebsd_committer 2019-01-21 08:32:03 UTC
$ env | grep LC_
$ env | grep LANG
$ env LANG=zh_TW.Big5 ps axuwww
root       1251   0.0  0.0   10988    2576 v0  Is+  14:36     0:00.00 -\---/usr-\---/libexec-\---/getty Pc ttyv0
(...)
$ env LANG=zh_TW.Big5 LC_CTYPE=C ps axuwww
root       1251   0.0  0.0   10988    2576 v0  Is+  14:36     0:00.00 /usr/libexec/getty Pc ttyv0
$ env LANG=zh_TW.UTF-8 ps axuwww
root       1251   0.0  0.0   10988    2576 v0  Is+  14:36     0:00.00 /usr/libexec/getty Pc ttyv0
(...)
$ env LANG=C ps axuwww | grep getty
root       1251   0.0  0.0   10988    2576 v0  Is+  14:36     0:00.00 /usr/libexec/getty Pc ttyv0
(...)
$ env LANG=en_US.UTF-8 ps axuwww | grep getty
root       1251   0.0  0.0   10988    2576 v0  Is+  14:36     0:00.00 /usr/libexec/getty Pc ttyv0
(...)
$ env LANG=en_US.ISO8859-1 ps axuwww | grep getty 
root       1251   0.0  0.0   10988    2576 v0  Is+  14:36     0:00.00 /usr/libexec/getty Pc ttyv0
(...)
Comment 1 Conrad Meyer freebsd_committer 2019-01-21 16:32:53 UTC
I suspect this is libxo-related.
Comment 2 Conrad Meyer freebsd_committer 2019-01-21 16:37:17 UTC
(ISO, UTF-8, and ASCII all share the same single-byte encoding of '/'; ps uses setlocale(); libxo assumes all input is UTF-8.  When a non-utf8 encoding is used, ps just passes through those strings to libxo, which probably attempts to encode them again as Big5 or something like that.)
Comment 3 Chen-Yu Tsai 2019-01-21 17:04:18 UTC
Big5 is 7-bit ASCII compatible, so there should be no reason to encode '/' as anything other than just '/'.
Comment 4 Conrad Meyer freebsd_committer 2019-01-21 17:06:27 UTC
(In reply to Chen-Yu Tsai from comment #3)
If you run 'LANG=zh_TW.Big5 ls / | hd', is '/' encoded as just the 7-bit ASCII '/'?
Comment 5 Conrad Meyer freebsd_committer 2019-01-21 17:07:33 UTC
Hm, 'ls' seems to encode as the usual 0x2f.  I still think this is something xo-related :-).
Comment 6 Conrad Meyer freebsd_committer 2019-01-21 17:09:44 UTC
One other detail: '-' also seems to get butchered, becoming '-\----'.
Comment 7 Conrad Meyer freebsd_committer 2019-01-21 17:11:35 UTC
',' also gets prefixed with the same string ('-\---').  It suggests to me some kind of escape sequence that is then getting converted at least one more time.