Bug 234010 - Lack of Unicode support in strfmon breaks monitary formatting
Summary: Lack of Unicode support in strfmon breaks monitary formatting
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: misc (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Some People
Assignee: Conrad Meyer
URL: https://reviews.freebsd.org/D18605
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-14 11:31 UTC by Jon Tejnung
Modified: 2018-12-19 23:04 UTC (History)
2 users (show)

See Also:


Attachments
tentative patch to fix the issue (3.21 KB, patch)
2018-12-15 10:27 UTC, Conrad Meyer
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jon Tejnung 2018-12-14 11:31:13 UTC
This bug was manifested when using the PHP function money_format and LC_MONETARY sv_SE.UTF-8 to format some currency. Instead of a space as the thousand separator, it was a ?-symbol. I'm not very good at this, but this is what I have come up with:

* FreeBSD is retrieving locale data from http://cldr.unicode.org
* In at least sv_SE.UTF-8, the value ending up as monitary thousand separator is a unicode character (C2 A0).
* localeconv is fetching the data in /usr/share/locale/sv_SE.UTF-8/LC_MONETARY and returning a pointer.
* __format_grouped_double (with the description "convert double to ASCII") is calling localeconv to fetch the thousand separator and seems to only use the first byte (C2).

* strfmon is calling __format_grouped_double to format the grouping of the string and returns bad strings when using locales with unicode characters.


How to reproduce:

setlocale(LC_MONETARY, "sv_SE.UTF-8");
int i;
strfmon(buf, sizeof(buf), "%i", money);
printf("%s\n", buf);
for(i=0; i<sizeof(buf); i++) {
        printf("%2d - %i \n", i, buf[i]);
}
Comment 1 Conrad Meyer freebsd_committer freebsd_triage 2018-12-15 07:51:15 UTC
Present on CURRENT as well.  Full reproducer program:

#include <locale.h>
#include <monetary.h>
#include <stdio.h>

int
main(void) {
        char buf[80];
        double money = 123456.78;

        setlocale(LC_MONETARY, "sv_SE.UTF-8");
        strfmon(buf, sizeof(buf), "%i", money);
        printf("'%s'\n", buf);

        return 0;
}

$ ./a.exe | hd
00000000  27 31 32 33 c2 34 35 36  2c 37 38 20 53 45 4b 20  |'123.456,78 SEK |
//            1  2  3 ^^  4  5  6  ...
Comment 2 Conrad Meyer freebsd_committer freebsd_triage 2018-12-15 10:27:08 UTC
Created attachment 200130 [details]
tentative patch to fix the issue

I'd like to add some testing as well.

LD_PRELOAD=$(make -C lib/libc -V .OBJDIR)/libc.so.7.full ~/a.exe | xxd
00000000: 2731 3233 c2a0 3435 362c 3738 2053 454b  '123..456,78 SEK
                    ^^^^
Comment 3 Conrad Meyer freebsd_committer freebsd_triage 2018-12-19 02:30:57 UTC
Posted patch, plus testcase, on phabricator: https://reviews.freebsd.org/D18605
Comment 4 commit-hook freebsd_committer freebsd_triage 2018-12-19 22:58:24 UTC
A commit references this bug:

Author: cem
Date: Wed Dec 19 22:57:48 UTC 2018
New revision: 342260
URL: https://svnweb.freebsd.org/changeset/base/342260

Log:
  Allow multi-byte thousands separators in strfmon(3)

  PR:	234010
  Reported by:	Jon Tejnung <jon AT herrskogen.se>
  Reviewed by:	yuripv
  Differential Revision:	https://reviews.freebsd.org/D18605

Changes:
  head/lib/libc/stdlib/strfmon.c
  head/lib/libc/tests/stdlib/Makefile
  head/lib/libc/tests/stdlib/strfmon_test.c