Summary: | libc++: std::wcout does not use global locale set via setlocale() | ||
---|---|---|---|
Product: | Base System | Reporter: | Yuri Victorovich <yuri> |
Component: | bin | Assignee: | freebsd-bugs (Nobody) <bugs> |
Status: | New --- | ||
Severity: | Affects Only Me | CC: | dim, emaste, jokehuang91, sechanyang3210, yuripv |
Priority: | --- | ||
Version: | CURRENT | ||
Hardware: | Any | ||
OS: | Any | ||
URL: | https://bugs.llvm.org/show_bug.cgi?id=48444 |
Description
Yuri Victorovich
![]() ![]() Reproduced this on current as well. Note that it seems to be clang (libc++?) specific -- compiling with gcc9 from ports shows correct behavior. Now searching the web a bit for this got me the following which apparently "fixes" the issue: std::locale mylocale(""); std::wcout.imbue(mylocale); As I know next to nothing about c++, wonder why the difference in requirements between clang (libc++?) and gcc (libstdc++?). (In reply to Yuri Pankov from comment #2) I don't think "std::wcout.imbue(mylocale);" should be required. It should be initialized with the currently chosen locale. This works with clang-10: > int main() { > std::locale mylocale(""); > std::wcout.imbue(mylocale); > std::wcout << L'>' << L'◯' << L'<' << std::endl; > } but with gcc-9 and gcc-10 it fails: > $ ./a.out > terminate called after throwing an instance of 'std::runtime_error' > what(): locale::facet::_S_create_c_locale name not valid > Abort trap (In reply to Yuri Victorovich from comment #4) So it's full of wonders, for clang you need: std::wcout.imbue(std::locale("")); ...and for gcc you need: setlocale(LC_ALL, ""); Reproduced the same with clang/libc++ 10/11 on Debian, so it does not seem to be FreeBSD specific. With both clang and gcc this line
> std::cout << std::wcout.getloc().name() << std::endl;
shows the locale in std::wcout defaults to "C" when it should default to the current user's locale.
Without this std::wcout isn't usable from libraries because libraries have to use the default state of std::wcout and it does not correspond to user's locale without the top-level program setting it in std::wcout.
(In reply to Yuri Victorovich from comment #6) Everything (well, almost) defaults to C locale, including printf(), e.g. the following will fail without setlocale() call: printf("printf=%C\n", L'◯'); And it looks like the problem is that libc++'s wcout does NOT use the global locale set via that call, while libstdc++'s one does. Whether it is a bug or deliberate choice, I have no idea. Dimitry, any thoughts? See e.g.: https://gcc.gnu.org/onlinedocs/libstdc++/manual/localization.html#locale.impl.c which says: > From Josuttis, p. 697-698, which says, that "there is only *one* > relation (of the C++ locale mechanism) to the C locale mechanism: the > global C locale is modified if a named C++ locale object is set as > the global locale" (emphasis Paolo), that is: > > std::locale::global(std::locale("")); > > affects the C functions as if the following call was made: > > std::setlocale(LC_ALL, ""); > > On the other hand, there is *no* vice versa, that is, calling > setlocale has *no* whatsoever on the C++ locale mechanism, in > particular on the working of locale(""), which constructs the locale > object from the environment of the running program, that is, in > practice, the set of LC_ALL, LANG, etc. variable of the shell. The above wording is also found in e.g. the C++11 standard, in [locale.statics]: > static locale global(const locale& loc); > > 1. Sets the global locale to its argument. > > 2. Effects: Causes future calls to the constructor locale() to return > a copy of the argument. If the argument has a name, does > > std::setlocale(LC_ALL, loc.name().c_str()); > > otherwise, the efect on the C locale, if any, is > implementation-defined. No library function other than > locale::global() shall afect the value returned by locale(). > [Note: See 22.6 for data race considerations when setlocale is > invoked.] > > 3. Returns: The previous value of locale(). (In reply to Dimitry Andric from comment #8) So libstdc++'s wcout being affected by setlocale() call is just an implementation choice, the one that libc++ didn't make? I asked a similar question in the libc++ bugtracker. Maybe they would have some insight about std::wcout's locale default. (In reply to Yuri Pankov from comment #9) > So libstdc++'s wcout being affected by setlocale() call is just an > implementation choice, the one that libc++ didn't make? Apparently, although that documentation link from libstdc++ that I pasted doesn't really tell anything about it, except maybe the part: > Locale initialization: at what point does _S_classic, _S_global get > initialized? Can named locales assume this initialization has already taken > place? but it seems this doc article is very old. Looking at libstdc++'s implementation, it appears they initialize a default locale() object here: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/src/c%2B%2B98/ios_locale.cc#l44 // Called only by basic_ios<>::init. void ios_base::_M_init() throw() { // NB: May be called more than once _M_precision = 6; _M_width = 0; _M_flags = skipws | dec; _M_ios_locale = locale(); } This default locale() object is constructed in https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/src/c%2B%2B98/locale_init.cc, but it seems like a separate copy of a C-like locale by default, e.g it has: void locale::_S_initialize_once() throw() { // 2 references. // One reference for _S_classic, one for _S_global _S_classic = new (&c_locale_impl) _Impl(2); _S_global = _S_classic; new (&c_locale) locale(_S_classic); } and the _Impl constructor is: // Construct "C" _Impl. locale::_Impl:: _Impl(size_t __refs) throw() : _M_refcount(__refs), _M_facets(0), _M_facets_size(num_facets), _M_caches(0), _M_names(0) { _M_facets = new (&facet_vec) const facet*[_M_facets_size](); _M_caches = new (&cache_vec) const facet*[_M_facets_size](); // Name the categories. _M_names = new (&name_vec) char*[_S_categories_size](); _M_names[0] = new (&name_c[0]) char[2]; std::memcpy(_M_names[0], locale::facet::_S_get_c_name(), 2); // This is needed as presently the C++ version of "C" locales // != data in the underlying locale model for __timepunct, // numpunct, and moneypunct. Also, the "C" locales must be // constructed in a way such that they are pre-allocated. // NB: Set locale::facets(ref) count to one so that each individual // facet is not destroyed when the locale (and thus locale::_Impl) is // destroyed. _M_init_facet(new (&ctype_c) std::ctype<char>(0, false, 1)); _M_init_facet(new (&codecvt_c) codecvt<char, char, mbstate_t>(1)); ... much more of this ... } So I think what you're seeing with libstdc++ is intentional, in the sense that they have a default locale which is sort-of the same as the default C locale (or even C.UTF-8). The only call to setlocale() in that .cc file is when you call std::locale::global(), as indicated in the docs. MARKED AS SPAM MARKED AS SPAM |