Summary: | devel/icu: Multibyte character is included in DateTimePatterns for en locale in release 72 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Ports & Packages | Reporter: | Tatsuki Makino <tatsuki_makino> | ||||||||
Component: | Individual Port(s) | Assignee: | FreeBSD Office Team <office> | ||||||||
Status: | Closed Overcome By Events | ||||||||||
Severity: | Affects Only Me | CC: | vishwin | ||||||||
Priority: | --- | Flags: | bugzilla:
maintainer-feedback?
(office) |
||||||||
Version: | Latest | ||||||||||
Hardware: | Any | ||||||||||
OS: | Any | ||||||||||
See Also: | https://bugzilla.mozilla.org/show_bug.cgi?id=1806042 | ||||||||||
Attachments: |
|
Description
Tatsuki Makino
2023-01-24 03:45:23 UTC
Created attachment 240299 [details]
Ports only for use in overlays, etc.
This creates icudt*.dat that replaces some multibyte characters in en locale.
Replacing ${LOCALBASE}/share/icu/72.1/icudt72l.dat with this will eliminate the above problem.
For example, it is easier to see the weather forecast at different times of the day and the extent of rainfall :)
Has this finding been reported to ICU upstream? (I actually just hit this "Invalid Date" problem myself) (In reply to Charlie Li from comment #2) No, I have not yet done that at all. As for my thoughts on this issue... This is not a problem for people using en, en-US, or en-* locales, it is the right. This does not seem to be a problem on the Linux side which seems to use ICU in the same way. At least it is not a problem with Android(+MS Edge browser). This is more of a problem on the website production side. The approach of trying to put the time string output by the new feature called LocaleString into the old-fashioned Date.parse function is strange. Therefore, shouldn't we be reporting to the website where the problem occurs? Anyway, I posted here because it didn't seem to be much of an issue on the Linux side, but has Linux outside of Android disappeared? :) Some application consumers like Mozilla bundle libraries like ICU, which may not be the latest version. I've been hitting this with the en locales myself. For now, the space character has changed to a multibyte character due to this commit. https://github.com/unicode-org/cldr/commit/a83026ab8c8fa6ed88f1047c4d0c6089f88b7e5d This is where it was reflected in the ICU. https://github.com/unicode-org/icu/commit/64b35481263ac4df37a28a9c549553ecc9710db2 (In reply to Charlie Li from comment #4) Chromium 110 and Firefox 110 bundle ICU 72. Created attachment 240577 [details]
Experimental patch for devel/icu
It won't use the bundle's icudt72l.dat, but will rebuild it.
It can be toggled by option.
To begin with, it may not be usable as is in a big-endian environment.
Created attachment 240578 [details] experimental patch for devel/icu It just builds one that allows the use of the environment variable ICU_DATA. Running as env ICU_DATA=/usr/local/share/icudt seamonkey will use different data like the port of attachment 240299 [details]. This may possibly mean that there is a risk like LD_PRELOAD. It seems that the browser has built in a behavior to convert whitespace characters, but would this be the case if we were to take action here? The problem for me seemed to have disappeared starting Firefox 110; 109 exhibited the issue. chromium also had no more problems with chromium-110, I think. I don't know what kind of fix it is, but it may be that a bug like the one below was embedded on a dare :) Mar/14/2023 10:49 PM ICU 73.2 seems to have changed due to compatibility. https://github.com/unicode-org/icu/releases/tag/release-73-2 |