The iconv converter from GB18030 to UTF-8 is broken: It maps only 63486 characters. It should map 1112064 characters. All valid Unicode code points (U+0000..U+D7FF, U+E000..U+10FFFF) are representable in GB18030. See https://en.wikipedia.org/wiki/GB_18030#Mapping for details. How to reproduce: $ cc -Wall -o table-from table-from.c $ ./table-from GB18030 > GB18030.TXT Actual output: see actual-GB18030.TXT Expected output: one of expected-GB18030-2005.TXT (for a GB18030:2005 compliant converter) or expected-GB18030-2022.TXT (for a GB18030:2022 compliant converter).
Created attachment 243269 [details] mapping table extractor
Created attachment 243270 [details] actual GB18030.TXT (compressed)
Created attachment 243271 [details] expected GB18030.TXT for 2005 version (compressed)
Created attachment 243272 [details] expected GB18030.TXT for 2022 version (compressed)