How to reproduce: ``` #!/bin/sh rm -Rf d e mkdir d touch d/`printf '\306'` mkdir e tar -c -f - d | tar -C e -x -f - ``` Doing this with empty $LANG leads to ``` : Can't translate pathname 'd/Ж' to UTF-8# sh test.sh ``` However, directory `d` proprely copied into `e`. This error message disappears with `LANG=en_US.ISO8859-1` I'm not exactly sure what this error message means, but anyway it is very unclear and may be interpreted as "the file was not archived". Also I don't know why tar even tries to do some charset translations. It should be binary-safe against filenames by default.
This started happening in FreeBSD 10. Before that tar never tried to do charset translations by any means.
It's a non-fatal warning that changes the exit status to non-zero but as you note, does not prevent correct copy. libarchive changes the copy mode from encoding-aware (UTF-8 default, I guess) to binary mode when it prints that text.
This is specified by POSIX' pax: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html "If there is a hdrcharset extended header in effect for a file, the value field for any gname, linkpath, path, and uname extended header records shall be encoded using the character set specified by the hdrcharset extended header record; otherwise, the value field shall be encoded using UTF-8. The value field for all other keywords specified by POSIX.1-2017 shall be encoded using UTF-8."
(Prior to FreeBSD 10, the default tar format was likely the older "ustar" instead of "pax".)
The filesystem has no internal charset so it is weird to do charset translations from no-charset (= BINARY) to any explicit charset. Also it is not good that resulting archive somehow dependent on environment $LANG which was intended for run-time localization purposes and not for abstract data processing. Also this fact is undocumented on tar manpage.