pathchk -p ignores codepoints >127 (and all unportable characters behind them). This error seems to be in the latest version in the repository as well. portable() returns the invalid character. Since this is treated as signed, the check >=0 misses unportable characters, which are not in us-ascii but in ISO8859-15 or UTF-8. Fix: Patch attached with submission follows: How-To-Repeat: - Use e.g. ISO8859-15 or UTF-8 as input charset. - execute: pathchk -p "/homeä/öön/foo/öbaör" - observe: no error is returned
Responsible Changed From-To: freebsd-bugs->eadler I'll take it.
Responsible Changed From-To: eadler->freebsd-bugs I won't be dealing with this PR for some time, so give it back to the pool
In PR bin/165988, you wrote: > pathchk -p ignores codepoints >127 (and all unportable characters > behind them). This error seems to be in the latest version in the > repository as well. > portable() returns the invalid character. Since this is treated as > signed, the check >=0 misses unportable characters, which are not in > us-ascii but in ISO8859-15 or UTF-8. Your diagnosis is correct. However, in the case of UTF-8 the message contains a partial character. The easiest way to fix this is to remove the first invalid character from the message as below. POSIX does not require it to be in the message. With more work, it is possible to use mbrtowc() to extract the (wide) character and print it, also taking into account the case where the encoding is invalid. Index: usr.bin/pathchk/pathchk.c =================================================================== --- usr.bin/pathchk/pathchk.c (revision 255496) +++ usr.bin/pathchk/pathchk.c (working copy) @@ -98,7 +98,7 @@ { struct stat sb; long complen, namemax, pathmax, svnamemax; - int badch, last; + int last; char *end, *p, *pathd; if ((pathd = strdup(path)) == NULL) @@ -142,9 +142,9 @@ goto bad; } - if (pflag && (badch = portable(p)) >= 0) { + if (pflag && !portable(p)) { warnx("%s: %s: component contains non-portable " - "character `%c'", path, p, badch); + "character", path, p); goto bad; } @@ -183,8 +183,7 @@ } /* - * Check whether a path component contains only portable characters. Return - * the first non-portable character found. + * Check whether a path component contains only portable characters. */ static int portable(const char *path) @@ -197,7 +196,7 @@ s = strspn(path, charset); if (path[s] != '\0') - return (path[s]); + return (0); - return (-1); + return (1); } -- Jilles Tjoelker
Author: jilles Date: Sun Oct 20 20:10:31 2013 New Revision: 256800 URL: http://svnweb.freebsd.org/changeset/base/256800 Log: pathchk: Ensure bytes >= 128 are considered non-portable characters. This was not broken on architectures such as ARM where char is unsigned. Also, remove the first non-portable character from the output. POSIX does not require this, and printing the first byte may yield an invalid byte sequence with UTF-8. PR: bin/165988 Reported by: Nicolas Rachinsky Modified: head/usr.bin/pathchk/pathchk.c Modified: head/usr.bin/pathchk/pathchk.c ============================================================================== --- head/usr.bin/pathchk/pathchk.c Sun Oct 20 18:40:55 2013 (r256799) +++ head/usr.bin/pathchk/pathchk.c Sun Oct 20 20:10:31 2013 (r256800) @@ -98,7 +98,7 @@ check(const char *path) { struct stat sb; long complen, namemax, pathmax, svnamemax; - int badch, last; + int last; char *end, *p, *pathd; if ((pathd = strdup(path)) == NULL) @@ -142,9 +142,9 @@ check(const char *path) goto bad; } - if (pflag && (badch = portable(p)) >= 0) { + if (pflag && !portable(p)) { warnx("%s: %s: component contains non-portable " - "character `%c'", path, p, badch); + "character", path, p); goto bad; } @@ -183,8 +183,7 @@ bad: free(pathd); } /* - * Check whether a path component contains only portable characters. Return - * the first non-portable character found. + * Check whether a path component contains only portable characters. */ static int portable(const char *path) @@ -197,7 +196,7 @@ portable(const char *path) s = strspn(path, charset); if (path[s] != '\0') - return (path[s]); + return (0); - return (-1); + return (1); } _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
State Changed From-To: open->patched Fixed in 11-current. Thanks for the report.
Responsible Changed From-To: freebsd-bugs->jilles I committed the change.
A commit references this bug: Author: jilles Date: Sun Aug 31 20:34:07 UTC 2014 New revision: 270890 URL: http://svnweb.freebsd.org/changeset/base/270890 Log: MFC r256800: pathchk: Ensure bytes >= 128 are considered non-portable characters. This was not broken on architectures such as ARM where char is unsigned. Also, remove the first non-portable character from the output. POSIX does not require this, and printing the first byte may yield an invalid byte sequence with UTF-8. PR: 165988 Reported by: Nicolas Rachinsky Relnotes: yes Changes: _U stable/10/ stable/10/usr.bin/pathchk/pathchk.c
A commit references this bug: Author: jilles Date: Sat Sep 20 12:59:28 UTC 2014 New revision: 271902 URL: http://svnweb.freebsd.org/changeset/base/271902 Log: MFC r256800: pathchk: Ensure bytes >= 128 are considered non-portable characters. This was not broken on architectures such as ARM where char is unsigned. Also, remove the first non-portable character from the output. POSIX does not require this, and printing the first byte may yield an invalid byte sequence with UTF-8. PR: 165988 Reported by: Nicolas Rachinsky Relnotes: yes Changes: _U stable/9/usr.bin/pathchk/ stable/9/usr.bin/pathchk/pathchk.c
Fixed in stable/9 and newer branches. No MFC to older branches is planned.