Summary: | egrep bug with trailing backslash (\) | ||
---|---|---|---|
Product: | Base System | Reporter: | fernando.valle |
Component: | bin | Assignee: | Kyle Evans <kevans> |
Status: | Closed FIXED | ||
Severity: | Affects Many People | CC: | alfredo, bugs, emaste, kevans, olivier |
Priority: | --- | Flags: | kevans:
mfc-stable12+
kevans: mfc-stable11- |
Version: | CURRENT | ||
Hardware: | Any | ||
OS: | Any | ||
URL: | https://reviews.freebsd.org/D27983 |
Description
fernando.valle
2021-01-05 19:48:36 UTC
(In reply to fernando.valle from comment #0) Hi, Thanks! I received a report just earlier today about this, too; what's going on here is that bsdgrep doesn't know that \t means tab and neither does the underlying regex engine (this is correct by the spec). The underlying regex engine rejects it now because it doesn't have any special meaning. Interestingly enough, this only worked by coincidence with gnugrep. gnugrep *also* doesn't understand \t => tab but instead opted to just silently interpret it as a 't'. The reason it appears to succeed is that the pattern argument breaks down like so (with \t translated to t): <<EOF [0-9]+ttestdir/A/B testdir/A testdir/C testdir EOF Note that this is four (4) distinct patterns; the first one never matches, while the latter three do. You can confirm this with gnugrep -o (I manually ran the test here): <<EOF root@viper:/usr/tests/usr.bin/du# /usr/local/bin/grep -E "[0-9]+\t$(echo "testdir/A/B testdir/A testdir/C testdir" | tr ' ' "\n[0-9]+\t")\n" -o du.out testdir/A testdir/A testdir/C EOF I will fix the test. A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4832d2e8ae1df6f907ac00275764f8135722cb7e commit 4832d2e8ae1df6f907ac00275764f8135722cb7e Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2021-01-05 21:33:06 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2021-01-07 22:36:31 +0000 du: tests: fix the H_flag test (primarily grep usage) This test attempts to use \t (tab intended) in a grep expression. With the former /usr/bin/grep (i.e. gnugrep), this was interpreted as a literal 't'. The expression would work anyways because the tr(1) usage would ultimately replace all of the spaces with a single newline, and they would match the paths whether they were correctly fromatted or not. Current /usr/bin/grep (i.e. bsdgrep) is less-tolerant of ordinary-escapes, a property of the underlying regex(3) engine, to make it easier to identify when stuff like this happens. In-fact, this expression broke after the switch happened. This revision does the bare basics to fix the usage by using a printf to get a literal tab character to insert into the expression. It also swaps out the manual insertion of the line prefix into the grep expression by pulling that part out of $sep and reusing it for the leading path. The secondary issue was the tr(1) usage, since tr would only replace the first character of string1 with the first character of string2. This has instead been replaced by a sed expression, which similary understands \n to be a newline on all supported versions of FreeBSD. Each path now gets prefixed with the appropriate context that should be there (i.e. numeric sequence followed by a tab). PR: 252446 Reviewed by: emaste, ngie Differential Revision: https://reviews.freebsd.org/D27983 usr.bin/du/tests/du_test.sh | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4b74a4d4e26788ae8e47ec10172ac80ce435dbb5 commit 4b74a4d4e26788ae8e47ec10172ac80ce435dbb5 Author: Kyle Evans <kevans@FreeBSD.org> AuthorDate: 2021-01-05 21:33:06 +0000 Commit: Kyle Evans <kevans@FreeBSD.org> CommitDate: 2021-01-24 04:04:55 +0000 du: tests: fix the H_flag test (primarily grep usage) This test attempts to use \t (tab intended) in a grep expression. With the former /usr/bin/grep (i.e. gnugrep), this was interpreted as a literal 't'. The expression would work anyways because the tr(1) usage would ultimately replace all of the spaces with a single newline, and they would match the paths whether they were correctly fromatted or not. Current /usr/bin/grep (i.e. bsdgrep) is less-tolerant of ordinary-escapes, a property of the underlying regex(3) engine, to make it easier to identify when stuff like this happens. In-fact, this expression broke after the switch happened. This revision does the bare basics to fix the usage by using a printf to get a literal tab character to insert into the expression. It also swaps out the manual insertion of the line prefix into the grep expression by pulling that part out of $sep and reusing it for the leading path. The secondary issue was the tr(1) usage, since tr would only replace the first character of string1 with the first character of string2. This has instead been replaced by a sed expression, which similary understands \n to be a newline on all supported versions of FreeBSD. Each path now gets prefixed with the appropriate context that should be there (i.e. numeric sequence followed by a tab). PR: 252446 (cherry picked from commit 4832d2e8ae1df6f907ac00275764f8135722cb7e) usr.bin/du/tests/du_test.sh | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) Fixed, thanks! |