Bug 252446

Summary: egrep bug with trailing backslash (\)
Product: Base System Reporter: fernando.valle
Component: binAssignee: Kyle Evans <kevans>
Status: Closed FIXED    
Severity: Affects Many People CC: alfredo, bugs, emaste, kevans, olivier
Priority: --- Flags: kevans: mfc-stable12+
kevans: mfc-stable11-
Version: CURRENT   
Hardware: Any   
OS: Any   
URL: https://reviews.freebsd.org/D27983

Description fernando.valle 2021-01-05 19:48:36 UTC
Running the test: /usr/tests/usr.bin/du/du_test:H_flag the following failure is occurring:

stderr:
egrep: trailing backslash (\)

The expression that fails the test is:
egrep -q "[0-9]+\t$(echo $paths1 | tr ' ' "$sep")\n" du.out

I did the test on amd64 and powerpc64(main-c255460-g282381aa5), in both the same error occurs.

It seems that grep is currently experiencing some problem with a trailing backslash.
Comment 1 Kyle Evans freebsd_committer freebsd_triage 2021-01-05 20:30:53 UTC
(In reply to fernando.valle from comment #0)

Hi,

Thanks! I received a report just earlier today about this, too; what's going on here is that bsdgrep doesn't know that \t means tab and neither does the underlying regex engine (this is correct by the spec). The underlying regex engine rejects it now because it doesn't have any special meaning.

Interestingly enough, this only worked by coincidence with gnugrep. gnugrep *also* doesn't understand \t => tab but instead opted to just silently interpret it as a 't'. The reason it appears to succeed is that the pattern argument breaks down like so (with \t translated to t):

<<EOF
[0-9]+ttestdir/A/B
testdir/A
testdir/C
testdir
EOF

Note that this is four (4) distinct patterns; the first one never matches, while the latter three do. You can confirm this with gnugrep -o (I manually ran the test here):

<<EOF
root@viper:/usr/tests/usr.bin/du# /usr/local/bin/grep -E "[0-9]+\t$(echo "testdir/A/B testdir/A testdir/C testdir" | tr ' ' "\n[0-9]+\t")\n" -o du.out
testdir/A
testdir/A
testdir/C
EOF

I will fix the test.
Comment 2 commit-hook freebsd_committer freebsd_triage 2021-01-07 22:37:41 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4832d2e8ae1df6f907ac00275764f8135722cb7e

commit 4832d2e8ae1df6f907ac00275764f8135722cb7e
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2021-01-05 21:33:06 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2021-01-07 22:36:31 +0000

    du: tests: fix the H_flag test (primarily grep usage)

    This test attempts to use \t (tab intended) in a grep expression.  With the
    former /usr/bin/grep (i.e. gnugrep), this was interpreted as a literal 't'.
    The expression would work anyways because the tr(1) usage would ultimately
    replace all of the spaces with a single newline, and they would match the
    paths whether they were correctly fromatted or not.

    Current /usr/bin/grep (i.e. bsdgrep) is less-tolerant of ordinary-escapes, a
    property of the underlying regex(3) engine, to make it easier to identify
    when stuff like this happens. In-fact, this expression broke after the
    switch happened.

    This revision does the bare basics to fix the usage by using a printf to get
    a literal tab character to insert into the expression. It also swaps out the
    manual insertion of the line prefix into the grep expression by pulling
    that part out of $sep and reusing it for the leading path.

    The secondary issue was the tr(1) usage, since tr would only replace the
    first character of string1 with the first character of string2.  This has
    instead been replaced by a sed expression, which similary understands \n to
    be a newline on all supported versions of FreeBSD.  Each path now gets
    prefixed with the appropriate context that should be there (i.e. numeric
    sequence followed by a tab).

    PR:             252446
    Reviewed by:    emaste, ngie
    Differential Revision:  https://reviews.freebsd.org/D27983

 usr.bin/du/tests/du_test.sh | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Comment 3 commit-hook freebsd_committer freebsd_triage 2021-01-24 04:06:07 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4b74a4d4e26788ae8e47ec10172ac80ce435dbb5

commit 4b74a4d4e26788ae8e47ec10172ac80ce435dbb5
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2021-01-05 21:33:06 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2021-01-24 04:04:55 +0000

    du: tests: fix the H_flag test (primarily grep usage)

    This test attempts to use \t (tab intended) in a grep expression.  With the
    former /usr/bin/grep (i.e. gnugrep), this was interpreted as a literal 't'.
    The expression would work anyways because the tr(1) usage would ultimately
    replace all of the spaces with a single newline, and they would match the
    paths whether they were correctly fromatted or not.

    Current /usr/bin/grep (i.e. bsdgrep) is less-tolerant of ordinary-escapes, a
    property of the underlying regex(3) engine, to make it easier to identify
    when stuff like this happens. In-fact, this expression broke after the
    switch happened.

    This revision does the bare basics to fix the usage by using a printf to get
    a literal tab character to insert into the expression. It also swaps out the
    manual insertion of the line prefix into the grep expression by pulling
    that part out of $sep and reusing it for the leading path.

    The secondary issue was the tr(1) usage, since tr would only replace the
    first character of string1 with the first character of string2.  This has
    instead been replaced by a sed expression, which similary understands \n to
    be a newline on all supported versions of FreeBSD.  Each path now gets
    prefixed with the appropriate context that should be there (i.e. numeric
    sequence followed by a tab).

    PR:             252446

    (cherry picked from commit 4832d2e8ae1df6f907ac00275764f8135722cb7e)

 usr.bin/du/tests/du_test.sh | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
Comment 4 Kyle Evans freebsd_committer freebsd_triage 2021-01-24 04:06:38 UTC
Fixed, thanks!