Bug 252446 - egrep bug with trailing backslash (\)
Summary: egrep bug with trailing backslash (\)
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: Kyle Evans
URL: https://reviews.freebsd.org/D27983
Depends on:
Reported: 2021-01-05 19:48 UTC by fernando.valle
Modified: 2021-01-07 22:37 UTC (History)
5 users (show)

See Also:
kevans: mfc-stable12?
kevans: mfc-stable11?


Note You need to log in before you can comment on or make changes to this bug.
Description fernando.valle 2021-01-05 19:48:36 UTC
Running the test: /usr/tests/usr.bin/du/du_test:H_flag the following failure is occurring:

egrep: trailing backslash (\)

The expression that fails the test is:
egrep -q "[0-9]+\t$(echo $paths1 | tr ' ' "$sep")\n" du.out

I did the test on amd64 and powerpc64(main-c255460-g282381aa5), in both the same error occurs.

It seems that grep is currently experiencing some problem with a trailing backslash.
Comment 1 Kyle Evans freebsd_committer 2021-01-05 20:30:53 UTC
(In reply to fernando.valle from comment #0)


Thanks! I received a report just earlier today about this, too; what's going on here is that bsdgrep doesn't know that \t means tab and neither does the underlying regex engine (this is correct by the spec). The underlying regex engine rejects it now because it doesn't have any special meaning.

Interestingly enough, this only worked by coincidence with gnugrep. gnugrep *also* doesn't understand \t => tab but instead opted to just silently interpret it as a 't'. The reason it appears to succeed is that the pattern argument breaks down like so (with \t translated to t):


Note that this is four (4) distinct patterns; the first one never matches, while the latter three do. You can confirm this with gnugrep -o (I manually ran the test here):

root@viper:/usr/tests/usr.bin/du# /usr/local/bin/grep -E "[0-9]+\t$(echo "testdir/A/B testdir/A testdir/C testdir" | tr ' ' "\n[0-9]+\t")\n" -o du.out

I will fix the test.
Comment 2 commit-hook freebsd_committer 2021-01-07 22:37:41 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4832d2e8ae1df6f907ac00275764f8135722cb7e

commit 4832d2e8ae1df6f907ac00275764f8135722cb7e
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2021-01-05 21:33:06 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2021-01-07 22:36:31 +0000

    du: tests: fix the H_flag test (primarily grep usage)

    This test attempts to use \t (tab intended) in a grep expression.  With the
    former /usr/bin/grep (i.e. gnugrep), this was interpreted as a literal 't'.
    The expression would work anyways because the tr(1) usage would ultimately
    replace all of the spaces with a single newline, and they would match the
    paths whether they were correctly fromatted or not.

    Current /usr/bin/grep (i.e. bsdgrep) is less-tolerant of ordinary-escapes, a
    property of the underlying regex(3) engine, to make it easier to identify
    when stuff like this happens. In-fact, this expression broke after the
    switch happened.

    This revision does the bare basics to fix the usage by using a printf to get
    a literal tab character to insert into the expression. It also swaps out the
    manual insertion of the line prefix into the grep expression by pulling
    that part out of $sep and reusing it for the leading path.

    The secondary issue was the tr(1) usage, since tr would only replace the
    first character of string1 with the first character of string2.  This has
    instead been replaced by a sed expression, which similary understands \n to
    be a newline on all supported versions of FreeBSD.  Each path now gets
    prefixed with the appropriate context that should be there (i.e. numeric
    sequence followed by a tab).

    PR:             252446
    Reviewed by:    emaste, ngie
    Differential Revision:  https://reviews.freebsd.org/D27983

 usr.bin/du/tests/du_test.sh | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)