FreeBSD Bugzilla – Attachment 222170 Details for
Bug 253209
grep -v -f some-empty-file -- does the wrong thing
Home
|
New
|
Browse
|
Search
|
[?]
|
Reports
|
Help
|
New Account
|
Log In
Remember
[x]
|
Forgot Password
Login:
[x]
[patch]
git(1) diff against base
0001-grep-fix-null-pattern-and-empty-pattern-file-behavio.patch (text/plain), 4.55 KB, created by
Kyle Evans
on 2021-02-04 21:39:07 UTC
(
hide
)
Description:
git(1) diff against base
Filename:
MIME Type:
Creator:
Kyle Evans
Created:
2021-02-04 21:39:07 UTC
Size:
4.55 KB
patch
obsolete
>From eac1c5305bd1600a13fffb5057401149c18140d1 Mon Sep 17 00:00:00 2001 >From: Kyle Evans <kevans@FreeBSD.org> >Date: Thu, 4 Feb 2021 15:26:45 -0600 >Subject: [PATCH 1/2] grep: fix null pattern and empty pattern file behavior > >The null pattern semantics were terrible because I tried to match gnugrep, >but I got it wrong. Let's unwind that: > >- The null pattern should match every line if neither -w nor -x. >- The null pattern should match empty lines if -x. >- The null pattern should not match any lines if -w. > >The first two will stop processing (shortcut) even if additional patterns >are specified. In any other case, we will continue processing other >patterns. If no other patterns are specified beside a null pattern, then >we match if neither -w nor -x or set and do not match if either of those >are specified. > >The justification for -w is that it should match on a whole word, but the >null pattern deos not have a whole word to match on. > >Empty pattern files should never match anything, and more importantly, -v >should cause everything to be written. > >Signed-off-by: Kyle Evans <kevans@FreeBSD.org> >--- > contrib/netbsd-tests/usr.bin/grep/t_grep.sh | 10 ++---- > usr.bin/grep/grep.c | 11 ------- > usr.bin/grep/util.c | 35 ++++++++++----------- > 3 files changed, 18 insertions(+), 38 deletions(-) > >diff --git a/contrib/netbsd-tests/usr.bin/grep/t_grep.sh b/contrib/netbsd-tests/usr.bin/grep/t_grep.sh >index e094b15c6d6..065a802d13d 100755 >--- a/contrib/netbsd-tests/usr.bin/grep/t_grep.sh >+++ b/contrib/netbsd-tests/usr.bin/grep/t_grep.sh >@@ -483,17 +483,11 @@ wflag_emptypat_head() > wflag_emptypat_body() > { > printf "" > test1 >- printf "\n" > test2 >- printf "qaz" > test3 >- printf " qaz\n" > test4 >+ printf "qaz" > test2 > > atf_check -s exit:1 -o empty grep -w -e "" test1 > >- atf_check -o file:test2 grep -w -e "" test2 >- >- atf_check -s exit:1 -o empty grep -w -e "" test3 >- >- atf_check -o file:test4 grep -w -e "" test4 >+ atf_check -s exit:1 -o empty grep -w -e "" test2 > } > > atf_test_case xflag_emptypat >diff --git a/usr.bin/grep/grep.c b/usr.bin/grep/grep.c >index 307a91353b6..33541e4fe73 100644 >--- a/usr.bin/grep/grep.c >+++ b/usr.bin/grep/grep.c >@@ -69,13 +69,6 @@ const char *errstr[] = { > int cflags = REG_NOSUB | REG_NEWLINE; > int eflags = REG_STARTEND; > >-/* XXX TODO: Get rid of this flag. >- * matchall is a gross hack that means that an empty pattern was passed to us. >- * It is a necessary evil at the moment because our regex(3) implementation >- * does not allow for empty patterns, as supported by POSIX's definition of >- * grammar for BREs/EREs. When libregex becomes available, it would be wise >- * to remove this and let regex(3) handle the dirty details of empty patterns. >- */ > bool matchall; > > /* Searching patterns */ >@@ -637,10 +630,6 @@ main(int argc, char *argv[]) > aargc -= optind; > aargv += optind; > >- /* Empty pattern file matches nothing */ >- if (!needpattern && (patterns == 0) && !matchall) >- exit(1); >- > /* Fail if we don't have any pattern */ > if (aargc == 0 && needpattern) > usage(); >diff --git a/usr.bin/grep/util.c b/usr.bin/grep/util.c >index e517e4eaee6..f22b7abd79e 100644 >--- a/usr.bin/grep/util.c >+++ b/usr.bin/grep/util.c >@@ -471,31 +471,28 @@ procline(struct parsec *pc) > > matchidx = pc->matchidx; > >- /* >- * With matchall (empty pattern), we can try to take some shortcuts. >- * Emtpy patterns trivially match every line except in the -w and -x >- * cases. For -w (whole-word) cases, we only match if the first >- * character isn't a word-character. For -x (whole-line) cases, we only >- * match if the line is empty. >- */ >+ /* Null pattern shortcuts. */ > if (matchall) { >- if (pc->ln.len == 0) >+ if (xflag && pc->ln.len == 0) { >+ /* Matches empty lines (-x). */ > return (true); >- if (wflag) { >- wend = L' '; >- if (sscanf(&pc->ln.dat[0], "%lc", &wend) == 1 && >- !iswword(wend)) >- return (true); >- } else if (!xflag) >+ } else if (!wflag && !xflag) { >+ /* Matches every line (no -w or -x). */ > return (true); >+ } > > /* >- * If we don't have any other patterns, we really don't match. >- * If we do have other patterns, we must fall through and check >- * them. >+ * If we only have the NULL pattern, whether we match or not >+ * depends on if we got here with -w or -x. If either is set, >+ * the answer is no. If we have other patterns, we'll defer >+ * to them. > */ >- if (patterns == 0) >- return (false); >+ if (patterns == 0) { >+ return (!(wflag || xflag)); >+ } >+ } else if (patterns == 0) { >+ /* Pattern file with no patterns. */ >+ return (false); > } > > matched = false; >-- >2.30.0 >
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Diff
View Attachment As Raw
Actions:
View
|
Diff
Attachments on
bug 253209
: 222170