/* $FreeBSD: src/usr.bin/grep/grep.c,v 126.96.36.199 2011/10/20 16:08:11 gabor Exp $
According to the POSIX-2008 standard, "^" and "$" should be ordinary characters in BREs (basic regexs) when they're not in anchoring positions (as contrasted to EREs, where they should always be anchors). Hence:
$ printf 'a^b$c' | grep -o 'a^b'
should match, and it does when I use Gnu grep (on Linux), and using BusyBox grep (again on Linux, built against uClibc). But it doesn't using the described version of FreeBSD grep. Curiously though:
$ printf 'a^b$c' | grep -o '[a]^b'
will match. And so too will 'b$c'.
One can't portably rely on '\^' here to specify the literal '^', because POSIX-2008 says that '^' in non-anchoring positions is not special in BREs, and that the combination of '\' and a non-special character is undefined. Of course, neither can one use '[^]'.
How-To-Repeat: See above.
I've noticed some more issues with the same version of grep. I don't
know whether they're related, but I'll append them here for now.
$ printf abc | grep -o '^[a-c]'
should just print 'a', but instead gives three hits, against each letter
of the incoming text. The same issue occurs when handling multiline
$ printf 'abc\ndef' | grep -o --null '^[a-f]'
incorrectly matches 6 times.
$ printf 'abc\ndef' | grep -o --null '[a-f]$'
correctly only matches 'c' and 'f'.
$ printf 'abc\ndef' | grep -o --null '\`[a-f]'
has the same issue as ^, whereas:
$ printf 'abc\ndef' | grep -o --null '[a-f]\'\'
matches 'c' and 'f'. To fix \` in a way that matches the behavior of \',
it should only match the 'a' and 'd'. In fact, though, both of these
should only match against a single character: 'a' for \` and 'f' for \'.
That's the specified behavior of these Gnu extensions, and how they
behave in the Gnu grep and BusyBox grep implementations I'm testing
against. If that behavior isn't going to be provided, then wouldn't it'd
be better for these extensions not even pretend to be present? And so,
just match against a literal ` or '?
On Wed, Apr 11, 2012, at 03:21 PM, Jim Pryor wrote:
> I've noticed some more issues with the same version of grep. I don't
> know whether they're related, but I'll append them here for now.
> $ printf abc | grep -o '^[a-c]'
Some more observations that seem related:
$ printf 'abc def' | grep -o '^[a-z]'
will match against each of the letters in 'abc', but not against any of
the letters in 'def'.
On the other hand:
$ printf 'abc def' | grep -o '\b[a-z]'
$ printf 'abc def' | grep -o '\<[a-z]'
will each match against all six of the letters.
Matching against the patterns:
gives correct results.
emaste@ - I think this one can just be closed. If I run all of these on an unsalted 11.0 machine, all of the examples in the above three posts yield the expected results rather than the observed results.
I do not have access to anything on 10.x or stable/10 to test it on and haven't built up the motivation to sort through commits and figure out why it seems to be working now, although I suppose it doesn't matter if we can see that it works on 10.x.
It looks like at least some of these issues are reproducible with the GNU grep in FreeBSD 10 - for example:
% grep --version
grep (GNU grep) 2.5.1-FreeBSD
Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
% printf abc | grep -o '^[a-c]'
I was not able to reproduce any of the failures with bsdgrep in FreeBSD 10. I have updated the title to refer to non-BSD grep.
This is good to know. =) This one may be closed when bsdgrep becomes /usr/bin/grep.
For bugs matching the following conditions:
- Status == In Progress
- Assignee == "bugs@FreeBSD.org"
- Last Modified Year <= 2017
- Set Status to "Open"