|Summary:||bsdgrep(1) doesn't match a regular expression containing "|" against UTF-16 file [regression]|
|Component:||bin||Assignee:||freebsd-bugs (Nobody) <bugs>|
|Status:||Closed Overcome By Events|
|Severity:||Affects Only Me||CC:||emaste, kevans|
|Bug Depends on:|
Description toomas.aas 2013-12-12 20:00:00 UTC
$ egrep -V egrep (BSD grep) 2.5.1-FreeBSD $ echo abc > testfile $ iconv -f ASCII -t UTF-16LE testfile > utftestfile $ egrep -c "a.b" /tmp/utftestfile 1 $ egrep -c "a.b|d" /tmp/utftestfile 0 The expected result is that the second "egrep" command should also return 1. This works as expected when using GNU grep 2.15 installed from ports. Also this works as expected with "bsdgrep -E" on FreeBSD 9.1 i386 system. How-To-Repeat: See "Full Description"
Comment 1 Kyle Evans 2017-01-21 03:45:23 UTC
A couple of notes here, as of right now: `egrep -c` and `bsdgrep -Ec` seem to be behaving consistently on this one now. Also, I've gotten as far as isolating it to a problem somewhere in the GNU compatibility bits. Enabling WITHOUT_GNU_GREP_COMPAT in /etc/src.conf and rebuilding bsdgrep makes it Just Work (TM). At this point, I'm not sure how to proceed. I did verify that we're setting the cflags right (in accordance with /usr/include/gnu/regex.h), other than that nothing else sticks out as blatantly wrong.
Comment 2 Kyle Evans 2017-01-21 04:00:23 UTC
(In reply to Kyle Evans from comment #1) Also worth noting: this equivalent test on a relatively recent Debian machine: > grep (GNU grep) 2.27 $ echo abc > testfile $ iconv -f ASCII -t UTF-16LE testfile > utftestfile $ egrep -c "a.b" /tmp/utftestfile 0 $ egrep -c "a.b|d" /tmp/utftestfile 0
Comment 3 Eitan Adler 2018-05-20 23:51:57 UTC
For bugs matching the following conditions: - Status == In Progress - Assignee == "bugs@FreeBSD.org" - Last Modified Year <= 2017 Do - Set Status to "Open"
Comment 4 Kyle Evans 2018-08-03 16:15:33 UTC
Adding this to tracking PR; will mark fixed/overcome by events once bsdgrep loses the bits that allow it to be linked against gnuregex.
Comment 5 Kyle Evans 2020-12-05 03:41:49 UTC
This is mostly OBE as bsdgrep will now use libregex by default rather than libgnuregex. 11.4 still links against it, but I would tend to recommend not using bsdgrep on 11.x.