Bug 184733

Summary: bsdgrep(1) doesn't match a regular expression containing "|" against UTF-16 file [regression]
Product: Base System Reporter: toomas.aas
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed Overcome By Events    
Severity: Affects Only Me CC: emaste, kevans
Priority: Normal    
Version: 9.2-STABLE   
Hardware: Any   
OS: Any   
Bug Depends on:    
Bug Blocks: 230332    

Description toomas.aas 2013-12-12 20:00:00 UTC
$ egrep -V
egrep (BSD grep) 2.5.1-FreeBSD

$ echo abc > testfile
$ iconv -f ASCII -t UTF-16LE testfile > utftestfile

$ egrep -c "a.b" /tmp/utftestfile
$ egrep -c "a.b|d" /tmp/utftestfile

The expected result is that the second "egrep" command should also
return 1. This works as expected when using GNU grep 2.15 installed
from ports. Also this works as expected with "bsdgrep -E" on FreeBSD
9.1 i386 system.

How-To-Repeat: See "Full Description"
Comment 1 Kyle Evans freebsd_committer 2017-01-21 03:45:23 UTC
A couple of notes here, as of right now:

`egrep -c` and `bsdgrep -Ec` seem to be behaving consistently on this one now. Also, I've gotten as far as isolating it to a problem somewhere in the GNU compatibility bits. Enabling WITHOUT_GNU_GREP_COMPAT in /etc/src.conf and rebuilding bsdgrep makes it Just Work (TM).

At this point, I'm not sure how to proceed. I did verify that we're setting the cflags right (in accordance with /usr/include/gnu/regex.h), other than that nothing else sticks out as blatantly wrong.
Comment 2 Kyle Evans freebsd_committer 2017-01-21 04:00:23 UTC
(In reply to Kyle Evans from comment #1)

Also worth noting: this equivalent test on a relatively recent Debian machine:

> grep (GNU grep) 2.27

$ echo abc > testfile
$ iconv -f ASCII -t UTF-16LE testfile > utftestfile
$ egrep -c "a.b" /tmp/utftestfile
$ egrep -c "a.b|d" /tmp/utftestfile
Comment 3 Eitan Adler freebsd_committer freebsd_triage 2018-05-20 23:51:57 UTC
For bugs matching the following conditions:
- Status == In Progress
- Assignee == "bugs@FreeBSD.org"
- Last Modified Year <= 2017

- Set Status to "Open"
Comment 4 Kyle Evans freebsd_committer 2018-08-03 16:15:33 UTC
Adding this to tracking PR; will mark fixed/overcome by events once bsdgrep loses the bits that allow it to be linked against gnuregex.
Comment 5 Kyle Evans freebsd_committer 2020-12-05 03:41:49 UTC
This is mostly OBE as bsdgrep will now use libregex by default rather than libgnuregex. 11.4 still links against it, but I would tend to recommend not using bsdgrep on 11.x.