Bug 243881 - libc++: regex uses truncated pattern if normal character is escaped
Summary: libc++: regex uses truncated pattern if normal character is escaped
Status: Closed Not Accepted
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 12.1-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-toolchain (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-04 18:53 UTC by Benjamin Lutz
Modified: 2020-02-04 22:11 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Lutz 2020-02-04 18:53:12 UTC
If a pattern in which a normal character is escaped (e.g.: "a\bc"), the libc++ that ships with FreeBSD only appears to use the part of the pattern up to that character.

Test program:

===== regextest.cpp BEGIN =====
#include <iostream>
#include <regex>
#include <vector>

using namespace std;

int main() {
	vector<string> patterns = {
	    R"(abc)",
	    R"(a\bc)",
	    R"(a\bx)",
	    R"(a\xc)",
	    R"(x\bc)",
	};
	
	for (const string &pattern : patterns) {
		cout << pattern << ": ";
		try {
			regex r(pattern, regex::extended);
			bool match = regex_search("abc", r);
			cout << (match ? "match" : "no match") << endl;
		} catch (const std::regex_error &e) {
			cout << "regex error: " << e.what() << endl;
		}
	}
	
	return 0;
}
===== regextest.cpp END =====

expected output:
abc: match
a\bc: match
a\bx: no match
a\xc: no match
x\bc: no match

Incorrect output on FreeBSD 12.1 with system c++ compiler (clang 8.0.1):
abc: match
a\bc: match
a\bx: match
a\xc: match
x\bc: no match

On FreeBSD, gcc9 works correctly, so does clang-8 on Ubuntu, which makes me think this is specific to the FreeBSD system compiler.
Comment 1 Dimitry Andric freebsd_committer freebsd_triage 2020-02-04 22:11:40 UTC
Could you please submit this upstream instead?  This has to do with libc++, not specifically with FreeBSD, as it gives precisely the same output on macOS.
I think you see different output on Ubuntu with clang, because it will use libstdc++ there.