I get an error when I try to use an empty regex for the field separator: $ echo hello | awk -F '' '{print $2}' awk: field separator FS is empty but awk has no issues splitting things on an empty regex: $ awk 'BEGIN{s="hello"; split(s, a, ""); print a[1]}' h Over on gawk, I get the expected behavior $ echo hello | awk -F '' '{print $1}' h This is somewhat similar to #226112 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226112 I get that awk uses EREs and `man re_format` says that "A (modern [Extended]) RE is one or more non-empty branches, separated by '|'", but 1) that's not what split() does 2) it's not what gawk's -F parameter does 3) permitting an empty regex for splitting already seems supported in awk code (as the split example shows) and shouldn't break any existing usage 4) as a non-workaround, `man re_format` says that the atom "()" matches the null string, but $ echo hello | awk -F '()' '{print $1}' doesn't split the row on the null regular expression (FWIW, gawk gives the same results when using "()" as the split pattern). In an ideal world, the behavior would match the behavior of gawk & the split() function, splitting the record into each individual character.
The standard states that FS='' is undefined behavior. It also states that -F sepstring and -v FS=sepstring are identical. https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html however, one true awk treats them differently.
I've filed the following https://github.com/onetrueawk/awk/issues/127 upstream. This seems inconsistent, especially since FS="" has well documented behavior in awk(1) from upstream.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=a2e3e1187309f9404940b61ca49a93bd0536559d commit a2e3e1187309f9404940b61ca49a93bd0536559d Author: Warner Losh <imp@FreeBSD.org> AuthorDate: 2021-07-20 04:47:30 +0000 Commit: Warner Losh <imp@FreeBSD.org> CommitDate: 2021-07-24 15:08:16 +0000 awk: Make -F '' and -v FS="" behave the same IEEE Std 1003.1-2008 mandates that -F str be treated the same as -v FS=str. For a null string, this was not the case. Since awk(1) documents that a null string for FS has a specific behavior, make -F '' behave consistently with -v FS="". PR: 241441 Upstream issue: https://github.com/onetrueawk/awk/issues/127 Upstream pull request: https://github.com/onetrueawk/awk/pull/128 MFC After: 2 weeks Sponsored by: Netflix contrib/one-true-awk/main.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f4ed53c6f5254edcc28c34cbe67d698bd93cb05e commit f4ed53c6f5254edcc28c34cbe67d698bd93cb05e Author: Warner Losh <imp@FreeBSD.org> AuthorDate: 2021-07-20 04:47:30 +0000 Commit: Warner Losh <imp@FreeBSD.org> CommitDate: 2021-07-30 23:02:13 +0000 awk: Make -F '' and -v FS="" behave the same IEEE Std 1003.1-2008 mandates that -F str be treated the same as -v FS=str. For a null string, this was not the case. Since awk(1) documents that a null string for FS has a specific behavior, make -F '' behave consistently with -v FS="". PR: 241441 Upstream issue: https://github.com/onetrueawk/awk/issues/127 Upstream pull request: https://github.com/onetrueawk/awk/pull/128 MFC After: 2 weeks Sponsored by: Netflix (cherry picked from commit a2e3e1187309f9404940b61ca49a93bd0536559d) contrib/one-true-awk/main.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=ab1dedd4946098fe7202e825d299a2cbec81dae0 commit ab1dedd4946098fe7202e825d299a2cbec81dae0 Author: Warner Losh <imp@FreeBSD.org> AuthorDate: 2021-07-20 04:47:30 +0000 Commit: Warner Losh <imp@FreeBSD.org> CommitDate: 2021-07-31 00:02:51 +0000 awk: Make -F '' and -v FS="" behave the same IEEE Std 1003.1-2008 mandates that -F str be treated the same as -v FS=str. For a null string, this was not the case. Since awk(1) documents that a null string for FS has a specific behavior, make -F '' behave consistently with -v FS="". PR: 241441 Upstream issue: https://github.com/onetrueawk/awk/issues/127 Upstream pull request: https://github.com/onetrueawk/awk/pull/128 MFC After: 2 weeks Sponsored by: Netflix (cherry picked from commit a2e3e1187309f9404940b61ca49a93bd0536559d) contrib/one-true-awk/main.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)