To reproduce: printf '%s\n' a b c d e f | sed ' /a/,/b/c\ x $!N ' Expected output: x c d e f Actual output:
I see no way to justify your "expected" output from the specification. (I also can't justify the actual output, but it deviates less from the spec than your "expected" output.) In particular, the line "b" is read using N and deleted without ever being seen by the /a/,/b/c command, and therefore the replacement "x" should never be emitted. My reading of the spec is that the "b", "d", "f" lines should be output, but I see no reading of the spec that allows the output of "c" and "e".
(In reply to Andrew "RhodiumToad" Gierth from comment #1) The expected output matches GNU sed. I'm not sure why you mention deletion, I'm not using any delete command. The first command changes the text in the range from a to b, and replaces it with x. The second command appends the next line, and since we're not doing anything fancy with \n in the pattern space, it should be a no-op. Therefore, the output should be identical to that of the same script without $!N.
(In reply to Mohamed Akram from comment #2) All of the pattern space (embedded newlines and all) is deleted at the end of each cycle (after being output if appropriate). The fact the GNU sed violates the spec is not our concern. Your description of the "c" command is not what the spec says. The spec says that with 2 addresses, "c" deletes the pattern space if the line is in the addressed range, and emits the replacement text if and only if the last line of the range is addressed. Since the "b" line is consumed by N and is never in the pattern space at the start of the cycle, the /a/,/b/ range never sees it, so the range extends to the last line of the file; but since you also read the last line of the file by doing N on the second-last line, the last line is also never processed by the "c" command so the replacement is never output. (You seem to be assuming that the use of N does not affect addresses; that's not what the spec says.)
(In reply to Andrew "RhodiumToad" Gierth from comment #3) Per the spec: > The sed utility shall then apply in sequence all commands whose addresses select that pattern space, until a command starts the next cycle or quits. For the c command: > Delete the pattern space. With a 0 or 1 address or at the end of a 2-address range, place text on the output and start the next cycle. So, once the c command is executed, the next cycle is started and N is not executed.
(In reply to Mohamed Akram from comment #4) This is the sequence of events according to the spec, as I read it: 1. Read line "a" into the pattern space. 2. Execute the first command: /a/ matches, so we begin an addressed range "c" deletes the pattern space (but does not emit anything and does not start the next cycle) 3. Execute the second command: $ does not match ! inverts the match N is executed, which appends "\nb" to the (deleted) pattern space 4. By my reading of the spec, the "\nb" should be output at this point. For whatever reason, BSD sed does not do that. 5. The pattern space is deleted (as this is the end of the cycle) 6. Read line "c" into the pattern space. 7. Execute the first command: /b/ does not match, so we are still in an addressed range "c" deletes the pattern space (but does not emit anything and does not start the next cycle) 8. Execute the second command: $ does not match ! inverts the match N is executed, which appends "\nd" to the (deleted) pattern space 9. as 4. 10. The pattern space is deleted (as this is the end of the cycle) 11. read line "e" into the pattern space. 12. Execute the first command: /b/ does not match, so we are still in an addressed range "c" deletes the pattern space (but does not emit anything and does not start the next cycle) 13. Execute the second command: $ does not match ! inverts the match N is executed, which appends "\nf" to the (deleted) pattern space 14. as 4. 15. The pattern space is deleted (as this is the end of the cycle). 16. There are no more lines so the process ends. Note that neither the last line of input, nor any line containing /b/, was never processed by the "c" command, so it never has a chance to emit the replacement text. You seem to be hung up on /a/,/b/ representing some block of input lines. This is NOT WHAT IT MEANS; it means "start a range when you see a _pattern space_ matching /a/, and end it when you see a pattern space matching /b/". By using N to process some input lines, you prevent them from being seen in the pattern space at the start of the script, which affects how the first command determines its range. Alternatively, you (or GNU sed) may be assuming that "c" starts a new cycle (rather than executing the rest of the script) for every row of a 2-address range, not just the last one. This isn't what the spec actually says (as you quoted yourself), though it might be considered to be more useful or consistent. (I looked for applicable defect reports against the spec, didn't find any.)
(In reply to Andrew "RhodiumToad" Gierth from comment #5) Thank you for this. > Alternatively, you (or GNU sed) may be assuming that "c" starts a new cycle (rather > than executing the rest of the script) for every row of a 2-address range, not just > the last one. This isn't what the spec actually says (as you quoted yourself), > though it might be considered to be more useful or consistent. (I looked for > applicable defect reports against the spec, didn't find any.) I now wonder if other implementations output bdf as expected. Might be worth opening a ticket against the spec to clarify this ambiguity.
(In reply to Mohamed Akram from comment #6) For what it's worth, the failure to output the "b","d","f" lines is because our sed has a "pattern deleted" flag which is set by "c" (and not reset by "N"), which suppresses the output of the pattern space at the end of the cycle. (I haven't looked at GNU sed's logic.)
Tried this with: NetBSD 9.3: same output as GNU OpenBSD 7.3: same output as FreeBSD
(In reply to Mohamed Akram from comment #8) NetBSD reference: https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=45981 That claims that the v7 manual said: Delete the pattern space. With 0 or 1 address or at the end of a 2-address range, place text on the output. Start the next cycle. ... which disagrees with what the spec now says. (As far as I can tell from a quick look, given that it's not the most readable code in the world, the v7 sed does in fact behave as documented in this case.) So, I think this can definitely be argued to be a defect in the spec; you should certainly take it up with them if you care about it.
should we assign this to standards@?
(In reply to Mina Galić from comment #10) Can't think of anything better to do with it.
(In reply to Andrew "RhodiumToad" Gierth from comment #9) Thank you very much for this. I've opened an issue with the Austin Group: https://austingroupbugs.net/view.php?id=1767
Created attachment 244157 [details] sed: fix 'c' command Patch to apply Austin Group resolution to code and manpage.
(In reply to Andrew "RhodiumToad" Gierth from comment #13) Thanks for the patch. Could someone merge it? The new version of the standard is out with the adjusted wording.
This is now fixed in OpenBSD sed as well. https://github.com/openbsd/src/commit/8a7444b3f20f89375418387d55d72ff94189313f
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=a2d78713171cf138b5ae50d61f82df1af7574c95 commit a2d78713171cf138b5ae50d61f82df1af7574c95 Author: Valeriy Ushakov <uwe@netbsd.org> AuthorDate: 2024-12-17 22:27:01 +0000 Commit: Warner Losh <imp@FreeBSD.org> CommitDate: 2024-12-17 22:34:06 +0000 sed: The change ("c") command should start a new cycle. The "c" command should start the next cycle as clarified in POSIX 2024. This is also consistent with historical and gnu sed behavior. This patch is from OpenBSD by way of NetBSD with a tweak to the man page date by me. Confirmed the test case in the bug now works. PR: 271817 Obtained from: NetBSD (1.39 uwe), OpenBSD (1.39 millert) Sponsored by: Netflix usr.bin/sed/process.c | 2 +- usr.bin/sed/sed.1 | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
Forgot MFC tag, so do so after 2 weeks.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=003818aca4cdda47adef808a56d48003aa514029 commit 003818aca4cdda47adef808a56d48003aa514029 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2024-12-23 19:06:11 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2024-12-23 19:08:15 +0000 sed tests: Add a regression test for the c function Based on the test case from PR 271817 by Mohamed Akram. PR: 271817 MFC after: 2 weeks usr.bin/sed/tests/sed2_test.sh | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=6d34cce6068401e5736c05b3c130c0583af1f2e9 commit 6d34cce6068401e5736c05b3c130c0583af1f2e9 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2024-12-23 19:06:11 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2025-01-17 18:48:15 +0000 sed tests: Add a regression test for the c function Based on the test case from PR 271817 by Mohamed Akram. PR: 271817 MFC after: 2 weeks (cherry picked from commit 003818aca4cdda47adef808a56d48003aa514029) usr.bin/sed/tests/sed2_test.sh | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3e22772769757b31d2b9383b5f510d4e43afaa8b commit 3e22772769757b31d2b9383b5f510d4e43afaa8b Author: Valeriy Ushakov <uwe@netbsd.org> AuthorDate: 2024-12-17 22:27:01 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2025-01-17 18:48:07 +0000 sed: The change ("c") command should start a new cycle. The "c" command should start the next cycle as clarified in POSIX 2024. This is also consistent with historical and gnu sed behavior. This patch is from OpenBSD by way of NetBSD with a tweak to the man page date by me. Confirmed the test case in the bug now works. PR: 271817 Obtained from: NetBSD (1.39 uwe), OpenBSD (1.39 millert) Sponsored by: Netflix (cherry picked from commit a2d78713171cf138b5ae50d61f82df1af7574c95) usr.bin/sed/process.c | 2 +- usr.bin/sed/sed.1 | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=9c348f73a8568769b1a746efd9ccbca2f4ef7252 commit 9c348f73a8568769b1a746efd9ccbca2f4ef7252 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2024-12-23 19:06:11 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2025-01-17 18:48:19 +0000 sed tests: Add a regression test for the c function Based on the test case from PR 271817 by Mohamed Akram. PR: 271817 MFC after: 2 weeks (cherry picked from commit 003818aca4cdda47adef808a56d48003aa514029) usr.bin/sed/tests/sed2_test.sh | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5ea64bfc9a6f8582b952580c7cdf754e7ab4a078 commit 5ea64bfc9a6f8582b952580c7cdf754e7ab4a078 Author: Valeriy Ushakov <uwe@netbsd.org> AuthorDate: 2024-12-17 22:27:01 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2025-01-17 18:48:01 +0000 sed: The change ("c") command should start a new cycle. The "c" command should start the next cycle as clarified in POSIX 2024. This is also consistent with historical and gnu sed behavior. This patch is from OpenBSD by way of NetBSD with a tweak to the man page date by me. Confirmed the test case in the bug now works. PR: 271817 Obtained from: NetBSD (1.39 uwe), OpenBSD (1.39 millert) Sponsored by: Netflix (cherry picked from commit a2d78713171cf138b5ae50d61f82df1af7574c95) usr.bin/sed/process.c | 2 +- usr.bin/sed/sed.1 | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)