I am generating diffs of multiple files into a single output stream which consists of multiple sections looking like this: --- ./accessibility/jovie/Makefile.ORIG 2016-08-26 17:51:07.219458267 +0200 +++ ./accessibility/jovie/Makefile 2016-08-26 17:48:41.413457178 +0200 @@ -1,9 +1,16 @@ # $FreeBSD: head/accessibility/jovie/Makefile 420774 2016-08-24 08:20:31Z tcberner $ ... --- ./accessibility/kaccessible/Makefile.ORIG 2016-08-26 17:51:07.268463953 +0200 +++ ./accessibility/kaccessible/Makefile 2016-08-26 17:48:41.505435723 +0200 @@ -1,9 +1,16 @@ # $FreeBSD: head/accessibility/kaccessible/Makefile 420774 2016-08-24 08:20:31Z tcberner $ ... I now want to split this stream using csplit: csplit -k -n 3 - '/^[-*][-*][-*] .*\.ORIG/' '{1000}' This produces files xx000 to xx998 of size 0, and file xx999 with the following contents (one line only): --- ./accessibility/jovie/Makefile.ORIG 2016-08-26 17:51:07.219458267 +0200 This worked correctly in 10.3.
This is only a "me too". I'm using FreeBSD 11.0-RELEASE-p2 #0: Mon Oct 24 06:55:27 UTC 2016 A simple example is not working for me: echo "uno\ndos\nxxx\ntres\ncuatro" | csplit - '/xxx/' I would expect two files created, one with: uno dos the other one with: tres cuatro Instead, I have: xx00 which is empty xx01: uno Cheers, Fernando
I have experienced the same issue and an additional issue with csplit on 11.0-RELEASE-p3. To replicate create a file called 'bigfile' with the contents: BEGINS Section One Text for section one. Section Two Text for section two. Section Three Text for section three. ENDS Run csplit: csplit bigfile '/Section/' '{1}' Expected result: 3 files: xx00, xx01 and xx02 each containing text for each section with a trailing newline; for example xx00: BEGINS Section One Text for section one. ENDS Actual result: 2 files: xx00 and xx01. xx00 is empty and xx01 contains the single line "Section One". Splitting by line number is also broken. Run csplit: csplit bigfile 4 7 Expected result: 3 files: xx00, xx01 and xx02 each containing text for each section with a trailing newline. Actual result: 3 files: xx00 containing the single line "Section One", xx01 containing the single line "Text here from section one" and xx02 containing a newline character. A potential workaround is instead using the split utility, for example: split -p 'Section' bigfile However, split does not offer the complexity of csplit and is not suitable as a direct substitute.
Same failure confirmed here. Fortunately, "split -p patt" can be used as a work-around in my case.
I can reproduce this and I suspect it has to do with changes to the return value of fputs().
Created attachment 182200 [details] [patch] csplit.c Patch fixes the fputs() return value checks, and also fixes a logic error in truncating the last file. Thanks, Jilles, for pointing out the change to fputs(). There was also a logic error, repeatable like this: csplit -k - '/patt/' '{99}' Then, give input containing < 99 lines matching /patt/. You'll observe that the last line containing /patt/ was written both at the start of file N and also left at the end of file N-1. This was due to the fact that toomuch() was not being called in the error case when a match N+1 was not found. Also fixed.
A commit references this bug: Author: jilles Date: Tue May 2 21:56:20 UTC 2017 New revision: 317709 URL: https://svnweb.freebsd.org/changeset/base/317709 Log: csplit: Fix check of fputs() return value, making csplit work again. As of r295638, fputs() returns the number of bytes written (if not more than INT_MAX). This broke csplit completely, since csplit assumed only success only for the return value 0. PR: 213510 Submitted by: J.R. Oldroyd MFC after: 1 week Relnotes: yes Changes: head/usr.bin/csplit/csplit.c
(In reply to J.R. Oldroyd from comment #5) I committed the fputs() return value check part, which should return csplit's functionality to what it was in 10.3. I plan to MFC this to stable/11 before release 11.1. Thanks for the other bug fix, but I would like to see some automated tests before committing new bug fixes to csplit. See usr.bin/tail/tests/ for an example of tests using kyua/atf in the modern way (some other tests were originally written for a different framework and should not be used as a reference). You may need to adjust mtree files and buildworld and installworld with WITH_TESTS=YES in /etc/src.conf. Regression tests for this bug can go into this bug report but other bugs should go into new bug reports or Phabricator reviews. Another bug that could be fixed is the {LINE_MAX} limit. This should be easy using getline(3).
There is no src/usr.bin/tail/tests/ on my 11.0 system. I will file a separate PR for that bug with a repeat-by and the patch.
(In reply to J.R. Oldroyd from comment #8) Yes, usr.bin/tail/tests did not exist at the time of 11.0-release. It is available in head and in stable/11. Alternatively, usr.bin/truncate/tests has similar tests and is available in 11.0-release.
Please see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219024 regarding the left-over matching line at the end of file N-1.
A commit references this bug: Author: jilles Date: Fri May 12 14:38:09 UTC 2017 New revision: 318238 URL: https://svnweb.freebsd.org/changeset/base/318238 Log: MFC r317709: csplit: Fix check of fputs() return value, making csplit work again. As of r295638, fputs() returns the number of bytes written (if not more than INT_MAX). This broke csplit completely, since csplit assumed success only for the return value 0. PR: 213510 Relnotes: yes Changes: _U stable/11/ stable/11/usr.bin/csplit/csplit.c
Fixed in head and stable/11, thanks!