Summary: | csplit leaves last matching pattern at end of file N-1 as well as in file N | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | J.R. Oldroyd <fbsd> | ||||
Component: | bin | Assignee: | Conrad Meyer <cem> | ||||
Status: | Closed FIXED | ||||||
Severity: | Affects Many People | CC: | cem, jilles, scott | ||||
Priority: | --- | ||||||
Version: | CURRENT | ||||||
Hardware: | Any | ||||||
OS: | Any | ||||||
Attachments: |
|
Sorry, I messed up the filename numbers in the above example. The actual filenames are as they should be: xx00, xx01, xx02 - the error line is left at the end of file xx01. Comments from Jilles Tjoelker, originally posted in PR#213510: I would like to see some automated tests before committing new bug fixes to csplit. See usr.bin/tail/tests/ for an example of tests using kyua/atf in the modern way (some other tests were originally written for a different framework and should not be used as a reference). You may need to adjust mtree files and buildworld and installworld with WITH_TESTS=YES in /etc/src.conf. ... and ... Yes, usr.bin/tail/tests did not exist at the time of 11.0-release. It is available in head and in stable/11. Alternatively, usr.bin/truncate/tests has similar tests and is available in 11.0-release. It's a great idea to have automated testing of utilities, but the atf framework is new to me and, right now, not something I have time to study. A quick look at the tests you mentioned for truncate(1) show an 11k test script. Sorry, but figuring something like that out for csplit is just not on my ToDo list at the moment! I just don't have time. I'd point out that this is a very obvious error, with a one-line patch to a part of the code that is used only with those given flags. I.e., applying this patch should not break other usage. Hopefully someone else can find time to produce some test scripts. J.R., Don't worry, I'll take care of it. A commit references this bug: Author: cem Date: Mon May 8 15:51:29 UTC 2017 New revision: 317942 URL: https://svnweb.freebsd.org/changeset/base/317942 Log: csplit(1): Fix extraneous output in edge case When the input to csplit contains fewer lines than the number of matches specified, extra output was mistakenly included in some output files. Fix the bug and add a simple ATF regression test. PR: 219024 Submitted by: J.R. Oldroyd <fbsd at opal.com> Changes: head/usr.bin/csplit/Makefile head/usr.bin/csplit/csplit.c head/usr.bin/csplit/tests/ head/usr.bin/csplit/tests/Makefile head/usr.bin/csplit/tests/csplit_test.sh ^Triage: Apologies for noise, closing a still open duplicate, finalising metadata while im here. *** Bug 163062 has been marked as a duplicate of this bug. *** |
Created attachment 182255 [details] [patch] csplit.c When using: csplit -k - '/patt/' '{count}' with an input containing less than count lines matching patt, a logic error causes the final pattern that matches to be left both at the start of file N and also at the end of file N-1. This is due to the fact that when there are less than count matches, csplit will error when looking for another match. In that error condition, csplit simply prints a message and exits, rather than calling its toomuch() function which is what truncates the previous output file. Repeat by: echo "one\ntwo\nxxx 1\nthree\nfour\nxxx 2\nfive\nsix" | csplit -k - '/xxx/' '{10}' Expected output: xx00: one two xx01: xxx 1 three four xx02: xxx 2 five six Actual output: xx01: one two xx02: xxx 1 three four xxx 2 xx03: xxx 2 five six The "xxx 2" at the end of file xx02 is incorrect. The attached patch fixes this by making sure that toomuch() is called in the error case when a third match is not found.