Created attachment 150515 [details] Process -e optarg verbatim When encountering an -e argument, sed currently mallocs a string to COPY the optarg -- with '\n' appended. The appendage does not seem necessary -- indeed, the same call to add_compunit processing the sole command (given without -e) passes the *argv verbatim: without making a copy, and without appending newline. The patch eliminates the seemingly redundant malloc/string-copying.
Created attachment 150516 [details] Add consts, where appropriate This patch is independent of the main one -- consts are added for great justice.
Created attachment 150518 [details] Constify, remove command-length limits This patch purports to eliminate the use of static buffers (of size _POSIX2_LINE_MAX -- 2048). They are replaced by either dynamically-allocated memory (when reading from file using getline() instead fgets()) and by using the already-recorded strings, when those are given on command-line (-e). This saves run-time memory -- because most commands are much shorter than 2048, while also lifting the hard-limit for those cases, when a command is longer... The latter case was handled badly before -- the command was silently truncated leading to cryptic error messages at best (see Bug 177018) or to a possible corruption at worst. Tested (under valgrind) against math.sed and hanoi.sed, as well as the very long -e argument.
Created attachment 150536 [details] Constify, remove command-length limits This version passes the regression-tests, that come with the gsed port as well -- those of them, which the unmodified sed(1) passes, that is.
Created attachment 150558 [details] Constify, remove command-length limits The restores an accidentally-omitted line. Detected, while building firefox. All other tests continue to pass.
Created attachment 150663 [details] Properly process multi-line -e commands too This patch fixes things to properly handle the multiline -e arguments, which are used heavily by our own regression-tests. Our entire test-suit succeeds now, with the exception of 7.8, which the unmodified sed fails too.
Created attachment 152797 [details] Latest installment of sed cleanup The patch adds some minor cleanups and solves the following actual problems: * When parsing a regexp in an address (rather than command), only "I" (capital) was treated as indication, nocase regexp was desired. Man-page documented the "i" (lower case) letter as the flag. GNU sed seems to allow only the upper case, but this patch now accepts both upper and lower case "i" in the address for consistency with treatment of regexps in the s-command * Do not compile s-command's regexps twice
Created attachment 153006 [details] Valentine's Day installment This version goes back to only allowing capital I as case-ignoring flag after address. Treating lower-case i as such broke some configure-scripts. Our man-page still needs fixing, however, because it (incorrectly) identifies the lower-case i as the flag.
The patches look interesting but the code has changed a lot so they don't apply cleanly.
Take .. I am playing with them.
A commit references this bug: Author: pfg Date: Mon May 9 18:53:46 UTC 2016 New revision: 299279 URL: https://svnweb.freebsd.org/changeset/base/299279 Log: Simplify redundant malloc'ing in sed -e. When encountering an -e argument, sed currently mallocs a string to COPY the optarg -- with '\n' appended. The appendage does not seem necessary -- indeed, the same call to add_compunit processing the sole command (given without -e) passes the *argv verbatim: without making a copy, and without appending newline. This matches what is done in other BSDs. Submitted by: Mikhail T. PR: 195929 MFC after: 2 weeks Changes: head/usr.bin/sed/main.c
A commit references this bug: Author: pfg Date: Tue May 10 02:02:50 UTC 2016 New revision: 299294 URL: https://svnweb.freebsd.org/changeset/base/299294 Log: Revert r299279: Simplify redundant malloc'ing in sed -e. It is causing havoc in the ports tree: ===> Configuring for wxsvg-1.5.7 sed: 1: "/gcc_dir=\\`/s/gcc /$CC /": bad flag in substitute command: '/' *** Error code 1 ===> Patching for vips-8.3.1 sed: 1: "1s|^#![[:space:]]*/usr/ ...": bad flag in substitute command: 's' *** Error code 1 PR: 195929 Reported by: danilo Changes: head/usr.bin/sed/main.c
Yes, I recall, that change had to be coupled with the one, that recognized '\0' in addition to '\n' as the line-ender. I only tested the patches together. Sorry for the confusion...
(In reply to Mikhail T. from comment #12) It is curious that the patch passed the regression tests: perhpas it shows we need more tests. The second patch doesn't apply cleanly to -current; which is where commits happen. I don't really have time to decouple the patches and with the FreeBSD 11 freeze coming I feel uncomfortable doing changes in sed. I will drop this for now, but may re-take it after FreeBSD 11.
(In reply to Pedro F. Giffuni from comment #13) > The second patch doesn't apply cleanly to -current I can rework the patch to apply to 11, but I need some reassurance, it is not going to be left here to bitrot again for over a year. > with the FreeBSD 11 freeze coming I feel uncomfortable doing changes in sed. Can you, at least, fix the man-page? It is the last hunk of the larger patch...
A commit references this bug: Author: pfg Date: Tue May 10 17:01:52 UTC 2016 New revision: 299377 URL: https://svnweb.freebsd.org/changeset/base/299377 Log: sed.1: Correction for the case insensitive case. Use the capital I instead of the lowercase. Submitted by: Mikhail T. PR: 195929 MFC after: 2 weeks Changes: head/usr.bin/sed/sed.1
(In reply to Mikhail T. from comment #14) OK, I updated the manpage as the bug was rather serious, thanks! I cannot assure anything: my absolute priority is getting the fix from Bug 209352 before 11-Release. If you wait a couple of weeks though, I will be merging to stable some of the changes that are breaking your patchset.
Created attachment 171898 [details] Updates for recent changes This corrects one old problem and reconciles the changes with the recent modifications.
Created attachment 171900 [details] Script for using GNU sed's tests to verify ours This script use GNU sed's test-suit to verify our sed. Almost all of gsed's tests succeed, obviously -- those, where we differ (and differed before) are excluded. To use: make -C /usr/ports/textproc/gsed extract sh test.sh After the script completes, valgrind's logs can be checked with: grep 'ERROR SUMMARY' *.log If you do not valgrind installed, edit the script to use the newly-built sed directly.
Created attachment 172273 [details] Update to match sed on FreeBSD 11-BETA 11-BETA (and current) still have more changes and match better GNU sed. Please review this update. Also note that attempting to make const the parameter in findlabel() causes this problem: /usr/src/usr.bin/sed/compile.c:913:28: warning: cast from 'const char *' to 'unsigned char *' drops const qualifier [-Wcast-qual] for (h = 0, p = (u_char *)name; (c = *p) != 0; p++) ^
Created attachment 172402 [details] Add come more const (In reply to Pedro F. Giffuni from comment #19) > Please review this update. Seems fine... I added some more const-poison, though -- this time to the regex_t-pointers. > /usr/src/usr.bin/sed/compile.c:913:28: warning: cast from 'const char *' to > 'unsigned char *' drops const qualifier [-Wcast-qual] Fixed -- by adding const to casting (and p's declaration) as well. It now compiles cleanly here with clang-3.4.1 (base), as well as clang-3.8.0 and gcc-6.0. Thanks!
Created attachment 172403 [details] Add some more const And correct a fat-fingering defect, which removed initialization of le in substitute().
Created attachment 172409 [details] Add some more const This installment adds some more const-poisoning to process.c
Comment on attachment 172409 [details] Add some more const In compile.c, the hunk after line 325, you removed an occurence of EATSPACE(), is that correct?
(In reply to Pedro F. Giffuni from comment #23) > In compile.c, the hunk after line 325, you removed an occurence of EATSPACE(), > is that correct? It seemed like compile_flags would eat all the space there may be. But the macro is practically free -- let's put it back...
(In reply to Mikhail Teterin from comment #24) The patch doesn't apply cleanly to 11/12 :(
Created attachment 172442 [details] Updated patch Unfortunately, according to our testsuite the patch causes breakage: legacy_test:main -> failed: 3 tests of 26 failed [0.219s] multi_test:main -> failed: 2 tests of 130 failed [0.364s] inplace_race_test:main -> passed [0.159s] The regular tests only cause one failure in 11-beta: /usr/tests/usr.bin/sed % kyua test -k Kyuafile legacy_test:main -> passed [0.153s] multi_test:main -> failed: 1 tests of 130 failed [0.350s] inplace_race_test:main -> passed [0.150s]
(In reply to Pedro F. Giffuni from comment #26) > Unfortunately, according to our testsuite the patch causes breakage Was that with or without the EATSPACE() you mentioned earlier? BTW, how do you run the test-suite? I have to admit, I could never quite figure it out -- and ended up adopting gsed's test-suit for my development efforts... Could you help, please? Thank you!
(In reply to Mikhail Teterin from comment #27) 1) Yes, the patch includes the EATSPACE. 2) The best reference to the testsuite is the wiki. https://wiki.freebsd.org/TestSuite For starters: - install kyua. - Add WITH_TEST=yes to /etc/src.conf - cd /usr/src/usr.bin/sed; make ; make install, etc - kyua test -k /usr/tests/usr.bin/sed/Kyuafile
(In reply to Pedro F. Giffuni from comment #28) Actually, it's WITH_TESTS (sorry about that).
Created attachment 172615 [details] Updated patch Ok, yes, there was a bug in the parsing lines like s/FOO/BAR/4. I fixed it and the multi_test now passes fully here. The legacy_test chokes on the missing regress.m4 -- maybe, I need to rebuild world again having set WITH_TESTS to yes (this ought to be simpler). But legacy_test was working for you before -- hopefully, it still does. Interestingly, one of gsed's tests started to fail now. But it fails the same way with our unmodified sed. The recently-introduced bug must be in the process.c somewhere: % echo z | gsed -n -e 's/^a*/b/2p' # empty -- correct % echo z | sed -n -e 's/^a*/b/2p' z # incorrect
Created attachment 172616 [details] Updated patch (In reply to Mikhail Teterin from comment #30) > The recently-introduced bug must be in the process.c somewhere: Yes, it would seem, the bug was introduced in base r300555. The check for finding the right number of re-occurences is off by one. With this hunk added, things seem to work: /* Did not find the requested number of matches. */ - if (n > 1) + if (n > 0) return (0);
A commit references this bug: Author: pfg Date: Sun Jul 17 21:49:54 UTC 2016 New revision: 302973 URL: https://svnweb.freebsd.org/changeset/base/302973 Log: sed(1): Fix off by one introduced in r299211. Detected by running the gsed tests. Submitted by: Mikhail Teterin PR: 195929 MFC after: 3 days Changes: head/usr.bin/sed/process.c
Created attachment 172618 [details] Updated patch for 12-current The patch still required some changes for 12-current and there are sufficient differences it may not be fun to merge to 10-stable. The off-by-one (thanks so much) is a candidate for merging into 11-stable.
(In reply to Pedro F. Giffuni from comment #33) > The patch still required some changes for 12-current and there are sufficient > differences it may not be fun to merge to 10-stable. Why are there such differences between branches? Sed is a purely user-space program and depends neither on anything unusual in the kernel, nor on any libraries, that may only be present in newer releases. I'd say, the version in 11 and 12 should be the same. There may be an argument for staying conservative with 10, but, as long as all the tests pass, maybe even that should be no different. > Detected by running the gsed tests. Maybe, the triggered test should be added to our suit?
(In reply to Mikhail Teterin from comment #34) >> The patch still required some changes for 12-current and there >> are sufficient differences it may not be fun to merge to 10-stable. > Why are there such differences between branches? Sed is a purely > user-space program and depends neither on anything unusual in the > kernel, nor on any libraries, that may only be present in newer > releases. There are changes affecting backwards compatibility. Anything that means you may have to rewrite you scripts during a minor upgrade is not good. Other committers take decisions about their own changes: in my case, r297602 which is really minor is something that I am not considering to MFC. > > Detected by running the gsed tests. > Maybe, the triggered test should be added to our suit? Yes but consider, within reason, there would be no license issues (IANAL).
A commit references this bug: Author: pfg Date: Tue Jul 19 22:56:41 UTC 2016 New revision: 303047 URL: https://svnweb.freebsd.org/changeset/base/303047 Log: sed(1): Assorted cleanups and simplifications. Const-ify several variables, make it build cleanly with WARNS level 5. Submitted by: mi PR: 195929 MFC after: 1 month Changes: head/usr.bin/sed/Makefile head/usr.bin/sed/compile.c head/usr.bin/sed/defs.h head/usr.bin/sed/extern.h head/usr.bin/sed/main.c head/usr.bin/sed/misc.c head/usr.bin/sed/process.c
Take
A commit references this bug: Author: pfg Date: Wed Jul 20 04:46:00 UTC 2016 New revision: 303064 URL: https://svnweb.freebsd.org/changeset/base/303064 Log: MFC r302973: sed(1): Fix off by one introduced in r299211. Detected by running the gsed tests. Submitted by: Mikhail Teterin PR: 195929 Approved by: re (gjb) Changes: _U stable/11/ stable/11/usr.bin/sed/process.c
A commit references this bug: Author: pfg Date: Wed Jul 20 04:49:02 UTC 2016 New revision: 303065 URL: https://svnweb.freebsd.org/changeset/base/303065 Log: MFC r302973: sed(1): Fix off by one introduced in r299211. Detected by running the gsed tests. Submitted by: Mikhail Teterin PR: 195929 Changes: _U stable/10/ stable/10/usr.bin/sed/process.c
A commit references this bug: Author: ngie Date: Sun Jul 31 05:31:09 UTC 2016 New revision: 303572 URL: https://svnweb.freebsd.org/changeset/base/303572 Log: Fix regression with /i caused by r303047 '\n' was specifically added to -e arguments prior to r303047. Restore historical behavior which in turn fixes usr.sbin/etcupdate/preworld_test:main . The fix is being committed to address the issue in the short term and may be iterated upon as noted in bug 211399 Discussed with: mi, pfg Differential Revision: https://reviews.freebsd.org/D7368 PR: 195929, 211399 [*] MFC after: 18 days X-MFC with: r303047 Reported by: Jenkins Sponsored by: EMC / Isilon Storage Division Changes: head/usr.bin/sed/main.c
A commit references this bug: Author: pfg Date: Tue Aug 2 15:35:54 UTC 2016 New revision: 303662 URL: https://svnweb.freebsd.org/changeset/base/303662 Log: sed(1): Revert r303047 "cleanup" and therefore r303572. While big, the change was meant to have no effect on behavior and instead so far we have found two regressions: one in the etcupdate tests and another one in the games/openttd port[1]. Revert to a known working state. We will likely have to split the patch in functional parts before bringing back the changes. PR: 195929 Reported by: danfe, madpilot [1] Changes: head/usr.bin/sed/Makefile head/usr.bin/sed/compile.c head/usr.bin/sed/defs.h head/usr.bin/sed/extern.h head/usr.bin/sed/main.c head/usr.bin/sed/misc.c head/usr.bin/sed/process.c
batch change of PRs untouched in 2018 marked "in progress" back to open.
back to the pool
^Triage: patch was reversed in 2016 and there has been no subsequent action since.