Bug 195929 - usr.bin/sed -- constify, remove line-length limits
Summary: usr.bin/sed -- constify, remove line-length limits
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.1-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2014-12-12 21:55 UTC by Mikhail T.
Modified: 2018-09-20 14:23 UTC (History)
3 users (show)

See Also:


Attachments
Process -e optarg verbatim (573 bytes, patch)
2014-12-12 21:55 UTC, Mikhail T.
no flags Details | Diff
Add consts, where appropriate (2.09 KB, patch)
2014-12-12 21:56 UTC, Mikhail T.
no flags Details | Diff
Constify, remove command-length limits (10.03 KB, patch)
2014-12-13 00:07 UTC, Mikhail T.
no flags Details | Diff
Constify, remove command-length limits (10.61 KB, patch)
2014-12-13 06:24 UTC, Mikhail T.
no flags Details | Diff
Constify, remove command-length limits (10.63 KB, patch)
2014-12-14 02:59 UTC, Mikhail T.
no flags Details | Diff
Properly process multi-line -e commands too (20.54 KB, patch)
2014-12-17 02:04 UTC, Mikhail T.
no flags Details | Diff
Latest installment of sed cleanup (23.54 KB, patch)
2015-02-09 13:25 UTC, Mikhail Teterin
no flags Details | Diff
Valentine's Day installment (22.81 KB, patch)
2015-02-15 16:26 UTC, Mikhail Teterin
no flags Details | Diff
Updates for recent changes (22.96 KB, patch)
2016-06-28 03:26 UTC, Mikhail Teterin
no flags Details | Diff
Script for using GNU sed's tests to verify ours (645 bytes, text/plain)
2016-06-28 03:47 UTC, Mikhail Teterin
no flags Details
Update to match sed on FreeBSD 11-BETA (22.40 KB, patch)
2016-07-09 03:21 UTC, Pedro F. Giffuni
no flags Details | Diff
Add come more const (22.90 KB, patch)
2016-07-12 05:26 UTC, Mikhail Teterin
no flags Details | Diff
Add some more const (22.90 KB, patch)
2016-07-12 05:31 UTC, Mikhail Teterin
no flags Details | Diff
Add some more const (23.94 KB, patch)
2016-07-12 13:07 UTC, Mikhail Teterin
no flags Details | Diff
Updated patch (23.90 KB, patch)
2016-07-12 23:03 UTC, Pedro F. Giffuni
no flags Details | Diff
Updated patch (24.49 KB, patch)
2016-07-17 20:46 UTC, Mikhail Teterin
no flags Details | Diff
Updated patch (24.70 KB, patch)
2016-07-17 21:17 UTC, Mikhail Teterin
no flags Details | Diff
Updated patch for 12-current (23.97 KB, patch)
2016-07-17 21:56 UTC, Pedro F. Giffuni
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mikhail T. 2014-12-12 21:55:14 UTC
Created attachment 150515 [details]
Process -e optarg verbatim

When encountering an -e argument, sed currently mallocs a string to COPY the optarg -- with '\n' appended. The appendage does not seem necessary -- indeed, the same call to add_compunit processing the sole command (given without -e) passes the *argv verbatim: without making a copy, and without appending newline.

The patch eliminates the seemingly redundant malloc/string-copying.
Comment 1 Mikhail T. 2014-12-12 21:56:35 UTC
Created attachment 150516 [details]
Add consts, where appropriate

This patch is independent of the main one -- consts are added for great justice.
Comment 2 Mikhail T. 2014-12-13 00:07:04 UTC
Created attachment 150518 [details]
Constify, remove command-length limits

This patch purports to eliminate the use of static buffers (of size _POSIX2_LINE_MAX -- 2048).

They are replaced by either dynamically-allocated memory (when reading from file using getline() instead fgets()) and by using the already-recorded strings, when those are given on command-line (-e).

This saves run-time memory -- because most commands are much shorter than 2048, while also lifting the hard-limit for those cases, when a command is longer...

The latter case was handled badly before -- the command was silently truncated leading to cryptic error messages at best (see Bug 177018) or to a possible corruption at worst.

Tested (under valgrind) against math.sed and hanoi.sed, as well as the very long -e argument.
Comment 3 Mikhail T. 2014-12-13 06:24:37 UTC
Created attachment 150536 [details]
Constify, remove command-length limits

This version passes the regression-tests, that come with the gsed port as well -- those of them, which the unmodified sed(1) passes, that is.
Comment 4 Mikhail T. 2014-12-14 02:59:38 UTC
Created attachment 150558 [details]
Constify, remove command-length limits

The restores an accidentally-omitted line. Detected, while building firefox. All other tests continue to pass.
Comment 5 Mikhail T. 2014-12-17 02:04:17 UTC
Created attachment 150663 [details]
Properly process multi-line -e commands too

This patch fixes things to properly handle the multiline -e arguments, which are used heavily by our own regression-tests.

Our entire test-suit succeeds now, with the exception of 7.8, which the unmodified sed fails too.
Comment 6 Mikhail Teterin freebsd_committer 2015-02-09 13:25:32 UTC
Created attachment 152797 [details]
Latest installment of sed cleanup

The patch adds some minor cleanups and solves the following actual problems:

* When parsing a regexp in an address (rather than command), only "I" (capital)
  was treated as indication, nocase regexp was desired. Man-page documented the
  "i" (lower case) letter as the flag. GNU sed seems to allow only the upper
  case, but this patch now accepts both upper and lower case "i" in the address
  for consistency with treatment of regexps in the s-command
* Do not compile s-command's regexps twice
Comment 7 Mikhail Teterin freebsd_committer 2015-02-15 16:26:44 UTC
Created attachment 153006 [details]
Valentine's Day installment

This version goes back to only allowing capital I as case-ignoring flag after address. Treating lower-case i as such broke some configure-scripts.

Our man-page still needs fixing, however, because it (incorrectly) identifies the lower-case i as the flag.
Comment 8 Pedro F. Giffuni freebsd_committer 2016-05-09 15:52:05 UTC
The patches look interesting but the code has changed a lot so they don't apply cleanly.
Comment 9 Pedro F. Giffuni freebsd_committer 2016-05-09 15:55:58 UTC
Take .. I am playing with them.
Comment 10 commit-hook freebsd_committer 2016-05-09 18:54:45 UTC
A commit references this bug:

Author: pfg
Date: Mon May  9 18:53:46 UTC 2016
New revision: 299279
URL: https://svnweb.freebsd.org/changeset/base/299279

Log:
  Simplify redundant malloc'ing in sed -e.

  When encountering an -e argument, sed currently mallocs a string to COPY
  the optarg -- with '\n' appended. The appendage does not seem necessary --
  indeed, the same call to add_compunit processing the sole command (given
  without -e) passes the *argv verbatim: without making a copy, and without
  appending newline.

  This matches what is done in other BSDs.

  Submitted by:	Mikhail T.
  PR:		195929
  MFC after:	2 weeks

Changes:
  head/usr.bin/sed/main.c
Comment 11 commit-hook freebsd_committer 2016-05-10 02:03:26 UTC
A commit references this bug:

Author: pfg
Date: Tue May 10 02:02:50 UTC 2016
New revision: 299294
URL: https://svnweb.freebsd.org/changeset/base/299294

Log:
  Revert r299279:
  Simplify redundant malloc'ing in sed -e.

  It is causing havoc in the ports tree:

  ===>  Configuring for wxsvg-1.5.7
  sed: 1: "/gcc_dir=\\`/s/gcc /$CC /": bad flag in substitute command: '/'
  *** Error code 1

  ===>  Patching for vips-8.3.1
  sed: 1: "1s|^#![[:space:]]*/usr/ ...": bad flag in substitute command: 's'
  *** Error code 1

  PR:		195929
  Reported by:	danilo

Changes:
  head/usr.bin/sed/main.c
Comment 12 Mikhail T. 2016-05-10 13:46:08 UTC
Yes, I recall, that change had to be coupled with the one, that recognized '\0' in addition to '\n' as the line-ender.

I only tested the patches together. Sorry for the confusion...
Comment 13 Pedro F. Giffuni freebsd_committer 2016-05-10 14:19:56 UTC
(In reply to Mikhail T. from comment #12)

It is curious that the patch passed the regression tests: perhpas it shows we need more tests.

The second patch doesn't apply cleanly to -current; which is where commits happen. I don't really have time to decouple the patches and with the FreeBSD 11 freeze coming I feel uncomfortable doing changes in sed.

I will drop this for now, but may re-take it after FreeBSD 11.
Comment 14 Mikhail T. 2016-05-10 14:33:16 UTC
(In reply to Pedro F. Giffuni from comment #13)
> The second patch doesn't apply cleanly to -current

I can rework the patch to apply to 11, but I need some reassurance, it is not going to be left here to bitrot again for over a year.

> with the FreeBSD 11 freeze coming I feel uncomfortable doing changes in sed.

Can you, at least, fix the man-page? It is the last hunk of the larger patch...
Comment 15 commit-hook freebsd_committer 2016-05-10 17:02:36 UTC
A commit references this bug:

Author: pfg
Date: Tue May 10 17:01:52 UTC 2016
New revision: 299377
URL: https://svnweb.freebsd.org/changeset/base/299377

Log:
  sed.1: Correction for the case insensitive case.

  Use the capital I instead of the lowercase.

  Submitted by:	Mikhail T.
  PR:		195929
  MFC after:	2 weeks

Changes:
  head/usr.bin/sed/sed.1
Comment 16 Pedro F. Giffuni freebsd_committer 2016-05-10 17:09:45 UTC
(In reply to Mikhail T. from comment #14)
OK, I updated the manpage as the bug was rather serious, thanks!

I cannot assure anything: my absolute priority is getting the fix from Bug 209352 before 11-Release.

If you wait a couple of weeks though, I will be merging to stable some of the changes that are breaking your patchset.
Comment 17 Mikhail Teterin freebsd_committer 2016-06-28 03:26:51 UTC
Created attachment 171898 [details]
Updates for recent changes

This corrects one old problem and reconciles the changes with the recent modifications.
Comment 18 Mikhail Teterin freebsd_committer 2016-06-28 03:47:53 UTC
Created attachment 171900 [details]
Script for using GNU sed's tests to verify ours

This script use GNU sed's test-suit to verify our sed. Almost all of gsed's tests succeed, obviously -- those, where we differ (and differed before) are excluded.

To use:
  make -C /usr/ports/textproc/gsed extract
  sh test.sh

After the script completes, valgrind's logs can be checked with:
  grep 'ERROR SUMMARY' *.log

If you do not valgrind installed, edit the script to use the newly-built sed directly.
Comment 19 Pedro F. Giffuni freebsd_committer 2016-07-09 03:21:17 UTC
Created attachment 172273 [details]
Update to match sed on FreeBSD 11-BETA

11-BETA (and current) still have more changes and match better GNU sed.
Please review this update. Also note that attempting to make const the parameter in findlabel() causes this problem:

/usr/src/usr.bin/sed/compile.c:913:28: warning: cast from 'const char *' to
      'unsigned char *' drops const qualifier [-Wcast-qual]
        for (h = 0, p = (u_char *)name; (c = *p) != 0; p++)
                                  ^
Comment 20 Mikhail Teterin freebsd_committer 2016-07-12 05:26:33 UTC
Created attachment 172402 [details]
Add come more const

(In reply to Pedro F. Giffuni from comment #19)
> Please review this update.

Seems fine... I added some more const-poison, though -- this time to the regex_t-pointers.

> /usr/src/usr.bin/sed/compile.c:913:28: warning: cast from 'const char *' to
>      'unsigned char *' drops const qualifier [-Wcast-qual]

Fixed -- by adding const to casting (and p's declaration) as well.

It now compiles cleanly here with clang-3.4.1 (base), as well as clang-3.8.0 and gcc-6.0. Thanks!
Comment 21 Mikhail Teterin freebsd_committer 2016-07-12 05:31:13 UTC
Created attachment 172403 [details]
Add some more const

And correct a fat-fingering defect, which removed initialization of le in substitute().
Comment 22 Mikhail Teterin freebsd_committer 2016-07-12 13:07:10 UTC
Created attachment 172409 [details]
Add some more const

This installment adds some more const-poisoning to process.c
Comment 23 Pedro F. Giffuni freebsd_committer 2016-07-12 18:01:54 UTC
Comment on attachment 172409 [details]
Add some more const

In compile.c, the hunk after line 325, you removed an occurence of EATSPACE(), is that correct?
Comment 24 Mikhail Teterin freebsd_committer 2016-07-12 18:14:08 UTC
(In reply to Pedro F. Giffuni from comment #23)
> In compile.c, the hunk after line 325, you removed an occurence of EATSPACE(), > is that correct?

It seemed like compile_flags would eat all the space there may be. But the macro is practically free -- let's put it back...
Comment 25 Pedro F. Giffuni freebsd_committer 2016-07-12 22:27:43 UTC
(In reply to Mikhail Teterin from comment #24)

The patch doesn't apply cleanly to 11/12 :(
Comment 26 Pedro F. Giffuni freebsd_committer 2016-07-12 23:03:54 UTC
Created attachment 172442 [details]
Updated patch

Unfortunately, according to our testsuite the patch causes breakage:

legacy_test:main  ->  failed: 3 tests of 26 failed  [0.219s]
multi_test:main  ->  failed: 2 tests of 130 failed  [0.364s]
inplace_race_test:main  ->  passed  [0.159s]

The regular tests only cause one failure in 11-beta:

/usr/tests/usr.bin/sed % kyua test -k Kyuafile
legacy_test:main  ->  passed  [0.153s]
multi_test:main  ->  failed: 1 tests of 130 failed  [0.350s]
inplace_race_test:main  ->  passed  [0.150s]
Comment 27 Mikhail Teterin freebsd_committer 2016-07-13 04:18:30 UTC
(In reply to Pedro F. Giffuni from comment #26)
> Unfortunately, according to our testsuite the patch causes breakage

Was that with or without the EATSPACE() you mentioned earlier?

BTW, how do you run the test-suite? I have to admit, I could never quite figure it out -- and ended up adopting gsed's test-suit for my development efforts...

Could you help, please? Thank you!
Comment 28 Pedro F. Giffuni freebsd_committer 2016-07-13 18:30:16 UTC
(In reply to Mikhail Teterin from comment #27)
1) Yes, the patch includes the EATSPACE.
2) The best reference to the testsuite is the wiki.
https://wiki.freebsd.org/TestSuite

For starters:
- install kyua.
- Add WITH_TEST=yes to /etc/src.conf
- cd /usr/src/usr.bin/sed; make ; make install, etc
- kyua test -k /usr/tests/usr.bin/sed/Kyuafile
Comment 29 Pedro F. Giffuni freebsd_committer 2016-07-17 14:46:01 UTC
(In reply to Pedro F. Giffuni from comment #28)

Actually, it's WITH_TESTS (sorry about that).
Comment 30 Mikhail Teterin freebsd_committer 2016-07-17 20:46:16 UTC
Created attachment 172615 [details]
Updated patch

Ok, yes, there was a bug in the parsing lines like s/FOO/BAR/4. I fixed it and the multi_test now passes fully here. The legacy_test chokes on the missing regress.m4 -- maybe, I need to rebuild world again having set WITH_TESTS to yes (this ought to be simpler).

But legacy_test was working for you before -- hopefully, it still does.

Interestingly, one of gsed's tests started to fail now. But it fails the same way with our unmodified sed. The recently-introduced bug must be in the process.c somewhere:

% echo z | gsed -n -e 's/^a*/b/2p'
# empty -- correct
% echo z |  sed -n -e 's/^a*/b/2p'
z # incorrect
Comment 31 Mikhail Teterin freebsd_committer 2016-07-17 21:17:01 UTC
Created attachment 172616 [details]
Updated patch

(In reply to Mikhail Teterin from comment #30)
> The recently-introduced bug must be in the process.c somewhere:

Yes, it would seem, the bug was introduced in base r300555. The check for finding the right number of re-occurences is off by one. With this hunk added, things seem to work:

 	/* Did not find the requested number of matches. */
-	if (n > 1)
+	if (n > 0)
		return (0);
Comment 32 commit-hook freebsd_committer 2016-07-17 21:50:26 UTC
A commit references this bug:

Author: pfg
Date: Sun Jul 17 21:49:54 UTC 2016
New revision: 302973
URL: https://svnweb.freebsd.org/changeset/base/302973

Log:
  sed(1): Fix off by one introduced in r299211.

  Detected by running the gsed tests.

  Submitted by:	Mikhail Teterin
  PR:		195929
  MFC after:	3 days

Changes:
  head/usr.bin/sed/process.c
Comment 33 Pedro F. Giffuni freebsd_committer 2016-07-17 21:56:21 UTC
Created attachment 172618 [details]
Updated patch for 12-current

The patch still required some changes for 12-current and there are sufficient differences it may not be fun to merge to 10-stable.

The off-by-one (thanks so much) is a candidate for merging into 11-stable.
Comment 34 Mikhail Teterin freebsd_committer 2016-07-17 22:14:04 UTC
(In reply to Pedro F. Giffuni from comment #33)
> The patch still required some changes for 12-current and there are sufficient > differences it may not be fun to merge to 10-stable.

Why are there such differences between branches? Sed is a purely user-space program and depends neither on anything unusual in the kernel, nor on any libraries, that may only be present in newer releases.

I'd say, the version in 11 and 12 should be the same. There may be an argument for staying conservative with 10, but, as long as all the tests pass, maybe even that should be no different.

> Detected by running the gsed tests.

Maybe, the triggered test should be added to our suit?
Comment 35 Pedro F. Giffuni freebsd_committer 2016-07-18 00:26:41 UTC
(In reply to Mikhail Teterin from comment #34)

>> The patch still required some changes for 12-current and there
>> are sufficient differences it may not be fun to merge to 10-stable.
> Why are there such differences between branches? Sed is a purely
> user-space program and depends neither on anything unusual in the
> kernel, nor on any libraries, that may only be present in newer
> releases.

There are changes affecting backwards compatibility. Anything that means
you may have to rewrite you scripts during a minor upgrade is not good.
Other committers take decisions about their own changes: in my case,
r297602 which is really minor is something that I am not considering
to MFC.

> > Detected by running the gsed tests.
> Maybe, the triggered test should be added to our suit?

Yes but consider, within reason, there would be no license issues (IANAL).
Comment 36 commit-hook freebsd_committer 2016-07-19 22:57:18 UTC
A commit references this bug:

Author: pfg
Date: Tue Jul 19 22:56:41 UTC 2016
New revision: 303047
URL: https://svnweb.freebsd.org/changeset/base/303047

Log:
  sed(1):	Assorted cleanups and simplifications.

  Const-ify several variables, make it build cleanly with WARNS level 5.

  Submitted by:	mi
  PR:		195929
  MFC after:	1 month

Changes:
  head/usr.bin/sed/Makefile
  head/usr.bin/sed/compile.c
  head/usr.bin/sed/defs.h
  head/usr.bin/sed/extern.h
  head/usr.bin/sed/main.c
  head/usr.bin/sed/misc.c
  head/usr.bin/sed/process.c
Comment 37 Pedro F. Giffuni freebsd_committer 2016-07-19 23:05:20 UTC
Take
Comment 38 commit-hook freebsd_committer 2016-07-20 04:46:36 UTC
A commit references this bug:

Author: pfg
Date: Wed Jul 20 04:46:00 UTC 2016
New revision: 303064
URL: https://svnweb.freebsd.org/changeset/base/303064

Log:
  MFC r302973:
  sed(1): Fix off by one introduced in r299211.

  Detected by running the gsed tests.

  Submitted by:	Mikhail Teterin
  PR:		195929
  Approved by:	re (gjb)

Changes:
_U  stable/11/
  stable/11/usr.bin/sed/process.c
Comment 39 commit-hook freebsd_committer 2016-07-20 04:49:39 UTC
A commit references this bug:

Author: pfg
Date: Wed Jul 20 04:49:02 UTC 2016
New revision: 303065
URL: https://svnweb.freebsd.org/changeset/base/303065

Log:
  MFC r302973:
  sed(1): Fix off by one introduced in r299211.

  Detected by running the gsed tests.

  Submitted by:	Mikhail Teterin
  PR:		195929

Changes:
_U  stable/10/
  stable/10/usr.bin/sed/process.c
Comment 40 commit-hook freebsd_committer 2016-07-31 05:31:26 UTC
A commit references this bug:

Author: ngie
Date: Sun Jul 31 05:31:09 UTC 2016
New revision: 303572
URL: https://svnweb.freebsd.org/changeset/base/303572

Log:
  Fix regression with /i caused by r303047

  '\n' was specifically added to -e arguments prior to r303047. Restore
  historical behavior which in turn fixes usr.sbin/etcupdate/preworld_test:main .

  The fix is being committed to address the issue in the short term and may be
  iterated upon as noted in bug 211399

  Discussed with:		mi, pfg
  Differential Revision:	https://reviews.freebsd.org/D7368
  PR:			195929, 211399 [*]
  MFC after:		18 days
  X-MFC with:		r303047
  Reported by:		Jenkins
  Sponsored by:		EMC / Isilon Storage Division

Changes:
  head/usr.bin/sed/main.c
Comment 41 commit-hook freebsd_committer 2016-08-02 15:36:04 UTC
A commit references this bug:

Author: pfg
Date: Tue Aug  2 15:35:54 UTC 2016
New revision: 303662
URL: https://svnweb.freebsd.org/changeset/base/303662

Log:
  sed(1): Revert r303047 "cleanup" and therefore r303572.

  While big, the change was meant to have no effect on behavior and instead
  so far we have found two regressions: one in the etcupdate tests and
  another one in the games/openttd port[1].

  Revert to a known working state. We will likely have to split the patch in
  functional parts before bringing back the changes.

  PR:		195929
  Reported by:	danfe, madpilot [1]

Changes:
  head/usr.bin/sed/Makefile
  head/usr.bin/sed/compile.c
  head/usr.bin/sed/defs.h
  head/usr.bin/sed/extern.h
  head/usr.bin/sed/main.c
  head/usr.bin/sed/misc.c
  head/usr.bin/sed/process.c
Comment 42 Eitan Adler freebsd_committer freebsd_triage 2018-05-23 10:27:42 UTC
batch change of PRs untouched in 2018 marked "in progress" back to open.
Comment 43 Pedro F. Giffuni freebsd_committer 2018-09-20 14:23:14 UTC
back to the pool