Bug 41159

Summary: [patch] new sed(1) -c option to allow ; as a separator for b, t and : functions
Product: Base System Reporter: Cyrille Lefevre <cyrille.lefevre>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: Open ---    
Severity: Affects Only Me CC: tjr
Priority: Normal Keywords: patch
Version: 4.6-STABLE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description Cyrille Lefevre 2002-07-30 13:50:01 UTC
	the current sed implementation can't handle the ; separator for
	b, t and : functions. this patch set add a new -c (compat) option
	to allow sed to parse such constructions. maybe -C is better then -c ?

How-To-Repeat: 	fetch http://queen.rett.polimi.it/~paolob/seders/scripts/sokoban.sed
	sed -f sokoban.sed
	sed: 2266: /root/sokoban.sed: unexpected EOF (pending }'s)
	sed -cf sokoban.sed
	(weel, it doesn't work yet, but at least, it can be parsed :)
Comment 1 Giorgos Keramidas freebsd_committer freebsd_triage 2002-08-20 22:04:10 UTC
Adding to audit trail:
:
: Message-Id: <20020801122653.A4231@dilbert.robbins.dropbear.id.au>
: Date: Thu, 1 Aug 2002 12:26:53 +1000
: From: Tim Robbins <tjr@FreeBSD.ORG>
: Subject: Re: new sed -c option to allow ; as a separator for b, t and : functions
:
: On Tue, Jul 30, 2002 at 02:40:33PM +0200, Cyrille Lefevre wrote:
: > 	the current sed implementation can't handle the ; separator for
: > 	b, t and : functions. this patch set add a new -c (compat) option
: > 	to allow sed to parse such constructions. maybe -C is better then -c ?
:
: If I understand SUSv3 correctly, file names or label names can contain
: semicolons:
:
: 32406	Command verbs other than {, a, b, c, i, r, t, w, :, and # can be
: 	followed by a semicolon, optional
: 32407	<blank>s, and another command verb. However, when the s command verb
: 	is used with the w
: 32408	flag, following it with another command in this manner produces
: 	undefined results.
:
: GNU sed, which by default accepts semicolons after (at least) the : and t
: commands does not strictly conform. I'd rather that our sed was by default
: as close as possible to what the standard requires, so we probably do need
: a command line option like you suggested.
:
: I think -g would be a better choice (similar to m4's -g option) to
: emphasise the fact that accepting semicolons after these commands is a GNU
: extension, and possibly adding more GNU compatibility features if
: they're needed/useful. BSD, 7th Edition and System V all behave the same
: way, and treat semicolons as part of the label/filename.
Comment 2 Cyrille Lefevre 2004-03-04 16:58:50 UTC
Hi Giorgos,

you wish has been exhausted, -c has been replaced by -g :)

feature added: unexpected EOF (pending }'s) displays the right openning {
line number

so, could you commit this PR ?

thanks in advance.

cvs diff against -current (FreeBSD 5.2-CURRENT #1: Sat Jan 31 15:17:05 CET 2004)

Index: compile.c
===================================================================
RCS file: /home/ncvs/src/usr.bin/sed/compile.c,v
retrieving revision 1.24
diff -u -I$Id.*$ -I$.+BSD.*$ -r1.24 compile.c
--- compile.c	4 Nov 2003 12:16:47 -0000	1.24
+++ compile.c	4 Mar 2004 16:48:40 -0000
@@ -76,7 +76,7 @@
 static char	 *compile_tr(char *, char **);
 static struct s_command
 		**compile_stream(struct s_command **);
-static char	 *duptoeol(char *, const char *);
+static char	 *duptoeol(char **, const char *, int);
 static void	  enterlabel(struct s_command *);
 static struct s_command
 		 *findlabel(char *);
@@ -164,9 +164,12 @@
 	stack = 0;
 	for (;;) {
 		if ((p = cu_fgets(lbuf, sizeof(lbuf), NULL)) == NULL) {
-			if (stack != 0)
+			if (stack != 0) {
+				for (cmd = stack; cmd->next; cmd = cmd->next)
+					/* nothing */ ;
 				errx(1, "%lu: %s: unexpected EOF (pending }'s)",
-							linenum, fname);
+							cmd->linenum, fname);
+			}
 			return (link);
 		}
 
@@ -231,6 +234,7 @@
 			p++;
 			EATSPACE();
 			cmd->next = stack;
+			cmd->linenum = linenum;
 			stack = cmd;
 			link = &cmd->u.c;
 			if (*p)
@@ -279,39 +283,43 @@
 		case WFILE:			/* w */
 			p++;
 			EATSPACE();
-			if (*p == '\0')
+			cmd->t = duptoeol(&p, "w command", 0);
+			if (cmd->t == NULL)
 				errx(1, "%lu: %s: filename expected", linenum, fname);
-			cmd->t = duptoeol(p, "w command");
 			if (aflag)
 				cmd->u.fd = -1;
-			else if ((cmd->u.fd = open(p,
+			else if ((cmd->u.fd = open(cmd->t, 
 			    O_WRONLY|O_APPEND|O_CREAT|O_TRUNC,
 			    DEFFILEMODE)) == -1)
-				err(1, "%s", p);
+				err(1, "%s", cmd->t);
 			break;
 		case RFILE:			/* r */
 			p++;
 			EATSPACE();
-			if (*p == '\0')
+			cmd->t = duptoeol(&p, "read command", 0);
+			if (cmd->t == NULL)
 				errx(1, "%lu: %s: filename expected", linenum, fname);
-			else
-				cmd->t = duptoeol(p, "read command");
 			break;
 		case BRANCH:			/* b t */
 			p++;
 			EATSPACE();
-			if (*p == '\0')
-				cmd->t = NULL;
-			else
-				cmd->t = duptoeol(p, "branch");
+			cmd->t = duptoeol(&p, "branch", 1);
+			if (*p == ';') {
+				p++;
+				goto semicolon;
+			}
 			break;
 		case LABEL:			/* : */
 			p++;
 			EATSPACE();
-			cmd->t = duptoeol(p, "label");
-			if (strlen(p) == 0)
+			cmd->t = duptoeol(&p, "label", 1);
+			if (cmd->t == NULL)
 				errx(1, "%lu: %s: empty label", linenum, fname);
 			enterlabel(cmd);
+			if (*p == ';') {
+				p++;
+				goto semicolon;
+			}
 			break;
 		case SUBST:			/* s */
 			p++;
@@ -730,25 +738,33 @@
 
 /*
  * duptoeol --
- *	Return a copy of all the characters up to \n or \0.
+ *	Return a copy of all the characters up to \n or \0 and maybe `;'.
  */
 static char *
-duptoeol(char *s, const char *ctype)
+duptoeol(char **sp, const char *ctype, int semi)
 {
 	size_t len;
 	int ws;
-	char *p, *start;
+	char *p, *start, *s;
+	char c;
 
+	c = semi && gflag ? ';' : '\0';
 	ws = 0;
-	for (start = s; *s != '\0' && *s != '\n'; ++s)
+	for (start = s = *sp; *s != '\0' && *s != '\n' && *s != c; ++s)
 		ws = isspace((unsigned char)*s);
-	*s = '\0';
+	*sp = s;
+	if (*s != c)
+		*s = '\0';
+	if (start == s)
+		return (NULL);
 	if (ws)
 		warnx("%lu: %s: whitespace after %s", linenum, fname, ctype);
 	len = s - start + 1;
 	if ((p = malloc(len)) == NULL)
 		err(1, "malloc");
-	return (memmove(p, start, len));
+	s = memmove(p, start, len);
+	s [len-1] = '\0';
+	return (s);
 }
 
 /*
Index: defs.h
===================================================================
RCS file: /home/ncvs/src/usr.bin/sed/defs.h,v
retrieving revision 1.3
diff -u -I$Id.*$ -I$.+BSD.*$ -r1.3 defs.h
--- defs.h	11 Aug 1997 07:21:00 -0000	1.3
+++ defs.h	31 Jul 2002 01:23:53 -0000
@@ -90,6 +90,7 @@
 	char code;				/* Command code */
 	u_int nonsel:1;				/* True if ! */
 	u_int inrange:1;			/* True if in range */
+	u_int linenum;
 };
 
 /*
Index: extern.h
===================================================================
RCS file: /home/ncvs/src/usr.bin/sed/extern.h,v
retrieving revision 1.12
diff -u -I$Id.*$ -I$.+BSD.*$ -r1.12 extern.h
--- extern.h	4 Nov 2003 13:09:16 -0000	1.12
+++ extern.h	4 Mar 2004 16:48:43 -0000
@@ -45,6 +45,7 @@
 extern u_long linenum;
 extern int appendnum;
 extern int aflag, eflag, nflag;
+extern int gflag;
 extern const char *fname, *outfname;
 extern FILE *infile, *outfile;
 extern int rflags;	/* regex flags to use */
Index: main.c
===================================================================
RCS file: /home/ncvs/src/usr.bin/sed/main.c,v
retrieving revision 1.31
diff -u -I$Id.*$ -I$.+BSD.*$ -r1.31 main.c
--- main.c	4 Nov 2003 22:39:25 -0000	1.31
+++ main.c	4 Mar 2004 16:48:27 -0000
@@ -100,6 +100,7 @@
 FILE *outfile;			/* Current output file */
 
 int aflag, eflag, nflag;
+int gflag;			/* allow ; to behave as \n for b and t */
 int rflags = 0;
 static int rval;		/* Exit status */
 
@@ -130,7 +131,7 @@
 	fflag = 0;
 	inplace = NULL;
 
-	while ((c = getopt(argc, argv, "Eae:f:i:n")) != -1)
+	while ((c = getopt(argc, argv, "Eae:f:gi:n")) != -1)
 		switch (c) {
 		case 'E':
 			rflags = REG_EXTENDED;
@@ -150,6 +151,9 @@
 			fflag = 1;
 			add_compunit(CU_FILE, optarg);
 			break;
+		case 'g':
+			gflag = 1;
+			break;
 		case 'i':
 			inplace = optarg;
 			break;
@@ -188,8 +192,8 @@
 usage(void)
 {
 	(void)fprintf(stderr, "%s\n%s\n",
-		"usage: sed script [-Ean] [-i extension] [file ...]",
-		"       sed [-an] [-i extension] [-e script] ... [-f script_file] ... [file ...]");
+		"usage: sed script [-Eagn] [-i extension] [file ...]",
+		"       sed [-acn] [-i extension] [-e script] ... [-f script_file] ... [file ...]");
 	exit(1);
 }
 
Index: sed.1
===================================================================
RCS file: /home/ncvs/src/usr.bin/sed/sed.1,v
retrieving revision 1.31
diff -u -I$Id.*$ -I$.+BSD.*$ -r1.31 sed.1
--- sed.1	4 Jan 2004 15:33:06 -0000	1.31
+++ sed.1	4 Mar 2004 16:47:21 -0000
@@ -43,11 +43,11 @@
 .Nd stream editor
 .Sh SYNOPSIS
 .Nm
-.Op Fl Ean
+.Op Fl Eagn
 .Ar command
 .Op Ar
 .Nm
-.Op Fl Ean
+.Op Fl Eagn
 .Op Fl e Ar command
 .Op Fl f Ar command_file
 .Op Fl i Ar extension
@@ -99,6 +99,17 @@
 .Ar command_file
 to the list of commands.
 The editing commands should each be listed on a separate line.
+.It Fl g
+Activate the GNU-sed compatible mode which allow the
+.Dq \&;
+command separator for
+.Dq b ,
+.Dq t
+and
+.Dq \&:
+functions instead of reading the
+.Em label
+until the eof of line.
 .It Fl i Ar extension
 Edit files in-place, saving backups with the specified
 .Ar extension .

Cyrille Lefevre
-- 
mailto:cyrille.lefevre@laposte.net
Comment 3 Cyrille Lefevre 2004-03-04 17:01:38 UTC
oops, I forgot to change the second usage string at line 196 in main.c

should be :
sed [-agn] [-i extension] [-e script] ...

instead of :
sed [-acn] [-i extension] [-e script] ...

sorry.

Cyrille Lefevre
-- 
mailto:cyrille.lefevre@laposte.net
Comment 4 Cyrille Lefevre 2004-03-04 17:04:28 UTC
well, in fact, the added feature already has it's own PR which is bin/41190
intituled : in sed, report the { linenum instead of EOF linenum on pending }

forgot about that, sorry again.

Cyrille Lefevre
-- 
mailto:cyrille.lefevre@laposte.net
Comment 5 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:08 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 6 Graham Perrin freebsd_committer freebsd_triage 2022-10-17 12:35:22 UTC
Keyword: 

    patch
or  patch-ready

– in lieu of summary line prefix: 

    [patch]

* bulk change for the keyword
* summary lines may be edited manually (not in bulk). 

Keyword descriptions and search interface: 

    <https://bugs.freebsd.org/bugzilla/describekeywords.cgi>