Bug 16393

Summary: /bin/sh doesn't strip comments on shebang line
Product: Base System Reporter: ryand <ryand>
Component: binAssignee: Garance A Drosehn <gad>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 3.3-STABLE   
Hardware: Any   
OS: Any   

Description ryand 2000-01-27 02:40:01 UTC
Basically, if I follow the suggestions in the perl book to make 
portably executable scripts, I must use a shebang hack where the 
perl script starts being executed as a sh script. sh will pass it off 
to perl. Currently sh chokes on the # after -- as the executable.

[503] x.pl 
#: Can't open #

What should be happening is sh strips everything including and after
the #. With no args the file should be executed in the same sh. Then
the eval/exec will transfer responsibility to perl. This works on
DEC UNIX, linux, and several others.

How-To-Repeat: Run the following script:

#!/bin/sh -- # -*- perl -*-

eval "exec perl $0 -S ${1+'$@'}"
  if 0;

print "1+1=", (1+1), "\n";
Comment 1 malachai 2000-02-10 21:00:30 UTC
I've run into this, too.  The problem seems to have two parts.

First, the kernel parses the shebang line into white-space-separated
tokens without any regard to the presence of a '#' character (which in
the case we're interested in, denotes a comment).  The first patch
makes the parsing slurp up everthing from the '#' to the end-of-line
and store it as a single word.  This is necessary so that /bin/sh
knows where the comment ends (otherwise (as it does currently),
/bin/sh would receive the '#' and any comment-words as separate
arguments and not know where the comment ended), because when the
interpreter is started, the name of the script is tacked on as the the
last argument.

Second, /bin/sh will take the first non-option as a file name
(according to sh(1)), which means it starts looking for a file named
the first word of the comment on the shebang line.  So, I modified
(see second patch) /bin/sh to ignore any command line words that begin
with '#' when searching for a file to interpret.  This continues to
allow things like:

    sh -c '# this is a nop'

and also preservers the original command line for the interpreter.

I'm not sure if this is the best way to fix things, but it appears to
be consistent with current behavior and address the problem.

Patches below.


-- 
Shawn Halpenny |        Maniacal@I Ache, Ohm    |  "Universal Danger!"
               +- - - - - - - - - - - - - - - - + - - - - - - - - - - - - - - \
               | vi:G3kfM~lxfAPXh~l~2x2FirllpfcxlrifaprmfOX~Xp2hr.lrcelyl2p
- - - - - - - -|    fU~X~refsPprnlxppri2lxlpr,pFrpprrfaPlpfiprgllxp~3Xlpfndw



--- /usr/src/sys/kern/imgact_shell.c~	Wed Feb  9 17:14:09 2000
+++ /usr/src/sys/kern/imgact_shell.c	Wed Feb  9 17:14:13 2000
@@ -51,6 +51,7 @@
 exec_shell_imgact(imgp)
 	struct image_params *imgp;
 {
+	const char *comment = NULL;
 	const char *image_header = imgp->image_header;
 	const char *ihp, *line_endp;
 	char *interp;
@@ -112,7 +113,15 @@
 			 *	because this is at the front of the string buffer
 			 *	and the maximum shell command length is tiny.
 			 */
-			while ((ihp < line_endp) && (*ihp != ' ') && (*ihp != '\t')) {
+			while ((ihp < line_endp) &&
+			    ((*ihp != ' ') && (*ihp != '\t') || comment)) {
+
+				/* Shell comment characters at the start of a token cause
+				 *	everything to EOL to be one token.
+				 */
+				if (*ihp == '#')
+					comment = ihp;
+
 				*imgp->stringp++ = *ihp++;
 				imgp->stringspace--;
 			}


--- /usr/src/bin/sh/options.c~	Thu Feb 10 11:02:38 2000
+++ /usr/src/bin/sh/options.c	Thu Feb 10 13:41:10 2000
@@ -108,6 +108,15 @@
 			optlist[i].val = 0;
 	arg0 = argv[0];
 	if (sflag == 0 && minusc == NULL) {
+		/* Skip any arguments that start with shell-comment character
+		 *	since it is unlikely the filename of a script given on
+		 *	the command line will start with one.
+		 */
+		while (*argptr && **argptr == '#')
+		{
+			argptr++;
+		}
+
 		commandname = arg0 = *argptr++;
 		setinputfile(commandname, 0);
 	}
Comment 2 Martin Cracauer freebsd_committer freebsd_triage 2000-02-15 08:50:06 UTC
State Changed
From-To: open->closed

Fixed for 4.0. 

Will be merged into 3.x after some time. 

Thanks for the bug report 

Comment 3 dancy 2002-02-19 16:26:26 UTC
We ran into a problem today related to this issue (we used the #
character as switch to our program).  I did some studies on various
other operating systems and FreeBSD hosts that have the modifications
suggested by bin/16393 fall short.  Here are the results of my study:

Given a file called '/tmp/x2' with shebang line:
#!/tmp/interp -a -b -c #dee eee

If /tmp/x2 is exec'd, the operating system runs /tmp/interp w/ the
following arguments:

Solaris 8:
args: "/tmp/interp" "-a" "/tmp/x2"

Tru64 4.0:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"

FreeBSD 2.2.7:
args: "/tmp/interp" "-a" "-b" "-c" "#dee" "eee" "/tmp/x2"

FreeBSD 4.0:
args: "/tmp/interp" "-a" "-b" "-c" "/tmp/x2"

Linux 2.4.12:
args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"

Linux 2.2.19:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"

Irix 6.5:
args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"

HPUX 11.00:
args: "/tmp/x2" "-a -b -c #dee eee" "/tmp/x2"

AIX 4.3:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"

Mac OX X:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"


The most common behavior is:
argv[0]: full path of interpreter
argv[1]: all remaining args, coalesced into one string
argv[2]: The file file exec'd.

FreeBSD's behavior is way out there.  No other system treats "#" in any special 
way.
Comment 4 Garance A Drosehn freebsd_committer freebsd_triage 2005-03-01 23:50:34 UTC
Responsible Changed
From-To: freebsd-bugs->gad

A fix for the "doesn't strip comments" problem was committed in 2000, 
but that caused trouble for other people (as documented in this PR). 
A fix for those problems was made to kern/imgact_shell.c was committed 
to 5.3-stable in late 2004, but that change broke the "strip-comments" 
processing that perl expects. 

See the thread on "Bug in #! processing - One More Time" in freebsd-arch 
for more details.  I intend to fix this for real with another set of 
changes, but those changes aren't going to be ready for 5.4-release. 


Comment 5 Garance A Drosehn freebsd_committer freebsd_triage 2005-03-01 23:50:34 UTC
State Changed
From-To: closed->analyzed

A fix for the "doesn't strip comments" problem was committed in 2000, 
but that caused trouble for other people (as documented in this PR). 
A fix for those problems was made to kern/imgact_shell.c was committed 
to 5.3-stable in late 2004, but that change broke the "strip-comments" 
processing that perl expects. 

See the thread on "Bug in #! processing - One More Time" in freebsd-arch 
for more details.  I intend to fix this for real with another set of 
changes, but those changes aren't going to be ready for 5.4-release.
Comment 6 Garance A Drosehn freebsd_committer freebsd_triage 2005-05-30 23:11:46 UTC
State Changed
From-To: analyzed->closed

A change has been made to sys/kern/imgact_shell.c which will probably be 
the final fix for this issue.  This has been committed to the 6.x-current 
branch, but it is an incompatible change and thus will probably not be 
MFC'ed into 5.x-stable.