Bug 31627

Summary: /bin/sh's hangling of some characters is wrong - loss of data
Product: Base System Reporter: Eugene Grosbein <ports>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.4-STABLE   
Hardware: Any   
OS: Any   

Description Eugene Grosbein 2001-10-30 05:40:00 UTC
	/bin/sh 'eats' some characters resulting in loss of data

Fix: 

Unknown for me.
How-To-Repeat: 	
	run this script using sh -x:

	#!/bin/sh -x
	string=`printf "test\201string"`
	echo $string | hd

	You will see that a symbol '' (dec 129, hex 0x81, oct 0201)
	is missing in echo's parameter and hd approves this.

	This also leads to impossibility for shell script to process
	a file with a name containing this symbol if it's created by
	another program.
Comment 1 Thomas Quinot 2001-11-06 17:18:34 UTC
Le 2001-11-06, Eugene Grosbein écrivait :

> #!/bin/sh
> string=`printf "\21"`
> echo $string | hd
 
> Replace 21 with 201 and rerun. You see:
> 00000000  0a                                                |.|
> 00000001

Can't reproduce here for the value \201, but for the other values
you mention it looks like perfectly normal and expected behaviour
from sh(1). It is not surprising at all that some characters "disappear"
here: since $string appears unquoted, any character which is whitespace
w.r.t. shell parsing rules won't be passed to echo.
Try to quote your string:
  echo "$string" | hd

In your other example, you use the 'read' builtin to get characters
from jot, but read is /also/ defined to apply shell field splitting
rules. 

A correct version of your test follows:

#!/bin/sh -x

for n in `jot 256 0`
do
  c="`jot -c 1 $n`"
  echo "$c" | wc -c | grep -v 2 && echo "$n"
done

which correctly produces the following output:

       1
0
       1
10

because a shell variable cannot contain a null character (which is
a string end marker), and backquote expansion is defined to remove
trailing newlines.

This is legal and expected behaviour, not a bug.

Thomas.

-- 
    Thomas.Quinot@Cuivre.FR.EU.ORG
Comment 2 Eugene Grosbein 2001-11-06 18:00:20 UTC
On Tue, Nov 06, 2001 at 06:18:34PM +0100, Thomas Quinot wrote:

> > #!/bin/sh
> > string=`printf "\21"`
> > echo $string | hd
>  
> > Replace 21 with 201 and rerun. You see:
> > 00000000  0a                                                |.|
> > 00000001
> 
> Can't reproduce here for the value \201, but for the other values
> you mention it looks like perfectly normal and expected behaviour
> from sh(1). It is not surprising at all that some characters "disappear"
> here: since $string appears unquoted, any character which is whitespace
> w.r.t. shell parsing rules won't be passed to echo.
> Try to quote your string:
>   echo "$string" | hd

I still get unexpected results:

#!/bin/sh
string=`printf "\210"`
echo "$string" | hd

gives me:
00000000  0a                                                |.|
00000001

The same with \12 and \201. Other codes are Ok, thank you for explanation.
I see that \12 is removed by backquotes but wonder what with \201 and \210.

Eugene Grosbein
Comment 3 Thomas Quinot 2001-11-06 19:51:13 UTC
Le 2001-11-06, Eugene Grosbein écrivait :

> I still get unexpected results:

You are absolutely right. My tests succeeded because I tried your
script on -CURRENT, where this bug was fixed a few weeks ago.
The fix to -STABLE was MFC'd last week:

Revision 1.31.2.3
Branch: RELENG_4

MFC: BASESYNTAX, DQSYNTAX, SQSYNTAX and ARISYNTAX handles negative
indexes.
     Allow those to be used to properly quote characters in the shell
     control character range.

PR:		31627

so updating your /bin/sh with the latest -STABLE version should resolve
your problem.

Thomas.

-- 
    Thomas.Quinot@Cuivre.FR.EU.ORG
Comment 4 dwmalone freebsd_committer freebsd_triage 2001-11-06 19:53:51 UTC
State Changed
From-To: open->closed

Fixed by tegge in -current and RELENG_4.
Comment 5 Eugene Grosbein 2001-11-07 04:22:02 UTC
On Tue, Nov 06, 2001 at 08:51:13PM +0100, Thomas Quinot wrote:

> > I still get unexpected results:
> You are absolutely right. My tests succeeded because I tried your
> script on -CURRENT, where this bug was fixed a few weeks ago.
> The fix to -STABLE was MFC'd last week:
> 
> Revision 1.31.2.3
> Branch: RELENG_4
> 
> MFC: BASESYNTAX, DQSYNTAX, SQSYNTAX and ARISYNTAX handles negative
> indexes.
>      Allow those to be used to properly quote characters in the shell
>      control character range.
> 
> PR:		31627
> 
> so updating your /bin/sh with the latest -STABLE version should resolve
> your problem.

I've updated to -STABLE and this works now as expected.
Thank you very much. PR should be closed now.

Eugene Grosbein