Bug 118723 - [patch] od(1)/hexdump(1) truncates last partial repeat line
Summary: [patch] od(1)/hexdump(1) truncates last partial repeat line
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-12-15 18:20 UTC by Mark Adler
Modified: 2017-12-31 22:29 UTC (History)
0 users

See Also:


Attachments
file.diff (986 bytes, patch)
2007-12-15 18:20 UTC, Mark Adler
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Adler 2007-12-15 18:20:01 UTC
od and hexdump truncates the last line of the dump when that line
partially repeats the previous line.

Fix: Patch relative to latest version (1.22):
How-To-Repeat: Create this 40-byte file (including five newline characters):

abcdefg 
0123456 
abcdefg 
0123456 
abcdefg 

(Just the first three lines, 24 bytes, shows the same problem, but I
included five lines to make the structure more evident.)

od on that file incorrectly gives:
0000000    060542  061544  062546  063412  030061  031063  032065 033012
*
0000040

which leaves off the last line "abcdefg\n", and incorrectly shows the
final address as 40 octal (32 decimal).

Similarly, hexdump gives the following with the same issues:

0000000 6162 6364 6566 670a 3031 3233 3435 360a
*
0000020

With the patch below, od then correctly displays:

0000000    060542  061544  062546  063412  030061  031063  032065 033012
*
0000040    060542  061544  062546 063412
0000050

and hexdump then correctly displays:

0000000 6162 6364 6566 670a 3031 3233 3435 360a
*
0000020 6162 6364 6566 670a
0000028
Comment 1 Gavin Atkinson freebsd_committer freebsd_triage 2008-06-23 18:52:43 UTC
Responsible Changed
From-To: freebsd-bugs->gcooper

gcooper is looking into this
Comment 2 Enji Cooper freebsd_committer 2008-12-27 09:41:12 UTC
Can't reproduce on 7.1rc2 / amd64 nor OSX 10.5.5 / x86. May be ppc or  
an endiam specific issue.

Will verify on a G5 with OSX 10.4.11 when I get back to SJC.

-Garrett
Comment 3 Mark Adler 2008-12-27 17:40:46 UTC
On Dec 27, 2008, at 1:41 AM, Garrett Cooper wrote:
> Can't reproduce on 7.1rc2 / amd64 nor OSX 10.5.5 / x86. May be ppc  
> or an endiam specific issue.
> Will verify on a G5 with OSX 10.4.11 when I get back to SJC.

Garrett,

I see the bug on my x86 with OS X 10.5.6.  See below.  (The version of  
od in my ~/bin directory has my patch.)  It is not a PPC-unique or  
endian issue.

Make sure that your test file is exactly as below, with no trailing  
blanks on the lines and only a single newline character per line.  To  
see the bug, the file must repeat every 16 bytes, and terminate in the  
middle of the last 16 bytes.  The bug is that the last partial repeat  
is not displayed.

Mark


gromit% cat xx
abcdefg
0123456
abcdefg
0123456
abcdefg
gromit% /usr/bin/od xx
0000000    061141  062143  063145  005147  030460  031462  032464   
005066
*
0000040
gromit% ~/bin/od xx
0000000    061141  062143  063145  005147  030460  031462  032464   
005066
*
0000040    061141  062143  063145  005147
0000050
gromit% /usr/bin/od -x xx
0000000      6261    6463    6665    0a67    3130    3332    3534     
0a36
*
0000040
gromit% ~/bin/od -x xx
0000000      6261    6463    6665    0a67    3130    3332    3534     
0a36
*
0000040      6261    6463    6665    0a67
0000050
gromit% uname -a
Darwin gromit 9.6.0 Darwin Kernel Version 9.6.0: Mon Nov 24 17:37:00  
PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386
Comment 4 Garrett Cooper 2010-02-08 04:25:10 UTC
Ok... I've gotten back to this bug and here are my comments on the
issue at hand:

1. The item you're seeing is a valid *cosmetic* bug -- the address
isn't being incremented properly, so it gives you the impression that
the data being represented is invalid. See the following for a valid
representation of the text file with %_u:

[root@optimus /scratch/src/head/usr.bin/hexdump]# hexdump -e '"%_u\n"'
-v test_118723.txt
a
b
c
d
e
f
g
lf
0
1
2
3
4
5
6
lf
a
b
c
d
e
f
g
lf
0
1
2
3
4
5
6
lf
a
b
c
d
e
f
g
lf

versus the non- -v output:

[root@optimus /scratch/src/head/usr.bin/hexdump]# hexdump -e '"%_u\n"'
test_118723.txt
a
b
c
d
e
f
g
lf
0
1
2
3
4
5
6
lf
a
b
c
d
e
f
g
lf
0
1
2
3
4
5
6
lf
a
b
c
d
e
f
g
lf

Please note that the data represented in both forms is equivalent.
2. The patch you provide breaks -v output, which isn't desirable...

Here's what should be done to fix the cosmetic issue:

Index: display.c
===================================================================
--- display.c	(revision 203332)
+++ display.c	(working copy)
@@ -264,18 +264,20 @@
 			if (need == blocksize)
 				return((u_char *)NULL);
 			/*
-			 * XXX bcmp() is not quite right in the presence
-			 * of multibyte characters.
+			 * We need to explicitly bump the address displayed even
+			 * if -v isn't specified, or the address doesn't
+			 * increment with repeated output...
+			 * Please see bin/118723 for more details.
 			 */
+			eaddress = address + nread;
 			if (vflag != ALL &&
 			    valid_save &&
-			    bcmp(curp, savp, nread) == 0) {
+			    memcmp(curp, savp, nread) == 0) {
 				if (vflag != DUP)
 					(void)printf("*\n");
 				return((u_char *)NULL);
 			}
 			bzero((char *)curp + nread, need);
-			eaddress = address + nread;
 			return(curp);
 		}
 		n = fread((char *)curp + nread, sizeof(u_char),
@@ -290,13 +292,9 @@
 		if (length != -1)
 			length -= n;
 		if (!(need -= n)) {
-			/*
-			 * XXX bcmp() is not quite right in the presence
-			 * of multibyte characters.
-			 */
 			if (vflag == ALL || vflag == FIRST ||
 			    valid_save == 0 ||
-			    bcmp(curp, savp, blocksize) != 0) {
+			    memcmp(curp, savp, blocksize) != 0) {
 				if (vflag == DUP || vflag == FIRST)
 					vflag = WAIT;
 				return(curp);

Please let me know if you agree to the change.

Thanks,
-Garrett
Comment 5 Mark Adler 2010-02-08 05:29:51 UTC
Garrett,

Thanks for looking into the bug.  I had to fire up the way-back machine =
to remember what you were talking about!

On Feb 7, 2010, at 8:25 PM, Garrett Cooper wrote:
> 1. The item you're seeing is a valid *cosmetic* bug --

I don't know what you mean by "cosmetic", but without the patch you can =
have two different input files of different lengths producing exactly =
the same output.  I'd call that simply a bug.

> Please let me know if you agree to the change.

Yes, I agree to the change.  Moving the "eaddress =3D address + nread;" =
before the "if" is the salient portion of the patch and does the trick.  =
The final address now correctly indicates the length of the input, which =
allows the original input to be reconstructed from the dump.

I don't actually have your current source (I noticed some differences in =
the unchanged portion of the diff from the version I have).  However I =
suspect that those changes would not affect this behavior.

Mark
Comment 6 Garrett Cooper 2010-02-08 06:33:47 UTC
On Sun, Feb 7, 2010 at 9:29 PM, Mark Adler <madler@alumni.caltech.edu> wrot=
e:
> Garrett,
>
> Thanks for looking into the bug. =A0I had to fire up the way-back machine=
 to remember what you were talking about!

:)...

> On Feb 7, 2010, at 8:25 PM, Garrett Cooper wrote:
>> 1. The item you're seeing is a valid *cosmetic* bug --
>
> I don't know what you mean by "cosmetic", but without the patch you can h=
ave two different input files of different lengths producing exactly the sa=
me output. =A0I'd call that simply a bug.

Sorry it's a Cisco-ism -- cosmetic bugs are bugs which are
functionally correct, but the data is misrepresented in some manner.
This might be considered a `minor' bug as well.

>> Please let me know if you agree to the change.
>
> Yes, I agree to the change. =A0Moving the "eaddress =3D address + nread;"=
 before the "if" is the salient portion of the patch and does the trick. =
=A0The final address now correctly indicates the length of the input, which=
 allows the original input to be reconstructed from the dump.

Brilliant -- that's what I figured it was after thinking through the
process a bit more thoroughly.

> I don't actually have your current source (I noticed some differences in =
the unchanged portion of the diff from the version I have). =A0However I su=
spect that those changes would not affect this behavior.

Right. I have a few items in the pipeline that I'm working on for this
utility that I'm trying to close out in one commit. The other items in
this file are benign as they affect different pieces of hexdump.

Thanks for checking and I'll add a regression test for this as well in
the final commit :).

Cheers!
-Garrett
Comment 7 Chris Rees freebsd_committer 2012-07-02 21:45:22 UTC
Responsible Changed
From-To: gcooper->freebsd-bugs

gcooper is not a developer
Comment 8 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:00:52 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped