| Summary: | [patch] od(1)/hexdump(1) truncates last partial repeat line | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Base System | Reporter: | Mark Adler <madler> | ||||
| Component: | bin | Assignee: | Xin LI <delphij> | ||||
| Status: | Closed FIXED | ||||||
| Severity: | Affects Only Me | CC: | delphij | ||||
| Priority: | Normal | Keywords: | patch | ||||
| Version: | Unspecified | ||||||
| Hardware: | Any | ||||||
| OS: | Any | ||||||
| URL: | https://reviews.freebsd.org/D40471 | ||||||
| Attachments: |
|
||||||
|
Description
Mark Adler
2007-12-15 18:20:01 UTC
Responsible Changed From-To: freebsd-bugs->gcooper gcooper is looking into this Can't reproduce on 7.1rc2 / amd64 nor OSX 10.5.5 / x86. May be ppc or an endiam specific issue. Will verify on a G5 with OSX 10.4.11 when I get back to SJC. -Garrett On Dec 27, 2008, at 1:41 AM, Garrett Cooper wrote:
> Can't reproduce on 7.1rc2 / amd64 nor OSX 10.5.5 / x86. May be ppc
> or an endiam specific issue.
> Will verify on a G5 with OSX 10.4.11 when I get back to SJC.
Garrett,
I see the bug on my x86 with OS X 10.5.6. See below. (The version of
od in my ~/bin directory has my patch.) It is not a PPC-unique or
endian issue.
Make sure that your test file is exactly as below, with no trailing
blanks on the lines and only a single newline character per line. To
see the bug, the file must repeat every 16 bytes, and terminate in the
middle of the last 16 bytes. The bug is that the last partial repeat
is not displayed.
Mark
gromit% cat xx
abcdefg
0123456
abcdefg
0123456
abcdefg
gromit% /usr/bin/od xx
0000000 061141 062143 063145 005147 030460 031462 032464
005066
*
0000040
gromit% ~/bin/od xx
0000000 061141 062143 063145 005147 030460 031462 032464
005066
*
0000040 061141 062143 063145 005147
0000050
gromit% /usr/bin/od -x xx
0000000 6261 6463 6665 0a67 3130 3332 3534
0a36
*
0000040
gromit% ~/bin/od -x xx
0000000 6261 6463 6665 0a67 3130 3332 3534
0a36
*
0000040 6261 6463 6665 0a67
0000050
gromit% uname -a
Darwin gromit 9.6.0 Darwin Kernel Version 9.6.0: Mon Nov 24 17:37:00
PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386
Ok... I've gotten back to this bug and here are my comments on the
issue at hand:
1. The item you're seeing is a valid *cosmetic* bug -- the address
isn't being incremented properly, so it gives you the impression that
the data being represented is invalid. See the following for a valid
representation of the text file with %_u:
[root@optimus /scratch/src/head/usr.bin/hexdump]# hexdump -e '"%_u\n"'
-v test_118723.txt
a
b
c
d
e
f
g
lf
0
1
2
3
4
5
6
lf
a
b
c
d
e
f
g
lf
0
1
2
3
4
5
6
lf
a
b
c
d
e
f
g
lf
versus the non- -v output:
[root@optimus /scratch/src/head/usr.bin/hexdump]# hexdump -e '"%_u\n"'
test_118723.txt
a
b
c
d
e
f
g
lf
0
1
2
3
4
5
6
lf
a
b
c
d
e
f
g
lf
0
1
2
3
4
5
6
lf
a
b
c
d
e
f
g
lf
Please note that the data represented in both forms is equivalent.
2. The patch you provide breaks -v output, which isn't desirable...
Here's what should be done to fix the cosmetic issue:
Index: display.c
===================================================================
--- display.c (revision 203332)
+++ display.c (working copy)
@@ -264,18 +264,20 @@
if (need == blocksize)
return((u_char *)NULL);
/*
- * XXX bcmp() is not quite right in the presence
- * of multibyte characters.
+ * We need to explicitly bump the address displayed even
+ * if -v isn't specified, or the address doesn't
+ * increment with repeated output...
+ * Please see bin/118723 for more details.
*/
+ eaddress = address + nread;
if (vflag != ALL &&
valid_save &&
- bcmp(curp, savp, nread) == 0) {
+ memcmp(curp, savp, nread) == 0) {
if (vflag != DUP)
(void)printf("*\n");
return((u_char *)NULL);
}
bzero((char *)curp + nread, need);
- eaddress = address + nread;
return(curp);
}
n = fread((char *)curp + nread, sizeof(u_char),
@@ -290,13 +292,9 @@
if (length != -1)
length -= n;
if (!(need -= n)) {
- /*
- * XXX bcmp() is not quite right in the presence
- * of multibyte characters.
- */
if (vflag == ALL || vflag == FIRST ||
valid_save == 0 ||
- bcmp(curp, savp, blocksize) != 0) {
+ memcmp(curp, savp, blocksize) != 0) {
if (vflag == DUP || vflag == FIRST)
vflag = WAIT;
return(curp);
Please let me know if you agree to the change.
Thanks,
-Garrett
Garrett, Thanks for looking into the bug. I had to fire up the way-back machine = to remember what you were talking about! On Feb 7, 2010, at 8:25 PM, Garrett Cooper wrote: > 1. The item you're seeing is a valid *cosmetic* bug -- I don't know what you mean by "cosmetic", but without the patch you can = have two different input files of different lengths producing exactly = the same output. I'd call that simply a bug. > Please let me know if you agree to the change. Yes, I agree to the change. Moving the "eaddress =3D address + nread;" = before the "if" is the salient portion of the patch and does the trick. = The final address now correctly indicates the length of the input, which = allows the original input to be reconstructed from the dump. I don't actually have your current source (I noticed some differences in = the unchanged portion of the diff from the version I have). However I = suspect that those changes would not affect this behavior. Mark On Sun, Feb 7, 2010 at 9:29 PM, Mark Adler <madler@alumni.caltech.edu> wrot= e: > Garrett, > > Thanks for looking into the bug. =A0I had to fire up the way-back machine= to remember what you were talking about! :)... > On Feb 7, 2010, at 8:25 PM, Garrett Cooper wrote: >> 1. The item you're seeing is a valid *cosmetic* bug -- > > I don't know what you mean by "cosmetic", but without the patch you can h= ave two different input files of different lengths producing exactly the sa= me output. =A0I'd call that simply a bug. Sorry it's a Cisco-ism -- cosmetic bugs are bugs which are functionally correct, but the data is misrepresented in some manner. This might be considered a `minor' bug as well. >> Please let me know if you agree to the change. > > Yes, I agree to the change. =A0Moving the "eaddress =3D address + nread;"= before the "if" is the salient portion of the patch and does the trick. = =A0The final address now correctly indicates the length of the input, which= allows the original input to be reconstructed from the dump. Brilliant -- that's what I figured it was after thinking through the process a bit more thoroughly. > I don't actually have your current source (I noticed some differences in = the unchanged portion of the diff from the version I have). =A0However I su= spect that those changes would not affect this behavior. Right. I have a few items in the pipeline that I'm working on for this utility that I'm trying to close out in one commit. The other items in this file are benign as they affect different pieces of hexdump. Thanks for checking and I'll add a regression test for this as well in the final commit :). Cheers! -Garrett Responsible Changed From-To: gcooper->freebsd-bugs gcooper is not a developer For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped Keyword:
patch
or patch-ready
– in lieu of summary line prefix:
[patch]
* bulk change for the keyword
* summary lines may be edited manually (not in bulk).
Keyword descriptions and search interface:
<https://bugs.freebsd.org/bugzilla/describekeywords.cgi>
I think the partial line should never be considered as a match of earlier line. So it seems that we should just check if "need" is a non-zero value:
diff --git a/usr.bin/hexdump/display.c b/usr.bin/hexdump/display.c
index 36306ededfc6..0bf1f1878eda 100644
--- a/usr.bin/hexdump/display.c
+++ b/usr.bin/hexdump/display.c
@@ -271,7 +271,7 @@ get(void)
* XXX bcmp() is not quite right in the presence
* of multibyte characters.
*/
- if (vflag != ALL &&
+ if (!need && vflag != ALL &&
valid_save &&
bcmp(curp, savp, nread) == 0) {
if (vflag != DUP) {
I've also created some test cases to verify the behavior for hexdump -C and hexdump -Cv.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d1016568185ddacbf1148acbe26b27258981b4f0 commit d1016568185ddacbf1148acbe26b27258981b4f0 Author: Xin LI <delphij@FreeBSD.org> AuthorDate: 2023-06-09 01:38:47 +0000 Commit: Xin LI <delphij@FreeBSD.org> CommitDate: 2023-06-09 01:39:05 +0000 hexdump: Partial lines cannot be repetitions of earlier lines. When checking for repetitions of earlier lines, we compare the first nread bytes of the line against the saved line. However, when we read a partial line, it should never be treated as a repetition of an earlier line, even if the first bytes match. This change fixes a bug where a partial line could be incorrectly identified as a repetition of an earlier line. Reported-by: Mark Adler <madler@alumni.caltech.edu> PR: 118723 Reviewed-by: emaste MFC-after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40471 usr.bin/hexdump/display.c | 2 +- usr.bin/hexdump/tests/Makefile | 3 +++ usr.bin/hexdump/tests/d_hexdump_UCflag_bug118723.out (new) | 4 ++++ usr.bin/hexdump/tests/d_hexdump_UCvflag_bug118723.out (new) | 4 ++++ usr.bin/hexdump/tests/d_hexdump_bug118723.in (new) | 5 +++++ usr.bin/hexdump/tests/hexdump_test.sh | 8 ++++++++ 6 files changed, 25 insertions(+), 1 deletion(-) |