Bug 46787

Summary: compress(1) manpage missing BUGS; other cleanup.
Product: Documentation Reporter: Gary W. Swearingen <swear>
Component: Books & ArticlesAssignee: Gary W. Swearingen <garys>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Latest   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description Gary W. Swearingen 2003-01-05 18:20:04 UTC
The compress(1) manpage has several problems:

1) It doesn't explain that the command will delete a file under certain
circumstances (see BUGS in patch).

2) It doesn't mention that ".Z" can be assumed by "uncompress" and
prevent compression by "compress".

3) The list of options lists only flags. (One flag is missing its argument).

4) The -b description doesn't give the valid values or default.

5) Misc. other poor organization, missing helpful info, etc.

================

Fix: The bug should be fixed in code, but I don't want to do it now.
The patch fixes the manpage until the code gets fixed.
I'll write a no-patch PR on the code after this PR is closed 
with the code still unfixed.

Patch $FreeBSD: src/usr.bin/compress/compress.1,v 1.17 2002/05/29 18:12:21 ru Exp $
How-To-Repeat: n/a
================
Comment 1 Giorgos Keramidas freebsd_committer freebsd_triage 2003-01-05 22:50:20 UTC
On 2003-01-05 10:18, "Gary W. Swearingen" <swear@attbi.com> wrote:
> --- /tmp/compress..orig.1	Sun Jan  5 06:33:35 2003
> +++ /tmp/compress.1	Sun Jan  5 10:02:14 2003
> @@ -54,17 +54,17 @@
>  .Sh DESCRIPTION
>  The
>  .Nm
> -utility reduces the size of the named files using adaptive Lempel-Ziv coding.
> +utility attempts to reduce the size of the 
> +named files using adaptive Lempel-Ziv coding.

This is not ok.  It breaks a line that is not over 80 characters, and
introduces whitespace at eol (first added line).

>  Each
>  .Ar file
>  is renamed to the same name plus the extension
>  .Dq .Z .
> -As many of the modification time, access time, file flags, file mode,
> -user ID, and group ID as allowed by permissions are retained in the
> -new file.

Does this really need to be moved further below?  I think it was put
where it is not because the first paragraph describes the change to
the file name and the author refers to the rest of the file attributes
to keep them close to the file -> file.Z change.

> -If compression would not reduce the size of a
> +If attempted compression would not reduce the size of a

Well, "attempted" sounds just a tiny bit too verbose here.  I think
it's more or less implied.

>  .Ar file ,
> -the file is ignored.
> +or if the filename ends with
> +.Dq .Z ,
> +the file is not compressed.  Exceptions are discussed below.

"already ends with .Z" perhaps?  I think this is a good & useful change.

>  .Pp
>  The
>  .Nm uncompress
> @@ -72,38 +72,53 @@
>  files by deleting the
>  .Dq .Z
>  extension.
> +If a specified filename doesn't end with
> +.Dq .Z ,
> +one will be assumed unless a file with the extended filename does not
> +exist, in which case the named file will be removed (because of a bug).

I am not sure I can understand the sentence above at all :(

>  .Pp
>  If no files are specified or a
>  .Ar file
>  argument is a single dash
>  .Pq Sq Fl ,
>  the standard input is compressed or uncompressed to the standard output.
> -If either the input and output files are not regular files, the checks for
> +If either the input and output are not regular files, the checks for

I think that removing "files" above is ok, but it probably sounds
better without "the" when "files" is gone:

	+If either input and output are not regular files, ...

>  reduction in size and file overwriting are not performed, the input file is
> -not removed, and the attributes of the input file are not retained.
> +not removed, and the attributes of the input file
> +are not retained in the output.

I'm not sure I like this for some reasons.  It wraps the line to a
very short length.  Content changes shouldn't be mixed with whitespace
changes if at all possible.  The addition of "in the output" is ok
though and I like that part of the change.

>  .It Fl c
>  Compressed or uncompressed output is written to the standard output.
> -No files are modified.
> +Compression is attempted even if the results will be larger than the
> +original.
> +No files are modified.  The

The line containing "No files are modified." is unnecessarily touched
in this part of the diff.  The new sentence should start on a new
line, and "The" be added on a line of its own.

> +Files are overwritten without prompting for confirmation.
> +For compression, proceed even if the
> +results will be larger than the original.

New text should be wrapped at about 70-80 characters if possible.
Wrapping at too short line lengths isn't very nice either, though.

>  .It Fl v
> -Print the percentage reduction of each file.
> +The size reduction of each file is printed as a percentage to
> +the standard output.  Ignored for uncompressiong or if the
> +.Fl c
> +flag is also used.

Sentences should start on a new line.
Also s/uncompressiong/uncompressing/ above.

> @@ -187,3 +195,9 @@
>  .Nm
>  command appeared in
>  .Bx 4.3 .
> +.Sh BUGS
> +For uncompression, if a specified filename doesn't end with
> +.Dq .Z
> +and a file with the filename extended with
> +.Dq .Z 
> +does not exist, the named file will be removed.

Hmmm, not very clear.
Let's ignore the whitespace at eol for a while....

Can you elaborate a bit, perhaps provide an example?
How can I reproduce this?

- Giorgos
Comment 2 Gary W. Swearingen 2003-01-06 02:33:37 UTC
Giorgos Keramidas <keramida@freebsd.org> writes:

> On 2003-01-05 10:18, "Gary W. Swearingen" <swear@attbi.com> wrote:
> > --- /tmp/compress..orig.1	Sun Jan  5 06:33:35 2003
> > +++ /tmp/compress.1	Sun Jan  5 10:02:14 2003
> > @@ -54,17 +54,17 @@
> >  .Sh DESCRIPTION
> >  The
> >  .Nm
> > -utility reduces the size of the named files using adaptive Lempel-Ziv coding.
> > +utility attempts to reduce the size of the 
> > +named files using adaptive Lempel-Ziv coding.
> 
> This is not ok.  It breaks a line that is not over 80 characters, and
> introduces whitespace at eol (first added line).

The line changed ("attempts to reduce") making it over 80 car; so I just
broke it in the new middle to make equal room in each for later changes.

I'll need to learn to check for trailing spaces, I guess.  I'll fix.

> >  Each
> >  .Ar file
> >  is renamed to the same name plus the extension
> >  .Dq .Z .
> > -As many of the modification time, access time, file flags, file mode,
> > -user ID, and group ID as allowed by permissions are retained in the
> > -new file.
> 
> Does this really need to be moved further below?  I think it was put
> where it is not because the first paragraph describes the change to
> the file name and the author refers to the rest of the file attributes
> to keep them close to the file -> file.Z change.

I don't fully understand that, but I moved it out of the "compress"
paragraph to a general paragragh because it affects "uncompress" too.

> > -If compression would not reduce the size of a
> > +If attempted compression would not reduce the size of a
> 
> Well, "attempted" sounds just a tiny bit too verbose here.  I think
> it's more or less implied.

Quite right.  Very much implied.  I'll fix.

> >  .Ar file ,
> > -the file is ignored.
> > +or if the filename ends with
> > +.Dq .Z ,
> > +the file is not compressed.  Exceptions are discussed below.
> 
> "already ends with .Z" perhaps?  I think this is a good & useful change.

Sure, OK.  Verbose, but to good effect.  I'll fix.

> >  .Pp
> >  The
> >  .Nm uncompress
> > @@ -72,38 +72,53 @@
> >  files by deleting the
> >  .Dq .Z
> >  extension.
> > +If a specified filename doesn't end with
> > +.Dq .Z ,
> > +one will be assumed unless a file with the extended filename does not
> > +exist, in which case the named file will be removed (because of a bug).
> 
> I am not sure I can understand the sentence above at all :(

Whoops.  "If a [command-line-]specified filename doesn't end
with a ".Z", one will be assumed [(by the command) to exist at the end
of the actual file's filename,] unless a file with the extended filename
[eg, xxx.Z] does not exist [on the disk], in which case the named file
[on the disk without any ".Z" at the end of its filename] will be
removed [as in gone, vanished, where'd my precious file go?].  :)

How's this:

If a specified filename does not end with ".Z", "compress" will assume
that the filename ends with ".Z".  See BUGS section.

> >  .Pp
> >  If no files are specified or a
> >  .Ar file
> >  argument is a single dash
> >  .Pq Sq Fl ,
> >  the standard input is compressed or uncompressed to the standard output.
> > -If either the input and output files are not regular files, the checks for
> > +If either the input and output are not regular files, the checks for
> 
> I think that removing "files" above is ok, but it probably sounds
> better without "the" when "files" is gone:
> 
> 	+If either input and output are not regular files, ...

Well, I thought about replacing "files" with "streams", but I'm not sure
everyone used "streams" that way.  It sounds quite wrong to me as you
have it, but mostly for other reasons.  I think it was meant to be
    +If either or both of the input and output files are not regular files,
but I like
    +If the input and/or output is not a regular file,
but losts of people dislike "and/or" so I propose this common cheat:
    +If the input or output is not a regular file,
(because the "and" case is likely to be assumed).  If you want to leave
out "the" upon second thought, I won't object; it'll bother few.
Without the "the" it might help to add "from":
    +If input or output is not from a regular file,
    
> >  reduction in size and file overwriting are not performed, the input file is
> > -not removed, and the attributes of the input file are not retained.
> > +not removed, and the attributes of the input file
> > +are not retained in the output.
> 
> I'm not sure I like this for some reasons.  It wraps the line to a
> very short length.  Content changes shouldn't be mixed with whitespace
> changes if at all possible.  The addition of "in the output" is ok
> though and I like that part of the change.

If I'd noticed it, I would have made the two "+" lines of equal length.
It seems to me that once you've made a change to a line, you've forced
the reviewers to examine that line in full.  The damage has been done.
Now you ought to prepare for the next change so it is less likely to
involve two lines because the old line is too long, etc., so you do two
things:  If the changed line is longer than 80, you break it into two
lines of about equal length to provide equal room for changes.  If the
line has parts of two sentences or significant phrases, you break
between those so a change to one won't force reviewers to review both.

Whatcha think?  Maybe the second step is wrong; there's a lot of text
with periods in the middle of lines already.

> >  .It Fl c
> >  Compressed or uncompressed output is written to the standard output.
> > -No files are modified.
> > +Compression is attempted even if the results will be larger than the
> > +original.
> > +No files are modified.  The
> 
> The line containing "No files are modified." is unnecessarily touched
> in this part of the diff.  The new sentence should start on a new
> line, and "The" be added on a line of its own.

Hmmm.  Right you are.  Lost track of some back-and-forth editing.
I'll fix; not that it really needs it, but as punishment.

> > +Files are overwritten without prompting for confirmation.
> > +For compression, proceed even if the
> > +results will be larger than the original.
> 
> New text should be wrapped at about 70-80 characters if possible.
> Wrapping at too short line lengths isn't very nice either, though.

I like to start new sentences on new lines (if it doesn't cause an
untouched line to be touched, of course; but I think added lines are
OK, no?).  The second sentence is 78 characters (and was likely longer
at some point in my editing).
 
> >  .It Fl v
> > -Print the percentage reduction of each file.
> > +The size reduction of each file is printed as a percentage to
> > +the standard output.  Ignored for uncompressiong or if the
> > +.Fl c
> > +flag is also used.
> 
> Sentences should start on a new line.
> Also s/uncompressiong/uncompressing/ above.

I'll fix both.

> > @@ -187,3 +195,9 @@
> >  .Nm
> >  command appeared in
> >  .Bx 4.3 .
> > +.Sh BUGS
> > +For uncompression, if a specified filename doesn't end with
> > +.Dq .Z
> > +and a file with the filename extended with
> > +.Dq .Z 
> > +does not exist, the named file will be removed.
> 
> Hmmm, not very clear.
> Let's ignore the whitespace at eol for a while....
> 
> Can you elaborate a bit, perhaps provide an example?
> How can I reproduce this?

Hmmm.  I've since discovered another case to slip in.  Try this rewrite:

    If "uncompress" is given a filename which doesn't end with ".Z" and
    a file of that name exists and there is no uncompressable file with
    the same name plus a trailing ".Z", then the named file will be
    removed.

For easiest example is to copy a plain text file to "xxx" and do
"ls xxx*; uncompress -f xxx"; ls xxx*.  Gone.  Try it again and also
copy a different size file to "xxx.Z".  Gone.  Try it once more but
ensure that "xxx.Z" is a real compressed file.  Not gone, just
overwritten, as the manpage has always said.

Thanks for your comments.
Comment 3 Giorgos Keramidas freebsd_committer freebsd_triage 2003-01-06 03:13:51 UTC
On 2003-01-05 18:33, "Gary W. Swearingen" <swear@attbi.com> wrote:
> Giorgos Keramidas <keramida@freebsd.org> writes:
> > > @@ -72,38 +72,53 @@
> > >  files by deleting the
> > >  .Dq .Z
> > >  extension.
> > > +If a specified filename doesn't end with
> > > +.Dq .Z ,
> > > +one will be assumed unless a file with the extended filename does not
> > > +exist, in which case the named file will be removed (because of a bug).
> >
> > I am not sure I can understand the sentence above at all :(
>
> Whoops.  "If a [command-line-]specified filename doesn't end
> with a ".Z", one will be assumed [(by the command) to exist at the end
> of the actual file's filename,] unless a file with the extended filename
> [eg, xxx.Z] does not exist [on the disk], in which case the named file
> [on the disk without any ".Z" at the end of its filename] will be
> removed [as in gone, vanished, where'd my precious file go?].  :)
>
> How's this:
>
> If a specified filename does not end with ".Z", "compress" will assume
> that the filename ends with ".Z".  See BUGS section.

Ah, much better!  Now I can reproduce the bug, but it's not compress
that is buggy.  It's uncompress, and only if -f option is used:

: giorgos@gothmog[04:57]/home/giorgos$ cd /tmp/
: giorgos@gothmog[04:57]/tmp$ mkdir foo
: giorgos@gothmog[04:57]/tmp$ cd foo
: giorgos@gothmog[04:57]/tmp/foo$ touch bar
: giorgos@gothmog[04:57]/tmp/foo$ ls -l
: total 0
: -rw-rw-r--  1 giorgos  wheel  - 0 Jan  6 04:57 bar
: giorgos@gothmog[04:57]/tmp/foo$ compress bar
: giorgos@gothmog[04:57]/tmp/foo$ ls -l
: total 0
: -rw-rw-r--  1 giorgos  wheel  - 0 Jan  6 04:57 bar
: giorgos@gothmog[04:57]/tmp/foo$ uncompress bar
: overwrite bar? n
: giorgos@gothmog[04:57]/tmp/foo$ ls -l
: total 0
: -rw-rw-r--  1 giorgos  wheel  - 0 Jan  6 04:57 bar
: giorgos@gothmog[04:57]/tmp/foo$ uncompress -f bar
: uncompress: bar.Z: No such file or directory
: giorgos@gothmog[04:57]/tmp/foo$ ls -l
: giorgos@gothmog[04:57]/tmp/foo$

How about something like this?

    +.Sh BUGS
    +When the
    +.Fl f
    +option is used and a specified
    +.Ar file
    +filename does not end with
    +.Dq .Z
    +but matches an existing file, the
    +.Nm uncompress
    +utility will assume that a compressed file with the
    +.Dq .Z
    +extension exists, and promptly overwrite the existing file.

> > >  reduction in size and file overwriting are not performed, the input file is
> > > -not removed, and the attributes of the input file are not retained.
> > > +not removed, and the attributes of the input file
> > > +are not retained in the output.
> >
> > I'm not sure I like this for some reasons.  It wraps the line to a
> > very short length.  Content changes shouldn't be mixed with whitespace
> > changes if at all possible.  The addition of "in the output" is ok
> > though and I like that part of the change.
>
> If I'd noticed it, I would have made the two "+" lines of equal length.
> It seems to me that once you've made a change to a line, you've forced
> the reviewers to examine that line in full.  The damage has been done.
> Now you ought to prepare for the next change so it is less likely to
> involve two lines because the old line is too long, etc., so you do two
> things:  If the changed line is longer than 80, you break it into two
> lines of about equal length to provide equal room for changes.

Hmmm, this doesn't sound too bad, but I always prefer to keep the
diffs minimal and as close to the original as possible when making
small changes.  This makes it very easy on the translators to quickly
spot the precise change that takes place.

> For easiest example is to copy a plain text file to "xxx" and do
> "ls xxx*; uncompress -f xxx"; ls xxx*.  Gone.  Try it again and also
> copy a different size file to "xxx.Z".  Gone.  Try it once more but
> ensure that "xxx.Z" is a real compressed file.  Not gone, just
> overwritten, as the manpage has always said.
>
> Thanks for your comments.

So it's two bugs?  Uhm.  These are accidents waiting to happen!!!

I think I should devote some time to try and find a fix for these,
instead of documenting their existence :-(

- Giorgos
Comment 4 Gary W. Swearingen 2003-01-06 04:06:22 UTC
Giorgos Keramidas <keramida@freebsd.org> writes:

> > If a specified filename does not end with ".Z", "compress" will assume
> > that the filename ends with ".Z".  See BUGS section.
> 
> Ah, much better!  Now I can reproduce the bug, but it's not compress
> that is buggy.  It's uncompress, and only if -f option is used:

It's in the "uncompress" para; it better not be "compress".  I'll fix.

As for "-f", you answered "overwrite foo? " the wrong way.  A "y"
will delete the file instead of overwriting it (even if your "foo"
was an real uncompressable file which CAN be uncompressed with
"cat foo | uncompress >foo.bar").

> How about something like this?
...
>     +extension exists, and promptly overwrite the existing file.

it doesn't overwrite it (as it should with uncompressed data); it
deletes it!

> So it's two bugs?  Uhm.  These are accidents waiting to happen!!!

I'm not sure if it's the same bug in two guises, or what.  There's
like to be other wierdness.  Like with zero-length files, & whatnot.
 
> I think I should devote some time to try and find a fix for these,
> instead of documenting their existence :-(

I was thinking that a compressed file without ".Z" is quite rare, but I
hadn't thought of "uncompress -f *" which seems fairly likely and seems
like it could do some serious damage.  If you'd like I'll write up a
"bin" PR and we can see if anyone else wants to beat you to it.
Comment 5 Giorgos Keramidas freebsd_committer freebsd_triage 2003-01-07 06:59:09 UTC
Responsible Changed
From-To: freebsd-doc->keramida

I'm looking at ways of fixing the bugs instead of documenting them. 
Some of the rest of the cleanups in this PR are cool, so I'll take 
care of this.
Comment 6 Tom Rhodes freebsd_committer freebsd_triage 2003-06-14 14:41:49 UTC
State Changed
From-To: open->patched

I have committed a fix to compress.c.  Manual fixes will be applied 
later.  Thanks! 


Comment 7 Tom Rhodes freebsd_committer freebsd_triage 2003-06-14 14:41:49 UTC
Responsible Changed
From-To: keramida->trhodes

Over to me, I'll handle this for keramida.
Comment 8 Gary W. Swearingen freebsd_committer freebsd_triage 2005-09-04 17:49:25 UTC
State Changed
From-To: patched->analyzed

PR should never have been put in patched state. 


Comment 9 Gary W. Swearingen freebsd_committer freebsd_triage 2005-09-04 17:49:25 UTC
Responsible Changed
From-To: trhodes->garys

I'll take this stale PR, per PR Handling Guidelines.
Comment 10 Gary W. Swearingen freebsd_committer freebsd_triage 2005-09-07 19:47:09 UTC
State Changed
From-To: analyzed->patched

Patched HEAD.
Comment 11 Gary W. Swearingen freebsd_committer freebsd_triage 2005-09-17 19:09:31 UTC
State Changed
From-To: patched->closed

MFC'd to RELENG_6 and RELENG_5.