Bug 17619

Summary: pax cannot read all tar files created by tar.
Product: Base System Reporter: Marc Olzheim <marcolz>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 3.4-RELEASE   
Hardware: Any   
OS: Any   

Description Marc Olzheim 2000-03-27 14:00:01 UTC
	When trying to read a tar file created by GNU-tar or Solaris's tar,
	pax asks for another archive. It seems to misinterpret the tar-header.
	Some sizes of tar's will be handled just right, others don't.

Fix: 

A work around for this specific example is to do:
	prompt> { cat bar.tar ; dd if=/dev/zero bs=512 count=1 ; } | pax -v

	But in general it seems like a good idea to just add a complete
	<blocksize> block of zeroes, i.e. 10k, so:
	prompt> { cat bar.tar ; dd if=/dev/zero bs=10k count=1 ; } | pax -v
How-To-Repeat: 
	prompt> dd if=/dev/zero of=foo bs=1759830 count=1
	prompt> tar cf bar.tar foo
	prompt> pax -vf bar.tar
	-rw-r--r--  1 marcolz          wheel    1759830 Mar 27 14:52 foo
	pax: End of archive volume 1 reached
	pax: ustar vol 1, 1 files, 1761280 bytes read, 0 bytes written.

	ATTENTION! pax archive volume change required.
	Ready for archive volume: 2
	Input archive name or "." to quit pax.
Comment 1 Sheldon Hearn 2000-03-29 11:41:38 UTC
On Mon, 27 Mar 2000 14:57:39 +0200, marcolz@stack.nl wrote:

> 	When trying to read a tar file created by GNU-tar or Solaris's tar,
> 	pax asks for another archive. It seems to misinterpret the tar-header.
> 	Some sizes of tar's will be handled just right, others don't.

Unfortunately, pax doesn't look like one of those parts of FreeBSD that
anyone takes an active interest in.  This may not be fixed any time soon
without patches. :-(

Ciao,
Sheldon.
Comment 2 Marc Olzheim 2000-03-29 12:08:09 UTC
> 
> 
> On Mon, 27 Mar 2000 14:57:39 +0200, marcolz@stack.nl wrote:
> 
> > 	When trying to read a tar file created by GNU-tar or Solaris's tar,
> > 	pax asks for another archive. It seems to misinterpret the tar-header.
> > 	Some sizes of tar's will be handled just right, others don't.
> 
> Unfortunately, pax doesn't look like one of those parts of FreeBSD that
> anyone takes an active interest in.  This may not be fixed any time soon
> without patches. :-(

OK, then I'll look into it as soon as I've got the time.

Zlo
Comment 3 mellon 2000-04-02 11:13:31 UTC
On Wed, Mar 29, 2000 at 02:50:03AM -0800, Sheldon Hearn wrote:
> The following reply was made to PR bin/17619; it has been noted by GNATS.
> 
> From: Sheldon Hearn <sheldonh@uunet.co.za>
> To: marcolz@stack.nl
> Cc: FreeBSD-gnats-submit@FreeBSD.ORG
> Subject: Re: bin/17619: pax cannot read all tar files created by tar. 
> Date: Wed, 29 Mar 2000 12:41:38 +0200
> 
>  On Mon, 27 Mar 2000 14:57:39 +0200, marcolz@stack.nl wrote:
>  
>  > 	When trying to read a tar file created by GNU-tar or Solaris's tar,
>  > 	pax asks for another archive. It seems to misinterpret the tar-header.
>  > 	Some sizes of tar's will be handled just right, others don't.

I looked into this. pax thinks tar archives should end with at least
2 blocks of zero (i.e. 1024 zeroed bytes *after* the file ends). 
In the example provided, GNU tar creates only one such block, and 
pax thinks it must read another one which is not there, so it asks
for the next volume. I don't know which one of them is right. In fact,
I tried to look into GNU tar's code for creating archives, and couldn't
understand where it creates even one block and what governs its
behavior. The code there is extremely unreadable.

pax can be made happy by this:

--- tar.h.orig  Sun Apr  2 10:10:11 2000
+++ tar.h       Sun Apr  2 10:10:19 2000
@@ -44,7 +44,7 @@
 #define CHK_LEN                8               /* length of checksum field */
 #define TNMSZ          100             /* size of name field */
 #ifdef _PAX_
-#define NULLCNT                2               /* number of null blocks in trailer */
+#define NULLCNT                1               /* number of null blocks in trailer */
 #define CHK_OFFSET     148             /* start of checksum field */
 #define BLNKSUM                256L            /* sum of checksum field using ' ' */
 #endif /* _PAX_ */

This shouldn't create any adverse effects, and appears to work on
all archives created by GNU tar; however, pax will also add one
trailer block rather than two to its archives after the patch. 
The relevant code which uses this constant is in the tar_trail()
function in tar.c .

-- 
Anatoly Vorobey,
mellon@pobox.com http://pobox.com/~mellon/
"Angels can fly because they take themselves lightly" - G.K.Chesterton
Comment 4 mellon 2000-04-02 11:38:22 UTC
On Sun, Apr 02, 2000 at 12:20:02AM -0800, Anatoly Vorobey wrote:

>  I looked into this. pax thinks tar archives should end with at least
>  2 blocks of zero (i.e. 1024 zeroed bytes *after* the file ends). 
>  In the example provided, GNU tar creates only one such block, and 
>  pax thinks it must read another one which is not there, so it asks
>  for the next volume. 

I forgot to demonstrate this on the example given my Marc:

prompt> dd if=/dev/zero of=foo bs=1759830 count=1
prompt> tar cf bar.tar foo
prompt> ls -l bar.tar
-rw-r--r--  1 mellon  wheel  1761280 Apr  2 07:19 bar.tar

Now 1759830 is padded to the 512 bytes boundary when written out
which results in 1760256 bytes; together with 512 bytes of the header
and 512 bytes of one trailer block, 512 + 1760256 + 512 = 1761280.

FWIW, the page at www.paranoia.com/~vax/tar_format.html , currently
uavailable but cached at
http://www.google.com/search?q=cache:www.paranoia.com/~vax/tar_format.html ,
says that there must be two trailer blocks. 

-- 
Anatoly Vorobey,
mellon@pobox.com http://pobox.com/~mellon/
"Angels can fly because they take themselves lightly" - G.K.Chesterton
Comment 5 mellon 2000-04-06 09:46:40 UTC
On Mon, Apr 03, 2000 at 10:20:54AM +0200, Marc Olzheim wrote:
> > Bah. So what should be done in this case (assuming something needs
> > to be done?). I can patch pax to accept one-block archives and
> > yet produce correct archives - maybe that's the way to go? Given that 
> > the GNU tar maintainer had more than 8 years to think about it, he's
> > obviously emotionally attached to one-trailing-block files and won't
> > let them go ;)
> 
> That seems like a good idea to me.

See the attached patch. Tested, works fine. Maybe someone shall
review/commit it?

Index: tar.c
===================================================================
RCS file: /freebsd/cvs/src/bin/pax/tar.c,v
retrieving revision 1.13
diff -u -r1.13 tar.c
--- tar.c	1999/08/27 23:14:47	1.13
+++ tar.c	2000/04/06 08:36:55
@@ -86,7 +86,7 @@
 tar_endwr()
 #endif
 {
-	return(wr_skip((off_t)(NULLCNT*BLKMULT)));
+	return(wr_skip((off_t)(NULL_PUT*BLKMULT)));
 }
 
 /*
@@ -104,7 +104,7 @@
 tar_endrd()
 #endif
 {
-	return((off_t)(NULLCNT*BLKMULT));
+	return((off_t)(NULL_PUT*BLKMULT));
 }
 
 /*
@@ -152,8 +152,14 @@
 	 * NOT try to id a trailer during resync mode. During resync mode we
 	 * might as well throw this block out since a valid header can NEVER be
 	 * a block of all 0 (we must have a valid file name).
+         *
+         * A bug in GNU tar causes it to sometimes produce trailers with 
+         * just one zero block. To handle this, we will put NULL_PUT
+         * blocks in our archives, but will expect NULL_EXPECT on reads.
+         * If there are actually two zero blocks, the second one will be
+         * skipped on the next next_head() call.   
 	 */
-	if (!in_resync && (++*cnt >= NULLCNT))
+	if (!in_resync && (++*cnt >= NULL_EXPECT))
 		return(0);
 	return(1);
 }
Index: tar.h
===================================================================
RCS file: /freebsd/cvs/src/bin/pax/tar.h,v
retrieving revision 1.6
diff -u -r1.6 tar.h
--- tar.h	1999/08/27 23:14:47	1.6
+++ tar.h	2000/04/06 08:32:28
@@ -44,7 +44,16 @@
 #define CHK_LEN		8		/* length of checksum field */
 #define TNMSZ		100		/* size of name field */
 #ifdef _PAX_
-#define NULLCNT		2		/* number of null blocks in trailer */
+
+/* 
+ * The USTAR format requires two null blocks to trail every file, yet
+ * GNU tar will sometimes produce tar files with only one. Hence the 
+ * double standard.
+ */
+
+#define NULL_PUT	2	        /* num of blocks to put in trailer */
+#define NULL_EXPECT	1               /* num of blocks to expect there */
+	
 #define CHK_OFFSET	148		/* start of checksum field */
 #define BLNKSUM		256L		/* sum of checksum field using ' ' */
 #endif /* _PAX_ */

-- 
Anatoly Vorobey,
mellon@pobox.com http://pobox.com/~mellon/
"Angels can fly because they take themselves lightly" - G.K.Chesterton
Comment 6 Marc Olzheim 2002-06-23 17:09:21 UTC
> See the attached patch. Tested, works fine. Maybe someone shall
> review/commit it?

Hmmm, the patch has worked fine for met for quite some time now. Is
anyone still maintaining pax ?

OpenBSD has fixed it in a different way, somehow by the way...

Zlo
Comment 7 Kris Kennaway freebsd_committer freebsd_triage 2003-07-13 02:09:09 UTC
State Changed
From-To: open->closed

This problem appears to be resolved.