Bug 225639 - FreeBSD's tar produces .tgz files that can not be read by other tar implementations (e.g windows 7-zip)
Summary: FreeBSD's tar produces .tgz files that can not be read by other tar implement...
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.3-RELEASE
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-bugs (Nobody)
Depends on:
Reported: 2018-02-03 02:27 UTC by cwf-ml
Modified: 2018-02-05 18:47 UTC (History)
4 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description cwf-ml 2018-02-03 02:27:11 UTC
we have a number of script-based processes that, at some point, all do something like

      tar -czf - /some/data | uuencode somedata.tgz | mailx -s "your data somedata.tgz" windows.user@my.organization.example

These can not be used by users who run Outlook and 7zip or Winrar. 7zip brings up error messages and hangs.

The reason can be found in tar's man page: 

     All archive output is written in correctly-sized blocks, even if the out-
     put is being compressed.(...) 
     For tar and cpio formats, the last block of output is padded to
     a full block size if the output is being written to standard output or to
     a character or block device such as a tape drive.  (...) Many com-
     pressors, including gzip(1) and bzip2(1), complain about the null padding
     when decompressing an archive created by tar, although they still extract
     it correctly.

Now this approach is outdated and needs to be fixed. This is not what GNU tar does (the only other major Unix tar implementation that handles .tgz files), nor is it the same as when we use

tar -cf - /some/data | gzip | uuencode somedata.tgz | mailx -s "your data somedata.tgz" windows.user@my.organization.example

Frankly, hardly anbody uses tar to directly interface with the (rapidly dying) breed of tape drives any more, while just about everybody uses tar to shuttle files back and forth between different platforms. Tar should do the right thing and produce standards compliant files as best as it can, and it should be interchangeable with linux and other OS tars as far as commonly used options are involved. 

FreeBSD tar's behavior is unexpected and leads to errors. Yes, there are ways to work around it, but people or code coming from Linux or Solaris environments do not expect or know this issue to exist at all. Furthermore, in target architectures like Windows and use cases like data interchange with Windows users, a target tool throwing errors is basically the same as failure; we can not just gloss over these and assume the user will somehow understand he can possibly work around them. 

Please change this tar behavior to reflect the one found in GNU tar. Create a special option to force zero padding where required. Do not zero-pad after gz encryption to stdout by default.
Comment 1 Conrad Meyer freebsd_committer 2018-02-03 03:18:11 UTC
Agreed.  Tar could just be smarter about this -- simply don't pad when writing to stdout.  It's probably still reasonable / harmless to pad when writing to block devices.

Tar in FreeBSD is mostly a thin shim around libarchive.  Perhaps the right thing to do is to file a bug in upstream libarchive and follow it here.
Comment 2 Ed Maste freebsd_committer 2018-02-05 18:47:06 UTC
Upstream can be found at https://github.com/libarchive/libarchive