Bug 187315 - unzip(1): base unzip does not recognize *.zip archives from dropbox.com
Summary: unzip(1): base unzip does not recognize *.zip archives from dropbox.com
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.0-STABLE
Hardware: Any Any
: Normal Affects Many People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-06 12:20 UTC by jakub_lach
Modified: 2017-10-25 13:57 UTC (History)
9 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jakub_lach 2014-03-06 12:20:00 UTC
Please see

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=594563+0+archive/2014/freebsd-current/20140223.freebsd-current

for details.

$ /usr/bin/unzip  file.zip                             
Archive:  file.zip
unzip: skipping non-regular entry ''
unzip: skipping non-regular entry 'A B C D.pdf'

archivers/unzip manages this case though...

$ /usr/local/bin/unzip  file.zip                       
Archive:  file.zip
warning:  stripped absolute path spec from /
mapname:  conversion of  failed
inflating: A B C D.pdf

How-To-Repeat: unzip a *.zip from dropbox.com (download directory as zipped archive)
Comment 1 Ross McKelvie 2014-09-18 09:16:01 UTC
I have also seen this behaviour on FreeBSD 10.0-RELEASE-p7 with http://downloads.sourceforge.net/project/edk2/OVMF/OVMF-IA32-r15214.zip and http://downloads.sourceforge.net/project/edk2/OVMF/OVMF-X64-r15214.zip; distfiles for proposed ports submitted in bug 192012.
Comment 2 oliver 2014-09-18 18:22:37 UTC
Some time ago I checked out the bug and tried to contact the author, but did not get response...maybe he did not get it...here is a copy of the mail...

---

Hello des,

I contact you because you are the main author of
the /usr/src/usr.bin/unzip utility if I got it correct. 

Well I took a glimpse into this PR bin/187315 and could need some
advice.

unzip(1) uses libarchive(3) for working with the archives. 

To determine the filetype, there is a function called
"archive_entry_filetype()" in libarchive. As this function uses
the file acl.mode as input, it fails if an entry has no file mode and
returns a filetype of 0x0. 

As the implementation of unzip expects to get a filetype of either
a regular file or a directory, it checks for that. And so
that sanity check for S_ISREG and S_ISDIR fails and the program skips
the entry.

unzip.c 

/* I don't think this can happen in a zipfile.. */
        if (!S_ISDIR(filetype) && !S_ISREG(filetype)) {
                warningx("skipping non-regular entry '%s'", pathname);
                ac(archive_read_data_skip(a));
                free(pathname);
                return;
}



The cause of this may be that dropbox creates the zipfile for you
on-the-fly. That means streaming it out of a database directly into a
zipfile. In this special circumstance, where there is no file and the
file comes from stdin, it is allowed by ZIP file archive standard to
keep the external file attribute 0x0. (see [1] 4.4.15 external file
attributes). As I understand it, the libarchive code uses this field for
filetype check.

I think that is what happens here (at least in the dropbox-file the
filetype is returned zero for all files and directories). I can
reproduce the error like that:

$ echo "testtext" | python -c "import 
sys                                       
import
zipfile                                                                  
z =
zipfile.ZipFile(sys.argv[1],'w')                                            
z.writestr(sys.argv[2],sys.stdin.read())
z.close()                                                                       
" test.zip testfile1
$ unzip -l test.zip
Archive:  test.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
        9  03-16-14 00:47   testfile1
$ unzip test.zip
Archive:  test.zip
unzip: skipping non-regular entry 'testfile1'
$ /usr/local/bin/unzip test.zip
Archive:  test.zip
 extracting: testfile1               
$ cat testfile1
testtext
$ 


for a correct file zipinfo shows (example):
  Unix file attributes (100744 octal):            -rwxr--r--  
  Unix file attributes (040744 octal):            drwxr--r--

for dropbox or above example:
  Unix file attributes (000600 octal):            ?rw-------

recognize the questionmark where filetype should be (=0x00).

The extraction seems to work correctly if we remove that sanity check
for S_ISDIR and S_ISREG. But as the program uses the information for
program flow that may be a problem.

As more and more archives are generated on the fly, maybe that issue
will get more serious. 

Maybe you can give me a hint if it's okay to remove that sanity check
or if you want to keep it.

[1] https://www.pkware.com/documents/casestudies/APPNOTE.TXT
Comment 3 Adam Mackler 2015-09-04 06:30:25 UTC
I am suffering from this same bug when trying to extract an application archive that was created using the Scala sbt-native-packager:

    http://www.scala-sbt.org/sbt-native-packager/

This is not surprising, since sbt-native-packager apparently compiles and archives an application's files without intermediately saving those files to disk.

The unzip on Debian Linux uncompresses said archive successfully.

I would be eager to learn of a workaround.  Although tar claims to be extracting this archive, the resulting files are corrupt, invoking tar either with or without the -z option.
Comment 4 Randy Westlund 2015-10-09 00:17:57 UTC
I just ran into this with zip files from Dropbox on 10.2-RELEASE-p2.  archivers/unzip works for me with no warnings.
Comment 5 Dag-Erling Smørgrav freebsd_committer 2015-10-09 13:07:45 UTC
This is an issue in libarchive, see https://github.com/libarchive/libarchive
Comment 6 Andriy Gapon freebsd_committer 2017-10-25 13:16:06 UTC
Seems like I cannot reproduce this issue any longer, on head.
Comment 7 jakub_lach 2017-10-25 13:29:59 UTC
I can unzip OVMF-IA32-r15214.zip on 11.1-STABLE #0 r324952 too.