Bug 283655 - archivers/pbzip2: broken decompress mode
Summary: archivers/pbzip2: broken decompress mode
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Babak Farrokhi
URL:
Keywords:
Depends on: 282528
Blocks:
  Show dependency treegraph
 
Reported: 2024-12-27 14:02 UTC by Wolfram Schneider
Modified: 2025-01-18 20:39 UTC (History)
4 users (show)

See Also:
bugzilla: maintainer-feedback? (farrokhi)


Attachments
no null byte and compare to end of header (988 bytes, text/plain)
2025-01-18 19:23 UTC, paparodeo
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wolfram Schneider freebsd_committer freebsd_triage 2024-12-27 14:02:13 UTC
I have compiled pbzip2 from source and can no longer decompress bz2 files.

How to repeat:

cd archivers/pbzip2
make clean all
date | ./work/stage/usr/local/bin/pbzip2  |  ./work/stage/usr/local/bin/pbzip2  -dc
pbzip2: producer_decompress: *ERROR: when reading bzip2 input stream
Terminator thread: premature exit requested - quitting...

Workaround: limit the number of CPUs in decompress mode to one:

date | ./work/stage/usr/local/bin/pbzip2  |  ./work/stage/usr/local/bin/pbzip2  -dc -p1
Fri Dec 27 13:55:15 UTC 2024

but this is rather pointless given that you want to use all CPUs on your machine.
Comment 1 Wolfram Schneider freebsd_committer freebsd_triage 2024-12-27 14:04:19 UTC
I guess this is related to PR: 282528

commit 29c80f114ea6cc60be39502339572af8c35ac440
Author: Dimitry Andric <dim@FreeBSD.org>
Date:   Mon Nov 4 00:24:58 2024 +0100

    archivers/pbzip2: fix build with libc++ 19
Comment 2 commit-hook freebsd_committer freebsd_triage 2024-12-27 21:25:48 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=89f872ec2ccf488f24cd9daca2e0d1f80e7ee429

commit 89f872ec2ccf488f24cd9daca2e0d1f80e7ee429
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2024-12-27 21:15:48 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2024-12-27 21:24:55 +0000

    archivers/pbzip2: fix stream decompression after libc++ 19 fix

    The patch in bug 283655 to fix the pbzip2 build against libc++ 19 (which
    finally removed std::char_traits<unsigned char>) had a problem where it
    did not correctly check the bzip2 header. This only occurred when
    decompressing from standard input, in the 'BZ2StreamScanner' mode, and
    would then throw an unexpected "invalid file format" error.

    PR:             283655
    Reported by:    wosch
    MFH:            2024Q4

 archivers/pbzip2/files/patch-BZ2StreamScanner.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 3 commit-hook freebsd_committer freebsd_triage 2024-12-27 21:25:49 UTC
A commit in branch 2024Q4 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=92112892d8323a5261869ff5726adc2e5d44a389

commit 92112892d8323a5261869ff5726adc2e5d44a389
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2024-12-27 21:15:48 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2024-12-27 21:25:30 +0000

    archivers/pbzip2: fix stream decompression after libc++ 19 fix

    The patch in bug 283655 to fix the pbzip2 build against libc++ 19 (which
    finally removed std::char_traits<unsigned char>) had a problem where it
    did not correctly check the bzip2 header. This only occurred when
    decompressing from standard input, in the 'BZ2StreamScanner' mode, and
    would then throw an unexpected "invalid file format" error.

    PR:             283655
    Reported by:    wosch
    MFH:            2024Q4

    (cherry picked from commit 89f872ec2ccf488f24cd9daca2e0d1f80e7ee429)

 archivers/pbzip2/files/patch-BZ2StreamScanner.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 4 Dimitry Andric freebsd_committer freebsd_triage 2024-12-27 21:37:21 UTC
Sorry about that, this was entirely my fault! I've fixed the patch now, thanks for reporting.
Comment 5 paparodeo 2025-01-18 19:23:54 UTC
Created attachment 256779 [details]
no null byte and compare to end of header

there are a couple more things that need to be fixed. the vector shouldn't contain the terminating null byte and the 2nd equals comparison needs to compare to the end of the vector.
Comment 6 Dimitry Andric freebsd_committer freebsd_triage 2025-01-18 20:17:35 UTC
(In reply to paparodeo from comment #5)
> the vector shouldn't contain the terminating null byte

It doesn't seem to matter for the functionality, but it looks better indeed. I think it is cleaner to not NUL-terminate the CharType arrays instead.


> and the 2nd equals comparison needs to compare to the end of the vector.

The original pbzip2 code has:

    // compare the remaining part of magic header
    int cmpres = pHdr->compare( hsp, pHdr->size() - hsp,
                    getInBuffSearchPtr() + hsp, pHdr->size() - hsp );

so it compares at most `pHdr->size() - hsp` characters. In the replaced code:

    bool cmpres = equal( pHdr->begin() + hsp, pHdr->begin() + pHdr->size() - hsp,
                    getInBuffSearchPtr() + hsp );

I now see that it should have been (to make it functionally equivalent
to the original):

    bool cmpres = equal( pHdr->begin() + hsp, pHdr->begin() + hsp + pHdr->size() - hsp,
                    getInBuffSearchPtr() + hsp );

which effectively becomes:

    bool cmpres = equal( pHdr->begin() + hsp, pHdr->begin() + pHdr->size(),
                    getInBuffSearchPtr() + hsp );

or indeed, just pHdr->end(). I think this one is good too.
Comment 7 commit-hook freebsd_committer freebsd_triage 2025-01-18 20:29:17 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=974d3ff054965d2bd2ab884a0579ed06c5a08b07

commit 974d3ff054965d2bd2ab884a0579ed06c5a08b07
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2025-01-18 20:26:02 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2025-01-18 20:28:01 +0000

    archivers/pbzip2: fix more issues after libc++ 19 patches

    * Ensure the _bz2Header and _bz2HeaderZero members of BZ2StreamScanner
      do not contain an additional NUL byte.
    * When comparing the magic header, compare from pHdr->begin() + hsp to
      pHdr->end(), making the code equivalent to what it was doing using
      std::basic_string::compare().

    Submitted by:   paparodeo@proton.me
    PR:             283655
    MFH:            2025Q1

 archivers/pbzip2/files/patch-BZ2StreamScanner.cpp | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)
Comment 8 commit-hook freebsd_committer freebsd_triage 2025-01-18 20:39:19 UTC
A commit in branch 2025Q1 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=71fc928d82c4b8378229e096bb5b327d2106acb6

commit 71fc928d82c4b8378229e096bb5b327d2106acb6
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2025-01-18 20:26:02 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2025-01-18 20:38:17 +0000

    archivers/pbzip2: fix more issues after libc++ 19 patches

    * Ensure the _bz2Header and _bz2HeaderZero members of BZ2StreamScanner
      do not contain an additional NUL byte.
    * When comparing the magic header, compare from pHdr->begin() + hsp to
      pHdr->end(), making the code equivalent to what it was doing using
      std::basic_string::compare().

    Submitted by:   paparodeo@proton.me
    PR:             283655
    MFH:            2025Q1

    (cherry picked from commit 974d3ff054965d2bd2ab884a0579ed06c5a08b07)

 archivers/pbzip2/files/patch-BZ2StreamScanner.cpp | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)