Bug 127912

Summary: In theory, uncompress(1) may crash and SEGV
Product: Base System Reporter: David Jones <drj>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: emaste, imp, ngie
Priority: Normal    
Version: 1.0-CURRENT   
Hardware: Any   
OS: Any   

Description David Jones 2008-10-07 10:20:01 UTC
I have found a problem with the uncompress utility, but not by running it, but by examining the source code.  There seems to be a possibility that uncompress will SEGV if calloc places an allocated block in a very unfortunate place.

Follows is a backgrounder of the source code, then the problem.
Byte is assumed to be 8-bits in the following discussion.

uncompress is a standard Unix utility.  On BSD systems the command line utility is a wrapper around the zopen compressed file IO interface.  The core code is in zopen.c.

That code is available here:
http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/compress/zopen.c?rev=1.12
I use that version for reference when mentioning line numbers.

We will be examining the routines responsible for compressed input.  These are zread on line 453 and getcode on line 568.  The input consists of a series of codes.  Each code is a number stored in a certain number of bits.  Initially codes are 9 bits wide, but as the table used for uncompressing grows the codes get wider too.  The variable "n_bits" specifies the number of bits used by codes at the current point in the input file.  It is between 9 and 16.  In the disk file codes are packed into bytes, the getcode routine is responsible for unpacking them, effectively converting bytes to codes.

getcode proceeds by reading into the buffer gbuf the next n_bits bytes of the input (see line 592).  Note that this buffer contains exactly 8 codes of length n_bits.  Near the beginning of getcode (after the buffer has been refilled if necessary), at line 599, the variable roffset contains the number of bits of buffer already consumed by previous calls to getcode; it is necessarily a multiple of n_bits up to 7*n_bits.  Starting at line 600 and throughout the rest of the routine the variable bits contains the number of bits yet to be consumed in constructing the next code.  A code is between 9 and 16 bits wide, it therefore requires either 2 or 3 bytes to be read to extract a code.

getcode uses the variable gcode to construct the code in either 2 or 3 parts.  The bytes from the buffer gbuf are read on lines 607, 613, 619.  The read on line 613 is conditional on there being at least 8 more bits of code that need to be read.  The 3 reads are as follows: first, read sufficient bits to take us up to the next byte boundary (between 1 and 8); second, if there are at least 8 more bits required, read the entire next byte; third, read the remaining bits (between 0 and 7).  Note the third read on line 619, it occurs even when bits is 0 and no more bits are required; in this case the gcode variable does not change (because rmask[0] is 0), but a byte is read nonetheless.

The problem

Consider the case when n_bits is 16; this is its maximum permitted value and we would expect it to be achieved for any sufficiently large file.  In this case the codes in the gbuf buffer are always byte aligned (roffset is a multiple of 16 and hence a multiple of 8).  Line 607 will read 8 bits and advance the bp pointer; line 613 will read 8 bits (because there are at least 8 bits needed) and advance the bp pointer; line 619 will read the byte at the bp pointer but throw it away.  Nonetheless the byte at the bp pointer is read.  Where does this pointer point?  For the first 7 codes read from the gbuf buffer it will simply point at the first byte of the next code.  When reading the last code from the gbuf buffer, bp will point one past the end of the buffer.  Reading one past the end of the buffer is undefined behaviour in C; however, we all know that what will happen is that simply the next byte will read and its value ignored.

Now it just so happens that this buffer, gbuf, is at the end of an object of type struct s_zstate, see lines 101 through to 143 (the buffer is declared on line 140, gbuf is aliased to it via a macro defined on line 177).  Such an object is allocated from dynamic memory on line 696 with a call to calloc.  If calloc places its returned memory block at the high-end of a page and with no mapped memory on the next page in memory then reading on past the end of the buffer will SEGV.

Conclusion

On some systems in some circumstances most invocations of compress will SEGV.  It is necessary that a particular block of memory obtained from calloc be placed so that the next byte of memory is unmapped; for example, at the highest address possible on a page with the following page left unmapped.

Fix: 

Prefix line 619 with "if(bits)"
How-To-Repeat: Cannot cause it to fail once, never mind repeatedly. ;)

Is this a problem?  Is it exploitable?  I have no idea.
Comment 1 Enji Cooper freebsd_committer freebsd_triage 2015-11-15 09:09:42 UTC
This is the line that the OP is suggesting have  a conditional added to it: https://svnweb.freebsd.org/base/head/usr.bin/compress/zopen.c?revision=146336&view=markup&pathrev=146336#l619 .
Comment 2 Eitan Adler freebsd_committer freebsd_triage 2018-05-20 23:53:11 UTC
For bugs matching the following conditions:
- Status == In Progress
- Assignee == "bugs@FreeBSD.org"
- Last Modified Year <= 2017

Do
- Set Status to "Open"
Comment 3 commit-hook freebsd_committer freebsd_triage 2024-10-11 15:55:39 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=818c7b769a4f7d3c8fecc4cf491f4e22ef816eba

commit 818c7b769a4f7d3c8fecc4cf491f4e22ef816eba
Author:     David Jones <drj@ravenbrook.com>
AuthorDate: 2024-10-11 15:49:17 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2024-10-11 15:50:09 +0000

    uncompress: Avoid reading an extra byte

    When reading the next code in a stream, avoid reading an extra byte if
    we're going to throw it away. When there's no more bits to extract from
    the stream, bits will be 0 and we'll mask the read byte with 0 anyway.
    At worst, this will avoid reading one past the end of gbuf array (which
    is not possible in well formed streams).

    PR: 127912
    Reviewed by:    emaste
    Differential Revision:  https://reviews.freebsd.org/D47041

 usr.bin/compress/zopen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 4 commit-hook freebsd_committer freebsd_triage 2024-10-11 22:11:45 UTC
A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=03b4f232354ce68fe1cb462038654f6527f63579

commit 03b4f232354ce68fe1cb462038654f6527f63579
Author:     David Jones <drj@ravenbrook.com>
AuthorDate: 2024-10-11 15:49:17 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2024-10-11 22:11:02 +0000

    uncompress: Avoid reading an extra byte

    When reading the next code in a stream, avoid reading an extra byte if
    we're going to throw it away. When there's no more bits to extract from
    the stream, bits will be 0 and we'll mask the read byte with 0 anyway.
    At worst, this will avoid reading one past the end of gbuf array (which
    is not possible in well formed streams).

    PR: 127912
    Reviewed by:    emaste
    Differential Revision:  https://reviews.freebsd.org/D47041

    (cherry picked from commit 818c7b769a4f7d3c8fecc4cf491f4e22ef816eba)

 usr.bin/compress/zopen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Comment 5 commit-hook freebsd_committer freebsd_triage 2024-10-11 22:13:46 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7e229794ca37d25265b0472bce26e8fd2b9bee2c

commit 7e229794ca37d25265b0472bce26e8fd2b9bee2c
Author:     David Jones <drj@ravenbrook.com>
AuthorDate: 2024-10-11 15:49:17 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2024-10-11 22:12:41 +0000

    uncompress: Avoid reading an extra byte

    When reading the next code in a stream, avoid reading an extra byte if
    we're going to throw it away. When there's no more bits to extract from
    the stream, bits will be 0 and we'll mask the read byte with 0 anyway.
    At worst, this will avoid reading one past the end of gbuf array (which
    is not possible in well formed streams).

    PR: 127912
    Reviewed by:    emaste
    Differential Revision:  https://reviews.freebsd.org/D47041

    (cherry picked from commit 818c7b769a4f7d3c8fecc4cf491f4e22ef816eba)

 usr.bin/compress/zopen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)