Bug 257368 - ggatec misbehaving
Summary: ggatec misbehaving
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 12.2-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-geom (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-23 23:29 UTC by jo
Modified: 2021-08-25 21:12 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jo 2021-07-23 23:29:46 UTC
Looks like there's a 32 bit integer overflow or underflow glitch in ggated or ggatec (not sure yet which).

Relevant lines from the log.
ggated first:

[...snipped...]
debug: Received hdr packet.
debug: recv_thread: offset=53686845440 length=114688
debug: disk_thread: offset=53686845440 length=114688
debug: send_thread: offset=53686845440 length=114688
debug: Sent hdr packet.
debug: Sent 114688 bytes (offset=53686845440, size=114688).
debug: Received hdr packet.
debug: recv_thread: offset=16384 length=2147467264
debug: disk_thread: offset=16384 length=2147467264
debug: send_thread: offset=16384 length=2147467264
debug: Sent hdr packet.
debug: Process 2048 exiting.

Notice the suspicious length paramter:
2147467264 = 0x7FFFC000
    114688 = 0x   1C000


Corresponding log from ggatec:

[...snipped...]
debug: Received 114688 bytes (offset=53686845440, size=114688).
debug: Sent hdr packet.
debug: Received hdr packet.
debug: Received data packet.
error: Lost connection 4.
debug: recv_thread: Died.
debug: send_thread: Died.



And ggated promptly allocates an additional 2GB of memory. Line from top:

 PID USERNAME THR PRI NICE   SIZE    RES STATE   C TIME  WCPU COMMAND
2048 root     1   20     0  2077M  1217M spa_na  1 0:05  0.00% ggated
Comment 2 jo 2021-07-23 23:55:21 UTC
ggatec connection lost message is coming from here https://github.com/freebsd/freebsd-src/blob/stable/12/sbin/ggate/ggatec/ggatec.c#L229

Looks like the code is expecting to be able to send/receive pretty huge packets as is, instead of breaking it up into smaller chunks.
Comment 3 jo 2021-07-24 00:19:48 UTC
$ truss zpool create testvolpool /dev/ggate0

[...snipped...]
openat(AT_FDCWD,"/dev/ggate0",O_RDONLY,00)       = 7 (0x7)
fstat(7,{ mode=crw-r----- ,inode=76,size=0,blksize=4096 }) = 0 (0x0)
ioctl(7,DIOCGMEDIASIZE,0x7fffff281960)           = 0 (0x0)
mmap(0x0,266240,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 42446731808768 (0x269ae65f1000)
pread(7,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,262144,0x0) = 262144 (0x40000)
pread(7,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,262144,0x40000) = 262144 (0x40000)
pread(7,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,262144,0xc7ff80000) = 262144 (0x40000)
pread(7,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,262144,0xc7ffc0000) = 262144 (0x40000)
close(7)                                         = 0 (0x0)



Looks fine to me... and shows no weird 2 GB read.
Comment 4 jo 2021-07-24 01:47:00 UTC
> Looks like the code is expecting to be able to send/receive pretty huge packets as is, instead of breaking it up into smaller chunks.

Ignore that, found the helper function that breaks it up.


Here's a truss of ggated:


debug: Received hdr packet.
 2008: write(1,"debug: Received hdr packet.\n",28) = 28 (0x1c)
debug: recv_thread: offset=16384 length=2147467264
 2008: write(1,"debug: recv_thread: offset=16384"...,51) = 51 (0x33)
 2008: mmap(0x0,2147487744,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 56795097997312 (0x33a7a3a42000)
 2008: mmap(0x0,6291456,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 56797253623808 (0x33a824206000)
 2008: munmap(0x33a824206000,6291456)            = 0 (0x0)
 2008: mmap(0x0,8384512,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 56797253255168 (0x33a8241ac000)
 2008: munmap(0x33a8241ac000,344064)             = 0 (0x0)
 2008: munmap(0x33a824800000,1748992)            = 0 (0x0)
 2008: _umtx_op(0x33a783f9f1b8,UMTX_OP_NWAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
 2008: _umtx_op(0x33a77e55dfd0,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
debug: disk_thread: offset=16384 length=2147467264
 2008: write(1,"debug: disk_thread: offset=16384"...,51) = 51 (0x33)
 2008: pread(7,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,2147467264,0x4000) = 2147467264 (0x7fffc000)
 2008: _umtx_op(0x33a783f9e2b8,UMTX_OP_NWAKE_PRIVATE,0x1,0x0,0x0) = 0 (0x0)
 2008: _umtx_op(0x33a794e63008,UMTX_OP_WAIT_UINT_PRIVATE,0x0,0x0,0x0) = 0 (0x0)
debug: send_thread: offset=16384 length=2147467264
 2008: write(1,"debug: send_thread: offset=16384"...,51) = 51 (0x33)
 2008: sendto(6,"\0\0\0\0\0\0\0@\0\^?\M^?\M-@\0\0"...,23,0,NULL,0) = 23 (0x17)
debug: Sent hdr packet.
 2008: write(1,"debug: Sent hdr packet.\n",24)   = 24 (0x18)
 2008: sendto(6,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0,NULL,0) = 131072 (0x20000)
 2008: sendto(6,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0,NULL,0) = 131072 (0x20000)
 2008: sendto(6,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0,NULL,0) = 131072 (0x20000)
 2008: recvfrom(8,0x33a796ff97b0,23,MSG_WAITALL,NULL,0x0) = 0 (0x0)
 2008: getpid()                                  = 2008 (0x7d8)
debug: Process 2008 exiting.
 2008: write(1,"debug: Process 2008 exiting.\n",29) = 29 (0x1d)
 2008: sendto(6,"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,131072,0,NULL,0) = 131072 (0x20000)



We can see that ggated allocs a big chunk of mem and successfully reads data into it and then tries to send it off, in chunks of 128kB. So far so good.
But then the receive thread notices that it did not read a full header and thus exits and the whole process dies only having sent 4x128kB.

Smells like a race condition?
Comment 5 jo 2021-07-24 02:06:11 UTC
Truss for ggatec:

debug: Received hdr packet.
write(1,"debug: Received hdr packet.\n",28)      = 28 (0x1c)
recvfrom(4,0x7fffdffddf40,2147467264,MSG_WAITALL,0x0,0x0) ERR#14 'Bad address'
debug: Received data packet.
write(1,"debug: Received data packet.\n",29)     = 29 (0x1d)
error: Lost connection 4.


Notice the bad address!
It's supposed to point to a stack buffer of size 128kB.
But here we are trying to dump 2 GB into it. Oops. Stack buffer overflow.

See near https://github.com/freebsd/freebsd-src/blob/stable/12/sbin/ggate/ggatec/ggatec.c#L223
Comment 6 jo 2021-07-27 00:31:53 UTC
That 2 GB request tripping up ggatec is a BIO_FLUSH command. But ggatec only handles BIO_READ and BIO_WRITE without failing any of the other ones.

See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213479
Comment 7 jo 2021-07-27 00:40:20 UTC
Or BIO_DELETE, don't know which one. Doesn't matter. Needs to explicitly fail unhandled requests.
Comment 8 jo 2021-07-27 01:03:21 UTC
Here's a fix: https://reviews.freebsd.org/D31318