Created attachment 222539 [details]
Patch to the port
Paul Saab found that devel/glib20 was failing to build in a poudriere jail on one of his machines. We tracked this down to the alloca() in our local patch to add fdwalk2() failing and overflowing the stack. The reason is that the fdwalk2() implementation uses the kern.file sysctl which fetches a _global_ table of all open files across all processes, meaning that the amount of required memory scales with the number of processes in the system. fdwalk2() should instead be using kinfo_getfile() from libutil which fetches only the open files for the current process (which also uses malloc() to avoid stack limit issues). While here, I also noticed that fdwalk2() failed to propagate the callback's return value to the caller and instead always returns 0 since *ret was never assigned a value.
The attached patch is untested, but should fix both of these issues. I wasn't quite sure how to convince glib20's port to add libutil as an explicit dependency (e.g. in LDFLAGS), so I'm hopeful a port maintainer can add that missing bit and finish this the rest of the way.
Created attachment 222576 [details]
build with libutil
Created attachment 222645 [details]
build with libutil
Full patch now up for review at https://reviews.freebsd.org/D28904.
A commit references this bug:
Date: Fri Feb 26 19:31:05 UTC 2021
New revision: 566632
Use kinfo_getfile() to implement fdwalk().
Previously, the kern.file sysctl (which queries the global file table)
was queried and the results saved in an on-stack buffer. With a
sufficiently active system the sysctl's output could overflow the
stack's available space. Instead, switch to kinfo_getfile() from
libutil. This uses a sysctl which queries only the open files for the
current process, and it uses heap space instead of the stack to store
the sysctl output.
Submitted by: ps (build glue patches)
Reported by: ps
Reviewed by: bapt
Differential Revision: https://reviews.freebsd.org/D28904
For a data point, I also had glib failing in poudriere on one machine, and with the committed changes it now works.
Thanks for fixing this.