FreeBSD coredumps dump whole apps even when most pages haven't been touched. Modern allocators often allocate very large regions of VA e.g. with 1GiB-100GiB mmaps rather than making repeated calls to mmap. When such processes dump core the whole map gets dumped even though it's nearly all unmapped pages. Linux breaks such maps into a set of PT_LOADs where unbacked anonymous pages aren't present in the file. We ran into this on a very slow riscv (100MHz FPGA with very slow disk), but we believe it effects everyone.
So is the idea to identify runs of alternating faulted and non-faulted ranges in each mapped area, and emit a phdr for each run (or each pair of runs)? Then, the phdr for a non-faulted range would have p_filsz == 0 (or it would get tacked on to the phdr for the preceding faulted range). Or, does Linux completely exclude non-faulted ranges from the core file? I'd be surprised if so, but it does make things a bit simpler for the kernel. I noticed recently that DPDK has a FreeBSD-specific workaround for this exact issue, so I'd like to work on it.
It looks like Linux handles core dump data phdrs the same way we do. The trick they use is to create a hole in the core file whenever a never-faulted virtual page is written. So a program that maps 512MB, faults a single page, and dumps core gives: # ./a.out addr is 0x7fc076b57000 Aborted (core dumped) # du -h core 116K core # du -h --apparent-size core 513M core
https://reviews.freebsd.org/D26590
A commit references this bug: Author: markj Date: Fri Oct 2 17:50:23 UTC 2020 New revision: 366368 URL: https://svnweb.freebsd.org/changeset/base/366368 Log: Implement sparse core dumps Currently we allocate and map zero-filled anonymous pages when dumping core. This can result in lots of needless disk I/O and page allocations. This change tries to make the core dumper more clever and represent unbacked ranges of virtual memory by holes in the core dump file. Add a new page fault type, VM_FAULT_NOFILL, which causes vm_fault() to clean up and return an error when it would otherwise map a zero-filled page. Then, in the core dumper code, prefault all user pages and handle errors by simply extending the size of the core file. This also fixes a bug related to the fact that vn_io_fault1() does not attempt partial I/O in the face of errors from vm_fault_quick_hold_pages(): if a truncated file is mapped into a user process, an attempt to dump beyond the end of the file results in an error, but this means that valid pages immediately preceding the end of the file might not have been dumped either. The change reduces the core dump size of trivial programs by a factor of ten simply by excluding unaccessed libc.so pages. PR: 249067 Reviewed by: kib Tested by: pho MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26590 Changes: head/sys/kern/imgact_elf.c head/sys/vm/vm_fault.c head/sys/vm/vm_map.h
I think I will hold off on a merge to stable/12 unless someone asks for it. The merge isn't trivial.