Bug 249067 - coredumps include whole maps
Summary: coredumps include whole maps
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Mark Johnston
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-02 16:43 UTC by Brooks Davis
Modified: 2020-11-02 14:02 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Brooks Davis freebsd_committer freebsd_triage 2020-09-02 16:43:51 UTC
FreeBSD coredumps dump whole apps even when most pages haven't been touched.  Modern allocators often allocate very large regions of VA e.g. with 1GiB-100GiB mmaps rather than making repeated calls to mmap. When such processes dump core the whole map gets dumped even though it's nearly all unmapped pages.  Linux breaks such maps into a set of PT_LOADs where unbacked anonymous pages aren't present in the file.

We ran into this on a very slow riscv (100MHz FPGA with very slow disk), but we believe it effects everyone.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2020-09-08 15:24:36 UTC
So is the idea to identify runs of alternating faulted and non-faulted ranges in each mapped area, and emit a phdr for each run (or each pair of runs)?  Then, the phdr for a non-faulted range would have p_filsz == 0 (or it would get tacked on to the phdr for the preceding faulted range).  Or, does Linux completely exclude non-faulted ranges from the core file?  I'd be surprised if so, but it does make things a bit simpler for the kernel.

I noticed recently that DPDK has a FreeBSD-specific workaround for this exact issue, so I'd like to work on it.
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2020-09-08 16:06:57 UTC
It looks like Linux handles core dump data phdrs the same way we do.  The trick they use is to create a hole in the core file whenever a never-faulted virtual page is written.  So a program that maps 512MB, faults a single page, and dumps core gives:

# ./a.out 
addr is 0x7fc076b57000
Aborted (core dumped)
# du -h core
116K    core
# du -h --apparent-size core
513M    core
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2020-09-29 18:43:12 UTC
https://reviews.freebsd.org/D26590
Comment 4 commit-hook freebsd_committer freebsd_triage 2020-10-02 17:51:00 UTC
A commit references this bug:

Author: markj
Date: Fri Oct  2 17:50:23 UTC 2020
New revision: 366368
URL: https://svnweb.freebsd.org/changeset/base/366368

Log:
  Implement sparse core dumps

  Currently we allocate and map zero-filled anonymous pages when dumping
  core.  This can result in lots of needless disk I/O and page
  allocations.  This change tries to make the core dumper more clever and
  represent unbacked ranges of virtual memory by holes in the core dump
  file.

  Add a new page fault type, VM_FAULT_NOFILL, which causes vm_fault() to
  clean up and return an error when it would otherwise map a zero-filled
  page.  Then, in the core dumper code, prefault all user pages and handle
  errors by simply extending the size of the core file.  This also fixes a
  bug related to the fact that vn_io_fault1() does not attempt partial I/O
  in the face of errors from vm_fault_quick_hold_pages(): if a truncated
  file is mapped into a user process, an attempt to dump beyond the end of
  the file results in an error, but this means that valid pages
  immediately preceding the end of the file might not have been dumped
  either.

  The change reduces the core dump size of trivial programs by a factor of
  ten simply by excluding unaccessed libc.so pages.

  PR:		249067
  Reviewed by:	kib
  Tested by:	pho
  MFC after:	1 month
  Sponsored by:	The FreeBSD Foundation
  Differential Revision:	https://reviews.freebsd.org/D26590

Changes:
  head/sys/kern/imgact_elf.c
  head/sys/vm/vm_fault.c
  head/sys/vm/vm_map.h
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2020-11-02 14:02:40 UTC
I think I will hold off on a merge to stable/12 unless someone asks for it.  The merge isn't trivial.