Bug 231794

Summary: zfs: Panic due to ARC related KVA memory exhaustion: in pmap_growkernel() > vm_map_insert
Product: Base System Reporter: Dave Robison <davewrobison>
Component: kernAssignee: freebsd-fs (Nobody) <fs>
Status: Open ---    
Severity: Affects Some People CC: dch, grahamperrin, jgitlin+freebsd, pi, rainer, sigsys
Priority: --- Keywords: crash, needs-qa
Version: 11.2-RELEASEFlags: koobs: mfc-stable13?
koobs: mfc-stable12?
Hardware: amd64   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231296
Attachments:
Description Flags
Photo of backtrace none

Description Dave Robison 2018-09-29 00:30:16 UTC
Created attachment 197582 [details]
Photo of backtrace

We are evaluating two servers based on the HP DL360 G10 (16 gigs RAM) and HP DL380 G10 (32 gigs RAM) motherboards. We can routinely panic these machines by putting them under load while running ZFS. Running six instances of bonnie++ and six instances of memtester (testing 2g) is enough to panic the DL360 in around 15 minutes and the DL380 in 10-13 hours.

Reducing ARC dramatically using vfs.zfs.arc_min and vfs.zfs.arc_max seems to mitigate this problem, at least after a day of testing under 12.0-A7. We are testing now on 11.2-RELEASE which we will use in production.

Daiichi from Japan was here to help diagnose this problem, and has been in contact with core team members who requested this bugzilla submission.

More panic photos available on request.
Comment 1 rainer 2018-10-04 06:59:54 UTC
See also:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231296

There's something seriously wrong with the default settings.
Comment 2 Andriy Gapon freebsd_committer freebsd_triage 2018-10-04 14:39:37 UTC
(In reply to Dave Robison from comment #0)
I just want to note that the problem here is not with exhausting the physical memory, but rather with exhausting the kernel virtual address space (KVA).
There could be many reasons for that such as incorrect tuning, bugs, KVA fragmentation, etc.
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2022-11-01 23:53:09 UTC
^Triage: Report is for EoL 11.2 (12.0-ALPHA7). Needs reproduction against currently supported versions.