I have a machine with 256 GiB of RAM (249 GiB managed) that serves files over plain HTTP with nginx & AIO. After system starup ARC grows to its maximum size - ~233 GiB (so about ten gigs are always free), then it slightly drops down to ~228 GiB. Only after that if I start some processes: 1. They will immediately hang in "D" state; 2. pagedaemon/uma enters clearing state; 3. ARC starts to evict till its minimum size; 4. When ARC reaches minimum entire system becomes unresponsive with a delay (from 5 mins to 8 hours). Some examples of hanging processes: 1. conftest when building devel/m4 (PR in "See Also"); 2. tar on any directory, e.g. tar cvf /dev/null /usr/ports.
Created attachment 182804 [details] output of "zfs-stats -a" when ARC reaches minimum
Created attachment 182805 [details] output of "vmstat -z" when ARC reaches minimum
Created attachment 182806 [details] output of "procstat -kka" when ARC reaches minimum
Created attachment 182807 [details] output of "truss tar cvf /dev/null /usr/ports" that starts ARC eviction
(In reply to Anton Sayetsky from comment #4) It's better to duplicate last lines from truss output & procstat output here: ===== truss output ===== clock_gettime(13,{1495402129.000000000 }) = 0 (0x0) openat(0xffffff9c,0x80245b0a0,0x100601,0x1b6,0x7fffffffd580,0x801d13b20) = 3 (0x3) fcntl(3,F_GETFD,) = 1 (0x1) fstat(3,{ mode=crw-rw-rw- ,inode=8,size=0,blksize=4096 }) = 0 (0x0) openat(0xffffff9c,0x8008bc804,0x100000,0x0,0xffff80080245c7d7,0x0) = 4 (0x4) fcntl(4,F_GETFD,) ===== procstar output related to tar ===== 75044 101901 bsdtar - mi_switch+0xbe sleepq_wait+0x3a _cv_wait+0x14d vmem_xalloc+0x568 vmem_alloc+0x3d kmem_malloc+0x33 uma_large_malloc+0x46 malloc+0x40 fdgrowtable+0x5b fdalloc+0x6c do_dup+0x18f kern_fcntl+0x6dc kern_fcntl_freebsd+0xae amd64_syscall+0x307 Xfast_syscall+0xfb
The procstat output suggests that you might be using geli for the swap device. This is known to cause deadlocks under memory pressure: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209759 You could reduce vfs.zfs.deadman_synctime_ms to more quickly get a panic when the system becomes unresponsive. It would probably help to see the counters in vm_cnt.
There have been a number of changes made to the ZFS code since 10.3-RELEASE; there is a version of a patch that I have been running which *should* apply against 10.3 in the following bug thread (I'm currently on 11 with the version for it in production here): https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594
Created attachment 182943 [details] output of "zfs-stats -a" when ARC reaches minimum (w/o swap)
Created attachment 182944 [details] output of "vmstat -z" when ARC reaches minimum (w/o swap)
Created attachment 182945 [details] output of "procstat -kka" when ARC reaches minimum (w/o swap)
Created attachment 182946 [details] output of "sysctl vm" when ARC reaches minimum (w/o swap)
(In reply to Fabian Keil from comment #6) > The procstat output suggests that you might be using geli for the swap device. Yes, you're right. I'm using GELI (AES-256-XTS/SHA256/onetime) over gmirror of 2 gpt partitions. > This is known to cause deadlocks under memory pressure: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209759 Disabled GELI swap & stopped relevant gmirror -- still got ARC eviction after running tar... > You could reduce vfs.zfs.deadman_synctime_ms to more quickly get a panic when the system becomes unresponsive. Unfortunately, I cannot see any panics (and thus, stack traces). System just hangs w/o any output to logs or console, and all that I can do - reset or power cycle through IPMI interface. I'm thinking about compiling kernel with KDB/DDB and collecting coredump with NMI. > It would probably help to see the counters in vm_cnt. Attached relevant sysctl output & similar diagnostics as before, but w/o swap.
Anton, I suspect that you could be running into a bug in fdalloc / fdgrowtable code that causes an attempt to allocate an insane amount of memory. The ARC is just the first victim. Could you please try to use kgdb (preferably from devel/gdb) and check arguments and local variables in the relevant stack frames?
(In reply to Andriy Gapon from comment #14) Running devel/gdb is possible but I need some instructions because I have almost no experience with it.
Looks like I can't reproduce this anymore after updating to releng/11.1 I still can observe a problem with ARC eviction to minimum size, but system at least doesn't hang now. So its time to try again patches from #187594 & D7538.