Hello, I got system with only one pool 6G: root@varnish:~ # zpool list NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zroot 5.97G 3.76G 2.21G - 73% 62% 1.00x ONLINE - However about 4-5 days after the system was powered on the ARC cache got bigger than the pool and it stays like this: up 20+20:50:56 10:44:15 34 processes: 1 running, 33 sleeping CPU: % user, % nice, % system, % interrupt, % idle Mem: 18M Active, 601M Inact, 13G Wired, 1801M Free ARC: 12G Total, 1920M MFU, 10G MRU, 419K Anon, 40M Header, 377M Other Swap: 2048M Total, 2048M Free And note that the pool is 6G, but it's not even 100% full. Even only the MFU is larger than the entire pool. There's no way that data is not duplicated there. Here is output of zfs stats: root@varnish:~ # zfs-stats -A ------------------------------------------------------------------------ ZFS Subsystem Report Sun Jan 22 10:45:47 2017 ------------------------------------------------------------------------ ARC Summary: (HEALTHY) Memory Throttle Count: 0 ARC Misc: Deleted: 65.44k Recycle Misses: 0 Mutex Misses: 4 Evict Skips: 1 ARC Size: 84.86% 12.33 GiB Target Size: (Adaptive) 85.07% 12.36 GiB Min Size (Hard Limit): 12.50% 1.82 GiB Max Size (High Water): 8:1 14.53 GiB ARC Size Breakdown: Recently Used Cache Size: 81.34% 10.06 GiB Frequently Used Cache Size: 18.66% 2.31 GiB ARC Hash Breakdown: Elements Max: 171.10k Elements Current: 87.74% 150.12k Collisions: 1.69m Chain Max: 4 Chains: 5.03k ------------------------------------------------------------------------
Maybe, related to bug #216178
Hello, I agree, looks a lot like it's related.
I was able to reproduce the symptoms where the reported ARC size could grow beyond the RAM size and the pool size. I think I see an accounting leak in the code. I will offer a patch later.
(In reply to k_georgiev from comment #0) Just to clarify, does your pool contain a cache (l2arc) device? There could be more than one problem.
No, No cache device, no log device. It is a VMware virtual machine if that should matter.
(In reply to k_georgiev from comment #5) Another question, do you have any filesystems with compression enabled in the pool? Could you please capture output of sysctl kstat.zfs.misc.arcstats next time you see this large ARC size?
Hello, There's a compression stats: for i in `zfs list |awk -F ' ' '{print $1}' |grep -v NAME`; do zfs get compression $i |tail -n 1; zfs get compressratio $i |tail -n 1; echo '----------' ; done; zroot compression lz4 local zroot compressratio 3.74x - ---------- zroot/ROOT compression lz4 inherited from zroot zroot/ROOT compressratio 1.89x - ---------- zroot/ROOT/default compression lz4 inherited from zroot zroot/ROOT/default compressratio 1.89x - ---------- zroot/tmp compression lz4 inherited from zroot zroot/tmp compressratio 4.57x - ---------- zroot/usr compression lz4 inherited from zroot zroot/usr compressratio 2.37x - ---------- zroot/usr/home compression lz4 inherited from zroot zroot/usr/home compressratio 1.01x - ---------- zroot/usr/ports compression lz4 inherited from zroot zroot/usr/ports compressratio 2.23x - ---------- zroot/usr/src compression lz4 inherited from zroot zroot/usr/src compressratio 2.46x - ---------- zroot/var compression lz4 inherited from zroot zroot/var compressratio 6.53x - ---------- zroot/var/audit compression lz4 inherited from zroot zroot/var/audit compressratio 1.00x - ---------- zroot/var/crash compression lz4 inherited from zroot zroot/var/crash compressratio 1.05x - ---------- zroot/var/log compression lz4 inherited from zroot zroot/var/log compressratio 6.64x - ---------- zroot/var/mail compression lz4 inherited from zroot zroot/var/mail compressratio 1.00x - ---------- zroot/var/tmp compression lz4 inherited from zroot zroot/var/tmp compressratio 1.50x - ---------- Currently the cache size is not larger than the pool size, but it is still a lot larger than the allocated data so I don't know if this will do the job for you. This is the current situation: root@varnish:~ # zpool list NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zroot 5.97G 2.87G 3.10G - 73% 48% 1.00x ONLINE - ARC: 5730M Total, 75M MFU, 5242M MRU, 1060K Anon, 22M Header, 391M Other And the statistics you requested: root@varnish:~ # sysctl kstat.zfs.misc.arcstats kstat.zfs.misc.arcstats.demand_hit_predictive_prefetch: 1139 kstat.zfs.misc.arcstats.sync_wait_for_async: 97 kstat.zfs.misc.arcstats.arc_meta_min: 975210240 kstat.zfs.misc.arcstats.arc_meta_max: 719384968 kstat.zfs.misc.arcstats.arc_meta_limit: 3900840960 kstat.zfs.misc.arcstats.arc_meta_used: 719067792 kstat.zfs.misc.arcstats.duplicate_reads: 2495 kstat.zfs.misc.arcstats.duplicate_buffers_size: 0 kstat.zfs.misc.arcstats.duplicate_buffers: 0 kstat.zfs.misc.arcstats.memory_throttle_count: 0 kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0 kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0 kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0 kstat.zfs.misc.arcstats.l2_write_pios: 0 kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0 kstat.zfs.misc.arcstats.l2_write_full: 0 kstat.zfs.misc.arcstats.l2_write_not_cacheable: 1 kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0 kstat.zfs.misc.arcstats.l2_write_in_l2: 0 kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0 kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0 kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0 kstat.zfs.misc.arcstats.l2_padding_needed: 0 kstat.zfs.misc.arcstats.l2_compress_failures: 0 kstat.zfs.misc.arcstats.l2_compress_zeros: 0 kstat.zfs.misc.arcstats.l2_compress_successes: 0 kstat.zfs.misc.arcstats.l2_hdr_size: 0 kstat.zfs.misc.arcstats.l2_asize: 0 kstat.zfs.misc.arcstats.l2_size: 0 kstat.zfs.misc.arcstats.l2_io_error: 0 kstat.zfs.misc.arcstats.l2_cksum_bad: 0 kstat.zfs.misc.arcstats.l2_abort_lowmem: 0 kstat.zfs.misc.arcstats.l2_cdata_free_on_write: 0 kstat.zfs.misc.arcstats.l2_free_on_write: 0 kstat.zfs.misc.arcstats.l2_evict_l1cached: 0 kstat.zfs.misc.arcstats.l2_evict_reading: 0 kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0 kstat.zfs.misc.arcstats.l2_writes_lock_retry: 0 kstat.zfs.misc.arcstats.l2_writes_error: 0 kstat.zfs.misc.arcstats.l2_writes_done: 0 kstat.zfs.misc.arcstats.l2_writes_sent: 0 kstat.zfs.misc.arcstats.l2_write_bytes: 0 kstat.zfs.misc.arcstats.l2_read_bytes: 0 kstat.zfs.misc.arcstats.l2_rw_clash: 0 kstat.zfs.misc.arcstats.l2_feeds: 0 kstat.zfs.misc.arcstats.l2_misses: 0 kstat.zfs.misc.arcstats.l2_hits: 0 kstat.zfs.misc.arcstats.mfu_ghost_evictable_metadata: 0 kstat.zfs.misc.arcstats.mfu_ghost_evictable_data: 0 kstat.zfs.misc.arcstats.mfu_ghost_size: 0 kstat.zfs.misc.arcstats.mfu_evictable_metadata: 11820032 kstat.zfs.misc.arcstats.mfu_evictable_data: 66081280 kstat.zfs.misc.arcstats.mfu_size: 78476800 kstat.zfs.misc.arcstats.mru_ghost_evictable_metadata: 0 kstat.zfs.misc.arcstats.mru_ghost_evictable_data: 0 kstat.zfs.misc.arcstats.mru_ghost_size: 0 kstat.zfs.misc.arcstats.mru_evictable_metadata: 95819264 kstat.zfs.misc.arcstats.mru_evictable_data: 5226756608 kstat.zfs.misc.arcstats.mru_size: 5500123136 kstat.zfs.misc.arcstats.anon_evictable_metadata: 0 kstat.zfs.misc.arcstats.anon_evictable_data: 0 kstat.zfs.misc.arcstats.anon_size: 1070080 kstat.zfs.misc.arcstats.other_size: 409599856 kstat.zfs.misc.arcstats.metadata_size: 286570496 kstat.zfs.misc.arcstats.data_size: 5293100032 kstat.zfs.misc.arcstats.hdr_size: 22897440 kstat.zfs.misc.arcstats.size: 6012167824 kstat.zfs.misc.arcstats.c_max: 15603363840 kstat.zfs.misc.arcstats.c_min: 1950420480 kstat.zfs.misc.arcstats.c: 15603363840 kstat.zfs.misc.arcstats.p: 7801681920 kstat.zfs.misc.arcstats.hash_chain_max: 3 kstat.zfs.misc.arcstats.hash_chains: 1290 kstat.zfs.misc.arcstats.hash_collisions: 156194 kstat.zfs.misc.arcstats.hash_elements_max: 79495 kstat.zfs.misc.arcstats.hash_elements: 79489 kstat.zfs.misc.arcstats.evict_l2_skip: 0 kstat.zfs.misc.arcstats.evict_l2_ineligible: 2048 kstat.zfs.misc.arcstats.evict_l2_eligible: 67584 kstat.zfs.misc.arcstats.evict_l2_cached: 0 kstat.zfs.misc.arcstats.evict_not_enough: 0 kstat.zfs.misc.arcstats.evict_skip: 1 kstat.zfs.misc.arcstats.mutex_miss: 0 kstat.zfs.misc.arcstats.deleted: 8 kstat.zfs.misc.arcstats.allocated: 9883033 kstat.zfs.misc.arcstats.mfu_ghost_hits: 0 kstat.zfs.misc.arcstats.mfu_hits: 6223489 kstat.zfs.misc.arcstats.mru_ghost_hits: 0 kstat.zfs.misc.arcstats.mru_hits: 1746472 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 9869 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 13974 kstat.zfs.misc.arcstats.prefetch_data_misses: 89 kstat.zfs.misc.arcstats.prefetch_data_hits: 1 kstat.zfs.misc.arcstats.demand_metadata_misses: 3949342 kstat.zfs.misc.arcstats.demand_metadata_hits: 7185451 kstat.zfs.misc.arcstats.demand_data_misses: 123344 kstat.zfs.misc.arcstats.demand_data_hits: 784508 kstat.zfs.misc.arcstats.misses: 4082644 kstat.zfs.misc.arcstats.hits: 7983934 Hope this helps Thanks
Hello again, If it will be useful now I got the cache size larger than the pool, this is the current status: ARC: 7713M Total, 1950M MFU, 5343M MRU, 746K Anon, 26M Header, 394M Other And again the statistics you requested: root@varnish:/tmp # sysctl kstat.zfs.misc.arcstats kstat.zfs.misc.arcstats.demand_hit_predictive_prefetch: 1139 kstat.zfs.misc.arcstats.sync_wait_for_async: 97 kstat.zfs.misc.arcstats.arc_meta_min: 975210240 kstat.zfs.misc.arcstats.arc_meta_max: 728642168 kstat.zfs.misc.arcstats.arc_meta_limit: 3900840960 kstat.zfs.misc.arcstats.arc_meta_used: 717771648 kstat.zfs.misc.arcstats.duplicate_reads: 2495 kstat.zfs.misc.arcstats.duplicate_buffers_size: 0 kstat.zfs.misc.arcstats.duplicate_buffers: 0 kstat.zfs.misc.arcstats.memory_throttle_count: 0 kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0 kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0 kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0 kstat.zfs.misc.arcstats.l2_write_pios: 0 kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0 kstat.zfs.misc.arcstats.l2_write_full: 0 kstat.zfs.misc.arcstats.l2_write_not_cacheable: 1 kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0 kstat.zfs.misc.arcstats.l2_write_in_l2: 0 kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0 kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0 kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0 kstat.zfs.misc.arcstats.l2_padding_needed: 0 kstat.zfs.misc.arcstats.l2_compress_failures: 0 kstat.zfs.misc.arcstats.l2_compress_zeros: 0 kstat.zfs.misc.arcstats.l2_compress_successes: 0 kstat.zfs.misc.arcstats.l2_hdr_size: 0 kstat.zfs.misc.arcstats.l2_asize: 0 kstat.zfs.misc.arcstats.l2_size: 0 kstat.zfs.misc.arcstats.l2_io_error: 0 kstat.zfs.misc.arcstats.l2_cksum_bad: 0 kstat.zfs.misc.arcstats.l2_abort_lowmem: 0 kstat.zfs.misc.arcstats.l2_cdata_free_on_write: 0 kstat.zfs.misc.arcstats.l2_free_on_write: 0 kstat.zfs.misc.arcstats.l2_evict_l1cached: 0 kstat.zfs.misc.arcstats.l2_evict_reading: 0 kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0 kstat.zfs.misc.arcstats.l2_writes_lock_retry: 0 kstat.zfs.misc.arcstats.l2_writes_error: 0 kstat.zfs.misc.arcstats.l2_writes_done: 0 kstat.zfs.misc.arcstats.l2_writes_sent: 0 kstat.zfs.misc.arcstats.l2_write_bytes: 0 kstat.zfs.misc.arcstats.l2_read_bytes: 0 kstat.zfs.misc.arcstats.l2_rw_clash: 0 kstat.zfs.misc.arcstats.l2_feeds: 0 kstat.zfs.misc.arcstats.l2_misses: 0 kstat.zfs.misc.arcstats.l2_hits: 0 kstat.zfs.misc.arcstats.mfu_ghost_evictable_metadata: 0 kstat.zfs.misc.arcstats.mfu_ghost_evictable_data: 0 kstat.zfs.misc.arcstats.mfu_ghost_size: 0 kstat.zfs.misc.arcstats.mfu_evictable_metadata: 12150272 kstat.zfs.misc.arcstats.mfu_evictable_data: 65606144 kstat.zfs.misc.arcstats.mfu_size: 78327808 kstat.zfs.misc.arcstats.mru_ghost_evictable_metadata: 0 kstat.zfs.misc.arcstats.mru_ghost_evictable_data: 0 kstat.zfs.misc.arcstats.mru_ghost_size: 0 kstat.zfs.misc.arcstats.mru_evictable_metadata: 94340096 kstat.zfs.misc.arcstats.mru_evictable_data: 5329829888 kstat.zfs.misc.arcstats.mru_size: 5601754112 kstat.zfs.misc.arcstats.anon_evictable_metadata: 0 kstat.zfs.misc.arcstats.anon_evictable_data: 0 kstat.zfs.misc.arcstats.anon_size: 939520 kstat.zfs.misc.arcstats.other_size: 409772864 kstat.zfs.misc.arcstats.metadata_size: 284930560 kstat.zfs.misc.arcstats.data_size: 5396353536 kstat.zfs.misc.arcstats.hdr_size: 23068224 kstat.zfs.misc.arcstats.size: 6114125184 kstat.zfs.misc.arcstats.c_max: 15603363840 kstat.zfs.misc.arcstats.c_min: 1950420480 kstat.zfs.misc.arcstats.c: 15603363840 kstat.zfs.misc.arcstats.p: 7801681920 kstat.zfs.misc.arcstats.hash_chain_max: 3 kstat.zfs.misc.arcstats.hash_chains: 1313 kstat.zfs.misc.arcstats.hash_collisions: 159594 kstat.zfs.misc.arcstats.hash_elements_max: 95175 kstat.zfs.misc.arcstats.hash_elements: 80080 kstat.zfs.misc.arcstats.evict_l2_skip: 0 kstat.zfs.misc.arcstats.evict_l2_ineligible: 2048 kstat.zfs.misc.arcstats.evict_l2_eligible: 67584 kstat.zfs.misc.arcstats.evict_l2_cached: 0 kstat.zfs.misc.arcstats.evict_not_enough: 0 kstat.zfs.misc.arcstats.evict_skip: 1 kstat.zfs.misc.arcstats.mutex_miss: 0 kstat.zfs.misc.arcstats.deleted: 8 kstat.zfs.misc.arcstats.allocated: 10092961 kstat.zfs.misc.arcstats.mfu_ghost_hits: 0 kstat.zfs.misc.arcstats.mfu_hits: 6988925 kstat.zfs.misc.arcstats.mru_ghost_hits: 0 kstat.zfs.misc.arcstats.mru_hits: 1778668 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 9879 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 13990 kstat.zfs.misc.arcstats.prefetch_data_misses: 89 kstat.zfs.misc.arcstats.prefetch_data_hits: 1 kstat.zfs.misc.arcstats.demand_metadata_misses: 3994929 kstat.zfs.misc.arcstats.demand_metadata_hits: 7237192 kstat.zfs.misc.arcstats.demand_data_misses: 124103 kstat.zfs.misc.arcstats.demand_data_hits: 1530399 kstat.zfs.misc.arcstats.misses: 4129000 kstat.zfs.misc.arcstats.hits: 8781582
(In reply to k_georgiev from comment #7) I haven't done any calculations, but it's quite possible that you are not having any problem. You are comparing the size of the data on disk where some of it is rather well compressed with the size of the data in ARC where the data is uncompressed (and after the compressed ARC support was added, a cached buffer could be compressed, uncompressed or both). Check what zfs get logicalused zroot gives you.
root@varnish:/tmp # zfs get logicalused zroot NAME PROPERTY VALUE SOURCE zroot logicalused 9.52G -
Hmm, As I look at the compress ration there is no more than 3x except for the /var/log where it's 6x. I'm not sure if this X is multiply, or percentage. If it' percentage then it's very low and almost no data is compressed. If it's multiply then it may be possible that all is fine, however this is quite fresh FreeBSD install with only installed Varnish software and few more packages that can't be more than few megabytes. No user data, or databases is stored there. I can't believe that fresh install takes close to 10G.
(In reply to k_georgiev from comment #11) It's the ratio, not percents. As hard as it is for you to believe that your system uses that much space, all stats indicate that it actually does. You can use du, df, zfs get space, etc to drill down where the space is used. I am closing this bug report now.