Bug 216364 - ZFS ARC cache is duplicating data? The cache size gets bigger then the pool.
Summary: ZFS ARC cache is duplicating data? The cache size gets bigger then the pool.
Status: Closed Not A Bug
Alias: None
Product: Base System
Classification: Unclassified
Component: misc (show other bugs)
Version: 11.0-RELEASE
Hardware: amd64 Any
: --- Affects Many People
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-22 08:49 UTC by k_georgiev
Modified: 2017-03-07 09:08 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description k_georgiev 2017-01-22 08:49:46 UTC
Hello,

I got system with only one pool 6G:

root@varnish:~ # zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
zroot  5.97G  3.76G  2.21G         -    73%    62%  1.00x  ONLINE  -

However about 4-5 days after the system was powered on the ARC cache got bigger than the pool and it stays like this:

                                                                                    up 20+20:50:56  10:44:15
34 processes:  1 running, 33 sleeping
CPU:     % user,     % nice,     % system,     % interrupt,     % idle
Mem: 18M Active, 601M Inact, 13G Wired, 1801M Free
ARC: 12G Total, 1920M MFU, 10G MRU, 419K Anon, 40M Header, 377M Other
Swap: 2048M Total, 2048M Free


And note that the pool is 6G, but it's not even 100% full.

Even only the MFU is larger than the entire pool. There's no way that data is not duplicated there. 

Here is output of zfs stats:


root@varnish:~ # zfs-stats -A

------------------------------------------------------------------------
ZFS Subsystem Report                            Sun Jan 22 10:45:47 2017
------------------------------------------------------------------------

ARC Summary: (HEALTHY)
        Memory Throttle Count:                  0

ARC Misc:
        Deleted:                                65.44k
        Recycle Misses:                         0
        Mutex Misses:                           4
        Evict Skips:                            1

ARC Size:                               84.86%  12.33   GiB
        Target Size: (Adaptive)         85.07%  12.36   GiB
        Min Size (Hard Limit):          12.50%  1.82    GiB
        Max Size (High Water):          8:1     14.53   GiB

ARC Size Breakdown:
        Recently Used Cache Size:       81.34%  10.06   GiB
        Frequently Used Cache Size:     18.66%  2.31    GiB

ARC Hash Breakdown:
        Elements Max:                           171.10k
        Elements Current:               87.74%  150.12k
        Collisions:                             1.69m
        Chain Max:                              4
        Chains:                                 5.03k

------------------------------------------------------------------------
Comment 1 Lev A. Serebryakov freebsd_committer freebsd_triage 2017-01-23 14:27:00 UTC
Maybe, related to bug #216178
Comment 2 k_georgiev 2017-01-29 13:13:14 UTC
Hello, 

I agree, looks a lot like it's related.
Comment 3 Andriy Gapon freebsd_committer freebsd_triage 2017-02-07 22:04:14 UTC
I was able to reproduce the symptoms where the reported ARC size could grow beyond the RAM size and the pool size.  I think I see an accounting leak in the code.  I will offer a patch later.
Comment 4 Andriy Gapon freebsd_committer freebsd_triage 2017-02-07 22:06:41 UTC
(In reply to k_georgiev from comment #0)
Just to clarify, does your pool contain a cache (l2arc) device?
There could be more than one problem.
Comment 5 k_georgiev 2017-02-08 09:19:40 UTC
No, 

No cache device, no log device.

It is a VMware virtual machine if that should matter.
Comment 6 Andriy Gapon freebsd_committer freebsd_triage 2017-02-08 09:50:29 UTC
(In reply to k_georgiev from comment #5)
Another question, do you have any filesystems with compression enabled in the pool?
Could you please capture output of sysctl kstat.zfs.misc.arcstats next time you see this large ARC size?
Comment 7 k_georgiev 2017-02-08 10:28:13 UTC
Hello, 

There's a compression stats:

 for i in `zfs list |awk -F ' ' '{print $1}' |grep -v NAME`; do zfs get compression $i |tail -n 1; zfs get compressratio $i |tail -n 1;  echo '----------' ; done;
zroot  compression  lz4       local
zroot  compressratio  3.74x  -
----------
zroot/ROOT  compression  lz4       inherited from zroot
zroot/ROOT  compressratio  1.89x  -
----------
zroot/ROOT/default  compression  lz4       inherited from zroot
zroot/ROOT/default  compressratio  1.89x  -
----------
zroot/tmp  compression  lz4       inherited from zroot
zroot/tmp  compressratio  4.57x  -
----------
zroot/usr  compression  lz4       inherited from zroot
zroot/usr  compressratio  2.37x  -
----------
zroot/usr/home  compression  lz4       inherited from zroot
zroot/usr/home  compressratio  1.01x  -
----------
zroot/usr/ports  compression  lz4       inherited from zroot
zroot/usr/ports  compressratio  2.23x  -
----------
zroot/usr/src  compression  lz4       inherited from zroot
zroot/usr/src  compressratio  2.46x  -
----------
zroot/var  compression  lz4       inherited from zroot
zroot/var  compressratio  6.53x  -
----------
zroot/var/audit  compression  lz4       inherited from zroot
zroot/var/audit  compressratio  1.00x  -
----------
zroot/var/crash  compression  lz4       inherited from zroot
zroot/var/crash  compressratio  1.05x  -
----------
zroot/var/log  compression  lz4       inherited from zroot
zroot/var/log  compressratio  6.64x  -
----------
zroot/var/mail  compression  lz4       inherited from zroot
zroot/var/mail  compressratio  1.00x  -
----------
zroot/var/tmp  compression  lz4       inherited from zroot
zroot/var/tmp  compressratio  1.50x  -
----------


Currently the cache size is not larger than the pool size, but it is still a lot larger than the allocated data so I don't know if this will do the job for you. This is the current situation:

root@varnish:~ # zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
zroot  5.97G  2.87G  3.10G         -    73%    48%  1.00x  ONLINE  -

ARC: 5730M Total, 75M MFU, 5242M MRU, 1060K Anon, 22M Header, 391M Other


And the statistics you requested:

root@varnish:~ # sysctl kstat.zfs.misc.arcstats 
kstat.zfs.misc.arcstats.demand_hit_predictive_prefetch: 1139
kstat.zfs.misc.arcstats.sync_wait_for_async: 97
kstat.zfs.misc.arcstats.arc_meta_min: 975210240
kstat.zfs.misc.arcstats.arc_meta_max: 719384968
kstat.zfs.misc.arcstats.arc_meta_limit: 3900840960
kstat.zfs.misc.arcstats.arc_meta_used: 719067792
kstat.zfs.misc.arcstats.duplicate_reads: 2495
kstat.zfs.misc.arcstats.duplicate_buffers_size: 0
kstat.zfs.misc.arcstats.duplicate_buffers: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
kstat.zfs.misc.arcstats.l2_write_pios: 0
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
kstat.zfs.misc.arcstats.l2_write_full: 0
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 1
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
kstat.zfs.misc.arcstats.l2_write_in_l2: 0
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
kstat.zfs.misc.arcstats.l2_padding_needed: 0
kstat.zfs.misc.arcstats.l2_compress_failures: 0
kstat.zfs.misc.arcstats.l2_compress_zeros: 0
kstat.zfs.misc.arcstats.l2_compress_successes: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.l2_asize: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cdata_free_on_write: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_evict_l1cached: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_writes_lock_retry: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_write_bytes: 0
kstat.zfs.misc.arcstats.l2_read_bytes: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.mfu_ghost_evictable_metadata: 0
kstat.zfs.misc.arcstats.mfu_ghost_evictable_data: 0
kstat.zfs.misc.arcstats.mfu_ghost_size: 0
kstat.zfs.misc.arcstats.mfu_evictable_metadata: 11820032
kstat.zfs.misc.arcstats.mfu_evictable_data: 66081280
kstat.zfs.misc.arcstats.mfu_size: 78476800
kstat.zfs.misc.arcstats.mru_ghost_evictable_metadata: 0
kstat.zfs.misc.arcstats.mru_ghost_evictable_data: 0
kstat.zfs.misc.arcstats.mru_ghost_size: 0
kstat.zfs.misc.arcstats.mru_evictable_metadata: 95819264
kstat.zfs.misc.arcstats.mru_evictable_data: 5226756608
kstat.zfs.misc.arcstats.mru_size: 5500123136
kstat.zfs.misc.arcstats.anon_evictable_metadata: 0
kstat.zfs.misc.arcstats.anon_evictable_data: 0
kstat.zfs.misc.arcstats.anon_size: 1070080
kstat.zfs.misc.arcstats.other_size: 409599856
kstat.zfs.misc.arcstats.metadata_size: 286570496
kstat.zfs.misc.arcstats.data_size: 5293100032
kstat.zfs.misc.arcstats.hdr_size: 22897440
kstat.zfs.misc.arcstats.size: 6012167824
kstat.zfs.misc.arcstats.c_max: 15603363840
kstat.zfs.misc.arcstats.c_min: 1950420480
kstat.zfs.misc.arcstats.c: 15603363840
kstat.zfs.misc.arcstats.p: 7801681920
kstat.zfs.misc.arcstats.hash_chain_max: 3
kstat.zfs.misc.arcstats.hash_chains: 1290
kstat.zfs.misc.arcstats.hash_collisions: 156194
kstat.zfs.misc.arcstats.hash_elements_max: 79495
kstat.zfs.misc.arcstats.hash_elements: 79489
kstat.zfs.misc.arcstats.evict_l2_skip: 0
kstat.zfs.misc.arcstats.evict_l2_ineligible: 2048
kstat.zfs.misc.arcstats.evict_l2_eligible: 67584
kstat.zfs.misc.arcstats.evict_l2_cached: 0
kstat.zfs.misc.arcstats.evict_not_enough: 0
kstat.zfs.misc.arcstats.evict_skip: 1
kstat.zfs.misc.arcstats.mutex_miss: 0
kstat.zfs.misc.arcstats.deleted: 8
kstat.zfs.misc.arcstats.allocated: 9883033
kstat.zfs.misc.arcstats.mfu_ghost_hits: 0
kstat.zfs.misc.arcstats.mfu_hits: 6223489
kstat.zfs.misc.arcstats.mru_ghost_hits: 0
kstat.zfs.misc.arcstats.mru_hits: 1746472
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 9869
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 13974
kstat.zfs.misc.arcstats.prefetch_data_misses: 89
kstat.zfs.misc.arcstats.prefetch_data_hits: 1
kstat.zfs.misc.arcstats.demand_metadata_misses: 3949342
kstat.zfs.misc.arcstats.demand_metadata_hits: 7185451
kstat.zfs.misc.arcstats.demand_data_misses: 123344
kstat.zfs.misc.arcstats.demand_data_hits: 784508
kstat.zfs.misc.arcstats.misses: 4082644
kstat.zfs.misc.arcstats.hits: 7983934

Hope this helps
Thanks
Comment 8 k_georgiev 2017-02-08 11:10:59 UTC
Hello again, 

If it will be useful now I got the cache size larger than the pool, this is the current status:

ARC: 7713M Total, 1950M MFU, 5343M MRU, 746K Anon, 26M Header, 394M Other


And again the statistics you requested:


root@varnish:/tmp # sysctl kstat.zfs.misc.arcstats
kstat.zfs.misc.arcstats.demand_hit_predictive_prefetch: 1139
kstat.zfs.misc.arcstats.sync_wait_for_async: 97
kstat.zfs.misc.arcstats.arc_meta_min: 975210240
kstat.zfs.misc.arcstats.arc_meta_max: 728642168
kstat.zfs.misc.arcstats.arc_meta_limit: 3900840960
kstat.zfs.misc.arcstats.arc_meta_used: 717771648
kstat.zfs.misc.arcstats.duplicate_reads: 2495
kstat.zfs.misc.arcstats.duplicate_buffers_size: 0
kstat.zfs.misc.arcstats.duplicate_buffers: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
kstat.zfs.misc.arcstats.l2_write_pios: 0
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
kstat.zfs.misc.arcstats.l2_write_full: 0
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 1
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
kstat.zfs.misc.arcstats.l2_write_in_l2: 0
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
kstat.zfs.misc.arcstats.l2_padding_needed: 0
kstat.zfs.misc.arcstats.l2_compress_failures: 0
kstat.zfs.misc.arcstats.l2_compress_zeros: 0
kstat.zfs.misc.arcstats.l2_compress_successes: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.l2_asize: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cdata_free_on_write: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_evict_l1cached: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_writes_lock_retry: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_write_bytes: 0
kstat.zfs.misc.arcstats.l2_read_bytes: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.mfu_ghost_evictable_metadata: 0
kstat.zfs.misc.arcstats.mfu_ghost_evictable_data: 0
kstat.zfs.misc.arcstats.mfu_ghost_size: 0
kstat.zfs.misc.arcstats.mfu_evictable_metadata: 12150272
kstat.zfs.misc.arcstats.mfu_evictable_data: 65606144
kstat.zfs.misc.arcstats.mfu_size: 78327808
kstat.zfs.misc.arcstats.mru_ghost_evictable_metadata: 0
kstat.zfs.misc.arcstats.mru_ghost_evictable_data: 0
kstat.zfs.misc.arcstats.mru_ghost_size: 0
kstat.zfs.misc.arcstats.mru_evictable_metadata: 94340096
kstat.zfs.misc.arcstats.mru_evictable_data: 5329829888
kstat.zfs.misc.arcstats.mru_size: 5601754112
kstat.zfs.misc.arcstats.anon_evictable_metadata: 0
kstat.zfs.misc.arcstats.anon_evictable_data: 0
kstat.zfs.misc.arcstats.anon_size: 939520
kstat.zfs.misc.arcstats.other_size: 409772864
kstat.zfs.misc.arcstats.metadata_size: 284930560
kstat.zfs.misc.arcstats.data_size: 5396353536
kstat.zfs.misc.arcstats.hdr_size: 23068224
kstat.zfs.misc.arcstats.size: 6114125184
kstat.zfs.misc.arcstats.c_max: 15603363840
kstat.zfs.misc.arcstats.c_min: 1950420480
kstat.zfs.misc.arcstats.c: 15603363840
kstat.zfs.misc.arcstats.p: 7801681920
kstat.zfs.misc.arcstats.hash_chain_max: 3
kstat.zfs.misc.arcstats.hash_chains: 1313
kstat.zfs.misc.arcstats.hash_collisions: 159594
kstat.zfs.misc.arcstats.hash_elements_max: 95175
kstat.zfs.misc.arcstats.hash_elements: 80080
kstat.zfs.misc.arcstats.evict_l2_skip: 0
kstat.zfs.misc.arcstats.evict_l2_ineligible: 2048
kstat.zfs.misc.arcstats.evict_l2_eligible: 67584
kstat.zfs.misc.arcstats.evict_l2_cached: 0
kstat.zfs.misc.arcstats.evict_not_enough: 0
kstat.zfs.misc.arcstats.evict_skip: 1
kstat.zfs.misc.arcstats.mutex_miss: 0
kstat.zfs.misc.arcstats.deleted: 8
kstat.zfs.misc.arcstats.allocated: 10092961
kstat.zfs.misc.arcstats.mfu_ghost_hits: 0
kstat.zfs.misc.arcstats.mfu_hits: 6988925
kstat.zfs.misc.arcstats.mru_ghost_hits: 0
kstat.zfs.misc.arcstats.mru_hits: 1778668
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 9879
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 13990
kstat.zfs.misc.arcstats.prefetch_data_misses: 89
kstat.zfs.misc.arcstats.prefetch_data_hits: 1
kstat.zfs.misc.arcstats.demand_metadata_misses: 3994929
kstat.zfs.misc.arcstats.demand_metadata_hits: 7237192
kstat.zfs.misc.arcstats.demand_data_misses: 124103
kstat.zfs.misc.arcstats.demand_data_hits: 1530399
kstat.zfs.misc.arcstats.misses: 4129000
kstat.zfs.misc.arcstats.hits: 8781582
Comment 9 Andriy Gapon freebsd_committer freebsd_triage 2017-02-08 11:12:29 UTC
(In reply to k_georgiev from comment #7)
I haven't done any calculations, but it's quite possible that you are not having any problem.  You are comparing the size of the data on disk where some of it is rather well compressed with the size of the data in ARC where the data is uncompressed (and after the compressed ARC support was added, a cached buffer could be compressed, uncompressed or both).

Check what zfs get logicalused zroot gives you.
Comment 10 k_georgiev 2017-02-08 11:20:02 UTC
root@varnish:/tmp # zfs get logicalused zroot 
NAME   PROPERTY     VALUE  SOURCE
zroot  logicalused  9.52G  -
Comment 11 k_georgiev 2017-02-08 13:42:59 UTC
Hmm,

As I look at the compress ration there is no more than 3x except for the /var/log where it's 6x. 

I'm not sure if this X is multiply, or percentage. If it' percentage then it's very low and almost no data is compressed. 
If it's multiply then it may be possible that all is fine, however this is quite fresh FreeBSD install with only installed Varnish software and few more packages that can't be more than few megabytes. No user data, or databases is stored there. I can't believe that fresh install takes close to 10G.
Comment 12 Andriy Gapon freebsd_committer freebsd_triage 2017-03-07 09:08:21 UTC
(In reply to k_georgiev from comment #11)
It's the ratio, not percents.
As hard as it is for you to believe that your system uses that much space, all stats indicate that it actually does.  You can use du, df, zfs get space, etc to drill down where the space is used.
I am closing this bug report now.