I'm using FreeBSD-11.1R amd64 with ZFS root-fs on my 8GB ram PC. I think zfs arcstats miss count shows wrong value. For example, I checked out FreeBSD ports tree from svn repository. ``` $ svn co svn://svn.freebsd.org/ports/head ports-head ``` and then execute following ruby script to stat(2) all files. ``` Dir.glob("ports-head/**/*").each do |file| File.stat(file) end ``` After 3-5 times running it, I expect all metadata cached on ARC. But zfs-mon -a shows as follows. ``` ZFS real-time cache activity monitor Seconds elapsed: 9 Cache hits and misses: 1s 10s 60s tot ARC hits: 0 0 0 0 ARC misses: 28294 7716 7716 7716 ARC demand data hits: 0 0 0 0 ARC demand data misses: 28 7 7 7 ARC demand metadata hits: 0 0 0 0 ARC demand metadata misses: 28267 7709 7709 7709 ARC prefetch data hits: 0 0 0 0 ARC prefetch data misses: 0 0 0 0 ARC prefetch metadata hits: 0 0 0 0 ARC prefetch metadata misses: 0 0 0 0 ZFETCH hits: 25 7 7 7 ZFETCH misses: 51529 14062 14062 14062 Cache efficiency percentage: 10s 60s tot ARC: 0.00 0.00 0.00 ARC demand data: 0.00 0.00 0.00 ARC demand metadata: 0.00 0.00 0.00 ARC prefetch data: 0.00 0.00 0.00 ARC prefetch metadata: 0.00 0.00 0.00 ZFETCH: 0.05 0.05 0.05 ``` I think `ARC demand metadata misses` shows hit count.
As discussed at BSDTW, please do the following: zfs-stats -a <your script> zfs-stats -a
OK, I will do it when I return to Japan, it will be 2 days later. I’m traveling Taiwan now.
I changed my script as follows. File.stat() loops 5 times, and print `zfs-stats -a` message at first and in end of each loop. ``` puts "#{Time.now} start" puts `zfs-stats -a` 5.times do |c| Dir.glob("ports-head/**/*").each do |file| File.stat(file) end puts "#{Time.now} loop count : #{c}" puts `zfs-stats -a` end ``` The result is shown in attached 'zfs-stat.log' file. Attached `top.log` shows `top` header of before and after running this script.
Created attachment 188004 [details] zfs-stat.log
Created attachment 188005 [details] top.log
(In reply to naito.yuichiro from comment #4) This log shows an interesting pattern, where the 'Cache Hit Ratio' goes up during the first run (reaching 58.16k hits total, from 8.73k), but doesn't go up on the successive runs. I will investigate this further
I have confirmed the impact you are seeing, but also confirmed that is not a major bug. If you run: zpool iostat 1 You'll see that running your ruby script will not actually result in any reads from the disk. There is a small issue with the stats accounting in ZFS, where if the metadata being read, happens to be stored in an "Embedded Block Pointer" (so instead of being stored as a data block, the data is embedded in the parent block, to save an entire sector, and to save an I/O to read that sector), then it is incorrectly counted as a miss. This is because to read the embedded block pointer, it has to create a read zio and go through most of the process of doing a read, but then ends up copying the data out of the block pointer instead of from disk. Anyway, I am investigating a quick fixes to account for the cache hit correctly, instead of as a cache miss. I am also looking to see if it would be relatively simple to optimize the case and return the data more directly in arc_read() instead of creating a zio and going the currently more complicated path. This path mostly exists because it makes it possible for other functions to not need to know about the embedded block pointer feature.
(In reply to Allan Jude from comment #7) Thank you very much to your investigation. My ruby script doesn't read disks actually. I thought this means all metadata is cached. I saw no problem with performance. > Anyway, I am investigating a quick fixes to account for the cache hit > correctly, instead of as a cache miss. I hope your fix. > I am also looking to see if it would be relatively simple to optimize the case > and return the data more directly in arc_read() instead of creating a zio and > going the currently more complicated path. This path mostly exists because it makes > it possible for other functions to not need to know about the embedded block > pointer feature. I don't know about zfs source codes. But if you need to test your patch in my environment, please ask me. I will test in new boot environment on my PC. I learned the feature in your session in BSDtw!
Created attachment 192787 [details] zfs-stat-resolved.log
I checked that r332523 has fixed this problem. After upgrading r332641, I ran my script described in comment:3 again. It shows as 'zfs-stat-resolved.log'. Cache Hit Ratio increases in each loop as I expected. I will close this PR. Thank you!