There is a server FreeBSD 10.1-RELEASE GENERIC kernel amd64 with 2 zfs pools. There are two Intel 480Gb SSD disks in server, used like ZIL (mirror 4GB per pool) and L2ARC (stripe 75+75 GB per pool). On some zfs datasets compression are used. After same days working I noticed wrong L2ARC alloc and free sizes pool1 in zpool iostat -v, later I seen same wrong size in pool2. It looks like: pool alloc free read write read write ------------------------------------ ----- ----- ----- ----- ----- ----- pool1 13,0T 34,3T 45 3,56K 3,93M 51,9M raidz3 13,0T 34,3T 45 3,51K 3,93M 47,1M multipath/pd01 - - 31 97 311K 4,96M multipath/pd02 - - 31 97 311K 4,96M multipath/pd03 - - 31 97 311K 4,96M multipath/pd04 - - 31 97 311K 4,96M multipath/pd05 - - 31 97 311K 4,96M multipath/pd06 - - 31 97 311K 4,96M multipath/pd07 - - 31 97 311K 4,96M multipath/pd08 - - 31 97 311K 4,96M multipath/pd09 - - 31 97 311K 4,96M multipath/pd10 - - 31 97 311K 4,96M multipath/pd11 - - 31 97 311K 4,96M multipath/pd12 - - 31 97 311K 4,96M multipath/pd13 - - 31 97 311K 4,96M logs - - - - - - mirror 812K 3,97G 0 45 0 4,83M diskid/DISK-CVWL435200Y1480QGNp1 - - 0 45 4 4,83M diskid/DISK-CVWL4353000F480QGNp1 - - 0 45 4 4,83M cache - - - - - - diskid/DISK-CVWL435200Y1480QGNp4 371G 16,0E 4 27 163K 3,16M diskid/DISK-CVWL4353000F480QGNp4 441G 16,0E 8 25 145K 2,94M ------------------------------------ ----- ----- ----- ----- ----- ----- pool2 10,2T 37,0T 81 1,36K 9,82M 80,2M raidz3 10,2T 37,0T 81 870 9,82M 45,9M multipath/pd14 - - 21 82 903K 4,67M multipath/pd15 - - 21 82 903K 4,67M multipath/pd16 - - 21 82 903K 4,67M multipath/pd17 - - 21 82 903K 4,67M multipath/pd18 - - 21 82 904K 4,67M multipath/pd19 - - 21 82 903K 4,67M multipath/pd20 - - 21 82 903K 4,67M multipath/pd21 - - 21 82 903K 4,67M multipath/pd22 - - 21 82 903K 4,67M multipath/pd23 - - 21 82 903K 4,67M multipath/pd24 - - 21 82 903K 4,67M multipath/pd25 - - 21 82 903K 4,67M multipath/pd26 - - 21 82 903K 4,67M logs - - - - - - mirror 238M 3,74G 0 525 0 34,3M diskid/DISK-CVWL435200Y1480QGNp2 - - 0 525 4 34,3M diskid/DISK-CVWL4353000F480QGNp2 - - 0 525 4 34,3M cache - - - - - - diskid/DISK-CVWL435200Y1480QGNp5 207G 16,0E 1 21 45,1K 2,56M diskid/DISK-CVWL4353000F480QGNp5 203G 16,0E 2 21 94,6K 2,60M Cache values “371G 16,0E” are abnormal, real aloc size is 75G. After that I looked zfs-stat -L and seen DEGRADED L2ARC and too big L2ARC size: L2 ARC Summary: (DEGRADED) Passed Headroom: 6.05m Tried Lock Failures: 22.36m IO In Progress: 2.75k Low Memory Aborts: 2.86k Free on Write: 5.48m Writes While Full: 339.48k R/W Clashes: 2.07k Bad Checksums: 211.52k IO Errors: 101.41k SPA Mismatch: 3.16b L2 ARC Size: (Adaptive) 1.27 TiB Header Size: 1.42% 18.56 GiB kstat.zfs.misc.arcstats.l2_io_error: 101531 kstat.zfs.misc.arcstats.l2_cksum_bad: 211782 smartctl show that both SSD is fine, without any IO errors. After reboot no problem's for some time. I found same issues which described about L2ARC compression: http://forums.freebsd.org/threads/l2arc-degraded.47540/ http://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.html My problem looks like same bug.
I have been discussing this issue here: https://forums.freebsd.org/threads/l2arc-degraded.47540/ Checksum and IO errors appear after an L2ARC device fills completely with cache data on any release of FreeBSD after L2ARC compression (http://wiki.illumos.org/display/illumos/L2ARC+Compression) was enabled (9.3-RELEASE and later, and 10.0-RELEASE and later). The following is from an 10.1-RELEASE-p5 (svn rev 278678) running a generic kernel. /etc/make.conf contains only "CPUTYPE?=core2". ZFS pools are v28 (created under 9.2-RELEASE) and have NOT been updated for feature flags. "zfs-stats -L" shows the L2ARC "DEGRADED" with numerous I/O and checksum errors: root@cadence:/ # zfs-stats -L ------------------------------------------------------------------------ ZFS Subsystem Report Fri Feb 13 10:36:49 2015 ------------------------------------------------------------------------ L2 ARC Summary: (DEGRADED) Passed Headroom: 30.28m Tried Lock Failures: 24.83m IO In Progress: 247 Low Memory Aborts: 103 Free on Write: 54.97k Writes While Full: 10.62k R/W Clashes: 562 Bad Checksums: 1.29m IO Errors: 128.28k SPA Mismatch: 48.53b L2 ARC Size: (Adaptive) 33.51 GiB Header Size: 2.24% 768.85 MiB L2 ARC Evicts: Lock Retries: 18 Upon Reading: 0 L2 ARC Breakdown: 35.47m Hit Ratio: 26.64% 9.45m Miss Ratio: 73.36% 26.02m Feeds: 568.79k L2 ARC Buffer: Bytes Scanned: 530.43 TiB Buffer Iterations: 568.79k List Iterations: 36.17m NULL List Iterations: 974.71k L2 ARC Writes: Writes Sent: 100.00% 136.10k ------------------------------------------------------------------------ Kernel variables showing compression is working on L2ARC, and there are I/O and checksum errors: root@cadence:/ # sysctl kstat.zfs.misc.arcstats.l2_compress_successes kstat.zfs.misc.arcstats.l2_compress_zeros kstat.zfs.misc.arcstats.l2_compress_failures kstat.zfs.misc.arcstats.l2_cksum_bad kstat.zfs.misc.arcstats.l2_io_error kstat.zfs.misc.arcstats.l2_compress_successes: 1353514 kstat.zfs.misc.arcstats.l2_compress_zeros: 29 kstat.zfs.misc.arcstats.l2_compress_failures: 4985 kstat.zfs.misc.arcstats.l2_cksum_bad: 1290021 kstat.zfs.misc.arcstats.l2_io_error: 128275 Slight variations of this problem have been reported in numerous instances on FreeBSD, FreeNAS and PC-BSD related forums and mailing lists, and is usually dismissed as a hardware problem: https://bugs.freenas.org/issues/5347 https://forums.freenas.org/index.php?threads/l2-arc-summary-degraded.19256/ https://bugs.pcbsd.org/issues/3418 http://svnweb.freebsd.org/base?view=revision&sortby=file&revision=256889 http://lists.freebsd.org/pipermail/freebsd-current/2013-October/045088.html http://lists.freebsd.org/pipermail/freebsd-bugs/2014-November/059261.html http://lists.freebsd.org/pipermail/freebsd-bugs/2014-December/059376.html http://lists.freebsd.org/pipermail/freebsd-fs/2014-October/020242.html However, I've been able to easily duplicate this problem on two different sets of high quality, reliable hardware (Dell PowerEdge, Intel SSD) that otherwise tests perfectly. To duplicate, simply create a zfs pool with a small L2ARC device and exercise the pool with random I/O until the L2ARC fills.
I have exactly the same problem with 2X Intel DC S3500 Series 600GB ------------------------------------------------------------------------ ZFS Subsystem Report Mon Apr 13 10:56:52 2015 ------------------------------------------------------------------------ L2 ARC Summary: (DEGRADED) Passed Headroom: 69.29m Tried Lock Failures: 316.43k IO In Progress: 86 Low Memory Aborts: 648 Free on Write: 1.01m Writes While Full: 202.57k R/W Clashes: 1.32k Bad Checksums: 17.34m IO Errors: 3.23m SPA Mismatch: 49.06m L2 ARC Size: (Adaptive) 3.21 TiB Header Size: 0.19% 6.21 GiB L2 ARC Evicts: Lock Retries: 146 Upon Reading: 0 L2 ARC Breakdown: 103.52m Hit Ratio: 34.77% 35.99m Miss Ratio: 65.23% 67.53m Feeds: 1.28m L2 ARC Buffer: Bytes Scanned: 116.31 TiB Buffer Iterations: 1.28m List Iterations: 78.60m NULL List Iterations: 2.68m L2 ARC Writes: Writes Sent: 100.00% 620.76k ------------------------------------------------------------------------ NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 multipath/disk1 ONLINE 0 0 0 multipath/disk2 ONLINE 0 0 0 multipath/disk25 ONLINE 0 0 0 multipath/disk4 ONLINE 0 0 0 multipath/disk5 ONLINE 0 0 0 multipath/disk6 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 multipath/disk7 ONLINE 0 0 0 multipath/disk8 ONLINE 0 0 0 multipath/disk9 ONLINE 0 0 0 multipath/disk26 ONLINE 0 0 0 multipath/disk11 ONLINE 0 0 0 multipath/disk12 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 multipath/disk13 ONLINE 0 0 0 multipath/disk14 ONLINE 0 0 0 multipath/disk15 ONLINE 0 0 0 multipath/disk16 ONLINE 0 0 0 multipath/disk17 ONLINE 0 0 0 multipath/disk18 ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 multipath/disk19 ONLINE 0 0 0 multipath/disk20 ONLINE 0 0 0 multipath/disk21 ONLINE 0 0 0 multipath/disk22 ONLINE 0 0 0 multipath/disk23 ONLINE 0 0 0 multipath/disk24 ONLINE 0 0 0 logs mirror-4 ONLINE 0 0 0 gpt/zil0 ONLINE 0 0 0 gpt/zil1 ONLINE 0 0 0 cache gpt/cache0 ONLINE 0 0 0 gpt/cache1 ONLINE 0 0 0 spares multipath/disk3 AVAIL multipath/disk27 AVAIL multipath/disk28 AVAIL multipath/disk10 AVAIL
A commit references this bug: Author: avg Date: Mon Aug 24 08:10:53 UTC 2015 New revision: 287099 URL: https://svnweb.freebsd.org/changeset/base/287099 Log: account for ashift when gathering buffers to be written to l2arc device The change that introduced the L2ARC compression support also introduced a bug where the on-disk size of the selected buffers could end up larger than the target size if the ashift is greater than 9. This was because the buffer selection could did not take into account the fact that on-disk size could be larger than the in-memory buffer size due to the alignment requirements. At the moment b_asize is a misnomer as it does not always represent the allocated size: if a buffer is compressed, then the compressed size is properly rounded (on FreeBSD), but if the compression fails or it is not applied, then the original size is kept and it could be smaller than what ashift requires. For the same reasons arcstat_l2_asize and the reported used space on the cache device could be smaller than the actual allocated size if ashift > 9. That problem is not fixed by this change. This change only ensures that l2ad_hand is not advanced by more than target_sz. Otherwise we would overwrite active (unevicted) L2ARC buffers. That problem is manifested as growing l2_cksum_bad and l2_io_error counters. This change also changes 'p' prefix to 'a' prefix in a few places where variables represent allocated rather than physical size. The resolved problem could also result in the reported allocated size being greater than the cache device's capacity, because of the overwritten buffers (more than one buffer claiming the same disk space). This change is already in ZFS-on-Linux: zfsonlinux/zfs@ef56b0780c80ebb0b1e637b8b8c79530a8ab3201 PR: 198242 PR: 195746 (possibly related) Reviewed by: mahrens (https://reviews.csiden.org/r/229/) Tested by: gkontos@aicom.gr (most recently) MFC after: 15 days X-MFC note: patch does not apply as is at the moment Relnotes: yes Sponsored by: ClusterHQ Differential Revision: https://reviews.freebsd.org/D2764 Reviewed by: noone (@FreeBSD.org) Changes: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
Should be fixed now.