Summary: | [zfs] [panic] ZFS Panic/Solaris Assert/zap.c:479 | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Larry Rosenman <ler> | ||||
Component: | kern | Assignee: | freebsd-fs (Nobody) <fs> | ||||
Status: | Closed Overcome By Events | ||||||
Severity: | Affects Only Me | CC: | peter | ||||
Priority: | Normal | ||||||
Version: | CURRENT | ||||||
Hardware: | Any | ||||||
OS: | Any | ||||||
Attachments: |
|
Description
Larry Rosenman
![]() ![]() Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s). I've wound up commenting out a bunch of these ASSERTS, and haven't seen any negative consequences, HOWEVER, I'd like someone to let me know how I could look at the FS(s)/POOL and see if there is a real issue. I'm STILL seeing these, and have commented out: =================================================================== --- zap.c (revision 270765) +++ zap.c (working copy) @@ -476,7 +476,9 @@ * chain. There should be no chained leafs (as we have removed * support for them). */ +#if 0 /*LER: to see what else blows up */ ASSERT0(l->l_phys->l_hdr.lh_pad1); +#endif /* * There should be more hash entries than there can be @@ -531,7 +533,9 @@ ASSERT3U(l->l_blkid, ==, blkid); ASSERT3P(l->l_dbuf, ==, db); ASSERT3P(l->l_phys, ==, l->l_dbuf->db_data); +#if 0 /* LER */ ASSERT3U(l->l_phys->l_hdr.lh_block_type, ==, ZBT_LEAF); +#endif ASSERT3U(l->l_phys->l_hdr.lh_magic, ==, ZAP_LEAF_MAGIC); *lp = l; borg.lerctr.org /usr/src.old/sys/cddl/contrib/opensolaris/uts/common/fs/zfs $ Can I PLEASE get someone to look at these, and tell me what I need to do to fix the on-disk image? Created attachment 146572 [details]
Latest panic core.txt. I **DO** have the vmcore available
If you look at the whole comment for one of the assertions you disabled: /* * lhr_pad was previously used for the next leaf in the leaf * chain. There should be no chained leafs (as we have removed * support for them). */ ASSERT0(l->l_phys->l_hdr.lh_pad1); Or looking at the annotation: 168404 pjd /* 168404 pjd * lhr_pad was previously used for the next leaf in the leaf 168404 pjd * chain. There should be no chained leafs (as we have removed 168404 pjd * support for them). 168404 pjd */ 240415 mm ASSERT0(l->l_phys->l_hdr.lh_pad1); One of those is the initial commit: ------------------------------------------------------------------------ r168404 | pjd | 2007-04-05 18:09:06 -0700 (Thu, 05 Apr 2007) | 11 lines Please welcome ZFS - The last word in file systems. ZFS file system was ported from OpenSolaris operating system. The code in under CDDL license. ------------------------------------------------------------------------ The second is: ------------------------------------------------------------------------ r240415 | mm | 2012-09-12 11:05:43 -0700 (Wed, 12 Sep 2012) | 21 lines Merge recent zfs vendor changes, sync code and adjust userland DEBUG. ------------------------------------------------------------------------ The change is just a syntax one: - ASSERT3U(l->l_phys->l_hdr.lh_pad1, ==, 0); + ASSERT0(l->l_phys->l_hdr.lh_pad1); So.. ever since ZFS existed in FreeBSD, that field should be zero, on disk. You are tripping those asserts because it isn't the case on your machine. This suggests to me that you've either got a very old file system that pre-dates freebsd, or you've got some evil on-disk corruption. I suspect the latter. If this was my machine, I'd be contemplating a backup/restore. I see from the disk sizes you may have a lot of data, but that's what I would be doing. I also see the stack traces come from the zfs de-dupe code. There are a great number of people who have a very dim view of that code, even the authors apologized for it. If it were my machine, I would be turning off dedup immediately, if not sooner. This pool has been around a LONG time -- I'm wondering where I can borrow a Terabyte of disk to backup what I care about and rebuild it one more time... (this pool started on FreeBSD 8(IIRC) on 400G disk, and got upgraded....) Out of curiosity, what does "zpool status -D poolname" show? http://serverfault.com/questions/533877/how-large-is-my-zfs-dedupe-table-at-the-moment You might be burning a significant chunk of ARC metadata space to hold the dedupe table in memory. If I had to guess at this point, I would guess that there are a couple of corrupt records in the on-disk dedup table for one or more of your data sets. Since this is an old pool, there's been plenty of opportunities over the years for this to accumulate while zfs support matured. From your comment, it sounds like you have 1TB of data spread over a pool made of 2TB disks? Is that deduplicated? Is one of the drives a spare? (if so, can you take the spare offline and make a pool on it to facilitate a dump/restore?) borg.lerctr.org /home/ler $ zpool status -D zroot pool: zroot state: ONLINE scan: scrub repaired 0 in 5h8m with 0 errors on Sat Aug 30 13:53:49 2014 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 gptid/71e84d97-043a-11e2-8cd6-003048f2299c ONLINE 0 0 0 gptid/3540b18c-0db2-11e2-8ef2-003048f2299c ONLINE 0 0 0 gptid/b0a645ed-0db9-11e2-8ef2-003048f2299c ONLINE 0 0 0 gptid/dc133fc1-0dc0-11e2-8ef2-003048f2299c ONLINE 0 0 0 gptid/21bef448-0dc8-11e2-8ef2-003048f2299c ONLINE 0 0 0 gptid/46033e1c-0432-11e2-8cd6-003048f2299c ONLINE 0 0 0 errors: No known data errors dedup: DDT entries 1415219, size 2676 on disk, 540 in core bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 1.26M 116G 66.3G 69.2G 1.26M 116G 66.3G 69.2G 2 86.1K 6.84G 6.04G 6.20G 174K 13.8G 12.1G 12.4G 4 3.15K 253M 243M 249M 14.6K 1.00G 985M 1019M 8 196 2.32M 823K 1.78M 1.96K 24.5M 8.42M 18.4M 16 109 1.23M 1024K 1.52M 2.50K 31.9M 26.2M 38.2M 32 33 330K 318K 498K 1.37K 18.1M 17.5M 24.9M 64 3 1.50K 1.50K 19.2K 249 124K 124K 1.55M 128 3 28K 28K 38.3K 470 4.96M 4.96M 6.40M 256 4 162K 162K 173K 1.33K 64.3M 64.3M 67.6M Total 1.35M 123G 72.6G 75.7G 1.45M 131G 79.5G 82.8G borg.lerctr.org /home/ler $ I'm actually using 500G on a 6 disk raid-Z1 pool of 2TB disks. That's the "logical Used" (approx). I went out and bought a 4TB NAS to use for this, and then for other things. "dedup: DDT entries 1415219, size 2676 on disk, 540 in core" -> 728MB of ram to cache this. Fortunately that's a relatively small portion of your system ram. I would still be doing a zfs set dedup=off anywhere it's on. While it won't fix the on-disk dedup corruption, at least you won't risk adding to it any more. Your system is crashing trying to parse the on-disk dedup records. That's what the *_ddt_* functions are about in the traces. I can't imagine this ending well. Does a scrub find the problem? or does the scrub trigger the assertions? scrub triggers the assertions..... currently rsyncing the volatile stuff to the new 4TB NAS mounted via NFS.... then will blow this system up and rebuild it withOUT dedupe.... Rebuilding ports -- no more assertions on a non-modified kernel. Thanks Peter Wemm for the input on dedup. I think we can close this one out. Looks like an OLD dedup corruption/mis-feature -- rebuilding the entire system from scratch cleaned it up. |