Bug 216586

Summary:	zfs panic: sa.sa_magic == 0x2F505A in zfs_space_delta_cb()
Product:	Base System	Reporter:	jo
Component:	kern	Assignee:	freebsd-fs (Nobody) <fs>
Status:	Open ---
Severity:	Affects Only Me	CC:	ae, avg, dpetrov67, grahamperrin
Priority:	---	Keywords:	crash
Version:	10.3-STABLE
Hardware:	amd64
OS:	Any
See Also:	https://github.com/openzfs/zfs/issues/2025 https://github.com/openzfs/zfs/issues/6332 https://github.com/openzfsonosx/openzfs/issues/5

Description jo 2017-01-29 22:57:42 UTC

While rsync'ing some files, the system eventually panics with:

> panic: solaris assert: sa.sa_magic == 0x2F505A (0x51786a88 == 0x2f505a), file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c, line: 610

The backtrace is:

> #0 0xffffffff804f23c0 at kdb_backtrace+0x60
> #1 0xffffffff804b3824 at vpanic+0x124
> #2 0xffffffff804b36f3 at panic+0x43
> #3 0xffffffff8109122f at assfail3+0x2f
> #4 0xffffffff80e4d417 at zfs_space_delta_cb+0x107
> #5 0xffffffff80dc156f at dmu_objset_userquota_get_ids+0x36f
> #6 0xffffffff80dc952f at dnode_sync+0x9f
> #7 0xffffffff80dc0aeb at dmu_objset_sync_dnodes+0x2b
> #8 0xffffffff80dc0920 at dmu_objset_sync+0x220
> #9 0xffffffff80dd241a at dsl_dataset_sync+0x1ca
> #10 0xffffffff80ddabca at dsl_pool_sync+0x7a
> #11 0xffffffff80dfa77d at spa_sync+0x57d
> #12 0xffffffff80e03f56 at txg_sync_thread+0x3a6
> #13 0xffffffff8048037a at fork_exit+0x9a
> #14 0xffffffff806ccfde at fork_trampoline+0xe

This happens on:
> FreeBSD fred 10.3-STABLE FreeBSD 10.3-STABLE #1 r308740: Sat Nov 19 21:15:27 GMT 2016 root@fred:/usr/obj/usr/src/sys/FRED amd64

and as well on a newer version:
> FreeBSD fred 10.3-STABLE FreeBSD 10.3-STABLE #0 r312924: Sun Jan 29 01:45:50 GMT 2017     root@fred:/usr/obj/usr/src/sys/FRED  amd64

The machine has ECC RAM, and a zpool scrub did not report any errors. The pool is a 2-way-mirror.

The actual value of sa.sa_magic changes (not everytime though): e.g. 0x51786a88 or 0x5178691e.

The code in question at zfs_vfsops.c, line 610 is:

> static int
> zfs_space_delta_cb(dmu_object_type_t bonustype, void *data,
>     uint64_t *userp, uint64_t *groupp)
> {
> 	/*
> 	 * Is it a valid type of object to track?
> 	 */
> 	if (bonustype != DMU_OT_ZNODE && bonustype != DMU_OT_SA)
> 		return (SET_ERROR(ENOENT));
> 
> 	/*
> 	 * If we have a NULL data pointer
> 	 * then assume the id's aren't changing and
> 	 * return EEXIST to the dmu to let it know to
> 	 * use the same ids
> 	 */
> 	if (data == NULL)
> 		return (SET_ERROR(EEXIST));
> 
> 	if (bonustype == DMU_OT_ZNODE) {
> 		znode_phys_t *znp = data;
> 		*userp = znp->zp_uid;
> 		*groupp = znp->zp_gid;
> 	} else {
> 		int hdrsize;
> 		sa_hdr_phys_t *sap = data;
> 		sa_hdr_phys_t sa = *sap;
> 		boolean_t swap = B_FALSE;
> 
> 		ASSERT(bonustype == DMU_OT_SA);
> 
> 		if (sa.sa_magic == 0) {
> 			/*
> 			 * This should only happen for newly created
> 			 * files that haven't had the znode data filled
> 			 * in yet.
> 			 */
> 			*userp = 0;
> 			*groupp = 0;
> 			return (0);
> 		}
> 		if (sa.sa_magic == BSWAP_32(SA_MAGIC)) {
> 			sa.sa_magic = SA_MAGIC;
> 			sa.sa_layout_info = BSWAP_16(sa.sa_layout_info);
> 			swap = B_TRUE;
> 		} else {
> /*610*/		VERIFY3U(sa.sa_magic, ==, SA_MAGIC);   // FAILS
> 		}
> 
> 		hdrsize = sa_hdrsize(&sa);
> 		VERIFY3U(hdrsize, >=, sizeof (sa_hdr_phys_t));
> 		*userp = *((uint64_t *)((uintptr_t)data + hdrsize +
> 		    SA_UID_OFFSET));
> 		*groupp = *((uint64_t *)((uintptr_t)data + hdrsize +
> 		    SA_GID_OFFSET));
> 		if (swap) {
> 			*userp = BSWAP_64(*userp);
> 			*groupp = BSWAP_64(*groupp);
> 		}
> 	}
> 	return (0);
> }

The stacktrace has dmu_objset_userquota_get_ids listed. And the target file system (into which rsync is writing) has quota enabled:

> NAME                        PROPERTY              VALUE                  SOURCE
> freddata/backup/alexs-imac  type                  filesystem             -
> freddata/backup/alexs-imac  creation              Wed Mar  7 18:01 2012  -
> freddata/backup/alexs-imac  used                  521G                   -
> freddata/backup/alexs-imac  available             29.2G                  -
> freddata/backup/alexs-imac  referenced            213G                   -
> freddata/backup/alexs-imac  compressratio         1.11x                  -
> freddata/backup/alexs-imac  mounted               yes                    -
> freddata/backup/alexs-imac  quota                 550G                   local
> freddata/backup/alexs-imac  reservation           300G                   local
> freddata/backup/alexs-imac  recordsize            128K                   default
> freddata/backup/alexs-imac  mountpoint            /backup/alexs-imac     inherited from freddata/backup
> freddata/backup/alexs-imac  sharenfs              off                    default
> freddata/backup/alexs-imac  checksum              fletcher4              inherited from freddata
> freddata/backup/alexs-imac  compression           lz4                    local
> freddata/backup/alexs-imac  atime                 off                    inherited from freddata/backup
> freddata/backup/alexs-imac  devices               on                     default
> freddata/backup/alexs-imac  exec                  on                     default
> freddata/backup/alexs-imac  setuid                on                     default
> freddata/backup/alexs-imac  readonly              off                    default
> freddata/backup/alexs-imac  jailed                off                    default
> freddata/backup/alexs-imac  snapdir               hidden                 default
> freddata/backup/alexs-imac  aclmode               discard                default
> freddata/backup/alexs-imac  aclinherit            restricted             default
> freddata/backup/alexs-imac  canmount              on                     default
> freddata/backup/alexs-imac  xattr                 off                    temporary
> freddata/backup/alexs-imac  copies                1                      default
> freddata/backup/alexs-imac  version               5                      -
> freddata/backup/alexs-imac  utf8only              off                    -
> freddata/backup/alexs-imac  normalization         none                   -
> freddata/backup/alexs-imac  casesensitivity       sensitive              -
> freddata/backup/alexs-imac  vscan                 off                    default
> freddata/backup/alexs-imac  nbmand                off                    default
> freddata/backup/alexs-imac  sharesmb              off                    default
> freddata/backup/alexs-imac  refquota              none                   default
> freddata/backup/alexs-imac  refreservation        none                   default
> freddata/backup/alexs-imac  primarycache          all                    default
> freddata/backup/alexs-imac  secondarycache        all                    default
> freddata/backup/alexs-imac  usedbysnapshots       308G                   -
> freddata/backup/alexs-imac  usedbydataset         213G                   -
> freddata/backup/alexs-imac  usedbychildren        0                      -
> freddata/backup/alexs-imac  usedbyrefreservation  0                      -
> freddata/backup/alexs-imac  logbias               latency                default
> freddata/backup/alexs-imac  dedup                 off                    default
> freddata/backup/alexs-imac  mlslabel                                     -
> freddata/backup/alexs-imac  sync                  standard               default
> freddata/backup/alexs-imac  refcompressratio      1.06x                  -
> freddata/backup/alexs-imac  written               145G                   -
> freddata/backup/alexs-imac  logicalused           569G                   -
> freddata/backup/alexs-imac  logicalreferenced     221G                   -
> freddata/backup/alexs-imac  volmode               default                default
> freddata/backup/alexs-imac  filesystem_limit      none                   default
> freddata/backup/alexs-imac  snapshot_limit        none                   default
> freddata/backup/alexs-imac  filesystem_count      none                   default
> freddata/backup/alexs-imac  snapshot_count        none                   default
> freddata/backup/alexs-imac  redundant_metadata    all                    default

Comment 1 Andriy Gapon freebsd_committer

2017-01-30 09:36:28 UTC

Did you do 'zfs upgrade' any time recently?

Comment 2 jo 2017-01-31 00:57:05 UTC

Haven't done any zfs/zpool upgrades in a long time.

zfs version is 5 (for this and all other fs in the pool).

zpool version is:
> NAME      PROPERTY                       VALUE                          SOURCE
> [...snip...]
> freddata  version                        -                              default
> freddata  feature@async_destroy          enabled                        local
> freddata  feature@empty_bpobj            active                         local
> freddata  feature@lz4_compress           active                         local
> freddata  feature@multi_vdev_crash_dump  enabled                        local
> freddata  feature@spacemap_histogram     active                         local
> freddata  feature@enabled_txg            active                         local
> freddata  feature@hole_birth             active                         local
> freddata  feature@extensible_dataset     enabled                        local
> freddata  feature@embedded_data          disabled                       local
> freddata  feature@bookmarks              enabled                        local
> freddata  feature@filesystem_limits      disabled                       local
> freddata  feature@large_blocks           disabled                       local

Comment 3 Andriy Gapon freebsd_committer

2017-01-31 13:04:48 UTC

Could it be that you have some old files that were created before your last zfs upgrade (even if that was a long time ago)?
Could it be that rsync updates those files?

You might find this issue (close, but no associated commit) to be interesting:
https://github.com/zfsonlinux/zfs/issues/2025

I see that in your case the wrong magic values also look like Unix timestamps, both from 25 April 2013.
Could you please check if you still have more files from that era?

P.S.
Some other similar reports in ZoL repository:
https://github.com/zfsonlinux/zfs/issues/1303
https://github.com/zfsonlinux/zfs/issues/3968

Comment 4 jo 2017-02-03 23:02:17 UTC

Thanks Andriy.

There are indeed lots of (very) old files there. I had an earlier snapshot from a few years ago (that refs all these files). Deleted all files in the fs, and rsync'd everything anew. No panic.

I've also tried (first rolling back to the old snapshot with all old files present) rsync with either --xattrs present or not. Panics for both cases.
But the time until panic varies a lot. So looks like a race condition.

Question is now, is there bogus data (old format that wasn't properly upgraded on-the-fly) with the correct checksum on the disk, and the current code trips over it.
Or is the race condition still present when it tries to on-the-fly-upgrade old files?

Comment 5 Andriy Gapon freebsd_committer

2017-02-04 12:36:05 UTC

(In reply to johannes from comment #4)
I believe that it is the race condition.
https://github.com/zfsonlinux/zfs/issues/2025#issuecomment-40459546
I think that what I wrote there still holds and there hasn't been any fix in this area.

Comment 6 jo 2020-01-11 03:05:21 UTC

Haven't encountered this problem in a long time. Closing it.

Comment 7 Andriy Gapon freebsd_committer

2020-06-07 09:01:34 UTC

(In reply to johannes from comment #6)
I believe that the problem still exists, but it's very specific (post zfs upgrade) and rare.

Comment 8 Andrey V. Elsukov freebsd_committer

2021-06-09 12:12:07 UTC

(In reply to Andriy Gapon from comment #7)
We hit this problem 2 times last week. One machine is based on 12.0, second one is based on 13.0 (before ZoL migration). I looked at the ZoL code, it seems to me, that the code didn't changed here. So, I think it is possible to get such panic on the up to date OpenZFS too.

I have core dump from the one panic. Let me know if you are interested to see something.

panic: solaris assert: sa.sa_magic == 0x2F505A (0xb656f0e0 == 0x2f505a), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c, line: 609
cpuid = 4
time = 1623060642
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe3fd9321780
vpanic() at vpanic+0x194/frame 0xfffffe3fd93217e0
panic() at panic+0x43/frame 0xfffffe3fd9321840
assfail3() at assfail3+0x2c/frame 0xfffffe3fd9321860
zfs_space_delta_cb() at zfs_space_delta_cb+0x100/frame 0xfffffe3fd93218a0
dmu_objset_userquota_get_ids() at dmu_objset_userquota_get_ids+0x1b7/frame 0xfffffe3fd93218f0
dnode_sync() at dnode_sync+0xa6/frame 0xfffffe3fd9321980
sync_dnodes_task() at sync_dnodes_task+0x92/frame 0xfffffe3fd93219c0
taskq_run() at taskq_run+0x10/frame 0xfffffe3fd93219e0
taskqueue_run_locked() at taskqueue_run_locked+0x147/frame 0xfffffe3fd9321a40
taskqueue_thread_loop() at taskqueue_thread_loop+0xb8/frame 0xfffffe3fd9321a70
fork_exit() at fork_exit+0x86/frame 0xfffffe3fd9321ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe3fd9321ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 14d8h32m53s
Dumping 14912 out of 261996 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at ./machine/pcpu.h:232
232	./machine/pcpu.h: No such file or directory.
(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:232
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:318
#2  0xffffffff80aa31c3 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:386
#3  0xffffffff80aa36ae in vpanic (fmt=<optimized out>, ap=0xfffffe3fd9321820) at /usr/src/sys/kern/kern_shutdown.c:779
#4  0xffffffff80aa34d3 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:710
#5  0xffffffff822df23c in assfail3 (a=<unavailable>, lv=<unavailable>, op=<unavailable>, rv=<unavailable>, f=<unavailable>, l=<optimized out>)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91
#6  0xffffffff8209e9c0 in zfs_space_delta_cb (bonustype=<optimized out>, data=0xfffff803e64b2f40, userp=0xfffff8052a3b2278, groupp=0xfffff8052a3b2280)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:609
#7  0xffffffff8200d917 in dmu_objset_userquota_get_ids (dn=0xfffff8052a3b2000, before=0, tx=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1592
#8  0xffffffff82015396 in dnode_sync (dn=0xfffff8052a3b2000, tx=0xfffff803ae49e800) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c:570
#9  0xffffffff8200d192 in dmu_objset_sync_dnodes (list=0xfffff801763bb420, tx=0xfffff803ae49e800)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1093
#10 sync_dnodes_task (arg=0xfffff82419cdd260) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1160
#11 0xffffffff81fd3750 in taskq_run (arg=0xfffff801c66d08d0, pending=<unavailable>) at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c:109
#12 0xffffffff80afc0e7 in taskqueue_run_locked (queue=0xfffff8016b01cd00) at /usr/src/sys/kern/subr_taskqueue.c:463
#13 0xffffffff80afd2c8 in taskqueue_thread_loop (arg=<optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:755
#14 0xffffffff80a65796 in fork_exit (callout=0xffffffff80afd210 <taskqueue_thread_loop>, arg=0xfffff801281c3f20, frame=0xfffffe3fd9321ac0)
    at /usr/src/sys/kern/kern_fork.c:1038
#15 <signal handler called>

Comment 9 Andriy Gapon freebsd_committer

2021-06-16 11:39:47 UTC

(In reply to Andrey V. Elsukov from comment #8)
I think that at this point there is enough information about the problem.
So far I could not come up with a good way to fix it, but to be honest I did not try very hard.

Comment 10 Andriy Gapon freebsd_committer

2021-06-16 11:43:10 UTC

The problem has not been fixed, so no reason to close this report.
Also see:
- https://github.com/zfsonlinux/zfs/issues/2025 (closed)
- https://github.com/openzfs/zfs/issues/6332 (open)
- https://github.com/openzfsonosx/openzfs/issues/5 (open)