Bug 235683

Summary:	[zfs] Panic during data access or scrub on 12.0-STABLE r343904 (blkptr at <addr> DVA 0 has invalid OFFSET)
Product:	Base System	Reporter:	Sergey Anokhin <admin>
Component:	kern	Assignee:	Eugene Grosbein <eugen>
Status:	Closed Feedback Timeout
Severity:	Affects Only Me	CC:	avos, eugen, rgrimes
Priority:	---	Keywords:	crash
Version:	12.0-STABLE
Hardware:	amd64
OS:	Any

Description Sergey Anokhin 2019-02-12 08:19:25 UTC

Hi All,

I have test machine:

# uname -ar
FreeBSD server.5034.ru 12.0-STABLE FreeBSD 12.0-STABLE r343904 SERVER  amd64

This server has ZFS pool with error:

# zpool status -v
  pool: zroot
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub canceled on Mon Feb 11 20:50:59 2019
config:

        NAME         STATE     READ WRITE CKSUM
        zroot        ONLINE       0     0     0
          gpt/disk0  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        zroot:<0x21008>


Looks like this is kernel panic when OS try to get access to broken data:

(pts/2)[root@server:/usr/obj/usr/src/amd64.amd64/sys/SERVER]# kgdb kernel /var/crash/vmcore.last
GNU gdb (GDB) 8.2.1 [GDB v8.2.1 for FreeBSD]
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from kernel...Reading symbols from /usr/obj/usr/src/amd64.amd64/sys/SERVER/kernel.debug...done.
done.

Unread portion of the kernel message buffer:
panic: Solaris(panic): blkptr at 0xfffffe0059108980 DVA 0 has invalid OFFSET 72057594038013952
cpuid = 2
time = 1549919776
KDB: stack backtrace:
#0 0xffffffff80c531c7 at kdb_backtrace+0x67
#1 0xffffffff80c07143 at vpanic+0x1a3
#2 0xffffffff80c06f93 at panic+0x43
#3 0xffffffff826d418f at vcmn_err+0xcf
#4 0xffffffff827849ca at zfs_panic_recover+0x5a
#5 0xffffffff827c0223 at zfs_blkptr_verify+0x303
#6 0xffffffff827c030c at zio_read+0x2c
#7 0xffffffff8270cc54 at arc_read+0x704
#8 0xffffffff827195ed at dbuf_read+0x72d
#9 0xffffffff8271d16f at __dbuf_hold_impl+0x57f
#10 0xffffffff8271d37f at dbuf_hold+0x7f
#11 0xffffffff827240ec at dmu_buf_hold_noread_by_dnode+0x3c
#12 0xffffffff827242ac at dmu_buf_hold_by_dnode+0x1c
#13 0xffffffff827a69cd at zap_get_leaf_byblk+0x4d
#14 0xffffffff827a404f at fzap_lookup+0xcf
#15 0xffffffff827aac67 at zap_lookup_impl+0x117
#16 0xffffffff827aaad5 at zap_lookup_norm+0xa5
#17 0xffffffff827aaa21 at zap_lookup+0x11
Uptime: 2h5m13s
Dumping 1087 out of 8077 MB:..2%..11%..21%..31%..42%..51%..61%..71%..81%..92%

__curthread () at ./machine/pcpu.h:230
230             __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (OFFSETOF_CURTHREAD));
(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80c06d2b in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:446
#3  0xffffffff80c071a3 in vpanic (fmt=<optimized out>, ap=0xfffffe0068148e10) at /usr/src/sys/kern/kern_shutdown.c:872
#4  0xffffffff80c06f93 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799
#5  0xffffffff826d418f in vcmn_err (ce=<optimized out>, fmt=0xffffffff82878f37 "blkptr at %p DVA %u has invalid OFFSET %llu", adx=0xfffffe0068148fa0)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:58
#6  0xffffffff827849ca in zfs_panic_recover (fmt=<unavailable>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:1653
#7  0xffffffff827c0223 in zfs_blkptr_verify (spa=0xfffffe004fdba000, bp=0xfffffe0059108980)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:835
#8  0xffffffff827c030c in zio_read (pio=0xfffff8015aeda830, spa=0xfffffe004fdba000, bp=0xfffffe0059108980, data=0xfffff8020ac939c0, size=16384,
    done=0xffffffff8270ddd0 <arc_read_done>, private=0xfffff8013c686700, priority=ZIO_PRIORITY_SYNC_READ, flags=ZIO_FLAG_CANFAIL, zb=0xfffffe0068149120)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:879
#9  0xffffffff8270cc54 in arc_read (pio=0xfffff8015aeda830, spa=0xfffffe004fdba000, bp=0xfffffe0059108980, done=0xfffff8002f39fbc0,
    private=0xfffff8011bff2580, priority=ZIO_PRIORITY_SYNC_READ, zio_flags=128, arc_flags=0xfffffe0068149164, zb=0xfffffe0068149120)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6080
#10 0xffffffff827195ed in dbuf_read_impl (db=<optimized out>, flags=<optimized out>, zio=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1310
#11 dbuf_read (db=0xfffff8002f5c0960, zio=0xfffff8015aeda830, flags=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1435
#12 0xffffffff8271d16f in dbuf_findbp (dn=<optimized out>, level=<optimized out>, blkid=10632, fail_sparse=<optimized out>, parentp=0xfffff801ad8ac038,
    bpp=<optimized out>, dh=0xfffff801ad8ac000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2506
#13 __dbuf_hold_impl (dh=0xfffff801ad8ac000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2931
#14 0xffffffff8271d37f in dbuf_hold_impl (dn=0xfffff8018ab6e000, level=0 '\000', blkid=10632, fail_sparse=0, fail_uncached=0, tag=0x0, dbp=0x0)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3032
#15 dbuf_hold_level (dn=0xfffff8018ab6e000, level=0, blkid=10632, tag=0x0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3075
#16 dbuf_hold (dn=0xfffff8018ab6e000, blkid=10632, tag=0x0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:3068
#17 0xffffffff827240ec in dmu_buf_hold_noread_by_dnode (dn=0xfffff8018ab6e000, offset=<optimized out>, tag=<unavailable>, dbp=0xfffffe00681492c8)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:160
#18 0xffffffff827242ac in dmu_buf_hold_by_dnode (dn=<unavailable>, offset=<unavailable>, tag=0x0, dbp=0xfffffe00681492c8, flags=1)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:208
#19 0xffffffff827a69cd in zap_get_leaf_byblk (zap=0xfffff80111de9200, blkid=10632, tx=0x0, lt=RW_READER, lp=0xfffffe0068149340)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:498
#20 0xffffffff827a404f in zap_deref_leaf (zap=0xfffff80111de9200, h=<optimized out>, tx=<unavailable>, lt=RW_READER, lp=0xfffffe00681493e0)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:580
#21 fzap_lookup (zn=0xfffff8011ce98200, integer_size=8, num_integers=1, buf=0xfffffe0068149490, realname=0x0, rn_len=0, ncp=0x0)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c:761
#22 0xffffffff827aac67 in zap_lookup_impl (zap=0xfffff80111de9200, name=<optimized out>, integer_size=8, num_integers=1, buf=0xfffffe0068149490,
    mt=<optimized out>, realname=0x0, rn_len=0, ncp=0x0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:915
#23 0xffffffff827aaad5 in zap_lookup_norm (os=<optimized out>, zapobj=<optimized out>, name=<optimized out>, integer_size=8, num_integers=1,
    buf=0xfffffe0068149490, mt=(unknown: 0), realname=0x0, rn_len=0, ncp=0x0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:954
#24 0xffffffff827aaa21 in zap_lookup (os=<unavailable>, zapobj=<unavailable>, name=<unavailable>, integer_size=<unavailable>, num_integers=<unavailable>,
    buf=<unavailable>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:898
#25 0xffffffff827d41f9 in zfs_match_find (zfsvfs=<optimized out>, dzp=<optimized out>, name=<optimized out>, mt=<optimized out>, zoid=<optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:81
#26 zfs_dirent_lookup (dzp=0xfffff8003812a110, name=0xfffffe0068149610 "0_NCiHyz6S.xml", zpp=0xfffffe00681494d8, flag=2)
--Type <RET> for more, q to quit, c to continue without paging--RET
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:177
#27 0xffffffff827d42e7 in zfs_dirlook (dzp=0xfffff8003812a110, name=<unavailable>, zpp=0xfffffe0068149590)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:238
#28 0xffffffff827edacf in zfs_lookup (dvp=<optimized out>, nm=<optimized out>, vpp=<optimized out>, cnp=<optimized out>, nameiop=<optimized out>,
    cr=<optimized out>, td=<optimized out>, flags=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1658
#29 0xffffffff827ee1fe in zfs_freebsd_lookup (ap=0xfffffe0068149778) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4960
#30 0xffffffff81314ba8 in VOP_CACHEDLOOKUP_APV (vop=<optimized out>, a=0xfffffe0068149778) at vnode_if.c:195
#31 0xffffffff80cc0846 in VOP_CACHEDLOOKUP (dvp=<optimized out>, vpp=<optimized out>, cnp=<optimized out>) at ./vnode_if.h:80
#32 vfs_cache_lookup (ap=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:2109
#33 0xffffffff81314a78 in VOP_LOOKUP_APV (vop=<optimized out>, a=0xfffffe0068149840) at vnode_if.c:127
#34 0xffffffff80cc9fa1 in VOP_LOOKUP (vpp=0xfffffe00681499d8, cnp=0xfffffe0068149a00, dvp=<optimized out>) at ./vnode_if.h:54
#35 lookup (ndp=0xfffffe0068149978) at /usr/src/sys/kern/vfs_lookup.c:879
#36 0xffffffff80cc948b in namei (ndp=0xfffffe0068149978) at /usr/src/sys/kern/vfs_lookup.c:444
#37 0xffffffff80ce06a6 in kern_accessat (td=<optimized out>, fd=-100, path=<optimized out>, pathseg=<optimized out>, flag=<optimized out>, amode=0)
    at /usr/src/sys/kern/vfs_syscalls.c:1986
#38 0xffffffff8118e592 in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#39 amd64_syscall (td=0xfffff8011bff2580, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1154
#40 <signal handler called>
#41 0x00000008003e703a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffd628
(kgdb) frame 9
#9  0xffffffff8270cc54 in arc_read (pio=0xfffff8015aeda830, spa=0xfffffe004fdba000, bp=0xfffffe0059108980, done=0xfffff8002f39fbc0,
    private=0xfffff8011bff2580, priority=ZIO_PRIORITY_SYNC_READ, zio_flags=128, arc_flags=0xfffffe0068149164, zb=0xfffffe0068149120)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6080
6080                    rzio = zio_read(pio, spa, bp, hdr->b_l1hdr.b_pabd, size,
(kgdb) frame 8
#8  0xffffffff827c030c in zio_read (pio=0xfffff8015aeda830, spa=0xfffffe004fdba000, bp=0xfffffe0059108980, data=0xfffff8020ac939c0, size=16384,
    done=0xffffffff8270ddd0 <arc_read_done>, private=0xfffff8013c686700, priority=ZIO_PRIORITY_SYNC_READ, flags=ZIO_FLAG_CANFAIL, zb=0xfffffe0068149120)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:879
879             zfs_blkptr_verify(spa, bp);
(kgdb) frame 7
#7  0xffffffff827c0223 in zfs_blkptr_verify (spa=0xfffffe004fdba000, bp=0xfffffe0059108980)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:835
835                             zfs_panic_recover("blkptr at %p DVA %u has invalid "
(kgdb) frame 6
#6  0xffffffff827849ca in zfs_panic_recover (fmt=<unavailable>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:1653
1653            vcmn_err(zfs_recover ? CE_WARN : CE_PANIC, fmt, adx);
(kgdb) frame 5
#5  0xffffffff826d418f in vcmn_err (ce=<optimized out>, fmt=0xffffffff82878f37 "blkptr at %p DVA %u has invalid OFFSET %llu", adx=0xfffffe0068148fa0)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:58
58                      panic("%s%s", prefix, buf);
(kgdb) frame 4
#4  0xffffffff80c06f93 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:799
799             vpanic(fmt, ap);
(kgdb)


Also, I tried to start scrub, but the server gone to reboot loop. Write dump, reboot, write dump, reboot and etc. I had to stop scrub manually.

Comment 1 Rodney W. Grimes freebsd_committer

2019-02-13 01:52:59 UTC

Please do not put bugs on stable@, current@, hackers@, etc

Comment 2 Andriy Voskoboinyk freebsd_committer

2019-02-13 08:45:38 UTC

Strange, I have assigned it to fs@ ...

Comment 3 Andriy Gapon freebsd_committer

2019-02-13 09:51:20 UTC

(In reply to Sergey Anokhin from comment #0)
I think that your best bet is finding the directory that triggers the panic and then moving your data excluding that directory to a new pool.  You can try to find the directory name in a debugger (e.g. pokinmg around frame 36) or empirically.
Unfortunately, you have an on-disk data corruption and there is no easy way to fix it (a hard way would be to learn ZFS on-disk format, to find the bad bits and to somehow fix or clear them using a disk/hex editor).

There is no obvious software bug, as far as I can tell, so this report is not actionable.

Comment 4 Sergey Anokhin 2019-02-13 13:50:36 UTC

(In reply to Andriy Gapon from comment #3)

Yes, I agree with you about to find/remove broken directories, but in any case kernel panic - it is bad.

Comment 5 Rodney W. Grimes freebsd_committer

2019-02-13 14:21:47 UTC

(In reply to Andriy Voskoboinyk from comment #2)
From a review of the history it was placed on stable@ by the submitter when it was created, before you assigned it to fs@.

(In reply to Sergey Anokhin from comment #0)
Please do not add stable@ or current@ to the cc: list when creating bugs, this is not a place we wish to have emails from bugzilla going.  Thanks, rgrimes@

Comment 6 Sergey Anokhin 2019-02-13 20:43:49 UTC

(In reply to Rodney W. Grimes from comment #5)

Ok. Thank you for a tip. My apologies...

Comment 7 Sergey Anokhin 2019-02-13 21:12:42 UTC

I've noticed some additional info: 

kernel panic occurs when trying to move files from one directory to another by perl script. The destination directory contains many millions of files.

Perhaps, it can be helpful.

Comment 8 Eugene Grosbein freebsd_committer

2019-02-14 04:16:39 UTC

(In reply to Andriy Gapon from comment #3)

Inability to fix data file system consistency is a bug. Yes, UFS can panic too in case of found inconsistency but it has fsck to get file system back to consistent state. ZFS should have means to fix consistency too, even at cost of some user data loss in case of non-redundant pool.

Comment 9 Andriy Gapon freebsd_committer

2019-02-14 09:23:09 UTC

(In reply to Sergey Anokhin from comment #4)
It's bad, but it's not a bug (not an actionable one, in any case), it's a data corruption.

(In reply to Eugene Grosbein from comment #8)
I agree. But such a tool will not magically appear.

Comment 10 Eugene Grosbein freebsd_committer

2019-02-14 09:36:17 UTC

(In reply to Andriy Gapon from comment #9)

ZFS already has "zpool scrub". Isn't it job of scrub to repair consistency? Now it panices too.

Comment 11 Sergey Anokhin 2019-02-14 10:43:07 UTC

(In reply to Andriy Gapon from comment #9)

Shouldn't scrub eliminate it? Instead of this it gives loop of reboot with kernel panic

Comment 12 Andriy Gapon freebsd_committer

2019-02-14 16:21:39 UTC

(In reply to Eugene Grosbein from comment #10)
(In reply to Sergey Anokhin from comment #11)

ZFS scrub repairs only blocks with bad checksums or otherwise unreadable blocks.
ZFS scrub does not analyze and thus does not fix any logical inconsistencies.
If bad data has a correct checksum, then at present ZFS cannot fix it. Sometimes it can recognize that the data is bad and report an error, sometimes it has no option but to panic, but sometimes it cannot even tell if it's bad data.

Comment 13 Andriy Gapon freebsd_committer

2019-02-14 17:32:29 UTC

If you need to recover your data then the following patch _may_ help you to read the pool and move the data elsewhere: https://people.freebsd.org/~avg/zfs-skip-bad-bp-read.diff

However the patch cannot fix the on-disk data, so the pool will remain as it is.

Comment 14 Sergey Anokhin 2019-02-14 18:40:08 UTC

(In reply to Andriy Gapon from comment #13)

Will the patch help to remove broken data without kernel panic?

Comment 15 Andriy Gapon freebsd_committer

2019-02-15 07:52:13 UTC

(In reply to Sergey Anokhin from comment #14)
Most likely no.
I suggest to re-create the pool.

Comment 16 Eugene Grosbein freebsd_committer

2022-10-23 06:08:54 UTC

Feedback timeout.

Comment 17 Graham Perrin freebsd_committer

2022-10-23 07:47:06 UTC

^Triage: with closure, there should be assignment to a person.