Bug 220691 - ZFS instantly panics on boot from degraded volume or after a drive failure
Summary: ZFS instantly panics on boot from degraded volume or after a drive failure
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: Andriy Gapon
URL:
Keywords: crash, regression
Depends on:
Blocks:
 
Reported: 2017-07-12 20:07 UTC by Peter Wemm
Modified: 2017-07-22 08:18 UTC (History)
6 users (show)

See Also:


Attachments
proposed patch (719 bytes, patch)
2017-07-17 06:20 UTC, Andriy Gapon
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Wemm freebsd_committer freebsd_triage 2017-07-12 20:07:20 UTC
r320065: Sun Jun 18 04:22:09 UTC 2017 - works
r320900: Wed Jul 12 03:00:15 UTC 2017 - panics

Sample of boot failure:
<118>Setting hostname: tiny.nyi.freebsd.org.
<118>Setting up harvesting: [UMA],
[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,D
<118>Feeding entropy: .

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x28

Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 07
fault virtual address   = 0x28

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 06
apic id = 00
fault virtual address   = 0x28
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, fault virtual address      
= 0x28
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff803aab56
stack pointer           = 0x28:0xfffffe0239fa3a90
fault code              = supervisor read data, page not present
IOPL = 0
current process         = 0 (zio_write_intr_0)
frame pointer           = 0x28:0xfffffe0239fa3aa0

db> where                       
Tracing pid 0 tid 100471 td 0xfffff80005452000
vdev_geom_io_done() at vdev_geom_io_done+0x36/frame 0xfffffe0239f9eaa0
zio_vdev_io_done() at zio_vdev_io_done+0x176/frame 0xfffffe0239f9ead0
zio_execute() at zio_execute+0xac/frame 0xfffffe0239f9eb20
taskqueue_run_locked() at taskqueue_run_locked+0x127/frame 0xfffffe0239f9eb80
taskqueue_thread_loop() at taskqueue_thread_loop+0xc8/frame 0xfffffe0239f9ebb0
fork_exit() at fork_exit+0x85/frame 0xfffffe0239f9ebf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0239f9ebf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---


Sample of panic when a volume degrades:

root@nope.ysv.freebsd.org:/home/peter # zpool offline zroot mfid5p3
Fatal trap 12: page fault while in kernel mode
cpuid = 4; apic id = 04

Fatal trap 12: page fault while in kernel mode
fault virtual address	= 0x28
Fatal trap 12: page fault while in kernel mode

Fatal trap 12: page fault while in kernel mode
Fatal trap 12: page fault while in kernel mode
cpuid = 7; apic id = 07
cpuid = 1; apic id = 01
fault virtual address	= 0x28
fault code		= supervisor read data, page not present
cpuid = 3; cpuid = 5; apic id = 03
Fatal trap 12: page fault while in kernel mode
apic id = 05
fault virtual address	= 0x28
fault virtual address	= 0x28
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff803aab56
stack pointer	        = 0x28:0xfffffe085fb3aa90
instruction pointer	= 0x20:0xffffffff803aab56
fault code		= supervisor read data, page not present
cpuid = 6; fault virtual address	= 0x28
Fatal trap 12: page fault while in kernel mode
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff803aab56
stack pointer	        = 0x28:0xfffffe085fb3fa90
frame pointer	        = 0x28:0xfffffe085fb3aaa0
fault code		= supervisor read data, page not present
cpuid = 2; apic id = 02
apic id = 06
instruction pointer	= 0x20:0xffffffff803aab56
fault virtual address	= 0x28
fault code		= supervisor read data, page not present
stack pointer	        = 0x28:0xfffffe085fb30a90
instruction pointer	= 0x20:0xffffffff803aab56
stack pointer	        = 0x28:0xfffffe085fb35a90
frame pointer	        = 0x28:0xfffffe085fb3faa0
code segment		= base rx0, limit 0xfffff, type 0x1b
stack pointer	        = 0x28:0xfffffe085fb44a90
frame pointer	        = 0x28:0xfffffe085fb44aa0
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
fault virtual address	= 0x28
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, instruction pointer	= 0x20:6
frame pointer	        = 0x28:0xfffffe085fb30aa0
code segment		= base rx0, limit 0xfffff, type 0x1b
code segment		= base rx0, limit 0xfffff, type 0x1b
frame pointer	        = 0x28:0xfffffe085fb35aa0
code segment		= base rx0, limit 0xfffff, type 0x1b
resume, IOPL = 0
stack pointer	        = 0x28:0xfffffe085fb26a90
			= DPL 0, pres 1, long 1, def32 0, gran 1
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= fault code		= supervisor read data, page not
frame pointer	        = 0x28:0xfffffe085fb26aa0
instruction pointer	= 0x20:0xffffffff803aab56
processor eflags	= interrupt enabled, code segment		= base b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 0 (zio_write_intr_2)
[ thread pid 0 tid 100500 ]
Stopped at      vdev_geom_io_done+0x36: movq    0x28(%rbx),%rsi
db> where
Tracing pid 0 tid 100500 td 0xfffff8000aae6000
vdev_geom_io_done() at vdev_geom_io_done+0x36/frame 0xfffffe085fb30aa0
zio_vdev_io_done() at zio_vdev_io_done+0x176/frame 0xfffffe085fb30ad0
zio_execute() at zio_execute+0xac/frame 0xfffffe085fb30b20
taskqueue_run_locked() at taskqueue_run_locked+0x127/frame 0xfffffe085fb30b80
taskqueue_thread_loop() at taskqueue_thread_loop+0xc8/frame 0xfffffe085fb30bb0
fork_exit() at fork_exit+0x85/frame 0xfffffe085fb30bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe085fb30bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
db> 

All cores trapped concurrently.

zio in the vdev_geom_io_done() function is null.
Comment 1 Peter Wemm freebsd_committer freebsd_triage 2017-07-12 21:03:34 UTC
Oops, make that:  "zio->io_bio is NULL".
Comment 2 Cy Schubert freebsd_committer freebsd_triage 2017-07-13 01:10:25 UTC
Indeed, it also affects mirrors.

It's panicking at line 1094 of src/svn-current/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:

abd_return_buf_copy(zio->io_abd, bp->bio_data, zio->io_size);

bp is a null pointer.
Comment 3 Andriy Gapon freebsd_committer freebsd_triage 2017-07-17 06:20:45 UTC
Created attachment 184417 [details]
proposed patch

Could everyone affected and anyone interested please test this patch?
Thank you!
Comment 4 Cy Schubert freebsd_committer freebsd_triage 2017-07-17 06:34:31 UTC
My patch is similar to your patch. It resolves the issue. I'll test yours as well.
Comment 5 Andriy Gapon freebsd_committer freebsd_triage 2017-07-17 06:37:08 UTC
(In reply to Cy Schubert from comment #4)
Thank you!  Could you please test the kernel with INVARIANTS if possible?
Comment 6 Cy Schubert freebsd_committer freebsd_triage 2017-07-17 19:13:11 UTC
(In reply to Andriy Gapon from comment #5)
No messages to console. Kernel built with:

cwsys# strings /boot/kernel/kernel | grep INVARI
Kernel compiled with INVARIANTS, may affect performance
Support for modules compiled with INVARIANTS option
options	INVARIANT_SUPPORT
options	INVARIANTS
cwsys#
Comment 7 commit-hook freebsd_committer freebsd_triage 2017-07-18 07:42:38 UTC
A commit references this bug:

Author: avg
Date: Tue Jul 18 07:41:39 UTC 2017
New revision: 321111
URL: https://svnweb.freebsd.org/changeset/base/321111

Log:
  fix a regression in r320452, ZFS ABD import

  I overlooked the fact that vdev_op_io_done hook is called even if the
  actual I/O is skipped, for example, in the case of a missing vdev.
  Arguably, this could be considered an issue in the zio pipeline engine,
  but for now I am adding defensive code to check for io_bp being NULL
  along with assertions that that happens only when it can be really
  expected.

  PR:		220691
  Reported by:	peter, cy
  Tested by:	cy
  MFC after:	1 week
  X-MFC with:	r320156, r320452

Changes:
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
Comment 8 Peter Wemm freebsd_committer freebsd_triage 2017-07-19 19:58:53 UTC
This fixes both of the cases that we encountered in the cluster.

Thank you!!
Comment 9 Andriy Gapon freebsd_committer freebsd_triage 2017-07-22 08:18:22 UTC
The issue is fixed in the only branch where it was present.