Bug 209580 - ZFS with INVARIANTS enabled deadlock
Summary: ZFS with INVARIANTS enabled deadlock
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.3-RELEASE
Hardware: Any Any
: --- Affects Many People
Assignee: FreeBSD Release Engineering
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-17 14:40 UTC by Shawn Webb
Modified: 2018-05-30 00:19 UTC (History)
7 users (show)

See Also:
op: mfc-stable10?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Shawn Webb 2016-05-17 14:40:35 UTC
On 10-STABLE/amd64 with INVARIANTS enabled, when a user attempts to do a new installation with the ZFS wizard, selecting the option to encrypt the pool causes a ZFS deadlock. The deadlock resolver (DEADLKRES) doesn't resolve this deadlock.

The commit (from github) that I used to generate the new installation image: 8cc4bb22bd5b562b7e64f69904c876eb0146d170

This issue was originally reported to HardenedBSD: https://github.com/HardenedBSD/hardenedBSD/issues/168

Steps to reproduce:
1. Generate a new installation cdrom image with GENERIC compiled with INVARIANTS and INVARIANT_SUPPORT
2. Boot the cdrom image in bhyve
3. Attempt an installation with encrypted ZFS through the installer
4. Notice the deadlock
Comment 1 Allan Jude freebsd_committer 2016-05-17 14:44:52 UTC
Do you have a backtrace?
Comment 2 Shawn Webb 2016-05-17 14:46:33 UTC
(In reply to Allan Jude from comment #1)
I don't. Unfortunately, one is not printed. And remote kernel debugging with bhyve isn't working with me.
Comment 3 Shawn Webb 2016-07-04 12:55:26 UTC
I'm just checking in. Any updates on this?
Comment 4 op 2016-08-04 21:33:34 UTC
The patch from this github issue fixes the zpool export problem: https://github.com/zfsonlinux/zfs/pull/3137/commits/65cdaa78aff3c2e21ff912b9acfc523f22a3b2c4 , but there is still an other problem during the zpool import call.
Comment 5 op 2016-10-08 12:37:33 UTC
The patch from julian@ fixes this issue: https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/085821.html
Comment 6 Allan Jude freebsd_committer 2016-10-08 18:21:59 UTC
This issue appears to have nothing to do with GELI or GELIBoot

The issue is apparently a use-after-free, and as noted, is fixed by Julian's patch
Comment 7 Andriy Gapon freebsd_committer 2016-10-08 19:19:13 UTC
And the problem seems to be in the taskqeueu code, not ZFS.
Comment 8 commit-hook freebsd_committer 2016-10-10 04:58:03 UTC
A commit references this bug:

Author: julian
Date: Mon Oct 10 04:57:33 UTC 2016
New revision: 306935
URL: https://svnweb.freebsd.org/changeset/base/306935

Log:
  While the thread is sleeping in taskqueue_drain_all() it is
  posible that the queue entry it is looking at is removed
  from the queue, but we make no effort to account
  for this. when we wake up we need to check it's still there.

  PR: 209580
  Sponsored by:	Panzura inc
  Differential Revision:	D8160

Changes:
  stable/10/sys/kern/subr_taskqueue.c
Comment 9 Julian Elischer freebsd_committer 2016-10-10 05:05:07 UTC
(In reply to op from comment #5)
 Hi have you confirmed that this fixes that problem?
I know it fixed our ZFS boot problem when Invariants was on.
Comment 10 Xin LI freebsd_committer 2016-10-10 07:42:53 UTC
Julian have asked for an EN for this, place this on our radar.
Comment 11 op 2016-10-10 18:47:32 UTC
Yes, I tested the fix multiple times, and it fixed this issue. Thanks!
Comment 12 op 2016-10-10 18:49:18 UTC
Btw, it would be really nice to add at least one test machine with enabled INVARIANT on stable branches too in FreeBSD's test cluster.
Comment 13 Eitan Adler freebsd_committer freebsd_triage 2018-05-28 19:43:44 UTC
batch change:

For bugs that match the following
-  Status Is In progress 
AND
- Untouched since 2018-01-01.
AND
- Affects Base System OR Documentation

DO:

Reset to open status.


Note:
I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
Comment 14 Xin LI freebsd_committer 2018-05-30 00:19:43 UTC
Fixed by r306935.