On 10-STABLE/amd64 with INVARIANTS enabled, when a user attempts to do a new installation with the ZFS wizard, selecting the option to encrypt the pool causes a ZFS deadlock. The deadlock resolver (DEADLKRES) doesn't resolve this deadlock. The commit (from github) that I used to generate the new installation image: 8cc4bb22bd5b562b7e64f69904c876eb0146d170 This issue was originally reported to HardenedBSD: https://github.com/HardenedBSD/hardenedBSD/issues/168 Steps to reproduce: 1. Generate a new installation cdrom image with GENERIC compiled with INVARIANTS and INVARIANT_SUPPORT 2. Boot the cdrom image in bhyve 3. Attempt an installation with encrypted ZFS through the installer 4. Notice the deadlock
Do you have a backtrace?
(In reply to Allan Jude from comment #1) I don't. Unfortunately, one is not printed. And remote kernel debugging with bhyve isn't working with me.
I'm just checking in. Any updates on this?
The patch from this github issue fixes the zpool export problem: https://github.com/zfsonlinux/zfs/pull/3137/commits/65cdaa78aff3c2e21ff912b9acfc523f22a3b2c4 , but there is still an other problem during the zpool import call.
The patch from julian@ fixes this issue: https://lists.freebsd.org/pipermail/freebsd-stable/2016-October/085821.html
This issue appears to have nothing to do with GELI or GELIBoot The issue is apparently a use-after-free, and as noted, is fixed by Julian's patch
And the problem seems to be in the taskqeueu code, not ZFS.
A commit references this bug: Author: julian Date: Mon Oct 10 04:57:33 UTC 2016 New revision: 306935 URL: https://svnweb.freebsd.org/changeset/base/306935 Log: While the thread is sleeping in taskqueue_drain_all() it is posible that the queue entry it is looking at is removed from the queue, but we make no effort to account for this. when we wake up we need to check it's still there. PR: 209580 Sponsored by: Panzura inc Differential Revision: D8160 Changes: stable/10/sys/kern/subr_taskqueue.c
(In reply to op from comment #5) Hi have you confirmed that this fixes that problem? I know it fixed our ZFS boot problem when Invariants was on.
Julian have asked for an EN for this, place this on our radar.
Yes, I tested the fix multiple times, and it fixed this issue. Thanks!
Btw, it would be really nice to add at least one test machine with enabled INVARIANT on stable branches too in FreeBSD's test cluster.
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
Fixed by r306935.