Bug 248250 - Witness Warning: 'Exclusive Sleep Mutex vtdev locked' warning on arm64 (RPi4b)
Summary: Witness Warning: 'Exclusive Sleep Mutex vtdev locked' warning on arm64 (RPi4b)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: arm64 Any
: --- Affects Some People
Assignee: Jason A. Harmening
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-24 19:44 UTC by Gordon Bergling
Modified: 2021-02-17 17:56 UTC (History)
1 user (show)

See Also:


Attachments
preliminary patch (7.88 KB, patch)
2020-08-05 06:38 UTC, Jason A. Harmening
no flags Details | Diff
dmesg of RPi4 with r363886 and the applied bugfix (17.79 KB, text/plain)
2020-08-05 09:02 UTC, Gordon Bergling
no flags Details
verified dmesg of r363886 and the applied bugfix (17.79 KB, text/plain)
2020-08-05 09:37 UTC, Gordon Bergling
no flags Details
dmesg output with the preliminary patch applied (16.75 KB, text/plain)
2020-08-05 18:59 UTC, Gordon Bergling
no flags Details
proposed final patch (4.51 KB, patch)
2020-08-09 08:34 UTC, Jason A. Harmening
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gordon Bergling freebsd_committer freebsd_triage 2020-07-24 19:44:01 UTC
While booting the RPi4b I get the following warning:

uma_zalloc_debug: zone "kenv" with the following non-sleepable locks held:
exclusive sleep mutex vtdev (vtdev) r = 0 (0xffff0000009b4280) locked @ /tank/nfs_public/tiny/src/sys/dev/vt/vt_core.c:2818
stack backtrace:
#0 0xffff00000048f2d8 at witness_debugger+0x64
#1 0xffff0000004902e8 at witness_warn+0x3f0
#2 0xffff000000690620 at uma_zalloc_debug+0x2c
#3 0xffff000000690088 at uma_zalloc_arg+0x2c
#4 0xffff0000003d3498 at getenv_string_buffer+0x40
#5 0xffff0000003d3bb4 at getenv_quad+0x14
#6 0xffff0000003d3b84 at getenv_int+0x14
#7 0xffff0000002d40a4 at vt_fb_init+0xbc
#8 0xffff0000002d8f50 at vt_replace_backend+0x10c
#9 0xffff0000002d41dc at vt_fb_attach+0x14
#10 0xffff00000019de28 at fbd_register+0xec
#11 0xffff00000045c850 at device_attach+0x400
#12 0xffff00000045c3b8 at device_probe_and_attach+0x7c
#13 0xffff0000006eefa8 at bcm_fb_attach+0x280
#14 0xffff00000045c850 at device_attach+0x400
#15 0xffff00000045c3b8 at device_probe_and_attach+0x7c
#16 0xffff00000045e59c at bus_generic_new_pass+0xf8
#17 0xffff00000045e54c at bus_generic_new_pass+0xa8
fb0: 1824x984(1824x984@0,0) 32bpp
Comment 1 Jason A. Harmening freebsd_committer freebsd_triage 2020-07-31 19:46:40 UTC
I'll take it.
Comment 2 Jason A. Harmening freebsd_committer freebsd_triage 2020-08-05 06:38:53 UTC
Created attachment 217019 [details]
preliminary patch
Comment 3 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 09:01:58 UTC
I applied the patch against r363886 but the warning is still printed during boot. I have attached a dmesg output.
Comment 4 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 09:02:55 UTC
Created attachment 217023 [details]
dmesg of RPi4 with r363886 and the applied bugfix
Comment 5 Jason A. Harmening freebsd_committer freebsd_triage 2020-08-05 09:14:05 UTC
Are you sure the patch applied cleanly?

The patch includes a change to getenv_quad to no longer call getenv_string_buffer(), but it looks like that's still happening in the attached dmesg:

#3 0xffff0000006977d8 at uma_zalloc_arg+0x2c
#4 0xffff0000003d5fdc at getenv_string_buffer+0x40
#5 0xffff0000003d66f8 at getenv_quad+0x14
#6 0xffff0000003d66c8 at getenv_int+0x14
#7 0xffff0000002d5a44 at vt_fb_init+0xbc
Comment 6 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 09:37:52 UTC
Created attachment 217024 [details]
verified dmesg of r363886 and the applied bugfix
Comment 7 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 09:43:09 UTC
(In reply to Jason A. Harmening from comment #5)

I have verified the patch was correctly applied and attached a new dmesg output. The patch applied cleanly, but due to the usage of a git mirror its not shown at the top of the dmesg output.

FreeBSD 13.0-CURRENT #42 57cfba0a880-c270348(master): Wed Aug  5 08:55:32 CEST 2020
    root@tiny.0xfce3.net:/tank/nfs_public/tiny/obj/tank/nfs_public/tiny/src/arm64.aarch64/sys/GENERIC-TCP arm64
FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
WARNING: WITNESS option enabled, expect reduced performance.

The kernel config is an "include GENERIC" with additional TCP options, if that matters.
Comment 8 Jason A. Harmening freebsd_committer freebsd_triage 2020-08-05 17:35:18 UTC
Hmm, that doesn't make sense then.  The patch may have other problems, but if you look at the diff it definitely removes the call from getenv_quad() to getenv_string_buffer() which is still seen in the WITNESS backtrace.

How are you building the kernel?
Comment 9 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 18:13:04 UTC
I usual have a script for compile kernel and world, but this time I did the following,

Everything in /usr/src
# patch < /home/gbergling/tmp/pr_248250.patch
# git add sys/kern/kern_environment.c
# git add sys/kern/subr_hints.c
# git add sys/sys/systm.h
# git status
# git commit -m "Apply patch from PR 248250"
# less ~/tmp/pr_248250.patch
# make -s -j 4 buildkernel NO_CLEAN=yes KERNCONF=GENERIC-TCP
# env MAKEOBJDIRPREFIX=/tank/nfs_public/tiny/obj make KERNCONF=GENERIC-TCP KODIR=/boot/kernel.r363886_and_bf installkernel
# nextboot -k kernel.r363886_and_bf

I verified that the correct kernel was loaded by nextboot. I could try a build without NO_CLEAN=yes.
Comment 10 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 18:15:07 UTC
Argh, I think I missed the MAKEOBJDIRPREFIX during buildkernel step. I have started a build and have a look if that was the mistake.
Comment 11 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 18:58:47 UTC
OK, that was mistake sorry, after building a new kernel with applied patch and the correct MAKEOBJDIRPREFIX the witness warning is gone. I attach a new dmesg in a second.
Comment 12 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 18:59:33 UTC
Created attachment 217035 [details]
dmesg output with the preliminary patch applied
Comment 13 Jason A. Harmening freebsd_committer freebsd_triage 2020-08-05 19:39:26 UTC
cool, thanks for testing.  Is there a separate PR for the gpioled warnings?
Comment 14 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 19:47:12 UTC
(In reply to Jason A. Harmening from comment #13)

No problem, thanks for fixing! :)

There isn't a PR for the gpioled warnings, yet. They were introduced when the Raspberry PI firmware changes were committed a few days ago. Should I create one?
Comment 15 Jason A. Harmening freebsd_committer freebsd_triage 2020-08-05 19:49:25 UTC
They should be tracked somewhere; you can file a PR or reply to the commit that introduced them if they're recent.
Comment 16 Gordon Bergling freebsd_committer freebsd_triage 2020-08-05 20:02:20 UTC
(In reply to Jason A. Harmening from comment #15)

There were a lot of recent commits involved in the RPi4 firmware parts, so I created PR 248490.
Comment 17 Jason A. Harmening freebsd_committer freebsd_triage 2020-08-09 08:34:22 UTC
Created attachment 217107 [details]
proposed final patch

Patch updated with code review feedback.  Gordon, can you retest?
Comment 18 Gordon Bergling freebsd_committer freebsd_triage 2020-08-09 10:34:24 UTC
(In reply to Jason A. Harmening from comment #17)

I tested the new patch and the witness warning is still gone. No others problems seen. Thanks for the fix.
Comment 19 commit-hook freebsd_committer freebsd_triage 2020-08-14 21:38:07 UTC
A commit references this bug:

Author: jah
Date: Fri Aug 14 21:37:38 UTC 2020
New revision: 364239
URL: https://svnweb.freebsd.org/changeset/base/364239

Log:
  kenv: avoid sleepable alloc for integer tunables

  Avoid performing a potentially-blocking malloc for kenv lookups that will only
  perform non-destructive integer conversions on the returned buffer. Instead,
  perform the strtoq() in-place with the kenv lock held.

  While here, factor the logic around kenv_lock acquire and release into
  kenv_acquire() and kenv_release(), and use these functions for some light
  cleanup. Collapse getenv_string_buffer() into kern_getenv(), as the former
  no longer has any other callers and the only additional task performed by
  the latter is a WITNESS check that hasn't been useful since r362231.

  PR:		248250
  Reported by:	gbe
  Reviewed by:	mjg
  Tested by:	gbe
  Differential Revision:	https://reviews.freebsd.org/D26010

Changes:
  head/sys/kern/kern_environment.c
Comment 20 commit-hook freebsd_committer freebsd_triage 2021-02-17 17:56:12 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=eefc7f388ab66a79f99a5f73236e9fbfc5312d4f

commit eefc7f388ab66a79f99a5f73236e9fbfc5312d4f
Author:     Jason A. Harmening <jah@FreeBSD.org>
AuthorDate: 2020-08-14 21:37:38 +0000
Commit:     Alexander Motin <mav@FreeBSD.org>
CommitDate: 2021-02-17 17:52:24 +0000

    kenv: avoid sleepable alloc for integer tunables

    Avoid performing a potentially-blocking malloc for kenv lookups that will only
    perform non-destructive integer conversions on the returned buffer. Instead,
    perform the strtoq() in-place with the kenv lock held.

    While here, factor the logic around kenv_lock acquire and release into
    kenv_acquire() and kenv_release(), and use these functions for some light
    cleanup. Collapse getenv_string_buffer() into kern_getenv(), as the former
    no longer has any other callers and the only additional task performed by
    the latter is a WITNESS check that hasn't been useful since r362231.

    PR:             248250
    Reported by:    gbe
    Reviewed by:    mjg
    Tested by:      gbe
    Differential Revision:  https://reviews.freebsd.org/D26010

 sys/kern/kern_environment.c | 124 ++++++++++++++++++++++++--------------------
 1 file changed, 67 insertions(+), 57 deletions(-)