Bug 248250

Summary: Witness Warning: 'Exclusive Sleep Mutex vtdev locked' warning on arm64 (RPi4b)
Product: Base System Reporter: Gordon Bergling <gbe>
Component: kernAssignee: Jason A. Harmening <jah>
Status: Closed FIXED    
Severity: Affects Some People CC: jah
Priority: ---    
Version: CURRENT   
Hardware: arm64   
OS: Any   
Attachments:
Description Flags
preliminary patch
none
dmesg of RPi4 with r363886 and the applied bugfix
none
verified dmesg of r363886 and the applied bugfix
none
dmesg output with the preliminary patch applied
none
proposed final patch none

Description Gordon Bergling freebsd_committer 2020-07-24 19:44:01 UTC
While booting the RPi4b I get the following warning:

uma_zalloc_debug: zone "kenv" with the following non-sleepable locks held:
exclusive sleep mutex vtdev (vtdev) r = 0 (0xffff0000009b4280) locked @ /tank/nfs_public/tiny/src/sys/dev/vt/vt_core.c:2818
stack backtrace:
#0 0xffff00000048f2d8 at witness_debugger+0x64
#1 0xffff0000004902e8 at witness_warn+0x3f0
#2 0xffff000000690620 at uma_zalloc_debug+0x2c
#3 0xffff000000690088 at uma_zalloc_arg+0x2c
#4 0xffff0000003d3498 at getenv_string_buffer+0x40
#5 0xffff0000003d3bb4 at getenv_quad+0x14
#6 0xffff0000003d3b84 at getenv_int+0x14
#7 0xffff0000002d40a4 at vt_fb_init+0xbc
#8 0xffff0000002d8f50 at vt_replace_backend+0x10c
#9 0xffff0000002d41dc at vt_fb_attach+0x14
#10 0xffff00000019de28 at fbd_register+0xec
#11 0xffff00000045c850 at device_attach+0x400
#12 0xffff00000045c3b8 at device_probe_and_attach+0x7c
#13 0xffff0000006eefa8 at bcm_fb_attach+0x280
#14 0xffff00000045c850 at device_attach+0x400
#15 0xffff00000045c3b8 at device_probe_and_attach+0x7c
#16 0xffff00000045e59c at bus_generic_new_pass+0xf8
#17 0xffff00000045e54c at bus_generic_new_pass+0xa8
fb0: 1824x984(1824x984@0,0) 32bpp
Comment 1 Jason A. Harmening freebsd_committer 2020-07-31 19:46:40 UTC
I'll take it.
Comment 2 Jason A. Harmening freebsd_committer 2020-08-05 06:38:53 UTC
Created attachment 217019 [details]
preliminary patch
Comment 3 Gordon Bergling freebsd_committer 2020-08-05 09:01:58 UTC
I applied the patch against r363886 but the warning is still printed during boot. I have attached a dmesg output.
Comment 4 Gordon Bergling freebsd_committer 2020-08-05 09:02:55 UTC
Created attachment 217023 [details]
dmesg of RPi4 with r363886 and the applied bugfix
Comment 5 Jason A. Harmening freebsd_committer 2020-08-05 09:14:05 UTC
Are you sure the patch applied cleanly?

The patch includes a change to getenv_quad to no longer call getenv_string_buffer(), but it looks like that's still happening in the attached dmesg:

#3 0xffff0000006977d8 at uma_zalloc_arg+0x2c
#4 0xffff0000003d5fdc at getenv_string_buffer+0x40
#5 0xffff0000003d66f8 at getenv_quad+0x14
#6 0xffff0000003d66c8 at getenv_int+0x14
#7 0xffff0000002d5a44 at vt_fb_init+0xbc
Comment 6 Gordon Bergling freebsd_committer 2020-08-05 09:37:52 UTC
Created attachment 217024 [details]
verified dmesg of r363886 and the applied bugfix
Comment 7 Gordon Bergling freebsd_committer 2020-08-05 09:43:09 UTC
(In reply to Jason A. Harmening from comment #5)

I have verified the patch was correctly applied and attached a new dmesg output. The patch applied cleanly, but due to the usage of a git mirror its not shown at the top of the dmesg output.

FreeBSD 13.0-CURRENT #42 57cfba0a880-c270348(master): Wed Aug  5 08:55:32 CEST 2020
    root@tiny.0xfce3.net:/tank/nfs_public/tiny/obj/tank/nfs_public/tiny/src/arm64.aarch64/sys/GENERIC-TCP arm64
FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
WARNING: WITNESS option enabled, expect reduced performance.

The kernel config is an "include GENERIC" with additional TCP options, if that matters.
Comment 8 Jason A. Harmening freebsd_committer 2020-08-05 17:35:18 UTC
Hmm, that doesn't make sense then.  The patch may have other problems, but if you look at the diff it definitely removes the call from getenv_quad() to getenv_string_buffer() which is still seen in the WITNESS backtrace.

How are you building the kernel?
Comment 9 Gordon Bergling freebsd_committer 2020-08-05 18:13:04 UTC
I usual have a script for compile kernel and world, but this time I did the following,

Everything in /usr/src
# patch < /home/gbergling/tmp/pr_248250.patch
# git add sys/kern/kern_environment.c
# git add sys/kern/subr_hints.c
# git add sys/sys/systm.h
# git status
# git commit -m "Apply patch from PR 248250"
# less ~/tmp/pr_248250.patch
# make -s -j 4 buildkernel NO_CLEAN=yes KERNCONF=GENERIC-TCP
# env MAKEOBJDIRPREFIX=/tank/nfs_public/tiny/obj make KERNCONF=GENERIC-TCP KODIR=/boot/kernel.r363886_and_bf installkernel
# nextboot -k kernel.r363886_and_bf

I verified that the correct kernel was loaded by nextboot. I could try a build without NO_CLEAN=yes.
Comment 10 Gordon Bergling freebsd_committer 2020-08-05 18:15:07 UTC
Argh, I think I missed the MAKEOBJDIRPREFIX during buildkernel step. I have started a build and have a look if that was the mistake.
Comment 11 Gordon Bergling freebsd_committer 2020-08-05 18:58:47 UTC
OK, that was mistake sorry, after building a new kernel with applied patch and the correct MAKEOBJDIRPREFIX the witness warning is gone. I attach a new dmesg in a second.
Comment 12 Gordon Bergling freebsd_committer 2020-08-05 18:59:33 UTC
Created attachment 217035 [details]
dmesg output with the preliminary patch applied
Comment 13 Jason A. Harmening freebsd_committer 2020-08-05 19:39:26 UTC
cool, thanks for testing.  Is there a separate PR for the gpioled warnings?
Comment 14 Gordon Bergling freebsd_committer 2020-08-05 19:47:12 UTC
(In reply to Jason A. Harmening from comment #13)

No problem, thanks for fixing! :)

There isn't a PR for the gpioled warnings, yet. They were introduced when the Raspberry PI firmware changes were committed a few days ago. Should I create one?
Comment 15 Jason A. Harmening freebsd_committer 2020-08-05 19:49:25 UTC
They should be tracked somewhere; you can file a PR or reply to the commit that introduced them if they're recent.
Comment 16 Gordon Bergling freebsd_committer 2020-08-05 20:02:20 UTC
(In reply to Jason A. Harmening from comment #15)

There were a lot of recent commits involved in the RPi4 firmware parts, so I created PR 248490.
Comment 17 Jason A. Harmening freebsd_committer 2020-08-09 08:34:22 UTC
Created attachment 217107 [details]
proposed final patch

Patch updated with code review feedback.  Gordon, can you retest?
Comment 18 Gordon Bergling freebsd_committer 2020-08-09 10:34:24 UTC
(In reply to Jason A. Harmening from comment #17)

I tested the new patch and the witness warning is still gone. No others problems seen. Thanks for the fix.
Comment 19 commit-hook freebsd_committer 2020-08-14 21:38:07 UTC
A commit references this bug:

Author: jah
Date: Fri Aug 14 21:37:38 UTC 2020
New revision: 364239
URL: https://svnweb.freebsd.org/changeset/base/364239

Log:
  kenv: avoid sleepable alloc for integer tunables

  Avoid performing a potentially-blocking malloc for kenv lookups that will only
  perform non-destructive integer conversions on the returned buffer. Instead,
  perform the strtoq() in-place with the kenv lock held.

  While here, factor the logic around kenv_lock acquire and release into
  kenv_acquire() and kenv_release(), and use these functions for some light
  cleanup. Collapse getenv_string_buffer() into kern_getenv(), as the former
  no longer has any other callers and the only additional task performed by
  the latter is a WITNESS check that hasn't been useful since r362231.

  PR:		248250
  Reported by:	gbe
  Reviewed by:	mjg
  Tested by:	gbe
  Differential Revision:	https://reviews.freebsd.org/D26010

Changes:
  head/sys/kern/kern_environment.c
Comment 20 commit-hook freebsd_committer 2021-02-17 17:56:12 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=eefc7f388ab66a79f99a5f73236e9fbfc5312d4f

commit eefc7f388ab66a79f99a5f73236e9fbfc5312d4f
Author:     Jason A. Harmening <jah@FreeBSD.org>
AuthorDate: 2020-08-14 21:37:38 +0000
Commit:     Alexander Motin <mav@FreeBSD.org>
CommitDate: 2021-02-17 17:52:24 +0000

    kenv: avoid sleepable alloc for integer tunables

    Avoid performing a potentially-blocking malloc for kenv lookups that will only
    perform non-destructive integer conversions on the returned buffer. Instead,
    perform the strtoq() in-place with the kenv lock held.

    While here, factor the logic around kenv_lock acquire and release into
    kenv_acquire() and kenv_release(), and use these functions for some light
    cleanup. Collapse getenv_string_buffer() into kern_getenv(), as the former
    no longer has any other callers and the only additional task performed by
    the latter is a WITNESS check that hasn't been useful since r362231.

    PR:             248250
    Reported by:    gbe
    Reviewed by:    mjg
    Tested by:      gbe
    Differential Revision:  https://reviews.freebsd.org/D26010

 sys/kern/kern_environment.c | 124 ++++++++++++++++++++++++--------------------
 1 file changed, 67 insertions(+), 57 deletions(-)