Bug 209203

Summary: Suspend panics VESA driver
Product: Base System Reporter: Roger Pau Monné <royger>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: emaste
Priority: --- Keywords: patch, vt
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Roger Pau Monné freebsd_committer freebsd_triage 2016-05-02 15:14:05 UTC
I've got the following panic when trying to suspend a VM that's using the VT driver.

login: panic: mtx_lock() of destroyed mutex @ /usr/src/sys/dev/fb/vesa.c:541
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00002464f0
vpanic() at vpanic+0x182/frame 0xfffffe0000246570
kassert_panic() at kassert_panic+0x126/frame 0xfffffe00002465e0
__mtx_lock_flags() at __mtx_lock_flags+0x142/frame 0xfffffe0000246630
vesa_bios_save_restore() at vesa_bios_save_restore+0x78/frame 0xfffffe0000246680
vga_suspend() at vga_suspend+0xa3/frame 0xfffffe00002466b0
isavga_suspend() at isavga_suspend+0x1d/frame 0xfffffe00002466d0
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame 0xfffffe00002466f0
bus_generic_suspend() at bus_generic_suspend+0x67/frame 0xfffffe0000246730
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame 0xfffffe0000246750
bus_generic_suspend() at bus_generic_suspend+0x67/frame 0xfffffe0000246790
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame 0xfffffe00002467b0
pci_suspend_child() at pci_suspend_child+0x2d/frame 0xfffffe00002467d0
bus_generic_suspend() at bus_generic_suspend+0x67/frame 0xfffffe0000246810
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame 0xfffffe0000246830
bus_generic_suspend() at bus_generic_suspend+0x67/frame 0xfffffe0000246870
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame 0xfffffe0000246890
bus_generic_suspend() at bus_generic_suspend+0x67/frame 0xfffffe00002468d0
acpi_suspend() at acpi_suspend+0x2f/frame 0xfffffe00002468f0
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame 0xfffffe0000246910
bus_generic_suspend() at bus_generic_suspend+0x67/frame 0xfffffe0000246950
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame 0xfffffe0000246970
bus_generic_suspend() at bus_generic_suspend+0x67/frame 0xfffffe00002469b0
xctrl_suspend() at xctrl_suspend+0x1ae/frame 0xfffffe0000246a20
xctrl_on_watch_event() at xctrl_on_watch_event+0x5e/frame 0xfffffe0000246a50
xenwatch_thread() at xenwatch_thread+0x1a2/frame 0xfffffe0000246a70
fork_exit() at fork_exit+0x84/frame 0xfffffe0000246ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000246ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 14 tid 100029 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db>

AFAICT it seems like the VESA driver failed to attach, but the suspend hook was not properly removed. Here is the full dmesg from boot:

/boot/kernel/kernel text=0x13ee1c0 data=0x135078+0x4e6478 syms=[0x8+0x160998+0x8+0x178e27]
/boot/entropy size=0x1000
Booting...
GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2016 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.0-CURRENT #14 r298919+bbf7e27(hotplug_scripts)-dirty: Mon May  2 13:34:23 CEST 2016
    root@nuc:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 3.7.1 (tags/RELEASE_371/final 255217) 20151225
WARNING: WITNESS option enabled, expect reduced performance.
VT(vga): text 80x25
XEN: Hypervisor version 4.7 detected.
CPU: Intel(R) Core(TM) i3-5010U CPU @ 2.10GHz (2095.20-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306d4  Family=0x6  Model=0x3d  Stepping=4
  Features=0x17c3fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,ACPI,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0xfff83203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV>
  AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
  AMD Features2=0x121<LAHF,ABM,Prefetch>
  Structured Extended Features=0x1c232a<TSCADJ,BMI1,AVX2,BMI2,ERMS,NFPUSG,RDSEED,ADX,SMAP>
  XSAVE Features=0x1<XSAVEOPT>
Hypervisor: Origin = "XenVMMXenVMM"
real memory  = 125829120 (120 MB)
avail memory = 75702272 (72 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: <Xen HVM>
random: unblocking device.
ioapic0: Changing APIC ID to 1
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-47 on motherboard
random: entropy device external interface
kbd1 at kbdmux0
netmap: loaded module
module_register_init: MOD_LOAD (vesa, 0xffffffff80f00c30, 0) error 19
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
vtvga0: <VT VGA driver> on motherboard
cryptosoft0: <software crypto> on motherboard
acpi0: <Xen> on motherboard
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
cpu0: <ACPI CPU> on acpi0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 62500000 Hz quality 950
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
isab0: <PCI-ISA bridge> at device 1.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX3 WDMA2 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc200-0xc20f at device 1.1 on pci0
ata0: <ATA channel> at channel 0 on atapci0
ata1: <ATA channel> at channel 1 on atapci0
pci0: <bridge> at device 1.3 (no driver attached)
xenpci0: <Xen Platform Device> port 0xc000-0xc0ff mem 0xf2000000-0xf2ffffff irq 24 at device 2.0 on pci0
vgapci0: <VGA-compatible display> mem 0xf0000000-0xf1ffffff,0xf3050000-0xf3050fff at device 3.0 on pci0
vgapci0: Boot video device
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse Explorer, device ID 4
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: does not respond
device_attach: fdc0 attach returned 6
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: console (115200,n,8,1)
xenpv0: <Xen PV bus> on motherboard
granttable0: <Xen Grant-table Device> on xenpv0
xen_et0: <Xen PV Clock> on xenpv0
Event timer "XENTIMER" frequency 1000000000 Hz quality 950
Timecounter "XENTIMER" frequency 1000000000 Hz quality 950
xenstore0: <XenStore> on xenpv0
evtchn0: <Xen event channel user-space device> on xenpv0
privcmd0: <Xen privileged interface user-space device> on xenpv0
debug0: <Xen debug handler> on xenpv0
orm0: <ISA Option ROM> at iomem 0xed800-0xeffff on isa0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
fdc0: No FDOUT register!
ppc0: cannot reserve I/O port range
Timecounters tick every 1.000 msec
xenballoon0: <Xen Balloon Device> on xenstore0
xctrl0: <Xen Control Device> on xenstore0
xs_dev0: <Xenstore user-space device> on xenstore0
xenbusb_front0: <Xen Frontend Devices> on xenstore0
xenbusb_add_device: Device device/suspend/event-channel ignored. State 6
xn0: <Virtual Network Interface> at device/vif/0 on xenbusb_front0
xn0: Ethernet address: 00:16:3e:74:3d:76
xenbusb_back0: <Xen Backend Devices> on xenstore0
xn0: backend features: feature-sg
xbd0: 22528MB <Virtual Block Device> at device/vbd/768 on xenbusb_front0
xbd0: attaching as ada0
xbd0: features: flush, write_barrier
xbd0: synchronize cache commands enabled.
Timecounter "TSC" frequency 2095195408 Hz quality 800
WARNING: WITNESS option enabled, expect reduced performance.
Trying to mount root from ufs:/dev/gpt/rootfs [rw]...
GEOM: ada0: the secondary GPT header is not in the last LBA.
WARNING: / was not properly dismounted
Setting hostuuid: 2394be13-2738-e511-a7fe-b8aeed708531.
Setting hostid: 0x773d4a18.
warning: total configured swap (262144 pages) exceeds maximum recommended amount (174304 pages).
warning: increase kern.maxswzone or reduce amount of swap.
Starting file system checks:
** SU+J Recovering /dev/gpt/rootfs
** Reading 33554432 byte journal from inode 8.
** Building recovery table.
** Resolving unreferenced inode list.
** Processing journal entries.
** 104 journal records in 9216 bytes for 36.11% utilization
** Freed 0 inodes (0 dirs) 0 blocks, and 0 frags.

***** FILE SYSTEM MARKED CLEAN *****
Mounting local file systems:.
lock order reversal:
 1st 0xfffff80003785d50 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2498
 2nd 0xfffffe0005949bd0 bufwait (bufwait) @ /usr/src/sys/ufs/ffs/ffs_vnops.c:263
 3rd 0xfffff80006464d50 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2498
stack backtrace:
#0 0xffffffff80a8a820 at witness_debugger+0x70
#1 0xffffffff80a8a721 at witness_checkorder+0xe71
#2 0xffffffff80a0703b at __lockmgr_args+0xd3b
#3 0xffffffff80cdc646 at ffs_lock+0xa6
#4 0xffffffff80fe0090 at VOP_LOCK1_APV+0x100
#5 0xffffffff80afa67a at _vn_lock+0x9a
#6 0xffffffff80aeac43 at vget+0x63
#7 0xffffffff80add53c at vfs_hash_get+0xcc
#8 0xffffffff80cd7c60 at ffs_vgetf+0x40
#9 0xffffffff80ccf5f1 at softdep_sync_buf+0xad1
#10 0xffffffff80cdd226 at ffs_syncvnode+0x256
#11 0xffffffff80cb3ed0 at ffs_truncate+0x8e0
#12 0xffffffff80ce469b at ufs_direnter+0x7bb
#13 0xffffffff80ced873 at ufs_makeinode+0x5f3
#14 0xffffffff80ce941d at ufs_create+0x2d
#15 0xffffffff80fddaf1 at VOP_CREATE_APV+0xf1
#16 0xffffffff80af9ed8 at vn_open_cred+0x2f8
#17 0xffffffff80af326c at kern_openat+0x25c
ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/gcc48
32-bit compatibility ldconfig path: /usr/lib32
Setting up harvesting:[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
Feeding entropy:.
xn0: link state changed to DOWN
xn0: link state changed to UP
Starting Network: lo0 xn0.
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
	inet 127.0.0.1 netmask 0xff000000
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
	groups: lo
xn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=3<RXCSUM,TXCSUM>
	ether 00:16:3e:74:3d:76
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet manual
	status: active
Starting devd.
Starting dhclient.
DHCPREQUEST on xn0 to 255.255.255.255 port 67
DHCPACK from 172.16.1.1
bound to 172.16.1.147 -- renewal in 43200 seconds.
add net fe80::: gateway ::1
add net ff02::: gateway ::1
add net ::ffff:0.0.0.0: gateway ::1
add net ::0.0.0.0: gateway ::1
Creating and/or trimming log files.
Starting syslogd.
No core dumps found.
Starting casperd.
Clearing /tmp (X related).
Cleaning xenstore database.
Starting xenservices: xenstored, xenconsoled.May  2 14:55:17  xenstored: Checking store ...
May  2 14:55:17  xenstored: Checking store complete.
WARNING: Failed to open connection to gnttab
FATAL: Failed to initialize dom0 state: No such file or directory
..............................May  2 14:55:49  xenconsoled: Failed to contact xenstore (No such file or directory).  Is it running?

Setting domain 0 name, domid and JSON config...
libxl: error: libxl.c:122:libxl_ctx_alloc: cannot connect to xenstore: No such file or directory
cannot init libxl context
Updating motd:.
Mounting late file systems:.
Configuring vt: blanktime.
Performing sanity check on sshd configuration.
Starting sshd.
Starting sendmail_submit.
Starting sendmail_msp_queue.
Starting cron.
Starting background file system checks in 60 seconds.

Mon May  2 14:55:52 UTC 2016

FreeBSD/amd64 (Amnesiac) (ttyu0)
Comment 1 Roger Pau Monné freebsd_committer freebsd_triage 2016-07-12 10:47:39 UTC
This is quite critical for Xen, and should be fixed before 11 is released. I've started to debug it.
Comment 2 Roger Pau Monné freebsd_committer freebsd_triage 2016-07-12 14:05:21 UTC
The following patch fixes the issue for me:

https://reviews.freebsd.org/D7196
Comment 3 commit-hook freebsd_committer freebsd_triage 2016-07-20 09:30:40 UTC
A commit references this bug:

Author: royger
Date: Wed Jul 20 09:29:39 UTC 2016
New revision: 303076
URL: https://svnweb.freebsd.org/changeset/base/303076

Log:
  vesa: fix panic on suspend

  Fix the following panic seen when migrating a FreeBSD guest on Xen:

  panic: mtx_lock() of destroyed mutex @ /usr/src/sys/dev/fb/vesa.c:541
  cpuid = 0
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001d2fa4f0
  vpanic() at vpanic+0x182/frame 0xfffffe001d2fa570
  kassert_panic() at kassert_panic+0x126/frame 0xfffffe001d2fa5e0
  __mtx_lock_flags() at __mtx_lock_flags+0x15b/frame 0xfffffe001d2fa630
  vesa_bios_save_restore() at vesa_bios_save_restore+0x78/frame 0xfffffe001d2fa680
  vga_suspend() at vga_suspend+0xa3/frame 0xfffffe001d2fa6b0
  isavga_suspend() at isavga_suspend+0x1d/frame 0xfffffe001d2fa6d0
  bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame
  [...]

  This is caused because vga_sub_configure (which is called if the VGA adapter
  is attached after VESA tried to initialize), points to vesa_configure, which
  doesn't initialize the VESA mutex. In order to fix it, make sure
  vga_sub_configure points to vesa_load, so that all the needed vesa
  components are properly initialized.

  Sponsored by:		Citrix Systems R&D
  MFC after:		3 days
  PR:			209203
  Reviewed by:		dumbbell
  Differential revision:	https://reviews.freebsd.org/D7196

Changes:
  head/sys/dev/fb/vesa.c
Comment 4 Roger Pau Monné freebsd_committer freebsd_triage 2016-07-20 09:48:35 UTC
Fixed by r303076.
Comment 5 commit-hook freebsd_committer freebsd_triage 2016-08-06 23:52:24 UTC
A commit references this bug:

Author: jhb
Date: Sat Aug  6 23:52:09 UTC 2016
New revision: 303807
URL: https://svnweb.freebsd.org/changeset/base/303807

Log:
  MFC 303076,303225: Use MTX_SYSINIT for the VESA lock.

  303076:
  vesa: fix panic on suspend

  Fix the following panic seen when migrating a FreeBSD guest on Xen:

  panic: mtx_lock() of destroyed mutex @ /usr/src/sys/dev/fb/vesa.c:541
  cpuid = 0
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001d2fa4f0
  vpanic() at vpanic+0x182/frame 0xfffffe001d2fa570
  kassert_panic() at kassert_panic+0x126/frame 0xfffffe001d2fa5e0
  __mtx_lock_flags() at __mtx_lock_flags+0x15b/frame 0xfffffe001d2fa630
  vesa_bios_save_restore() at vesa_bios_save_restore+0x78/frame 0xfffffe001d2fa680
  vga_suspend() at vga_suspend+0xa3/frame 0xfffffe001d2fa6b0
  isavga_suspend() at isavga_suspend+0x1d/frame 0xfffffe001d2fa6d0
  bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame
  [...]

  This is caused because vga_sub_configure (which is called if the VGA adapter
  is attached after VESA tried to initialize), points to vesa_configure, which
  doesn't initialize the VESA mutex. In order to fix it, make sure
  vga_sub_configure points to vesa_load, so that all the needed vesa
  components are properly initialized.

  303225:
  Use MTX_SYSINIT for the VESA lock.

  vesa_init_done isn't a reliable guard for the mutex init.  If
  vesa_configure() doesn't find valid VESA info it will not set
  vesa_init_done, but the lock will remain initialized.  Revert r303076
  and use MTX_SYSINIT to deterministically init the lock.

  PR:		209203
  Approved by:	re (gjb)

Changes:
_U  stable/11/
  stable/11/sys/dev/fb/vesa.c
Comment 6 commit-hook freebsd_committer freebsd_triage 2016-08-06 23:54:26 UTC
A commit references this bug:

Author: jhb
Date: Sat Aug  6 23:53:34 UTC 2016
New revision: 303808
URL: https://svnweb.freebsd.org/changeset/base/303808

Log:
  MFC 303076,303225: Use MTX_SYSINIT for the VESA lock.

  303076:
  vesa: fix panic on suspend

  Fix the following panic seen when migrating a FreeBSD guest on Xen:

  panic: mtx_lock() of destroyed mutex @ /usr/src/sys/dev/fb/vesa.c:541
  cpuid = 0
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001d2fa4f0
  vpanic() at vpanic+0x182/frame 0xfffffe001d2fa570
  kassert_panic() at kassert_panic+0x126/frame 0xfffffe001d2fa5e0
  __mtx_lock_flags() at __mtx_lock_flags+0x15b/frame 0xfffffe001d2fa630
  vesa_bios_save_restore() at vesa_bios_save_restore+0x78/frame 0xfffffe001d2fa680
  vga_suspend() at vga_suspend+0xa3/frame 0xfffffe001d2fa6b0
  isavga_suspend() at isavga_suspend+0x1d/frame 0xfffffe001d2fa6d0
  bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame
  [...]

  This is caused because vga_sub_configure (which is called if the VGA adapter
  is attached after VESA tried to initialize), points to vesa_configure, which
  doesn't initialize the VESA mutex. In order to fix it, make sure
  vga_sub_configure points to vesa_load, so that all the needed vesa
  components are properly initialized.

  303225:
  Use MTX_SYSINIT for the VESA lock.

  vesa_init_done isn't a reliable guard for the mutex init.  If
  vesa_configure() doesn't find valid VESA info it will not set
  vesa_init_done, but the lock will remain initialized.  Revert r303076
  and use MTX_SYSINIT to deterministically init the lock.

  PR:		209203

Changes:
_U  stable/10/
  stable/10/sys/dev/fb/vesa.c
_U  stable/9/sys/
_U  stable/9/sys/dev/
  stable/9/sys/dev/fb/vesa.c