Bug 242724

Summary: bhyve: Unkillable processes stuck in 'STOP' state (vmm::vm_handle_suspend())
Product: Base System Reporter: Aleksandr Fedorov <afedorov>
Component: bhyveAssignee: Konstantin Belousov <kib>
Status: Closed FIXED    
Severity: Affects Many People CC: andrey, jhb, kib, markj, rgrimes, virtualization
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
procstat -k
none
prockstat threads
none
prockstat signals
none
procstat tsignals none

Description Aleksandr Fedorov freebsd_committer freebsd_triage 2019-12-19 13:04:00 UTC
On one of our servers, there are three unkillable bhyve processes which stuck in 'STOP' state. All has the same symptoms:

# procstat -kk 9277 | sort -k 4
  PID    TID COMM                TDNAME              KSTACK
 9277 102504 bhyve               blk-1:0:0-0         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102514 bhyve               blk-1:0:0-1         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102752 bhyve               blk-1:0:0-2         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102760 bhyve               blk-1:0:0-3         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102761 bhyve               blk-1:0:0-4         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102763 bhyve               blk-1:0:0-5         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102764 bhyve               blk-1:0:0-6         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102766 bhyve               blk-1:0:0-7         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102767 bhyve               blk-3:0-0           mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102768 bhyve               blk-3:0-1           mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102769 bhyve               blk-3:0-2           mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102770 bhyve               blk-3:0-3           mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102771 bhyve               blk-3:0-4           mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102772 bhyve               blk-3:0-5           mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102773 bhyve               blk-3:0-6           mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102774 bhyve               blk-3:0-7           mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102775 bhyve               blk-4:0:0-0         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102780 bhyve               blk-4:0:0-1         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102887 bhyve               blk-4:0:0-2         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102888 bhyve               blk-4:0:0-3         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102889 bhyve               blk-4:0:0-4         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102890 bhyve               blk-4:0:0-5         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102891 bhyve               blk-4:0:0-6         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102892 bhyve               blk-4:0:0-7         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102687 bhyve               mevent              mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102941 bhyve               pci-devstat         mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102983 bhyve               pci-reconfig        mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102937 bhyve               rfb                 mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f
 9277 102991 bhyve               vcpu 0              mi_switch+0xe2 sleepq_timedwait+0x2f msleep_spin_sbt+0xd8 vm_run+0x6c7 vmmdev_ioctl+0x9b2 devfs_ioctl+0xc7 VOP_IOCTL_APV+0x56 vn_ioctl+0x16a devfs_ioctl_f+0x1f kern_ioctl+0x27d sys_ioctl+0x15d amd64_syscall+0x3b0 fast_syscall_common+0x101
 9277 100782 bhyve               vcpu 1              mi_switch+0xe2 thread_suspend_switch+0xd4 thread_single+0x47b sigexit+0x57 postsig+0x2f8 ast+0x327 fast_syscall_common+0x198
 9277 100788 bhyve               vcpu 2              mi_switch+0xe2 sleepq_timedwait+0x2f msleep_spin_sbt+0xd8 vm_run+0x6c7 vmmdev_ioctl+0x9b2 devfs_ioctl+0xc7 VOP_IOCTL_APV+0x56 vn_ioctl+0x16a devfs_ioctl_f+0x1f kern_ioctl+0x27d sys_ioctl+0x15d amd64_syscall+0x3b0 fast_syscall_common+0x101
 9277 101385 bhyve               vcpu 3              mi_switch+0xe2 sleepq_timedwait+0x2f msleep_spin_sbt+0xd8 vm_run+0x6c7 vmmdev_ioctl+0x9b2 devfs_ioctl+0xc7 VOP_IOCTL_APV+0x56 vn_ioctl+0x16a devfs_ioctl_f+0x1f kern_ioctl+0x27d sys_ioctl+0x15d amd64_syscall+0x3b0 fast_syscall_common+0x101
 9277 103402 bhyve               vcpu 4              mi_switch+0xe2 sleepq_timedwait+0x2f msleep_spin_sbt+0xd8 vm_run+0x6c7 vmmdev_ioctl+0x9b2 devfs_ioctl+0xc7 VOP_IOCTL_APV+0x56 vn_ioctl+0x16a devfs_ioctl_f+0x1f kern_ioctl+0x27d sys_ioctl+0x15d amd64_syscall+0x3b0 fast_syscall_common+0x101
 9277 102206 bhyve               vcpu 5              mi_switch+0xe2 sleepq_timedwait+0x2f msleep_spin_sbt+0xd8 vm_run+0x6c7 vmmdev_ioctl+0x9b2 devfs_ioctl+0xc7 VOP_IOCTL_APV+0x56 vn_ioctl+0x16a devfs_ioctl_f+0x1f kern_ioctl+0x27d sys_ioctl+0x15d amd64_syscall+0x3b0 fast_syscall_common+0x101
 9277 102932 bhyve               vtnet-5:0 tx        mi_switch+0xe2 thread_suspend_check+0x297 ast+0x3b9 doreti_ast+0x1f

# procstat threads 9277|grep vcpu
 9277 100782 bhyve               vcpu 1               -1  124 stop    -
 9277 100788 bhyve               vcpu 2               -1  123 stop    vmsusp
 9277 101385 bhyve               vcpu 3               -1  123 stop    vmsusp
 9277 102206 bhyve               vcpu 5               -1  122 stop    vmsusp
 9277 102991 bhyve               vcpu 0               -1  124 stop    vmsusp
 9277 103402 bhyve               vcpu 4               -1  123 stop    vmsusp

As you can see, One of the VCPU  threads exit, the rest are sleeping in kernel mode: https://svnweb.freebsd.org/base/head/sys/amd64/vmm/vmm.c?view=markup#l1515

The main problem is that if the bhyve process is killed, the threads driving those vCPUs will be gone, resulting in the remaining thread waiting in kernelspace forever: https://svnweb.freebsd.org/base/head/sys/amd64/vmm/vmm.c?view=markup#l1507

I found that the same bug was fixed in SmartOS: https://github.com/joyent/illumos-joyent/commit/dce228e4331f185347c3e0325cab8a3af72d6410#diff-721b4b2a86bbb8d11c6e01a2995868e6

Unfortunately the site https://smartos.org/bugview/OS-6888 is not available, so it is impossible to see the detailed description.

Can we apply similar fix to our bhyve code?
Comment 1 Aleksandr Fedorov freebsd_committer freebsd_triage 2019-12-19 13:04:49 UTC
Created attachment 210059 [details]
procstat -k
Comment 2 Aleksandr Fedorov freebsd_committer freebsd_triage 2019-12-19 13:05:15 UTC
Created attachment 210060 [details]
prockstat threads
Comment 3 Aleksandr Fedorov freebsd_committer freebsd_triage 2019-12-19 13:05:55 UTC
Created attachment 210061 [details]
prockstat signals
Comment 4 Aleksandr Fedorov freebsd_committer freebsd_triage 2019-12-19 13:06:12 UTC
Created attachment 210062 [details]
procstat tsignals
Comment 5 Konstantin Belousov freebsd_committer freebsd_triage 2019-12-19 15:30:41 UTC
Show line number for vm_run+0x6c7.

What you wrote about vm_handle_rendezvous() might be reasonable but I am not sure that this is the case.  Anyway, the line number information should confirm or deny.

If indeed the issue is the infinite wait for rendezvous, it seems to be easy to handle it.
Comment 6 Konstantin Belousov freebsd_committer freebsd_triage 2019-12-19 20:57:46 UTC
Try https://reviews.freebsd.org/D22881.
I did not tested that.
Comment 8 Konstantin Belousov freebsd_committer freebsd_triage 2019-12-20 14:37:45 UTC
(In reply to Andrey Fesenko from comment #7)
I have no idea what did you tried to say.  smartos link opens up as a blank page, and the yandex link returns 404.
Comment 9 Andrey Fesenko 2019-12-20 20:02:28 UTC
(In reply to Konstantin Belousov from comment #8)
hm... yandex cach not permanent link :(

original page bug OS-6888 long time not work, now working, save https://web.archive.org/web/20191220195714/https://smartos.org/bugview/OS-6888 if smartos bugview down again.
Comment 10 commit-hook freebsd_committer freebsd_triage 2020-01-02 22:37:38 UTC
A commit references this bug:

Author: kib
Date: Thu Jan  2 22:37:05 UTC 2020
New revision: 356296
URL: https://svnweb.freebsd.org/changeset/base/356296

Log:
  bhyve: terminate waiting loops if thread suspension is requested.

  PR:	242724
  Reviewed by:	markj
  Reported and tested by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
  	 (previous version)
  Sponsored by:	The FreeBSD Foundation
  MFC after:	1 week
  Differential revision:	https://reviews.freebsd.org/D22881

Changes:
  head/sys/amd64/include/vmm.h
  head/sys/amd64/vmm/io/vioapic.c
  head/sys/amd64/vmm/vmm.c
  head/sys/amd64/vmm/vmm_dev.c
Comment 11 commit-hook freebsd_committer freebsd_triage 2020-01-09 08:30:23 UTC
A commit references this bug:

Author: kib
Date: Thu Jan  9 08:29:32 UTC 2020
New revision: 356543
URL: https://svnweb.freebsd.org/changeset/base/356543

Log:
  MFC r356296:
  bhyve: terminate waiting loops if thread suspension is requested.

  PR:	242724

Changes:
_U  stable/12/
  stable/12/sys/amd64/include/vmm.h
  stable/12/sys/amd64/vmm/io/vioapic.c
  stable/12/sys/amd64/vmm/vmm.c
  stable/12/sys/amd64/vmm/vmm_dev.c