Bug 253087 - Timeout of bufdaemon happens at shutdown time with -CURRENT amd64 and VirtualBox VM
Summary: Timeout of bufdaemon happens at shutdown time with -CURRENT amd64 and Virtual...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Kyle Evans
URL: https://reviews.freebsd.org/D29132
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-29 22:35 UTC by Yasuhiro Kimura
Modified: 2021-03-14 19:47 UTC (History)
8 users (show)

See Also:
kevans: mfc-stable13+
kevans: mfc-stable12-
kevans: mfc-stable11-


Attachments
Screenshot of a 13.0-RC1 guest (819.13 KB, image/png)
2021-03-07 13:51 UTC, Graham Perrin
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yasuhiro Kimura freebsd_committer freebsd_triage 2021-01-29 22:35:41 UTC
[Host]
CPU: Intel Core i7 9700 3.00GHz
OS: 64bit Windows 10 20H2
VirtualBox: 6.1.18

[VM]
CPU: 4 core
Mem: 8GB
HDD: 100GB

Since December I've been experiencing the problem that timeout of bufdaemon happens at shutdown time with -CURRENT amd64 and VirtualBox VM under above conditions. The problem happens when I login the VM and do something to certain extent. For example the problem is reproducible by doing `make buildworld`.

The same problem was reported by AMD Ryzen users at freebsd-current ML and fix for the CPU was already committed. But in my case CPU is Intel and fix for AMD Ryzen doesn't solve the problem of course. So I tried bisect of source tree and found that following commit is the source of the problem.

----------------------------------------------------------------------
commit 84eaf2ccc6aa05da7b7389991d3023698b756e3f
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Mon Dec 21 19:02:31 2020 +0200

    x86: stop punishing VMs with low priority for TSC timecounter
    
    I suspect that virtualization techniques improved from the time when we
    have to effectively disable TSC use in VM.  For instance, it was reported
    (complained) in https://github.com/JuliaLang/julia/issues/38877 that
    FreeBSD is groundlessly slow on AWS with some loads.
    
    Remove the check and start watching for complaints.
    
    Reviewed by:    emaste, grehan
    Discussed with: cperciva
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D27629
----------------------------------------------------------------------

I confirmed that the problem still happens with f17fc5439f517d06ac8976f53354789cde5a7427 committed at Fri Jan 29 09:34:27 2021 -0500 but reverting above one fixes the problem. Moreover same problem happens with 40cb0344eb27e0bb9a112ff50812a7e77816d6be of stable/13 committed at Thu Jan 28 18:59:57 2021 -0500 and and reverting above one also fixes the problem.

Cc-ing the committer of 84eaf2ccc6aa05da7b7389991d3023698b756e3f
Comment 1 Yasuhiro Kimura freebsd_committer freebsd_triage 2021-02-16 19:46:48 UTC
Following is detailed step of reproducing this problem.

1.  Download FreeBSD-14-CURRENT-20210211-c511a5ab53b-256609-disc1.iso from FreeBSD.org download server (https://download.freebsd.org/)

2.  Create new VirtualBox VM with following settings. -- (*1)

    General:
      Name: FreeBSD
      Operating System: FreeBSD (64-bit)
    System:
      Base Memory: 8192 MB
      Processors: 4
      Boot Order: Hard Disk
      EFI: Enabled
      Acceleration: VT-x/AMD-V, Nested Paging
    Display:
      Video Memory: 16 MB
      Graphic Controller: VMSVGA
      Remote Desktop Server: Disabled
      Recording: Disabled
    Storage:
      Controller: AHCI
        SATA Port 0: FreeBSD.vdi (Normal 100GB)
        SATA Port 1: [Optical Drive] FreeBSD-14-CURRENT-20210211-c511a5ab53b-256609-disc1.iso 
    Audio:
      Disabled
    Network:
      Adapter 1: Intal PRO/1000 MT Desktop (Bridged Adapter, Realtek PCIe GBE Family Controller)
    USB:
      USB Controller: OHCI EHCI
      Device Filters: 0 (0 active)
    Shared folders:
      None

3.  Start VM

4.  Install OS with following settings. -- (*2)

    * Use default keymap
    * Install base, kernel and lib32
    * Select 'Auto (ZFS)' as partitioning
    * Change 2 items of ZFS configuration
      - Partition scheme -> GPT (UEF1)
      - Swap Size -> 8g
    * Select 'stripe' as Virtual Device Type
    * Select 'ada0: VBOX HEADDISK'
    * Configure 'em0' as following
      IPv4: manual configuration
      IPv6: disabled
    * Set Time Zone to 'Asia/Japan'
    * Enable 'sshd', 'ntpdate', 'ntpd' and 'dumpdev'
    * No security hardening options
    * Don't add user accounts
    * Do nothing at final configuration
    * Don't do manual configuration
    * Reboot

5.  Login as root

6.  cd /usr

7.  pkg install git-tiny

8.  git clone https://git.freebsd.org/src.git

9.  cd src

10. make -j 4 buildworld

11 shutdown -h now

Note: (*1),(*2)
Not sure if each setting affects the problem. I just wrote what I did while creating VM and installing OS.
Comment 2 Ed Maste freebsd_committer freebsd_triage 2021-03-05 20:58:59 UTC
Maybe it makes sense to leave non-tsc default on VirtualBox only?
Comment 3 Graham Perrin freebsd_committer freebsd_triage 2021-03-05 21:36:32 UTC
I have what may be the same symptom with at least one _13.0-BETA4_ guest, hosted by VirtualBox 6.1.8 (not yet in ports) on FreeBSD 14.0-CURRENT.
Comment 4 Graham Perrin freebsd_committer freebsd_triage 2021-03-07 13:51:47 UTC
Created attachment 223058 [details]
Screenshot of a 13.0-RC1 guest

Timeouts with a reasonably clean 13.0-RC1 guest (upgraded from a 12.2-RELEASE-p3 that had no package installed).
Comment 5 commit-hook freebsd_committer freebsd_triage 2021-03-08 20:44:10 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=8cc15b0dfc2f3299662e78f18bd6127f83c14ab4

commit 8cc15b0dfc2f3299662e78f18bd6127f83c14ab4
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2021-03-08 20:20:10 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2021-03-08 20:43:06 +0000

    x86: tsc: deprioritize TSC on VirtualBox

    Misbehavior has been observed with TSC under VirtualBox, where threads
    doing small sleeps (~1 second) may miss their wake up and hang around
    in a sleep state indefinitely.  Switching back to ACPI-fast decidedly
    fixes it, so stop using TSC on VirtualBox at least for the time being.

    This partially reverts 84eaf2ccc6aa, applying it only to VirtualBox and
    increasing the quality to 0. Negative qualities can never be chosen and
    cannot be chosen with the tunable recently added. If we do not have a
    timecounter with a higher quality than 0, then TSC does at least leave
    the system mostly usable.

    PR:             253087
    Reviewed by:    emaste, kib
    MFC after:      3 days
    Differential Revision:  https://reviews.freebsd.org/D29132

 sys/x86/x86/tsc.c | 8 ++++++++
 1 file changed, 8 insertions(+)
Comment 6 commit-hook freebsd_committer freebsd_triage 2021-03-12 18:44:26 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ec24f78e5b201ea56a69607c6e4438a2faac25c0

commit ec24f78e5b201ea56a69607c6e4438a2faac25c0
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2021-03-08 20:20:10 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2021-03-12 18:43:43 +0000

    x86: tsc: deprioritize TSC on VirtualBox

    Misbehavior has been observed with TSC under VirtualBox, where threads
    doing small sleeps (~1 second) may miss their wake up and hang around
    in a sleep state indefinitely.  Switching back to ACPI-fast decidedly
    fixes it, so stop using TSC on VirtualBox at least for the time being.

    This partially reverts 84eaf2ccc6aa, applying it only to VirtualBox and
    increasing the quality to 0. Negative qualities can never be chosen and
    cannot be chosen with the tunable recently added. If we do not have a
    timecounter with a higher quality than 0, then TSC does at least leave
    the system mostly usable.

    PR:             253087

    (cherry picked from commit 8cc15b0dfc2f3299662e78f18bd6127f83c14ab4)

 sys/x86/x86/tsc.c | 8 ++++++++
 1 file changed, 8 insertions(+)
Comment 7 Graham Perrin freebsd_committer freebsd_triage 2021-03-13 13:47:16 UTC
Thanks, and as a side note: 

(In reply to commit-hook from comment #6)

> A commit in branch stable/13 references this bug: …

https://cgit.freebsd.org/src/commit/?id=ec24f78e5b201ea56a69607c6e4438a2faac25c0&h=stable%2F13 helps to _not_ misrepresent main as the context.
Comment 8 commit-hook freebsd_committer freebsd_triage 2021-03-14 19:44:41 UTC
A commit in branch releng/13.0 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1a15924593931b91aee31875fa75782a592a7436

commit 1a15924593931b91aee31875fa75782a592a7436
Author:     Kyle Evans <kevans@FreeBSD.org>
AuthorDate: 2021-03-08 20:20:10 +0000
Commit:     Kyle Evans <kevans@FreeBSD.org>
CommitDate: 2021-03-14 19:44:04 +0000

    x86: tsc: deprioritize TSC on VirtualBox

    Misbehavior has been observed with TSC under VirtualBox, where threads
    doing small sleeps (~1 second) may miss their wake up and hang around
    in a sleep state indefinitely.  Switching back to ACPI-fast decidedly
    fixes it, so stop using TSC on VirtualBox at least for the time being.

    This partially reverts 84eaf2ccc6aa, applying it only to VirtualBox and
    increasing the quality to 0. Negative qualities can never be chosen and
    cannot be chosen with the tunable recently added. If we do not have a
    timecounter with a higher quality than 0, then TSC does at least leave
    the system mostly usable.

    PR:             253087
    Approved by:    re (gjb)

    (cherry picked from commit 8cc15b0dfc2f3299662e78f18bd6127f83c14ab4)
    (cherry picked from commit ec24f78e5b201ea56a69607c6e4438a2faac25c0)

 sys/x86/x86/tsc.c | 8 ++++++++
 1 file changed, 8 insertions(+)
Comment 9 Kyle Evans freebsd_committer freebsd_triage 2021-03-14 19:47:19 UTC
This should be fixed for -RC3. Thank you.