Bug 168298 - VirtualBox using AIO on a zvol crashes
Summary: VirtualBox using AIO on a zvol crashes
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-virtualization mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-24 11:50 UTC by Pete French
Modified: 2018-07-05 17:42 UTC (History)
11 users (show)

See Also:


Attachments
update postinstall message (688 bytes, patch)
2018-05-12 16:46 UTC, rozhuk.im
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Pete French 2012-05-24 11:50:03 UTC
	With AIO loaded ViryialBox will use this to access files. Running
	VirtualBox on to of a zvol as the raw disc crashes. This may be a bug
	in zvol+aio, hence the classification above. VirtualBox prduces an
	error message in it's logs about AIO before crashing.

Fix: 

Do not load the AIO kernel module. VirtualBox is stable if AIO
	is not being used.
How-To-Repeat: 
	Running VirtualBox over a zvol with AIO and then doing heavy
	disc write activity will provoke the problem in a few minutes. I made
	a posting to stable regarding this here:

	http://lists.freebsd.org/pipermail/freebsd-stable/2012-May/067648.html

	The zvol has compression enabled.
Comment 1 Martin Birgmeier 2014-12-31 14:05:34 UTC
In case this is still interesting: Do/did you have more than one disk attached? See bug #174968.

-- Martin
Comment 2 pete 2014-12-31 14:55:30 UTC
At the time only a single disc was attached. Subsequently I have added more, but do not have AIO enabled. I have moved to FreeBSD 10 these days, and havent tested since the original bug report, but I dont really have the disc load that I used to on the 10.1 machines.

If I get a chnace I will try it again, but its not likely to be in the next few days.
Comment 3 rozhuk.im 2015-08-14 03:16:21 UTC
To fix tune AIO.
Add to /etc/sysctl.conf

# AIO: Async IO management
vfs.aio.target_aio_procs=4		# Preferred number of ready kernel threads for async IO
vfs.aio.max_aio_procs=4			# Maximum number of kernel threads to use for handling async IO
vfs.aio.aiod_lifetime=30000		# Maximum lifetime for idle aiod
vfs.aio.aiod_timeout=10000		# Timeout value for synchronous aio operations
vfs.aio.max_aio_queue=65536		# Maximum number of aio requests to queue, globally
vfs.aio.max_aio_queue_per_proc=65536	# Maximum queued aio requests per process (stored in the process)
vfs.aio.max_aio_per_proc=8192		# Maximum active aio requests per process (stored in the process)
vfs.aio.max_buf_aio=8192		# Maximum buf aio requests per process (stored in the process)



default values:
vfs.aio.max_aio_queue: 1024
vfs.aio.max_aio_queue_per_proc: 256

to small, and some times queue in vbox > 256 and then vbox fail
Comment 4 Martin Birgmeier 2015-08-14 15:46:40 UTC
(In reply to rozhuk.im from comment #3)

Interesting... thank you.

Before I try this myself: Does this just make the issue more unlikely to happen, or is this a genuine fix?

And also, does VBox not have any method to detect when aio operations are rejected in FreeBSD due to resource limits?

Do I understand correctly that this might fix my issue https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=174968?

-- Martin
Comment 5 rkoberman 2016-08-23 01:39:41 UTC
Just for the record, this is not ZFS related. The same issue shows up with VB and UFS. I am running 11 where AIO is integrated into the kernel, so I'll try the tuning advice and see what happens.

In my case it produced several crashes in my Window 7 client while suspending the VM and many hangs of the VM. Worst case was when the virtual disk required a disk check which, in turn hung repeatedly, though eventually it did complete and the system is running again.

Can AIO be disabled? I was looking at kern.features.aio.

virtualbox-ose-5.0.26_1
FreeBSD rogue 11.0-BETA4 FreeBSD 11.0-BETA4 #1 r303806: Sat Aug  6 18:50:50 PDT 2016     root@rogue:/usr/obj/usr/src/sys/GENERIC.4BSD  amd64
Comment 6 rkoberman 2016-08-24 23:28:20 UTC
The adjustments in comment 3 seem to work, although vfs.aio.aiod_timeout does not exist in 11.0 and vfs.aio.max_aio_procs defaults to 4, so is a noop.

Some of the others seem a bit extreme and I suspect tuning them back would be reasonable. The queue depths are being set to the maximum possible. I suspect 4096 and 1024 would be adequate.

Not really sure why the reduction of maximum AIO processes to 4, but does not seem unreasonable. Likewise the 10x increase in idle time for AIO processes.

The final two, max_aio_per_proc and max_buf_aio also look a bit extreme. Bumped from 32 and 16 to 8K is probably overkill. I'll play around with them and see what I find.

Finally, these may require tuning for the number of VMs.

In any case, I can now run my VM without the disk lock-ups.
Comment 7 martin 2017-03-07 14:49:30 UTC
Hi,

I would like to confirm that the aio sysctl settings in comment #3 fix crashes (causing SIGILL in VBoxSDL) and broken guest filesystems on virtualbox-ose-5.1.14_2 (FreeBSD 11.0). I've tried different guest operating systems and all fail mostly with HDD problems during initial installation.

I am using simple VDI files, by the way, not any ZVOL and no compression. My host CPU is a bit slow (AMD  Athlon II X3 460), it is only capable to emulate 32-bit guests.

I've used these sysctl settings (probably still overdimensioned):

vfs.aio.max_aio_queue=8192
vfs.aio.max_aio_queue_per_proc=8192
vfs.aio.max_aio_per_proc=4096
vfs.aio.max_buf_aio=4096

Thank you.
Comment 8 rkoberman 2017-03-07 16:49:51 UTC
I have done some experimentation and havebeen able to modify these settings to less extreme values and still get VB to run without failing.
vfs.aio.max_aio_queue: 8192
vfs.aio.max_aio_queue_per_proc: 1024
vfs.aio.max_aio_per_proc: 128
vfs.aio.max_buf_aio: 64

I will admit that I have not tried tweaking these values for some time and I suspect come are still allowing the consumption of more resources than needed, but these are safer than those I first proposed.
Comment 9 rozhuk.im 2018-05-10 23:53:13 UTC
May be add info about AIO tunings to post install message and close this bug?
Comment 10 rozhuk.im 2018-05-12 16:46:25 UTC
Created attachment 193335 [details]
update postinstall message
Comment 11 commit-hook freebsd_committer 2018-05-12 17:12:05 UTC
A commit references this bug:

Author: pi
Date: Sat May 12 17:11:34 UTC 2018
New revision: 469742
URL: https://svnweb.freebsd.org/changeset/ports/469742

Log:
  emulators/virtualbox-ose: add pkg-message about sysctl tuning with AIO

  - New values for several sysctl vfs.aio.max* parameters are suggested

  PR:		168298
  Submitted by:	rozhuk.im@gmail.com
  Reported by:	petefrench@ingresso.co.uk

Changes:
  head/emulators/virtualbox-ose/pkg-message
Comment 12 Kurt Jaeger freebsd_committer 2018-05-12 17:17:39 UTC
Sorry, I forgot to mention the reviews in the commit message. Now we should probably keep this PR open until some finds a root cause and some safe lower bounds for those values ?
Comment 13 rozhuk.im 2018-05-12 17:25:27 UTC
Lower bounds depends on host system load, host system speed, guest activity.
IMHO lower bounds does nothink, this is max values, it does not consume resources at idle.