Bug 237429 - bhyve: Performance regression after 12 upgrade
Summary: bhyve: Performance regression after 12 upgrade
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: misc (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-virtualization mailing list
URL:
Keywords: needs-qa, performance
Depends on:
Blocks:
 
Reported: 2019-04-21 00:51 UTC by doctor
Modified: 2019-08-06 04:04 UTC (History)
3 users (show)

See Also:


Attachments
dmesg.boot (19.28 KB, text/plain)
2019-04-22 02:50 UTC, doctor
no flags Details
pciconf -lv (41.53 KB, text/plain)
2019-04-22 02:52 UTC, doctor
no flags Details
/etc/rc.conf as requested (409 bytes, text/plain)
2019-04-22 02:54 UTC, doctor
no flags Details
bhyveguests 3 samples as ran from script (892 bytes, text/plain)
2019-04-22 02:55 UTC, doctor
no flags Details
uanme -a as requested (144 bytes, text/plain)
2019-04-22 02:56 UTC, doctor
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description doctor 2019-04-21 00:51:07 UTC
Total surprise.  Lookslike bhyve is running very slowly and is taking resources from other services.  I was running bhyve under 11 and 10 no major issues.  I have upgraded to 12 and am running into issues such as server slow performance, virtual machine slow performance , cpu errors from virtual servers under bhyve. Cannot install new bhyve virtual servers due to memory errors.  How can this happen?  I am ready to help debug the issue very cautiously .  Running FreeBSD 12.0 on ufs 2 6-core 1.70 GHz Xeon CPUs and 16 GB RAM.
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2019-04-21 04:40:28 UTC
Thank you for the report. Could you please provide more information on the system, in particular:

- Exact freebsd version (uname -a)
- /var/run/dmesg.boot (as an attachment)
- pciconf -lv output (as an attachment)
- list of running processes when the system (or guest) is performing slowly (as an attachment)
- top -t output (as an attachment), during guest slowdown (when the performance issue is apparenty)
- /etc/rc.conf contents (as an attachment, sanitized if necessary)
- complete host/guest bhyve vm configurations, including cpu/memory/disk confirations for the guests (as an attachment)

Reduction of the test/reproduction case, and steps to reproduce are going to be critical to progressing this issue.
Comment 2 doctor 2019-04-22 02:50:23 UTC
Created attachment 203880 [details]
dmesg.boot
Comment 3 doctor 2019-04-22 02:52:39 UTC
Created attachment 203881 [details]
pciconf -lv
Comment 4 doctor 2019-04-22 02:54:43 UTC
Created attachment 203882 [details]
/etc/rc.conf as requested
Comment 5 doctor 2019-04-22 02:55:50 UTC
Created attachment 203883 [details]
bhyveguests 3 samples as ran from script
Comment 6 doctor 2019-04-22 02:56:44 UTC
Created attachment 203884 [details]
uanme -a as requested
Comment 7 doctor 2019-04-22 02:59:47 UTC
I have 

- complete host/guest bhyve vm configurations, including cpu/memory/disk confirations for the guests (as an attachment)

But it is 2M of info
Comment 8 Rodney W. Grimes freebsd_committer 2019-04-22 15:56:10 UTC
Something that jumps out is:
warning: total configured swap (58982400 pages) exceeds maximum recommended amount (16155552 pages).

You have ~60G of swap space configured, the system only has 16G of memory,
do you actually expect to over commit near 4x on memory?

To go with this the 3 vm's you show are 4G in memory size, so they would
need 12 of the 16G for guests, leaving your host running ZFS with only
4G of memory.  Have you tuned the arc cache to be < than say about 2G
as with just these 3 VM's your gong to start having memory pressure.

I see 16 tap devices configured manually, this is normally handled by
vm-bhyve, but leads me to the question how many VM's are you trying to
run at once and what is the total memory footprint?
Comment 9 doctor 2019-04-22 17:35:26 UTC
(In reply to Rodney W. Grimes from comment #8)
1) In case I wish to expand to say 64G RAM I can do so without trying to rebuild the server.

2)Running UFS not ZFS.  Did I forget to mention this?

3) In 11.2 There were not problems.  in 12.0 These problems suddenly manifested.

4) I was able to put up to 8 VM in 11.2 and below without issue.
Comment 10 Rodney W. Grimes freebsd_committer 2019-04-22 18:57:09 UTC
(In reply to doctor from comment #9)
> 1) In case I wish to expand to say 64G RAM I can do so without trying to rebuild the server.
Ok, valid, thanks for clarifing.

> 2)Running UFS not ZFS.  Did I forget to mention this?
Your system shows it has loaded ZFS:
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
you can say your not using it, but your dmesg says differently.

> 3) In 11.2 There were not problems.  in 12.0 These problems suddenly manifested.

Are you certain that the only change is that of from 11.2 to 12.0,
or did some other change, perhaps a change not considered is the
cause of issues.  I am simply trying to identify what performance
would be low on 12.0 independent of any prior status.

> 4) I was able to put up to 8 VM in 11.2 and below without issue.
8 VM's with 4G each on 11.2 with 16G of memory, I do not accept this as true,
that is 32G of memory commit + host memory needs.  That should of been trashing itself unless those 8vm's are pretty well idle
Comment 11 doctor 2019-04-22 22:02:08 UTC
(In reply to Rodney W. Grimes from comment #10)
(In reply to doctor from comment #9)
> 1) In case I wish to expand to say 64G RAM I can do so without trying to rebuild the server.
Ok, valid, thanks for clarifying.

Well  small server.

> 2)Running UFS not ZFS.  Did I forget to mention this?
Your system shows it has loaded ZFS:
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
you can say your not using it, but your dmesg says differently.

I know I have it enable,  would like to experiment with a hybrid UFS / ZFS system.

> 3) In 11.2 There were not problems.  in 12.0 These problems suddenly manifested.

Are you certain that the only change is that of from 11.2 to 12.0,
or did some other change, perhaps a change not considered is the
cause of issues.  I am simply trying to identify what performance
would be low on 12.0 independent of any prior status.

I do have a 2M ps axww | top -t file That needs to be added.


> 4) I was able to put up to 8 VM in 11.2 and below without issue.
8 VM's with 4G each on 11.2 with 16G of memory, I do not accept this as true,
that is 32G of memory commit + host memory needs.  That should of been trashing itself unless those 8vm's are pretty well idle

They idle for the most part on 11.2 .
Comment 12 amvandemore 2019-04-23 00:52:05 UTC
I don't think this is a reasonable bug report.  Greatly overcommitted resources and configuration like loading ZFS when the system is already memory starved.  

I expect the output of something like 'top -aSwd 2' would provide much clarity missing from this report.

If you do believe this be an actual bug in 12, can you provide the minimal test case for reproduction.

Also can you expand on the CPU errors you are seeing in guest VM's?  I suspect these are spinlock errors which occur because you are overcommited, not because you are running 12-STABLE.
Comment 13 Rodney W. Grimes freebsd_committer 2019-04-23 00:58:39 UTC
(In reply to doctor from comment #11)
I am concerned that you have loaded zfs, you have no pool what so ever present on this machine?  So my assumption is neither did 11.2, so please can you remove zfs, just so we know in no possible way is that having an effect.  I am pretty sure that zfs grabs a chunk of memory at boot time even if you just load it and never do a zpool import, but it should not be a very big chunk.

If in fact you do have a pool, even if your not using it, zfs is going to import it and suck up a bunch of your memory.

I am not so interested in a large chunk of top data,
however a single shot of "ps alwwx" would be of interest if you could get
one during the slow down phase.

At this stage I am pretty convinced this is additional memory pressure and or system tuning changes that came with 12.0.  It does have a slightly larger memory foot print, and there are a fairly extensive amount of vm system changes between 11.2 and 12.0.
Comment 14 doctor 2019-04-23 02:55:04 UTC
(In reply to amvandemore from comment #12)
 You might have a point.  Now trying to install every VM with 512MB and see if that helps.
Comment 15 doctor 2019-04-23 02:56:10 UTC
(In reply to Rodney W. Grimes from comment #13)


Spinoff errors make sense.  What I will do now is to build VMs basedon 512MB and the next machine I obatin will be SAS and ZFS friendly
Comment 16 doctor 2019-08-06 04:04:49 UTC
case closed