Bug 219913 - emulators/virtualbox-ose-kmod: if the MAXCPU option is not the default for the running kernel, then 'kldload vboxdrv.ko' will result in a kernel panic
Summary: emulators/virtualbox-ose-kmod: if the MAXCPU option is not the default for th...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Vladimir Druzenko
URL: https://www.freshports.org/emulators/...
Keywords: crash
Depends on:
Blocks:
 
Reported: 2017-06-10 19:01 UTC by Andriy Voskoboinyk
Modified: 2025-01-07 19:50 UTC (History)
8 users (show)

See Also:
vvd: maintainer-feedback+


Attachments
patch (2.37 KB, patch)
2018-04-11 03:35 UTC, Craig Leres
no flags Details | Diff
updated patch (5.49 KB, patch)
2023-08-18 21:11 UTC, Craig Leres
no flags Details | Diff
patch (5.47 KB, patch)
2024-06-28 18:32 UTC, Craig Leres
leres: maintainer-approval? (vbox)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andriy Voskoboinyk freebsd_committer freebsd_triage 2017-06-10 19:01:31 UTC
Due to ABI difference (vboxdrv passes cpuset_t parameter (bitfield with CPU_SETSIZE -> MAXCPU bits) into smp_rendezvous_cpus()) kernel panics with "ncpus is 0 with non-zero map" message.

Manual "opt_global.h" inclusion from current kernel build into src/VBox/Runtime/r0drv/freebsd/mp-r0drv-freebsd.c seems to fix this issue when MAXCPU is overridden; kern.smp.maxcpus may be used instead (in case if other 'global' options are not so problematic).
Comment 1 Walter Schwarzenfeld 2018-02-12 14:59:01 UTC
Feedback please!
Comment 2 Craig Leres freebsd_committer freebsd_triage 2018-04-06 02:55:10 UTC
I just started running 11.1-RELEASE on some of my systems and ran into this. At a minimum I think the driver should refuse to load if mp_maxcpus != MAXCPU.

Is there a way to make a cpuset_t at runtime?
Comment 3 Craig Leres freebsd_committer freebsd_triage 2018-04-11 03:35:18 UTC
Created attachment 192427 [details]
patch

The attached patch adds a check and refuses to load vboxdrv when MAXCPU does not match mp_maxcpus.
Comment 4 Graham Perrin freebsd_committer freebsd_triage 2023-08-18 03:59:03 UTC
(In reply to Craig Leres from comment #3)

Thanks, can someone rebase for 6.1.46_1? 

(Assuming that the bug is still reproducible.)
Comment 5 Craig Leres freebsd_committer freebsd_triage 2023-08-18 21:11:19 UTC
Created attachment 244203 [details]
updated patch

Here's an updated patch. It's been nearly 4 years since I've run virtualbox anywhere so I did not go to the trouble of building a custom kernel with MAXCPU != 256 but I did build virtualbox-ose-kmod and test that vboxdrv still loads with the patch applied.

Really the only reason I ran into this was that I had always bump'ed MAXCPU from its ridiculously low default in the before times and didn't notice when the default changed to the more modern value of 256. At that point I stopped customizing it.
Comment 7 Graham Perrin freebsd_committer freebsd_triage 2023-08-19 05:21:41 UTC
Comment on attachment 244203 [details]
updated patch

Additional eyes on this patch. 

Whilst emulation@ is not the maintainer, it _is_ a specified address for problems; <https://github.com/freebsd/freebsd-ports/commit/afbf09cc33941f6e8015ea2a99665add0df3b03a#diff-64c4a683499abbf4275fc526621380d5ffa8c2f6c58cd5a2acf0080b6d47f377R17>
Comment 8 Vladimir Druzenko freebsd_committer freebsd_triage 2024-06-28 12:56:25 UTC
Is this PR still relevant?
Comment 9 Craig Leres freebsd_committer freebsd_triage 2024-06-28 18:32:17 UTC
Created attachment 251748 [details]
patch

Here's an updated patch tested with 14.1-RELEASE-p1. I built a custom (amd64) kernel with options MAXCPU=512 and verified it still works as intended:

    Jun 28 11:14:16 sea kernel: vboxdrv: MAXCPU != mp_maxcpus (1024 != 512)
    Jun 28 11:14:16 sea syslogd: last message repeated 1 times
    Jun 28 11:14:16 sea kernel: module_register_init: MOD_LOAD (vboxdrv, 0xffffffff8569f4f0, 0) error 22

I also tested the freebsd-built virtualbox-ose-kmod-6.1.50 package and when I attempted to load that version of the module with my 512 cpu kernel it said:

    Jun 28 11:15:59 sea kernel: KLD vboxdrv.ko: depends on kernel - not available or version mismatch

It gives the same error when I tried to load in (a) a custom kenrel with MAXCPU=1024 (the default) and (b) the GENERIC 14.1-RELEASE (p0) kernel. At this point it's not clear to me how to use this kernel module with any systems!

Here's a thread that appears relevant:

    https://forums.freebsd.org/threads/virtualbox-kernel-module-fails-to-load-on-freebsd-13-1-release.85191/
Comment 10 Vladimir Druzenko freebsd_committer freebsd_triage 2024-06-28 20:42:04 UTC
(In reply to Craig Leres from comment #9)
Is it work if emulators/virtualbox-ose-kmod was build with same MAXCPU as kernel?
For example using something like -DMAXCPU=N during build emulators/virtualbox-ose-kmod.
Comment 11 Craig Leres freebsd_committer freebsd_triage 2024-06-28 20:47:53 UTC
(In reply to Vladimir Druzenko from comment #10)
Back when I was using virtualbox (six years ago), I had no issues using a custom kernel so long as mp_maxcpus == MAXCPU.
Comment 12 Vladimir Druzenko freebsd_committer freebsd_triage 2024-06-28 21:13:49 UTC
(In reply to Craig Leres from comment #11)
MAXCPU defined in /usr/include/machine/param.h, mp_maxcpus from running kernel from option MAXCPU.
If you build emulators/virtualbox-ose-kmod without define custom MAXCPU, then it doesn't work, isn't it?
So patch need possibility to build the port with custom MAXCPU. Isn't it?
Or it's possible to define MAXCPU in somewhere like /etc/make.conf without manual editing of /usr/include/machine/param.h?

P.S. Sorry or my english…
Comment 13 Craig Leres freebsd_committer freebsd_triage 2024-06-28 21:35:55 UTC
(In reply to Vladimir Druzenko from comment #12)
(Don't sweat your English, I don't even know more than one language!)

To my way of thinking the minimal fix is to prevent the crash that happens when you run a custom kernel that is compiled with a MAXCPU that's different from the default. That way if the user wants to run with a different MAXCPU they'll find out that they need to build a matching custom vbox module.

Consider the case when the user has a working vbox setup but they build a custom kernel to increase MAXCPU. When the reboot they'll have to mess with the bootloader to avoid loading the problematic module before they can even boot again.

What would be ideal is if the module could make a cpuset_t at runtime by looking at mp_maxcpus instead of being hard-coded to use the MAXCPU define.
Comment 14 Vladimir Druzenko freebsd_committer freebsd_triage 2024-06-29 00:17:20 UTC
(In reply to Craig Leres from comment #13)
Can you please explain: if I build kernel with non-default MAXCPU, boot it, then build virtualbox-ose-kmod without any patched - can it work or I must to rebuild virtualbox-ose-kmod with defined MAXCPU (make -DMAXCPU=N, add something to /etc/make.conf, edit /usr/include/machine/param.h, export MAXCPU=N or something else)?
Comment 15 Craig Leres freebsd_committer freebsd_triage 2024-06-29 01:00:06 UTC
First, I would not worry too much about supporting users who want to use a non-default MAXCPU, they'll be able to figure it out.

But is it likely that anyone would even want to do this? The current default is 1024 and I'm not so sure FreeBSD even runs on any hardware that has more than 1024 cores.

And it doesn't seem likely to me that someone would want to run a custom kernel with fewer than 1024 cores. The only reason I can think of is to (slightly) reduce the amount of kernel memory used. But would you really need to do that on a system beefy enough to run virtualbox?

But if you wanted to do this, you would add:

    options MAXCPU=2048

to your kernel config, run "config", make depend ..., make ...

Then yes, you'd have to do something custom to build a virtualbox-ose-kmod that would work with it.
Comment 16 Vladimir Druzenko freebsd_committer freebsd_triage 2024-06-29 20:13:25 UTC
So, we have come to the conclusion that now (MAXCPU=1024 by default) there is very little point in creating any patch for emulators/virtualbox-ose-kmod?
Therefore, I propose to simply close this PR as an Overcome by Events.

IMHO, we should think about the next increase MAXCPU before release 15:
- Intel has had CPU [1] for over a year now that allow to create a server with 960 threads (60 cores per socket * 2 threads per core * 8 sockets);
- AMD just announced [2] a CPU with 192 cores per socket * 2 threads per core * 2 sockets = 768 threads.
So even before the end of life of 14, servers with the number of threads > 1024 may appear.

[1] https://ark.intel.com/content/www/us/en/ark/products/231747/intel-xeon-platinum-8490h-processor-112-5m-cache-1-90-ghz.html
[2] https://www.tomshardware.com/pc-components/cpus/amd-announces-3nm-epyc-turin-launching-with-192-cores-and-384-threads-in-second-half-of-2024-54x-faster-than-intel-xeon-in-ai-workload
Comment 17 Craig Leres freebsd_committer freebsd_triage 2024-06-29 23:36:58 UTC
(In reply to Vladimir Druzenko from comment #16)
I don't like the current "vboxdrv.ko: depends on kernel - not available or version mismatch" error but I'm not longer a consumer of this port and what's there now is very restrictive vs. the booted kernel so maybe I don't care any more.

If you want to close the PR, that's fine with me. (Thanks for digging in on it!)
Comment 18 Vladimir Druzenko freebsd_committer freebsd_triage 2024-06-30 00:14:06 UTC
(In reply to Craig Leres from comment #17)
What do you think about adding an explanation to the emulators/virtualbox-ose-kmod/files/pkg-message.in? But for this I need your help.
Comment 19 Craig Leres freebsd_committer freebsd_triage 2024-07-03 17:30:42 UTC
(In reply to Vladimir Druzenko from comment #18)
What's already in pkg-message.in is probably sufficient. And I don't understand how the current mechanism to detect a mismatch works so I can't think of anything to add.
Comment 20 Fernando Apesteguía freebsd_committer freebsd_triage 2025-01-07 12:08:47 UTC
Can we close this one?
Comment 21 Vladimir Druzenko freebsd_committer freebsd_triage 2025-01-07 14:09:39 UTC
I've studied the patch, looked through the includes from base and the port sources - I'll commit this patch after a little testing.
Comment 22 commit-hook freebsd_committer freebsd_triage 2025-01-07 16:14:40 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=8d72823b38b779036014938cc250f859b27fb3f7

commit 8d72823b38b779036014938cc250f859b27fb3f7
Author:     Craig Leres <leres@freebsd.org>
AuthorDate: 2025-01-07 16:10:57 +0000
Commit:     Vladimir Druzenko <vvd@FreeBSD.org>
CommitDate: 2025-01-07 16:10:57 +0000

    emulators/virtualbox-ose-kmod: Add check for MAXCPU and mp_maxcpus before load vboxdrv.ko

    If the MAXCPU option is not the default for the running kernel, then
    'kldload vboxdrv.ko' will result in a kernel panic.
    Due to ABI difference (vboxdrv passes cpuset_t parameter (bitfield with
    CPU_SETSIZE -> MAXCPU bits) into smp_rendezvous_cpus()) kernel panics
    with "ncpus is 0 with non-zero map" message.

    PR:     219913

 emulators/virtualbox-ose-kmod/Makefile             |  1 +
 ...ox_HostDrivers_Support_freebsd_SUPDrv-freebsd.c | 37 +++++++++++++++-------
 2 files changed, 27 insertions(+), 11 deletions(-)
Comment 23 Vladimir Druzenko freebsd_committer freebsd_triage 2025-01-07 16:23:18 UTC
I can try to adapt this patch for emulators/virtualbox-ose-kmod-legacy, but I find it hard to imagine a situation in which someone would use VirtualBox 5.x on hardware with more than 1024 CPUs.

Thanks for patch!
Comment 24 Craig Leres freebsd_committer freebsd_triage 2025-01-07 18:33:22 UTC
I agree it's unlikely many folks will change MAXCPU given its modern default of 1024 but it seems worth avoiding the panic for this case.

Thanks for giving this PR some love.
Comment 25 commit-hook freebsd_committer freebsd_triage 2025-01-07 19:50:20 UTC
A commit in branch 2025Q1 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=448dfbb743fadbf19299be9b16251108a5e47474

commit 448dfbb743fadbf19299be9b16251108a5e47474
Author:     Craig Leres <leres@freebsd.org>
AuthorDate: 2025-01-07 16:10:57 +0000
Commit:     Vladimir Druzenko <vvd@FreeBSD.org>
CommitDate: 2025-01-07 19:40:38 +0000

    emulators/virtualbox-ose-kmod: Add check for MAXCPU and mp_maxcpus before load vboxdrv.ko

    If the MAXCPU option is not the default for the running kernel, then
    'kldload vboxdrv.ko' will result in a kernel panic.
    Due to ABI difference (vboxdrv passes cpuset_t parameter (bitfield with
    CPU_SETSIZE -> MAXCPU bits) into smp_rendezvous_cpus()) kernel panics
    with "ncpus is 0 with non-zero map" message.

    PR:     219913
    (cherry picked from commit 8d72823b38b779036014938cc250f859b27fb3f7)

 emulators/virtualbox-ose-kmod/Makefile             |  1 +
 ...ox_HostDrivers_Support_freebsd_SUPDrv-freebsd.c | 37 +++++++++++++++-------
 2 files changed, 27 insertions(+), 11 deletions(-)