Bug 259399 - emulators/virtualbox-ose-nox11: no longer able to connect to running VM's
Summary: emulators/virtualbox-ose-nox11: no longer able to connect to running VM's
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Virtualbox Team (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-24 11:15 UTC by Ralf van der Enden
Modified: 2021-10-26 23:15 UTC (History)
1 user (show)

See Also:
madpilot: maintainer-feedback+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ralf van der Enden 2021-10-24 11:15:28 UTC
After upgrading emulators/virtualbox-ose-kmod and emulators/virtualbox-ose-nox11 from 6.1.26 to 6.1.28 I can no longer connect via tcp to my VM's.

This is how it looks before the upgrade:

[(Sat Oct 23 16:35) root@lan ~]# docker-machine ls
NAME      ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER      ERRORS
chev4     -        virtualbox   Stopped                                       Unknown
default   -        virtualbox   Running   tcp://192.168.99.100:2376           v19.03.12

and this is after:

[(Sun Oct 24 13:10) root@lan ~]# docker-machine ls
NAME      ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER    ERRORS
chev4     -        virtualbox   Stopped                                       Unknown
default   -        virtualbox   Running   tcp://192.168.99.100:2376           Unknown   Unable to query docker version: Get "https://192.168.99.100:2376/v1.15/version": dial tcp 192.168.99.100:2376: connect: permission denied

I've also tried creating a new one, but that shows the same errors as the default docker instance.

If there's anything else I can provide (logs, etc) let me know and I'll attach them to this PR.
Comment 1 Guido Falsi freebsd_committer 2021-10-24 14:15:42 UTC
Hi,

I've just made a quick test:

Host: FreeBSD 14.0 (recent head)
Guest: FreeBSD 13.0-p4 (binary install, binary packages)

Guest is configured with NAT networking, and uses the virtio network card.

I enabled the the ssh server in the guest, added a network mapping (TCP port 22 to 2042 on the host) and am able to connect to it.

I also tested with bridged networking, all the rest being the same and I was also able to connect.

So I can't reproduce anything similar.

Could you report the exact virtualbox networking configuration you're using?

I see you use docker. I've never used it and know very little about it, but having a tool mediating your virtualbox usage means a lot of other factors could be weighting in.

One thing you should check is if there is any firewall configuration getting in the way.

I can't think of anything else right away and not sure about logs that could be useful in this case. Maybe the output of "netstat -rn" and "netstat -r", but I don't foresee getting anything decisive from there.
Comment 2 Guido Falsi freebsd_committer 2021-10-24 14:23:05 UTC
(In reply to Guido Falsi from comment #1)


Forgot a question:

What's the guest OS? Are additions installed and updated (I have old 6.1.26 additions in the VM I tested with, usually additions from older versions work fine)


I mention firewall because the error you get (connect: permission denied) smells of local firewall restriction being applied by the kernel for generated packets (inbound leg for locally generated packets in the firewall), but it's just an hypothesis based on partial data.
Comment 3 Ralf van der Enden 2021-10-24 23:03:59 UTC
Additions aren't installed, since those only work for FreeBSD guests if I'm not mistaken.

The guest OS is Linux (Boot2Docker.iso, which is used automatically if you use the virtualbox driver of docker-machine).

My firewall wasn't an issue for 6.1.26, so I'd assume it should also work for 6.1.28.

rc.conf:
# VirtualBox
vboxnet_enable="YES"

sysctl.conf:
# virtualbox (AIO)
vfs.aio.max_buf_aio=8192
vfs.aio.max_aio_queue_per_proc=65536
vfs.aio.max_aio_per_proc=8192
vfs.aio.max_aio_queue=65536

ifconfig vboxnet0:
vboxnet0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 0a:00:27:00:00:00
        inet 192.168.99.1 netmask 0xffffff00 broadcast 192.168.99.255
        inet6 fe80::800:27ff:fe00:0%vboxnet0 prefixlen 64 scopeid 0x4
        media: Ethernet autoselect
        status: active
        nd6 options=61<PERFORMNUD,AUTO_LINKLOCAL,NO_RADR>

Not sure what else I can share.
Comment 4 Ralf van der Enden 2021-10-25 06:48:04 UTC
Boot2Docker added additions somewhere along the road. Afer running VBoxService -V I get this: 5.2.34r133893

I forgot to mention I'm running this on 13.0-RELEASE-p4 amd64
Comment 5 Guido Falsi freebsd_committer 2021-10-25 07:44:18 UTC
(In reply to Ralf van der Enden from comment #3)

> Additions aren't installed, since those only work for FreeBSD guests if I'm not mistaken.

Additions are available for Linux, FreeBSD and Windows, each OS/distribution has it's preferred way of installation. Anyway I don't think they make a difference in this case.

> My firewall wasn't an issue for 6.1.26, so I'd assume it should also work for 6.1.28.

I'm bound to say that "assuming" things is not a good way to diagnose a problem. 
As I already said, the error you're getting (permission denied) is the error you'd be getting if the firewall blocked the connection in it's inbound leg to your kernel. So at least verifying the firewall is not interfering in some way is advisable.

The quickest way to test this would be to briefly disable your firewall.



Inside the virtualbox VM what network interface are you using? Is it configured with bridged networkk? NAT? NATNetwork?

Have you tried manually creating a VM in virtalbox and connecting to that? This would at least reduce the number of moving parts.
Comment 6 Ralf van der Enden 2021-10-25 19:10:46 UTC
Hi again,

I think I know why it's not working.

If I start any VM on 6.1.26 the following is started automatically:

/usr/local/lib/virtualbox/VBoxNetDHCP --comment HostInterfaceNetworking-vboxnet0 --config /root/.config/VirtualBox/HostInterfaceNetworking-vboxnet0-Dhcpd.config --l

After upgrading to 6.1.28 the VM comes up, but fails to get an ip-number, since the VBoxNetDHCP is not running. Am I supposed to start that manually ? This was never the case with older versions.

Disabling the firewall didn't work, so that was nog it.


And it's behaving like this with both docker-machine and vagrant.
Comment 7 Guido Falsi freebsd_committer 2021-10-25 20:20:55 UTC
(In reply to Ralf van der Enden from comment #6)

I asked more than once exactly how you VM networking was configured but you did not reply. Can you please provide this information? WHich of the following are you using?

NAT
Bridged Adapter
Internal Network
Host-only adapter
NAT Network


(there are also a few others but I am sure you're not using those, because they don't apply in this case)

I need this information to try to reproduce your problem, without it I'm just guessing.

I can guess from what you tell me that you're using NAT Network, which is the most complex of the bunch. It does depend on an external daemon which should start by itself, but has proven difficult. When I was working on updating to 6.1 it was causing hangs and crashes. Maybe something else came up and needs to be fixed.

I will perform some tests with such a setup, it will require time.

But please confirm this is the actual setup you're using.
Comment 8 Guido Falsi freebsd_committer 2021-10-25 20:31:57 UTC
(In reply to Guido Falsi from comment #7)

Forgot one detail. There is no UI to configure the NATNetwork backend part, which is what is responsible of the DHCP daemon.

How did you configure the NAT Network? Was it automatically configured by the tools you manage you VMs with? Do those tools have some way of forcing rebuilding the network configuration? Because dhcp is optional in that configuration (the dhcp service could be provided by a VM running in the network, I did that some years ago, or all the VMs have static addresses), and it is also possible that configuration got corrupted by the upgrade for whatever reason.

I did a quick test here running this command to configure natnetwork:

```
VBoxManage natnetwork add --netname natnet1 --network "192.168.88.0/24" --enable --dhcp on
```

Connected a VM to it and launched it, the DHCP daemon started as expected and the machine does have and IP and is able to talk to the local network.

This is the configuration reported:

> VBoxManage natnetwork list
NAT Networks:

Name:        natnet1
Network:     192.168.88.0/24
Gateway:     192.168.88.1
IPv6:        No
Enabled:     Yes

1 network found


What is the output of `VBoxManage natnetwork list` on your machine?
Comment 9 Ralf van der Enden 2021-10-25 21:45:12 UTC
docker-machine (and also vagrant) configured it for me. As far as I know my VM's use HostOnly-Ifs

Here's the output of several VBoxManage commands (while running 6.1.26):

[(Mon Oct 25 23:37) root@lan ~]# VBoxManage natnetwork list
NAT Networks:

0 networks found


[(Mon Oct 25 23:36) root@lan ~]# VBoxManage list dhcpservers
NetworkName:    HostInterfaceNetworking-vboxnet0
Dhcpd IP:       192.168.99.10
LowerIPAddress: 192.168.99.100
UpperIPAddress: 192.168.99.254
NetworkMask:    255.255.255.0
Enabled:        Yes
Global Configuration:
    minLeaseTime:     default
    defaultLeaseTime: default
    maxLeaseTime:     default
    Forced options:   None
    Suppressed opts.: None
        1/legacy: 255.255.255.0
Groups:               None
Individual Configs:   None

[(Mon Oct 25 23:37) root@lan ~]# VBoxManage list hostonlyifs
Name:            vboxnet0
GUID:            786f6276-656e-4074-8000-0a0027000000
DHCP:            Disabled
IPAddress:       192.168.99.1
NetworkMask:     255.255.255.0
IPV6Address:     fe80::800:27ff:fe00:0
IPV6NetworkMaskPrefixLength: 64
HardwareAddress: 0a:00:27:00:00:00
MediumType:      Ethernet
Wireless:        No
Status:          Up
VBoxNetworkName: HostInterfaceNetworking-vboxnet0


While running 6.1.28 this one is different:

[(Mon Oct 25 23:43) root@lan ~]# VBoxManage list hostonlyifs
Name:            vboxnet0
GUID:            786f6276-656e-4074-8000-0a0027000000
DHCP:            Disabled
IPAddress:       192.168.99.1
NetworkMask:     255.255.255.0
IPV6Address:
IPV6NetworkMaskPrefixLength: 0
HardwareAddress: 0a:00:27:00:00:00
MediumType:      Ethernet
Wireless:        No
Status:          Down
VBoxNetworkName: HostInterfaceNetworking-vboxnet0

After bringing up the VM it doesn't change the status of vboxnet0 to up and doesn't assign the gateway IP (192.168.99.1) to it. It does do that when using vbox 6.1.26
Comment 10 Guido Falsi freebsd_committer 2021-10-25 22:39:27 UTC
(In reply to Ralf van der Enden from comment #9)

I'm testing hostonlynetwork. First thing I notice is that in the output of VBoxManage list hostonlyifs always reports that ad Disabled, even when I am sure it is enabled. Anyway here the VBoxNetDHCP daemon is starting up and assigning IPs. The VM in it gets the IP and I'm able to ssh into it from the host.

I never used docker and have experience in vagrant only with simple setups, with machines configured to use NAT as an interface mode.

I can't help with the details of tools I don't know and use, but is there in these tools some command to force reconfigure the networking setup from scratch? Because maybe something changed in the virtualbox config that is not working with the old configuration files.

I understand you say it worked before the upgrade, but since you're not configuring virtualbox directly, but only using it indirectly from other tools there are other parts moving. My suggestion is to use those tools to reset and reconfigure virtualbox networking setup from scratch. Especially wipe out any hostonly network configuration, reboot and recreate them, then reconnect the machines to them.

I don't see any issuae in virtualbox itself, but I can tell you network configuration in virtualbox on FreeBSD hosts is somewhat brittle. If I do anything slightly complex I often find I need to reload kernel modules, reboot or reset it and start from scratch.

I'm sorry but I tried to replicate the configurations you exposed but can't see anything wrong in virtualbox itself.

If I can't reproduce the issue I can't really diagnose it.
Comment 11 Guido Falsi freebsd_committer 2021-10-25 22:52:01 UTC
Some things in your report don't add up to me, are you completely sure the main virtualbox and kmod packages are aligned? Have you performed a full reboot after upgrading?

Are yu using binary packages or building them yourself?

If building yourself could you share the options to the ports? (output of make showconfig)

in both cases could you try forcing reinstallation of the packages/rebuilding reinstalling the ports (both of them) rebooting and trying again (preferabily also rebuilding the networking configuration of virtualbox from scratch).

Looks like something got out of sync on your system. This sometimes happens.
Comment 12 Ralf van der Enden 2021-10-26 07:38:48 UTC
I'm building myself using poudriere. And yes, I've also rebooted after the installation, but I can try that again in a couple of hours (after work hours).

There was a small change to the virtualbox ports (6.1.28_1), so both have been rebuilt. The only thing I can try is build the regular port (virtualbox-ose) with X11 support and see if that changes anything.

===> The following configuration options are available for virtualbox-ose-nox11-6.1.28_1:
     AIO=on: Enable Asyncronous IO support (check pkg-message)
     PYTHON=off: Python bindings or support
     UDPTUNNEL=off: Build with UDP tunnel support
     VDE=off: Build with VDE support
     VNC=off: Build with VNC support
     WEBSERVICE=off: Build Webservice

===> The following configuration options are available for virtualbox-ose-kmod-6.1.28_1:
     DEBUG=off: Debug symbols, additional logs and assertions
     VIMAGE=on: VIMAGE virtual networking support

And I wanna thank you for investing the time to get this sorted for me. Much appreciated.
Comment 13 Guido Falsi freebsd_committer 2021-10-26 08:02:05 UTC
(In reply to Ralf van der Enden from comment #12)

I made the small change, because I noticed a mistake I made with the update, but I doubt it has any influence on your issue.

Also using the X11 variant can be worth a try, but it is improbable it will fix your issue.

I'm actually running out of ideas. Another thing to try is rollback to 6.1.26 and check if the problem disappears or remains.

But you still have not confirmed if you can force docker/ansible to reset and rebuild their networking configuration from scrach. As I said from my vintage point it looks like the virtualbox network configuration got in some inconsistent state in your machine.

Also having a spare machine where you can test replicating the setup and see if it behaves differently would help.
Comment 14 Ralf van der Enden 2021-10-26 08:52:08 UTC
I found the following:
https://discuss.hashicorp.com/t/vagrant-2-2-18-osx-11-6-cannot-create-private-network/30984/14

This describes the exact issue I'm having (but on MacOS).

I removed all my interfaces, disabled the dhcp server and created a new hostonlyif interface (via VBoxManage hostonlyif create.

If I try to set the ip to 19.168.99.1 I get this error:
[(Tue Oct 26 10:38) root@lan ~]# VBoxManage hostonlyif ipconfig vboxnet0 --ip 192.168.99.1
VBoxManage: error: Code E_ACCESSDENIED (0x80070005) - Access denied (extended info not available)
VBoxManage: error: Context: "EnableStaticIPConfig(Bstr(pszIp).raw(), Bstr(pszNetmask).raw())" at line 242 of file VBoxManageHostonly.cpp

The default ip set when creating a new hostonlyif is 192.168.56.1. After changing my Vagrantfile to use 192.168.56.100 as its ipnumber everything worked as expected.

I'm not sure why changing the ip would throw that error, but it's probably a bug in virtualbox.

Thanks a lot for your help. It steered me in the right direction.
Comment 15 Guido Falsi freebsd_committer 2021-10-26 16:51:52 UTC
(In reply to Ralf van der Enden from comment #14)

Happy to have helped.

>[(Tue Oct 26 10:38) root@lan ~]# VBoxManage hostonlyif ipconfig vboxnet0 --ip 192.168.99.1
> VBoxManage: error: Code E_ACCESSDENIED (0x80070005) - Access denied (extended info not available)
> VBoxManage: error: Context: "EnableStaticIPConfig(Bstr(pszIp).raw(), Bstr(pszNetmask).raw())" at line 242 of file VBoxManageHostonly.cpp

This kind of error happens a lot with the virtualbox command line, especially with the networking ones. I got this after creating a natnetwork for a test, when trying to remove it. Rebooted the machine and it worked at first try.

I agree it would be good to fix this bugs, but it would require weeks or months of full time development on virtualbox.


If the issue is fixed please close this bug report as solved!
Comment 16 Ralf van der Enden 2021-10-26 23:15:54 UTC
According to the documentation host-only interfaces can only make use of 192.168.56.0/21 unless you create a /etc/vbox/networks.conf with the ranges you'd like to use: https://www.virtualbox.org/manual/ch06.html#network_hostonly

This ties in with the following from the changelog that comes with 6.1.28:
Network: More administrative control over network ranges, see user manual (which can be found at the link above).

Full changelog is here for reference: Network: More administrative control over network ranges, see user manual

Just documenting it here in case someone else runs into the same issue as I did.