Bug 278338 - bhyve deletes tap LINK0 flag (regression)
Summary: bhyve deletes tap LINK0 flag (regression)
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bhyve (show other bugs)
Version: 13.2-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-virtualization (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2024-04-13 02:48 UTC by Peter Much
Modified: 2025-02-10 23:46 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Much 2024-04-13 02:48:19 UTC
change bc6372602a00 is faulty:
bhyve does now at startup delete the LINK0 flag on the tap interface, so IP adresses are removed when bhyve closes the tap.

This means, each time the client wants to reboot, somebody with access to the host must freshly set the LINK0 on the tap interface. This is not practical.

Workaround: revert bc6372602a00 and ef161a35012f
Comment 1 crest 2024-04-16 00:44:22 UTC
It's expected that a tap interface goes down if you close it causing the addresses and routes to removed. It's annoying for a routed bhyve setup, but can be avoided by using the vmnet cloner instead of the tap cloner to create your interfaces.

You should be able to get the behaviour I assume you want for a routed bhyve deployment from vmnet. I'm not sure what's the correct way to deal with the link0 flag if the tap interface is already configured before bhyve opens it. I can see arguments both way (trusting the user to really know and intentionally set each bitfield they twiddled with vs. bringing the device into its default configuration to clean up any corrupted state left over from the previous opening of the device by some other software that also used tap0). The only way to cover both cases would probably to add one more configuration option similar to noinit for serial ports.
Comment 2 Peter Much 2024-04-16 12:45:54 UTC
(In reply to crest from comment #1)

crest, thank You for explaining the rationale behind this. So this is rather a /cultural/ change, then.


I.)
Yes, it is expected that the tap goes down on closing. And then the addresses and routes are removed. This was a problem here, but then the tap manpage states:

     " and has all of its configured addresses
     deleted unless the device [...] has IFF_LINK0 flag set."

Which is where I found this, and why I did implement it that way. And it did work well all the time, up to 13.3
Apparently this behaviour is the purpose of the LINK0 flag here - probably the only purpose.


II.)
"trusting the user to really know and intentionally set each bitfield they twiddled with". 

That was always the tradition with unix: assuming that the operator does know what they are doing, and allowing them to shoot themeselves into the foot, if so desired. (Because otherwise they would use some other OS where they can pay for being nannied.)

So the actual issue is now a cultural one, it is that we do no longer trust the user (to know why they set LINK0).


III.)
Given that, let me elaborate on the technological impact assessment of this change of confidence, as a matter of surprize:

 1. on the desktop, xterms don't open anymore
 2. anaysis shows, the DNS replies are botched, and therefore the xterms
    cannot correctly resolve the DISPLAY host.
 3. there are no errors in the named log.
 4. BIND was recently updated, but the log shows it was reinstalled at the
    same version (otherwise we would have spent another couple of hours
    analysing possibly incompatible changes there).
 5. detailed analysis of the resolver log shows, queries are hitting the
    wrong BIND view, and therefore get bogus answers (this is a 6-way named:
    lan/public/rootslave each authoritative and caching)
 6. they are hitting the wrong BIND view because apparently they come from
    a bogus IP address
 7. the bogus IP address is found on the tap interface
 8. it is put there by net/dhcpcd
 9. net/dhcpcd also was reinstalled at the same version (luckily)
10. detailled analysis shows that net/dhcpcd is putting that address there
    only when there is none present yet (this is also one of those pieces
    of software that does far too many things on it's own discretion)
11. Then finally: the addresses do disappear when a guest is rebooted 
    /from the inside/ (And installing quarterly pkg updates was the
    first time this was done after release of 13.3)

This is not really funny. It is what one gets from surprizes. 
Or, as I usually put it: /distrust/ just makes things far more complicated - and  the price is usually paid by others.


IV.)
I don't see reason for a configuration choice or option. If you want to reset the configuration of a pre-exsting tap on bhyve startup, then you could as well just remove all the configured IP addresses alongside - then things would start to fail rightaway and we would immediately see that there is an issue.

Otherwise, if there are already addresses there, they are probably there for a purpose. And if LINK0 was set alongside with these addresses, then that is also probably there for a purpose.
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2024-06-29 20:57:48 UTC
^Triage: fix version number.

Also, make sure that committer of bc6372602a00 was Cc:ed.
Comment 4 Bjoern A. Zeeb freebsd_committer freebsd_triage 2025-02-10 23:46:36 UTC
I am also seeing annoying problems on main these days in that tap loses it's IPv6 link-local address on bhyve reboot and needs a further manual ifconfig tap0 down/up cycle to regain that despite the IPv6 flags being set correctly.