Bug 243554 - multicast packets not seen on PHY bridge member
Summary: multicast packets not seen on PHY bridge member
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.1-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-24 02:01 UTC by James Blachly
Modified: 2023-10-01 20:59 UTC (History)
5 users (show)

See Also:


Attachments
diff against releng/12.1 (328 bytes, patch)
2020-01-24 03:12 UTC, Kyle Evans
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description James Blachly 2020-01-24 02:01:02 UTC
Summary: if_bridge(4) purports in the man page[1] to forward multicast traffic to all members of the bridge. However, this does not appear to be the case.

Extended summary: a bridge with members tap0, tap1, ... comprising bhyve virtual machines, as well as igb1 (the host's primary interface) forwards multicast traffic (mDNS specifically) among the taps, and OUT the PHY interface (igb1), however, the PHY interface does not receive inbound multicast traffic (on the FreeBSD side). Unicast traffic operates fine.

Details:
I use FreeBSD 12.1 as a VM host and ran into a problem with multicast packets over a bridge not being seen by programs [on the host] listening on the bridge's physical interface constituent (igb1), which I discovered when running avahi-daemon.

Briefly, my setup is as follows:
FreeBSD 11.2 host, bare metal, eth PHY igb1
    bridge0 with members igb1, tap0, tap1
VM linux guest virtio-net to tap0 to bridge on FreeBSD
VM freebsd guest virtio-net to tap1 to bridge on FreeBSD
Mac, ethernet to same switch as FreeBSD

mDNS query/response operates properly between the Mac and any of the
others (both physical and virtual), and all work in the converse
direction with the Mac.  The guests, all of which are constituents of
the bridge, are able to communicate via mDNS with one another.  However,
the guests are _unable_ to communicate with the host via mDNS.

tcpdump shows the query packets appearing on igb1, but truss on avahi-daemon
shows they are not received.

This means multicast packets are forwarded OUT all members of the
bridge, but not IN (at least, to physical interfaces -- they do
go both directions on the taps)

If I add an IP address to the bridge, avahi-daemon on the host binds to
the bridge interface directly and then receives incoming packets,
responding with the IP of the bridge. All then operates correctly,
except that the host now has two IPs on the same subnet of course.


Given that if_bridge(4) is described as a virtual switch [1] and

Given that unicast packets originating on one of the bridge's
taps are received by host programs bound to igb1, it seems to me that
anything bound to igb1 should also be receiving the multicast packets.


Is the discrepancy between handling of unicast and multicast packets

* an error specifically related to multicast and bridging, or
* an accident that unicast connections work? [unlikely]
* (or none of the above)

Kind regards and thanks in advance.



[1]      A bridge works like a switch, forwarding traffic from one interface to
     another.  Multicast and broadcast packets are always forwarded to all
     interfaces that are part of the bridge.  For unicast traffic, the bridge
     learns which MAC addresses are associated with which interfaces and will
     forward the traffic selectively.
Comment 1 James Blachly 2020-01-24 02:06:11 UTC
(In reply to James Blachly from comment #0)

I have to correct the below, I meant to type FreeBSD 12.1-RELEASE as my current version. The problem first manifest for me in version 11.2 and the text I pasted from an unanswered 2018 email to free-net ML. Sorry for any confusion.
Comment 2 Kyle Evans freebsd_committer freebsd_triage 2020-01-24 02:08:48 UTC
CC'ing kp, since he's done some work on bridge, too
Comment 3 Kyle Evans freebsd_committer freebsd_triage 2020-01-24 03:12:34 UTC
Created attachment 211002 [details]
diff against releng/12.1

It looks similar to some of the other observability problems I've fixed in the past. While the conventional setup is that the bridge alone would get the IP and not igb1, I think being able to observe the packets in question on igb1 is still important for debugging purposes.

There's also an incorrect looking comment in if_bridge.c that I'll dig into a little later; in bridge_forward(), we claim that tapping multicast/broadcast traffic isn't important because it will be reinjected into ether_input. I can't see how this is true. AFAICT these packets will travel bridge_broadcast() -> bridge_enqueue() -> if_transmit OR just bridge_enqueue() -> if_transmit, which will typically not involve ether_input.
Comment 4 James Blachly 2020-01-29 04:02:42 UTC
@Kyle Evans:

Patch did not do the trick. I can try to tcpdump this weekend (perhaps) to prove it, but behavior is same as before.

Also I note that the patch adds the call to ETHER_BPF_MTAP at a different place than in HEAD -- in the patch it adds right before bridge_enque, but in HEAD (but not in 12.1) the call is higher, outside the if{} block:

https://github.com/freebsd/freebsd/blob/0f0a35a04846fc4f4bdb6caa2852336d7de9447d/sys/net/if_bridge.c#L2058-L2060

I am sure this is intended but just thought I'd mention here for others.

Anyway, thanks for looking at this. If you can think of other diagnostics I can do that would be helpful, let me know.
Comment 5 Kyle Evans freebsd_committer freebsd_triage 2020-01-29 04:16:29 UTC
(In reply to James Blachly from comment #4)

Interesting- re-reading your original post... so the proper packets are appearing via tcpdump, but they're not being tapped off to avahi?

That may be indicative of an architectural problem that we can't actually solve, and would instead require something like switch(4) [that we don't have yet] with a cleaner separation between the interface receiving local packets and the physical interface.
Comment 6 sledz 2021-10-04 13:10:27 UTC
(In reply to James Blachly from comment #0)

Is https://forum.opnsense.org/index.php?topic=24978.0 possibly the same problem?
Comment 7 Patrick M. Hausen 2021-10-04 14:19:50 UTC
> If I add an IP address to the bridge, avahi-daemon on the host binds to
> the bridge interface directly and then receives incoming packets,
> responding with the IP of the bridge. All then operates correctly,
> except that the host now has two IPs on the same subnet of course.

You must not put IP addresses on bridge member interfaces. All addresses must go on the bridge.

This is documented in the handbook page for bridging:
https://docs.freebsd.org/en/books/handbook/advanced-networking/#network-bridging

> If the bridge host needs an IP address, set it on the bridge interface, not on the member interfaces.

I don't know if this already fixes your problem or if there is still a bug preventing multicast from working, but your setup is definitely wrong if you have an IP address on a member interface.

Kristof Provost has confirmed that putting an IP address on a member IF breaks multicast!

Hope that helps,
Patrick
Comment 8 James Blachly 2021-10-05 01:54:05 UTC
(In reply to Patrick M. Hausen from comment #7)

>You must not put IP addresses on bridge member interfaces. All addresses must go on the bridge.

> This is documented in the handbook page for bridging:
> https://docs.freebsd.org/en/books/handbook/advanced-networking/#network-bridging

>> If the bridge host needs an IP address, set it on the bridge interface, not on the member interfaces.

> I don't know if this already fixes your problem or if there is still a bug preventing multicast from working, but your setup is definitely wrong if you have an IP address on a member interface.

So that I am 100% clear, are you suggesting the following:

1. Physical interface, say, igb0 exists and passes traffic on the lan
2. User desires to create bridge for use with bhyve
3. User/bhyve accessory software creates bridge, adding igb0 as member and tap0...N for M Virtual machines
4. User should REMOVE IP from igb0 and ADD IP to the bridge (does not have one by default)

?

> Kristof Provost has confirmed that putting an IP address on a member IF breaks multicast!

Can you point me to this?

Regards
Comment 9 Patrick M. Hausen 2021-10-05 09:35:06 UTC
> 4. User should REMOVE IP from igb0 and ADD IP to the bridge (does not have one by default)

Yes! Yes! Yes!

All IP addresses MUST be on the bridge interface and not on any member. FreeNAS/TrueNAS has been doing it wrong for years. The problem is with dynamically generated bridge interfaces of course.
Most TrueNAS users won't notice, because there is not much in IPv4 that relies on multicast. So it works, most of the time. With IPv6 things get interesting ...

If you plan to use VNET jails or VMs with tap and bridge, best practice is to statically create the bridge at boot time via cloned_interfaces and configure IP accordingly. Then point your VM/jail orchestration tool at the existing bridge instead of having it create a new one.

> > [ ... statement by Kristof ...]
> Can you point me to this?

Private conversation, but you can of course just ask him.

That single statement in the handbook essentially says it all - but by far not prominently enough, IMHO.

Kind regards,
Patrick
Comment 10 Kristof Provost freebsd_committer freebsd_triage 2021-10-05 09:51:15 UTC
(In reply to Patrick M. Hausen from comment #9)
I confirm what Patrick says here. Set your IP address on the bridge, not on member interfaces.

There's also this bug, and the one comment specifically: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247912#c6
Comment 11 Patrick M. Hausen 2023-09-27 07:53:56 UTC
(In reply to Kristof Provost from comment #10)
Another closing candidate?
Comment 12 Kristof Provost freebsd_committer freebsd_triage 2023-09-27 08:17:56 UTC
(In reply to Patrick M. Hausen from comment #11)
I'm inclined to leave this open. While it's a well documented configuration issue there's also a case to be made that this should either just work, or the kernel should object to this configuration.

That's not something I expect to get done any time soon, but that does argue for leaving the bug open.