Bug 270559 - if_bridge: does not forward packets properly for vlan 1
Summary: if_bridge: does not forward packets properly for vlan 1
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-31 04:29 UTC by Zhenlei Huang
Modified: 2023-06-01 09:40 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zhenlei Huang freebsd_committer freebsd_triage 2023-03-31 04:29:38 UTC
A simple setup:

          jail0                         br0                       jail1                                          
                                                                                                                 
+-----------------------+      +------------------+      +------------------------+                              
|                       |      |                  |      |                        |                              
|   +------+    +-----+ |      | +-----+  +-----+ |      |  +-----+    +------+   |                              
|   |vlan1 ------ep0b ------------ ep0a|  |ep1a -------------ep1b ------ vlan1|   |                              
|   +------+    +-----+ |      | +-----+  +-----+ |      |  +-----+    +------+   |                              
|192.168.1.1/24         |      |                  |      |          192.168.1.2/24|                              
+-----------------------+      +------------------+      +------------------------+ 


The test script:
```
#/bin/sh

# vlan id 1 is expected to fail this test
vid=1

ep0a=$( ifconfig epair create )
ep0b=${ep0a%a}b

ep1a=$( ifconfig epair create )
ep1b=${ep1a%a}b

br0=$( ifconfig bridge create )
ifconfig $br0 addm $ep0a addm $ep1a

ifconfig $ep0a up
ifconfig $ep1a up
ifconfig $br0 up

j0=$( jail -ic vnet persist )
j1=$( jail -ic vnet persist )

ifconfig $ep0b vnet $j0
ifconfig $ep1b vnet $j1

# load if_vlan.ko if not built into kernel
kldload -nq if_vlan

jexec $j0 ifconfig $ep0b up
jexec $j1 ifconfig $ep1b up

jexec $j0 ifconfig ${ep0b}.${vid} create
jexec $j1 ifconfig ${ep1b}.${vid} create

# Set ether address of $ep0b (untagged) same with ${ep1b}.${vid}
jexec $j1 ifconfig ${ep1b}.${vid} ether 02:09:a2:78:9a:bc
jexec $j0 ifconfig ${ep0b}        ether 02:09:a2:78:9a:bc
jexec $j0 ifconfig ${ep0b}.${vid} ether 02:09:a2:12:34:56

# Add ip address, will also populate $br0's fowarding table, by ARP announcement
jexec $j0 ifconfig ${ep0b}.${vid} inet 192.168.1.1/24
jexec $j1 ifconfig ${ep1b}.${vid} inet 192.168.1.2/24

sleep 0.5
echo "======== jail $j0 ===================="
jexec $j0 ifconfig
echo "======== jail $j1 ===================="
jexec $j1 ifconfig

echo "======== learned addresses on $br0 ========"
ifconfig $br0 addr

echo "==== check contection ===="
jexec $j0 ping -t5 -c3 192.168.1.2

# This will trigger a mac flap (by ARP announcement)
jexec $j0 ifconfig $ep0b inet 192.168.2.1/24

sleep 0.5

echo "======== learned addresses on $br0 , after mac flap ===="
ifconfig $br0 addr

echo "======== re-check contection ========"
jexec $j0 ping -t5 -c3 192.168.1.2
rval=$?

jail -R $j1
jail -R $j0
ifconfig $br0 destroy
ifconfig $ep1a destroy
ifconfig $ep0a destroy

exit $rval
```
Comment 1 Zhenlei Huang freebsd_committer freebsd_triage 2023-03-31 04:39:15 UTC
Expected behavior: Members in vlan1 should communicate with each other without disturbing.

Actual behavior: Untagged traffic and vlan1 traffic influence each other.
Comment 2 Mina Galić freebsd_triage 2023-03-31 06:48:43 UTC
why do $j1's $ep1b.$vid and $j0's $ep0b get the same MAC address:

jexec $j1 ifconfig ${ep1b}.${vid} ether 02:09:a2:78:9a:bc
jexec $j0 ifconfig ${ep0b}        ether 02:09:a2:78:9a:bc

?
Comment 3 Zhenlei Huang freebsd_committer freebsd_triage 2023-03-31 07:15:39 UTC
(In reply to Mina Galić from comment #2)
> why do $j1's $ep1b.$vid and $j0's $ep0b get the same MAC address:

$ep1b.$vid vlan interface with VID 1, while $ep0b is ordinary interface. They are in different broadcast domain and the traffic should not interleave each other. It is perfect fine for different VLANs to have same MAC address.
Comment 4 Kristof Provost freebsd_committer freebsd_triage 2023-04-01 01:05:46 UTC
I'd expect this to work, because the bridge code does take vlan ID into account when it learns addresses.

I translated your test into an automated test: https://reviews.freebsd.org/D39379 , but that seems to pass. Did I miss something?
Comment 5 Zhenlei Huang freebsd_committer freebsd_triage 2023-04-02 15:44:29 UTC
(In reply to Kristof Provost from comment #4)

> I'd expect this to work, because the bridge code does take vlan ID into account
> when it learns addresses.

After looked into the code, I think the root cause is that if_bridge(4) treat untagged packets as from vlan 1 (the default VID for bridge as 802.1Q-2003 Table 9-2), and untagged packets and that from vlan 1 share the same forwarding lookup table. if_bridge(4) will then treat the two host from different broadcast domain as the same one.

One possible solution could be treat untagged packets as untagged (or vlan 0) and tagged as tagged. Although vlan 0 is not valid as per 802.1Q-2003 but it has no side effect if user treat if_bridge(4) as transparent bridge. Actually if_bridge(4) does not function as a full 802.1q aware bridge (I mean it misses policies such as adding tags for inbound and removing tags on outbound or dropping packets with unknown / un-configured tags).

> I translated your test into an automated test: https://reviews.freebsd.org/D39379 ,
> but that seems to pass. Did I miss something?
I'll comment directly on D39379.
Comment 6 Kristof Provost freebsd_committer freebsd_triage 2023-04-05 20:11:12 UTC
Yes, I believe your analysis is correct, as is your proposed approach for fixing it.
With a bit of poking I got my test case to fail as you described, and with the following patch it works again:

diff --git a/sys/net/if_bridge.c b/sys/net/if_bridge.c
index 60a1683c74ae..5144da190a4f 100644
--- a/sys/net/if_bridge.c
+++ b/sys/net/if_bridge.c
@@ -389,7 +389,7 @@ static void bridge_linkcheck(struct bridge_softc *sc);

 /* The default bridge vlan is 1 (IEEE 802.1Q-2003 Table 9-2) */
 #define        VLANTAGOF(_m)   \
-    (_m->m_flags & M_VLANTAG) ? EVL_VLANOFTAG(_m->m_pkthdr.ether_vtag) : 1
+    (_m->m_flags & M_VLANTAG) ? EVL_VLANOFTAG(_m->m_pkthdr.ether_vtag) : 0

 static struct bstp_cb_ops bridge_ops = {
        .bcb_state = bridge_state_change,
@@ -2783,10 +2783,6 @@ bridge_rtupdate(struct bridge_softc *sc, const uint8_t *dst, uint16_t vlan,
             dst[3] == 0 && dst[4] == 0 && dst[5] == 0) != 0)
                return (EINVAL);

-       /* 802.1p frames map to vlan 1 */
-       if (vlan == 0)
-               vlan = 1;
-
        /*
         * A route for this destination might already exist.  If so,
         * update it, otherwise create a new one.
@@ -3118,7 +3114,7 @@ bridge_rtnode_lookup(struct bridge_softc *sc, const uint8_t *addr, uint16_t vlan
        hash = bridge_rthash(sc, addr);
        CK_LIST_FOREACH(brt, &sc->sc_rthash[hash], brt_hash) {
                dir = bridge_rtnode_addr_cmp(addr, brt->brt_addr);
-               if (dir == 0 && (brt->brt_vlan == vlan || vlan == 0))
+               if (dir == 0 && (brt->brt_vlan == vlan))
                        return (brt);
                if (dir > 0)
                        return (NULL);

I assume you've got something similar? Do you want to post a review or should I post this one? Either way, we should copy philip@, because I remember him working on enhancing the bridge's support for vlans, so he may have useful things to say.
Comment 7 Zhenlei Huang freebsd_committer freebsd_triage 2023-04-10 06:29:09 UTC
(In reply to Kristof Provost from comment #6)
My local WIP fix is basically identical with yours.

A few days after this bug report, I realize that we might should corp with https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240106#c30 .

Anyway if_bridge(4) is software bridge, before it becomes a full **MANAGED SWITCH** maybe it deserves a tunable flag to disobey IEEE 802.1Q-2003 Table 9-2.
Comment 8 Philip Paeps freebsd_committer freebsd_triage 2023-04-10 07:24:17 UTC
Our if_bridge(4) really only implements the learning and forwarding parts of the original 802.1D standard.  It's really only aware of VLANs because if_vlan(4) puts the relevant tags in the mbuf.  It does not try to be a complete implementation of 802.1Q-2014 (which incorporates bridging, VLANs and some other stuff).

The comment introducing the VLANTAGOF(_m) macro is a little confusing if this patch is applied as-is.  I'm also not convinced that this change is correct.  VLAN 0 is not a valid VID but it's perfectly possible for a dot1q tag with VID=0 to appear on the wire: it indicates untagged traffic with a PCP or the DEI bit set.

I wonder if we should use 0xFFF rather than 0 as the magic number for untagged frames.  0xFFF is reserved in 802.1Q-2014 for implementation use.

Did either of you test what happens if a packet with a dot1q header that only contains a PCP and not a VID gets forwarded?
Comment 9 Kristof Provost freebsd_committer freebsd_triage 2023-04-10 08:10:02 UTC
(In reply to Philip Paeps from comment #8)
I have not, but did notice that the patch in #6 failed to account for vlan 0 having a special meaning in the bridge ioctl interface (it means delete this address in all vlans). I'd patched around that differently, but using 0xfff to mean no vlan tag should also work and avoid that issue.

I've been meaning to post the patch to phab before this, but I got distracted by if_gif being broken on aarch64 (and one of the bridge tests uses gif).
I'll update my patch and try to post it today. I'll also see if I can add a test case for vlan id 0.
Comment 10 Zhenlei Huang freebsd_committer freebsd_triage 2023-04-10 09:03:40 UTC
(In reply to Philip Paeps from comment #8)
> Did either of you test what happens if a packet with a dot1q header that only
> contains a PCP and not a VID gets forwarded?

Before that I did some basic PCP tests on re(4), cxgbe(4) and epair(4). Found a bug with epair(4). I'll post it separately.

If you meant `ifconfig bridge pcp 3 or ping -C3 xxx`, then it does (and should) work w/wo the patch in #6.
Comment 11 commit-hook freebsd_committer freebsd_triage 2023-04-14 11:21:04 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b0e38a1373c087e5a55eefcdee69ccfbf12f86ce

commit b0e38a1373c087e5a55eefcdee69ccfbf12f86ce
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2023-04-07 16:00:08 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2023-04-14 11:17:02 +0000

    bridge: distinguish no vlan and vlan 1

    The bridge treated no vlan tag as being equivalent to vlan ID 1, which
    causes confusion if the bridge sees both untagged and vlan 1 tagged
    traffic.

    Use DOT1Q_VID_NULL when there's no tag, and fix up the lookup code by
    using 'DOT1Q_VID_RSVD_IMPL' to mean 'any vlan', rather than vlan 0. Note
    that we have to account for userspace expecting to use 0 as meaning 'any
    vlan'.

    PR:             270559
    Suggested by:   Zhenlei Huang <zlei@FreeBSD.org>
    Reviewed by:    philip, zlei
    Differential Revision:  https://reviews.freebsd.org/D39478

 sys/net/if_bridge.c   | 24 ++++++++++++++----------
 sys/net/if_vlan_var.h |  5 +++++
 2 files changed, 19 insertions(+), 10 deletions(-)
Comment 12 commit-hook freebsd_committer freebsd_triage 2023-04-14 11:21:07 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f97802a35e484ee0f8e11d3956189392ea995b29

commit f97802a35e484ee0f8e11d3956189392ea995b29
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2023-04-07 16:02:35 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2023-04-14 11:17:02 +0000

    bridge tests: test cross-vlan bridging

    Ensure that the bridge takes VLAN IDs into account when learning
    addresses.

    PR:             270559
    Differential Revision:  https://reviews.freebsd.org/D39379

 tests/sys/net/if_bridge_test.sh | 68 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
Comment 13 ben 2023-05-31 06:56:27 UTC
I think a default to VLAN 1 was missed in the bridge_transmit() call to bridge_rtlookup(), so all lookups on frames from e.g., IP forwarding fail and flood.
Comment 14 Kristof Provost freebsd_committer freebsd_triage 2023-05-31 07:07:13 UTC
(In reply to ben from comment #13)
I'm afraid I do not understand your comment.

Can you describe your setup as well as the expected and observed behaviour?
Comment 15 ben 2023-06-01 00:54:10 UTC
(In reply to Kristof Provost from comment #14)

Sure.  My apologies for being overly terse.

Create a bridge with IP address:
# ifconfig bridge create up
bridge0
# ifconfig bridge0 inet 10.0.0.1/30

Add an epair with the other end in a jail so we have something to ping:
# ifconfig epair create up
epair0a
# ifconfig bridge0 addm epair0a
# jail -ic persist vnet=new
1
# ifconfig epair0b vnet 1
# ifconfig -j 1 epair0b inet 10.0.0.2/30 up
# ifconfig -j 1 epair0b | grep ether
	ether 02:9d:22:b7:0a:0b

Start a ping and verify that the bridge learns the destination address:
# ping -q 10.0.0.2 &
# ifconfig bridge0 addr
02:9d:22:b7:0a:0b Vlan0 epair0a 1200 flags=0<>

Add some other interface to the bridge:
# ifconfig tap create up
tap0
# ifconfig bridge0 addm tap0
# cat /dev/tap0 >/dev/null &

See that unicast packets are being unexpectedly flooded to tap0 despite the bridge table entry for the destination MAC:
# tcpdump -eni tap0
00:30:52.296892 58:9c:fc:00:ed:c1 > 02:9d:22:b7:0a:0b, ethertype IPv4 (0x0800), length 98: 10.0.0.1 > 10.0.0.2: ICMP echo request, id 1572, seq 420, length 64

This fixes it for me:

diff --git a/sys/net/if_bridge.c b/sys/net/if_bridge.c
index d93bddf71ccb..9896c7a57191 100644
--- a/sys/net/if_bridge.c
+++ b/sys/net/if_bridge.c
@@ -2279,7 +2279,7 @@ bridge_transmit(struct ifnet *ifp, struct mbuf *m)
        eh = mtod(m, struct ether_header *);
 
        if (((m->m_flags & (M_BCAST|M_MCAST)) == 0) &&
-           (dst_if = bridge_rtlookup(sc, eh->ether_dhost, 1)) != NULL) {
+           (dst_if = bridge_rtlookup(sc, eh->ether_dhost, DOT1Q_VID_NULL)) != NULL) {
                error = bridge_enqueue(sc, dst_if, m);
        } else
                bridge_broadcast(sc, ifp, m, 0);

Thank you!
Comment 16 commit-hook freebsd_committer freebsd_triage 2023-06-01 09:40:18 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=fd7edfcdc3c329cdbd3f5e7a554f7153e805ab04

commit fd7edfcdc3c329cdbd3f5e7a554f7153e805ab04
Author:     Ben Wilber <ben@desync.com>
AuthorDate: 2023-06-01 09:29:36 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2023-06-01 09:31:59 +0000

    bridge: fix lookup for untagged packets in bridge_transmit()

    b0e38a1373 improved if_bridge's ability to cope with different VLANs,
    but it failed to update bridge_transmit() to cope with the new rule that
    untagged packets are treated as having VLAN ID 0 (rather than 1, as used
    to be the case).

    Fix that oversight.

    PR:             270559
    Reviewed by:    kp

 sys/net/if_bridge.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)