Bug 279195 - kernel panic while re-configuring if_vlan(4) vlan id
Summary: kernel panic while re-configuring if_vlan(4) vlan id
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: Kristof Provost
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-21 07:57 UTC by Zhenlei Huang
Modified: 2024-05-22 07:41 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zhenlei Huang freebsd_committer freebsd_triage 2024-05-21 07:57:36 UTC
Spot this while repeating https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=279181 .

Steps to repeat:

```
# ifconfig vlan0 create vlan 100 vlandev cxl0
# ifconfig vlan1 create vlan 110 vlandev cxl0
# ifconfig vlan0 vlan 110
ifconfig: SIOCSETVLAN: File exists
# ifconfig vlan0 vlan 110
# ifconfig vlan0 vlan 111

... panic ...
```
Comment 1 Zhenlei Huang freebsd_committer freebsd_triage 2024-05-21 08:01:45 UTC
Backtrace:

```
Unread portion of the kernel message buffer:
panic: vlan_remhash: vlan not found

cpuid = 1
time = 1716278185
KDB: stack backtrace:
#0 0xffffffff80b9009d at kdb_backtrace+0x5d
#1 0xffffffff80b431a2 at vpanic+0x132
#2 0xffffffff80b43063 at panic+0x43
#3 0xffffffff80c72553 at vlan_remhash+0x63
#4 0xffffffff80c7414d at vlan_config+0xdd
#5 0xffffffff80c73bea at vlan_ioctl+0x41a
#6 0xffffffff80c60723 at ifhwioctl+0x883
#7 0xffffffff80c62843 at ifioctl+0x913
#8 0xffffffff80bb15a5 at kern_ioctl+0x255
#9 0xffffffff80bb12e3 at sys_ioctl+0x123
#10 0xffffffff8100d119 at amd64_syscall+0x109
#11 0xffffffff80fe43db at fast_syscall_common+0xf8
Uptime: 47s
Dumping 696 out of 16207 MB:..3%..12%..21%..33%..42%..51%..63%..72%..81%..92%

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:405
#2  0xffffffff80b42d37 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:526
#3  0xffffffff80b4320f in vpanic (fmt=0xffffffff81150b30 "%s: vlan not found\n", ap=ap@entry=0xfffffe00f9bcd9c0) at /usr/src/sys/kern/kern_shutdown.c:970
#4  0xffffffff80b43063 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:894
#5  0xffffffff80c72553 in vlan_remhash (trunk=<optimized out>, ifv=ifv@entry=0xfffff8032f2c9d80) at /usr/src/sys/net/if_vlan.c:480
#6  0xffffffff80c7414d in vlan_config (ifv=ifv@entry=0xfffff8032f2c9d80, p=p@entry=0xfffff80003ba7800, vid=111, proto=<optimized out>) at /usr/src/sys/net/if_vlan.c:1719
#7  0xffffffff80c73bea in vlan_ioctl (ifp=0xfffff8036582e000, ifp@entry=<error reading variable: value is not available>, cmd=<unavailable>, 
    cmd@entry=<error reading variable: value is not available>, data=<unavailable>, data@entry=<error reading variable: value is not available>)
    at /usr/src/sys/net/if_vlan.c:2264
#8  0xffffffff80c60723 in ifhwioctl (cmd=cmd@entry=2149607737, ifp=0xfffff8036582e000, data=data@entry=0xfffffe00f9bcdd50 "vlan0", td=td@entry=0xfffffe00fae223a0)
    at /usr/src/sys/net/if.c:2757
#9  0xffffffff80c62843 in ifioctl (so=0xfffff8000ae773c0, so@entry=<error reading variable: value is not available>, cmd=2149607737, 
    cmd@entry=<error reading variable: value is not available>, data=0xfffffe00f9bcdd50 "vlan0", data@entry=<error reading variable: value is not available>, 
    td=0xfffffe00fae223a0, td@entry=<error reading variable: value is not available>) at /usr/src/sys/net/if.c:3070
#10 0xffffffff80bb15a5 in fo_ioctl (fp=0xfffff800039b5640, com=2149607737, data=<unavailable>, active_cred=<unavailable>, td=0xfffffe00fae223a0) at /usr/src/sys/sys/file.h:366
#11 kern_ioctl (td=td@entry=0xfffffe00fae223a0, fd=<optimized out>, com=com@entry=2149607737, data=<unavailable>, data@entry=0xfffffe00f9bcdd50 "vlan0")
    at /usr/src/sys/kern/sys_generic.c:805
#12 0xffffffff80bb12e3 in sys_ioctl (td=0xfffffe00fae223a0, uap=0xfffffe00fae227a0) at /usr/src/sys/kern/sys_generic.c:713
#13 0xffffffff8100d119 in syscallenter (td=0xfffffe00fae223a0) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:187
#14 amd64_syscall (td=0xfffffe00fae223a0, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1197
#15 <signal handler called>
#16 0x0000178be8b75e0a in ?? ()
Backtrace stopped: Cannot access memory at address 0x178be401c0a8
(kgdb) 
```
Comment 2 Kristof Provost freebsd_committer freebsd_triage 2024-05-21 08:34:19 UTC
Yeah, that seems to need this:

diff --git a/sys/net/if_vlan.c b/sys/net/if_vlan.c
index b69d8107e30d..b936d4673404 100644
--- a/sys/net/if_vlan.c
+++ b/sys/net/if_vlan.c
@@ -1715,10 +1715,17 @@ vlan_config(struct ifvlan *ifv, struct ifnet *p, uint16_t vid,
                ifv->ifv_proto = proto;

                if (ifv->ifv_vid != vid) {
+                       int oldvid = ifv->ifv_vid;
+
                        /* Re-hash */
                        vlan_remhash(trunk, ifv);
                        ifv->ifv_vid = vid;
                        error = vlan_inshash(trunk, ifv);
+                       if (error) {
+                               ifv->ifv_vid = oldvid;
+                               /* Re-insert back where we found it. */
+                               vlan_inshash(trunk, ifv);
+                       }
                }
                /* Will unlock */
                goto done;

After # ifconfig vlan0 vlan 110 you'll see that vlan0 thinks it's on 111, so when it tries to vlan_remhash() it's not where it expects to find it.

There may be a similar problem in the case where we don't yet have a vlandev.
Comment 3 Kristof Provost freebsd_committer freebsd_triage 2024-05-21 12:26:28 UTC
Patch in https://reviews.freebsd.org/D45285
Comment 4 commit-hook freebsd_committer freebsd_triage 2024-05-22 07:41:47 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=bdd12889eaa64032b3d09ef47e9a6f7081863378

commit bdd12889eaa64032b3d09ef47e9a6f7081863378
Author:     Kristof Provost <kp@FreeBSD.org>
AuthorDate: 2024-05-21 11:31:13 +0000
Commit:     Kristof Provost <kp@FreeBSD.org>
CommitDate: 2024-05-22 07:08:02 +0000

    if_vlan: handle VID conflicts

    If we fail to change the vlan id we have to undo the removal (and vlan id
    change) in the error path. Otherwise we'll have removed the vlan object from the
    hash table, and have the wrong vlan id as well. Subsequent modification attempts
    will then try to remove an entry which doesn't exist, and panic.

    Undo the vlan id modification if the insertion in the hash table fails, and
    re-insert it under the original vlan id.

    PR:             279195
    Reviewed by:    zlei
    MFC atfer:      1 week
    Sponsored by:   Rubicon Communications, LLC ("Netgate")
    Differential Revision:  https://reviews.freebsd.org/D45285

 sys/net/if_vlan.c        | 10 ++++++++++
 tests/sys/net/if_vlan.sh | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)