Bug 254660

Summary: net/wireguard-kmod: wg0: link state changed to DOWN
Product: Ports & Packages Reporter: Denis Shaposhnikov <dsh>
Component: Individual Port(s)Assignee: Bernhard Froehlich <decke>
Status: Closed FIXED    
Severity: Affects Only Me CC: chris, jason, rhurlin
Priority: --- Flags: bugzilla: maintainer-feedback? (decke)
Version: Latest   
Hardware: Any   
OS: Any   

Description Denis Shaposhnikov 2021-03-30 13:02:27 UTC
Hi!

I'm on 12.2-p5 and trying to use wireguard-kmod-0.0.20210323 instead of wireguard-go. It works, but I've got a strange issue. After starting or restarting it works fine for some time

Mar 29 15:05:45 XXX doas[3839]: XX ran command service wireguard start as root from /usr/home/XX
Mar 29 15:05:45 XXX kernel: wg0: link state changed to UP

I can connect to it from my android phone and it works. After sometime without any connection it stopped to work and I can't use it from my android phone anymore. In the log I see

Mar 29 21:44:40 XXX kernel: wg0: link state changed to DOWN

and nothing else about wg. To use it I need to restart the service and again, after sometime to being idle it stopped to work. I got with situation twice after switching to wireguard-kmod.
Comment 1 Rainer Hurling freebsd_committer freebsd_triage 2021-03-30 19:23:37 UTC
Same here :(
Comment 2 Bernhard Froehlich freebsd_committer freebsd_triage 2021-03-31 04:53:16 UTC
Do you use jails on the machine where wireguard-kmod is running?
Comment 3 Bernhard Froehlich freebsd_committer freebsd_triage 2021-03-31 04:57:34 UTC
Or poudriere or other tools that use jails?
Comment 4 Rainer Hurling freebsd_committer freebsd_triage 2021-03-31 05:13:12 UTC
(In reply to Bernhard Froehlich from comment #3)

I use Poudriere almost every day several times.

But wireguard also shows this behaviour for me, if I a not using Poudriere.
Comment 5 Bernhard Froehlich freebsd_committer freebsd_triage 2021-03-31 05:45:28 UTC
This problem (or a similar one) was already fixed upstream and will be included in the next snapshot.

https://git.zx2c4.com/wireguard-freebsd/commit/?id=1f6818b7e4bab23b3ab471d752d6c481453e9622
Comment 6 Rainer Hurling freebsd_committer freebsd_triage 2021-03-31 06:29:00 UTC
(In reply to Bernhard Froehlich from comment #5)

Thanks for the info! :)
Comment 7 Denis Shaposhnikov 2021-03-31 08:42:27 UTC
Yes, several jails is running on that host. And I see now, I got the link down message right after I stopped one of them.

Good to know it's fixed. Thanks!
Comment 8 Christos Chatzaras 2021-04-02 16:03:58 UTC
I use wireguard-kmod and I have multiple jails.

Today it looks like I have a filesystem lockup (I couldn't SSH but I could ping) the public IP.

Before the lockup logs show nothing. After lockup I see these on my remote rsyslog:

----------

Apr  2 18:35:56 192.168.0.42 kernel: [#] route -q -n add -inet 192.168.2.1/32 -interface wg0
Apr  2 18:35:56 192.168.0.42 kernel: [+] Backgrounding route monitor
Apr  2 18:35:56 192.168.0.42 kernel: Starting cron.
Apr  2 18:35:56 192.168.0.42 kernel: Starting jails:
Apr  2 18:35:56 192.168.0.42 devd[32858]: Processing event '!system=VFS subsystem=FS type=MOUNT mount-point="/home/jail/php74/home/www" mount-dev="/home/www" mount-type="nullfs" fsid=0x02ff002929000000 owner=0 flags="local;noatime;nosuid;" opt="fstype=nullfs;fspath=/home$
Apr  2 18:35:56 192.168.0.42 devd[32858]: Pushing table
Apr  2 18:35:56 192.168.0.42 devd[32858]: Processing notify event
Apr  2 18:35:56 192.168.0.42 devd[32858]: Popping table
Apr  2 18:35:56 192.168.0.42 devd[32858]: Processing event '!system=VFS subsystem=FS type=MOUNT mount-point="/home/jail/php74/tmp" mount-dev="/tmp" mount-type="nullfs" fsid=0x03ff002929000000 owner=0 flags="local;noatime;nosuid;" opt="fstype=nullfs;fspath=/home/jail/php7$
Apr  2 18:35:56 192.168.0.42 devd[32858]: Pushing table
Apr  2 18:35:56 192.168.0.42 devd[32858]: Processing notify event
Apr  2 18:35:56 192.168.0.42 devd[32858]: Popping table
Apr  2 18:35:56 192.168.0.42 devd[32858]: Processing event '!system=VFS subsystem=FS type=MOUNT mount-point="/home/jail/php74/tmpfs" mount-dev="/tmpfs" mount-type="nullfs" fsid=0x04ff002929000000 owner=0 flags="local;noatime;nosuid;" opt="fstype=nullfs;fspath=/home/jail/$
Apr  2 18:35:56 192.168.0.42 devd[32858]: Pushing table
Apr  2 18:35:56 192.168.0.42 devd[32858]: Processing notify event
Apr  2 18:35:56 192.168.0.42 devd[32858]: Popping table
Apr  2 18:35:56 192.168.0.42 devd[32858]: Processing event '!system=VFS subsystem=FS type=MOUNT mount-point="/home/jail/php74/dev" mount-dev="devfs" mount-type="devfs" fsid=0x05ff007171000000 owner=0 flags="local;multilabel;" opt="ruleset=4;fstype=devfs;fspath=/home/jail$
Apr  2 18:35:56 192.168.0.42 devd[32858]: Pushing table
Apr  2 18:35:56 192.168.0.42 devd[32858]: Processing notify event
Apr  2 18:35:56 192.168.0.42 devd[32858]: Popping table
Apr  2 18:35:57 192.168.0.42 ntpd[75759]: Listen normally on 7 wg0 192.168.0.42:123
Apr  2 18:35:57 192.168.0.42 ntpd[75759]: Soliciting pool server 136.243.66.91

----------

Could the lockup be related to wireguard?
Comment 9 Christos Chatzaras 2021-04-02 16:17:04 UTC
Fatal trap 12: page fault while in kernel mode
cpuid = 7; apic id = 07
fault virtual address   = 0x18
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c9b7d8
stack pointer           = 0x0:0xfffffe00357a51c0
frame pointer           = 0x0:0xfffffe00357a5230
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_7)
trap number             = 12
panic: page fault
cpuid = 7
time = 1617377524
KDB: stack backtrace:
#0 0xffffffff80c57345 at kdb_backtrace+0x65
#1 0xffffffff80c09d21 at vpanic+0x181
#2 0xffffffff80c09b93 at panic+0x43
#3 0xffffffff8108a187 at trap_fatal+0x387
#4 0xffffffff8108a1df at trap_pfault+0x4f
#5 0xffffffff8108983d at trap+0x27d
#6 0xffffffff81061768 at calltrap+0x8
#7 0xffffffff80dc8a33 at tcp_output+0x10b3
#8 0xffffffff80dc0fcb at tcp_do_segment+0x301b
#9 0xffffffff80dbd1ee at tcp_input+0xabe
#10 0xffffffff80dafbe5 at ip_input+0x125
#11 0xffffffff80d3f2ca at netisr_dispatch_src+0xca
#12 0xffffffff80d23a58 at ether_demux+0x148
#13 0xffffffff80d24ddc at ether_nh_input+0x34c
#14 0xffffffff80d3f2ca at netisr_dispatch_src+0xca
#15 0xffffffff80d23ea9 at ether_input+0x69
#16 0xffffffff80dc6a61 at tcp_flush_out_le+0x221
#17 0xffffffff80dc67fd at tcp_lro_flush+0x2ad
Uptime: 2d15h58m1s
Dumping 2453 out of 32505 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
Comment 10 Bernhard Froehlich freebsd_committer freebsd_triage 2021-04-04 19:44:15 UTC
Sounds a lot like everyone is using jails and the patch should work. I will keep the bugreport open until the next snapshot is committed to the portstree.
Comment 11 Jason A. Donenfeld 2021-04-13 17:53:42 UTC
Fix released in https://lists.zx2c4.com/pipermail/wireguard/2021-April/006612.html
Comment 12 Bernhard Froehlich freebsd_committer freebsd_triage 2021-04-13 19:32:43 UTC
net/wireguard-kmod was updated to 0.0.20210412