Bug 228412 - Kernel panic on shutdown after IPv6 was enabled
Summary: Kernel panic on shutdown after IPv6 was enabled
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.4-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-22 02:04 UTC by Victor Sudakov
Modified: 2018-06-14 02:03 UTC (History)
5 users (show)

See Also:


Attachments
crashinfo (137.54 KB, text/plain)
2018-05-22 02:04 UTC, Victor Sudakov
no flags Details
kgdb output (5.97 KB, text/plain)
2018-05-22 16:33 UTC, Victor Sudakov
no flags Details
GPF and kernel panic on killing syncthing (161.14 KB, text/plain)
2018-05-27 03:55 UTC, Victor Sudakov
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Victor Sudakov 2018-05-22 02:04:48 UTC
Created attachment 193599 [details]
crashinfo

Kernel panic on every shutdown. 100% reproducible. Attaching /var/crash/core.txt.5, can provide any additional info on request.

It may be that the problem started after I configured the system as a dual stack IPv6 gateway.
Comment 1 Victor Sudakov 2018-05-22 11:22:36 UTC
ifconfig before shutdown:

re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	description: Outside
	options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
	ether 54:04:a6:b4:9a:66
	hwaddr 54:04:a6:b4:9a:66
	inet 78.140.19.131 netmask 0xffffff00 broadcast 78.140.19.255 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect (100baseTX <half-duplex>)
	status: active
ath0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 2290
	ether 00:1d:0f:f9:40:c6
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
	media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
	status: running
re1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	description: Inside
	options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
	ether c4:12:f5:33:c9:7c
	hwaddr c4:12:f5:33:c9:7c
	inet 192.168.4.1 netmask 0xffffff00 broadcast 192.168.4.255 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect (100baseTX <full-duplex>)
	status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4 
	inet 127.0.0.1 netmask 0xff000000 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
tap0: flags=8902<BROADCAST,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether 00:bd:05:6a:00:00
	hwaddr 00:bd:05:6a:00:00
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: no carrier
tap1: flags=8902<BROADCAST,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether 00:bd:09:6a:00:01
	hwaddr 00:bd:09:6a:00:01
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: no carrier
tap2: flags=8902<BROADCAST,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether 00:bd:0d:6a:00:02
	hwaddr 00:bd:0d:6a:00:02
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: no carrier
tap3: flags=8902<BROADCAST,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether 00:bd:11:6a:00:03
	hwaddr 00:bd:11:6a:00:03
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: no carrier
tap4: flags=8902<BROADCAST,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether 00:bd:15:6a:00:04
	hwaddr 00:bd:15:6a:00:04
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: no carrier
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	description: vm-main
	ether 02:2c:d4:74:d0:00
	inet 192.168.3.1 netmask 0xffffff00 broadcast 192.168.3.255 
	inet6 fe80::2c:d4ff:fe74:d000%bridge0 prefixlen 64 scopeid 0xa 
	inet6 2001:470:ecba:2::1 prefixlen 64 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
	id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
	maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
	root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
	member: tap5 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 14 priority 128 path cost 2000000
	member: tap4 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 9 priority 128 path cost 2000000
	member: tap3 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 8 priority 128 path cost 2000000
	member: tap2 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 7 priority 128 path cost 2000000
	member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 6 priority 128 path cost 2000000
	member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 5 priority 128 path cost 2000000
gif0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1480
	options=80000<LINKSTATE>
	tunnel inet 78.140.19.131 --> 216.218.221.42
	inet6 2001:470:35:7af::2 --> 2001:470:35:7af::1 prefixlen 128 
	inet6 fe80::5604:a6ff:feb4:9a66%gif0 prefixlen 64 scopeid 0xb 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	ether 00:1d:0f:f9:40:c6
	hwaddr 00:1d:0f:f9:40:c6
	inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255 
	inet6 fe80::21d:fff:fef9:40c6%wlan0 prefixlen 64 scopeid 0xc 
	inet6 2001:470:ecba:1::1 prefixlen 64 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
	media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
	status: running
	ssid sudakov channel 7 (2442 MHz 11g) bssid 00:1d:0f:f9:40:c6
	regdomain FCC country US indoor ecm authmode WPA1+WPA2/802.11i
	privacy MIXED deftxkey 2 TKIP 2:128-bit TKIP 3:128-bit txpower 30
	scanvalid 60 protmode CTS wme burst dtimperiod 1 -dfs
bridge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	description: vm-isolated
	ether 02:2c:d4:74:d0:01
	nd6 options=1<PERFORMNUD>
	id 00:00:00:00:00:00 priority 0 hellotime 2 fwddelay 15
	maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
	root id 00:00:00:00:00:00 priority 0 ifcost 0 port 0
tap5: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	description: vmnet-fido-0-main
	options=80000<LINKSTATE>
	ether 00:bd:41:d5:00:05
	hwaddr 00:bd:41:d5:00:05
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect
	status: active
	Opened by PID 2094
Comment 2 Andrey V. Elsukov freebsd_committer freebsd_triage 2018-05-22 15:30:03 UTC
Your trace looks like some application uses IPv6 multicast before shutdown, what if you kill it first, and then run `shutdown`? 
Also can you run `kgdb /boot/kernel/kernel /var/crash/vmcore.5` and show the output of these commands:

f 7
l
i lo
p *ifp
p mld_mtx
Comment 3 Victor Sudakov 2018-05-22 16:32:40 UTC
(In reply to Andrey V. Elsukov from comment #2)
> Also can you run `kgdb /boot/kernel/kernel /var/crash/vmcore.5`
> and show the output of these commands:

typescript attached.

> Your trace looks like some application uses IPv6 multicast before shutdown
> what if you kill it first, and then run `shutdown`?

This could be easily rtadvd. I will try before the next shutdown, and report.
Comment 4 Victor Sudakov 2018-05-22 16:33:49 UTC
Created attachment 193617 [details]
kgdb output
Comment 5 Andrey V. Elsukov freebsd_committer freebsd_triage 2018-05-23 11:19:18 UTC
(In reply to vas from comment #3)
> (In reply to Andrey V. Elsukov from comment #2)
> > Also can you run `kgdb /boot/kernel/kernel /var/crash/vmcore.5`
> > and show the output of these commands:
> 
> typescript attached.
> 
> > Your trace looks like some application uses IPv6 multicast before shutdown
> > what if you kill it first, and then run `shutdown`?
> 
> This could be easily rtadvd. I will try before the next shutdown, and report.


According to the panic message, it is syncthing.
Since you don't have installed source code in /usr/src, can you provide the exact version of your kernel (uname -a)?
Comment 6 Victor Sudakov 2018-05-23 16:53:02 UTC
(In reply to Andrey V. Elsukov from comment #5)

> can you provide the exact version of your kernel (uname -a)?

The attached crash was either on 10.4-RELEASE-p8 or on 10.4-RELEASE-p9. I cannot say for sure because I updated to p9 and then rolled it back to p8 as I thought the update was the cause of the crash.

Let's assume it's "FreeBSD vas.sibptus.ru 10.4-RELEASE-p8 FreeBSD 10.4-RELEASE-p8 #0: Tue Apr  3 18:40:50 UTC 2018     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64" 

Do you need a shell account on this box to examine the coredumps and kernels? I can give you one.
Comment 7 Victor Sudakov 2018-05-23 16:54:01 UTC
I can also install the kernel sources if this helps.
Comment 8 Victor Sudakov 2018-05-27 03:54:25 UTC
I have an important update. Syncthing (net/syncthing) is running from a regular unprivileged user. However, when this user says "killall syncthing", it causes an immediate general protection fault and kernel panic.

I'm attaching a new kernel crash ASAP.
Comment 9 Victor Sudakov 2018-05-27 03:55:57 UTC
Created attachment 193747 [details]
GPF and kernel panic on killing syncthing

Happens immediately on killing syncthing
Comment 10 Eugene Grosbein freebsd_committer freebsd_triage 2018-05-27 09:19:15 UTC
(In reply to vas from comment #9)

Please make kernel.debug and crashdump available to download, compressed.
Comment 11 Eugene Grosbein freebsd_committer freebsd_triage 2018-05-27 09:41:13 UTC
(In reply to Eugene Grosbein from comment #10)

Please also describe how one can reproduce the problem, e.g. how do you build and/or instal syncthing and configure it.
Comment 12 Victor Sudakov 2018-05-27 12:20:53 UTC
(In reply to Eugene Grosbein from comment #10)
> Please make kernel.debug and crashdump available to download, compressed.

http://noc.sibptus.ru/~sudakov/bug228412.tar.gz

> how do you build and/or instal syncthing and configure it.

Just 'pkg install syncthing' from the default FreeBSD package repo.
Comment 13 Eugene Grosbein freebsd_committer freebsd_triage 2018-05-29 13:58:04 UTC
I could not reproduce a panic using my 11.1-STABLE/amd64 system that has working IPv6 support and similar multiple tap interfaces: I've started 

/usr/bin/nohup /usr/local/bin/syncthing -no-browser > /var/log/syncthing.log &

kernel/witness generated wrote two Lock Order Reversals but that's all: "killall syncthing" terminated it just fine, no panics.

Can you try switching to 11.1?
Comment 14 Victor Sudakov 2018-06-08 14:31:46 UTC
(In reply to Eugene Grosbein from comment #13)
> Can you try switching to 11.1?

Switching to 11.1 seems to have cured the problem.
Comment 15 Eugene Grosbein freebsd_committer freebsd_triage 2018-06-08 14:50:59 UTC
I guess this was fixed with https://svnweb.freebsd.org/base?view=revision&revision=302054 committed to head before stable/11 was branched but never merged to 10.x

Let's see what Bjoern thinks about the problem.
Comment 16 Bjoern A. Zeeb freebsd_committer freebsd_triage 2018-06-08 15:29:06 UTC
The VIMAGE changes were never intended to be merged to 10.
I am not sure how this changeset relates to this bug?
Comment 17 Eugene Grosbein freebsd_committer freebsd_triage 2018-06-08 15:54:09 UTC
(In reply to Bjoern A. Zeeb from comment #16)

Commit log to r302054 mentiones multicast initialisation and teardown and this PR is about a panic during multicast teardown.

Also, that change presents in stable/11 where panic does not manifest and it is not in stable/10 that panices.

And this change is the only non-commentary difference in sys/netinet6/mld6.c (where panic occurs) between stable/10 and stable/11.
Comment 18 Victor Sudakov 2018-06-14 02:03:27 UTC
I think nobody should waste time fixing the 10 branch, and the workaround is to upgrade to the 11.1-RELEASE. Therefore I suggest we close this bug.