Bug 206614

Summary: net/dhcpcd: Crashes the kernel when a VNET jail starts.
Product: Ports & Packages Reporter: g_amanakis
Component: Individual Port(s)Assignee: freebsd-ports-bugs (Nobody) <ports-bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: roy
Priority: --- Keywords: crash, needs-patch, needs-qa
Version: LatestFlags: roy: maintainer-feedback+
Hardware: amd64   
OS: Any   
Bug Depends on: 206613    
Bug Blocks:    
Attachments:
Description Flags
rc.conf
none
My dhcpcd.conf none

Description g_amanakis 2016-01-25 15:39:14 UTC

    
Comment 1 g_amanakis 2016-01-25 15:41:50 UTC
See:
http://roy.marples.name/projects/dhcpcd/tktview/3a1e57157dd01af0fb7ce497850645eb7d49889d

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206613

dhcpcd 6.10.1 and more specifically [6b2a5402c4] causes a kernel panic on FreeBSD 10.2 when starting a VNET iocage jail. The system runs a GENERIC kernel with VIMAGE and IPSEC enabled. Reverting this resolves the problem. 

/var/log/messsages:
  3 Jan 24 19:30:42 x3200 kernel: vnet0:1: link state changed to DOWN
  4 Jan 24 19:30:42 x3200 kernel: vnet0: link state changed to DOWN
  5 Jan 24 19:30:42 x3200 kernel: bridge1: link state changed to DOWN
  6 Jan 24 19:30:42 x3200 kernel: ifa_del_loopback_route: deletion failed: 48
  7 Jan 24 19:30:42 x3200 kernel: Freed UMA keg (udp_inpcb) was not empty (60 items).  Lost 6 pages of memory.
  8 Jan 24 19:30:42 x3200 kernel: Freed UMA keg (udpcb) was not empty (668 items).  Lost 4 pages of memory.
  9 Jan 24 19:30:42 x3200 kernel: Freed UMA keg (tcp_inpcb) was not empty (60 items).  Lost 6 pages of memory.
 10 Jan 24 19:30:42 x3200 kernel: Freed UMA keg (tcpcb) was not empty (18 items).  Lost 6 pages of memory.
 11 Jan 24 19:30:42 x3200 kernel: Freed UMA keg (ripcb) was not empty (60 items).  Lost 6 pages of memory.
 12 Jan 24 19:30:42 x3200 kernel: hhook_vnet_uninit: hhook_head type=1, id=1 cleanup required
 13 Jan 24 19:30:42 x3200 kernel: hhook_vnet_uninit: hhook_head type=1, id=0 cleanup required
 14 Jan 24 19:31:05 x3200 devd: Executing '/etc/pccard_ether epair0a start'
 15 Jan 24 19:31:05 x3200 kernel: epair0a:
 16 Jan 24 19:31:05 x3200 kernel:
 17 Jan 24 19:31:05 x3200 kernel: Fatal trap 12: page fault while in kernel mode
 18 Jan 24 19:31:05 x3200 kernel: cpuid = 1; apic id = 02
 19 Jan 24 19:31:05 x3200 kernel: Ethernet address: 02:ff:20:00:09:0a
 20 Jan 24 19:31:05 x3200 kernel: fault virtual address     = 0x0
 21 Jan 24 19:31:05 x3200 kernel: fault code                = supervisor read instruction, page not present
 22 Jan 24 19:31:05 x3200 kernel: instruction pointer       = 0x20:0x0
 23 Jan 24 19:31:05 x3200 kernel: stack pointer             = 0x28:0xfffffe04691ca720
 24 Jan 24 19:31:05 x3200 kernel: frame pointer             = 0x28:0xfffffe04691ca770
 25 Jan 24 19:31:05 x3200 kernel: epair0b: code segment             = base rx0, limit 0xfffff, type 0x1b
 26 Jan 24 19:31:05 x3200 kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
 27 Jan 24 19:31:05 x3200 kernel: Ethernet address: 02:ff:70:00:0a:0b
 28 Jan 24 19:31:05 x3200 kernel: processor eflags  = interrupt enabled,
 29 Jan 24 19:31:05 x3200 kernel: epair0a: link state changed to UP
 30 Jan 24 19:33:13 x3200 syslogd: kernel boot file is /boot/kernel/kernel
 31 Jan 24 19:33:13 x3200 kernel: epair0b: link state changed to UP
 32 Jan 24 19:33:13 x3200 kernel: resume, IOPL = 0
 33 Jan 24 19:33:13 x3200 kernel: current process           = 10817 (dhcpcd)
 34 Jan 24 19:33:13 x3200 kernel: trap number               = 12
 35 Jan 24 19:33:13 x3200 kernel: panic: page fault
 36 Jan 24 19:33:13 x3200 kernel: cpuid = 1
 37 Jan 24 19:33:13 x3200 kernel: KDB: stack backtrace:
 38 Jan 24 19:33:13 x3200 kernel: #0 0xffffffff809442a0 at kdb_backtrace+0x60
 39 Jan 24 19:33:13 x3200 kernel: #1 0xffffffff80907a06 at vpanic+0x126
 40 Jan 24 19:33:13 x3200 kernel: #2 0xffffffff809078d3 at panic+0x43
 41 Jan 24 19:33:13 x3200 kernel: #3 0xffffffff80cd178b at trap_fatal+0x36b
 42 Jan 24 19:33:13 x3200 kernel: #4 0xffffffff80cd1a8d at trap_pfault+0x2ed
 43 Jan 24 19:33:13 x3200 kernel: #5 0xffffffff80cd112a at trap+0x47a
 44 Jan 24 19:33:13 x3200 kernel: #6 0xffffffff80cb74a2 at calltrap+0x8
 45 Jan 24 19:33:13 x3200 kernel: #7 0xffffffff809ca1cb at ifioctl+0x11eb
 46 Jan 24 19:33:13 x3200 kernel: #8 0xffffffff8095c195 at kern_ioctl+0x255
 47 Jan 24 19:33:13 x3200 kernel: #9 0xffffffff8095be90 at sys_ioctl+0x140
 48 Jan 24 19:33:13 x3200 kernel: #10 0xffffffff80cd20a7 at amd64_syscall+0x357
 49 Jan 24 19:33:13 x3200 kernel: #11 0xffffffff80cb778b at Xfast_syscall+0xfb
 50 Jan 24 19:33:13 x3200 kernel: Uptime: 30m59s

See http://roy.marples.name/projects/dhcpcd/tktview?name=3a1e57157d.
Expected behaviour: A userland app should not crash the kernel.


iocage creates a vnet0:1 and connects it to bridge1. When dhcpcd is stopped I can start the VNET Jail. The ifconfig shows:

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=4209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWTSO>
	ether 00:xx:xx:xx:xx:xx
	inet6 fe80::xxx:xxxx:xxxx:xxxx%em0 prefixlen 64 scopeid 0x1 
	inet6 2001:xxx:xxxx:xxx:xxxx:xxxx:xxxx:xxxx prefixlen 128 
	inet 69.xxx.xxx.xxx netmask 0xfffffe00 broadcast 255.255.255.255 
	nd6 options=1<PERFORMNUD>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=42098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWTSO>
	ether 00:xx:xx:xx:xx:58
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
enc0: flags=0<> metric 0 mtu 1536
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
	inet6 ::1 prefixlen 128 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4 
	inet 127.0.0.1 netmask 0xff000000 
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	ether 02:xx:xx:xx:xx:00
	inet6 fe80::yy:yyyy:yyyy:3e00%bridge0 prefixlen 64 scopeid 0x5 
	inet zzz.zzz.zz.z netmask 0xffffff00 broadcast 156.168.10.255 
	inet6 2601:xxx:xxxxx:xxxx::1 prefixlen 64 
	nd6 options=41<PERFORMNUD,NO_RADR>
	id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
	maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
	root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
	member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 7 priority 128 path cost 2000000
	member: em1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 2 priority 128 path cost 55
bridge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	ether 02:xx:xx:xx:xx:01
	inet 172.10.0.1 netmask 0xffffff00 broadcast 172.10.0.255 
	nd6 options=49<PERFORMNUD,IFDISABLED,NO_RADR>
	id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
	maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
	root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
	member: vnet0:1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 9 priority 128 path cost 2000
tap0: flags=8902<BROADCAST,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether xx:xx:xx:xx:xx:00
	nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
	media: Ethernet autoselect
	status: no carrier
tap1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=80000<LINKSTATE>
	ether xx:xx:xx:xx:xx:01
	nd6 options=69<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL,NO_RADR>
	media: Ethernet autoselect
	status: no carrier
vnet0:1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	description: associated with jail: "jailid goes here"
	options=8<VLAN_MTU>
	ether xx:xx:xx:xx:xx:e1
	inet6 fe80::ff:xxxx:xxxx:xxe1%vnet0:1 prefixlen 64 scopeid 0x9 
	nd6 options=61<PERFORMNUD,AUTO_LINKLOCAL,NO_RADR>
	media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
	status: active
Comment 2 Kubilay Kocak freebsd_committer freebsd_triage 2016-01-25 15:45:48 UTC
Is this a different panic or issue that than in bug 206613?

Also, for future issues please include large log outputs, configuration files or settings as attachments instead of comments. 

Thanks!
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2016-01-25 15:47:17 UTC
Ignore my last question. This issue depends on bug 206613 (though technically they are duplicates.
Comment 4 roy 2016-01-26 08:48:22 UTC
Running FreeBSD-10.2 RELEASE #0 r286666 and dhcpcd-6.10.1 on amd64 I ran

ifconfig epair create

20 times when running dhcpcd for a single interface and 20 times when running dhcpcd for all interfaces and dhcpcd behaved as designed.

I have no VNET or iocage configs.

From the maintainer perspective there seems to be no dhcpcd specific bug here and any fix is very likely within the kernel.
Comment 5 roy 2016-01-26 08:51:32 UTC
(In reply to g_amanakis from comment #1)

Could you also attach your dhcpcd.conf and syslog output regarding dhcpcd so we can see what it's doing when you crash please?
Comment 6 g_amanakis 2016-01-26 14:42:59 UTC
Created attachment 166148 [details]
rc.conf

My rc.conf.
Comment 7 g_amanakis 2016-01-26 14:44:06 UTC
Created attachment 166149 [details]
My dhcpcd.conf

My dhcpcd.conf. I am using IPv6 Prefix Delegation, but not for the epair interfaces.
Comment 8 g_amanakis 2016-01-26 23:30:04 UTC
I tried my dhcpcd.conf configuration on a vanilla USB install of "FreeBSD-10.2-RELEASE-amd64-uefi-memstick.img" and I can reproduce the issue. This was on bare metal hardware.
Comment 9 g_amanakis 2016-01-31 20:19:05 UTC
The issue seems resolved in the 10-STABLE kernel (as of r295091).