Using bhyve and nmdm to access console, I am able to cause a kernel panic. I am using generic FreeBSD 10.0-p7. It is repeatable. Jul 29 19:37:18 jail5 kernel: panic: make_dev_credv: bad si_name (error=17, si_name=nmdm397A) Jul 29 19:37:18 jail5 kernel: cpuid = 3 Jul 29 19:37:18 jail5 kernel: KDB: stack backtrace: Jul 29 19:37:18 jail5 kernel: #0 0xffffffff808e7e90 at kdb_backtrace+0x60 Jul 29 19:37:18 jail5 kernel: #1 0xffffffff808af975 at panic+0x155 Jul 29 19:37:18 jail5 kernel: #2 0xffffffff80865c68 at make_dev_credv+0x2e8 Jul 29 19:37:18 jail5 kernel: #3 0xffffffff80865ccd at make_dev_cred+0x5d Jul 29 19:37:18 jail5 kernel: #4 0xffffffff8090d69b at tty_makedev+0x10b Jul 29 19:37:18 jail5 kernel: #5 0xffffffff81c001fc at nmdm_clone+0x16c Jul 29 19:37:18 jail5 kernel: #6 0xffffffff807ac974 at devfs_lookup+0x3f4 Jul 29 19:37:18 jail5 kernel: #7 0xffffffff80d97cd2 at VOP_LOOKUP_APV+0x92 Jul 29 19:37:18 jail5 kernel: #8 0xffffffff8093ff7b at lookup+0x58b Jul 29 19:37:18 jail5 kernel: #9 0xffffffff8093f704 at namei+0x504 Jul 29 19:37:18 jail5 kernel: #10 0xffffffff80953235 at kern_statat_vnhook+0xa5 Jul 29 19:37:18 jail5 kernel: #11 0xffffffff809530cd at sys_stat+0x2d Jul 29 19:37:18 jail5 kernel: #12 0xffffffff80c8f127 at amd64_syscall+0x357 Jul 29 19:37:18 jail5 kernel: #13 0xffffffff80c7581b at Xfast_syscall+0xfb This, of course, stop all bhyve virts and reboots the system. There is a nmdm397A and nmdm397B in the /dev directory. Another time. Jul 29 19:17:57 jail5 kernel: panic: make_dev_credv: bad si_name (error=17, si_name=nmdm397A) Jul 29 19:17:57 jail5 kernel: cpuid = 4 Jul 29 19:17:57 jail5 kernel: KDB: stack backtrace: Jul 29 19:17:57 jail5 kernel: #0 0xffffffff808e7e90 at kdb_backtrace+0x60 Jul 29 19:17:57 jail5 kernel: #1 0xffffffff808af975 at panic+0x155 Jul 29 19:17:57 jail5 kernel: #2 0xffffffff80865c68 at make_dev_credv+0x2e8 Jul 29 19:17:57 jail5 kernel: #3 0xffffffff80865ccd at make_dev_cred+0x5d Jul 29 19:17:57 jail5 kernel: #4 0xffffffff8090d69b at tty_makedev+0x10b Jul 29 19:17:57 jail5 kernel: #5 0xffffffff81c001fc at nmdm_clone+0x16c Jul 29 19:17:57 jail5 kernel: #6 0xffffffff807ac974 at devfs_lookup+0x3f4 Jul 29 19:17:57 jail5 kernel: #7 0xffffffff80d97cd2 at VOP_LOOKUP_APV+0x92 Jul 29 19:17:57 jail5 kernel: #8 0xffffffff8093ff7b at lookup+0x58b Jul 29 19:17:57 jail5 kernel: #9 0xffffffff8093f704 at namei+0x504 Jul 29 19:17:57 jail5 kernel: #10 0xffffffff80957ed2 at vn_open_cred+0x232 Jul 29 19:17:57 jail5 kernel: #11 0xffffffff80951671 at kern_openat+0x261 Jul 29 19:17:57 jail5 kernel: #12 0xffffffff80c8f127 at amd64_syscall+0x357 Jul 29 19:17:57 jail5 kernel: #13 0xffffffff80c7581b at Xfast_syscall+0xfb and again. Jul 29 19:04:45 jail5 kernel: panic: make_dev_credv: bad si_name (error=17, si_name=nmdm397A) Jul 29 19:04:45 jail5 kernel: cpuid = 5 Jul 29 19:04:45 jail5 kernel: KDB: stack backtrace: Jul 29 19:04:45 jail5 kernel: #0 0xffffffff808e7e90 at kdb_backtrace+0x60 Jul 29 19:04:45 jail5 kernel: #1 0xffffffff808af975 at panic+0x155 Jul 29 19:04:45 jail5 kernel: #2 0xffffffff80865c68 at make_dev_credv+0x2e8 Jul 29 19:04:45 jail5 kernel: #3 0xffffffff80865ccd at make_dev_cred+0x5d Jul 29 19:04:45 jail5 kernel: #4 0xffffffff8090d69b at tty_makedev+0x10b Jul 29 19:04:45 jail5 kernel: #5 0xffffffff81c001fc at nmdm_clone+0x16c Jul 29 19:04:45 jail5 kernel: #6 0xffffffff807ac974 at devfs_lookup+0x3f4 Jul 29 19:04:45 jail5 kernel: #7 0xffffffff80d97cd2 at VOP_LOOKUP_APV+0x92 Jul 29 19:04:45 jail5 kernel: #8 0xffffffff8093ff7b at lookup+0x58b Jul 29 19:04:45 jail5 kernel: #9 0xffffffff8093f704 at namei+0x504 Jul 29 19:04:45 jail5 kernel: #10 0xffffffff80953235 at kern_statat_vnhook+0xa5 Jul 29 19:04:45 jail5 kernel: #11 0xffffffff809530cd at sys_stat+0x2d Jul 29 19:04:45 jail5 kernel: #12 0xffffffff80c8f127 at amd64_syscall+0x357 Jul 29 19:04:45 jail5 kernel: #13 0xffffffff80c7581b at Xfast_syscall+0xfb
Assign to kernel not ports
Dave, you should put your "uname -a" information
(In reply to John Marino from comment #2) > Dave, you should put your "uname -a" information Here it is. FreeBSD jail5.johncompanies.com 10.0-RELEASE-p7 FreeBSD 10.0-RELEASE-p7 #0: Tue Jul 8 06:37:44 UTC 2014 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64
Looking at this. Dave: would you be able to outline how to repro this ? Lots of VMs starting up at the same time ?
Hi Peter, I had 3 VMs running with a slightly modified vmrc. The vms had names like col02325. When I tried to start a 4th VM it happened to have a name like col00345. I was using the last 4 digits as the nmdm device number. Apparently a leading zero for the nmdm number caused a problem. The whole system crashed. I changed the vm name to col05345, and it worked. To my way of thinking, nothing that can be entered on a command line should be able to cause a kernel crash. Let me know if you need more information. I even have a couple of crash dumps. Dave Smith
Thanks - yes, easy repro on 10. Doesn't crash on CURRENT, but the bug is still there in that the leading zeros are stripped when the devfs entry is created. This in turn means that it isn't possible to connect to the 'B' end using the same text string. After a chat with Neel, the proposed solution will be to allow any valid devfs character string between the 'nmdm' and trailing 'A' or 'B'. That way you can even encode the VM name in the nmdm device name :) I'll see if I can get this into CURRENT shortly, and then MFC in time for 10.1.
+1 Being able to put the VM name in the nmdm device name is super useful.
A commit references this bug: Author: grehan Date: Wed Sep 10 05:44:16 UTC 2014 New revision: 271350 URL: http://svnweb.freebsd.org/changeset/base/271350 Log: Fix issue with nmdm and leading zeros in device name. The nmdm code enforces a number between the 'nmdm' and 'A|B' portions of the device name. This is then used as a unit number, and sprintf'd back into the tty name. If leading zeros were used in the name, the created device name is different than the string used for the clone-open (e.g. /dev/nmdm0001A will result in /dev/nmdm1A). Since unit numbers are no longer required with the updated tty code, there seems to be no reason to force the string to be a number. The fix is to allow an arbitrary string between 'nmdm' and 'A|B', within the constraints of devfs names. This allows all existing user of numeric strings to continue to work, and also allows more meaningful names to be used, such as bhyve VM names. Tested on amd64, i386 and ppc64. Reported by: Dave Smith PR: 192281 Reviewed by: neel, glebius Phabric: D729 MFC after: 3 days Changes: head/sys/dev/nmdm/nmdm.c
A commit references this bug: Author: grehan Date: Thu Sep 18 19:20:09 UTC 2014 New revision: 271800 URL: http://svnweb.freebsd.org/changeset/base/271800 Log: MFC nmdm driver changes, r259550 and r271350 r259550 (glebius): Make nmdm(4) destroy devices when both sides of a pair are disconnected. This makes it possible to kldunload nmdm.ko when there are no users of it. r271350: Fix issue with nmdm and leading zeros in device name. The nmdm code enforces a number between the 'nmdm' and 'A|B' portions of the device name. This is then used as a unit number, and sprintf'd back into the tty name. If leading zeros were used in the name, the created device name is different than the string used for the clone-open (e.g. /dev/nmdm0001A will result in /dev/nmdm1A). Since unit numbers are no longer required with the updated tty code, there seems to be no reason to force the string to be a number. The fix is to allow an arbitrary string between 'nmdm' and 'A|B', within the constraints of devfs names. This allows all existing user of numeric strings to continue to work, and also allows more meaningful names to be used, such as bhyve VM names. PR: 192281 Approved by: re (glebius) Changes: _U stable/10/ stable/10/sys/dev/nmdm/nmdm.c
Fixed in CURRENT and in 10-stable.