Created attachment 162483 [details] Screenshot of crash I've never been able to get isboot working on 11-CURRENT. Now that 11-CURRENT has 64-bit Linux support and Docker, I'm very interested in getting this working. The author of isboot, Daisuke Aoyama, suggested that the following commit may be the root of the problem: https://svnweb.freebsd.org/base/head/sys/netinet/in.c?r1=257692&r2=257691&pathrev=257692 I'm completely unfamiliar with kernel coding or networking coding, so I'm not going to be of much help in patching, but I can certainly help test patches. First off, is anyone able to run this on 11-CURRENT? If not, does anyone have any patches I can test?
Hi Yonas. I would like isboot (to continue?) to work in -CURRENT but I'm also not a kernel programmer. I've added glebius@ to the CC since he committed the revision you reference. Gleb, any idea what's causing the panic in the screenshot or how to patch isboot?
The fix for this particular panic would be to provide the value of curthread as the last argument in the call to in_control() in isboot code. This is in isboot.c lines 222, 233, 242. Yonas, John, can you please try that? I'm not sure that this will be the last problem. In general in_control() isn't a kernel interface, it expects to be called only via ifioctl() in userland context. P.S. I also sent email to the author of isboot, asking him to subscribe to this bug.
root cause is that 11-current never check td=NULL case at in_aifaddr_ioctl() which is introduced by r257692. previous in_control() will skip td=NULL case like this: > case SIOCALIFADDR: > if (td != NULL) { > error = priv_check(td, PRIV_NET_ADDIFADDR); > if (error) > return (error); > } I don't know this check is needed.
In the kernel there is no code that calls in_control() without specifying thread, and the function is designed to be run in context of a syscall. So, if you want to use the function, you should supply thread argument to it. In this particular case using just curthread should work fine.
Created attachment 162797 [details] dmesg after applying the fix in comment #2 Gleb, that actually worked! The only issue is that a bunch of stack traces were printed in a loop for about 20 seconds before finally completing the boot up. The output of `dmesg` is attached.
I saw "WARNING: WITNESS option enabled, expect reduced performance." in the dmesg output. Is this just verbose debugging info caused by the WITNESS option?
Yes, all these warnings come from WITNESS. And the problem is the same: isboot calls many kernel functions that are designed only for sleepable context in a non-sleepable context.
(In reply to Gleb Smirnoff from comment #4) Using curthread is reasonable. But I don't understand why you remove existing NULL check and leave it in in_difaddr_ioctl().
(In reply to Gleb Smirnoff from comment #7) Yes, I use M_WAITOK but replace with M_NOWAIT now. (also eliminate unsigned warning.)
I agree, having the check in in_difaddr seems incosistent. I will recheck that.
(In reply to Yonas Yanfa from comment #5) I have just uploaded 0.1.12. It sill print: >xpt_release_devq(): requested 1 > present 0 However, it should work on 11-current. http://www.peach.ne.jp/archives/isboot/isboot-0.2.12.tar.gz Thank you.
(In reply to aoyama from comment #11) Thanks, it works great :)
(In reply to aoyama from comment #11) Ok solved. Index: iscsi.c =================================================================== --- iscsi.c (revision 36) +++ iscsi.c (working copy) @@ -2067,6 +2067,7 @@ memset(&ccb, 0, sizeof(ccb)); xpt_setup_ccb(&ccb.ccb_h, path, /* priority */1); ccb.ccb_h.func_code = XPT_REL_SIMQ; + ccb.ccb_h.flags = CAM_DEV_QFREEZE; ccb.crs.release_flags = RELSIM_ADJUST_OPENINGS; if (sess->opt.tags > 1) ccb.crs.openings = sess->opt.tags - 1; Now you can see two lock order reversal at boot time.
(In reply to John Nielsen from comment #1) Hi John, I have uploaded 0.2.13 and committed to primary location: https://sourceforge.net/p/nas4free/code/1995/ Please use 0.2.13 for net/isboot-kmod. Thanks.
(In reply to aoyama from comment #14) Thanks Daisuke!
Thanks all. I have submitted a new version of the port in ports/204313. I believe this PR can be closed.