Bug 204045 - net/isboot-kmod - Not working on 11-CURRENT
Summary: net/isboot-kmod - Not working on 11-CURRENT
Status: Closed Overcome By Events
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Gleb Smirnoff
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-26 20:49 UTC by Yonas Yanfa
Modified: 2015-11-05 21:47 UTC (History)
3 users (show)

See Also:
bugzilla: maintainer-feedback? (john)


Attachments
Screenshot of crash (26.99 KB, image/png)
2015-10-26 20:49 UTC, Yonas Yanfa
no flags Details
dmesg after applying the fix in comment #2 (86.10 KB, text/plain)
2015-11-05 00:43 UTC, Yonas Yanfa
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yonas Yanfa 2015-10-26 20:49:34 UTC
Created attachment 162483 [details]
Screenshot of crash

I've never been able to get isboot working on 11-CURRENT. Now that 11-CURRENT has 64-bit Linux support and Docker, I'm very interested in getting this working.

The author of isboot, Daisuke Aoyama, suggested that the following commit may be the root of the problem:

https://svnweb.freebsd.org/base/head/sys/netinet/in.c?r1=257692&r2=257691&pathrev=257692

I'm completely unfamiliar with kernel coding or networking coding, so I'm not going to be of much help in patching, but I can certainly help test patches.

First off, is anyone able to run this on 11-CURRENT?

If not, does anyone have any patches I can test?
Comment 1 John Nielsen 2015-11-04 18:47:34 UTC
Hi Yonas. I would like isboot (to continue?) to work in -CURRENT but I'm also not a kernel programmer.

I've added glebius@ to the CC since he committed the revision you reference. Gleb, any idea what's causing the panic in the screenshot or how to patch isboot?
Comment 2 Gleb Smirnoff freebsd_committer freebsd_triage 2015-11-04 23:31:42 UTC
The fix for this particular panic would be to provide the value of curthread as the last argument in the call to in_control() in isboot code. This is in isboot.c lines 222, 233, 242.

Yonas, John, can you please try that?

I'm not sure that this will be the last problem. In general in_control() isn't a kernel interface, it expects to be called only via ifioctl() in userland context.

P.S. I also sent email to the author of isboot, asking him to subscribe to this bug.
Comment 3 aoyama 2015-11-05 00:06:01 UTC
root cause is that 11-current never check td=NULL case at in_aifaddr_ioctl() which is introduced by r257692.
previous in_control() will skip td=NULL case like this:

>          case SIOCALIFADDR:
>                  if (td != NULL) {
>                          error = priv_check(td, PRIV_NET_ADDIFADDR);
>                          if (error)
>                                  return (error);
>                  } 

I don't know this check is needed.
Comment 4 Gleb Smirnoff freebsd_committer freebsd_triage 2015-11-05 00:15:26 UTC
In the kernel there is no code that calls in_control() without specifying thread, and the function is designed to be run in context of a syscall.

So, if you want to use the function, you should supply thread argument to it. In this particular case using just curthread should work fine.
Comment 5 Yonas Yanfa 2015-11-05 00:43:29 UTC
Created attachment 162797 [details]
dmesg after applying the fix in comment #2

Gleb, that actually worked! The only issue is that a bunch of stack traces were printed in a loop for about 20 seconds before finally completing the boot up.

The output of `dmesg` is attached.
Comment 6 Yonas Yanfa 2015-11-05 00:48:27 UTC
I saw "WARNING: WITNESS option enabled, expect reduced performance." in the dmesg output. Is this just verbose debugging info caused by the WITNESS option?
Comment 7 Gleb Smirnoff freebsd_committer freebsd_triage 2015-11-05 07:35:59 UTC
Yes, all these warnings come from WITNESS. And the problem is the same: isboot calls many kernel functions that are designed only for sleepable context in a non-sleepable context.
Comment 8 aoyama 2015-11-05 14:38:02 UTC
(In reply to Gleb Smirnoff from comment #4)
Using curthread is reasonable.
But I don't understand why you remove existing NULL check and leave it in in_difaddr_ioctl().
Comment 9 aoyama 2015-11-05 14:41:21 UTC
(In reply to Gleb Smirnoff from comment #7)
Yes, I use M_WAITOK but replace with M_NOWAIT now.
(also eliminate unsigned warning.)
Comment 10 Gleb Smirnoff freebsd_committer freebsd_triage 2015-11-05 15:52:01 UTC
I agree, having the check in in_difaddr seems incosistent. I will recheck that.
Comment 11 aoyama 2015-11-05 16:23:00 UTC
(In reply to Yonas Yanfa from comment #5)
I have just uploaded 0.1.12. It sill print:

>xpt_release_devq(): requested 1 > present 0

However, it should work on 11-current.

http://www.peach.ne.jp/archives/isboot/isboot-0.2.12.tar.gz

Thank you.
Comment 12 Yonas Yanfa 2015-11-05 16:37:18 UTC
(In reply to aoyama from comment #11)

Thanks, it works great :)
Comment 13 aoyama 2015-11-05 16:48:40 UTC
(In reply to aoyama from comment #11)

Ok solved.

Index: iscsi.c
===================================================================
--- iscsi.c     (revision 36)
+++ iscsi.c     (working copy)
@@ -2067,6 +2067,7 @@
                memset(&ccb, 0, sizeof(ccb));
                xpt_setup_ccb(&ccb.ccb_h, path, /* priority */1);
                ccb.ccb_h.func_code = XPT_REL_SIMQ;
+               ccb.ccb_h.flags = CAM_DEV_QFREEZE;
                ccb.crs.release_flags = RELSIM_ADJUST_OPENINGS;
                if (sess->opt.tags > 1)
                        ccb.crs.openings = sess->opt.tags - 1;

Now you can see two lock order reversal at boot time.
Comment 14 aoyama 2015-11-05 17:05:30 UTC
(In reply to John Nielsen from comment #1)
Hi John,

I have uploaded 0.2.13 and committed to primary location:

https://sourceforge.net/p/nas4free/code/1995/

Please use 0.2.13 for net/isboot-kmod.

Thanks.
Comment 15 Yonas Yanfa 2015-11-05 18:46:32 UTC
(In reply to aoyama from comment #14)

Thanks Daisuke!
Comment 16 John Nielsen 2015-11-05 19:38:15 UTC
Thanks all. I have submitted a new version of the port in ports/204313. I believe this PR can be closed.