Summary: | (reclassified by bugmeister as honeypot) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Other | Reporter: | Brendan Molloy <brendan+freebsd> | ||||||||
Component: | Spam | Assignee: | Bugmeister <bugmeister> | ||||||||
Status: | Closed FIXED | ||||||||||
Severity: | Affects Only Me | Keywords: | honeypot | ||||||||
Priority: | --- | ||||||||||
Version: | unspecified | ||||||||||
Hardware: | amd64 | ||||||||||
OS: | Any | ||||||||||
Bug Depends on: | 205263 | ||||||||||
Bug Blocks: | |||||||||||
Attachments: |
|
Created attachment 164035 [details]
Screen at time of hang (interlaced messages showing boot order issue)
Thank you for the detailed report Brendan! Hi, Try MFC-ing r288265 to 10-stable. https://svnweb.freebsd.org/changeset/base/288265 --HPS I noticed a way to reproduce this issue while trying to fix some of my firewall rules. Block all connections with a firewall (I used ipfilter). It will cause the hang at /etc/rc.d/mountcritremote when attempting to mount an NFS mount. Once I fixed my rules, it still hangs for me on boot even with firewalls disabled, but eventually recovers when it retries after the network interface becomes available. (In reply to Hans Petter Selasky from comment #3) I don't think r288265 will help -- it delays mounting root before init even begins, for diskless operation with an nfs root. (I should still MFC it tho.) Since the system isn't configured for nfsroot, it won't come into play. In this case it's a matter of the network rc scripts running before the usb network interface is available, and I don't think we have any mechanism for waiting for NICs to arrive. One thing that will work, but it's more of a workaround than a fix, is to set kern.cam.boot_delay=nnnnnn in loader.conf. The delay is in milliseconds, so something like 10000 is probably enough. OK, so is this an USB issue then? (In reply to Hans Petter Selasky from comment #6) I don't think it's a problem in the sense of there being some usb driver code that can be changed to make it work -- it's an rc-scripts problem. Oh, interesting... I've just discovered the existence of /etc/rc.d/netwait. It looks like it's designed to handle exactly this situation... mountcritremote waits for netwait, and netwait can be configured to use a specific interface to ping an ip address and it doesn't complete until it gets a response or times out. I think you just add netwait_enable=YES, netwait_if=<whatever>, netwait_ip=<ip to ping>. If the interface is dhcp instead of static, just change it to SYNCDHCP in rc.conf I tried the netwait method: Dec 11 11:27:33 aerie kernel: Waiting for ue0 to have link Dec 11 11:27:33 aerie kernel: /etc/rc: ERROR: ifconfig ue0 failed Dec 11 11:27:33 aerie kernel: Mounting NFS file systems:[tcp] 10.0.0.4:/nfs/BorgBackups: RPCPROG_NFS: RPC: Port mapper failure - RPC: Unable to send The interface does not exist at all until after the script had already run, so maybe the script should have an option to be more patient with errors? Because I love FreeBSD, I am providing further information on this issue at 1am on a Saturday on Hans' and Kubilay's behalf. I was asked to add hw.usb.axe.debug=16 to sysctl.conf, and I rebooted. The only difference in logs was this: Dec 12 01:01:34 aerie kernel: ue0: <USB Ethernet> on axe0 Dec 12 01:01:34 aerie kernel: ue0: Ethernet address: 00:50:b6:16:30:de Dec 12 01:01:35 aerie ntpd[601]: ntpd 4.2.8p3-a (1): Starting Dec 12 01:01:35 aerie kernel: . Dec 12 01:01:36 aerie kernel: axe_bulk_write_callback: transfer complete Dec 12 01:02:05 aerie last message repeated 10 times Dec 12 01:02:53 aerie last message repeated 28 times (In reply to Brendan Molloy from comment #8) Oh! I should have realized... the interface doesn't exist until devd comes along and creates it, and that's too late. I think you can fix that by adding to /boot/loader.conf: if_axe_load=YES and you'll probably still need either SYNCDHCP or newait_enable for static ip. (In reply to Ian Lepore from comment #10) I tried this, and you are right, I still needed netwait. The patch I provided in bug #205263 makes netwait more resilient to late loading of the modules, as it doesn't immediately scream about no interface when no iface is yet found. The timeout then gives enough time for the module to load at its own pace, while not requiring any changes to /boot/loader.conf. If my patch is clean enough, it provides a solution inline with the principle of least astonishment to a problem that may become more prevalent in the future, as many modern laptops do not any longer include ethernet ports and users rely on USB or Thunderbolt ethernet adapters. Created attachment 164335 [details]
Updated diff for netwait script
I was updating the comments at the top of the script (seems to be the only "documentation" for netwait), and while describing how you could provide a list of interfaces to wait for, I realized it only allowed 1 interface. I figured someone could have usb wifi and usb wired NIC or multiple NICs or whatever, so I updated the script to handle a list of interfaces.
I can only test the failure path (interface never arrives), it needs a test with real late-arriving hardware.
I have turned the latest diff into a review at https://reviews.freebsd.org/D4608 A commit references this bug: Author: ian Date: Sat Dec 26 18:21:33 UTC 2015 New revision: 292752 URL: https://svnweb.freebsd.org/changeset/base/292752 Log: Enhance rc.d/netwait script to wait for late-attaching interfaces such as USB NICs. USB network hardware may not be enumerated and available when the rc.d networking scripts run. Eventually the USB attachment completes and devd events cause the network initialization to happen, but by then other rc.d scripts have already failed, because services which depend on NETWORKING (such as mountcritremote) may end up running before the network is actually ready. There is an existing netwait script, but because it is dependent on NETWORKING it runs too late to prevent failure of some other rc scripts. This change flips the order so that NETWORKING depends on netwait, and netwait now depends on devd and routing (the former is needed to make interfaces appear, and the latter is needed to run the ping tests in netwait). The netwait script used to be oriented primarily towards "as soon as any host is reachable the network is fully functional", so you gave it a list of IPs to try and you could optionally name an interface and it would wait for carrier on that interface. That functionality still works the same, but now you can provide a list of interfaces to wait for and it waits until each one of them is available. The ping logic still completes as soon as the first IP on the list responds. These changes were submitted by Brenden Molloy <brendan+freebsd@bbqsrc.net> in PR 205186, and lightly modified by me to allow a list of interfaces instead of just one. PR: 205186 Differential Revision: https://reviews.freebsd.org/D4608 (timeout w/o review) Changes: head/etc/defaults/rc.conf head/etc/rc.d/NETWORKING head/etc/rc.d/netwait A commit references this bug: Author: ian Date: Sun Jan 24 19:41:32 UTC 2016 New revision: 294680 URL: https://svnweb.freebsd.org/changeset/base/294680 Log: MFC r292752: Enhance rc.d/netwait script to wait for late-attaching interfaces such as USB NICs. USB network hardware may not be enumerated and available when the rc.d networking scripts run. Eventually the USB attachment completes and devd events cause the network initialization to happen, but by then other rc.d scripts have already failed, because services which depend on NETWORKING (such as mountcritremote) may end up running before the network is actually ready. There is an existing netwait script, but because it is dependent on NETWORKING it runs too late to prevent failure of some other rc scripts. This change flips the order so that NETWORKING depends on netwait, and netwait now depends on devd and routing (the former is needed to make interfaces appear, and the latter is needed to run the ping tests in netwait). The netwait script used to be oriented primarily towards "as soon as any host is reachable the network is fully functional", so you gave it a list of IPs to try and you could optionally name an interface and it would wait for carrier on that interface. That functionality still works the same, but now you can provide a list of interfaces to wait for and it waits until each one of them is available. The ping logic still completes as soon as the first IP on the list responds. These changes were submitted by Brenden Molloy <brendan+freebsd@bbqsrc.net> in PR 205186, and lightly modified by me to allow a list of interfaces instead of just one. PR: 205186 Relnotes: yes Changes: _U stable/10/ stable/10/etc/defaults/rc.conf stable/10/etc/rc.d/NETWORKING stable/10/etc/rc.d/netwait MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM Triage: attempt to de-orbit this PR. Original title: "USB ethernet device with NFS mount causes boot hang (startup order)" Original assignee: hans-petter (RIP) MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM MARKED AS SPAM |
Created attachment 164034 [details] console.log output Today I've set up an old Inspiron 1525 laptop as a FreeBSD server. The internal NIC had died, so I am using an external NIC connected via USB. To my delight, FreeBSD detected it immediately and it just worked. It was wonderful! However, troubles began when I added an NFS mount to my /etc/fstab. Upon booting, due to the fact the USB device isn't recognised and connected until after the booting process has finished, booting hangs with an error. Dec 10 21:16:40 aerie kernel: Mounting NFS file systems:[tcp] 10.0.0.4:/nfs/BorgBackups: RPCPROG_NFS: RPC: Port mapper failure - RPC: Unable to send Simply waiting, it will repeat this error after a while. I decided to try Ctrl+C, and booting completed. That caused this line in the console.log: Dec 10 21:16:40 aerie kernel: Script /etc/rc.d/mountcritremote interrupted