Removing a devfs symlink can cause a panic if a program had that symlink open and then exits. I use the comms/conserver port to manage a bunch of serial lines connected to a 32 port RocketPort config. To keep track of what each line is connected to, I create symlinks via "link" lines in devfs.conf. I disovered that if I remove link lines from devfs.conf, update devfs and then kill off a conserver that has one or more of the obsolete links open, the system will panic. How-To-Repeat: Append a line to devfs.conf: link cuaR00 test Update devfs and verify the new symlink: # /etc/rc.d/devfs restart # ls -l /dev/test /dev/cuaR00 crw-rw---- 1 uucp dialer 0, 40 Feb 12 19:45 /dev/cuaR00 lrwxr-xr-x 1 root wheel 6 Feb 12 19:45 /dev/test -> cuaR00 Add a line to conserver.cf: test:/dev/test:9600p:/var/console/test.log:0 Startup conserver. Remove out the link line previously added to devfs.conf. Update devfs again. Note that the /dev/test symlink still exists. Kill conserver. The system panics with: device_unbusy: called for non-busy device rp0 kgdb shows: (kgdb) bt #0 0xc0692c96 in doadump () #1 0xc06931f2 in boot () #2 0xc0693519 in panic () #3 0xc06a9304 in device_unbusy () #4 0xc05ca8e9 in rpclose () #5 0xc06c6617 in ttyclose () #6 0xc066878b in giant_close () #7 0xc0644876 in devfs_close () #8 0xc08ae164 in VOP_CLOSE_APV () #9 0xc06fa8c6 in vn_close () #10 0xc06fb836 in vn_closefile () #11 0xc06448b7 in devfs_close_f () #12 0xc0672a04 in fdrop_locked () #13 0xc0672951 in fdrop () #14 0xc0670eef in closef () #15 0xc066e04d in close () #16 0xc08998f3 in syscall () #17 0xc08853df in Xint0x80_syscall () #18 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?)
On Tue, Feb 13, 2007 at 09:34:53PM -0800, Craig Leres wrote: > > % uname -a > FreeBSD ee.lbl.gov 6.2-RELEASE FreeBSD 6.2-RELEASE #1: Mon Jan 15 11:31:32 PST 2007 leres@fox.ee.lbl.gov:/usr/src/6.2-RELEASE/sys/i386/compile/LBLSMP i386 > > rp0: <RocketPort PCI> port 0x3000-0x30ff irq 19 at device 7.0 on pci2 > RocketPort0 (Version 3.02) 32 ports. > > >Description: > Removing a devfs symlink can cause a panic if a program had > that symlink open and then exits. > > I use the comms/conserver port to manage a bunch of serial > lines connected to a 32 port RocketPort config. To keep > track of what each line is connected to, I create symlinks > via "link" lines in devfs.conf. I disovered that if I remove > link lines from devfs.conf, update devfs and then kill off > a conserver that has one or more of the obsolete links open, > the system will panic. > Hello, This does not really feel like a bug but rather a configuration failure from the administrator (you in this case). When you have active connections, applications assume that they can have IO with (in this case a symlinked device). When you remove that device the application can no longer access that IO and in this case kills the system. The latter of this might be a problem (though I really feel like "dont do that!"): What we need prior to continueing is a dump of the panic, you can read how to do that on the http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html documentation pages. Thanks in advance ! -- Kind regards, Remko Lodder ** remko@elvandar.org FreeBSD ** remko@FreeBSD.org /* Quis custodiet ipsos custodes */
> This does not really feel like a bug but rather a configuration > failure from the administrator (you in this case). (Not a bug? Seriously?) Here's some additional info. I ran the scenario with a sio port; the system does not crash, /dev/modem does not go away and tip can still communicate with the modem after devfs is updated. (As expected.) I also ran the scenario with an puc port which worked the same as with the sio port. Finally, I reran the scenario with the RocketPort and found it's not necessary to make any changes to devfs.conf to induce a panic. You can simply: tip to a rocketport line run "/etc/rc.d/devfs restart" exit tip (wait for the system to reboot) I think the issue is a side effect of refreshing the devfs layout is that the state is getting reset. The rp driver seems to be the only one using device_busy/device_unbusy in a manner that's sensitive to the dev state. > What we need prior to continueing is a dump of > the panic, I used a crash dump to create the kgdb stack trace. Craig
The problem is a lot more serious than I first reported. Over the last few weeks I've found it pretty easy to panic the system. I did more surfing and found this freebsd-hackers thread: http://groups.google.com/group/mailing.freebsd.hackers/browse_frm/thread/883da63a8c62854d Ultimately, Doug Ambrisko posts a patch against RELENG_6. I wasn't able to recover the patch from his posting but he also committed the changes which are available via the cvs web interface: http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rp.c?rev=1.67.2.2&content-type=text/plain http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rp_isa.c?rev=1.7.2.1&content-type=text/plain http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rp_pci.c?rev=1.11.2.2&content-type=text/plain http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rpreg.h?rev=1.7.2.1&content-type=text/plain http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rpvar.h?rev=1.8.2.1&content-type=text/plain It looks good! I've been unable to generate a device_unbusy() panic with the resulting kernel. Craig
I was still able to crash Doug Ambrisko's version of the rp driver. I think the problem is that in some cases rp_handle_port() calls rpclose() which calls device_unbusy(). Later rpclose() is called again and we hit the panic. I looked at the last known good version of the driver I'd used, 1.45.2.2 from 4.10-RELEASE, and found it has code to avoid calling rpclose() twice. I did something similar that seems to work. Note: The appended diffs are against version 1.67.2.2 of rp.c. Craig =================================================================== RCS file: RCS/rp.c,v retrieving revision 1.2 retrieving revision 1.3 diff -c -r1.2 -r1.3 *** rp.c 2007/03/08 04:07:10 1.2 --- rp.c 2007/03/14 02:23:17 1.3 *************** *** 573,578 **** --- 573,579 ---- static void rpbreak(struct tty *, int); static void rpclose(struct tty *tp); + static void rphardclose(struct tty *tp); static int rpmodem(struct tty *, int, int); static int rpparam(struct tty *, struct termios *); static void rpstart(struct tty *); *************** *** 697,703 **** if((tp->t_state & TS_CARR_ON)) { (void)ttyld_modem(tp, 0); if(ttyld_modem(tp, 0) == 0) { ! rpclose(tp); } } } --- 698,704 ---- if((tp->t_state & TS_CARR_ON)) { (void)ttyld_modem(tp, 0); if(ttyld_modem(tp, 0) == 0) { ! rphardclose(tp); } } } *************** *** 936,941 **** --- 937,952 ---- rpclose(struct tty *tp) { struct rp_port *rp; + + rp = tp->t_sc; + rphardclose(tp); + device_unbusy(rp->rp_ctlp->dev); + } + + static void + rphardclose(struct tty *tp) + { + struct rp_port *rp; CHANNEL_t *cp; rp = tp->t_sc; *************** *** 959,965 **** tp->t_actout = FALSE; wakeup(&tp->t_actout); wakeup(TSA_CARR_ON(tp)); - device_unbusy(rp->rp_ctlp->dev); } static void --- 970,975 ----
Craig Leres writes: | I was still able to crash Doug Ambrisko's version of the rp driver. | I think the problem is that in some cases rp_handle_port() calls | rpclose() which calls device_unbusy(). Later rpclose() is called | again and we hit the panic. | | I looked at the last known good version of the driver I'd used, | 1.45.2.2 from 4.10-RELEASE, and found it has code to avoid calling | rpclose() twice. I did something similar that seems to work. | | Note: The appended diffs are against version 1.67.2.2 of rp.c. | | Craig | | =================================================================== | RCS file: RCS/rp.c,v | retrieving revision 1.2 | retrieving revision 1.3 | diff -c -r1.2 -r1.3 | *** rp.c 2007/03/08 04:07:10 1.2 | --- rp.c 2007/03/14 02:23:17 1.3 | *************** | *** 573,578 **** | --- 573,579 ---- | | static void rpbreak(struct tty *, int); | static void rpclose(struct tty *tp); | + static void rphardclose(struct tty *tp); | static int rpmodem(struct tty *, int, int); | static int rpparam(struct tty *, struct termios *); | static void rpstart(struct tty *); | *************** | *** 697,703 **** | if((tp->t_state & TS_CARR_ON)) { | (void)ttyld_modem(tp, 0); | if(ttyld_modem(tp, 0) == 0) { | ! rpclose(tp); | } | } | } | --- 698,704 ---- | if((tp->t_state & TS_CARR_ON)) { | (void)ttyld_modem(tp, 0); | if(ttyld_modem(tp, 0) == 0) { | ! rphardclose(tp); | } | } | } | *************** | *** 936,941 **** | --- 937,952 ---- | rpclose(struct tty *tp) | { | struct rp_port *rp; | + | + rp = tp->t_sc; | + rphardclose(tp); | + device_unbusy(rp->rp_ctlp->dev); | + } | + | + static void | + rphardclose(struct tty *tp) | + { | + struct rp_port *rp; | CHANNEL_t *cp; | | rp = tp->t_sc; | *************** | *** 959,965 **** | tp->t_actout = FALSE; | wakeup(&tp->t_actout); | wakeup(TSA_CARR_ON(tp)); | - device_unbusy(rp->rp_ctlp->dev); | } | | static void | --- 970,975 ---- | I wonder if this should be a reference count? Good find. Doug A.
Responsible Changed From-To: freebsd-bugs->remko There are reports that the information in this ticket resolves the problem, Mark requested a source committer to fix this, make it happen (I will do it)
remko 2007-06-26 13:50:48 UTC FreeBSD src repository Modified files: sys/dev/rp rp.c Log: Fix Rocketport so that it does not crash the system when a device pointer changes for example: (From Craig Leres): tip to a rocketport line run "/etc/rc.d/devfs restart" exit tip (wait for the system to reboot) Thanks to Robert Watson for poking me to fix this. PR: kern/109152 Approved by: imp (mentor) Approved by: re (kensmith) Reviewed by: jhb Submitted by: Craig Leres <leres@ee dot lbl dot gov> Revision Changes Path 1.73 +12 -2 src/sys/dev/rp/rp.c _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
State Changed From-To: open->patched Patched in -CURRENT, will MFC in a few days.
remko 2007-07-03 20:35:07 UTC FreeBSD src repository Modified files: (Branch: RELENG_6) sys/dev/rp rp.c Log: MFC rp.c rev 1.73 Fix Rocketport so that it does not crash the system when a device pointer changes for example: (From Craig Leres): tip to a rocketport line run "/etc/rc.d/devfs restart" exit tip (wait for the system to reboot) Thanks to Robert Watson for poking me to fix this. PR: kern/109152 Approved by: imp (mentor) Approved by: re (kensmith) Reviewed by: jhb Submitted by: Craig Leres <leres@ee dot lbl dot gov> Approved by: imp (mentor, implicit) Revision Changes Path 1.67.2.3 +12 -2 src/sys/dev/rp/rp.c _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
State Changed From-To: patched->closed After some more consideration, I wont MFC this to RELENG_5. I think all should run RELENG_6 and in the future RELENG_7 where possible. 5.x wont have any new releases anyway so it does not add up to me personally.