Bug 109152

Summary: [rp] RocketPort panic from device_unbusy()
Product: Base System Reporter: Craig Leres <leres>
Component: kernAssignee: Remko Lodder <remko>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 6.2-RELEASE   
Hardware: Any   
OS: Any   

Description Craig Leres freebsd_committer freebsd_triage 2007-02-14 05:50:04 UTC
	Removing a devfs symlink can cause a panic if a program had
	that symlink open and then exits.

	I use the comms/conserver port to manage a bunch of serial
	lines connected to a 32 port RocketPort config. To keep
	track of what each line is connected to, I create symlinks
	via "link" lines in devfs.conf. I disovered that if I remove
	link lines from devfs.conf, update devfs and then kill off
	a conserver that has one or more of the obsolete links open,
	the system will panic.

How-To-Repeat: 	Append a line to devfs.conf:

	    link cuaR00 test

	Update devfs and verify the new symlink:

	    # /etc/rc.d/devfs restart
	    # ls -l /dev/test /dev/cuaR00
	    crw-rw----  1 uucp  dialer    0,  40 Feb 12 19:45 /dev/cuaR00
	    lrwxr-xr-x  1 root  wheel          6 Feb 12 19:45 /dev/test -> cuaR00

	Add a line to conserver.cf:

	    test:/dev/test:9600p:/var/console/test.log:0

	Startup conserver.
	Remove out the link line previously added to devfs.conf.
	Update devfs again.
	Note that the /dev/test symlink still exists.
	Kill conserver.
	The system panics with:

	    device_unbusy: called for non-busy device rp0

	kgdb shows:

	    (kgdb) bt
	    #0  0xc0692c96 in doadump ()
	    #1  0xc06931f2 in boot ()
	    #2  0xc0693519 in panic ()
	    #3  0xc06a9304 in device_unbusy ()
	    #4  0xc05ca8e9 in rpclose ()
	    #5  0xc06c6617 in ttyclose ()
	    #6  0xc066878b in giant_close ()
	    #7  0xc0644876 in devfs_close ()
	    #8  0xc08ae164 in VOP_CLOSE_APV ()
	    #9  0xc06fa8c6 in vn_close ()
	    #10 0xc06fb836 in vn_closefile ()
	    #11 0xc06448b7 in devfs_close_f ()
	    #12 0xc0672a04 in fdrop_locked ()
	    #13 0xc0672951 in fdrop ()
	    #14 0xc0670eef in closef ()
	    #15 0xc066e04d in close ()
	    #16 0xc08998f3 in syscall ()
	    #17 0xc08853df in Xint0x80_syscall ()
	    #18 0x00000033 in ?? ()
	    Previous frame inner to this frame (corrupt stack?)
Comment 1 Remko Lodder 2007-02-14 06:51:42 UTC
On Tue, Feb 13, 2007 at 09:34:53PM -0800, Craig Leres wrote:
> 
> 	% uname -a
> 	FreeBSD ee.lbl.gov 6.2-RELEASE FreeBSD 6.2-RELEASE #1: Mon Jan 15 11:31:32 PST 2007     leres@fox.ee.lbl.gov:/usr/src/6.2-RELEASE/sys/i386/compile/LBLSMP  i386
> 
> 	rp0: <RocketPort PCI> port 0x3000-0x30ff irq 19 at device 7.0 on pci2
> 	RocketPort0 (Version 3.02) 32 ports.
> 
> >Description:
> 	Removing a devfs symlink can cause a panic if a program had
> 	that symlink open and then exits.
> 
> 	I use the comms/conserver port to manage a bunch of serial
> 	lines connected to a 32 port RocketPort config. To keep
> 	track of what each line is connected to, I create symlinks
> 	via "link" lines in devfs.conf. I disovered that if I remove
> 	link lines from devfs.conf, update devfs and then kill off
> 	a conserver that has one or more of the obsolete links open,
> 	the system will panic.
> 

Hello,

This does not really feel like a bug but rather a configuration
failure from the administrator (you in this case). When you have
active connections, applications assume that they can have IO with
(in this case a symlinked device). When you remove that device
the application can no longer access that IO and in this case
kills the system.

The latter of this might be a problem (though I really feel like
"dont do that!"): What we need prior to continueing is a dump of
the panic, you can read how to do that on the
http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html
documentation pages.

Thanks in advance !

-- 
Kind regards,

     Remko Lodder               ** remko@elvandar.org
     FreeBSD                    ** remko@FreeBSD.org

     /* Quis custodiet ipsos custodes */
Comment 2 Craig Leres freebsd_committer freebsd_triage 2007-02-14 08:34:11 UTC
> This does not really feel like a bug but rather a configuration
> failure from the administrator (you in this case).

(Not a bug? Seriously?)

Here's some additional info.

I ran the scenario with a sio port; the system does not crash,
/dev/modem does not go away and tip can still communicate with the
modem after devfs is updated. (As expected.)

I also ran the scenario with an puc port which worked the same as
with the sio port.

Finally, I reran the scenario with the RocketPort and found it's
not necessary to make any changes to devfs.conf to induce a panic.
You can simply:

    tip to a rocketport line
    run "/etc/rc.d/devfs restart"
    exit tip
    (wait for the system to reboot)

I think the issue is a side effect of refreshing the devfs layout
is that the state is getting reset. The rp driver seems to be the
only one using device_busy/device_unbusy in a manner that's sensitive
to the dev state.

>                   What we need prior to continueing is a dump of
> the panic,

I used a crash dump to create the kgdb stack trace.

		Craig
Comment 3 Craig Leres freebsd_committer freebsd_triage 2007-03-08 04:21:21 UTC
The problem is a lot more serious than I first reported. Over the
last few weeks I've found it pretty easy to panic the system. I
did more surfing and found this freebsd-hackers thread:

    http://groups.google.com/group/mailing.freebsd.hackers/browse_frm/thread/883da63a8c62854d

Ultimately, Doug Ambrisko posts a patch against RELENG_6. I wasn't
able to recover the patch from his posting but he also committed
the changes which are available via the cvs web interface:

    http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rp.c?rev=1.67.2.2&content-type=text/plain
    http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rp_isa.c?rev=1.7.2.1&content-type=text/plain
    http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rp_pci.c?rev=1.11.2.2&content-type=text/plain
    http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rpreg.h?rev=1.7.2.1&content-type=text/plain
    http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/sys/dev/rp/rpvar.h?rev=1.8.2.1&content-type=text/plain

It looks good! I've been unable to generate a device_unbusy() panic
with the resulting kernel.

		Craig
Comment 4 Craig Leres freebsd_committer freebsd_triage 2007-03-14 02:27:01 UTC
I was still able to crash Doug Ambrisko's version of the rp driver.
I think the problem is that in some cases rp_handle_port() calls
rpclose() which calls device_unbusy(). Later rpclose() is called
again and we hit the panic.

I looked at the last known good version of the driver I'd used,
1.45.2.2 from 4.10-RELEASE, and found it has code to avoid calling
rpclose() twice. I did something similar that seems to work.

Note: The appended diffs are against version 1.67.2.2 of rp.c.

		Craig

===================================================================
RCS file: RCS/rp.c,v
retrieving revision 1.2
retrieving revision 1.3
diff -c -r1.2 -r1.3
*** rp.c	2007/03/08 04:07:10	1.2
--- rp.c	2007/03/14 02:23:17	1.3
***************
*** 573,578 ****
--- 573,579 ----
  
  static	void	rpbreak(struct tty *, int);
  static	void	rpclose(struct tty *tp);
+ static	void	rphardclose(struct tty *tp);
  static	int	rpmodem(struct tty *, int, int);
  static	int	rpparam(struct tty *, struct termios *);
  static	void	rpstart(struct tty *);
***************
*** 697,703 ****
  			if((tp->t_state & TS_CARR_ON)) {
  				(void)ttyld_modem(tp, 0);
  				if(ttyld_modem(tp, 0) == 0) {
! 					rpclose(tp);
  				}
  			}
  		}
--- 698,704 ----
  			if((tp->t_state & TS_CARR_ON)) {
  				(void)ttyld_modem(tp, 0);
  				if(ttyld_modem(tp, 0) == 0) {
! 					rphardclose(tp);
  				}
  			}
  		}
***************
*** 936,941 ****
--- 937,952 ----
  rpclose(struct tty *tp)
  {
  	struct	rp_port	*rp;
+ 
+ 	rp = tp->t_sc;
+ 	rphardclose(tp);
+ 	device_unbusy(rp->rp_ctlp->dev);
+ }
+ 
+ static void
+ rphardclose(struct tty *tp)
+ {
+ 	struct	rp_port	*rp;
  	CHANNEL_t	*cp;
  
  	rp = tp->t_sc;
***************
*** 959,965 ****
  	tp->t_actout = FALSE;
  	wakeup(&tp->t_actout);
  	wakeup(TSA_CARR_ON(tp));
- 	device_unbusy(rp->rp_ctlp->dev);
  }
  
  static void
--- 970,975 ----
Comment 5 Doug Ambrisko 2007-03-15 03:55:48 UTC
Craig Leres writes:
| I was still able to crash Doug Ambrisko's version of the rp driver.
| I think the problem is that in some cases rp_handle_port() calls
| rpclose() which calls device_unbusy(). Later rpclose() is called
| again and we hit the panic.
| 
| I looked at the last known good version of the driver I'd used,
| 1.45.2.2 from 4.10-RELEASE, and found it has code to avoid calling
| rpclose() twice. I did something similar that seems to work.
| 
| Note: The appended diffs are against version 1.67.2.2 of rp.c.
| 
| 		Craig
| 
| ===================================================================
| RCS file: RCS/rp.c,v
| retrieving revision 1.2
| retrieving revision 1.3
| diff -c -r1.2 -r1.3
| *** rp.c	2007/03/08 04:07:10	1.2
| --- rp.c	2007/03/14 02:23:17	1.3
| ***************
| *** 573,578 ****
| --- 573,579 ----
|   
|   static	void	rpbreak(struct tty *, int);
|   static	void	rpclose(struct tty *tp);
| + static	void	rphardclose(struct tty *tp);
|   static	int	rpmodem(struct tty *, int, int);
|   static	int	rpparam(struct tty *, struct termios *);
|   static	void	rpstart(struct tty *);
| ***************
| *** 697,703 ****
|   			if((tp->t_state & TS_CARR_ON)) {
|   				(void)ttyld_modem(tp, 0);
|   				if(ttyld_modem(tp, 0) == 0) {
| ! 					rpclose(tp);
|   				}
|   			}
|   		}
| --- 698,704 ----
|   			if((tp->t_state & TS_CARR_ON)) {
|   				(void)ttyld_modem(tp, 0);
|   				if(ttyld_modem(tp, 0) == 0) {
| ! 					rphardclose(tp);
|   				}
|   			}
|   		}
| ***************
| *** 936,941 ****
| --- 937,952 ----
|   rpclose(struct tty *tp)
|   {
|   	struct	rp_port	*rp;
| + 
| + 	rp = tp->t_sc;
| + 	rphardclose(tp);
| + 	device_unbusy(rp->rp_ctlp->dev);
| + }
| + 
| + static void
| + rphardclose(struct tty *tp)
| + {
| + 	struct	rp_port	*rp;
|   	CHANNEL_t	*cp;
|   
|   	rp = tp->t_sc;
| ***************
| *** 959,965 ****
|   	tp->t_actout = FALSE;
|   	wakeup(&tp->t_actout);
|   	wakeup(TSA_CARR_ON(tp));
| - 	device_unbusy(rp->rp_ctlp->dev);
|   }
|   
|   static void
| --- 970,975 ----
| 

I wonder if this should be a reference count?  Good find.

Doug A.
Comment 6 Remko Lodder freebsd_committer freebsd_triage 2007-04-24 20:43:00 UTC
Responsible Changed
From-To: freebsd-bugs->remko

There are reports that the information in this ticket resolves the 
problem, Mark requested a source committer to fix this, make it happen 
(I will do it)
Comment 7 dfilter service freebsd_committer freebsd_triage 2007-06-26 14:50:54 UTC
remko       2007-06-26 13:50:48 UTC

  FreeBSD src repository

  Modified files:
    sys/dev/rp           rp.c 
  Log:
  Fix Rocketport so that it does not crash the system when a device pointer
  changes for example:
  
  (From Craig Leres):
  
  tip to a rocketport line
  run "/etc/rc.d/devfs restart"
  exit tip
  (wait for the system to reboot)
  
  Thanks to Robert Watson for poking me to fix this.
  
  PR:             kern/109152
  Approved by:    imp (mentor)
  Approved by:    re (kensmith)
  Reviewed by:    jhb
  Submitted by:   Craig Leres <leres@ee dot lbl dot gov>
  
  Revision  Changes    Path
  1.73      +12 -2     src/sys/dev/rp/rp.c
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
Comment 8 Remko Lodder freebsd_committer freebsd_triage 2007-06-26 14:51:20 UTC
State Changed
From-To: open->patched

Patched in -CURRENT, will MFC in a few days.
Comment 9 dfilter service freebsd_committer freebsd_triage 2007-07-03 21:35:13 UTC
remko       2007-07-03 20:35:07 UTC

  FreeBSD src repository

  Modified files:        (Branch: RELENG_6)
    sys/dev/rp           rp.c 
  Log:
  MFC rp.c rev 1.73
  
    Fix Rocketport so that it does not crash the system when a device pointer
    changes for example:
  
    (From Craig Leres):
  
    tip to a rocketport line
    run "/etc/rc.d/devfs restart"
    exit tip
    (wait for the system to reboot)
  
    Thanks to Robert Watson for poking me to fix this.
  
    PR:           kern/109152
    Approved by:  imp (mentor)
    Approved by:  re (kensmith)
    Reviewed by:  jhb
    Submitted by: Craig Leres <leres@ee dot lbl dot gov>
  
  Approved by:    imp (mentor, implicit)
  
  Revision  Changes    Path
  1.67.2.3  +12 -2     src/sys/dev/rp/rp.c
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
Comment 10 Remko Lodder freebsd_committer freebsd_triage 2007-09-04 07:04:35 UTC
State Changed
From-To: patched->closed

After some more consideration, I wont MFC this to RELENG_5. 
I think all should run RELENG_6 and in the future RELENG_7 
where possible. 5.x wont have any new releases anyway so it 
does not add up to me personally.