Bug 29203

Summary: Kernel panics when mounting non-fixated CD
Product: Base System Reporter: Ed Alley <wea>
Component: kernAssignee: Søren Schmidt <sos>
Status: Closed FIXED    
Severity: Affects Only Me CC: wea
Priority: Normal    
Version: 4.3-RELEASE   
Hardware: Any   
OS: Any   

Description Ed Alley 2001-07-25 01:05:01 UTC
>Number:         29203
>Category:       kern
>Synopsis:       Kernel panics when mounting non-fixated CD
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jul 24 17:10:00 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     Ed Alley
>Release:        FreeBSD 4.3-RELEASE i386
>Organization:
Lawrence Livermore National Lab.
>Environment:
System: FreeBSD jordan.llnl.gov 4.3-RELEASE FreeBSD 4.3-RELEASE #0: Tue Jul 24 13:38:25 PDT 2001 wea@jordan.llnl.gov:/usr/src/sys/compile/JORDAN.5.ipfw i386

>Description:
	
I am running FreeBSD 4.3 with an IDE HP cd-writer 9500 series.
I have been successfully making CD's using burncd since
I installed it. 

However, I mistakenly tried to mount a CD which I failed to fixate
and I got a kernel panic. I was able to de-bug the kernel code
and found out where the problem is. I have included a patch
which works for me and would like to hear whether it is
sufficient or what I should do next.

I found out through my investigations into this that the
ATAPI interface isn't followed closely by manufactures.

For instance before we installed this HP CDRW we had installed
a Yamaha CDRW which displayed other problems (among them is
that it won't fixate using burncd under FreeBSD). In addition
my CDROM on my home computer which is running FreeBSD 4.2 doesn't
cause a panic when I try to mount a non-fixated CD it just refuses
to do it. So ATAPI of one manufacturer is not ATAPI of another.

The problem with what I am doing is that most (if not everybody)
reading this will not have my hardware configuration to test this
problem on. So I have included part of my gdb session below so
you can see how I came up with my patch.

So here is the panic message that I get when I try to mount the
non-fixated CD; you can see that it is a page fault:

(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash.gdb/kernel.0
(kgdb) core-file /var/crash.gdb/vmcore.0
IdlePTD 2711552
initial pcb at 221800
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xc0d96000
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc01b6c2e
stack pointer           = 0x10:0xc0206f10
frame pointer           = 0x10:0xc0206f20
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio 
trap number             = 12
panic: page fault

Here is the trace of the corpse:

(kgdb) where
#0  dumpsys () at ../../kern/kern_shutdown.c:469
#1  0xc01389c3 in boot (howto=256) at ../../kern/kern_shutdown.c:309
#2  0xc0138d40 in poweroff_wait (junk=0xc01ff28f, howto=0)
    at ../../kern/kern_shutdown.c:556
#3  0xc01d68f1 in trap_fatal (frame=0xc0206ed0, eva=3235471360)
    at ../../i386/i386/trap.c:951
#4  0xc01d65c9 in trap_pfault (frame=0xc0206ed0, usermode=0, eva=3235471360)
    at ../../i386/i386/trap.c:844
#5  0xc01d61af in trap (frame={tf_fs = -65520, tf_es = -973537264, 
      tf_ds = 6488080, tf_edi = -1059495936, tf_esi = 32768, 
      tf_ebp = -1071616224, tf_isp = -1071616260, tf_ebx = -1059685120, 
      tf_edx = 368, tf_ecx = 7168, tf_eax = -1060624128, tf_trapno = 12, 
      tf_err = 2, tf_eip = -1071944658, tf_cs = 8, tf_eflags = 66054, 
      tf_esp = -1063045216, tf_ss = -1059685120}) at ../../i386/i386/trap.c:443
#6  0xc01b6c2e in atapi_read (request=0xc0d67d00, length=32768)
    at machine/cpufunc.h:222
#7  0xc01b66cb in atapi_interrupt (request=0xc0d67d00)
    at ../../dev/ata/atapi-all.c:391
#8  0xc01afcee in ata_intr (data=0xc0c82900) at ../../dev/ata/ata-all.c:1154
(kgdb)

The routine atapi_read() is where the error occured. By poking around
I discovered that the bytecount request was enormous:

print request->bytecount
$1 = 4294934528
(kgdb) x/x &request->bytecount
0xc0d67d18:     0xffff8000
x/d &request->bytecount
0xc0d67d18:     -32768
(kgdb)

So you can see that 32768 was subtracted off of an unsigned zero!
If the first request was for bytecount zero then atapi_read()
will read nothing but subtract size = 32768 from bytecount before
returning. Since bytecount is unsigned this causes the roll over
to a big number. The next call then attempts to read a bytecount
of over 4G.

>How-To-Repeat:
	Unless you have my hardware configuration you can't repeat
	the error. However, I don't doubt that other CDRs will
	do something similar.

>Fix:
	I have included a patch that works for me. I am not fully
	satisfied with it because even though it is simple, it
	limits the bytecount to 2G. Does this mean that a person
	could not read a file bigger that 2G with this patch?

	My patch is very simple:

	In atapi-all.c in routine atapi_interrupt() for case
	ATAPI_P_READ I cast bytecount to a long and check
	for zero or negative. If it is zero or negative I
	write an error message and break out. This avoids
	atapi_read() and returns with and error message.
	However, as noted above, this limits the valid
	byte count to 2G.

	The patch must be installed in /usr/src/sys/dev/ata as:

		patch -p < patch.file

	Here is the patch:

*** atapi-all.c.orig	Tue Jul 24 13:21:03 2001
--- atapi-all.c	Tue Jul 24 13:28:45 2001
***************
*** 382,387 ****
--- 382,393 ----
  	    return ATA_OP_CONTINUES;
  	
  	case ATAPI_P_READ:
+ 	    if ((long)request->bytecount <= 0) {
+ 		printf("%s: %s trying to read with bytecount = %d\n",
+ 			atp->devname, atapi_cmd2str(atp->cmd),
+ 			(long)request->bytecount);
+ 		break;
+ 	    }
  	    if (!(request->flags & ATPR_F_READ)) {
  		request->result = inb(atp->controller->ioaddr + ATA_ERROR);
  		printf("%s: %s trying to read on write buffer\n",

	End of patch file.


Thank-you in anticipation for your comments. I am a newbie at kernel
debugging, so if I have done anything stupid please go easy on me. :)

>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Comment 1 Ed Alley 2001-07-25 01:10:00 UTC
	
I am running FreeBSD 4.3 with an IDE HP cd-writer 9500 series.
I have been successfully making CD's using burncd since
I installed it. 

However, I mistakenly tried to mount a CD which I failed to fixate
and I got a kernel panic. I was able to de-bug the kernel code
and found out where the problem is. I have included a patch
which works for me and would like to hear whether it is
sufficient or what I should do next.

I found out through my investigations into this that the
ATAPI interface isn't followed closely by manufactures.

For instance before we installed this HP CDRW we had installed
a Yamaha CDRW which displayed other problems (among them is
that it won't fixate using burncd under FreeBSD). In addition
my CDROM on my home computer which is running FreeBSD 4.2 doesn't
cause a panic when I try to mount a non-fixated CD it just refuses
to do it. So ATAPI of one manufacturer is not ATAPI of another.

The problem with what I am doing is that most (if not everybody)
reading this will not have my hardware configuration to test this
problem on. So I have included part of my gdb session below so
you can see how I came up with my patch.

So here is the panic message that I get when I try to mount the
non-fixated CD; you can see that it is a page fault:

(kgdb) symbol-file kernel.debug
Reading symbols from kernel.debug...done.
(kgdb) exec-file /var/crash.gdb/kernel.0
(kgdb) core-file /var/crash.gdb/vmcore.0
IdlePTD 2711552
initial pcb at 221800
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xc0d96000
fault code              = supervisor write, page not present
instruction pointer     = 0x8:0xc01b6c2e
stack pointer           = 0x10:0xc0206f10
frame pointer           = 0x10:0xc0206f20
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio 
trap number             = 12
panic: page fault

Here is the trace of the corpse:

(kgdb) where
#0  dumpsys () at ../../kern/kern_shutdown.c:469
#1  0xc01389c3 in boot (howto=256) at ../../kern/kern_shutdown.c:309
#2  0xc0138d40 in poweroff_wait (junk=0xc01ff28f, howto=0)
    at ../../kern/kern_shutdown.c:556
#3  0xc01d68f1 in trap_fatal (frame=0xc0206ed0, eva=3235471360)
    at ../../i386/i386/trap.c:951
#4  0xc01d65c9 in trap_pfault (frame=0xc0206ed0, usermode=0, eva=3235471360)
    at ../../i386/i386/trap.c:844
#5  0xc01d61af in trap (frame={tf_fs = -65520, tf_es = -973537264, 
      tf_ds = 6488080, tf_edi = -1059495936, tf_esi = 32768, 
      tf_ebp = -1071616224, tf_isp = -1071616260, tf_ebx = -1059685120, 
      tf_edx = 368, tf_ecx = 7168, tf_eax = -1060624128, tf_trapno = 12, 
      tf_err = 2, tf_eip = -1071944658, tf_cs = 8, tf_eflags = 66054, 
      tf_esp = -1063045216, tf_ss = -1059685120}) at ../../i386/i386/trap.c:443
#6  0xc01b6c2e in atapi_read (request=0xc0d67d00, length=32768)
    at machine/cpufunc.h:222
#7  0xc01b66cb in atapi_interrupt (request=0xc0d67d00)
    at ../../dev/ata/atapi-all.c:391
#8  0xc01afcee in ata_intr (data=0xc0c82900) at ../../dev/ata/ata-all.c:1154
(kgdb)

The routine atapi_read() is where the error occured. By poking around
I discovered that the bytecount request was enormous:

print request->bytecount
$1 = 4294934528
(kgdb) x/x &request->bytecount
0xc0d67d18:     0xffff8000
x/d &request->bytecount
0xc0d67d18:     -32768
(kgdb)

So you can see that 32768 was subtracted off of an unsigned zero!
If the first request was for bytecount zero then atapi_read()
will read nothing but subtract size = 32768 from bytecount before
returning. Since bytecount is unsigned this causes the roll over
to a big number. The next call then attempts to read a bytecount
of over 4G.

Fix: I have included a patch that works for me. I am not fully
	satisfied with it because even though it is simple, it
	limits the bytecount to 2G. Does this mean that a person
	could not read a file bigger that 2G with this patch?

	My patch is very simple:

	In atapi-all.c in routine atapi_interrupt() for case
	ATAPI_P_READ I cast bytecount to a long and check
	for zero or negative. If it is zero or negative I
	write an error message and break out. This avoids
	atapi_read() and returns with and error message.
	However, as noted above, this limits the valid
	byte count to 2G.

	The patch must be installed in /usr/src/sys/dev/ata as:

		patch -p < patch.file

	Here is the patch:



End of patch file.


Thank-you in anticipation for your comments. I am a newbie at kernel
debugging, so if I have done anything stupid please go easy on me. :)--dm3df6rDchDVeX47W3X4sUJEJoDrkalzBMW8CbQmeV6C6nC4
Content-Type: text/plain; name="file.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="file.diff"

*** atapi-all.c.orig	Tue Jul 24 13:21:03 2001
--- atapi-all.c	Tue Jul 24 13:28:45 2001
***************
*** 382,387 ****
--- 382,393 ----
  	    return ATA_OP_CONTINUES;
  	
  	case ATAPI_P_READ:
+ 	    if ((long)request->bytecount <= 0) {
+ 		printf("%s: %s trying to read with bytecount = %d\n",
+ 			atp->devname, atapi_cmd2str(atp->cmd),
+ 			(long)request->bytecount);
+ 		break;
+ 	    }
  	    if (!(request->flags & ATPR_F_READ)) {
  		request->result = inb(atp->controller->ioaddr + ATA_ERROR);
  		printf("%s: %s trying to read on write buffer\n",
How-To-Repeat: 	Unless you have my hardware configuration you can't repeat
	the error. However, I don't doubt that other CDRs will
	do something similar.
Comment 2 Ed Alley 2001-07-26 01:00:22 UTC
>Submitter-Id:	current-users
>Originator:	Ed Alley
>Organization:	Lawrence Livermore National Lab.
>Confidential:	no
>Synopsis:	kern/29203: Kernel panics when mounting non-fixated CD
>Severity:	non-critical
>Priority:	low
>Category:	kern
>Class:		change-request
>Release:	FreeBSD 4.3-RELEASE i386
>Environment:
System: FreeBSD jordan.llnl.gov 4.3-RELEASE #0: Tue Jul 24 13:38:25 PDT 2001 wea@jordan.llnl.gov:/usr/src/sys/compile/JORDAN.5.ipfw i386

>Description:

	RE: PR kern/29203
	
	I am running FreeBSD 4.3 with an IDE HP cd-writer 9500 series.

	I have previously submitted a patch for the kernel panic that
	resulted after I tried to mount a non-fixated CD that I had
	burned with my CD writer. The previous patch had the problem
	that it tested every ATAPI read for a large bytecount including
	IDE disk reads. The patch therefore also limited those reads
	to a bytecount of 2G. The patch below only tests CD reads,
	hence, only applies the 2G limit to CD reads. However, the
	question I now have is: Are DVDs typed as CDs in the
	ATAPI driver. If so then more work needs to be done here
	before this patch can be accepted.

>How-To-Repeat:
	Unless you have my hardware configuration you can't repeat
	the error. However, I don't doubt that other CDRWs will
	do something similar.

>Fix:
	The patch has been improved: It now only tests the
	bytecount for the case of reading a CD. This eliminates
	the problem of limiting the bytecount to 2G for disk reads.
	It now only does this for CDs. The only remaining
	question is whether a DVD is an ATAPI_TYPE_CDROM
	device type. If so then this patch is still not
	complete, because a DVD can be larger than 2G.

	The patch must be installed in /usr/src/sys/dev/ata as:

		patch -p < patch.file

	Here is the patch:

*** atapi-all.c.orig	Tue Jul 24 13:21:03 2001
--- atapi-all.c	Wed Jul 25 16:03:47 2001
***************
*** 382,387 ****
--- 382,397 ----
  	    return ATA_OP_CONTINUES;
  	
  	case ATAPI_P_READ:
+ 	    if (ATP_PARAM->device_type == ATAPI_TYPE_CDROM) {
+ 		if ((long)request->bytecount < 0) {
+ 		    printf("%s: %s trying to read CD with bytecount = %lu\n",
+ 			atp->devname, atapi_cmd2str(atp->cmd),
+ 			(unsigned long)request->bytecount);
+ 			request->result =
+ 			    ATAPI_E_ILI | ATAPI_SK_ILLEGAL_REQUEST;
+ 			break;
+ 		}
+ 	    }
  	    if (!(request->flags & ATPR_F_READ)) {
  		request->result = inb(atp->controller->ioaddr + ATA_ERROR);
  		printf("%s: %s trying to read on write buffer\n",

	End of patch file.


Thank-you in anticipation of your comments.
Comment 3 Ed Alley 2001-07-26 06:51:40 UTC
>Submitter-Id:	current-users
>Originator:	Ed Alley
>Organization:	Lawrence Livermore National Lab.
>Confidential:	no
>Synopsis:	Kernel panics when mounting non-fixated CD
>Severity:	non-critical
>Priority:	low
>Category:	kern
>Class:		change-request
>Release:	FreeBSD 4.3-RELEASE i386
>Environment:
System: FreeBSD jordan.llnl.gov 4.3-RELEASE #0: Tue Jul 24 13:38:25 PDT 2001 wea@llnl.gov: /usr/src/sys/compile/JORDAN.5.ipfw i386

>Description:

	RE: PR kern/29203

	I am running FreeBSD 4.3 with an IDE HP cd-writer 9500 series.

	I appologize if this is a double submit but I never got a
	reply from gnats concerning my second submission. This may
	have been due to a corrupted subject line which didn't have
	the RE: in it. 

	By looking at code I have discovered that the bytecount should
	never exceed the maximum size of an signed integer! This can be
	seen in the file atapi-all.c and the routine atapi_queue_cmd()
	that queues up the atapi requests. The bytecount is passed
	through the argument list of atapi_queue_cmd() as the variable
	count which is typed as an int! Therefore, my concern about
	bytecount being limited to 2G was unfounded. On my machine
	ints and longs are both 32 bits. So limiting bytecount to
	2G is correct.

	The page fault was triggered by atapi_read() when it tried to
	read a bytecount = 0xffff8000 which is larger that 2G. It is
	curious that when we convert the hex to a signed long we
	get -32768 which is -2^(15) where ^ means power. According
	to <machine/limits.h> on my machine this is the value of
	the smallest short. If the hardware returns short integers
	and it thinks that -0 = 0x8000 this may be the problem
	since 0x8000 would get sign extended for int32 so the
	argument count would be equal to -32768 and then bytecount
	would get set to 0xffff8000. (I'm speculating here since
	I don't know how the hardware works.)

	(In view of the above discussion perhaps a better place for a
	 patch is in atapi_queue_cmd(): a simple test on the sign of
	 count would suffice. I'll look at this also.)

	I have previously submitted a patch to avoid the panic that
	resulted after I tried to mount a non-fixated iso 9660 CD
	previously burned with my CD writer. The previous patch had
	the problem that it tested every ATAPI read for a large bytecount
	including IDE disk reads. The patch below only tests CD reads,
	and hence, only applies the 2G limit to these.

>How-To-Repeat:
	Unless you have my hardware configuration you can't repeat
	the error. However, I don't doubt that other CDRWs may
	do something similar.

>Fix:
	The patch has been changed: It now tests the bytecount
	only for the case of reading a CD. This eliminates
	the problem of limiting the bytecount to 2G for disk reads.
	It now only does this for CDs.

	Note: I have tried to print out a meaningful message
	and also tried to get the atapi error mechanism to
	say something meaningful, but I was not too successful.
	If someone can improve this I would be grateful.

	Also note: It may be better to do this in atapi_queue_cmd().

	The patch must be installed in /usr/src/sys/dev/ata as:

		patch -p < patch.file

	Here is the patch:

*** atapi-all.c.orig	Tue Jul 24 13:21:03 2001
--- atapi-all.c	Wed Jul 25 16:03:47 2001
***************
*** 382,387 ****
--- 382,397 ----
  	    return ATA_OP_CONTINUES;
  	
  	case ATAPI_P_READ:
+ 	    if (ATP_PARAM->device_type == ATAPI_TYPE_CDROM) {
+ 		if ((long)request->bytecount < 0) {
+ 		    printf("%s: %s trying to read CD with bytecount = %lu\n",
+ 			atp->devname, atapi_cmd2str(atp->cmd),
+ 			(unsigned long)request->bytecount);
+ 			request->result =
+ 			    ATAPI_E_ILI | ATAPI_SK_ILLEGAL_REQUEST;
+ 			break;
+ 		}
+ 	    }
  	    if (!(request->flags & ATPR_F_READ)) {
  		request->result = inb(atp->controller->ioaddr + ATA_ERROR);
  		printf("%s: %s trying to read on write buffer\n",

	End of patch.


Thank-you in anticipation of your comments.

	Ed
Comment 4 bill fumerola freebsd_committer freebsd_triage 2001-07-26 10:55:12 UTC
Responsible Changed
From-To: freebsd-bugs->sos

sos is Mr. ATA
Comment 5 Søren Schmidt freebsd_committer freebsd_triage 2001-09-06 10:01:19 UTC
State Changed
From-To: open->closed

This is belived to be fixed in -stable (4.4) and -current.