| Summary: | Kernel panics when mounting non-fixated CD | ||
|---|---|---|---|
| Product: | Base System | Reporter: | Ed Alley <wea> |
| Component: | kern | Assignee: | Søren Schmidt <sos> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | CC: | wea |
| Priority: | Normal | ||
| Version: | 4.3-RELEASE | ||
| Hardware: | Any | ||
| OS: | Any | ||
|
Description
Ed Alley
2001-07-25 01:05:01 UTC
I am running FreeBSD 4.3 with an IDE HP cd-writer 9500 series. I have been successfully making CD's using burncd since I installed it. However, I mistakenly tried to mount a CD which I failed to fixate and I got a kernel panic. I was able to de-bug the kernel code and found out where the problem is. I have included a patch which works for me and would like to hear whether it is sufficient or what I should do next. I found out through my investigations into this that the ATAPI interface isn't followed closely by manufactures. For instance before we installed this HP CDRW we had installed a Yamaha CDRW which displayed other problems (among them is that it won't fixate using burncd under FreeBSD). In addition my CDROM on my home computer which is running FreeBSD 4.2 doesn't cause a panic when I try to mount a non-fixated CD it just refuses to do it. So ATAPI of one manufacturer is not ATAPI of another. The problem with what I am doing is that most (if not everybody) reading this will not have my hardware configuration to test this problem on. So I have included part of my gdb session below so you can see how I came up with my patch. So here is the panic message that I get when I try to mount the non-fixated CD; you can see that it is a page fault: (kgdb) symbol-file kernel.debug Reading symbols from kernel.debug...done. (kgdb) exec-file /var/crash.gdb/kernel.0 (kgdb) core-file /var/crash.gdb/vmcore.0 IdlePTD 2711552 initial pcb at 221800 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0xc0d96000 fault code = supervisor write, page not present instruction pointer = 0x8:0xc01b6c2e stack pointer = 0x10:0xc0206f10 frame pointer = 0x10:0xc0206f20 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = bio trap number = 12 panic: page fault Here is the trace of the corpse: (kgdb) where #0 dumpsys () at ../../kern/kern_shutdown.c:469 #1 0xc01389c3 in boot (howto=256) at ../../kern/kern_shutdown.c:309 #2 0xc0138d40 in poweroff_wait (junk=0xc01ff28f, howto=0) at ../../kern/kern_shutdown.c:556 #3 0xc01d68f1 in trap_fatal (frame=0xc0206ed0, eva=3235471360) at ../../i386/i386/trap.c:951 #4 0xc01d65c9 in trap_pfault (frame=0xc0206ed0, usermode=0, eva=3235471360) at ../../i386/i386/trap.c:844 #5 0xc01d61af in trap (frame={tf_fs = -65520, tf_es = -973537264, tf_ds = 6488080, tf_edi = -1059495936, tf_esi = 32768, tf_ebp = -1071616224, tf_isp = -1071616260, tf_ebx = -1059685120, tf_edx = 368, tf_ecx = 7168, tf_eax = -1060624128, tf_trapno = 12, tf_err = 2, tf_eip = -1071944658, tf_cs = 8, tf_eflags = 66054, tf_esp = -1063045216, tf_ss = -1059685120}) at ../../i386/i386/trap.c:443 #6 0xc01b6c2e in atapi_read (request=0xc0d67d00, length=32768) at machine/cpufunc.h:222 #7 0xc01b66cb in atapi_interrupt (request=0xc0d67d00) at ../../dev/ata/atapi-all.c:391 #8 0xc01afcee in ata_intr (data=0xc0c82900) at ../../dev/ata/ata-all.c:1154 (kgdb) The routine atapi_read() is where the error occured. By poking around I discovered that the bytecount request was enormous: print request->bytecount $1 = 4294934528 (kgdb) x/x &request->bytecount 0xc0d67d18: 0xffff8000 x/d &request->bytecount 0xc0d67d18: -32768 (kgdb) So you can see that 32768 was subtracted off of an unsigned zero! If the first request was for bytecount zero then atapi_read() will read nothing but subtract size = 32768 from bytecount before returning. Since bytecount is unsigned this causes the roll over to a big number. The next call then attempts to read a bytecount of over 4G. Fix: I have included a patch that works for me. I am not fully satisfied with it because even though it is simple, it limits the bytecount to 2G. Does this mean that a person could not read a file bigger that 2G with this patch? My patch is very simple: In atapi-all.c in routine atapi_interrupt() for case ATAPI_P_READ I cast bytecount to a long and check for zero or negative. If it is zero or negative I write an error message and break out. This avoids atapi_read() and returns with and error message. However, as noted above, this limits the valid byte count to 2G. The patch must be installed in /usr/src/sys/dev/ata as: patch -p < patch.file Here is the patch: End of patch file. Thank-you in anticipation for your comments. I am a newbie at kernel debugging, so if I have done anything stupid please go easy on me. :)--dm3df6rDchDVeX47W3X4sUJEJoDrkalzBMW8CbQmeV6C6nC4 Content-Type: text/plain; name="file.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="file.diff" *** atapi-all.c.orig Tue Jul 24 13:21:03 2001 --- atapi-all.c Tue Jul 24 13:28:45 2001 *************** *** 382,387 **** --- 382,393 ---- return ATA_OP_CONTINUES; case ATAPI_P_READ: + if ((long)request->bytecount <= 0) { + printf("%s: %s trying to read with bytecount = %d\n", + atp->devname, atapi_cmd2str(atp->cmd), + (long)request->bytecount); + break; + } if (!(request->flags & ATPR_F_READ)) { request->result = inb(atp->controller->ioaddr + ATA_ERROR); printf("%s: %s trying to read on write buffer\n", How-To-Repeat: Unless you have my hardware configuration you can't repeat the error. However, I don't doubt that other CDRs will do something similar. >Submitter-Id: current-users >Originator: Ed Alley >Organization: Lawrence Livermore National Lab. >Confidential: no >Synopsis: kern/29203: Kernel panics when mounting non-fixated CD >Severity: non-critical >Priority: low >Category: kern >Class: change-request >Release: FreeBSD 4.3-RELEASE i386 >Environment: System: FreeBSD jordan.llnl.gov 4.3-RELEASE #0: Tue Jul 24 13:38:25 PDT 2001 wea@jordan.llnl.gov:/usr/src/sys/compile/JORDAN.5.ipfw i386 >Description: RE: PR kern/29203 I am running FreeBSD 4.3 with an IDE HP cd-writer 9500 series. I have previously submitted a patch for the kernel panic that resulted after I tried to mount a non-fixated CD that I had burned with my CD writer. The previous patch had the problem that it tested every ATAPI read for a large bytecount including IDE disk reads. The patch therefore also limited those reads to a bytecount of 2G. The patch below only tests CD reads, hence, only applies the 2G limit to CD reads. However, the question I now have is: Are DVDs typed as CDs in the ATAPI driver. If so then more work needs to be done here before this patch can be accepted. >How-To-Repeat: Unless you have my hardware configuration you can't repeat the error. However, I don't doubt that other CDRWs will do something similar. >Fix: The patch has been improved: It now only tests the bytecount for the case of reading a CD. This eliminates the problem of limiting the bytecount to 2G for disk reads. It now only does this for CDs. The only remaining question is whether a DVD is an ATAPI_TYPE_CDROM device type. If so then this patch is still not complete, because a DVD can be larger than 2G. The patch must be installed in /usr/src/sys/dev/ata as: patch -p < patch.file Here is the patch: *** atapi-all.c.orig Tue Jul 24 13:21:03 2001 --- atapi-all.c Wed Jul 25 16:03:47 2001 *************** *** 382,387 **** --- 382,397 ---- return ATA_OP_CONTINUES; case ATAPI_P_READ: + if (ATP_PARAM->device_type == ATAPI_TYPE_CDROM) { + if ((long)request->bytecount < 0) { + printf("%s: %s trying to read CD with bytecount = %lu\n", + atp->devname, atapi_cmd2str(atp->cmd), + (unsigned long)request->bytecount); + request->result = + ATAPI_E_ILI | ATAPI_SK_ILLEGAL_REQUEST; + break; + } + } if (!(request->flags & ATPR_F_READ)) { request->result = inb(atp->controller->ioaddr + ATA_ERROR); printf("%s: %s trying to read on write buffer\n", End of patch file. Thank-you in anticipation of your comments. >Submitter-Id: current-users >Originator: Ed Alley >Organization: Lawrence Livermore National Lab. >Confidential: no >Synopsis: Kernel panics when mounting non-fixated CD >Severity: non-critical >Priority: low >Category: kern >Class: change-request >Release: FreeBSD 4.3-RELEASE i386 >Environment: System: FreeBSD jordan.llnl.gov 4.3-RELEASE #0: Tue Jul 24 13:38:25 PDT 2001 wea@llnl.gov: /usr/src/sys/compile/JORDAN.5.ipfw i386 >Description: RE: PR kern/29203 I am running FreeBSD 4.3 with an IDE HP cd-writer 9500 series. I appologize if this is a double submit but I never got a reply from gnats concerning my second submission. This may have been due to a corrupted subject line which didn't have the RE: in it. By looking at code I have discovered that the bytecount should never exceed the maximum size of an signed integer! This can be seen in the file atapi-all.c and the routine atapi_queue_cmd() that queues up the atapi requests. The bytecount is passed through the argument list of atapi_queue_cmd() as the variable count which is typed as an int! Therefore, my concern about bytecount being limited to 2G was unfounded. On my machine ints and longs are both 32 bits. So limiting bytecount to 2G is correct. The page fault was triggered by atapi_read() when it tried to read a bytecount = 0xffff8000 which is larger that 2G. It is curious that when we convert the hex to a signed long we get -32768 which is -2^(15) where ^ means power. According to <machine/limits.h> on my machine this is the value of the smallest short. If the hardware returns short integers and it thinks that -0 = 0x8000 this may be the problem since 0x8000 would get sign extended for int32 so the argument count would be equal to -32768 and then bytecount would get set to 0xffff8000. (I'm speculating here since I don't know how the hardware works.) (In view of the above discussion perhaps a better place for a patch is in atapi_queue_cmd(): a simple test on the sign of count would suffice. I'll look at this also.) I have previously submitted a patch to avoid the panic that resulted after I tried to mount a non-fixated iso 9660 CD previously burned with my CD writer. The previous patch had the problem that it tested every ATAPI read for a large bytecount including IDE disk reads. The patch below only tests CD reads, and hence, only applies the 2G limit to these. >How-To-Repeat: Unless you have my hardware configuration you can't repeat the error. However, I don't doubt that other CDRWs may do something similar. >Fix: The patch has been changed: It now tests the bytecount only for the case of reading a CD. This eliminates the problem of limiting the bytecount to 2G for disk reads. It now only does this for CDs. Note: I have tried to print out a meaningful message and also tried to get the atapi error mechanism to say something meaningful, but I was not too successful. If someone can improve this I would be grateful. Also note: It may be better to do this in atapi_queue_cmd(). The patch must be installed in /usr/src/sys/dev/ata as: patch -p < patch.file Here is the patch: *** atapi-all.c.orig Tue Jul 24 13:21:03 2001 --- atapi-all.c Wed Jul 25 16:03:47 2001 *************** *** 382,387 **** --- 382,397 ---- return ATA_OP_CONTINUES; case ATAPI_P_READ: + if (ATP_PARAM->device_type == ATAPI_TYPE_CDROM) { + if ((long)request->bytecount < 0) { + printf("%s: %s trying to read CD with bytecount = %lu\n", + atp->devname, atapi_cmd2str(atp->cmd), + (unsigned long)request->bytecount); + request->result = + ATAPI_E_ILI | ATAPI_SK_ILLEGAL_REQUEST; + break; + } + } if (!(request->flags & ATPR_F_READ)) { request->result = inb(atp->controller->ioaddr + ATA_ERROR); printf("%s: %s trying to read on write buffer\n", End of patch. Thank-you in anticipation of your comments. Ed Responsible Changed From-To: freebsd-bugs->sos sos is Mr. ATA State Changed From-To: open->closed This is belived to be fixed in -stable (4.4) and -current. |