This has been going on since before 5.1-R. Nate is working it with me, but asked for a PR. DMESG: Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.1-CURRENT #44: Sun Aug 31 18:06:58 CDT 2003 ler@lerlaptop.lerctr.org:/usr/obj/usr/src/sys/LERLAPTOP Preloaded elf kernel "/boot/kernel/kernel" at 0xc05a3000. Preloaded acpi_dsdt "/boot/DSDT.aml" at 0xc05a3294. Preloaded elf module "/boot/kernel/acpi.ko" at 0xc05a32d8. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) III Mobile CPU 1133MHz (1129.57-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x6b1 Stepping = 1 Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE> real memory = 527958016 (503 MB) avail memory = 506515456 (483 MB) Pentium Pro MTRR support enabled ACPI: DSDT was overridden. ACPI-0375: *** Info: Table [DSDT] replaced by host OS npx0: <math processor> on motherboard npx0: INT 16 interface acpi0: <FUJ ERG > on motherboard pcibios: BIOS version 2.10 Using $PIR table, 9 entries at 0xc00fdf30 acpi0: power button is handled as a fixed feature programming model. Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0xfc08-0xfc0b on acpi0 acpi_cpu0: <CPU> port 0x530-0x537 on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib0: slot 2 INTA is routed to irq 11 pcib0: slot 29 INTA is routed to irq 11 pcib0: slot 29 INTB is routed to irq 11 pcib0: slot 31 INTB is routed to irq 11 pcib0: slot 31 INTB is routed to irq 11 pcib0: slot 31 INTB is routed to irq 11 agp0: <Intel 82830M (830M GMCH) SVGA controller> mem 0xe0000000-0xe007ffff,0xe8000000-0xefffffff irq 11 at device 2.0 on pci0 agp0: detected 8060k stolen memory agp0: aperture size is 128M pci0: <display> at device 2.1 (no driver attached) uhci0: <Intel 82801CA/CAM (ICH3) USB controller USB-A> port 0x18c0-0x18df irq 11 at device 29.0 on pci0 usb0: <Intel 82801CA/CAM (ICH3) USB controller USB-A> on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: <Intel 82801CA/CAM (ICH3) USB controller USB-B> port 0x18e0-0x18ff irq 11 at device 29.1 on pci0 usb1: <Intel 82801CA/CAM (ICH3) USB controller USB-B> on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered pcib1: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pcib1: slot 10 INTA is routed to irq 11 pcib1: slot 10 INTB is routed to irq 11 pcib1: slot 13 INTA is routed to irq 11 pcib1: slot 14 INTA is routed to irq 11 cbb0: <TI1520 PCI-CardBus Bridge> irq 11 at device 10.0 on pci1 start (88000000) < sc->membase (e0200000) end (ffffffff) > sc->memlimit (e02fffff) cardbus0: <CardBus bus> on cbb0 pccard0: <16-bit PCCard bus> on cbb0 cbb1: <TI1520 PCI-CardBus Bridge> irq 11 at device 10.1 on pci1 start (88000000) < sc->membase (e0200000) end (ffffffff) > sc->memlimit (e02fffff) cardbus1: <CardBus bus> on cbb1 pccard1: <16-bit PCCard bus> on cbb1 rl0: <RealTek 8139 10/100BaseTX, rev. C> port 0x2000-0x20ff mem 0xe0200800-0xe02008ff irq 11 at device 13.0 on pci1 rl0: Ethernet address: 00:e0:00:7e:d0:45 miibus0: <MII bus> on rl0 rlphy0: <RealTek internal media interface> on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fwohci0: vendor=10cf, dev=2010 fwohci0: <1394 Open Host Controller Interface> mem 0xe0200000-0xe02007ff irq 11 at device 14.0 on pci1 fwohci0: OHCI version 1.0 (ROM=1) fwohci0: No. of Isochronous channel is 4. fwohci0: EUI64 00:00:0e:10:00:70:a8:72 fwohci0: Phy 1394a available S400, 3 ports. fwohci0: Link S400, max_rec 1024 bytes. fwohci0: max_rec 1024 -> 2048 firewire0: <IEEE1394(FireWire) bus> on fwohci0 if_fwe0: <Ethernet over FireWire> on firewire0 if_fwe0: Fake Ethernet address: 02:00:0e:70:a8:72 sbp0: <SBP2/SCSI over firewire> on firewire0 fwohci0: Initiate bus reset fwohci0: BUS reset isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel ICH3 UDMA100 controller> port 0x1c20-0x1c2f,0x374-0x377,0x170-0x177,0x3f4-0x3f7,0x1f0-0x1f7 mem 0xe0100000-0xe01003ff at device 31.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 ichsmb0: <Intel 82801CA (ICH3) SMBus controller> port 0x1c00-0x1c1f irq 11 at device 31.3 on pci0 smbus0: <System Management Bus> on ichsmb0 smb0: <SMBus generic I/O> on smbus0 pcm0: <Intel ICH3 (82801CA)> port 0x1880-0x18bf,0x1000-0x10ff irq 11 at device 31.5 on pci0 pcm0: <SigmaTel STAC9756/57 AC97 Codec> pci0: <simple comms> at device 31.6 (no driver attached) acpi_button0: <Power Button> on acpi0 acpi_lid0: <Control Method Lid Switch> on acpi0 acpi_acad0: <AC adapter> on acpi0 acpi_cmbat0: <Control method Battery> on acpi0 speaker0 port 0x61 on acpi0 atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: model IntelliMouse, device ID 3 acpi_ec0: <embedded controller: GPE 0x17> port 0x66,0x62 on acpi0 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A sio1 port 0x2e8-0x2ef irq 3 on acpi0 sio1: type 16550A ppc0 port 0x778-0x77b,0x378-0x37f irq 7 drq 1 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/15 bytes threshold ppbus0: <Parallel port bus> on ppc0 ppi0: <Parallel I/O> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 orm0: <Option ROMs> at iomem 0xcc800-0xcefff,0xc0000-0xcc7ff on isa0 pmtimer0 on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 1129573997 Hz quality 800 Timecounters tick every 10.000 msec fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) fwohci0: txd err= e unknown event acpi_acad0: acline initialization start acpi_acad0: On Line acpi_acad0: acline initialization done, tried 1 times acpi_cmbat0: battery initialization start acpi_cmbat0: battery initialization done, tried 1 times start (88000000) < sc->membase (e0200000) end (ffffffff) > sc->memlimit (e02fffff) start (88000000) < sc->membase (e0200000) end (ffffffff) > sc->memlimit (e02fffff) wi0: <The Linksys Group, Inc. Instant Wireless Network PC Card> at port 0x100-0x13f irq 11 function 0 config 1 on pccard1 wi0: 802.11 address: 00:06:25:18:1a:37 wi0: using RF:PRISM3(PCMCIA) wi0: Intersil Firmware: Primary (1.1.0), Station (1.4.2) wi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps ad0: 57231MB <FUJITSU MHS2060AT> [116280/16/63] at ata0-master UDMA100 acd0: CDRW <SONY CD-RW CRX800E> at ata1-master WDMA2 Mounting root from ufs:/dev/ad0s2a How-To-Repeat: Transition this laptop to battery.
Responsible Changed From-To: freebsd-bugs->njl I'm working on this.
A resource string was overwriting the GPE Methodnode pointer that was at 0xc14e3a00. Luckily it was at the same address each boot so we used a hwatchpoint in ddb. The #3 case below is the culprit. ---------- Forwarded message ---------- Date: Wed, 17 Sep 2003 19:18:02 -0500 From: Larry Rosenman <ler@lerctr.org> To: Nate Lawson <nate@root.org> Cc: marks@ripe.net --On Wednesday, September 17, 2003 00:01:55 -0700 Nate Lawson <nate@root.org> wrote: > Thanks. While the dump was too late (since the data had already been > overwritten), it's useful to have your kernel debug file. I have some > exact steps for you to do and then I can really track it down. > > 1. boot -sd > 2. hwatch 0xc14e3a00 > 3. c > 4. At each breakpoint, type tr. Use scroll lock and arrow keys to get to > the top of the buffer. I only need the top 4-6 function names before > "breakpoint()" or "calltrap()" but for the top one (probably > AcpiSomething), I need FuncName + 0xwhatever. > 5. After you've written that down, goto step 3. Keep doing this until you > get the panic. > 6. Don't bother doing a dump. Give me the above function trace for each > run. Here we go: 1) generic_bzero AcpiEvCreateGpeInfoBlocks I figure this is normal, so I didn't go farther in the trace. 2) AcpiEvSaveMethodInfo+0xA1 AcpiNsWalkNameSpace AcpiCreateGpeBlock AcpiEvGpeInitialize 3) AcpiRsEndTagResource+0x17 AcpiRsByteStreamToList AcpiRsCreateResourceList AcpiRsGetCrsMethodData then we get the system to a prompt, and the standard panic on transition LER
Here is where we find the exact resource list that is overwriting the pointer. I built a custom acpica interpreter with a small program attached to step through the resource list decoding routine. This also shows it was the end tag that was finally hitting MethodNode: (gdb) l *AcpiRsEndTagResource+0x17 0xc0169397 is in AcpiRsEndTagResource (/usr/src/sys/contrib/dev/acpica/rsmisc.c:174). 169 OutputStruct->Id = ACPI_RSTYPE_END_TAG; 170 171 /* 172 * Set the Length parameter 173 */ 174 OutputStruct->Length = 0; 175 176 /* 177 * Return the final size of the structure 178 */ ---------- Forwarded message ---------- Date: Thu, 18 Sep 2003 14:03:06 -0700 (PDT) From: Nate Lawson <nate@root.org> To: Larry Rosenman <ler@lerctr.org> Cc: marks@ripe.net Subject: Re: acpi panics On Thu, 18 Sep 2003, Larry Rosenman wrote: > That's better. One hit. dmesg attached. Ok, I have isolated the problem to one particular resource, \PCI0\RSRC. It has various addr 16 and addr 32 types. I am certain the bug is in how ACPICA parses this buffer. Below is the output from two runs. The first is when it calculates the space needed. The second is when it stores the data in the buffer. As you can see, the first run allocates 488 bytes and then the second overflows this by 36 bytes. Calc pass --- Resource 88 len 64 Resource 88 len 28 Resource 88 len 64 Resource 88 len 64 Resource 87 len 64 Resource 87 len 64 Resource 87 len 64 Resource 87 len 64 Resource 87 len 12 total len 488 Parse pass --- size 76 size 28 size 68 size 68 size 68 size 68 size 68 size 68 size 12 total len 524 And it so happens that the buf at 0xc14e3800 + 524 bytes is [drumroll please] 0xc14e3a0c Which is exactly 12 bytes into the GPE eventinfo buffer. Now I just have to untangle the rsaddr.c functions for WORD/DWORD values and see just WHAT THEIR PROBLEM IS! -Nate
Here is the offending Buffer from his ASL: \PCI0\RSRC Name(RSRC, Buffer(0xa9) { 0x88, 0xe, 0x0, 0x2, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x47, 0x1, 0xf8, 0xc, 0xf8, 0xc, 0x1, 0x8, 0x88, 0xe, 0x0, 0x1, 0xc, 0x3, 0x0, 0x0, 0x0, 0x0, 0xf7, 0xc, 0x0, 0x0, 0xf8, 0xc, 0x0, 0x88, 0xe, 0x0, 0x1, 0xc, 0x3, 0x0, 0x0, 0x0, 0xd, 0xff, 0xff, 0x0, 0x0, 0x0, 0xf3, 0x0, 0x87, 0x18, 0x0, 0x0, 0xc, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xa, 0x0, 0xff, 0x ff, 0xb, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, 0x0, 0x87, 0x18, 0x0, 0x0, 0xc, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc, 0x0, 0xff, 0xff, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x87, 0x18, 0x0, 0x0, 0xc, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xd, 0x0, 0xff, 0xff, 0xe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0, 0x0, 0x87, 0x18, 0x0, 0x0, 0xc, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xbf, 0xfe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x79, 0x0 }) And my interpretation of it: 88 e 0 2 c 0 0 0 0 0 ff 0 0 0 0 1 0 WORD address 47 1 f8 c f8 c 1 8 IO port 88 e 0 1 c 3 0 0 0 0 f7 c 0 0 f8 c 0 WORD address 88 e 0 1 c 3 0 0 0 d ff ff 0 0 0 f3 0 WORD address 87 18 0 0 c 3 0 0 0 0 0 0 a 0 ff ff b 0 0 0 0 0 0 0 2 0 0 DWORD address 87 18 0 0 c 3 0 0 0 0 0 f0 c 0 0 f0 c 0 0 0 0 0 0 0 0 0 0 DWORD address 87 18 0 0 c 3 0 0 0 0 0 f0 c 0 ff ff d 0 0 0 0 0 0 10 1 0 0 DWORD address 87 18 0 0 c 3 0 0 0 0 0 0 0 20 ff ff bf fe 0 0 0 0 0 0 c0 de 0 DWORD address 79 0 End tag
Here is the summary and patch. ---------- Forwarded message ---------- Date: Fri, 19 Sep 2003 08:09:02 -0700 (PDT) From: Nate Lawson <nate@root.org> To: acpi-jp@jp.freebsd.org Cc: acpi-devel@lists.sourceforge.net Subject: [PATCH] invalid resource lists and extra checking A FreeBSD user has been having a problem where his system panics on transition to battery from AC. The PR is below: http://www.freebsd.org/cgi/query-pr.cgi?pr=56254 After some tracking, I found that his GPE block was being overwritten by the end of a resource list (type Address16). It also happened for Address32. I did some more debugging and found that he does not have a Resource Index or Resource Source String but his resources have an extra trailing zero byte on Address type resources (but not fixed-length resources like IO ports). For instance, here is an Address16 resource: 0x88, 0xe, 0x0, 0x2, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0 As you can see, it has a length of 14 (total length 17) which is one extra byte but there is no Resource Source String. The spec explicitly says on page 194 (Table 6-27): ==== Byte 16 -- Resource Source Index (Optional) Only present if Resource Source (below) is present. This field gives an ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ index to the specific resource descriptor that this device consumes from in the current resource template for the device object pointed to in Resource Source. String -- Resource Source (Optional) If present, the device ... ==== This can be read that for an Address16, valid lengths are 13 and > 14 since if Resource Index is present, Resource Source String also must be present. I'm not sure if a zero-length string (other than the terminating NUL) is valid. If not, then the valid lengths are 13 and > 15. This is already assumed by AcpiRsGetListLength() where it subtracts the length from 14 to get the length of the Resource Source String. This allowed me to develop the following patch for Address16. It also includes extra checks that the length meets the minimum specified by the standard and for the extended interrupt resource, that there is at least one interrupt present. Let me know that I can commit this to our vendor branch and can expect it in a future release. -Nate Index: rsaddr.c =================================================================== RCS file: /home/ncvs/src/sys/contrib/dev/acpica/rsaddr.c,v retrieving revision 1.1.1.11 diff -u -r1.1.1.11 rsaddr.c --- rsaddr.c 13 Jul 2003 22:43:31 -0000 1.1.1.11 +++ rsaddr.c 19 Sep 2003 04:59:50 -0000 @@ -168,6 +168,10 @@ Buffer += 1; ACPI_MOVE_16_TO_16 (&Temp16, Buffer); + /* Check for the minimum length. */ + if (Temp16 < 13) + return_ACPI_STATUS (AE_AML_INVALID_RESOURCE_TYPE); + *BytesConsumed = Temp16 + 3; OutputStruct->Id = ACPI_RSTYPE_ADDRESS16; @@ -275,11 +279,13 @@ /* * This will leave us pointing to the Resource Source Index * If it is present, then save it off and calculate the - * pointer to where the null terminated string goes: - * Each Interrupt takes 32-bits + the 5 bytes of the - * stream that are default. + * pointer to where the null terminated string goes. + * + * Note that some buggy resources have a length that indicates the + * Index byte is present even though it isn't (since there is no + * following Resource String.) We add one to catch these. */ - if (*BytesConsumed > 16) + if (*BytesConsumed > 16 + 1) { /* Dereference the Index */ @@ -555,6 +561,10 @@ */ Buffer += 1; ACPI_MOVE_16_TO_16 (&Temp16, Buffer); + + /* Check for the minimum length. */ + if (Temp16 < 23) + return_ACPI_STATUS (AE_AML_INVALID_RESOURCE_TYPE); *BytesConsumed = Temp16 + 3; OutputStruct->Id = ACPI_RSTYPE_ADDRESS32; @@ -667,9 +677,13 @@ /* * This will leave us pointing to the Resource Source Index * If it is present, then save it off and calculate the - * pointer to where the null terminated string goes: + * pointer to where the null terminated string goes. + * + * Note that some buggy resources have a length that indicates the + * Index byte is present even though it isn't (since there is no + * following Resource String.) We add one to catch these. */ - if (*BytesConsumed > 26) + if (*BytesConsumed > 26 + 1) { /* Dereference the Index */ @@ -944,7 +958,11 @@ Buffer += 1; ACPI_MOVE_16_TO_16 (&Temp16, Buffer); + /* Check for the minimum length. */ + if (Temp16 < 43) + return_ACPI_STATUS (AE_AML_INVALID_RESOURCE_TYPE); *BytesConsumed = Temp16 + 3; + OutputStruct->Id = ACPI_RSTYPE_ADDRESS64; /* @@ -1056,11 +1074,13 @@ /* * This will leave us pointing to the Resource Source Index * If it is present, then save it off and calculate the - * pointer to where the null terminated string goes: - * Each Interrupt takes 32-bits + the 5 bytes of the - * stream that are default. + * pointer to where the null terminated string goes. + * + * Note that some buggy resources have a length that indicates the + * Index byte is present even though it isn't (since there is no + * following Resource String.) We add one to catch these. */ - if (*BytesConsumed > 46) + if (*BytesConsumed > 46 + 1) { /* Dereference the Index */ Index: rsirq.c =================================================================== RCS file: /home/ncvs/src/sys/contrib/dev/acpica/rsirq.c,v retrieving revision 1.1.1.12 diff -u -r1.1.1.12 rsirq.c --- rsirq.c 13 Jul 2003 22:43:39 -0000 1.1.1.12 +++ rsirq.c 19 Sep 2003 05:03:51 -0000 @@ -408,7 +408,11 @@ Buffer += 1; ACPI_MOVE_16_TO_16 (&Temp16, Buffer); + /* Check for the minimum length. */ + if (Temp16 < 6) + return_ACPI_STATUS (AE_AML_INVALID_RESOURCE_TYPE); *BytesConsumed = Temp16 + 3; + OutputStruct->Id = ACPI_RSTYPE_EXT_IRQ; /* @@ -446,6 +450,12 @@ Buffer += 1; Temp8 = *Buffer; + /* Minimum number of IRQs is one. */ + if (Temp8 < 1) { + *BytesConsumed = 0; + return_ACPI_STATUS (AE_AML_INVALID_RESOURCE_TYPE); + } + OutputStruct->Data.ExtendedIrq.NumberOfInterrupts = Temp8; /* @@ -480,7 +490,8 @@ * stream that are default. */ if (*BytesConsumed > - ((ACPI_SIZE) OutputStruct->Data.ExtendedIrq.NumberOfInterrupts * 4) + 5) + ((ACPI_SIZE) OutputStruct->Data.ExtendedIrq.NumberOfInterrupts * 4) + + 5 + 1) { /* Dereference the Index */
State Changed From-To: open->closed Patch committed and intel will integrate the changes. User reports that both the battery transition panic and the shutdown panic are gone.