Summary: | Xfree86's savage and vesa drivers can panic the kernel! | ||
---|---|---|---|
Product: | Base System | Reporter: | Nate Dannenberg <natedac> |
Component: | i386 | Assignee: | msmith |
Status: | Closed FIXED | ||
Severity: | Affects Only Me | ||
Priority: | Normal | ||
Version: | Unspecified | ||
Hardware: | Any | ||
OS: | Any |
Description
Nate Dannenberg
2001-03-21 04:40:01 UTC
I have come up with additional information regarding this problem, in that I now know how to get FreeBSD to provide the crash information for me at startup, via it's crashdump feature (see dumpon(8) and savecore(8)). The problem was that XFree86 4.0.2 (and 4.0.3) will crash my system, an IBM Aptiva (550 MHz Athlon, 96M Ram, Number Nine SR9 with S3 Savage4 chipset and 8M ram), if I use the "savage" or "vesa" drivers. Using the "vga" driver provides a temporary solution, however it's not a suitable replacement due to the very nature of plain VGA modes. ----- Text Import Begin ----- su-2.04# gdb -k GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd". (kgdb) symbol-file kernel Reading symbols from kernel...(no debugging symbols found)...done. (kgdb) exec-file /usr/crash/kernel.0 (kgdb) core-file /usr/crash/vmcore.0 IdlePTD 3465216 initial pcb at 2ba4a0 panicstr: general protection fault panic messages: --- Fatal trap 9: general protection fault while in kernel mode instruction pointer = 0x8:0xc024b4a3 stack pointer = 0x10:0xc76a9cf0 frame pointer = 0x10:0xc76a9d10 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 3 current process = 344 (XFree86) interrupt mask = none trap number = 9 panic: general protection fault syncing disks... 30 29 21 10 done Uptime: 11m46s dumping to dev #ad/0x20001, offset 205184 dump ata0: resetting devices .. done 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 4 2 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --- #0 0xc013ebba in dumpsys () (kgdb) where #0 0xc013ebba in dumpsys () #1 0xc013e9db in boot () #2 0xc013ed58 in poweroff_wait () #3 0xc02575bd in trap_fatal () #4 0xc0256fcb in trap () #5 0xc024b4a3 in i686_mrstoreone () #6 0xc024b3e9 in i686_mrstore () #7 0xc024ba21 in i686_mrset () #8 0xc0252a81 in mem_range_attr_set () #9 0xc02529fd in mem_ioctl () #10 0xc02528b4 in mmioctl () #11 0xc017545e in spec_ioctl () #12 0xc0175189 in spec_vnoperate () #13 0xc0209a3d in ufs_vnoperatespec () #14 0xc0171a44 in vn_ioctl () #15 0xc014d10e in ioctl () #16 0xc0257869 in syscall2 () #17 0xc0249b95 in Xint0x80_syscall () #18 0x8091059 in ?? () #19 0x80955c7 in ?? () #20 0x87760dd in ?? () #21 0x8758e32 in ?? () #22 0x806b7cc in ?? () #23 0x80bab12 in ?? () #24 0x806b045 in ?? () (kgdb) quit I'll provide more information if I can, as I write this, I have a new kernel compiling to include within it the debugging symbols and the debugger itself (my existing copy lacks these items). -- /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~___~~~~~\ | natedac@kscable.com //Z@|___ | | http://home.kscable.com/natedac |'(__ [_< | \_C64/C128_-_What's_*YOUR*_hobby?__\___|____/ I've managed to get some new info about this problem. I managed to get a proper debug kernel built and have extracted some information from it using gdb as per the handbook. You may want to delete the panic message and other information I posted earlier, it's incomplete anyways. bash-2.04# gdb -k kernel.debug /usr/crash/vmcore.2 GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"... IdlePTD 3457024 initial pcb at 2ba4c0 panicstr: general protection fault panic messages: --- Fatal trap 9: general protection fault while in kernel mode instruction pointer = 0x8:0xc024b4b3 stack pointer = 0x10:0xc76b4cf0 frame pointer = 0x10:0xc76b4d10 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 3 current process = 330 (XFree86) interrupt mask = none trap number = 9 panic: general protection fault syncing disks... 33 32 23 12 done Uptime: 3m14s dumping to dev #ad/0x20001, offset 205184 dump ata0: resetting devices .. done 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 4 2 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --- #0 dumpsys () at ../../kern/kern_shutdown.c:469 469 if (dumping++) { (kgdb) where #0 dumpsys () at ../../kern/kern_shutdown.c:469 #1 0xc013e9db in boot (howto=256) at ../../kern/kern_shutdown.c:309 #2 0xc013ed58 in poweroff_wait (junk=0xc028e1c5, howto=-957278880) at ../../kern/kern_shutdown.c:556 #3 0xc02575cd in trap_fatal (frame=0xc76b4cb0, eva=0) at ../../i386/i386/trap.c:951 #4 0xc0256fdb in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi = -1, tf_esi = -1, tf_ebp = -949269232, tf_isp = -949269284, tf_ebx = -1, tf_edx = -1, tf_ecx = 592, tf_eax = -1, tf_trapno = 9, tf_err = 0, tf_eip = -1071336269, tf_cs = 8, tf_eflags = 77974, tf_esp = 65535, tf_ss = 0}) at ../../i386/i386/trap.c:613 #5 0xc024b4b3 in i686_mrstoreone (arg=0xc02d7ed4) at ../../i386/i386/i686_mem.c:290 #6 0xc024b3f9 in i686_mrstore (sc=0xc02d7ed4) at ../../i386/i386/i686_mem.c:253 #7 0xc024ba31 in i686_mrset (sc=0xc02d7ed4, mrd=0xc09d81e0, arg=0xc76b4eb0) at ../../i386/i386/i686_mem.c:489 #8 0xc0252a91 in mem_range_attr_set (mrd=0xc09d81e0, arg=0xc76b4eb0) at ../../i386/i386/mem.c:439 #9 0xc0252a0d in mem_ioctl (dev=0xc02b8ebc, cmd=2148298035, data=0xc76b4eac "\bù¿¿", flags=3, p=0xc6f11560) at ../../i386/i386/mem.c:402 #10 0xc02528c4 in mmioctl (dev=0xc02b8ebc, cmd=2148298035, data=0xc76b4eac "\bù¿¿", flags=3, p=0xc6f11560) at ../../i386/i386/mem.c:338 #11 0xc017544e in spec_ioctl (ap=0xc76b4de8) at ../../miscfs/specfs/spec_vnops.c:306 #12 0xc0175179 in spec_vnoperate (ap=0xc76b4de8) at ../../miscfs/specfs/spec_vnops.c:119 #13 0xc0209a09 in ufs_vnoperatespec (ap=0xc76b4de8) at ../../ufs/ufs/ufs_vnops.c:2391 #14 0xc0171a34 in vn_ioctl (fp=0xc0acd740, com=2148298035, data=0xc76b4eac "\bù¿¿", p=0xc6f11560) at vnode_if.h:429 #15 0xc014d0fe in ioctl (p=0xc6f11560, uap=0xc76b4f80) at ../../sys/file.h:178 #16 0xc0257879 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 983040, tf_esi = 65536, tf_ebp = -1077937884, tf_isp = -949268524, tf_ebx = -1077937912, tf_edx = 673607984, tf_ecx = 141856768, tf_eax = 54, tf_trapno = 12, tf_err = 2, tf_eip = 673244900, tf_cs = 31, tf_eflags = 12951, tf_esp = -1077937976, tf_ss = 47}) at ../../i386/i386/trap.c:1150 #17 0xc0249ba5 in Xint0x80_syscall () #18 0x8091059 in ?? () #19 0x80955c7 in ?? () #20 0x872a0dd in ?? () #21 0x8710e32 in ?? () #22 0x806b7cc in ?? () #23 0x80bab12 in ?? () #24 0x806b045 in ?? () (kgdb) up 4 #4 0xc0256fdb in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi = -1, tf_esi = -1, tf_ebp = -949269232, tf_isp = -949269284, tf_ebx = -1, tf_edx = -1, tf_ecx = 592, tf_eax = -1, tf_trapno = 9, tf_err = 0, tf_eip = -1071336269, tf_cs = 8, tf_eflags = 77974, tf_esp = 65535, tf_ss = 0}) at ../../i386/i386/trap.c:613 613 trap_fatal(&frame, eva); (kgdb) quit Problem has been solved by a workaround in the kernel source. Thanks to the help of some of the guys on #BSDcode on EFnet (hi BigSpoon!). Apparently MTRR doesn't work on this paricular processor, an AMD 550 MHz Athlon, or the kernel is somehow detecting it wrong. XFree86 4.4.x obviously is trying to use this feature somehow, and that was causing a kernel panic. Temporary Workaround: At line 572 of /sys/i386/i386/i686_mem.c, comment out the commands that attempt to detect the MTRR facility, like this: ---- snip ---- static void i686_mem_drvinit(void *unused) { /* Try for i686 MTRRs */ /* if ((cpu_feature & CPUID_MTRR) && ((cpu_id & 0xf00) == 0x600) && ((strcmp(cpu_vendor, "GenuineIntel") == 0) || (strcmp(cpu_vendor, "AuthenticAMD") == 0))) { mem_range_softc.mr_op = &i686_mrops; } */ } On Tue, 27 Mar 2001 natedac@kscable.com wrote: > Athlon, or the kernel is somehow detecting it wrong. XFree86 > 4.4.x obviously is trying to use this feature somehow, and that was > causing a kernel panic. Erm... make that 4.0.x (i.e. 4.0.2, 4.0.3) -- /~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~___~~~~~\ | natedac@kscable.com //Z@|___ | | http://home.kscable.com/natedac |'(__ [_< | \_C64/C128_-_What's_*YOUR*_hobby?__\___|____/ Thanks to lots of help from BigSpoon on #BSDcode/EFnet (jhb@FreeBSD.org), here's a more proper way to fix the MTRR problem that plages some machines like mine. This will allow the MTRR support to be disabled at will either in /boot/loader.conf or at the boot prompt. It appears that today the copy of this file sitting on ftp.freebsd.org has been changed slightly (so a patch/diff against my week-old copy would be of little use anyways ;-) Anyways, here we go: Edit /usr/src/sys/i386/i386/i686_mem.c Somewhere near the start of the file, add these two lines: ------------------------------------------------------------------------------ static int mtrrs_disabled; TUNABLE_INT_DECL("machdep.mtrrs_disabled", 0, mtrrs_disabled); ------------------------------------------------------------------------------ Then, go to the end of the file, and make these changes to i686_mem_drvinit()... ------------------------------------------------------------------------------ static void i686_mem_drvinit(void *unused) { /* First, check if the user wants to allow MTRR at all */ if (mtrrs_disabled) { return; } /* Ok, MTRR is allowed, so try for i686 MTRRs */ if ((cpu_feature & CPUID_MTRR) && ((cpu_id & 0xf00) == 0x600) && ((strcmp(cpu_vendor, "GenuineIntel") == 0) || (strcmp(cpu_vendor, "AuthenticAMD") == 0))) { mem_range_softc.mr_op = &i686_mrops; } } ----------------------------------------------------------------------------- Finally, add a line to /boot/loader.conf to disable the MTRR support if you need to (leaving it out or mis-spelling the variable name will cause MTRR support to be detected and enabled normally): machdep.mtrrs_disabled = 1 I guess at this point, the original PR should remain open until the bug is solved in a more formal manner, but this seems to be an effective workaround. So far, I haven't seen any ill effects from disabling MTRR's. Responsible Changed From-To: freebsd-ports->jmz Over to maintainer. Responsible Changed From-To: jmz->jhb This is more about needing to disable MTRR's on buggy processors which is a kernel issue, not a ports issue. Responsible Changed From-To: jhb->msmith Mike wants it so he can fix the missing holes in our K7 MTRR support. State Changed From-To: open->closed I believe the patches that I recently committed to -current and -stable should resolve this issue. |