Bug 123754 - [ata] [panic] atacontrol(8): atacontrol reinit causing kernel panic
Summary: [ata] [panic] atacontrol(8): atacontrol reinit causing kernel panic
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 7.0-STABLE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Alexander Motin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-17 11:50 UTC by Jeremy Chadwick
Modified: 2009-11-10 23:09 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Chadwick freebsd_committer freebsd_triage 2008-05-17 11:50:01 UTC
	atacontrol reinit can crash the kernel in some situations.  In our
	situation, we needed to replace a disk, ad6, on ata3-master.  See
	How-To-Repeat for details of what commands were done.

	I do have a kernel core as a result of this issue, and can make it
	available if need be.  Note that it's quite large (343MBytes).

	Relevant hardware:

atapci1: <Intel AHCI controller> port 0x30e8-0x30ef,0x30dc-0x30df,0x30e0-0x30e7,0x30d8-0x30db,0x30b0-0x30bf mem 0xe8600400-0xe86007ff irq 19 at device 31.2 on pci0
atapci1: [ITHREAD]
atapci1: AHCI Version 01.10 controller with 4 ports detected
ata2: <ATA channel 0> on atapci1
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci1
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci1
ata4: [ITHREAD]
ata5: <ATA channel 3> on atapci1
ata5: [ITHREAD]
ad4: 239372MB <WDC WD2500YS-01SHB1 20.06C06> at ata2-master SATA300
ad6: 239372MB <WDC WD2500YS-01SHB1 20.06C06> at ata3-master SATA300

	vmcore info:

horus# cat /var/crash/info.0
Dump header from device /dev/ad4s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 343969792B (328 MB)
  Blocksize: 512
  Dumptime: Fri May 16 21:06:40 2008
  Hostname: horus.sc1.parodius.com
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 7.0-STABLE #0: Sat May 10 06:28:12 PDT 2008
    root@horus.sc1.parodius.com:/usr/obj/usr/src/sys/PDSMI_PLUS_amd64
  Panic String: from debugger
  Dump Parity: 2593049917
  Bounds: 0
  Dump Status: good

	Results of kgdb /boot/kernel/kernel /var/crash/vmcore.0

horus# kgdb /boot/kernel/kernel /var/crash/vmcore.0
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x258
fault code              = supervisor read data, page not present
instruction pointer     = 0x8:0xffffffff80291104
stack pointer           = 0x10:0xffffffffaf0e5920
frame pointer           = 0x10:0xffffffffaf0e5940
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 19373 (atacontrol)
panic: from debugger
cpuid = 0
KDB: stack backtrace:
Uptime: 6d5h3m48s
Physical memory: 2039 MB
Dumping 328 MB: 313 297 281 265 249 233 217 201 185 169 153 137 121 105 89 73 57 41 25 9

Reading symbols from /boot/kernel/pf.ko...Reading symbols from /boot/kernel/pf.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/pf.ko
#0  doadump () at pcpu.h:194
194     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:194
#1  0xffffffff8029cd4f in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
#2  0xffffffff8029d1bf in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:572
#3  0xffffffff8018bd6c in db_panic (addr=Variable "addr" is not available.
) at /usr/src/sys/ddb/db_command.c:446
#4  0xffffffff8018c379 in db_command (last_cmdp=0xffffffff805e4ac8, cmd_table=0x0, dopager=1) at /usr/src/sys/ddb/db_command.c:413
#5  0xffffffff8018c57b in db_command_loop () at /usr/src/sys/ddb/db_command.c:466
#6  0xffffffff8018e057 in db_trap (type=Variable "type" is not available.
) at /usr/src/sys/ddb/db_main.c:228
#7  0xffffffff802c63f5 in kdb_trap (type=12, code=0, tf=0xffffffffaf0e5870) at /usr/src/sys/kern/subr_kdb.c:524
#8  0xffffffff80411ed2 in trap_fatal (frame=0xffffffffaf0e5870, eva=Variable "eva" is not available.
) at /usr/src/sys/amd64/amd64/trap.c:719
#9  0xffffffff80412282 in trap_pfault (frame=0xffffffffaf0e5870, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641
#10 0xffffffff80412ad0 in trap (frame=0xffffffffaf0e5870) at /usr/src/sys/amd64/amd64/trap.c:410
#11 0xffffffff803f8b8e in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169
#12 0xffffffff80291104 in _mtx_lock_sleep (m=0xffffff0001228e18, tid=18446742974257523184, opts=Variable "opts" is not available.
)
    at /usr/src/sys/kern/kern_mutex.c:335
#13 0xffffffff801bf42e in ata_start (dev=0xffffff000123d600) at /usr/src/sys/dev/ata/ata-queue.c:177
#14 0xffffffff801a8539 in ata_ioctl (dev=Variable "dev" is not available.
) at /usr/src/sys/dev/ata/ata-all.c:375
#15 0xffffffff80269e48 in giant_ioctl (dev=0xffffff0001172400, cmd=2147770626, data=0xffffff00134793f0 "\003", fflag=3,
    td=0xffffff00038d69f0) at /usr/src/sys/kern/kern_conf.c:405
#16 0xffffffff80234703 in devfs_ioctl_f (fp=0xffffff00039475a0, com=2147770626, data=0xffffff00134793f0, cred=Variable "cred" is not available.
)
    at /usr/src/sys/fs/devfs/devfs_vnops.c:494
#17 0xffffffff802d330e in kern_ioctl (td=0xffffff00038d69f0, fd=3, com=2147770626, data=0xffffff00134793f0 "\003") at file.h:266
#18 0xffffffff802d361a in ioctl (td=0xffffff00038d69f0, uap=0xffffffffaf0e5be0) at /usr/src/sys/kern/sys_generic.c:570
#19 0xffffffff804124f8 in syscall (frame=0xffffffffaf0e5c70) at /usr/src/sys/amd64/amd64/trap.c:852
#20 0xffffffff803f8d9b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:290
#21 0x00000000607094fc in ?? ()
Previous frame inner to this frame (corrupt stack?)

How-To-Repeat: 	# atacontrol detach ata3
	[removed ad6 disk attached to ata3-master]
	[inserted new disk]
	# atacontrol into ata3
	[showed "no device present" for both master and slave]
	# atacontrol reinit ata3
	[panic]
Comment 1 Remko Lodder freebsd_committer freebsd_triage 2008-05-17 13:30:12 UTC
Responsible Changed
From-To: freebsd-bugs->sos

Hi Soren, can you look at this please?
Comment 2 Andrey V. Elsukov 2008-05-17 17:56:58 UTC
17.05.08, 14:29, "Jeremy Chadwick" <koitsu@FreeBSD.org>:

> >Number:         123754
> >Category:       kern
> >Synopsis:       atacontrol reinit causing kernel panic
> >How-To-Repeat:
> 	# atacontrol detach ata3
> 	[removed ad6 disk attached to ata3-master]
> 	[inserted new disk]
> 	# atacontrol into ata3
> 	[showed "no device present" for both master and slave]
> 	# atacontrol reinit ata3
> 	[panic]

I suggested a fix for this problem in kern/122045.
Also here is explanation why it's panics:
http://lists.freebsd.org/pipermail/freebsd-bugs/2008-March/029820.html

--
WBR, Andrey V. Elsukov
Comment 3 Volker 2008-05-17 23:44:03 UTC
DUP of kern/122045
will leave it to Soeren to close either of these
Comment 4 Andrey V. Elsukov 2008-06-07 05:36:48 UTC
Hi,
  The fix was committed to CURRENT.
-- 
WBR, Andrey V. Elsukov
Comment 5 Alexander Motin freebsd_committer freebsd_triage 2009-02-22 00:02:45 UTC
State Changed
From-To: open->patched

ttac/detach/reinit implementation was reworked on 8-CURRENT.
Comment 6 Mark Linimon freebsd_committer freebsd_triage 2009-05-12 05:52:07 UTC
Responsible Changed
From-To: sos->mav

Over to committer as MFC reminder, if applicable.
Comment 7 Alexander Motin freebsd_committer freebsd_triage 2009-11-10 23:08:37 UTC
State Changed
From-To: patched->closed

I am not going to do massive merge to 7-STABLE.