Bug 200384 - ctld panic at boot on 10.1-STABLE
Summary: ctld panic at boot on 10.1-STABLE
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.1-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Edward Tomasz Napierala
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-22 11:34 UTC by Jimmy Olgeni
Modified: 2015-05-24 04:14 UTC (History)
1 user (show)

See Also:


Attachments
Sample ctl.conf (241 bytes, text/plain)
2015-05-22 11:34 UTC, Jimmy Olgeni
no flags Details
Sample textdump 1 (24.50 KB, application/x-tar)
2015-05-22 11:36 UTC, Jimmy Olgeni
no flags Details
Sample textdump 2 (24.50 KB, application/x-tar)
2015-05-22 11:36 UTC, Jimmy Olgeni
no flags Details
Fix (2.32 KB, patch)
2015-05-23 17:28 UTC, Edward Tomasz Napierala
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jimmy Olgeni freebsd_committer freebsd_triage 2015-05-22 11:34:54 UTC
Created attachment 157032 [details]
Sample ctl.conf

Loading ctld causes a panic at boot (ctld_enable="YES" in rc.conf) when the following kernel options are enabled: DEADLKRES, DEBUG_LOCKS, DEBUG_VFS_LOCKS, DIAGNOSTIC, INVARIANTS, INVARIANT_SUPPORT, WITNESS.

It seems repeatable so far - the options were actually enbled to track down a possible deadlock elsewhere.

Fatal trap 9: general protection fault while in kernel mode
cpuid = 2; apic id = 02
instruction pointer	= 0x20:0xffffffff802e476a
stack pointer	        = 0x28:0xfffffe046a4467c0
frame pointer	        = 0x28:0xfffffe046a4468a0
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 4 (scanner)
trap number		= 9
panic: general protection fault
cpuid = 2
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe046a446340
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe046a4463f0
vpanic() at vpanic+0x126/frame 0xfffffe046a446430
panic() at panic+0x43/frame 0xfffffe046a446490
trap_fatal() at trap_fatal+0x38f/frame 0xfffffe046a4464f0
trap() at trap+0x818/frame 0xfffffe046a446700
calltrap() at calltrap+0x8/frame 0xfffffe046a446700
--- trap 0x9, rip = 0xffffffff802e476a, rsp = 0xfffffe046a4467c0, rbp = 0xfffffe046a4468a0 ---
cam_periph_alloc() at cam_periph_alloc+0x48a/frame 0xfffffe046a4468a0
scsi_scan_lun() at scsi_scan_lun+0x1cb/frame 0xfffffe046a446ae0
scsi_scan_bus() at scsi_scan_bus+0x809/frame 0xfffffe046a446b50
xpt_scanner_thread() at xpt_scanner_thread+0x10b/frame 0xfffffe046a446bb0
fork_exit() at fork_exit+0x84/frame 0xfffffe046a446bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe046a446bf0
--- trap 0, rip = 0, rsp = 0xfffffe046a446cb0, rbp = 0 ---
Uptime: 22s

ctl.conf only references one zvol.
Comment 1 Jimmy Olgeni freebsd_committer freebsd_triage 2015-05-22 11:36:02 UTC
Created attachment 157033 [details]
Sample textdump 1
Comment 2 Jimmy Olgeni freebsd_committer freebsd_triage 2015-05-22 11:36:18 UTC
Created attachment 157034 [details]
Sample textdump 2
Comment 3 Jimmy Olgeni freebsd_committer freebsd_triage 2015-05-22 18:36:32 UTC
I managed to get a better backtrace (below).

A few more data points, always booting with the debug options enabled:

- Booting with ctld_enable -> panic
- Booting without ctld_enable, and then starting the ctld service -> panic
- Booting without ctld_enable, and kldloading ctl -> panic

- Booting without ctld_enable, and kldloading ctl after loading iscsi -> ok
- Booting with ctld_enable, but with iscsi_load and ctl_load in loader.conf -> ok

There seems to be something going on between ctl and iscsi when
ctl is loaded without iscsi already present.

(kgdb) bt
#0  doadump (textdump=0) at pcpu.h:219
#1  0xffffffff803600fe in db_dump (dummy=<value optimized out>, dummy2=0, dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
#2  0xffffffff8035fb9d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440
#3  0xffffffff8035f914 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493
#4  0xffffffff803622f0 in db_trap (type=<value optimized out>, code=0) at /usr/src/sys/ddb/db_main.c:231
#5  0xffffffff809953d9 in kdb_trap (type=3, code=0, tf=<value optimized out>) at /usr/src/sys/kern/subr_kdb.c:656
#6  0xffffffff80d8b73e in trap (frame=0xfffffe046a445210) at /usr/src/sys/amd64/amd64/trap.c:554
#7  0xffffffff80d6fc52 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff80994ae5 in kdb_break () at cpufunc.h:63
#9  0xffffffff80776d9b in scgetc (sc=0xffffffff8185be18, flags=<value optimized out>) at /usr/src/sys/dev/syscons/syscons.c:3591
#10 0xffffffff807795ff in sc_cngetc (cd=0xffffffff81643f70) at /usr/src/sys/dev/syscons/syscons.c:1782
#11 0xffffffff809071c5 in cngetc () at /usr/src/sys/kern/kern_cons.c:406
#12 0xffffffff8095a54e in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:455
#13 0xffffffff8095a9d5 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
#14 0xffffffff8095aa23 in panic (fmt=0xffffffff81643f70 "\004") at /usr/src/sys/kern/kern_shutdown.c:688
#15 0xffffffff80d8be3f in trap_fatal (frame=<value optimized out>, eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:857
#16 0xffffffff80d8ba98 in trap (frame=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:203
#17 0xffffffff80d6fc52 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236
#18 0xffffffff802e494a in cam_periph_alloc (periph_ctor=0xffffffff802f9c30 <proberegister>, 
    periph_oninvalidate=<value optimized out>, periph_dtor=<value optimized out>, periph_start=<value optimized out>, 
    name=<value optimized out>, type=<value optimized out>, path=0xfffff80163591600, ac_callback=<value optimized out>, 
    code=<value optimized out>) at /usr/src/sys/cam/cam_periph.c:227
#19 0xffffffff802f94fb in scsi_scan_lun (request_ccb=<value optimized out>) at /usr/src/sys/cam/scsi/scsi_xpt.c:2339
#20 0xffffffff802fd659 in scsi_scan_bus (periph=<value optimized out>, request_ccb=0xfffff80163b33800)
    at /usr/src/sys/cam/scsi/scsi_xpt.c:2037
#21 0xffffffff802f045b in xpt_scanner_thread (dummy=<value optimized out>) at /usr/src/sys/cam/cam_xpt.c:2453
#22 0xffffffff809215d4 in fork_exit (callout=0xffffffff802f0350 <xpt_scanner_thread>, arg=0x0, frame=0xfffffe046a445c00)
    at /usr/src/sys/kern/kern_fork.c:1017
#23 0xffffffff80d7018e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611
#24 0x0000000000000000 in ?? ()
Comment 4 Jimmy Olgeni freebsd_committer freebsd_triage 2015-05-22 18:37:22 UTC
Tested on 10.1-STABLE #1 r283287 amd64.
Comment 5 Edward Tomasz Napierala freebsd_committer freebsd_triage 2015-05-23 17:28:18 UTC
Created attachment 157086 [details]
Fix
Comment 6 Edward Tomasz Napierala freebsd_committer freebsd_triage 2015-05-23 17:29:13 UTC
Can you try the attached fix?
Comment 7 Jimmy Olgeni freebsd_committer freebsd_triage 2015-05-23 19:17:36 UTC
(In reply to Edward Tomasz Napierala from comment #6)

Looks great! Panics stopped and ctld is working as usual.
Comment 8 commit-hook freebsd_committer freebsd_triage 2015-05-24 04:14:24 UTC
A commit references this bug:

Author: trasz
Date: Sun May 24 04:14:11 UTC 2015
New revision: 283349
URL: https://svnweb.freebsd.org/changeset/base/283349

Log:
  MFC r279554:

  Make periphdriver_register() take XPT lock when modifying the periph_drivers
  array.

  This fixes a panic that sometimes occured when kldloading ctl.ko.

  PR:		200384
  Sponsored by:	The FreeBSD Foundation

Changes:
_U  stable/10/
  stable/10/sys/cam/cam_periph.c
  stable/10/sys/cam/cam_xpt.c