Bug 222898

Summary: [iscsi]: panic: page fault - system crashes while working as iSCSI target
Product: Base System Reporter: emz
Component: kernAssignee: Edward Tomasz Napierala <trasz>
Status: Closed FIXED    
Severity: Affects Only Me CC: thomsonk, trasz
Priority: --- Keywords: patch
Version: 11.0-STABLE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
r315520
none
#pragma clang optimize off none

Description emz 2017-10-10 12:03:30 UTC
I have a priduction system working as iSCSI target. While increasing the number of initiators (currently over 3 hundred) in crashes more often.

Last panic and backtrace:

# kgdb /boot/kernel/kernel vmcore.2 
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
fault virtual address   = 0x40
fault code      = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff82641fdb
stack pointer           = 0x28:0xfffffe10450e0680
frame pointer           = 0x28:0xfffffe10450e06d0
code segment    = base rx0, limit 0xfffff, type 0x1b
      = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process  = 650 (work4)
trap number     = 12
panic: page fault
cpuid = 16
KDB: stack backtrace:
#0 0xffffffff80b023a7 at kdb_backtrace+0x67
#1 0xffffffff80ab88e6 at vpanic+0x186
#2 0xffffffff80ab8753 at panic+0x43
#3 0xffffffff80f72ce2 at trap_fatal+0x322
#4 0xffffffff80f72eac at trap_pfault+0x1bc
#5 0xffffffff80f72560 at trap+0x280
#6 0xffffffff80f559f1 at calltrap+0x8
#7 0xffffffff8262e27a at ctl_work_thread+0xeba
#8 0xffffffff80a70785 at fork_exit+0x85
#9 0xffffffff80f55f2e at fork_trampoline+0xe
Uptime: 21d21h26m5s
Dumping 26603 out of 65386 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from /usr/lib/debug//boot/kernel/geom_mirror.ko.debug...done.
done.
Loaded symbols for /boot/kernel/geom_mirror.ko
Reading symbols from /boot/kernel/pf.ko...Reading symbols from /usr/lib/debug//boot/kernel/pf.ko.debug...done.
done.
Loaded symbols for /boot/kernel/pf.ko
Reading symbols from /boot/kernel/ichwd.ko...Reading symbols from /usr/lib/debug//boot/kernel/ichwd.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ichwd.ko
Reading symbols from /boot/kernel/ctl.ko...Reading symbols from /usr/lib/debug//boot/kernel/ctl.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ctl.ko
Reading symbols from /boot/kernel/iscsi.ko...Reading symbols from /usr/lib/debug//boot/kernel/iscsi.ko.debug...done.
done.
Loaded symbols for /boot/kernel/iscsi.ko
#0  doadump (textdump=<value optimized out>) at pcpu.h:222
222 pcpu.h: No such file or directory.
  in pcpu.h
(kgdb) bt
#0  doadump (textdump=<value optimized out>) at pcpu.h:222
#1  0xffffffff80ab8369 in kern_reboot (howto=260) at /usr/src-11-r310734/sys/kern/kern_shutdown.c:366
#2  0xffffffff80ab8920 in vpanic (fmt=<value optimized out>, ap=<value optimized out>)
    at /usr/src-11-r310734/sys/kern/kern_shutdown.c:759
#3  0xffffffff80ab8753 in panic (fmt=<value optimized out>) at /usr/src-11-r310734/sys/kern/kern_shutdown.c:690
#4  0xffffffff80f72ce2 in trap_fatal (frame=0xfffffe10450e05c0, eva=64)
    at /usr/src-11-r310734/sys/amd64/amd64/trap.c:801
#5  0xffffffff80f72eac in trap_pfault (frame=0xfffffe10450e05c0, usermode=0)
    at /usr/src-11-r310734/sys/amd64/amd64/trap.c:658
#6  0xffffffff80f72560 in trap (frame=0xfffffe10450e05c0) at /usr/src-11-r310734/sys/amd64/amd64/trap.c:421
#7  0xffffffff80f559f1 in calltrap () at /usr/src-11-r310734/sys/amd64/amd64/exception.S:236
#8  0xffffffff82641fdb in cfiscsi_done (io=<value optimized out>) at icl_conn_if.h:98
#9  0xffffffff8262e27a in ctl_work_thread (arg=0xfffffe0388516600)
    at /usr/src-11-r310734/sys/modules/ctl/../../cam/ctl/ctl.c:13373
#10 0xffffffff80a70785 in fork_exit (callout=0xffffffff8262d3c0 <ctl_work_thread>, arg=0xfffffe0388516600, 
    frame=0xfffffe10450e0ac0) at /usr/src-11-r310734/sys/kern/kern_fork.c:1040
#11 0xffffffff80f55f2e in fork_trampoline () at /usr/src-11-r310734/sys/amd64/amd64/exception.S:611
#12 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb)
Comment 1 emz 2017-10-12 07:23:14 UTC
Environment:

Uname:

FreeBSD san1.lynx.playkey.net 11.0-STABLE FreeBSD 11.0-STABLE #1 r310734M: Thu Sep 21 18:40:08 +05 2017     emz@san1.lynx.playkey.net:/usr/obj-11-r310734/usr/src-11-r310734/sys/GENERIC  amd64

Disks:

[root@san1:sys/SANDEBUG]# camcontrol devlist
<AHCI SGPIO Enclosure 1.00 0001>   at scbus4 target 0 lun 0 (ses0,pass0)
<AHCI SGPIO Enclosure 1.00 0001>   at scbus11 target 0 lun 0 (ses1,pass1)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 8 lun 0 (pass2,da0)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 9 lun 0 (pass3,da1)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 10 lun 0 (pass4,da2)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 11 lun 0 (pass5,da3)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 12 lun 0 (pass6,da4)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 13 lun 0 (pass7,da5)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 14 lun 0 (pass8,da6)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 15 lun 0 (pass9,da7)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 16 lun 0 (pass10,da8)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 17 lun 0 (pass11,da9)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 18 lun 0 (pass12,da10)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 19 lun 0 (pass13,da11)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 20 lun 0 (pass14,da12)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 21 lun 0 (pass15,da13)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 22 lun 0 (pass16,da14)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 23 lun 0 (pass17,da15)
<ATA TOSHIBA THNSN81Q 6101>        at scbus12 target 24 lun 0 (pass18,da16)
<LSI SAS3x40 0601>                 at scbus12 target 32 lun 0 (ses2,pass19)
<ATEN Virtual CDROM YS0J>          at scbus13 target 0 lun 0 (pass20,cd0)

ctl statistics:

# ctladm devlist | wc -l
     821

# ctladm islist | wc -l
     755 (two disks per initiator, some sometimes lose their target for the reason not related to this ticket)

traffic when idling:

 netstat -I ix0 1
            input            ix0           output
   packets  errs idrops      bytes    packets  errs      bytes colls
     50641     0     0   70562167      49719     0   70335884     0
     50099     0     0   70148479      48306     0   68437356     0
     46561     0     0   65304615      44928     0   63583291     0
     48908     0     0   68448307      47207     0   66274503     0
     56893     0     0   79810803      55510     0   78770191     0
     67500     0     0   94948586      65446     0   93477697     0
     49551     0     0   69503383      48396     0   68838826     0
     66153     0     0   92608074      65416     0   93358583     0
     55500     0     0   77802745      54048     0   77154416     0
     44205     0     0   61417486      43995     0   60993381     0
     67484     0     0   94723317      69191     0   94908452     0
^C

when non idling traffic can peak up to 700Kpps-9Gbit/sec.
Comment 2 emz 2017-10-12 07:38:41 UTC
This very system, being boot up with debug kernel, built with

crashes somewhere withing 30 seconds after the ctld is started:

WARNING: 10.0.3.211 (iqn.1991-05.com.microsoft:worker379): no ping reply (NOP-Out) after 5 seconds; dropping connection
WARNING: 10.0.3.211 (iqn.1991-05.com.microsoft:worker379): no ping reply (NOP-Out) after 5 seconds; dropping connection
WARNING: 10.0.3.201 (iqn.1991-05.com.microsoft:worker369): no ping reply (NOP-Out) after 5 seconds; dropping connection
WARNING: 213.152.134.47 (iqn.1991-05.com.microsoft:worker18): no ping reply (NOP-Out) after 5 seconds; dropping connecti
on
WARNING: 213.152.134.51 (iqn.1991-05.com.microsoft:worker22): no ping reply (NOP-Out) after 5 seconds; dropping connecti
on
WARNING: 10.0.3.219 (iqn.1991-05.com.microsoft:worker387): no ping reply (NOP-Out) after 5 seconds; dropping connection
WARNING: 213.152.138.2 (iqn.2016-04.net.playkey.iscsi:esx11): no ping reply (NOP-Out) after 5 seconds; dropping connecti
on
panic: destroying session with 4 outstanding PDUs
cpuid = 5
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe10555e48f0
vpanic() at vpanic+0x186/frame 0xfffffe10555e4970
kassert_panic() at kassert_panic+0x126/frame 0xfffffe10555e49e0
icl_soft_conn_close() at icl_soft_conn_close+0x1f5/frame 0xfffffe10555e4a10
cfiscsi_maintenance_thread() at cfiscsi_maintenance_thread+0x100/frame 0xfffffe10555e4a70
fork_exit() at fork_exit+0x84/frame 0xfffffe10555e4ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe10555e4ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 2m11s
Dumping 8422 out of 65386 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug
...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from /usr/lib/debug//boot/kernel/geom_mirror.ko.debug
...done.
done.
Loaded symbols for /boot/kernel/geom_mirror.ko
Reading symbols from /boot/kernel/pf.ko...Reading symbols from /usr/lib/debug//boot/kernel/pf.ko.debug...done.
done.
Loaded symbols for /boot/kernel/pf.ko
Reading symbols from /boot/kernel/ichwd.ko...Reading symbols from /usr/lib/debug//boot/kernel/ichwd.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ichwd.ko
Reading symbols from /boot/kernel/ctl.ko...Reading symbols from /usr/lib/debug//boot/kernel/ctl.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ctl.ko
Reading symbols from /boot/kernel/iscsi.ko...Reading symbols from /usr/lib/debug//boot/kernel/iscsi.ko.debug...done.
done.
Loaded symbols for /boot/kernel/iscsi.ko
#0  doadump (textdump=1) at pcpu.h:222
222     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=1) at pcpu.h:222
#1  0xffffffff80a2e725 in kern_reboot (howto=<value optimized out>)
    at /usr/src-11-r310734/sys/kern/kern_shutdown.c:366
#2  0xffffffff80a2ed00 in vpanic (fmt=<value optimized out>, 
    ap=<value optimized out>)
    at /usr/src-11-r310734/sys/kern/kern_shutdown.c:759
#3  0xffffffff80a2eb36 in kassert_panic (fmt=<value optimized out>)
    at /usr/src-11-r310734/sys/kern/kern_shutdown.c:649
#4  0xffffffff82875f05 in icl_soft_conn_close (ic=<value optimized out>)
    at /usr/src-11-r310734/sys/modules/iscsi/../../dev/iscsi/icl_soft.c:1443
#5  0xffffffff82845560 in cfiscsi_maintenance_thread (
    arg=<value optimized out>) at icl_conn_if.h:135
#6  0xffffffff809f2064 in fork_exit (
    callout=0xffffffff82845460 <cfiscsi_maintenance_thread>, 
    arg=0xfffff8033e06cc00, frame=0xfffffe10555e4ac0)
    at /usr/src-11-r310734/sys/kern/kern_fork.c:1040
#7  0xffffffff80e7432e in fork_trampoline ()
    at /usr/src-11-r310734/sys/amd64/amd64/exception.S:611
#8  0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb)

This is reproducible, it crashed 3 times in a row from 3 attempts to run the debug kernel. I spoke with mav@, he told me (as far as I undertood) that with is unintended behaviour, and it is probably the possible reason why the target system crashes when running non-debug kernel.

I collected a crashdump from one of the attempts, it's available here along with the kernel and modules binaries, along with non-stripped ones:

http://files2.enaza.ru/ctld-stuck/debug/
Comment 3 emz 2017-10-12 07:40:37 UTC
Seems like I have planned to add the debug kernel configuration later but totally forgot to do it. So, here it is:

ident SANDEBUG
include GENERIC

options WITNESS
options WITNESS_SKIPSPIN
options INVARIANTS
options INVARIANT_SUPPORT
options DIAGNOSTIC
options DEADLKRES
options MALLOC_DEBUG_MAXZONES=8

options DDB
options KDB_UNATTENDED
Comment 4 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-10-12 16:25:39 UTC
What's the exact version of the kernel you are running?  Could you try to upgrade to at least r315520 and see if it changes anything?
Comment 5 emz 2017-10-13 10:27:11 UTC
This happens on a machine running r310734.
Unfortunately, on a more recent versions, not sure what the exact revision from, I got this: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219760 .

I could try to reproduce this on a r320056 which I have on a test SAN system in the office, but the test environment doesn't have that much of initiators - only about 3-5, and on a production systems I have several hundreds. So, if this is related to the initiators number, then I guess it won't give anything.

However, I will try and post the result back here.
Comment 6 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-10-13 16:51:43 UTC
Ok, can you try with the r315520.diff applied instead?
Comment 7 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-10-13 16:52:27 UTC
Created attachment 187144 [details]
r315520
Comment 8 emz 2017-10-16 10:45:45 UTC
Well, I took a risk and installed a stock r315520. Unfortunately, the behavior is identical: crash within dozens of seconds after starting ctld, same panic message on a debug kernel:

===Cut===
WARNING: 213.152.134.114 (iqn.1991-05.com.microsoft:worker85): no ping reply (NOP-Out) after 5 seconds; dropping connect
ion
WARNING: 213.152.138.158 (iqn.1991-05.com.microsoft:worker265): no ping reply (NOP-Out) after 5 seconds; dropping connec
tion
WARNING: 213.152.134.62 (iqn.1991-05.com.microsoft:worker33): no ping reply (NOP-Out) after 5 seconds; dropping connecti
on
WARNING: 213.152.134.114 (iqn.1991-05.com.microsoft:worker85): no ping reply (NOP-Out) after 5 seconds; dropping connect
ion
panic: destroying session with 4 outstanding PDUs
cpuid = 24
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe105789a8f0
vpanic() at vpanic+0x186/frame 0xfffffe105789a970
kassert_panic() at kassert_panic+0x126/frame 0xfffffe105789a9e0
icl_soft_conn_close() at icl_soft_conn_close+0x20a/frame 0xfffffe105789aa10
cfiscsi_maintenance_thread() at cfiscsi_maintenance_thread+0x100/frame 0xfffffe105789aa70
fork_exit() at fork_exit+0x84/frame 0xfffffe105789aab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe105789aab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 1m55s
===Cut===

I am running a non-debug version of r315520 right now (so there's no either explicit regression or improvement over the r310734), it's still tending to crash in random moments of time when loaded by 299 initiators.

As a workaround I will try to diminish the load by transferring some initiators to a spare SAN.

Crashdumps, binaries and infos:

http://files2.enaza.ru/ctld-stuck/r315520/

In the same time I'm still willing to help in any way in resolving this, including the access on a production system, over ssh/IPMI/whatever.
Comment 9 emz 2017-10-16 10:47:56 UTC
Follow-up: it's stock r315520 except the CTL_MAX_PORTS set to 1024, I decided to mention this in case if it matters (but mav@ said that it's very unlikely that this change could affect anything).
Comment 10 emz 2017-10-19 05:56:22 UTC
Is there something more needed from me ? Or may be my last answer was totally discouraging, and installing r315520 wasn't the expected action ? Do I need to test this patch instead ? Because as far as I understood I was given the patch 'cause "Okay, since you claiming that you cannot install r315520 - then take this patch". Did I understand this wrong ?
Comment 11 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-11-05 18:47:17 UTC
I've spent some time thinking about this, and I believe the KASSERT in question is wrong.  It's perfectly ok for PDUs to be allocated at this point - icl_conn_close() is called before terminating the tasks, so there might be eg cfiscsi_done() running in a parallel thread.  That's why we have the ic_terminating check in icl_queue().  I believe the KASSERT should be moved from icl_conn_close() to icl_conn_free().  Alexander, do you agree?

Having said that - I'd still like to take a look into how things actually look like when this happens, to make sure my reasoning is correct.  Eugene, could you try to repeat this with the iscsi-optimize-off.diff applied, with all the other things just as they were before - with INVARIANTS and DIAGNOSTICS enabled?  Please also include debug files from usr/lib/debug/boot/kernel/; the ones you've put up contain debug info for the kernel itself, but not for the modules.  Or perhaps just build kernel with ctl and iscsi compiled in.
Comment 12 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-11-05 18:47:48 UTC
Created attachment 187767 [details]
#pragma clang optimize off
Comment 13 commit-hook freebsd_committer freebsd_triage 2018-03-15 17:37:11 UTC
A commit references this bug:

Author: trasz
Date: Thu Mar 15 17:36:14 UTC 2018
New revision: 331013
URL: https://svnweb.freebsd.org/changeset/base/331013

Log:
  Fix iSCSI target crash on session reinstation.

  The crash scenario goes like this: there's a thread waiting on "reinstate";
  because it doesn't update the timeout counter it gets terminated by the
  callout; at this point the maintenance thread starts the termination routine.
  The first thread finishes waiting, proceeds to icl_conn_handoff(), and drops
  the refcount, which allows the maintenance thread to free its resources.  At
  this point another thread receives a PDU.  Boom.

  PR:		222898, 219866
  Reported by:	Eugene M. Zheganin <emz at norma.perm.ru>
  Tested by:	Eugene M. Zheganin <emz at norma.perm.ru>
  Reviewed by:	mav@ (earlier version)
  MFC after:	2 weeks
  Sponsored by:	playkey.net

Changes:
  head/sys/cam/ctl/ctl_frontend_iscsi.c
  head/sys/cam/ctl/ctl_frontend_iscsi.h
Comment 14 Edward Tomasz Napierala freebsd_committer freebsd_triage 2018-03-15 17:40:06 UTC
*** Bug 210683 has been marked as a duplicate of this bug. ***
Comment 15 commit-hook freebsd_committer freebsd_triage 2018-04-16 17:24:46 UTC
A commit references this bug:

Author: trasz
Date: Mon Apr 16 17:24:33 UTC 2018
New revision: 332622
URL: https://svnweb.freebsd.org/changeset/base/332622

Log:
  MFC r331013:

  Fix iSCSI target crash on session reinstation.

  The crash scenario goes like this: there's a thread waiting on "reinstate";
  because it doesn't update the timeout counter it gets terminated by the
  callout; at this point the maintenance thread starts the termination routine.
  The first thread finishes waiting, proceeds to icl_conn_handoff(), and drops
  the refcount, which allows the maintenance thread to free its resources.  At
  this point another thread receives a PDU.  Boom.

  PR:		222898, 219866
  Sponsored by:	playkey.net

Changes:
_U  stable/11/
  stable/11/sys/cam/ctl/ctl_frontend_iscsi.c
  stable/11/sys/cam/ctl/ctl_frontend_iscsi.h