Bug 226110 - Observing ASSERT at cfiscsi_session_delete+0x49 while running traffic on 100 iSCSI LUNs
Summary: Observing ASSERT at cfiscsi_session_delete+0x49 while running traffic on 100 ...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Edward Tomasz Napierala
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-22 07:55 UTC by Manish Kumar
Modified: 2018-04-09 14:04 UTC (History)
1 user (show)

See Also:


Attachments
core text file (180.98 KB, text/plain)
2018-02-22 07:55 UTC, Manish Kumar
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Manish Kumar 2018-02-22 07:55:09 UTC
Created attachment 190884 [details]
core text file

Observing an assert while running traffic on 100 iSCSI LUNs (used zvols as LUNs) on the target machine. The panic stack trace:  

panic: destroying session with outstanding CTL pdus
cpuid = 4
time = 1519157619
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe005f30f880
kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe005f30f950
vpanic() at vpanic+0x268/frame 0xfffffe005f30fa20
kassert_panic() at kassert_panic+0xc7/frame 0xfffffe005f30fab0
cfiscsi_session_delete() at cfiscsi_session_delete+0x49/frame 0xfffffe005f30fae0
cfiscsi_maintenance_thread() at cfiscsi_maintenance_thread+0x110/frame 0xfffffe005f30fb30
fork_exit() at fork_exit+0x145/frame 0xfffffe005f30fbb0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe005f30fbb0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

-------------------------------------
Steps to recreate the issue:
-------------------------------------
The target machine exposes 100 zvols as iSCSI LUNs. The Initiator discovers these LUNS, creates file system on them, and runs traffic (iozone). The target machine asserts after few hours.

OS: FreeBSD 12.0-CURRENT (svn r329019)
I'm using GENERIC config file, with "nooptions       VIMAGE"

==>Start the Target (FreeBSD)
    1. Create 100 zvols
      # zpool create iscsi <slice/partition>
      # for ((i=0;i<100;i++)); do zfs create -V1G -o volmode=dev iscsi/d$i; done
    
    2. configure the network interface to be used (Target_IP_Address)

    3. /etc/ctl.conf file:
         portal-group pg0 {
            discovery-auth-group no-authentication
            listen <Target_IP_Address>
         }
         target iqn.2016-11.com.xyz.abc:0 {
            auth-group no-authentication
            portal-group pg0
                     lun 0 {
                              path /dev/zvol/iscsi/d0
                              size 1G
                              option vendor "foo"
                              option product "bar"
                              option revision "d0"
                     }
         }
         .
         .
      
        100 targets with 1 lun each

    4. start the target daemon 
        # service ctld onestart

==>Initiator (Linux)
    1. login/discover the target 
        # iscsiadm -m discovery -t st -p <Target_IP_Address> -l
    2. List all Luns 
	# lsscsi	
    3. Format and create file system ONLY on all discovered luns (TAKE CARE TO NOT TOUCH OTHER LUNS). 
	# mkfs.ext3 /dev/sdxx      // for all discovered LUNs
    4. mount all the devices formatted in the above step to different mount points
        # mount /dev/sdxx /mnt/iscsiyy    // for all the luns formatted in the above step
    5. Run traffic on all luns using iozone
        # cd /mnt/iscsiyy
	# iozone -a -I -+d -g 512m &
	

==> Target machine asserts after 6-7 hours.
Comment 1 Edward Tomasz Napierala freebsd_committer freebsd_triage 2018-03-15 17:38:15 UTC
Are you able to reproduce this bug with fresh CURRENT, ie after r331013?  Thanks!
Comment 2 Manish Kumar 2018-03-27 07:13:48 UTC
(In reply to Edward Tomasz Napierala from comment #1)
I am not able to reproduce this bug on/after r331013.
I hit few other panics which look similar to another bug (226064). I will update the bug 226064 with the panic information.

I have run the test multiple times and not seen this assert even a single time. Earlier I used to hit this assert once in 4-5 times.
Comment 3 Manish Kumar 2018-04-09 14:04:59 UTC
Closing the bug as it is not observed after r331013. Since other panics are still seen (similar to bug 226064), the complete verification is pending.