Bug 259152 - [iscsi] OCI oracle iscsi block device problems
Summary: [iscsi] OCI oracle iscsi block device problems
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: arm64 Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on: 259355
Blocks:
  Show dependency treegraph
 
Reported: 2021-10-13 19:11 UTC by Dave Cottlehuber
Modified: 2021-10-22 14:01 UTC (History)
5 users (show)

See Also:


Attachments
pcap on port 3260 during this transaction (13.56 KB, application/vnd.tcpdump.pcap)
2021-10-13 19:11 UTC, Dave Cottlehuber
no flags Details
pcap v2 with -s 0 (18.54 KB, application/vnd.tcpdump.pcap)
2021-10-15 17:09 UTC, Dave Cottlehuber
no flags Details
permit_expdatasn_0.patch (1.57 KB, patch)
2021-10-15 22:43 UTC, John Baldwin
no flags Details | Diff
post permit_expdatasn_0.patch with iscsi attach & running gpart (227.90 KB, application/vnd.tcpdump.pcap)
2021-10-21 12:25 UTC, Dave Cottlehuber
no flags Details
icl_conn_receive_pdu crash data segment length error (75.75 KB, application/vnd.tcpdump.pcap)
2021-10-21 13:07 UTC, Dave Cottlehuber
no flags Details
iscsid: default conn_max_recv_data_segment_length to our request (1.38 KB, patch)
2021-10-22 07:55 UTC, Dave Cottlehuber
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Cottlehuber freebsd_committer 2021-10-13 19:11:42 UTC
Created attachment 228677 [details]
pcap on port 3260 during this transaction

## environment

- Freebsd 14.0-CURRENT 6882064d0ced456c89d161da4bc43c5d97bab60b (2021-10-12)
- OCI hosted iscsi block storage
- neither boot volumes (no auth required) nor additional volumes (with auth) will connect correctly
- pcap + wireshark suggests that the login + response is correct from protocol perspective
- fiddling with MTU (1500 -> 9000 -> 16000) seems not to help

## logs

these are taken from a VM as it's easier to get pcap (attached)

### equivalent linux commands (oracle linux ofc)
sudo iscsiadm -m node -o new -T iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1 -p 169.254.2.3:3260
sudo iscsiadm -m node -o update -T iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1 -n node.startup -v automatic
sudo iscsiadm -m node -T iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1 -p 169.254.2.3:3260 -l

### freebsd term
# cat /etc/iscsi.conf
behemoth {
        TargetAddress   = 169.254.2.3:3260
        TargetName      = iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1
}

# sysctl kern.iscsi.debug=10
# tail -qF /var/log/messages | egrep --line-buffered 'cam|icsci' &
# ngrep -qid vtnet0 -O /tmp/iscsi.pcap . port 3260 &
# iscsictl -Aa

[3858] vtnet0: promiscuous mode enabled
[3883] WARNING: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): ExpDataSN mismatch in SCSI Response (0 vs 1)
[3883] iscsi_session_cleanup: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing
[3883] iscsi_session_terminate_task: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing devq
[3883] iscsi_ioctl_daemon_handoff: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): releasing
[3883] WARNING: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): ExpDataSN mismatch in SCSI Response (0 vs 1)
[3883] iscsi_session_cleanup: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing
[3883] iscsi_session_terminate_task: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing devq
[3883] iscsi_ioctl_daemon_handoff: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): releasing
[3883] WARNING: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): ExpDataSN mismatch in SCSI Response (0 vs 1)
[3883] iscsi_session_cleanup: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing
[3883] iscsi_session_terminate_task: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing devq
[3884] iscsi_ioctl_daemon_handoff: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): releasing
[3884] WARNING: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): ExpDataSN mismatch in SCSI Response (0 vs 1)
[3884] iscsi_session_cleanup: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing
[3884] iscsi_session_terminate_task: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing devq
[3884] iscsi_ioctl_daemon_handoff: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): releasing
[3884] WARNING: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): ExpDataSN mismatch in SCSI Response (0 vs 1)
[3884] iscsi_session_cleanup: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing
[3884] iscsi_session_terminate_task: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing devq
[3884] iscsi_ioctl_daemon_handoff: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): releasing
[3884] WARNING: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): ExpDataSN mismatch in SCSI Response (0 vs 1)
[3884] iscsi_session_cleanup: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing
[3884] iscsi_session_terminate_task: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing devq
[3885] iscsi_ioctl_daemon_handoff: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): releasing
[3885] WARNING: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): ExpDataSN mismatch in SCSI Response (0 vs 1)
[3885] iscsi_session_cleanup: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing
[3885] iscsi_session_terminate_task: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing devq
[3885] iscsi_ioctl_daemon_handoff: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): releasing
[3885] WARNING: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): ExpDataSN mismatch in SCSI Response (0 vs 1)
[3885] iscsi_session_cleanup: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing
[3885] iscsi_session_terminate_task: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): freezing devq
[3885] iscsi_ioctl_daemon_fail: 169.254.2.3:3260 (iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1): iscsid(8) failed: Connection refused

# iscsictl -Lv
Session ID:               2
Initiator name:           iqn.1994-09.org.freebsd:a01.cabal5.net
Initiator portal:
Initiator alias:
Target name:              iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1
Target portal:            169.254.2.3:3260
Target alias:
User:
Secret:
Mutual user:
Mutual secret:
Session type:             Normal
Enable:                   Yes
Session state:            Disconnected
Failure reason:           Connection refused
Header digest:            None
Data digest:              None
MaxRecvDataSegmentLength: 8192
MaxSendDataSegmentLength: 8192
MaxBurstLen:              262144
FirstBurstLen:            65536
ImmediateData:            Yes
iSER (RDMA):              No
Offload driver:           None
Device nodes:             probe0



## wireshark dump from pcap

### client -> server
iSCSI (Login Command)
    Opcode: Login Command (0x03)
    1... .... = T: Transit to next login stage
    .0.. .... = C: Text is complete
    .... 00.. = CSG: Security negotiation (0x0)
    .... ..01 = NSG: Operational negotiation (0x1)
    VersionMax: 0x00
    VersionMin: 0x00
    TotalAHSLength: 0x00
    DataSegmentLength: 163 (0x000000a3)
    ISID: 80 12 4a 8c 6f 4f
        10.. .... = ISID_t: Random (0x2)
        ..00 0000 = ISID_a: 0x00
        ISID_b: 0x124a
        ISID_c: 0x8c
        ISID_d: 0x6f4f
    TSIH: 0x0000
    InitiatorTaskTag: 0x00000000
    CID: 0x0000
    CmdSN: 0x00000000
    ExpStatSN: 0x00000001
    Key/Value Pairs
        KeyValue: AuthMethod=None
        KeyValue: InitiatorName=iqn.1994-09.org.freebsd:a01.cabal5.net
        KeyValue: SessionType=Normal
        KeyValue: TargetName=iqn.2015-12.com.oracleiaas:96df523d-a0c4-4d20-9d97-6900e94a1bc1
    Padding: 00

### server -> client

iSCSI (Login Response)
    Opcode: Login Response (0x23)
    1... .... = T: Transit to next login stage
    .0.. .... = C: Text is complete
    .... 01.. = CSG: Operational negotiation (0x1)
    .... ..11 = NSG: Full feature phase (0x3)
    VersionMax: 0x00
    VersionActive: 0x00
    TotalAHSLength: 0x00
    DataSegmentLength: 226 (0x000000e2)
    ISID: 80 12 4a 8c 6f 4f
        10.. .... = ISID_t: Random (0x2)
        ..00 0000 = ISID_a: 0x00
        ISID_b: 0x124a
        ISID_c: 0x8c
        ISID_d: 0x6f4f
    TSIH: 0x0100
    InitiatorTaskTag: 0x00000000
    StatSN: 0x00000001
    ExpCmdSN: 0x00000000
    MaxCmdSN: 0x00000001
    Status: Success (0x0000)
    Key/Value Pairs
        KeyValue: iSCSIProtocolLevel=NotUnderstood
        KeyValue: HeaderDigest=None
        KeyValue: DataDigest=None
        KeyValue: ImmediateData=Yes
        KeyValue: MaxBurstLength=262144
        KeyValue: FirstBurstLength=65536
        KeyValue: InitialR2T=Yes
        KeyValue: MaxOutstandingR2T=1
        KeyValue: DefaultTime2Wait=2
        KeyValue: DefaultTime2Retain=0
        KeyValue: ErrorRecoveryLevel=0
    Padding: 00 00

### client -> server

iSCSI (SCSI Command)
    Opcode: SCSI Command (0x01)
    .0.. .... = I: Queued delivery
    TotalAHSLength: 0x00
    DataSegmentLength: 0 (0x00000000)
    LUN
        00.. .... = Address Mode: Simple logical unit addressing (0x00)
        ..00 0000  0000 0000 = LUN: 0x0000
    InitiatorTaskTag: 0x05000000
    ExpectedDataTransferLength: 0x00000018
    CmdSN: 0x00000000
    ExpStatSN: 0x00000002
    Data In in: 21
    Response in: 21
Flags: 0xc0, F, R, Attr: Untagged
    1... .... = F: Final PDU in sequence
    .1.. .... = R: Data will be read from target
    ..0. .... = W: No data will be written to target
    .... .000 = Attr: Untagged (0x0)
SCSI CDB Mode Sense(6)
    [LUN: 0x0000]
    [Command Set:Direct Access Device (0x00) (Using default commandset)]
    [Response in: 21]
    Opcode: Mode Sense(6) (0x1a)
    DBD = 0
    00.. .... = Page Control: Current Values (0)
    ..00 1010 = SPC-2 Page Code: Control (0x0a)
    Allocation Length: 24
    Control: 0x00
        00.. .... = Vendor specific: 0x0
        ..00 0... = Reserved: 0x0
        .... .0.. = NACA: Normal ACA is not set
        .... ..0. = Obsolete: 0x0
        .... ...0 = Obsolete: 0x0

### server -> client

iSCSI (SCSI Data In)
    Opcode: SCSI Data In (0x25)
    Flags: 0x80
        1... .... = F: Final PDU in sequence
        .0.. .... = A: Acknowledge not requested
        .... .0.. = O: No residual overflow occurred
        .... ..0. = U: No residual underflow occurred
        .... ...0 = S: Response does not contain SCSI status
    TotalAHSLength: 0x00
    DataSegmentLength: 24 (0x00000018)
    InitiatorTaskTag: 0x05000000
    StatSN: 0x00000000
    ExpCmdSN: 0x00000001
    MaxCmdSN: 0x00000081
    DataSN: 0x00000000
    BufferOffset: 0x00000000
    ResidualCount: 0x00000000
    Request in: 20
    Response in: 21
SCSI Payload (Mode Sense(6) Response Data)
    [LUN: 0x0000]
    [Command Set:Direct Access Device (0x00) (Using default commandset)]
    [SBC Opcode: Mode Sense(6) (0x1a)]
    [Request in: 20]
    [Response in: 21]
    Mode Data Length: 0
iSCSI (SCSI Response)
    Opcode: SCSI Response (0x21)
    Response: Command completed at target (0x00)
    Status: Check Condition (0x02)
    TotalAHSLength: 0x00
    DataSegmentLength: 20 (0x00000014)
    InitiatorTaskTag: 0x05000000
    StatSN: 0x00000002
    ExpCmdSN: 0x00000001
    MaxCmdSN: 0x00000081
    ExpDataSN: 0x00000000
    BidiReadResidualCount: 0x00000000
    ResidualCount: 0x00000000
    Request in: 20
    Time from request: 0.000114000 seconds
    Data In in: 21
    SenseLength: 0x0012
Flags: 0x80
    ...0 .... = o: No overflow of read part of bi-directional command
    .... 0... = u: No underflow of read part of bi-directional command
    .... .0.. = O: No residual overflow occurred
    .... ..0. = U: No residual underflow occurred
SCSI: SNS Info
    [LUN: 0x0000]
    .111 0000 = SNS Error Type: Current Error (0x70)
    Valid: 112
    0... .... = Filemark: False
    .0.. .... = EOM: False
    ..0. .... = ILI: False
    .... 0110 = Sense Key: Unit Attention (0x6)
    Sense Info: 0x00000000
    Additional Sense Length: 10
    Command-Specific Information: 00 00 00 00
    Additional Sense Code+Qualifier: Power On, Reset, Or Bus Device Reset Occurred (0x2900)
    <Additional Sense Code: 0x29>
    <Additional Sense Code Qualifier: 0x00>
    Field Replaceable Unit Code: 0x00
    0... .... = SKSV: False
    .000 0000 0000 0000 0000 0000 = Sense Key Specific: 0x000000


AFAICT this repeats until connection is dropped
Comment 1 Dave Cottlehuber freebsd_committer 2021-10-13 19:15:39 UTC
https://mailarchive.ietf.org/arch/msg/ips/8WBo2pcVojpuwBoCb2IPPRW6VWA/may be relevant.
Comment 3 Dave Cottlehuber freebsd_committer 2021-10-13 19:39:08 UTC
see URL https://reviews.freebsd.org/D31594 for additional context
Comment 4 Dave Cottlehuber freebsd_committer 2021-10-13 22:16:08 UTC
commentin out the error handling as added in D31594 as suggested by Ed,
*and* bumping up mtu to 16000, gets us a bit further, but volume is
still inaccessible and causes the new disk to hang.

[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): connection error; reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): PDU 0x0 StatSN 0 != session ExpStatSN 1 (or + 1); reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): received PDU with unsupported opcode 0x0; reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): PDU 0x0 StatSN 0 != session ExpStatSN 1 (or + 1); reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): received PDU with unsupported opcode 0x0; reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): PDU 0x0 StatSN 0 != session ExpStatSN 1 (or + 1); reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): received PDU with unsupported opcode 0x0; reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): PDU 0x0 StatSN 0 != session ExpStatSN 1 (or + 1); reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): received PDU with unsupported opcode 0x0; reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): PDU 0x0 StatSN 0 != session ExpStatSN 1 (or + 1); reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): received PDU with unsupported opcode 0x0; reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): PDU 0x0 StatSN 0 != session ExpStatSN 1 (or + 1); reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): received PDU with unsupported opcode 0x0; reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): PDU 0x0 StatSN 0 != session ExpStatSN 1 (or + 1); reconnecting
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): received PDU with unsupported opcode 0x0; reconnecting
[2655] WARNING: icl_conn_receive_pdu: received data segment length 16384 is larger than negotiated; dropping connection
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): connection error; reconnecting
[2655] WARNING: icl_conn_receive_pdu: received data segment length 16306705 is larger than negotiated; dropping connection
[2655] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): connection error; reconnecting
...
[2709] da1 at iscsi1 bus 0 scbus1 target 0 lun 1
da1: <ORACLE BlockVolume 1.0>  detached
[2709] GEOM: da1: corrupt or invalid GPT detected.
[2709] GEOM: da1: GPT rejected -- may not be recoverable.
[2709] g_dev_taste: g_dev_taste(da1) failed to g_attach, error=6
[2709] (da1:iscsi1:0:0:1): Periph destroyed
[2722] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): ExpDataSN mismatch in SCSI Response (0 vs 1)
[2722] da1 at iscsi2 bus 0 scbus1 target 0 lun 1
da1: <ORACLE BlockVolume 1.0> Fixed Direct Access SPC-4 SCSI device
da1: 150.000MB/s transfers
da1: Command Queueing enabled
da1: 48128MB (98566144 512 byte sectors)
[2722] (da1:iscsi2:0:0:1): READ(6)/WRITE(6) not supported, increasing minimum_cmd_size to 10.
[2727] WARNING: 169.254.2.7:3260 (iqn.2015-02.oracle.boot:uefi): no ping reply (NOP-In) after 5 seconds; reconnecting
...

root@a01 /u/h/dch# gpart show

^c
^z
Comment 5 John Baldwin freebsd_committer freebsd_triage 2021-10-15 16:04:48 UTC
Please redo the packet capture with '-s 0'.  The SCSI response PDUs are in the same packet as the Data-In but can't be seen in WireShark since the captured packets are truncated.  Given the small number of packets involved, using '-s 0' should be ok without risk of dropping packets.

Note that since there is a Data-In, the ExpDataSN field should be 1 in the SCSI Response.  It is a count of the Data-In PDUs sent in the reply to the original request prior to the SCSI Response PDU that closes it:

https://datatracker.ietf.org/doc/html/rfc7143#section-11.4.8

The error messages in comment #4 suggest a completely garbled connection, e.g. if there are extra bytes in the stream that cause the client to read PDU headers at the wrong offset.  'PDU 0x0' means that it thinks the opcode of the PDU is 0 (NOP-Out).  However, NOP-out PDUs are only sent by the initiator (client), not the target.

I wonder if OCI is using AHS (which isn't generally supported in FreeBSD's iSCSI stack)?  A full pcap might help.

Oh, also, the LoginRespone shows 'iSCSIProtocolLevel=Not Understood" in the reply.  That may be ok though as that just means we can't use features from RFC 7144 I think (reading in RFC 7144 which defines version 2 where as 7143 is version 1).
Comment 6 Dave Cottlehuber freebsd_committer 2021-10-15 17:09:31 UTC
Created attachment 228726 [details]
pcap v2 with -s 0

# tcpdump -i vtnet0 -w /var/tmp/iscsi2.pcap -s 0 port 3260

root@a01:~ # grep -v \# /etc/iscsi.conf
behemoth {
        targetaddress   = 169.254.2.9:3260
        targetname      = iqn.2015-02.oracle.boot:uefi
}

root@a01:~ # sysctl kern.iscsi.debug=10
kern.iscsi.debug: 1 -> 10

root@a01:~ # tcpdump -i vtnet0 -w /tmp/iscsi.pcap -s 0 port 3260 &
[2] 56458

root@a01:~ # service iscsid onestart
Starting iscsid.
root@a01:~ # iscsictl -Lv
root@a01:~ # iscsictl -Aa
root@a01:~ # iscsictl -Lv
Session ID:               7
Initiator name:           iqn.1994-09.org.freebsd:a01.cabal5.net
Initiator portal:         
Initiator alias:          
Target name:              iqn.2015-02.oracle.boot:uefi
Target portal:            169.254.2.9:3260
Target alias:             
User:                     
Secret:                   
Mutual user:              
Mutual secret:            
Session type:             Normal
Enable:                   Yes
Session state:            Disconnected
Failure reason:           Connection refused
Header digest:            None
Data digest:              None
MaxRecvDataSegmentLength: 8192
MaxSendDataSegmentLength: 8192
MaxBurstLen:              262144
FirstBurstLen:            65536
ImmediateData:            Yes
iSER (RDMA):              No
Offload driver:           None
Device nodes:             probe0
Comment 7 Dave Cottlehuber freebsd_committer 2021-10-15 17:38:03 UTC
note to self; more learnings https://reviews.freebsd.org/D31577
Comment 8 John Baldwin freebsd_committer freebsd_triage 2021-10-15 22:43:41 UTC
Created attachment 228734 [details]
permit_expdatasn_0.patch

Not tested, but I did notice that we should require an ExpDataSN of 0 if the response is not command completed.

Looking at the latest trace, it does seem that the SCSI response does include an ExpDataSN of 0, and I've added an exception that tries to exempt this check for errors.  It's not conformant, but Linux also doesn't check for this that I can see (linux/drivers/scsi/libiscsi.c:iscsi_scsi_cmd_rsp()).

This though will just get you back to comment #4 I think and we will still need a pcap with -s 0 of the result to look at for the next issue.
Comment 9 Dave Cottlehuber freebsd_committer 2021-10-21 11:36:59 UTC
thanks John!

permit_expdatasn_0.patch gets us further - we can successfully attach a volume now, and partition the device:


# iscsictl -A -p 169.254.2.2:3260 -t iqn.2015-12.com.oracleiaas:f896d637-0a99-4fc7-9aed-ecfc8c2166f2
# iscsictl -Lv
Session ID:               1
Initiator name:           iqn.1994-09.org.freebsd:hobbes.skunkwerks.at
Initiator portal:         
Initiator alias:          
Target name:              iqn.2015-12.com.oracleiaas:f896d637-0a99-4fc7-9aed-ecfc8c2166f2
Target portal:            169.254.2.2:3260
Target alias:             
User:                     
Secret:                   
Mutual user:              
Mutual secret:            
Session type:             Normal
Enable:                   Yes
Session state:            Connected
Failure reason:           
Header digest:            None
Data digest:              None
MaxRecvDataSegmentLength: 8192
MaxSendDataSegmentLength: 8192
MaxBurstLen:              262144
FirstBurstLen:            65536
ImmediateData:            Yes
iSER (RDMA):              No
Offload driver:           None
Device nodes:             da1 


next, we run into:

# tcpdump -nns 0 -C 32 -w /var/tmp/iscsi/iscsi.pcap port 3260 &
# time dd if=/dev/random of=/dev/da1 count=1 bs=1g

Password:WARNING: icl_conn_receive_pdu: received data segment length 16384 is larger than negotiated; dropping connection

WARNING: 169.254.2.2:3260 (iqn.2015-12.com.oracleiaas:f896d637-0a99-4fc7-9aed-ecfc8c2166f2): connection error; reconnecting
panic: isc->receive_pdu != NULL
cpuid = 1
time = 1634815845
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x184
panic() at panic+0x44
icl_receive_thread() at icl_receive_thread+0x734
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 0 tid 100279 ]
Stopped at      kdb_enter+0x44: undefined       f901811f
db>
Comment 10 Dave Cottlehuber freebsd_committer 2021-10-21 12:25:25 UTC
Created attachment 228921 [details]
post permit_expdatasn_0.patch with iscsi attach & running gpart

Soon after transferring data, we lose connection and panic. I do not yet have a pcap *during* the crash.
Comment 11 Dave Cottlehuber freebsd_committer 2021-10-21 13:07:42 UTC
Created attachment 228924 [details]
icl_conn_receive_pdu crash data segment length error

putting haproxy in between, in tcp mode, and attaching (but not mounting via iscsi) the block vol to the other node, gives us both a crash, and a pcap.

da1: <ORACLE BlockVolume 1.0> Fixed Direct Access SPC-4 SCSI device
da1: 150.000MB/s transfers
da1: Command Queueing enabled
da1: 51200MB (104857600 512 byte sectors)
iscsi_pdu_handle_scsi_response: 10.0.0.86 (iqn.2015-12.com.oracleiaas:f896d637-0a99-4fc7-9aed-ecfc8c2166f2): freezing devq
(da1:iscsi6:0:0:1): READ(6)/WRITE(6) not supported, increasing minimum_cmd_size to 10.
WARNING: icl_conn_receive_pdu: received data segment length 16384 is larger than negotiated; dropping connection
WARNING: 10.0.0.86 (iqn.2015-12.com.oracleiaas:f896d637-0a99-4fc7-9aed-ecfc8c2166f2): connection error; reconnecting
panic: isc->receive_pdu != NULL
cpuid = 3
time = 1634821349
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
vpanic() at vpanic+0x184
panic() at panic+0x44
icl_receive_thread() at icl_receive_thread+0x734
fork_exit() at fork_exit+0x74
fork_trampoline() at fork_trampoline+0x14
KDB: enter: panic
[ thread pid 0 tid 100287 ]
Stopped at      kdb_enter+0x44: undefined       f901811f
Comment 12 Ed Maste freebsd_committer 2021-10-21 14:32:19 UTC
(In reply to Dave Cottlehuber from comment #11)
Notes from an IRC debugging session with dch@ and andrew@:

The panic looks like failed error handling after a reconnect attempt so needs to be fixed but is not the primary issue; this is:

WARNING: icl_conn_receive_pdu: received data segment length 16384 is larger than negotiated; dropping connection

sys/dev/iscsi/icl_soft.c:

                len = icl_pdu_data_segment_length(request);
                if (len > ic->ic_max_recv_data_segment_length) {  <---
                        ICL_WARN("received data segment "
                            "length %zd is larger than negotiated; "
                            "dropping connection", len);
                        error = EINVAL;
                        break;
                }

ic_max_recv_data_segment_length ultimately gets set by iscsid handling the new connection. In usr.sbin/iscsid/login.c in login_negotiate_key:

        } else if (strcmp(name, "MaxRecvDataSegmentLength") == 0) {
                tmp = strtoul(value, NULL, 10);
                if (tmp <= 0)
                        log_errx(1, "received invalid "
                            "MaxRecvDataSegmentLength");
                if (tmp > isl->isl_max_send_data_segment_length) {
                        log_debugx("capping max_send_data_segment_length "
                            "from %d to %d", tmp,
                            isl->isl_max_send_data_segment_length);
                        tmp = isl->isl_max_send_data_segment_length;
                }
                conn->conn_max_send_data_segment_length = tmp;
                /* We received target's limit, that means it accepted our's. */
                conn->conn_max_recv_data_segment_length =   <---
                    isl->isl_max_recv_data_segment_length;  <---

So it appears conn_max_recv_data_segment_length is only set once we receive MaxRecvDataSegmentLength from the target

from proxied.pcap,

frame 15 our login command contains MaxRecvDataSegmentLength=262144
the response is in frame 17 and does not contain MaxRecvDataSegmentLength, so we never update conn_max_recv_data_segment_length 

This handling for conn_max_recv_data_segment_length came in with
https://cgit.freebsd.org/src/commit/?id=a15fbc904a4d535e0c438bcecac056fca4fa3c5e
Comment 13 Dave Cottlehuber freebsd_committer 2021-10-21 16:18:09 UTC
relevant packets to save others opening tcpdump/wireshark

WARNING: icl_conn_receive_pdu: received data segment length 16384 is larger than negotiated; dropping connection

this is the initial login packet sent from FreeBSD to OCI iscsi target:

iSCSI (Login Command)
    Opcode: Login Command (0x03)
    1... .... = T: Transit to next login stage
    .0.. .... = C: Text is complete
    .... 01.. = CSG: Operational negotiation (0x1)
    .... ..11 = NSG: Full feature phase (0x3)
    VersionMax: 0x00
    VersionMin: 0x00
    TotalAHSLength: 0x00
    DataSegmentLength: 249 (0x000000f9)
    ISID: 80 d5 92 14 52 4d
        10.. .... = ISID_t: Random (0x2)
        ..00 0000 = ISID_a: 0x00
        ISID_b: 0xd592
        ISID_c: 0x14
        ISID_d: 0x524d
    TSIH: 0x0000
    InitiatorTaskTag: 0x00000000
    CID: 0x0000
    CmdSN: 0x00000000
    ExpStatSN: 0x00000001
    Key/Value Pairs
        KeyValue: iSCSIProtocolLevel=2
        KeyValue: HeaderDigest=None
        KeyValue: DataDigest=None
        KeyValue: ImmediateData=Yes
        KeyValue: MaxBurstLength=1048576
        KeyValue: FirstBurstLength=1048576
        KeyValue: InitialR2T=Yes
        KeyValue: MaxOutstandingR2T=1
        KeyValue: MaxRecvDataSegmentLength=262144 <----- 
        KeyValue: DefaultTime2Wait=0
        KeyValue: DefaultTime2Retain=0
        KeyValue: ErrorRecoveryLevel=0
    Padding: 00 00 00

and this below is the specific iscsi message that causes FreeBSD to panic:

Transmission Control Protocol, Src Port: 3260, Dst Port: 24622, Seq: 27737, Ack: 2441, Len: 48
[3 Reassembled TCP Segments (16432 bytes): #194(8948), #195(7436), #196(48)]
    [Frame: 194, payload: 0-8947 (8948 bytes)]
    [Frame: 195, payload: 8948-16383 (7436 bytes)]
    [Frame: 196, payload: 16384-16431 (48 bytes)]
    [Segment count: 3]
    [Reassembled TCP length: 16432]
    [Reassembled TCP Data: 25 81 00 00 00 00 40 00 00 00 00 00 00 00 00 00 27 00 00 00 ff ff ff ff …]
iSCSI (SCSI Data In)
    Opcode: SCSI Data In (0x25)
    Flags: 0x81
        1... .... = F: Final PDU in sequence
        .0.. .... = A: Acknowledge not requested
        .... .0.. = O: No residual overflow occurred
        .... ..0. = U: No residual underflow occurred
        .... ...1 = S: Response contains SCSI status
    Status: Good (0x00)
    TotalAHSLength: 0x00
    DataSegmentLength: 16384 (0x00004000)
    InitiatorTaskTag: 0x27000000
    StatSN: 0x00000029
    ExpCmdSN: 0x00000028
    MaxCmdSN: 0x000000a8
    DataSN: 0x00000000
    BufferOffset: 0x00000000
    ResidualCount: 0x00000000
    Request in: 188
    Time from request: 0.000568000 seconds
SCSI Payload (Read(10) Response Data)
    [LUN: 0x0001]
    [Command Set:Direct Access Device (0x00) ]
    [SBC Opcode: Read(10) (0x28)]
    [Request in: 188]
    [Response in: 196]
SCSI Response (Read(10))
    [LUN: 0x0001]
    [Command Set:Direct Access Device (0x00) ]
    [SBC Opcode: Read(10) (0x28)]
    [Request in: 188]
    [Time from request: 0.000568000 seconds]
    [Status: Good (0x00)]
Comment 14 Dave Cottlehuber freebsd_committer 2021-10-22 07:55:33 UTC
Created attachment 228938 [details]
iscsid: default conn_max_recv_data_segment_length to our request

Author: Ed Maste <emaste@FreeBSD.org>
Date:   Thu Oct 21 11:09:58 2021 -0400

    iscsid: default conn_max_recv_data_segment_length to our request
    
    XXX attempted workaround for PR259152; target did not respond with
    MaxRecvDataSegmentLength and so we did not update
    conn_max_recv_data_segment_length.
    
    PR:             259152

Ed's patch here "works for me" - I can see, mount, & use iscsi targets on both VM and BM now, handling a full 20GiB zpool resilver between volumes without issue. I expect we'll survive a buildworld as well.

My understanding is that we have several "wip patches" that need somebody who's more familiar with the spec and the code to make a call if this is (a) appropriate and (b) if patches need to evolve further to commit to main.
Comment 15 Ed Maste freebsd_committer 2021-10-22 14:01:45 UTC
MaxRecvDataSegmentLength issue (first mentioned in comment #11) split out to PR259355.