Bug 215727 - [iscsi] target sends invalid NOP-out and drops connection if initiator doesn't reply
Summary: [iscsi] target sends invalid NOP-out and drops connection if initiator doesn'...
Status: Closed Works As Intended
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-RELEASE
Hardware: Any Any
: --- Affects Many People
Assignee: Edward Tomasz Napierala
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-03 12:16 UTC by Nareg Sinenian
Modified: 2017-04-26 19:52 UTC (History)
4 users (show)

See Also:


Attachments
Sample NOP-In sent by the FreeBSD target (496 bytes, application/octet-stream)
2017-01-03 12:16 UTC, Nareg Sinenian
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nareg Sinenian 2017-01-03 12:16:52 UTC
Created attachment 178470 [details]
Sample NOP-In sent by the FreeBSD target

The target sends NOPs to the initiator and expects a reply. When it doesn't receive a reply, it drops the connection. The problem, however, is that standards-compliant initiator will not respond to the NOP sent by the target because it contains an invalid target transfer tag.

See RFC7143 Sec. 10.19.1:

   If the target is sending a NOP-In as a ping (intending to receive a
   corresponding NOP-Out), this field is set to a valid value (not the
   reserved value 0xffffffff).

The FreeBSD target sends a NOP-In expecting a reply, but the field is NOT set to a valid value. It is instead set to the reserved value of 0xffffffff.

I have verified this by examining target traffic. A sample log is attached. In this case, the target dropped the connection due to a "ping timeout" as observed on the console: 

WARNING: 10.10.6.5 (iqn.2015-01.com.localhost): connection error; dropping connection
WARNING: 10.10.6.5 (iqn.2015-01.com.localhost): no ping reply (NOP-Out) after 5 seconds; dropping connection
Comment 1 Julien Cigar 2017-01-06 10:00:37 UTC
it looks the same problem as bug #211990
Comment 2 Ben RUBSON 2017-01-06 10:31:20 UTC
Nareg, just to know, are you using FreeBSD as the initiator / is FreeBSD initiator impacted by this issue ?
Or will it correctly reply to 0xffffffff ? (it could, as we would expect it to correctly work with the FreeBSD target :)
Comment 3 Nareg Sinenian 2017-01-06 16:28:00 UTC
(In reply to Ben RUBSON from comment #2)

I am not using FreeBSD as the initiator. I have my own, open-source initiator that I've written for macOS (www.github.com/iscsi-osx). The problem is on the target side when my initiator is used with FreeBSD. I looked at PDUs using WireShark and confirmed that the FreeBSD target was doing this. I have not used the FreeBSD initiator, so that may or may not have problems (and may or may not be consistently problematic with the target).

What's more, I noted that the FreeBSD target would flip-flop between sending the correct target transfer tag when in debug mode to the opposite when not in debug mode.
Comment 4 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-01-14 11:15:12 UTC
The piece of FreeBSD iSCSI target that assembles the ping requests looks like this:

        bhsni->bhsni_opcode = ISCSI_BHS_OPCODE_NOP_IN;
        bhsni->bhsni_flags = 0x80;
        bhsni->bhsni_initiator_task_tag = 0xffffffff;

The structure is prezeroed, so the target transfer tag will be 0.  The code that responds to ping requests looks like this:

        bhsni->bhsni_opcode = ISCSI_BHS_OPCODE_NOP_IN;
        bhsni->bhsni_flags = 0x80;
        bhsni->bhsni_initiator_task_tag = bhsno->bhsno_initiator_task_tag;
        bhsni->bhsni_target_transfer_tag = 0xffffffff;

Which makes me wonder - are you absolutely sure the PDU in question is the really the request, and not the response?

Also - could you tell some more about the flip-flop?  Note that FreeBSD target would only send ping requests if it hasn't seen activity (as in, any incoming PDUs) in some time, which means it will send requests if the initiator doesn't; if the initiator does send ping requests, the FreeBSD target will only send responses instead.
Comment 5 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-01-14 11:18:22 UTC
Oh, and the PDU in the packet trace you've provided contains additional data - FreeBSD target doesn't use that, except in response, as required by the RFC.
Comment 6 Nareg Sinenian 2017-01-21 03:02:55 UTC
The full Wireshark dump (it's 80MB) doesn't contain any NOP outs, which is what made me believe that these are initiated by the target, not the initiator (is there a way I can post that here? there seems to be a file limit)

Regardless, however, why would FreeBSD timeout the connection? 

Let's say that it is the initiator sending these...does FreeBSD timeout the connection when it receives NOPs too frequently?
Comment 7 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-01-21 10:04:39 UTC
I seem to remember a bug in Wireshark where it would fail to decode some of the iSCSI PDUs.  Since you're developing your own initiator, perhaps you could add some debugging printfs to it?

Regarding the timeouts - the algorithm used by FreeBSD target is very simple: the timeout happens when there were no PDUs received in the last five (by default, controlled by the kern.cam.ctl.iscsi.ping_timeout sysctl) seconds.  The timer is reset by receiving not just NOP-Outs, but any kind of PDU.  After the first second the target starts sending NOP-Ins, to request NOP-Out from the initiator.
Comment 8 Nareg Sinenian 2017-01-22 02:42:21 UTC
Interesting about Wireshark. I'm working off of a trace that a user sent to me so little else is available in the way of debug information. I'll setup a VM and see if I can reproduce. Stay tuned.
Comment 9 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-04-11 19:53:06 UTC
Any updates on this?