Bug 260449 - incorrect PRIV_REQUEST() uses in iscsi target code
Summary: incorrect PRIV_REQUEST() uses in iscsi target code
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: John Baldwin
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-15 19:56 UTC by Robert Morris
Modified: 2022-05-07 19:37 UTC (History)
5 users (show)

See Also:


Attachments
Cause iscsi target to crash due to incorrect PRIV_REQUEST() uses. (4.56 KB, text/plain)
2021-12-15 19:56 UTC, Robert Morris
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Morris 2021-12-15 19:56:09 UTC
Created attachment 230154 [details]
Cause iscsi target to crash due to incorrect PRIV_REQUEST() uses.

I suspect this line in cfiscsi_done():

  cs = PRIV_REQUEST(io);

should be

  request = PRIV_REQUEST(io);
  cs = PDU_SESSION(request);

And this line in cfiscsi_session_terminate_tasks() looks odd:

  PRIV_REQUEST(io) = cs;

I'd expect ... = request, not ... = cs.

I've attached a program that (on my machine) produces panic from the
first problem, and (once the first is fixed) a kernel page fault due to
the second.

panic: refcount 0xffffffd0023eb7b4 wraparound
panic() at panic+0x2a
_refcount_update_saturated() at _refcount_update_saturated+0x16
refcount_releasen() at refcount_releasen+0x4c
refcount_release() at refcount_release+0xc
cfiscsi_done() at cfiscsi_done+0x3e
ctl_process_done() at ctl_process_done+0x460
ctl_work_thread() at ctl_work_thread+0x13e
fork_exit() at fork_exit+0x80
fork_trampoline() at fork_trampoline+0xa

panic: Fatal page fault at 0xffffffc0000468c4: 0x000000000000ba
--- exception 15, tval = 0xba
cfiscsi_done() at cfiscsi_done+0x5e
ctl_process_done() at ctl_process_done+0x460
ctl_work_thread() at ctl_work_thread+0x13e
fork_exit() at fork_exit+0x80
fork_trampoline() at fork_trampoline+0xa

FreeBSD  14.0-CURRENT FreeBSD 14.0-CURRENT #160 main-n250912-e4746deeda02-dirty: Wed Dec 15 14:36:14 EST 2021     rtm@xxx:/usr/obj/usr/rtm/symbsd/src/riscv.riscv64/sys/RTM  riscv
Comment 1 John Baldwin freebsd_committer freebsd_triage 2021-12-22 18:10:59 UTC
So I think the issue is that in cfiscsi_terminate_tasks() there is no request to store in PRIV_REQUEST() so it instead stores the session directly.  However, it is possible to send a request on the wire that also results in CTL_IO_TASK and and CTL_TASK_I_T_NEXUS_RESET in cfiscsi_pdu_handle_task_request:

	case BHSTMR_FUNCTION_I_T_NEXUS_RESET:
#if 0
		CFISCSI_SESSION_DEBUG(cs, "BHSTMR_FUNCTION_I_T_NEXUS_RESET");
#endif
		io->taskio.task_action = CTL_TASK_I_T_NEXUS_RESET;
		break;


And in this case, PRIV_REQUEST() is the request instead.

Your test program constructs a normal CDB inquiry but then xors the first two bytes with 0x4b03 to generate 0x8b02 (little endian).  0x2 == ISCSI_BHS_OPCODE_TASK_REQUEST and 0x8b as the function == 0x80 | BHSTMR_FUNCTION_I_T_NEXUS_RESET.

I think the bug is we need a clean way to differentiate between an on-the-wire reset vs the internally-initiatired reset.  One way may be to allocate a new CTL_TASK_INTERNAL_RESET or some such that terminate_tasks would use instead of of CTL_TASK_I_T_NEXUS_RESET, and ctl_run_task would treat as identical to CTL_TASK_I_T_NEXUS_RESET.  The only thing I don't yet understand for that is cfumass_done() in the usb storage driver checking for this opcode.  Nothing in the cfumass driver schedules a CTL_IO_TASK, so I suspect the code in cfumass_done is just stale copy/paste from cfiscsi_done?
Comment 2 John Baldwin freebsd_committer freebsd_triage 2022-01-26 21:40:41 UTC
Candidate fix up for review at https://reviews.freebsd.org/D34055
Comment 3 commit-hook freebsd_committer freebsd_triage 2022-01-28 21:10:50 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=2e8d1a55258d39f7315fa4f2164c0fce96e79802

commit 2e8d1a55258d39f7315fa4f2164c0fce96e79802
Author:     John Baldwin <jhb@FreeBSD.org>
AuthorDate: 2022-01-28 21:07:04 +0000
Commit:     John Baldwin <jhb@FreeBSD.org>
CommitDate: 2022-01-28 21:07:04 +0000

    iscsi: Allocate a dummy PDU for the internal nexus reset task.

    When an iSCSI target session is terminated, an internal nexus reset
    task is posted to abort existing tasks belonging to the session.
    Previously, the ctl_io for this internal nexus reset stored a pointer
    to the session in the slot that normally holds a pointer to the PDU
    from the initiator that triggered the I/O request.  The completion
    handler then assumed that any nexus reset I/O was due to an internal
    request and fetched the session pointer (instead of the PDU pointer)
    from the ctl_io.  However, it is possible to trigger a nexus reset via
    an on-the-wire task management PDU.  If such a PDU were sent to the
    target, then the completion handler would incorrectly treat this
    request as an internal request and treat the pointer to the received
    PDU as a pointer to the session instead.

    To fix, allocate a dummy PDU for the internal reset task and use an
    invalid opcode to differentiate internal nexus resets from resets
    requested by the initiator.

    PR:             260449
    Reported by:    Robert Morris <rtm@lcs.mit.edu>
    Reviewed by:    mav
    Sponsored by:   Chelsio Communications
    Differential Revision:  https://reviews.freebsd.org/D34055

 sys/cam/ctl/ctl_frontend_iscsi.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)
Comment 4 Ed Maste freebsd_committer freebsd_triage 2022-04-23 02:33:13 UTC
Any plans to MFC @jhb?
Comment 5 commit-hook freebsd_committer freebsd_triage 2022-04-29 23:13:07 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=7f8f8202edc4ae61a65a766de6ec3420a64d3f7d

commit 7f8f8202edc4ae61a65a766de6ec3420a64d3f7d
Author:     John Baldwin <jhb@FreeBSD.org>
AuthorDate: 2022-01-28 21:07:04 +0000
Commit:     John Baldwin <jhb@FreeBSD.org>
CommitDate: 2022-04-29 23:05:55 +0000

    iscsi: Allocate a dummy PDU for the internal nexus reset task.

    When an iSCSI target session is terminated, an internal nexus reset
    task is posted to abort existing tasks belonging to the session.
    Previously, the ctl_io for this internal nexus reset stored a pointer
    to the session in the slot that normally holds a pointer to the PDU
    from the initiator that triggered the I/O request.  The completion
    handler then assumed that any nexus reset I/O was due to an internal
    request and fetched the session pointer (instead of the PDU pointer)
    from the ctl_io.  However, it is possible to trigger a nexus reset via
    an on-the-wire task management PDU.  If such a PDU were sent to the
    target, then the completion handler would incorrectly treat this
    request as an internal request and treat the pointer to the received
    PDU as a pointer to the session instead.

    To fix, allocate a dummy PDU for the internal reset task and use an
    invalid opcode to differentiate internal nexus resets from resets
    requested by the initiator.

    PR:             260449
    Reported by:    Robert Morris <rtm@lcs.mit.edu>
    Reviewed by:    mav
    Sponsored by:   Chelsio Communications
    Differential Revision:  https://reviews.freebsd.org/D34055

    (cherry picked from commit 2e8d1a55258d39f7315fa4f2164c0fce96e79802)

 sys/cam/ctl/ctl_frontend_iscsi.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)