Bug 224037 - Kernel crashes when trying to mount certain USB keys reported as WriteProtected
Summary: Kernel crashes when trying to mount certain USB keys reported as WriteProtected
Status: Closed DUPLICATE of bug 210316
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-scsi (Nobody)
URL:
Keywords:
: 224009 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-12-02 13:35 UTC by Petr Leszkow
Modified: 2018-02-15 16:31 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Petr Leszkow 2017-12-02 13:35:20 UTC
I am not sure if this is a "KERNEL" bug, but every time when I just plug the write protected USB key ( for example genuine Windows 10 installation key - I am trying to install Windows 10 into VirtualBos ) my systems crashes and reboots itself.


FreeBSD m4700 11.1-RELEASE FreeBSD 11.1-RELEASE #0 r321309: Fri Jul 21 02:08:28 UTC 2017   root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64


Mate Desktop with Polkit settings to auto-mount usb keys.

The only way how to avoid the immediate system crash/reboot is to force mount usb key with ReadOnly option plus auto-mount Polkit policy disabled. Therefore is there any kernel-level-method, to detect correctly write protected keys and mount them read-only automatically? 


Thank you for your help.

PetrL

System log:

ugen2.4: <KDI-MSFT Windows 10> at usbus2
umass1 on uhub4
umass1: <KDI-MSFT Windows 10, class 0/0, rev 2.10/1.10, addr 4> on usbus2
umass1:  SCSI over Bulk-Only; quirks = 0x8100
umass1:4:1: Attached to scbus4
da1 at umass-sim1 bus 1 scbus4 target 0 lun 0
da1: <KDI-MSFT Windows 10 PMAP> Removable Direct Access SPC-4 SCSI device
da1: Serial Number 001A92053F4DB0B0B27403F8
da1: 40.000MB/s transfers
da1: 14784MB (30277632 512 byte sectors)
da1: quirks=0x2<NO_6_BYTE>
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 14 a0 00 00 08 00
(da1:umass-sim1:1:0:0): CAM status: SCSI Status Error
(da1:umass-sim1:1:0:0): SCSI status: Check Condition
(da1:umass-sim1:1:0:0): SCSI sense: DATA PROTECT asc:27,0 (Write protected)
(da1:umass-sim1:1:0:0): Error 13, Unretryable error
g_vfs_done():da1s1[WRITE(offset=1654784, length=4096)]error = 13
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 14 a0 00 00 08 00
(da1:umass-sim1:1:0:0): CAM status: SCSI Status Error
(da1:umass-sim1:1:0:0): SCSI status: Check Condition
(da1:umass-sim1:1:0:0): SCSI sense: DATA PROTECT asc:27,0 (Write protected)
(da1:umass-sim1:1:0:0): Error 13, Unretryable error
g_vfs_done():da1s1[WRITE(offset=1654784, length=4096)]error = 13
(da1:umass-sim1:1:0:0): WRITE(10). CDB: 2a 00 00 00 14 a0 00 00 08 00
(da1:umass-sim1:1:0:0): CAM status: SCSI Status Error
(da1:umass-sim1:1:0:0): SCSI status: Check Condition
(da1:umass-sim1:1:0:0): SCSI sense: DATA PROTECT asc:27,0 (Write protected)
(da1:umass-sim1:1:0:0): Error 13, Unretryable error
g_vfs_done():da1s1[WRITE(offset=1654784, length=4096)]error = 13
fsync: giving up on dirty 0xfffff8005152fb10: tag devfs, type VCHR
    usecount 1, writecount 0, refcount 1850 mountedhere 0xfffff8005170ae00
    flags (VI_ACTIVE)
    v_object 0xfffff8000d03a2d0 ref 0 pages 1848 cleanbuf 1847 dirtybuf 1
    lock type devfs: UNLOCKED
        dev da1s1

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff809c0c2b
stack pointer           = 0x28:0xfffffe04648448e0
frame pointer           = 0x28:0xfffffe0464844910
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 27 (syncer)
trap number             = 9
panic: general protection fault
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff80aada97 at kdb_backtrace+0x67
#1 0xffffffff80a6bb76 at vpanic+0x186
#2 0xffffffff80a6b9e3 at panic+0x43
#3 0xffffffff80edf832 at trap_fatal+0x322
#4 0xffffffff80edee9e at trap+0x5e
#5 0xffffffff80ec3641 at calltrap+0x8
#6 0xffffffff80b07086 at bufwrite+0x256
#7 0xffffffff80b1889e at vop_stdfsync+0x19e
#8 0xffffffff8104cc59 at VOP_FSYNC_APV+0x89
#9 0xffffffff80b2e28a at sched_sync+0x4fa
#10 0xffffffff80a2f815 at fork_exit+0x85
#11 0xffffffff80ec3b7e at fork_trampoline+0xe
Comment 1 Conrad Meyer freebsd_committer freebsd_triage 2017-12-02 16:30:14 UTC
*** Bug 224009 has been marked as a duplicate of this bug. ***
Comment 2 Conrad Meyer freebsd_committer freebsd_triage 2017-12-02 16:32:05 UTC
Hm, this is probably below FS -- CAM.
Comment 3 Petr Leszkow 2017-12-02 17:29:47 UTC
(In reply to Conrad Meyer from comment #2)

Hi Condrad,

I checked the bug #224009 and the only difference in my case is, the USB key was still connected to usb port during panic and crash, therefore the device was not missing/removed physically from computer.

Basically, if the key is present in usb port, RO mount is successfull - system is not affected. Umount - still OK, another mount - key still phys. connected, but with RW option, crash is almost instatnt and the result is computer reboot.

Hope that helps to clarify the crash "sequence".

Thank you.

Petr
Comment 4 Conrad Meyer freebsd_committer freebsd_triage 2017-12-02 17:50:31 UTC
Petr,

I suspect when CAM sees EROFS or EIO errors due to write requests to read only media failing, it ejects the device.  Something about the ejection path is failing to clean up references to some internal object, which is freed before buffers referencing it can be released.
Comment 5 Conrad Meyer freebsd_committer freebsd_triage 2017-12-02 17:51:20 UTC
Sorry, I should have looked more closely at the logs.  13 = EACCES.
Comment 6 Petr Leszkow 2017-12-02 17:57:10 UTC
btw, just checked SCSI Asignment at http://www.t10.org/lists/asc-num.htm#ASC_27

for ASC 27,0 ...see list bellow:

so, I think it would be sufficient if kernel can instruct the cam or any upper layer to do not attempt to mount devices with ASC 27,0 and 27,1 as RW and handle them for example as CD/DVD asc 27,4 asc 27,5 ?

What do you think ? 

Thanks.

Petr




       D - Direct Access Block Device (SBC-4)         Device Column key
       .Z - Host Managed Zoned Block Device (ZBC)     ---------------------
       . T - Sequential Access Device (SSC-5)         blank = code not used
       .  P - Processor Device (SPC-2)                not blank = code used
       .  .R - C/DVD Device (MMC-6)
       .  . O - Optical Memory Block Device (SBC)
       .  .  M - Media Changer Device (SMC-3)
       .  .  .A - Storage Array Device (SCC-2)SBC)
       .  .  . E - SCSI Enclosure Services device (SES-3)
       .  .  .  B - Simplified Direct-Access (Reduced Block) device (RBC)
       .  .  .  .K - Optical Card Reader/Writer device (OCRW)
       .  .  .  . V - Automation/Device Interface device (ADC-4)
       .  .  .  .  F - Object-based Storage Device (OSD-2)
ASC/   .  .  .  .  .
ASCQ   DZTPROMAEBKVF  Description
27/00  DZT RO   BK    WRITE PROTECTED
27/01  DZT RO   BK    HARDWARE WRITE PROTECTED
27/02  DZT RO   BK    LOGICAL UNIT SOFTWARE WRITE PROTECTED
27/03    T R          ASSOCIATED WRITE PROTECT
27/04    T R          PERSISTENT WRITE PROTECT
27/05    T R          PERMANENT WRITE PROTECT
27/06      R       F  CONDITIONAL WRITE PROTECT
27/07  DZ       B     SPACE ALLOCATION FAILED WRITE PROTECT
27/08  DZ             ZONE IS READ ONLY
Comment 7 Petr Leszkow 2017-12-02 17:57:32 UTC
btw, just checked SCSI Asignment at http://www.t10.org/lists/asc-num.htm#ASC_27

for ASC 27,0 ...see list bellow:

so, I think it would be sufficient if kernel can instruct the cam or any upper layer to do not attempt to mount devices with ASC 27,0 and 27,1 as RW and handle them for example as CD/DVD asc 27,4 asc 27,5 ?

What do you think ? 

Thanks.

Petr




       D - Direct Access Block Device (SBC-4)         Device Column key
       .Z - Host Managed Zoned Block Device (ZBC)     ---------------------
       . T - Sequential Access Device (SSC-5)         blank = code not used
       .  P - Processor Device (SPC-2)                not blank = code used
       .  .R - C/DVD Device (MMC-6)
       .  . O - Optical Memory Block Device (SBC)
       .  .  M - Media Changer Device (SMC-3)
       .  .  .A - Storage Array Device (SCC-2)SBC)
       .  .  . E - SCSI Enclosure Services device (SES-3)
       .  .  .  B - Simplified Direct-Access (Reduced Block) device (RBC)
       .  .  .  .K - Optical Card Reader/Writer device (OCRW)
       .  .  .  . V - Automation/Device Interface device (ADC-4)
       .  .  .  .  F - Object-based Storage Device (OSD-2)
ASC/   .  .  .  .  .
ASCQ   DZTPROMAEBKVF  Description
27/00  DZT RO   BK    WRITE PROTECTED
27/01  DZT RO   BK    HARDWARE WRITE PROTECTED
27/02  DZT RO   BK    LOGICAL UNIT SOFTWARE WRITE PROTECTED
27/03    T R          ASSOCIATED WRITE PROTECT
27/04    T R          PERSISTENT WRITE PROTECT
27/05    T R          PERMANENT WRITE PROTECT
27/06      R       F  CONDITIONAL WRITE PROTECT
27/07  DZ       B     SPACE ALLOCATION FAILED WRITE PROTECT
27/08  DZ             ZONE IS READ ONLY
Comment 8 Conrad Meyer freebsd_committer freebsd_triage 2017-12-02 17:58:53 UTC
This line:

> (da1:umass-sim1:1:0:0): CAM status: SCSI Status Error

Indicates CAM_SCSI_STATUS_ERROR.

> (da1:umass-sim1:1:0:0): SCSI status: Check Condition

Indicates SCSI_STATUS_CHECK_COND -> camperiphscsisenseerror().

I don't have time right now to chase it below there and confirm my suspicion.
Comment 9 Conrad Meyer freebsd_committer freebsd_triage 2017-12-02 17:59:37 UTC
(In reply to Petr Leszkow from comment #7)
It's more complicated than that :-).  Filesystems will blindly try to mount R/W and CAM needs to report an appropriate error, but not kill the device.
Comment 10 Andriy Gapon freebsd_committer freebsd_triage 2017-12-02 18:09:25 UTC
I reported a similar, if not the same, problem in bug 210316.
If in this case it is also about msdosfs, then there is my analysis of the issue in that bug.

Also, many years ago I made this proposal: https://lists.freebsd.org/pipermail/freebsd-scsi/2011-April/004864.html
I still use that patch locally.
But it wouldn't help in this situation because of the initial r/w mount attempt.
Comment 11 Petr Leszkow 2017-12-02 18:27:52 UTC
yes, the other OS'es report the fs on usb key as msdos too.

Just to confirm the hw "health" of the usb key, I just mounted it successfully on mac os and linux boxes, both OS'es detected WriteProtect mode and mounted key Read Only. Therefore I think imho we are missing proper SCSI/mode/CAM/etc "sort of workflow or handling" 

Basically as Andryi mentioned, correctly deteceted RO SCSI mode should be used for or instruct "upper layers" to override default RW mode...



Petr(In reply to Andriy Gapon from comment #10)
Comment 12 Conrad Meyer freebsd_committer freebsd_triage 2017-12-02 18:29:04 UTC
(In reply to Andriy Gapon from comment #10)
Given it is a windows key, msdosfs seems likely.  That may be the common thing theme. Feel free to dupe this PR to that one.
Comment 13 Conrad Meyer freebsd_committer freebsd_triage 2017-12-02 18:32:28 UTC
(In reply to Andriy Gapon from comment #10)
I have no objection to the concept of the patch proposed, but wouldn’t squash EACCES or whatever to ENODEV.
Comment 14 Petr Leszkow 2017-12-02 22:34:50 UTC
(In reply to Conrad Meyer from comment #9)

yes, I can imagine that :) I am no developer, ....just a humble Isilon TA :)

Thumbs up for you for on FreeBSD and OneFS work!

I hope the bug can be fixed, it's not problem to avoid it, but it can be a bit annoying especially for desktop users, where just "simple" key can crash your desktop/laptop etc.
Comment 15 Conrad Meyer freebsd_committer freebsd_triage 2017-12-03 02:17:02 UTC

*** This bug has been marked as a duplicate of bug 210316 ***
Comment 16 Andriy Gapon freebsd_committer freebsd_triage 2017-12-03 09:26:05 UTC
(In reply to Conrad Meyer from comment #13)
Call me stubborn, but to me EACCESS is from the realm of software (logical) permissions, not from the realm of hardware access.  I consider EACCESS and EPERM as two confusing twins.
But my opinion is moot.
Comment 17 Andriy Gapon freebsd_committer freebsd_triage 2017-12-03 11:28:51 UTC
(In reply to Petr Leszkow from comment #7)
I actually like the idea in this comment.
I think that we need to fix a bug that leads to the geom_vfs / buffer-cache crash in any case.
It might be also nice to add the r/o mount fallback mechanism.

But on top of those things we could modify g_disk_access() to prevent the write access altogether if a disk is in the write-protected mode.  We could either add a new flag to flags in struct disk or, perhaps, a new parameter (open mode) to d_open.
scsi_da could then use either WP bit (0x80) in the device-specific byte of the mode parameter header (dev_spec field of scsi_mode_header_6 / scsi_mode_header_10) or SWP bit (SCP_SWP) in the Control mode page (0x0A) that we query at PROBE_MODE_SENSE stage.

I am not sufficiently fluent with CAM to implement the CAM (scsi_da) part.  The geom_disk part is rather trivial, of course.

P.S.
It seems that scsi_sa.c already queries bit 7 in dev_spec, so that's an example.
Comment 18 Conrad Meyer freebsd_committer freebsd_triage 2017-12-03 17:33:51 UTC
(In reply to Andriy Gapon from comment #17)
> I think that we need to fix a bug that leads to the geom_vfs / buffer-cache
> crash in any case.

Agreed.  That's my focus with this chain of duplicated bugs.

> It might be also nice to add the r/o mount fallback mechanism.

Yes, although that is an orthogonal enhancement, IMO, and met some resistance in the earlier attempt.  If mount fails with EROFS or EACCES or whatever and dmesg contains the same CAM errors they do now, I think sysamdmins will figure it out.

> But on top of those things we could modify g_disk_access() to prevent the write
> access altogether if a disk is in the write-protected mode.

Yeah, that makes sense too.  It would reduce the complexity required in higher level layers.  Instead of having to wait for the first IO to fail (probably whatever sets the dirty flag on a superblock), filesystems that rely on the underlying device to be a GEOM object (e.g., msdosfs) will encounter an error trying to g_access() (via g_vfs_open(..., 1)) the underlying disk writable.

That would also solve the initial bug without having to change the filesystem level at all.
Comment 19 Conrad Meyer freebsd_committer freebsd_triage 2017-12-03 17:36:21 UTC
So for (3), what is the right way to expose media readonly status to GEOM from CAM?
Comment 20 Conrad Meyer freebsd_committer freebsd_triage 2017-12-04 01:20:32 UTC
(In reply to Conrad Meyer from comment #19)
We could export it via the d_getattr method, which for SCSI is dagetattr().  Or more simply, we could put it in d_flags, set via e.g. daregister (again, for SCSI).
Comment 21 Andriy Gapon freebsd_committer freebsd_triage 2017-12-04 10:33:15 UTC
(In reply to Conrad Meyer from comment #20)
Please see my proof of concept based on the second approach:
https://reviews.freebsd.org/D13360
Comment 22 Scott Long freebsd_committer freebsd_triage 2017-12-04 17:10:21 UTC
Please use getattr, don't use flags.  For things like this that are used only at initialization, it's fine to use a slow path method like getattr.  Save flags for things that are more critical.
Comment 23 Andriy Gapon freebsd_committer freebsd_triage 2017-12-04 17:19:48 UTC
(In reply to Scott Long from comment #22)

Scott,

I envisioned that g_disk_access would act on the WP bit and I am not sure if using d_getattr there would be really convenient as it requires setting up a bio, etc.
On the other hand, I think that it would be nice to expose the write-protect status via a GEOM attribute for other uses.

Another question is whether it should scsi_da code that queries the WP bit or whether it should be scsi_xpt.  Right now dagetattr is short-circuited to xpt_getattr.  But it doesn't have to be, of course.
Comment 24 commit-hook freebsd_committer freebsd_triage 2018-01-15 11:20:18 UTC
A commit references this bug:

Author: avg
Date: Mon Jan 15 11:20:01 UTC 2018
New revision: 327996
URL: https://svnweb.freebsd.org/changeset/base/327996

Log:
  geom_disk / scsi_da: deny opening write-protected disks for writing

  Ths change consists of two parts.

  geom_disk: deny opening a disk for writing if it's marked as
  write-protected.  A new disk(9) flag is added to mark write protected
  disks.  A possible alternative could be to add another parameter to d_open,
  so that the open mode could be passed to it and the disk drivers could
  make the decision internally, but the flag required less churn.

  scsi_da: add a new phase of disk probing to query the all pages mode
  sense page.  We can determine if the disk is write protected using bit 7
  of the device specific field in the mode parameter header returned by
  MODE SENSE.

  PR:		224037
  Reviewed by:	mav
  MFC after:	4 weeks
  Differential Revision: https://reviews.freebsd.org/D13360

Changes:
  head/sys/cam/scsi/scsi_da.c
  head/sys/geom/geom_disk.c
  head/sys/geom/geom_disk.h
Comment 25 commit-hook freebsd_committer freebsd_triage 2018-02-15 15:33:48 UTC
A commit references this bug:

Author: avg
Date: Thu Feb 15 15:33:18 UTC 2018
New revision: 329316
URL: https://svnweb.freebsd.org/changeset/base/329316

Log:
  MFC r327996: geom_disk / scsi_da: deny opening write-protected disks for writing

  Ths change consists of two parts.

  geom_disk: deny opening a disk for writing if it's marked as
  write-protected.  A new disk(9) flag is added to mark write protected
  disks.  A possible alternative could be to add another parameter to d_open,
  so that the open mode could be passed to it and the disk drivers could
  make the decision internally, but the flag required less churn.

  scsi_da: add a new phase of disk probing to query the all pages mode
  sense page.  We can determine if the disk is write protected using bit 7
  of the device specific field in the mode parameter header returned by
  MODE SENSE.

  PR:		224037

Changes:
_U  stable/11/
  stable/11/sys/cam/scsi/scsi_da.c
  stable/11/sys/geom/geom_disk.c
  stable/11/sys/geom/geom_disk.h
Comment 26 commit-hook freebsd_committer freebsd_triage 2018-02-15 16:31:39 UTC
A commit references this bug:

Author: avg
Date: Thu Feb 15 16:31:36 UTC 2018
New revision: 329319
URL: https://svnweb.freebsd.org/changeset/base/329319

Log:
  MFC r327996: geom_disk / scsi_da: deny opening write-protected disks for writing

  Ths change consists of two parts.

  geom_disk: deny opening a disk for writing if it's marked as
  write-protected.  A new disk(9) flag is added to mark write protected
  disks.  A possible alternative could be to add another parameter to d_open,
  so that the open mode could be passed to it and the disk drivers could
  make the decision internally, but the flag required less churn.

  scsi_da: add a new phase of disk probing to query the all pages mode
  sense page.  We can determine if the disk is write protected using bit 7
  of the device specific field in the mode parameter header returned by
  MODE SENSE.

  PR:		224037

Changes:
_U  stable/10/
  stable/10/sys/cam/scsi/scsi_da.c
  stable/10/sys/geom/geom_disk.c
  stable/10/sys/geom/geom_disk.h