Bug 248368 - devd MEDIACHANGE events are delivered unreliably for /dev/cdX devices
Summary: devd MEDIACHANGE events are delivered unreliably for /dev/cdX devices
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-30 14:17 UTC by Gleb Popov
Modified: 2023-12-01 10:41 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gleb Popov freebsd_committer freebsd_triage 2020-07-30 14:17:31 UTC
1. VirtualBox

On a VirtualBox VM with a CD-ROM device attached I run:

# cat /var/run/devd.pipe | grep cd

While it runs, I insert and then eject an ISO image to/from the drive. This yields MEDIACHANGE event only during **ejection**:

!system=DEVFS subsystem=CDEV type=MEDIACHANGE cdev=cd0
!system=GEOM subsystem=DEV type=MEDIACHANGE cdev=cd0

2. Real hardware

The same test done on real computer and DVD-ROM drive results in MEDIACHANGE events emitted during **insertion** only, and not on ejection.

Additionally, devd log is spammed with following errors, not sure if relevant:

!system=CAM subsystem=periph type=error device=cd0 serial="K1KH3TA2730" cam_status="0xcc" scsi_status=2 scsi_sense="71 02 3a 02" CDB="00 00 00 00 00 00 "
Comment 1 Pau Amma 2020-08-02 00:17:46 UTC
Duplicating.

1. VirtualBox 5.2.44

% uname -a
FreeBSD FreeBSD-bug-248402 12.1-RELEASE-p7 FreeBSD 12.1-RELEASE-p7 GENERIC  amd64
% sudo cat /var/run/devd.pipe | grep cd
Password:
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " <--- Inserting .iso image at that point, messages stop.
<--- After a minute or so, remove .iso image, messages as follows:
!system=DEVFS subsystem=CDEV type=MEDIACHANGE cdev=cd0
!system=GEOM subsystem=DEV type=MEDIACHANGE cdev=cd0
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="1e 00 00 00 01 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="25 00 00 00 00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="1e 00 00 00 01 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="25 00 00 00 00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="1e 00 00 00 01 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="25 00 00 00 00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="1e 00 00 00 01 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="25 00 00 00 00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 
^C

2. Bare metal (Lenovo Z50-70)
% uname -a
FreeBSD gadfly 12.1-RELEASE-p7 FreeBSD 12.1-RELEASE-p7 GENERIC  amd64
% sudo cat /var/run/devd.pipe | grep cd
Password:
<--- upon opening the empty caddy
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 02" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 02" CDB="00 00 00 00 00 00 " 
<--- upon closing the caddy with a DVD-R in it
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 04 01" CDB="00 00 00 00 00 00 " 
!system=DEVFS subsystem=CDEV type=MEDIACHANGE cdev=cd0
!system=GEOM subsystem=DEV type=MEDIACHANGE cdev=cd0
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 06 28 00" CDB="00 00 00 00 00 00 " 
!system=DEVFS subsystem=CDEV type=CREATE cdev=iso9660/12_1_RELEASE_AMD64_CD
!system=GEOM subsystem=DEV type=CREATE cdev=iso9660/12_1_RELEASE_AMD64_CD
!system=DEVFS subsystem=CDEV type=DESTROY cdev=iso9660/12_1_RELEASE_AMD64_CD
!system=GEOM subsystem=DEV type=DESTROY cdev=iso9660/12_1_RELEASE_AMD64_CD
<--- upon opening the caddy again
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 02" CDB="00 00 00 00 00 00 "
<--- upon closing the emptied caddy
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 04 01" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 01" CDB="00 00 00 00 00 00 "
Comment 2 Gleb Popov freebsd_committer freebsd_triage 2021-07-22 18:57:38 UTC
Still relevant with 14-CURRENT host, 13-STABLE guest and VirtualBox 6.1
Comment 3 saznik 2022-04-07 07:30:56 UTC
I see the same Prolbem in 12.3-RELEASE-p5
Comment 4 Warner Losh freebsd_committer freebsd_triage 2022-09-25 19:31:09 UTC

cam error = 0xcc. That's CAM_SCSI_STATUS_ERROR   = 0x0c with
        /* The DEV queue is frozen w/this err */
        CAM_DEV_QFRZN           = 0x40,
        /* Autosense data valid for target */
        CAM_AUTOSNS_VALID       = 0x80,
also set. This makes sense.

!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " 

CDB is a TUR (Test Unit Ready) sent to the drive.

0x70 is the 'error code' for CURRENT ERROR.
0x02 is the sense key. This is SSD_KEY_NOT_READY
asc (additional sense code) is 0x3a
ascq (additional sense code qualifier) is 0x00

0x3a 0x00 is 'media not present'
0x3a 0x01 us 'media not present, tray closed' (seen later)
0x3a 0x02 is 'media not present, tray open'

so this seems legit. We don't report this error anywhere except devd, since we report all errors (even ignored ones) there.

The cd driver currently will turn off the 'seen media' bit if it gets a 0x3a asc (regardless of the qualifier) and call the media gone stuff.

So after the media is removed on real hardware, we see:
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="1e 00 00 00 01 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="VB2-01700376" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="25 00 00 00 00 00 00 00 00 00 " 

1E is 'prevent / allow media removal'
25 is 'read capacity'

so it's trying to find the capacity of the media that isn't there, it seems. But it gives up after 3 retries, and we see the same stream of TUR + media not present.

So for closing the drive on real hardware:
<--- upon closing the emptied caddy
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 04 01" CDB="00 00 00 00 00 00 " 
!system=CAM subsystem=periph type=error device=cd0 serial="KYME48F3455" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 01" CDB="00 00 00 00 00 00 "

ASC of 04/01 is 'Logical Unit is becoming ready' so that makes sense (it takes the CD drive a while to realize it has or does not have a disc). It's a transient condition that may, or may not, happen.

ASC of 3a/01 is 'drive closed, no media'

I think that the 'upon opening drive' is two lines too late, since we destroy the media in the error callback.

So the error is that we don't create/destroy the media for virtualbox, but do for actual hardware.

The cd code looks like it's getting a ASC/ASCQ 28/00 which is media change. And it also looks like we never saw media (which is why we never destroy it) so the 'saw media' bit looks like it's getting cleared somewhere else w/o a corresponding disk_media_gone() callback. That looks like it can happen when there's an error (any error) when we're trying to do the prevent/allow stuff. so there may be a different ordering with virtual box that's leading to this situation.

Not sure I have time to come up with a patch, and it doesn't make 100% sense to me (so I'm thinking there's some missing information).
Comment 5 Warner Losh freebsd_committer freebsd_triage 2022-09-25 19:35:45 UTC
My comment has an error 'on real hardware' in the middle and I can't edit it for one of the 'VB2-01700376' entries.
Comment 6 Pau Amma 2022-09-25 22:22:29 UTC
(In reply to Warner Losh from comment #4)
Thanks for this. Would it help to repeat the test with a more recent FreeBSD (13.1 on bare metal, and either that or 14-current in VirtualBox) and this time, get a complete, unedited log?
Comment 7 Lars Engels freebsd_committer freebsd_triage 2022-11-01 20:24:16 UTC
I'm seeing the same every 3 seconds on my notebook running 13.1 but with a da device:

!system=CAM subsystem=periph type=error device=da1 serial="20120501030900000" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 "
Comment 8 Bruce Becker 2023-12-01 10:41:06 UTC
> uname
FreeBSD fbsd13_2 13.2-STABLE FreeBSD 13.2-STABLE stable/13-n256726-7c25a53a2cb9 FBSD13_2 amd64

Built 2023/11/29

Virtualbox guest on FreeBSD 12.4-STABLE host


Every 3 seconds -

Dec  1 05:29:46 <1.6> fbsd13_2 devd[627]: Processing event '!system=CAM subsystem=periph type=error device=cd0 serial="VB7-1a2b3c4d" cam_status="0xcc" scsi_status=2 scsi_sense="70 02 3a 00" CDB="00 00 00 00 00 00 " '