Bug 249395 - Regression in 12.stable/12.2-beta: sg driver not working for makemkv
Summary: Regression in 12.stable/12.2-beta: sg driver not working for makemkv
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.1-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: John Baldwin
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2020-09-17 15:08 UTC by Trenton Schulz
Modified: 2020-10-01 17:31 UTC (History)
2 users (show)

See Also:


Attachments
debug.patch (1.34 KB, patch)
2020-09-17 20:35 UTC, John Baldwin
no flags Details | Diff
dtrace output with 12.2 scsi_sg.c (3.43 KB, text/plain)
2020-09-22 17:20 UTC, Trenton Schulz
no flags Details
dtrace output with previous version of scsi_sg.c (13.10 KB, application/octet-stream)
2020-09-22 17:26 UTC, Trenton Schulz
no flags Details
revert.patch (2.03 KB, patch)
2020-09-22 21:33 UTC, John Baldwin
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Trenton Schulz 2020-09-17 15:08:58 UTC
The Linux-compatible "scsi-generic" driver is not working
in 12-stable when using multimedia/makemkv (a binary-only Linux program); it never finds a drive/disc, which it does on 12.1.

The problem seems to be MFCs in base r361038.

Steps to reproduce:

0. Build a custom kernel with "device sg" and reboot
1. Install multimedia/makemkv
2. run update-makemkv-drives
3. Insert a blu-ray or DVD film in your DVD/blu-ray drive
4. run (probably as root) makemkvcon info disc:0 

When running on 12.1 with base r361038 or later:

sudo makemkvcon info disc:0
MakeMKV v1.15.2 linux(x64-release) started
The program can't find any usable optical drives.
Failed to open disc
Total 0 titles
zsh: segmentation fault  sudo makemkvcon info disc:0


What is expected:

When running on a scsi_sc.c before base r361038

sudo makemkvcon info disc:0
MakeMKV v1.15.2 linux(x64-release) started
Using direct disc access mode
Evaluation version, 28 day(s) out of 30 remaining
Loaded content hash table, will verify integrity of M2TS files.
[snip output]
Operation successfully completed
Total 14 titles
Title  0
[snip output]

Title 13
0 Video Mpeg4 AVC High@L4.1
1 Audio DTS-HD High resolution audio
2 Audio DTS
3 Subtitles HDMV PGS Subtitles
4 Subtitles HDMV PGS Subtitles
5 Subtitles HDMV PGS Subtitles
6 Subtitles HDMV PGS Subtitles

zsh: segmentation fault  sudo makemkvcon info disc:0


Note:
makemkvcon always segfaults on exit. That is another issue.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2020-09-17 15:15:43 UTC
Is makemkv a 32-bit binary?
Comment 2 John Baldwin freebsd_committer freebsd_triage 2020-09-17 16:02:41 UTC
Also, can you share which device is being accessed (scsi_sg.c is like passX devices, so both the end-device (e.g. cd0?) and the HBA driver (ahci?) matters).  The one thing I am curious about is the change from CAM_DIR_IN | CAM_DIR_OUT to CAM_DIR_BOTH.  ahci may not handle CAM_DIR_BOTH correctly which might be why this is broken if so.  Sadly, CAM_DIR_BOTH is not CAM_DIR_IN | CAM_DIR_OUT but is 0 due to some weirdness in the CAM spec.  I think ahci might do things like 'if (dir & CAM_DIR_IN)' when instead it needs to be doing 'if (dir == CAM_DIR_IN || dir == CAM_DIR_BOTH)' IIRC from when I was looking at the CAM_DIR_BOTH issue before committing the change to head.
Comment 3 Trenton Schulz 2020-09-17 16:28:24 UTC
Mark, makemkvcon is a 64-bit binary.
Comment 4 Trenton Schulz 2020-09-17 16:36:13 UTC
John, the device I was doing this with was a USB Blu-ray drive from ASUS. The dmesg I get is:

umass0 on uhub1
umass0: <Generic External, class 0/0, rev 2.00/1.07, addr 3> on usbus1
umass0:  SCSI over Bulk-Only; quirks = 0x4000
umass0:6:0: Attached to scbus6
cd0 at umass-sim0 bus 0 scbus6 target 0 lun 0
cd0: <ASUS SBW-06D2X-U D501> Removable CD-ROM SCSI device
cd0: 40.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
cd0: quirks=0x10<10_BYTE_ONLY>


So, maybe it is an issue with umass and not ACHI.

Hope this helps.
Comment 5 John Baldwin freebsd_committer freebsd_triage 2020-09-17 20:35:42 UTC
Created attachment 218033 [details]
debug.patch

I looked at umass.c and it doesn't handle CAM_DIR_BOTH, but it handles CAM_DIR_IN | CAM_DIR_OUT (CAM_DIR_NONE) the same as CAM_DIR_BOTH.  Still, the debug patch reverts the CAM_DIR_BOTH change while also adding a printf to another thing that changed in the commit in question.
Comment 6 John Baldwin freebsd_committer freebsd_triage 2020-09-17 20:36:08 UTC
To be clear, this isn't a fix so much as something to test to see if it helps debug the issue farther.
Comment 7 Trenton Schulz 2020-09-18 06:43:12 UTC
(In reply to John Baldwin from comment #5)

OK, and understood this is for further debugging. I'll try it out this evening and let you know what happens.
Comment 8 Trenton Schulz 2020-09-18 19:58:45 UTC
(In reply to John Baldwin from comment #6)

I applied the patch and it I get more or less the same output as before.

makemkvcon info disc:0
MakeMKV v1.15.2 linux(x64-release) started
The program can't find any usable optical drives.
Failed to open disc
Total 0 titles
Segmentation fault (core dumped)

Nothing shows up in dmesg either.

So, there might be some other interaction going on here with the change.
Comment 9 John Baldwin freebsd_committer freebsd_triage 2020-09-21 18:13:52 UTC
Hmm, are you familiar with dtrace?  It would be nice to see which scsi_sg requests are being issued and if they are reporting errors or not, etc.  In particular, a dtrace script that logs the return value from sgioctl() for cmd == SG_IO, and logs the struct sg_io_hdr structure on entry to sgioctl() when cmd == SG_IO.  You would need to do something like:

kldload dtraceall
dtrace -s /path/to/script.d

and then run your test program in another window.

Here's a stab at the script but it is untested:

inline unsigned long SG_IO = 0xc0582285;

fbt::sgioctl:entry
/args[1] == SG_IO/
{
    self->trace = 1;
    print(*(struct sg_io_hdr *)arg2);
}

fbt::sgioctl:return
/self->trace == 1/
{
    self->trace = 0;
    printf("returned %d", arg1);
}
Comment 10 Trenton Schulz 2020-09-22 17:20:03 UTC
Created attachment 218184 [details]
dtrace output with 12.2 scsi_sg.c
Comment 11 Trenton Schulz 2020-09-22 17:26:09 UTC
Created attachment 218185 [details]
dtrace output with previous version of scsi_sg.c

I know of dtrace, but never really had a chance to use it on my own.

Anyway, the script at least generates output. Nice!

I ran makemkvcon with both the 12.2 version of scsc_sg.c (not using the patch from this bug) and the previous version and have attached both to the patch now. The earlier version generated a lot of output so I had to compress it to get under the attachment limet

I hope there is something that gives a clue as to what is happening. If not, I'm willing to try revisions of the dtrace script for further probing.
Comment 12 John Baldwin freebsd_committer freebsd_triage 2020-09-22 21:32:40 UTC
Comment on attachment 218184 [details]
dtrace output with 12.2 scsi_sg.c

Hummmm, I think I see that my original commit was wrong as scsi_sg(4) is using cam_periph_mapmem().  I will revert most of the commit in HEAD and then MFC it.
Comment 13 John Baldwin freebsd_committer freebsd_triage 2020-09-22 21:33:35 UTC
Created attachment 218194 [details]
revert.patch

Would you be able to test this patch that reverts all but the CAM_DIR_BOTH change (which I think is still a legit fix)?
Comment 14 Trenton Schulz 2020-09-23 06:53:19 UTC
(In reply to John Baldwin from comment #13)

Thank you for looking deeper. I won't be near a machine to test it today, but I will take a look at this on Thursday.
Comment 15 Trenton Schulz 2020-09-24 19:53:41 UTC
(In reply to John Baldwin from comment #13)
Had a chance to try the patch this evening. It seems that things work again in makemkvcon. I can again get info from the disc.

So, I will say that this appears fixed to me.

Thank you!
Comment 16 commit-hook freebsd_committer freebsd_triage 2020-09-25 21:20:35 UTC
A commit references this bug:

Author: jhb
Date: Fri Sep 25 21:19:56 UTC 2020
New revision: 366175
URL: https://svnweb.freebsd.org/changeset/base/366175

Log:
  Revert most of r360179.

  I had failed to notice that sgsendccb() was using cam_periph_mapmem()
  and thus was not passing down user pointers directly to drivers.  In
  practice this broke requests submitted from userland.

  PR:		249395
  Reported by:	Trenton Schulz <trueos@norwegianrockcat.com>
  Reviewed by:	scottl
  MFC after:	3 days
  Differential Revision:	https://reviews.freebsd.org/D26550

Changes:
  head/sys/cam/scsi/scsi_sg.c
Comment 17 commit-hook freebsd_committer freebsd_triage 2020-09-30 18:10:41 UTC
A commit references this bug:

Author: jhb
Date: Wed Sep 30 18:09:51 UTC 2020
New revision: 366297
URL: https://svnweb.freebsd.org/changeset/base/366297

Log:
  MFC 366175: Revert most of r360179.

  I had failed to notice that sgsendccb() was using cam_periph_mapmem()
  and thus was not passing down user pointers directly to drivers.  In
  practice this broke requests submitted from userland.

  PR:		249395

Changes:
_U  stable/11/
  stable/11/sys/cam/scsi/scsi_sg.c
_U  stable/12/
  stable/12/sys/cam/scsi/scsi_sg.c
Comment 18 commit-hook freebsd_committer freebsd_triage 2020-10-01 17:31:14 UTC
A commit references this bug:

Author: jhb
Date: Thu Oct  1 17:30:39 UTC 2020
New revision: 366332
URL: https://svnweb.freebsd.org/changeset/base/366332

Log:
  MFS 366297: Revert most of r360179.

  I had failed to notice that sgsendccb() was using cam_periph_mapmem()
  and thus was not passing down user pointers directly to drivers.  In
  practice this broke requests submitted from userland.

  PR:		249395
  Approved by:	re (gjb)

Changes:
_U  releng/12.2/
  releng/12.2/sys/cam/scsi/scsi_sg.c
Comment 19 John Baldwin freebsd_committer freebsd_triage 2020-10-01 17:31:40 UTC
Thanks for the patience and testing!