Bug 212211

Summary: [sysutils/smartmontools]: smartctl not able to access S.M.A.R.T data on drives connected to mrsas(4) RAID
Product: Ports & Packages Reporter: Andrii Stesin <stesin>
Component: Individual Port(s)Assignee: Oleksii Samorukov <samm>
Status: Closed FIXED    
Severity: Affects Some People CC: adamz, alex-freebsd-bugs, h, jpaetzel, pkubaj, samm, stesin, terry-freebsd, vrwmiller
Priority: ---    
Version: Latest   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Sample output after patching
none
Changes to daily periodic smart script
none
Requested info
none
Requested --scan output none

Description Andrii Stesin 2016-08-28 10:56:56 UTC
Given FreeBSD 10.3 system with LSI MegaRAID 9271-8i SATA/SAS RAID controller.

Recommended (and supported) driver for this newer controller is mrsas (4). mrsas (4) provides OS with virtual drives /dev/daXX - although I found that /dev/passX devices are also created for whatever purpose (/dev/daN has corresponding /dev/passN) in case it matters.

Each /dev/daXX hides some (1 or more) physical SAS/SATA drives hidden under it. Using MegaCLI utility, you can list PDs which of these are located by:

- either with a pair of values [EnclosureID:SlotID],
- or with the "device ID" which LSI MegaRAID somehow assigns to them.

I have fresh actual smartmontools installed:

smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)

I want to get S.M.A.R.T. status of any physical drive attached to MegaRAID, i.e. of an SSD which MegaRAID assigned device ID is 12, and it is hidden under /dev/da0 virtual drive. But what I actually get:

[root@chort /]# smartctl -a /dev/da0
smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
START OF INFORMATION SECTION
============================
Vendor: LSI
Product: MR9271-8i
Revision: 3.46
User Capacity: 199,481,098,240 bytes [199 GB]
Logical block size: 512 bytes
Terminate command early due to bad response to IEC mode page
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
[root@chort /]# 

Ok, I understand: /dev/da0 is not a physical drive, maybe it's a striped array of 2 drives, smartctl gets confused, this is expected behavior.

Looking deeper, I discovered, that Linux version of smartctl is aware of this situation, it allows the following syntax, where "-d megaraid,12" should tell that we are dealing with LSI MegaRAID and we actually want to question the disk with "device ID" 12. Tried with Ubuntu live CD, it actually works well.

But on FreeBSD this mode is not supported:

[root@chort /]# smartctl -a -d megaraid,12 /dev/da0
smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
/dev/da0: Unknown device type 'megaraid,12'
=======> VALID ARGUMENTS ARE: ata, scsi, nvme[,NSID], sat[,auto][,N][+TYPE], usbcypress[,X], usbjmicron[,p][,x][,N], usbprolific, usbsunplus, 3ware,N, hpt,L/M/N, cciss,N, areca,N/E, atacam, auto, test <=======
Use smartctl -h to get a usage summary
[root@chort /]#

I already opened the ticket at https://www.smartmontools.org/ticket/734 but I think that a clarification is needed here: does mrsas (4) support this kind of queries at all, as of today?

Do we need some modifications to mrsas (4) in order to support the syntax like:

smartctl -a -d megaraid,12

or even better,

smartctl -a -d 'megaraid,[9:11'

or both? Or this is completely on the side of smartmontools? Thanks in advance!

WBR,
Andrii Stesin
Comment 1 Andrii Stesin 2016-08-28 11:02:49 UTC
Sorry for misprint (missed bracket), I meant

smartctl -a -d 'megaraid,[9:11]'
Comment 2 Andrii Stesin 2016-08-28 21:06:02 UTC
BTW, while trying to reach AVAGO people who deal with mrsas (4) I opened the /usr/src/sys/dev/mrsas.c file, at line 4 it says: "Support: freebsdraid@avagotech.com" - this mail does not work, the mailbox is rejected by AVAGO. At the bottom of initial copyright message, there is another email listed: megaraidfbsd@avagotech.com and this one does not work either (also rejected).

Maybe AVAGO dropped FreeBSD support at all? Or what I am doing wrong?
Comment 3 Piotr Kubaj freebsd_committer freebsd_triage 2016-08-29 12:55:53 UTC
I can confirm that sysutils/smartmontools doesn't work with mrsas(4). sysutils/storcli also doesn't work:
[root@gen ~/storcli_all_os/FreeBSD]# storcli /call show all
Status = Failure
Description = No Controller found

Using the newest storcli from AVAGO doesn't help:
[root@gen ~/storcli_all_os/FreeBSD]# ./storcli64 /call show all
Status = Failure
Description = No Controller found

Strangely, MegaCli works, but it doesn't provide enough functionality for me.
Comment 4 Oleksii Samorukov freebsd_committer freebsd_triage 2017-10-12 18:31:53 UTC
Hi i am smartmontools developer. it should be possible to implement this functionality in the FreeBSD build. Please drop me a note in the ticket if you want to participate in testing.
Comment 5 Josh Paetzel freebsd_committer freebsd_triage 2017-10-13 02:17:36 UTC
Yes, I'd be very interested in participating in testing.
Comment 6 Terry Kennedy 2019-08-11 04:47:59 UTC
Just posting to register my interest in this as well. 12-STABLE amd64 with a SAS3108 (Dell PERC H730) controller, just in case this PR has been languishing because it was filed against a no-longer-supported FreeBSD release.

The mfi driver package provided the mfip kernel module, which exposed each of the controller's drives as a passN device, regardless of configuration / array / etc. status.

With some newer controllers only supported by mrsas, and mfi being deprecated (or at least not suggested as the path to the future), FreeBSD needs functionality similar to mfip in order for smartmontools (and perhaps other software) to work on drives attached to a mrsas controller.

It looks like r342065 added a lot of the support needed for this to mrsas:
https://svnweb.freebsd.org/base?view=revision&revision=342065

Perhaps someone more familiar with this driver could finish up adding passthru support? My company could contribute some reasonable funding (and test hardware) toward getting this implemented if necessary.
Comment 7 Oleksii Samorukov freebsd_committer freebsd_triage 2019-10-02 15:40:56 UTC
Hi Terry. 

Please connect with me about adding this support, samm@os2.kiev.ua and samm@net-art.cz. I will need remote access to the hw
Comment 8 Mark Linimon freebsd_committer freebsd_triage 2021-05-12 20:09:06 UTC
^Triage: assignee has returned his bit for safekeeping.
Comment 9 henrime 2021-08-17 12:06:47 UTC
Still present today with 13.0-CURRENT
Comment 10 Oleksii Samorukov freebsd_committer freebsd_triage 2021-11-22 15:13:46 UTC
I did an experimental patch to support -d megaraid on FreeBSD, see https://patch-diff.githubusercontent.com/raw/smartmontools/smartmontools/pull/117.patch.

I do not have any mrsas devices around, so was testing it on mfi only. Also was reversing storcli to find if it would do the same.

Testing is welcome. See https://patch-diff.githubusercontent.com/raw/smartmontools/smartmontools/pull/117.patch for the patch itself.
Comment 11 Terry Kennedy 2021-11-23 03:40:28 UTC
Created attachment 229665 [details]
Sample output after patching

(In reply to Oleksii Samorukov from comment #10)

In basic testing, this seems to work - thank you!

Attached is a log file showing status, start of a self-test, self-test in progress, and status after completed self-test.

The patch to ChangeLog does not install because I think your patch is based on an unreleased version of smartmontools. However, that's cosmetic.

Another cosmetic issue is that "megaraid" should probably be added to smart_interface::get_valid_dev_types_str() in dev_interface.cpp

I will do some additional testing and report my findings.

Thanks again!
Comment 12 Terry Kennedy 2021-11-23 04:03:16 UTC
Created attachment 229666 [details]
Changes to daily periodic smart script

The attached patch enables daily SMART status reports as part of the nightly periodic jobs. The following needs to be added to /etc/periodic.conf (season to taste):

daily_status_smart_devices="mrsas0,0 mrsas0,1 mrsas0,2 mrsas0,3"
daily_status_smart_enable="YES"

I believe the patch to enable SMART support provides support for devices other than /dev/mrsas0. In that case, the attached patch will need to be extended to understand those additional devices.
Comment 13 Oleksii Samorukov freebsd_committer freebsd_triage 2021-11-23 07:07:41 UTC
(In reply to Terry Kennedy from comment #11)

Thank you for testing and comments, glad to see its working. One of the things i would like to implement before merging it to master is scanning and hint for the mrsas exported block devices. From source code i see that its exporting block devices as /dev/da?. May i ask you to issue `smartctl -d scsi -i -r ioctl,3 /dev/daX` where daX is mrsas owned device? And also `camcontrol devlist -v` if possible. My intention is to detect that device is owned by mrsas and display hint about '-d megaraid,N'
Comment 14 Terry Kennedy 2021-11-23 07:19:46 UTC
Created attachment 229667 [details]
Requested info

I'm attaching the output from the two commands you requested. /dev/da0 is a RAID5 of 4 SSDs. This is a PERC H730 (PCI card=0x1f491028 chip=0x005d1000) in a Dell PowerEdge R730.

There seems to be partial support for passthru in the mrsas driver. I think I noted this in my first comment on this PR.

There is quite a large amount of variation in the behavior of the various MegaRAID drivers in FreeBSD. mfi will expose all of the physical member disks as /dev/passX devices if the undocumented mfip module is loaded. One of the other MegaRAID drivers (mpr, maybe?) has an interesting misfeature of only exposing the last physical member disk of a volume as a /dev/passX device.
Comment 15 Oleksii Samorukov freebsd_committer freebsd_triage 2021-11-23 10:19:24 UTC
Thank you. I updated patch to support scanning, please try to test 

"smartctl --scan" and "smartctl --scan-open" commands to see if it works as expected
Comment 16 Oleksii Samorukov freebsd_committer freebsd_triage 2021-11-23 12:19:39 UTC
P.S. I merged things to master already. Please test on https://github.com/smartmontools/smartmontools. If succeed - i will update the port.
Comment 17 Terry Kennedy 2021-11-23 18:45:20 UTC
(In reply to Oleksii Samorukov from comment #16)

What is the procedure to generate a configure script from the provided configure.ac file? Simply trying "autoconf" produces:

configure.ac:7: error: possibly undefined macro: AM_INIT_AUTOMAKE
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation.
configure.ac:25: error: possibly undefined macro: AM_MAINTAINER_MODE
configure.ac:30: error: possibly undefined macro: AM_PROG_AS
configure.ac:91: error: possibly undefined macro: AM_CONDITIONAL

It appears that the version of smartmontools in ports contains a "pre-cooked" configure script.
Comment 18 Oleksii Samorukov freebsd_committer freebsd_triage 2021-11-23 20:03:46 UTC
> What is the procedure to generate a configure script from the provided configure.ac file?

Just run ./autogen.sh, it should do the job. Or get CI build/src from https://builds.smartmontools.org/
Comment 19 Terry Kennedy 2021-11-24 03:03:38 UTC
Created attachment 229686 [details]
Requested --scan output

Results of "--scan", "--scan-open", and "--scan-open -d megaraid" attached.

Thanks again!
Comment 20 Terry Kennedy 2021-11-24 03:38:59 UTC
(In reply to Terry Kennedy from comment #19)

A few additional comments based on looking at the individual commits...

Hinting is working on mrsas:
(0:402) new-gate:~terry/smartmontools-master/smartmontools# ./smartctl -a /dev/da0
smartctl 7.3 (build date Nov 23 2021) [FreeBSD 12.2-STABLE amd64] (local build)
Copyright (C) 2002-21, Bruce Allen, Christian Franke, www.smartmontools.org

Smartctl open device: /dev/da0 failed: DELL or MegaRaid controller, use '-d megaraid,N'

In smartctl.8.in, you added "This interface will also work for Dell PERC controllers." The PERC S100 (and perhaps other S-series) are "soft RAID" based on Box Hill. So this probably won't work there. I can fire up a system with that hardware if you'd like me to check things. I'm pretty sure that "... for Dell PERH H-series controllers" is a safe bet.
Comment 21 Oleksii Samorukov freebsd_committer freebsd_triage 2021-11-24 07:03:55 UTC
Thank you for testing! What i can also detect if request comes via mrsas driver, this should be safer. But i was not sure if this would not break JBOD configuration. Would be nice if someone can tell if JBOD disks on mrsas are supporting SMART directly or not. 

P.S. going to backport this to the port today. In case of any additional issues feel free to comment here, open additional PR or just report directly on smartmontools.org
Comment 22 commit-hook freebsd_committer freebsd_triage 2021-11-24 07:29:50 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=40a957b3db989c9d820563005df61464f8a0312d

commit 40a957b3db989c9d820563005df61464f8a0312d
Author:     Oleksii Samorukov <samm@FreeBSD.org>
AuthorDate: 2021-11-24 07:15:14 +0000
Commit:     Oleksii Samorukov <samm@FreeBSD.org>
CommitDate: 2021-11-24 07:28:58 +0000

    sysutils/smartmontools: Implement monitoring of the devices behind mrsas RAID

    PR:             212211
    Reported by:    stesin@gmail.com

 sysutils/smartmontools/Makefile                    |   2 +-
 .../files/patch-os__freebsd.cpp (new)              | 450 +++++++++++++++++++++
 .../smartmontools/files/patch-os__freebsd.h (new)  | 167 ++++++++
 .../smartmontools/files/patch-smartctl.8.in (new)  |  47 +++
 .../files/patch-smartd.conf.5.in (new)             |  56 +++
 sysutils/smartmontools/files/smart.in              |   3 +
 6 files changed, 724 insertions(+), 1 deletion(-)
Comment 23 Alexander Burke 2021-11-24 07:58:40 UTC
(In reply to Terry Kennedy from comment #20)

>I'm pretty sure that "... for Dell PERH H-series controllers" is a safe bet.

Typo aside, if it's good for the H-series, it's surely good for the I-series as well; the only difference is that the I-series is blessed for use in the integrated storage controller slot.
Comment 24 Oleksii Samorukov freebsd_committer freebsd_triage 2021-11-24 08:33:36 UTC
I updated PERC matching (see r5253, https://www.smartmontools.org/changeset/5253) to match anything with "PERC ". We do not have any such drive in drivedb and in case of soft-raid it will not hurt as they are visible as normal SATA controllers and should not alter any drive info.