Bug 227664

Summary: mps(4) regression makes the system unbootable
Product: Base System Reporter: Nicolas Braud-Santoni <nicolas>
Component: kernAssignee: Stephen McConnell <slm>
Status: New ---    
Severity: Affects Some People CC: fox
Priority: --- Keywords: regression
Version: 11.1-RELEASE   
Hardware: amd64   
OS: Any   
Attachments:
Description Flags
Console log on HardenedBSD 11-STABLE v1100055.1 none

Description Nicolas Braud-Santoni 2018-04-21 01:14:59 UTC
Created attachment 192692 [details]
Console log on HardenedBSD 11-STABLE v1100055.1

Hi,

On 11.1-RELEASE, and CURRENT, the mps(4) driver goes into an infinite loop of rebooting the controller, at boot time, hanging the kernel until it eventually panics.

I can systematically reproduce the issue on a Dell Poweredge R410 server, and I bisected the regression to somewhere between 11.0-RELEASE and 11.1-RC3; I will bisect it further tomorrow.

Furthermore, Rachel (in CC) encountered this regression independently.
@Rachel: Please provide info on the hardware you encountered this on.

Please find attached a screenshot of the system's console, from HardenedBSD's 11-STABLE, as I forgot to take some while bisecting the issue (but the log messages where, if not identical, at least very similar).


Best,

  nicoo
Comment 1 Nicolas Braud-Santoni 2018-04-22 11:04:47 UTC
The regression was likely introduced by the following commit, before 11.1-RC1:

6ec4b0641762d521d99545fd95f16cb05c65bec3 Thu Jun  1 16:55:03 2017
   MFC r318895: Fix several problems with mapping code in mps(4).
   MFC r318896: Fix several problems with mapping code in mpr(4).

I've encountered issues building bootable ISOs from a specific checkout of `src/`, so I cannot easily check it right now.
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2018-04-22 16:36:46 UTC
Over to the committer of the MFC in question (r319446).
Comment 3 Stephen McConnell freebsd_committer 2018-04-23 17:50:23 UTC
I see that the FW version is 02.15.63.00. That's really old FW (about 8 years old). Is it possible for you to upgrade your FW and retry? Also, if you can increase the debug output, that might help. You can set the sysctl value for 'debug_level' to an OR of these values (see the mps man page for more info):
#define MPS_INFO        (1 << 0)        /* Basic info */
#define MPS_FAULT       (1 << 1)        /* Hardware faults */
#define MPS_EVENT       (1 << 2)        /* Event data from the controller */
#define MPS_LOG         (1 << 3)        /* Log data from the controller */
#define MPS_RECOVERY    (1 << 4)        /* Command error recovery tracing */
#define MPS_ERROR       (1 << 5)        /* Parameter errors, programming bugs */
#define MPS_INIT        (1 << 6)        /* Things related to system init */
#define MPS_XINFO       (1 << 7)        /* More detailed/noisy info */
#define MPS_USER        (1 << 8)        /* Trace user-generated commands */
#define MPS_MAPPING     (1 << 9)        /* Trace device mappings */
#define MPS_TRACE       (1 << 10)       /* Function-by-function trace */

Initially, you might just try setting them all to see if that gets us somewhere. But before you do this, I would recommend upgrading your FW if you can.
Comment 4 Nicolas Braud-Santoni 2018-04-24 00:31:38 UTC
(In reply to Stephen McConnell from comment #3)
> I see that the FW version is 02.15.63.00. That's really old FW (about 8 years old). Is it possible for you to upgrade your FW and retry?

This is rented hardware, but I will ask the provider whether a firmware upgrade would be possible.

How can I set the sysctl option when booting form an ISO?  This is the least-inconvenient way I found to test a specific version/build.
Comment 5 Stephen McConnell freebsd_committer 2018-04-24 16:53:30 UTC
You can set the debug level during boot by selecting '3' on the boot screen to 'Escape to Loader Prompt'. Then type:

set dev.mpr.0.debug_level=0xffffffff

This will set all of the debug flags for the mpr driver.