Created attachment 216866 [details]
This patch removes raid map sync functionality from mfi driver
We have found a raid map sync failure in invader (device id: 5d) as soon as the <mfi> driver is loaded. This is due to the failure in fetching updated raid map from the firmware as raid map is logically unsupported in driver reason being driver is not getting any raid map data as part of MFI_DCMD_LD_MAP_GET_INFO instead it is getting the config seq number as part of MFI_DCMD_LD_GET_LIST DCMD resulted in raid map sync failure. Below is the firmware log snippet where there is a config sequence number mismatch between driver and firmware:
C0:ld sync: non-matching seqNums 1
C0:ld sync: 01 unsync'd lds remaining
This issue applies to the controllers which has a raid map support like Thunderbolt (device id: 5b) invader (device id: 5d) and fury (device id: 5f) controllers and it is not applicable to till Liberator (Gen1 and Gen2) as these controllers don't have a raid map support.
Hence, We propose to remove the raid map sync functionality from <mfi> driver. I have attached the sample patch covers raid map sync functionality removal part and hasn't covered any test cases and it is just for reference.
If it looks feasible to remove raid map sync support from driver then please consider my patch.
Note: we are not seeing this issue with <mrsas> driver.
Can FreeBSD comment on this bug ?
Dell has found this bug is causing excessive event logging, due to the incorrect raidmap sync call which fails. This in turn causes premature wear on the flash components where the logs are stored.
i.e. the bug eventually damages the hardware.
[responding with hat bugmeister@]
So, from looking over the src tree, there does not seem to be anyone actively maintaining this driver -- most of the last few years' worth of commits are issues that arose elsewhere in the codebase.
I'll try to find someone versed in disk driver code to comment on this.
^Triage: Request feedback from original mfi(4) author
Any update/ETA on when the MFI drive will be updated to resolve this issue? The issue currently impacts card hardware as well as renders the log useless due to the excessive prints.
@kubilay or Mark
Any update on finding a maintainer ?