Bug 239801 - mfi errors causing zfs checksum errors
Summary: mfi errors causing zfs checksum errors
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.3-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2019-08-12 13:34 UTC by Daniel Mafua
Modified: 2019-08-22 11:24 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Mafua 2019-08-12 13:34:33 UTC
After upgrading to FreeBSD 11.3 I began having storage issues. (I may have experienced the same problem when testing FreeBSD 12.0).  At first I thought it was a problem with upgrading my ZFS pool to include the update for spacemap_v2, but now I think it's a lower level problem.  After a day or two, I start to see checksum errors appear on a pool. Doing a scrub on the pool just results in more errors. If I restart the machine, I can then scrub the pool and errors no longer appear, then after a few days it happens again.

There's only one or two machines I haven't upgraded my zpool on, but I think it's just a coincidence that I saw the problem after upgrading the zpool (thinking the upgrade went okay). But I haven't experienced problems on the one machine I haven't upgraded the pool on.

All (3) servers are Dell PowerEdge servers, using a PERC raid controller configured as JBOD.  They have two drives configured as a zfs mirror. One machine had a drive failure a few months ago, which was replaced and running fine, but otherwise all machines have been running for years with no issues and never failed a scrub until the upgrade. They're at different locations, so that probably rules out power issues.



mfi0 Adapter:
    Product Name: PERC H330 Adapter
   Serial Number: 59N01F6
        Firmware: 25.3.0.0016
     RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID10, RAID50
  Battery Backup: not present
           NVRAM: 32K
  Onboard Memory: 0M
  Minimum Stripe: 64K
  Maximum Stripe: 64K

mfi0 Physical Drives:
 0 (  932G) JBOD <SEAGATE ST1000NM0023 GS10 serial=Z1W4J1LE> SCSI-6 S0
 1 (  932G) JBOD <SEAGATE ST1000NM0023 GS10 serial=Z1W4J3LX> SCSI-6 S1


Errors Reported in /var/log/messages

kernel: mfi0: I/O error, cmd=0xfffffe0000f9a4a8, status=0x3c, scsi_status=0
kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0
kernel: mfisyspd1: hard error cmd=write 675298440-675298991
kernel: mfi0: I/O error, cmd=0xfffffe0000f99dc0, status=0x3c, scsi_status=0
kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0
kernel: mfisyspd1: hard error cmd=write 675298504-675299015
kernel: mfi0: I/O error, cmd=0xfffffe0000f9c048, status=0x3c, scsi_status=0
kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0
kernel: mfisyspd0: hard error cmd=write 675298504-675299015
kernel: mfi0: I/O error, cmd=0xfffffe0000f9ae38, status=0x3c, scsi_status=0
kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0
kernel: mfisyspd0: hard error cmd=write 675298440-675298991
kernel: mfi0: I/O error, cmd=0xfffffe0000f9c048, status=0x3c, scsi_status=0
kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0
kernel: mfisyspd0: hard error cmd=write 675298440-675298991
kernel: mfi0: I/O error, cmd=0xfffffe0000f9afd0, status=0x3c, scsi_status=0
kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0
kernel: mfisyspd1: hard error cmd=write 675298440-675298991
kernel: mfi0: I/O error, cmd=0xfffffe0000f98bb0, status=0x3c, scsi_status=0
kernel: mfi0: sense error 47, sense_key 15, asc 175, ascq 175
kernel: mfisyspd1: hard error cmd=write 675298440-675298991
kernel: mfi0: I/O error, cmd=0xfffffe0000f9a6c8, status=0x3c, scsi_status=0
kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0
kernel: mfisyspd0: hard error cmd=write 675298440-675298991
kernel: mfi0: I/O error, cmd=0xfffffe0000f9b0e0, status=0x3c, scsi_status=0
kernel: mfi0: sense error 0, sense_key 0, asc 0, ascq 0
kernel: mfisyspd0: hard error cmd=write 675298440-675298991