Bug 240145 - [smartpqi][zfs] kernel panic with hanging vdev
Summary: [smartpqi][zfs] kernel panic with hanging vdev
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.2-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Warner Losh
URL:
Keywords: panic
Depends on:
Blocks:
 
Reported: 2019-08-27 12:45 UTC by rainer
Modified: 2022-04-27 13:29 UTC (History)
17 users (show)

See Also:


Attachments
Attaching the changes , reducing the maximum transfer size (520 bytes, patch)
2021-11-11 14:41 UTC, Hermes T K
no flags Details | Diff
Screencap showing maxphys=131072 and subsequent failure (983.25 KB, image/jpeg)
2021-11-12 22:33 UTC, Peter
no flags Details
Attaching the driver file for FreeBSD 13.0 (47.97 KB, application/x-xz)
2021-12-28 10:08 UTC, Hermes T K
no flags Details
Attaching the driver file for FreeBSD 13.0 (47.97 KB, application/x-xz)
2021-12-28 10:12 UTC, Hermes T K
no flags Details
Screencap showing new driver loaded and subsequent failure (948.83 KB, image/jpeg)
2022-01-10 22:55 UTC, Peter
no flags Details
Attaching the smartpqi bootleg (47.89 KB, application/x-xz)
2022-01-27 10:52 UTC, Hermes T K
no flags Details
Screencap showing "bootleg" driver loaded and subsequent failure (880.50 KB, image/jpeg)
2022-01-27 23:27 UTC, Peter
no flags Details
Screencap showing driver in attachment #230487 actually loaded and successful (987.03 KB, image/jpeg)
2022-02-03 13:44 UTC, Peter
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description rainer 2019-08-27 12:45:43 UTC
Hi,

I get kernel panics like this one:

2019-08-27T09:51:47+02:00 server-log03-prod kernel: <118>[51] 2019-08-27T09:51:47+02:00 server-log03-prod 1 2019-08-27T09:51:47.264114+02:00 server-log03-prod savecore 75563 - - reboot after panic: I/O to pool 'datapool' appears to be hung on vdev guid 3442909230652761189 at '/dev/da0'.

dmesg shows:

[167] [ERROR]::[17:655.0][0,84,0][CPU 7][pqi_map_request][540]:bus_dmamap_load_ccb failed = 36 count = 131072
[167] [WARN]:[17:655.0][CPU 7][pqisrc_io_start][794]:In Progress on 84
[167] Assertion failed at file /usr/src/sys/dev/smartpqi/smartpqi_response.c line 203


before it crashes.

There's a scrub running and I would assume that it's triggered by that.

The hardware is a HP DL380 Gen10 with 2*8 disk RAIDz2, booting from a separate controller.


zpool status
  pool: datapool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
	still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(7) for details.
  scan: scrub in progress since Tue Aug 27 03:49:26 2019
	596G scanned at 832M/s, 429M issued at 599K/s, 14.6T total
	0 repaired, 0.00% done, no estimated completion time
config:

	NAME        STATE     READ WRITE CKSUM
	datapool    ONLINE       0     0     0
	  raidz2-0  ONLINE       0     0     0
	    da3     ONLINE       0     0     0
	    da2     ONLINE       0     0     0
	    da1     ONLINE       0     0     0
	    da0     ONLINE       0     0     0
	    da4     ONLINE       0     0     0
	    da5     ONLINE       0     0     0
	    da6     ONLINE       0     0     0
	    da7     ONLINE       0     0     0
	  raidz2-1  ONLINE       0     0     0
	    da11    ONLINE       0     0     0
	    da10    ONLINE       0     0     0
	    da9     ONLINE       0     0     0
	    da8     ONLINE       0     0     0
	    da12    ONLINE       0     0     0
	    da13    ONLINE       0     0     0
	    da14    ONLINE       0     0     0
	    da15    ONLINE       0     0     0
	  raidz2-2  ONLINE       0     0     0
	    da16    ONLINE       0     0     0
	    da17    ONLINE       0     0     0
	    da18    ONLINE       0     0     0
	    da19    ONLINE       0     0     0
	    da20    ONLINE       0     0     0
	    da21    ONLINE       0     0     0
	    da22    ONLINE       0     0     0
	    da23    ONLINE       0     0     0

errors: No known data errors

  pool: zroot
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	zroot       ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    da24p4  ONLINE       0     0     0
	    da25p4  ONLINE       0     0     0

errors: No known data errors

datapool is on a HPE E208i-p SR Gen10 1.98
zroot is on a HPE P408i-a SR Gen10 1.98


I've updated all the firmware to what is available in SPP 2019.03.01

I might be a hardware-issue, but I'm not really sure where to put it.
Is it da0?

What do these error-messages mean?
Comment 1 Andriy Gapon freebsd_committer 2019-08-28 08:48:04 UTC
ZFS just reported a stuck I/O operation.
The problem is likely to be either in the driver or in the hardware.
Maybe it's triggered by the I/O load that a scrub creates.
Comment 2 rainer 2019-08-28 09:18:04 UTC
OK, thanks.

I have two of these servers, this is actually the one that has less I/O (and less drives, it finished scrubbing 19T in 4.5h yesterday).

So, I would also tend to point towards hardware. But what is it?
A specific drive? Or is the HBA toast?

I'll have to look if I can actually swap out the HBA or if I need to swap the motherboard.

I've disabled scrubs, so the server works for the moment.
Comment 3 Peter Eriksson 2019-08-28 11:27:39 UTC
Just another (rather worthless, but anyway) datapoint: 

The same thing happened to us on one of our production file servers just this Monday during prime daytime (1pm). No Scrub running just a normal load of SMB and NFS traffic (some ~400 SMB clients and ~40 NFS clients). 

FreeBSD kernel: 11.2-RELEASE-p10

Hardware: Dell PowerEdge R730xd with LSI SAS3008 (Dell-branded) HBA and the DATA pool the error occured in has 12 x 10TN SAS 7200rpm drives in a RAID-Z2 config.

After the reboot no errors could be found via Smartctl or in any logs (other than the "panic" message on the disk (or any other disk)

The vdev pointed at in the panic message was the one named "diskid/DISK-7PK8RSLC" below

# zpool status -v DATA
  pool: DATA
 state: ONLINE
  scan: scrub repaired 0 in 83h42m with 0 errors on Tue Jan  8 07:44:05 2019
config:

	NAME                              STATE     READ WRITE CKSUM
	DATA                              ONLINE       0     0     0
	  raidz2-0                        ONLINE       0     0     0
	    diskid/DISK-7PK784UC          ONLINE       0     0     0
	    diskid/DISK-7PK2GT9G          ONLINE       0     0     0
	    diskid/DISK-7PK8RSLC          ONLINE       0     0     0
	    diskid/DISK-7PK77Z2C          ONLINE       0     0     0
	    diskid/DISK-7PK1U91G          ONLINE       0     0     0
	    diskid/DISK-7PK2GBPG          ONLINE       0     0     0
	  raidz2-1                        ONLINE       0     0     0
	    diskid/DISK-7PK1AZ4G          ONLINE       0     0     0
	    diskid/DISK-7PK2GEEG          ONLINE       0     0     0
	    diskid/DISK-7PK14ARG          ONLINE       0     0     0
	    diskid/DISK-7PK7HS5C          ONLINE       0     0     0
	    diskid/DISK-7PK2GERG          ONLINE       0     0     0
	    diskid/DISK-7PK200TG          ONLINE       0     0     0
	logs
	  diskid/DISK-BTHV7146043R400NGN  ONLINE       0     0     0
	  diskid/DISK-BTHV715403A9400NGN  ONLINE       0     0     0
	cache
	  diskid/DISK-CVCQ72660083400AGN  ONLINE       0     0     0
	spares
	  diskid/DISK-7PK1RNVG            AVAIL   
	  diskid/DISK-7PK784NC            AVAIL   

errors: No known data errors

# sas3ircu 0 DISPLAY 
Avago Technologies SAS3 IR Configuration Utility.
Version 11.00.00.00 (2015.08.04) 
Copyright (c) 2009-2015 Avago Technologies. All rights reserved. 

Read configuration has been initiated for controller 0
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
  Controller type                         : SAS3008
  BIOS version                            : 8.37.00.00
  Firmware version                        : 16.00.04.00
  Channel description                     : 1 Serial Attached SCSI
  Initiator ID                            : 0
  Maximum physical devices                : 543
  Concurrent commands supported           : 9584
  Slot                                    : 5
  Segment                                 : 0
  Bus                                     : 2
  Device                                  : 0
  Function                                : 0
  RAID Support                            : No
...
Device is a Hard disk
  Enclosure #                             : 2
  Slot #                                  : 2
  SAS Address                             : 5000cca-2-51b8-fbb1
  State                                   : Ready (RDY)
  Size (in MB)/(in sectors)               : 9470975/2424569855
  Manufacturer                            : HGST    
  Model Number                            : HUH721010AL4200 
  Firmware Revision                       : LS17
  Serial No                               : 7PK8RSLC
  GUID                                    : N/A
  Protocol                                : SAS
  Drive Type                              : SAS_HDD
...
Comment 4 rainer 2019-09-02 00:06:42 UTC
So, replacing the controller:

HPE E208i-p SR Gen10


seems to have helped.

The scrub went through.

I know hardware errors are difficult to diagnose from the OS above it, but maybe there could somehow be more diagnostics?


We will have to send back this controller (we pre-ordered a new one on a hunch).
Comment 5 rainer 2019-10-05 12:11:19 UTC
Now, the other of two servers is also acting up.

After rebooting, it finished its scrub though.

I've not yet ordered a replacement HBA but will do soon.

The server with the replaced HBA has never shown a problem again. So far ;-)
Comment 6 wermut 2020-06-28 18:46:19 UTC
I have the same problem with a HPE DL385 Gen10 and the HPE Smart Array P816i-a controller under FreeBSD 12.1, with controller firmware 2.65. smartpqi manpage states, that the HPE Gen 10 devices should be supported, but all tested systems, do not work correctly with this driver.

ZFS pool and volumen creation works without problems. Creating a new virtual machine with vm-bhyve and cloud-init hangs in the step of copying the cloud image over to the new ZFS volume.

[ERROR]::[68.655.0][0,65,0][CPU 31][pqi_map_request][540]:bus_dmamap_load_ccb failed = 36 count = 131072
[WARN]:[68.655.0][CPU 31][pqisrc_io_start][794]:In Progress on 68

...

panic: I/O to pool 'tank' appears to be hung on vdev guid 13498267743616651267 at '/dev/da0'.
cpuid = 28
time = 1593368493
KDB: stack backtrace:
#0 0xffffffff80c1d307 at kdb_backtrace+0x67
#1 0xffffffff80bd063d at vpanic+0x19d
#2 0xffffffff80bd0493 at panic+0x43
#3 0xffffffff82bc17da at vdev_deadman+0x18a
#4 0xffffffff82bc1691 at vdev_deadman+0x41
#5 0xffffffff82bc1691 at vdev_deadman+0x41
#6 0xffffffff82bb21e4 at spa_daedman+0x84
#7 0xffffffff80c2fae4 at taskqueue_run_locked+0x154
#8 0xffffffff80c30e18 at taskqueue_thread_loop+0x98
#9 0xffffffff80b90c53 at fork_exit+0x83
#10 0xffffffff81082c3e at fork_trampoline+0xe

...

(da0:smartpqi0:0:64:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
(da0:smartpqi0:0:64:0): CAM status: Command timeout
(da0:smartpqi0:0:64:0): Error 5, Retries exhausted
(da0:smartpqi0:0:64:0): Synchronize cache failed

...

Dump failed. Partition to small.
Comment 7 rainer 2020-11-16 13:31:02 UTC
SPP 2020.09.0 contains firmware 3.00 for these HBAs.

I will see if it fixes these problems, too - because the list of bug-fixes is very long...


(server </root>) 130 # dmesg |grep -i smart
smartpqi0: <P408i-a SR Gen10> port 0xc000-0xc0ff mem 0xf3800000-0xf3807fff at device 0.0 numa-domain 0 on pci9
smartpqi0: using MSI-X interrupts (8 vectors)
ses0 at smartpqi0 bus 0 scbus0 target 66 lun 0
ses0: <HPE Smart Adapter 3.00> Fixed Enclosure Services SPC-3 SCSI device
Comment 8 rainer 2020-11-27 09:11:04 UTC
On another server with Firmware 3.0, I got a first "hang on vdev" crash after about a week. I also upgraded to 12.2-RELEASE
This one only got 8 disks

Microsemi does have an even newer firmware, but it has not yet made its way to HP (I've asked our VAR how long this will take).

My big syslog-servers have 32 disks, so I will for now not upgrade them.
Comment 9 rainer 2020-12-31 10:53:34 UTC
Meanwhile, the newer firmware is available from HPE.

I've applied it and it's slightly better in that it only crashes every couple of days.

Because the HBA might be a problem, I've now ordered the equivalent Microsemi HBA:

Smart-RAID 3154-8i

At least, I get a FreeBSD utility for configuration and firmware update and don't have to boot a live-CD.


HP has in the past also created more problems in their OEMed HBAs than they solved (compared to the original hardware).
Comment 10 rainer 2021-01-26 14:04:55 UTC
OK, so I still get this panic:

[49406] [ERROR]::[55:655.0][0,68,0][CPU 9][pqi_map_request][540]:bus_dmamap_load_ccb failed = 36 count = 131072
[49406] [WARN]:[55:655.0][CPU 9][pqisrc_io_start][794]:In Progress on 68
[50411] panic: I/O to pool 'datapool' appears to be hung on vdev guid 3875563786885777386 at '/dev/da9'.
[50411] cpuid = 14
[50411] time = 1611665350
[50411] KDB: stack backtrace:
[50411] #0 0xffffffff80c0a8e5 at kdb_backtrace+0x65
[50411] #1 0xffffffff80bbeb9b at vpanic+0x17b
[50411] #2 0xffffffff80bbea13 at panic+0x43
[50411] #3 0xffffffff828a2314 at vdev_deadman+0x184
[50411] #4 0xffffffff828a21d1 at vdev_deadman+0x41
[50411] #5 0xffffffff828a21d1 at vdev_deadman+0x41
[50411] #6 0xffffffff828930f6 at spa_deadman+0x86
[50411] #7 0xffffffff80c1ced4 at taskqueue_run_locked+0x144
[50411] #8 0xffffffff80c1e2c6 at taskqueue_thread_loop+0xb6
[50411] #9 0xffffffff80b804ce at fork_exit+0x7e
[50411] #10 0xffffffff81067f9e at fork_trampoline+0xe
[50411] Uptime: 14h0m11s


This is with the OG Adaptec HBA:

<Adaptec Smart Adapter 3.21>       at scbus1 target 72 lun 0 (ses1,pass12)
<Adaptec 3154-8i 3.21>             at scbus1 target 1088 lun 0 (pass13)

set to HBA mode.
Comment 11 seri 2021-02-01 21:57:40 UTC
MARKED AS SPAM
Comment 12 scott.benesh 2021-02-01 22:18:50 UTC
Have you tried with the latest FreeBSD drivers for found here:

https://storage.microsemi.com/en-us/speed/raid/aac/unix/smartpqi_freebsd_v4030.0.101_tgz.php

We've been trying to get the latest driver code changes inbox at 

https://reviews.freebsd.org/D24428

but I guess we've lost the magic on getting these changes submitted.
Comment 13 rainer 2021-02-06 22:03:45 UTC
Hi,

thanks for this update.

I will then try your driver. Unfortunately, I don't have a test-environment to try it out.

I guess, I can make a boot-environment and if it causes problems, just revert to the old boot-environment?

I'm sorry that your efforts are not honored.
Comment 14 rainer 2021-02-06 22:46:39 UTC
OK,

so I realized I can compile this anywhere, not just on my server.

This is what I did:

 - take a FreeBSD 12.2-RELEASE-p3 install
 - download src.txz, extract
 - freebsd-update fetch && freebsd-update install
 - cd /usr/src
 - patch -p 0 < /root/D24428.diff
 - make buildkernel && make installkernel


this is what I get:

cc -target x86_64-unknown-freebsd12.2 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin -c -O2 -pipe -fno-strict-aliasing  -g -nostdinc  -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common  -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -MD  -MF.depend.nehemiah.o -MTnehemiah.o -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-address-of-packed-member  -mno-aes -mno-avx  -std=iso9899:1999 -Werror  /usr/src/sys/dev/random/nehemiah.c
ctfconvert -L VERSION -g nehemiah.o
cc -target x86_64-unknown-freebsd12.2 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin -c -O2 -pipe -fno-strict-aliasing  -g -nostdinc  -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common  -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -MD  -MF.depend.smartpqi_cam.o -MTsmartpqi_cam.o -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -Wno-address-of-packed-member  -mno-aes -mno-avx  -std=iso9899:1999 -Werror  /usr/src/sys/dev/smartpqi/smartpqi_cam.c
In file included from /usr/src/sys/dev/smartpqi/smartpqi_cam.c:34:
In file included from /usr/src/sys/dev/smartpqi/smartpqi_includes.h:86:
/usr/src/sys/dev/smartpqi/smartpqi_defines.h:973:27: error: redefinition of typedef 'OS_ATOMIC64_T' is a C11 feature [-Werror,-Wtypedef-redefinition]
typedef volatile uint64_t OS_ATOMIC64_T;
                          ^
/usr/src/sys/dev/smartpqi/smartpqi_defines.h:825:33: note: previous definition is here
typedef volatile uint64_t       OS_ATOMIC64_T;
                                ^
/usr/src/sys/dev/smartpqi/smartpqi_defines.h:975:9: error: 'OS_ATOMIC64_READ' macro redefined [-Werror,-Wmacro-redefined]
#define OS_ATOMIC64_READ(_softs, target)        atomic_load_acq_64(&(_softs)->target)
        ^
/usr/src/sys/dev/smartpqi/smartpqi_defines.h:826:9: note: previous definition is here
#define OS_ATOMIC64_READ(p)     atomic_load_acq_64(p)
        ^
/usr/src/sys/dev/smartpqi/smartpqi_defines.h:976:9: error: 'OS_ATOMIC64_INC' macro redefined [-Werror,-Wmacro-redefined]
#define OS_ATOMIC64_INC(_softs, target)         atomic_add_64(&(_softs)->target, 1)
        ^
/usr/src/sys/dev/smartpqi/smartpqi_defines.h:831:9: note: previous definition is here
#define OS_ATOMIC64_INC(p)      (atomic_fetchadd_64(p, 1) + 1)
        ^
/usr/src/sys/dev/smartpqi/smartpqi_cam.c:619:4: error: use of undeclared identifier 'bsd_status'
                        bsd_status = EIO;
                        ^
/usr/src/sys/dev/smartpqi/smartpqi_cam.c:623:31: error: use of undeclared identifier 'bsd_status'; did you mean 'dumpstatus'?
        DBG_FUNC("OUT error = %d\n", bsd_status);
                                     ^~~~~~~~~~
                                     dumpstatus
/usr/src/sys/dev/smartpqi/smartpqi_defines.h:1083:58: note: expanded from macro 'DBG_FUNC'
                                printf("[FUNC]:[ %s ] [ %d ]"fmt,__func__,__LINE__,##args);                     \
                                                                                     ^
/usr/src/sys/sys/systm.h:217:5: note: 'dumpstatus' declared here
int     dumpstatus(vm_offset_t addr, off_t count);
        ^
/usr/src/sys/dev/smartpqi/smartpqi_cam.c:623:31: error: format specifies type 'int' but the argument has type 'int (*)(vm_offset_t, off_t)' (aka 'int (*)(unsigned long, long)')
      [-Werror,-Wformat]
        DBG_FUNC("OUT error = %d\n", bsd_status);
                              ~~     ^~~~~~~~~~
/usr/src/sys/dev/smartpqi/smartpqi_defines.h:1083:58: note: expanded from macro 'DBG_FUNC'
                                printf("[FUNC]:[ %s ] [ %d ]"fmt,__func__,__LINE__,##args);                     \
                                                             ~~~                     ^~~~
/usr/src/sys/dev/smartpqi/smartpqi_cam.c:625:9: error: use of undeclared identifier 'bsd_status'; did you mean 'dumpstatus'?
        return bsd_status;
               ^~~~~~~~~~
               dumpstatus
/usr/src/sys/sys/systm.h:217:5: note: 'dumpstatus' declared here
int     dumpstatus(vm_offset_t addr, off_t count);
        ^
/usr/src/sys/dev/smartpqi/smartpqi_cam.c:625:9: error: incompatible pointer to integer conversion returning 'int (vm_offset_t, off_t)' (aka 'int (unsigned long, long)') from a function
      with result type 'int' [-Werror,-Wint-conversion]
        return bsd_status;
               ^~~~~~~~~~
8 errors generated.
*** Error code 1

Stop.
make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/GENERIC
*** Error code 1
*** Error code 1


Unfortunately, I have no idea how to fix this.
Comment 15 rainer 2021-02-18 18:02:37 UTC
Hi,

with the updated diff, I get:

(f-hosting <src>) 1 # patch  -l -p 0 < /root/D24428.diff 
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_cam.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_cam.c
|+++ sys/dev/smartpqi/smartpqi_cam.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_cam.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 36.
Hunk #3 succeeded at 56.
Hunk #4 succeeded at 67.
Hunk #5 succeeded at 79.
Hunk #6 succeeded at 106.
Hunk #7 succeeded at 135.
Hunk #8 succeeded at 151.
Hunk #9 succeeded at 162.
Hunk #10 succeeded at 184.
Hunk #11 succeeded at 210.
Hunk #12 succeeded at 241.
Hunk #13 succeeded at 257.
Hunk #14 succeeded at 348.
Hunk #15 succeeded at 363.
Hunk #16 succeeded at 380.
Hunk #17 succeeded at 400.
Hunk #18 succeeded at 439.
Hunk #19 succeeded at 466.
Hunk #20 succeeded at 489.
Hunk #21 succeeded at 515.
Hunk #22 failed at 539.
Hunk #23 failed at 577.
Hunk #24 succeeded at 613 (offset -14 lines).
Hunk #25 succeeded at 638 (offset -14 lines).
Hunk #26 succeeded at 646 (offset -14 lines).
Hunk #27 succeeded at 663 (offset -14 lines).
Hunk #28 succeeded at 702 (offset -14 lines).
Hunk #29 succeeded at 723 (offset -14 lines).
Hunk #30 succeeded at 747 (offset -14 lines).
Hunk #31 succeeded at 774 (offset -14 lines).
Hunk #32 succeeded at 798 (offset -14 lines).
Hunk #33 succeeded at 868 (offset -14 lines).
Hunk #34 succeeded at 946 (offset -14 lines).
Hunk #35 succeeded at 957 (offset -14 lines).
Hunk #36 succeeded at 985 (offset -14 lines).
Hunk #37 succeeded at 1003 (offset -14 lines).
Hunk #38 succeeded at 1046 (offset -14 lines).
Hunk #39 succeeded at 1112 (offset -14 lines).
Hunk #40 succeeded at 1125 (offset -14 lines).
Hunk #41 succeeded at 1170 (offset -14 lines).
Hunk #42 succeeded at 1190 (offset -14 lines).
Hunk #43 succeeded at 1206 (offset -14 lines).
Hunk #44 succeeded at 1217 (offset -14 lines).
Hunk #45 succeeded at 1228 (offset -14 lines).
Hunk #46 succeeded at 1247 (offset -14 lines).
Hunk #47 succeeded at 1267 (offset -14 lines).
Hunk #48 succeeded at 1294 (offset -14 lines).
Hunk #49 succeeded at 1317 (offset -14 lines).
2 out of 49 hunks failed--saving rejects to sys/dev/smartpqi/smartpqi_cam.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_cmd.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_cmd.c
|+++ sys/dev/smartpqi/smartpqi_cmd.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_cmd.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 34.
Hunk #3 succeeded at 45.
Hunk #4 succeeded at 72.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_defines.h
|===================================================================
|--- sys/dev/smartpqi/smartpqi_defines.h
|+++ sys/dev/smartpqi/smartpqi_defines.h
--------------------------
Patching file sys/dev/smartpqi/smartpqi_defines.h using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 31.
Hunk #3 succeeded at 90.
Hunk #4 succeeded at 100.
Hunk #5 succeeded at 124.
Hunk #6 succeeded at 186.
Hunk #7 succeeded at 208.
Hunk #8 succeeded at 221.
Hunk #9 succeeded at 240.
Hunk #10 succeeded at 276.
Hunk #11 succeeded at 327.
Hunk #12 succeeded at 346.
Hunk #13 succeeded at 355.
Hunk #14 succeeded at 380.
Hunk #15 succeeded at 403.
Hunk #16 succeeded at 423.
Hunk #17 succeeded at 490.
Hunk #18 succeeded at 555.
Hunk #19 succeeded at 604.
Hunk #20 succeeded at 666.
Hunk #21 succeeded at 682.
Hunk #22 succeeded at 706.
Hunk #23 succeeded at 760.
Hunk #24 succeeded at 789.
Hunk #25 succeeded at 800.
Hunk #26 succeeded at 818.
Hunk #27 succeeded at 917.
Hunk #28 failed at 980.
Hunk #29 succeeded at 1031.
Hunk #30 succeeded at 1042.
Hunk #31 succeeded at 1054.
Hunk #32 succeeded at 1073.
Hunk #33 succeeded at 1091.
Hunk #34 succeeded at 1163.
1 out of 34 hunks failed--saving rejects to sys/dev/smartpqi/smartpqi_defines.h.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_discovery.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_discovery.c
|+++ sys/dev/smartpqi/smartpqi_discovery.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_discovery.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 31.
Hunk #3 succeeded at 44.
Hunk #4 succeeded at 58.
Hunk #5 succeeded at 70.
Hunk #6 succeeded at 129.
Hunk #7 succeeded at 150.
Hunk #8 succeeded at 162.
Hunk #9 succeeded at 197.
Hunk #10 succeeded at 212.
Hunk #11 succeeded at 259.
Hunk #12 succeeded at 290.
Hunk #13 succeeded at 302.
Hunk #14 succeeded at 315.
Hunk #15 succeeded at 335.
Hunk #16 succeeded at 348.
Hunk #17 succeeded at 358.
Hunk #18 succeeded at 367.
Hunk #19 succeeded at 399.
Hunk #20 succeeded at 432.
Hunk #21 succeeded at 515.
Hunk #22 succeeded at 530.
Hunk #23 succeeded at 577.
Hunk #24 succeeded at 590.
Hunk #25 succeeded at 602.
Hunk #26 succeeded at 617.
Hunk #27 succeeded at 630.
Hunk #28 succeeded at 642.
Hunk #29 succeeded at 653.
Hunk #30 succeeded at 684.
Hunk #31 succeeded at 753.
Hunk #32 succeeded at 766.
Hunk #33 succeeded at 787.
Hunk #34 succeeded at 808.
Hunk #35 succeeded at 816.
Hunk #36 succeeded at 857.
Hunk #37 succeeded at 877.
Hunk #38 succeeded at 888.
Hunk #39 succeeded at 898.
Hunk #40 succeeded at 927.
Hunk #41 succeeded at 940.
Hunk #42 succeeded at 958.
Hunk #43 succeeded at 991.
Hunk #44 succeeded at 1008.
Hunk #45 succeeded at 1027.
Hunk #46 succeeded at 1038.
Hunk #47 succeeded at 1057.
Hunk #48 succeeded at 1072.
Hunk #49 succeeded at 1108.
Hunk #50 succeeded at 1139.
Hunk #51 succeeded at 1152.
Hunk #52 succeeded at 1172.
Hunk #53 succeeded at 1186.
Hunk #54 succeeded at 1212.
Hunk #55 succeeded at 1277.
Hunk #56 succeeded at 1297.
Hunk #57 succeeded at 1345.
Hunk #58 succeeded at 1391.
Hunk #59 succeeded at 1424.
Hunk #60 succeeded at 1435.
Hunk #61 succeeded at 1486.
Hunk #62 succeeded at 1497.
Hunk #63 succeeded at 1555.
Hunk #64 succeeded at 1563.
Hunk #65 succeeded at 1581.
Hunk #66 succeeded at 1597.
Hunk #67 succeeded at 1616.
Hunk #68 succeeded at 1662.
Hunk #69 succeeded at 1674.
Hunk #70 succeeded at 1684.
Hunk #71 succeeded at 1708.
Hunk #72 succeeded at 1718.
Hunk #73 succeeded at 1737.
Hunk #74 succeeded at 1752.
Hunk #75 succeeded at 1761.
Hunk #76 succeeded at 1775.
Hunk #77 succeeded at 1804.
Hunk #78 succeeded at 1865.
Hunk #79 succeeded at 1882.
Hunk #80 succeeded at 1919.
Hunk #81 succeeded at 1970.
Hunk #82 succeeded at 1987.
Hunk #83 succeeded at 2007.
Hunk #84 succeeded at 2027.
Hunk #85 succeeded at 2045.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_event.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_event.c
|+++ sys/dev/smartpqi/smartpqi_event.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_event.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 37.
Hunk #3 succeeded at 61.
Hunk #4 succeeded at 74.
Hunk #5 succeeded at 94.
Hunk #6 succeeded at 109.
Hunk #7 succeeded at 121.
Hunk #8 succeeded at 168.
Hunk #9 succeeded at 176.
Hunk #10 succeeded at 209.
Hunk #11 succeeded at 224.
Hunk #12 succeeded at 246.
Hunk #13 succeeded at 259.
Hunk #14 succeeded at 281.
Hunk #15 succeeded at 301.
Hunk #16 succeeded at 320.
Hunk #17 succeeded at 347.
Hunk #18 succeeded at 381.
Hunk #19 succeeded at 399.
Hunk #20 succeeded at 419.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_helper.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_helper.c
|+++ sys/dev/smartpqi/smartpqi_helper.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_helper.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 30.
Hunk #3 succeeded at 97.
Hunk #4 succeeded at 142.
Hunk #5 succeeded at 151.
Hunk #6 succeeded at 160.
Hunk #7 succeeded at 188.
Hunk #8 succeeded at 230.
Hunk #9 succeeded at 287.
Hunk #10 succeeded at 299.
Hunk #11 succeeded at 319.
Hunk #12 succeeded at 330.
Hunk #13 succeeded at 353.
Hunk #14 succeeded at 364.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_includes.h
|===================================================================
|--- sys/dev/smartpqi/smartpqi_includes.h
|+++ sys/dev/smartpqi/smartpqi_includes.h
--------------------------
Patching file sys/dev/smartpqi/smartpqi_includes.h using Plan A...
Hunk #1 succeeded at 1.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_init.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_init.c
|+++ sys/dev/smartpqi/smartpqi_init.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_init.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 33.
Hunk #3 succeeded at 67.
Hunk #4 succeeded at 82.
Hunk #5 succeeded at 110.
Hunk #6 succeeded at 133.
Hunk #7 succeeded at 154.
Hunk #8 succeeded at 169.
Hunk #9 succeeded at 185.
Hunk #10 succeeded at 223.
Hunk #11 succeeded at 240.
Hunk #12 succeeded at 264.
Hunk #13 succeeded at 296.
Hunk #14 succeeded at 314.
Hunk #15 succeeded at 327.
Hunk #16 succeeded at 340.
Hunk #17 succeeded at 385.
Hunk #18 succeeded at 609.
Hunk #19 succeeded at 635.
Hunk #20 succeeded at 673.
Hunk #21 succeeded at 690.
Hunk #22 succeeded at 705.
Hunk #23 succeeded at 714.
Hunk #24 succeeded at 750.
Hunk #25 succeeded at 766.
Hunk #26 succeeded at 774.
Hunk #27 succeeded at 793.
Hunk #28 succeeded at 842.
Hunk #29 succeeded at 852.
Hunk #30 succeeded at 877.
Hunk #31 succeeded at 886.
Hunk #32 succeeded at 948.
Hunk #33 succeeded at 1024.
Hunk #34 succeeded at 1067.
Hunk #35 succeeded at 1089.
Hunk #36 succeeded at 1098.
Hunk #37 succeeded at 1119.
Hunk #38 succeeded at 1142.
Hunk #39 succeeded at 1155.
Hunk #40 succeeded at 1185.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_intr.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_intr.c
|+++ sys/dev/smartpqi/smartpqi_intr.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_intr.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 34.
Hunk #3 succeeded at 91.
Hunk #4 succeeded at 113.
Hunk #5 succeeded at 134.
Hunk #6 succeeded at 160.
Hunk #7 succeeded at 171.
Hunk #8 succeeded at 212.
Hunk #9 succeeded at 240.
Hunk #10 succeeded at 254.
Hunk #11 succeeded at 268.
Hunk #12 succeeded at 298.
Hunk #13 succeeded at 326.
Hunk #14 succeeded at 379.
Hunk #15 succeeded at 412.
Hunk #16 succeeded at 437.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_ioctl.h
|===================================================================
|--- sys/dev/smartpqi/smartpqi_ioctl.h
|+++ sys/dev/smartpqi/smartpqi_ioctl.h
--------------------------
Patching file sys/dev/smartpqi/smartpqi_ioctl.h using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 34.
Hunk #3 succeeded at 69.
Hunk #4 succeeded at 77.
Hunk #5 succeeded at 96.
Hunk #6 succeeded at 105.
Hunk #7 succeeded at 136.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_ioctl.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_ioctl.c
|+++ sys/dev/smartpqi/smartpqi_ioctl.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_ioctl.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 37.
Hunk #3 succeeded at 47.
Hunk #4 succeeded at 57.
Hunk #5 succeeded at 99.
Hunk #6 succeeded at 124.
Hunk #7 succeeded at 194.
Hunk #8 succeeded at 207.
Hunk #9 succeeded at 249.
Hunk #10 succeeded at 286.
Hunk #11 succeeded at 339.
Hunk #12 succeeded at 355.
Hunk #13 succeeded at 375.
Hunk #14 succeeded at 389.
Hunk #15 succeeded at 413.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_main.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_main.c
|+++ sys/dev/smartpqi/smartpqi_main.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_main.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 35.
Hunk #3 succeeded at 50.
Hunk #4 succeeded at 132.
Hunk #5 succeeded at 145.
Hunk #6 succeeded at 155.
Hunk #7 succeeded at 175.
Hunk #8 succeeded at 212.
Hunk #9 succeeded at 222.
Hunk #10 succeeded at 243.
Hunk #11 succeeded at 266.
Hunk #12 succeeded at 277.
Hunk #13 succeeded at 325.
Hunk #14 failed at 347.
Hunk #15 failed at 389.
Hunk #16 succeeded at 426 (offset -2 lines).
Hunk #17 failed at 439.
Hunk #18 succeeded at 480 (offset 2 lines).
Hunk #19 succeeded at 497 (offset -2 lines).
Hunk #20 succeeded at 554 (offset 2 lines).
3 out of 20 hunks failed--saving rejects to sys/dev/smartpqi/smartpqi_main.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_mem.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_mem.c
|+++ sys/dev/smartpqi/smartpqi_mem.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_mem.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 failed at 30.
Hunk #3 succeeded at 42.
Hunk #4 succeeded at 94.
Hunk #5 succeeded at 118.
Hunk #6 failed at 169.
Hunk #7 succeeded at 187.
Hunk #8 succeeded at 200.
2 out of 8 hunks failed--saving rejects to sys/dev/smartpqi/smartpqi_mem.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_misc.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_misc.c
|+++ sys/dev/smartpqi/smartpqi_misc.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_misc.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 31.
Hunk #3 succeeded at 42.
Hunk #4 succeeded at 52.
Hunk #5 failed at 71.
Hunk #6 succeeded at 93 (offset 1 line).
Hunk #7 failed at 102.
Hunk #8 succeeded at 112 (offset 1 line).
Hunk #9 succeeded at 162 (offset 1 line).
2 out of 9 hunks failed--saving rejects to sys/dev/smartpqi/smartpqi_misc.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_prototypes.h
|===================================================================
|--- sys/dev/smartpqi/smartpqi_prototypes.h
|+++ sys/dev/smartpqi/smartpqi_prototypes.h
--------------------------
Patching file sys/dev/smartpqi/smartpqi_prototypes.h using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 40.
Hunk #3 succeeded at 55.
Hunk #4 succeeded at 93.
Hunk #5 succeeded at 112.
Hunk #6 succeeded at 129.
Hunk #7 succeeded at 137.
Hunk #8 succeeded at 212.
Hunk #9 succeeded at 229.
Hunk #10 succeeded at 237.
Hunk #11 succeeded at 271.
Hunk #12 succeeded at 295.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_queue.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_queue.c
|+++ sys/dev/smartpqi/smartpqi_queue.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_queue.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 34.
Hunk #3 succeeded at 51.
Hunk #4 succeeded at 60.
Hunk #5 succeeded at 73.
Hunk #6 succeeded at 109.
Hunk #7 succeeded at 123.
Hunk #8 succeeded at 131.
Hunk #9 succeeded at 148.
Hunk #10 succeeded at 160.
Hunk #11 succeeded at 190.
Hunk #12 succeeded at 231.
Hunk #13 succeeded at 240.
Hunk #14 succeeded at 289 with fuzz 2.
Hunk #15 succeeded at 309.
Hunk #16 succeeded at 332.
Hunk #17 succeeded at 343.
Hunk #18 succeeded at 370.
Hunk #19 succeeded at 385.
Hunk #20 succeeded at 408.
Hunk #21 succeeded at 420.
Hunk #22 succeeded at 433.
Hunk #23 succeeded at 443.
Hunk #24 succeeded at 459.
Hunk #25 succeeded at 477.
Hunk #26 succeeded at 492.
Hunk #27 succeeded at 520.
Hunk #28 succeeded at 529.
Hunk #29 succeeded at 539 with fuzz 1.
Hunk #30 succeeded at 549.
Hunk #31 succeeded at 565 with fuzz 1.
Hunk #32 succeeded at 579.
Hunk #33 succeeded at 591.
Hunk #34 succeeded at 599.
Hunk #35 succeeded at 610 with fuzz 1.
Hunk #36 succeeded at 629.
Hunk #37 failed at 672.
Hunk #38 succeeded at 689.
Hunk #39 succeeded at 740.
Hunk #40 succeeded at 761.
Hunk #41 succeeded at 771.
Hunk #42 succeeded at 794.
Hunk #43 succeeded at 817.
Hunk #44 succeeded at 825.
Hunk #45 succeeded at 858.
Hunk #46 succeeded at 900.
Hunk #47 succeeded at 919.
Hunk #48 succeeded at 953.
Hunk #49 succeeded at 969.
Hunk #50 succeeded at 979.
Hunk #51 succeeded at 992.
Hunk #52 succeeded at 1012.
1 out of 52 hunks failed--saving rejects to sys/dev/smartpqi/smartpqi_queue.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_request.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_request.c
|+++ sys/dev/smartpqi/smartpqi_request.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_request.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 49.
Hunk #3 succeeded at 61.
Hunk #4 succeeded at 82.
Hunk #5 succeeded at 105.
Hunk #6 succeeded at 122.
Hunk #7 succeeded at 132.
Hunk #8 succeeded at 149.
Hunk #9 succeeded at 167.
Hunk #10 succeeded at 181.
Hunk #11 succeeded at 189.
Hunk #12 succeeded at 216.
Hunk #13 succeeded at 235.
Hunk #14 succeeded at 265.
Hunk #15 succeeded at 281.
Hunk #16 succeeded at 292.
Hunk #17 succeeded at 301.
Hunk #18 succeeded at 318.
Hunk #19 succeeded at 328.
Hunk #20 succeeded at 394.
Hunk #21 succeeded at 403.
Hunk #22 succeeded at 424.
Hunk #23 succeeded at 437.
Hunk #24 succeeded at 481.
Hunk #25 succeeded at 521.
Hunk #26 failed at 603.
Hunk #27 succeeded at 631.
Hunk #28 succeeded at 757.
Hunk #29 succeeded at 801.
Hunk #30 succeeded at 867.
1 out of 30 hunks failed--saving rejects to sys/dev/smartpqi/smartpqi_request.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_response.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_response.c
|+++ sys/dev/smartpqi/smartpqi_response.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_response.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 33.
Hunk #3 succeeded at 47.
Hunk #4 succeeded at 57.
Hunk #5 succeeded at 85.
Hunk #6 succeeded at 95.
Hunk #7 succeeded at 176.
Hunk #8 succeeded at 201.
Hunk #9 succeeded at 219.
Hunk #10 succeeded at 231.
Hunk #11 succeeded at 274.
Hunk #12 succeeded at 317.
Hunk #13 succeeded at 348.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_sis.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_sis.c
|+++ sys/dev/smartpqi/smartpqi_sis.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_sis.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 30.
Hunk #3 succeeded at 47.
Hunk #4 succeeded at 82.
Hunk #5 succeeded at 94.
Hunk #6 succeeded at 135.
Hunk #7 succeeded at 162.
Hunk #8 succeeded at 176.
Hunk #9 succeeded at 226.
Hunk #10 succeeded at 249.
Hunk #11 succeeded at 274.
Hunk #12 succeeded at 291.
Hunk #13 succeeded at 306.
Hunk #14 succeeded at 387.
Hunk #15 succeeded at 440.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_structures.h
|===================================================================
|--- sys/dev/smartpqi/smartpqi_structures.h
|+++ sys/dev/smartpqi/smartpqi_structures.h
--------------------------
Patching file sys/dev/smartpqi/smartpqi_structures.h using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 31.
Hunk #3 succeeded at 40.
Hunk #4 succeeded at 57.
Hunk #5 succeeded at 93.
Hunk #6 succeeded at 107.
Hunk #7 succeeded at 188.
Hunk #8 succeeded at 225.
Hunk #9 succeeded at 252.
Hunk #10 succeeded at 276.
Hunk #11 succeeded at 355.
Hunk #12 succeeded at 378.
Hunk #13 succeeded at 388.
Hunk #14 succeeded at 408.
Hunk #15 succeeded at 421.
Hunk #16 succeeded at 437.
Hunk #17 succeeded at 553.
Hunk #18 succeeded at 589.
Hunk #19 succeeded at 705.
Hunk #20 succeeded at 748.
Hunk #21 succeeded at 779.
Hunk #22 succeeded at 795.
Hunk #23 succeeded at 958.
Hunk #24 succeeded at 979.
Hunk #25 succeeded at 1063.
Hunk #26 succeeded at 1086.
Hunk #27 succeeded at 1107.
Hunk #28 succeeded at 1122.
Hunk #29 succeeded at 1150.
Hunk #30 succeeded at 1162.
Hunk #31 succeeded at 1181.
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: sys/dev/smartpqi/smartpqi_tag.c
|===================================================================
|--- sys/dev/smartpqi/smartpqi_tag.c
|+++ sys/dev/smartpqi/smartpqi_tag.c
--------------------------
Patching file sys/dev/smartpqi/smartpqi_tag.c using Plan A...
Hunk #1 succeeded at 1.
Hunk #2 succeeded at 35.
Hunk #3 succeeded at 52.
Hunk #4 succeeded at 77.
Hunk #5 succeeded at 98.
Hunk #6 succeeded at 135.
Hunk #7 succeeded at 195.
Hunk #8 succeeded at 241.
Hunk #9 succeeded at 250.
Hunk #10 succeeded at 264.
done


I did a make clean, then deleted /usr/src and re-extracted a clean src.tar.gz before running freebsd-update fetch && freebsd-update install again.
Then I ran 

patch -l -p 0 < /root/D24428.diff

again.
Comment 16 Srikanth 2021-02-24 13:07:38 UTC
Hi,

Does the driver/kernel compiled after following steps which was mentioned in https://reviews.freebsd.org/D24428 ?
Comment 17 rainer 2021-03-15 21:35:11 UTC
Hi,

I'm not successful here:

(f-hosting <smartpqi>) 0 # git apply --check /root/D24428_2.diff
error: patch failed: sys/dev/smartpqi/smartpqi_cam.c:231
error: sys/dev/smartpqi/smartpqi_cam.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_cmd.c:43
error: sys/dev/smartpqi/smartpqi_cmd.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_defines.h:77
error: sys/dev/smartpqi/smartpqi_defines.h: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_discovery.c:62
error: sys/dev/smartpqi/smartpqi_discovery.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_event.c:35
error: sys/dev/smartpqi/smartpqi_event.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_helper.c:43
error: sys/dev/smartpqi/smartpqi_helper.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_init.c:31
error: sys/dev/smartpqi/smartpqi_init.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_intr.c:32
error: sys/dev/smartpqi/smartpqi_intr.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_ioctl.h:67
error: sys/dev/smartpqi/smartpqi_ioctl.h: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_ioctl.c:53
error: sys/dev/smartpqi/smartpqi_ioctl.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_main.c:134
error: sys/dev/smartpqi/smartpqi_main.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_misc.c:39
error: sys/dev/smartpqi/smartpqi_misc.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_prototypes.h:120
error: sys/dev/smartpqi/smartpqi_prototypes.h: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_queue.c:32
error: sys/dev/smartpqi/smartpqi_queue.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_request.c:47
error: sys/dev/smartpqi/smartpqi_request.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_response.c:85
error: sys/dev/smartpqi/smartpqi_response.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_sis.c:77
error: sys/dev/smartpqi/smartpqi_sis.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_structures.h:29
error: sys/dev/smartpqi/smartpqi_structures.h: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_tag.c:73
error: sys/dev/smartpqi/smartpqi_tag.c: patch does not apply


It seems it checks out HEAD/current.
Is that supposed to happen?
Comment 18 rainer 2021-03-15 21:38:44 UTC
OK,

with

git clone -b releng/12.2 --depth 1 https://git.freebsd.org/src.git src

(f-hosting <smartpqi>) 0 # git apply --check /root/D24428_2.diff
error: patch failed: sys/dev/smartpqi/smartpqi_cam.c:473
error: sys/dev/smartpqi/smartpqi_cam.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_defines.h:856
error: sys/dev/smartpqi/smartpqi_defines.h: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_main.c:312
error: sys/dev/smartpqi/smartpqi_main.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_mem.c:28
error: sys/dev/smartpqi/smartpqi_mem.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_misc.c:69
error: sys/dev/smartpqi/smartpqi_misc.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_queue.c:280
error: sys/dev/smartpqi/smartpqi_queue.c: patch does not apply
error: patch failed: sys/dev/smartpqi/smartpqi_request.c:540
error: sys/dev/smartpqi/smartpqi_request.c: patch does not apply
Comment 19 Srikanth 2021-03-26 15:12:46 UTC
(In reply to rainer from comment #18)
Can you please apply the latest patch on 12.2
Comment 20 Peter 2021-08-16 13:19:20 UTC
Hopping on this thread as I have the exact same issue. I have spent significant time attempting to debug this and will share what I've found so far. I have two nearly identical systems HPE dl180 g10 with p816i controllers. One has SATA disks, the other SAS. Only the system with SAS disks seems to be affected. Only a zfs scrub triggers this panic - the system is otherwise stable. The hardware has been verified OK by successfully completing a scrub under CentOS 8.4 with 0 errors. I have been able to reproduce this on every OS/driver/firmware combination up to and including:

FreeBSD 13.0
Microsemi driver v4130 (8/5/2021)
HPE SmartArray Firmware 3.53

I'm willing to help debug as this is a 100% reproduceable issue - sometimes within first 1% of scrub progress, but never more than 8-9%.
Comment 21 rainer 2021-08-16 16:28:11 UTC
I believe this has been MFC'ed to 13-stable a while ago and 12-STABLE recently.

Please try that.

https://cgit.freebsd.org/src/log/sys/dev/smartpqi?h=stable/13

https://cgit.freebsd.org/src/commit/sys/dev/smartpqi?h=stable/13&id=1569aab1cb38a38fb619f343ed1e47d4b4070ffe

For me (DL 380 Gen 10 with P408i + Microsemi Smart-RAID 3154-8i) it works without issue, so far.
Comment 22 Peter 2021-08-16 17:07:22 UTC
(In reply to rainer from comment #21)

I don't understand what you're asking me to try. The latest drivers from Microsemi are newer than anything in your links. Are you implying that the official latest drivers don't contain these patches? Or the included driver in 13.0 doesn't contain these patches?
Comment 23 rainer 2021-08-16 18:13:02 UTC
Ah, OK.

Sorry - when I checked, there were no newer drivers on Microsemi's homepage.

I have one DL380 Gen 10 with P408 that I still have a scrub to run on. The other two show no problems.

Maybe you can try 13-stable and if there's a problem, comment on the differential and open a new PR here?

The biggest problem is that none of the committers actually have access to the hardware and are thus reliant on 3rd-parties like us for verification.
Comment 24 Peter 2021-08-16 19:17:52 UTC
(In reply to rainer from comment #23)

As before, I can reproduce this bug on all versions of FreeBSD, up to and including 13.0 stable. I'd prefer not to split the thread as it is the same unresolved issue present in 12.x.
Comment 25 rainer 2021-08-16 19:22:17 UTC
Yes, do so and post the PR here.


How many drives do you have?
Comment 26 Peter 2021-08-16 19:55:53 UTC
(In reply to rainer from comment #25)

See bug #257890
12x Seagate ST16000NM002G
Comment 27 Warner Losh freebsd_committer 2021-08-17 22:55:53 UTC
As far as I know, I've committed all the smartpqi drivers from microsemi. The 13.0 and -current drivers are identical. The 12.x driver has a few differences, but I don't believe they will affect its operation on a 12.x kernel. The latest drivers are not yet in a release, though, so you'd have to test on -stable (which it looks like you are doing).

If there's newer drivers on the microsemi website, can someone point me at them? There were long delays in getting their last release in and I'd like to avoid that in the future by keeping more on top of it.
Comment 28 Peter 2021-08-18 12:06:04 UTC
(In reply to Warner Losh from comment #27)

These are the drivers that were tested after the BSD-included drivers failed.
They have the same issue.

https://storage.microsemi.com/en-us/speed/raid/aac/unix/smartpqi_freebsd_v4130.0.1008_tgz.php
Comment 29 Warner Losh freebsd_committer 2021-08-18 14:38:21 UTC
(In reply to Peter from comment #28)
Yea, the drivers that I found had no source included so including them in FreeBSD is going to be tough.
Comment 30 Mirco Schmidt 2021-08-30 14:58:17 UTC
I was hit by that one too...

Running a "HPE DL380 Gen10" with a "HPE Smart Array P816i-a SR Gen10"

pqi_map_requests: bus_dnamap_load_ccb failed error

All this using HPE's latest Firmware for the controller: 
HPE Smart Array P816i-a SR Gen10 	3.53

Hope there'll be a fix soon!
Comment 31 Peter 2021-08-30 16:37:12 UTC
(In reply to Mirco Schmidt from comment #30)

What model# and quantity of disks do you have in this system?
Comment 32 benoitc 2021-08-30 17:46:52 UTC
i have the same kind of error but with an UFS disk : pqi_map_requests: bus_dnamap_load_ccb failed error. This is using an  E208i-a SR Gen10. 2 disks in miror others in HBA mode
Comment 33 benoitc 2021-08-30 17:47:41 UTC
(In reply to benoitc from comment #32)
on latest ReleASe-13-p4.
Comment 34 Mirco Schmidt 2021-08-31 08:52:57 UTC
(In reply to Peter from comment #31)

Hi hi,

I've got 5 8TB 7.2k SAS Disks (MB008000JWJRQ) behind the P816i-a. Additionaly to that there are 2 480GB NVMe SSD on the "HPE NS204i-p Gen10+ Boot Controller" which I intend to use as Log & Zil (VS000480KXALB) and two 240 mSATA SSD from which I now boot the ProxMox which I had to setup yesterday as the BSD was repeatatly crashing! And the machine had to go live that day... 

So I'm now running the BSD from inside KVM and have those 5 Disk passed through to the VM and it is stable & fast ;-)
Comment 35 Mirco Schmidt 2021-08-31 08:53:20 UTC
(In reply to Peter from comment #31)

Hi hi,

I've got 5 8TB 7.2k SAS Disks (MB008000JWJRQ) behind the P816i-a. Additionaly to that there are 2 480GB NVMe SSD on the "HPE NS204i-p Gen10+ Boot Controller" which I intend to use as Log & Zil (VS000480KXALB) and two 240 mSATA SSD from which I now boot the ProxMox which I had to setup yesterday as the BSD was repeatatly crashing! And the machine had to go live that day... 

So I'm now running the BSD from inside KVM and have those 5 Disk passed through to the VM and it is stable & fast ;-)
Comment 36 Peter 2021-08-31 12:03:13 UTC
(In reply to Mirco Schmidt from comment #35)

Thanks for sharing that - my suspicion is that this issue is related to SAS transport. My systems with SATA disks do not have this issue, but only having one system with SAS disks didn't seem like enough of a sample size. As an added bonus those look like HPE disks so HPE can stop screeching about compatibility issues being the cause of this. ;)
Comment 37 Mirco Schmidt 2021-08-31 15:29:45 UTC
(In reply to Peter from comment #36)

If that is the case (HPE moaning about issues with "unsupported" disks) consider me your testbed!

I'm willing to prove this anytime HPE come's up with a change or firmware upgrade... I'll easily drive to the client, drop in a USB-stick boot up to BSD and check if the upgrade from HPE fixes the issue ;-)
Comment 38 benoitc 2021-08-31 16:37:20 UTC
(In reply to Peter from comment #36)
disk I have are also HPE drives. What would be a possible fix if it's due to the SAS transport?
Comment 39 Peter 2021-08-31 16:53:41 UTC
(In reply to Mirco Schmidt from comment #37)
I've already ruled out hardware as the issue. My system performs flawlessly under CentOS 8.4. I intend to follow up with HPE when we have a resolution and will definitely let them know HPE disks are also affected. These events are intermittently logged as a hardware failure by the system bios (this seems dependent on the driver version, though), which is why HPE was originally involved.

(In reply to benoitc from comment #38)
Are your disks SATA or SAS? Post the model number if you can find it. All indicators currently point to this being a driver issue and I'm trying to collect as much information as possible for the devs at Microsemi.
Comment 40 benoitc 2021-08-31 16:59:44 UTC
(In reply to Peter from comment #39)
SAS disks : 2x300 GB (EG000300JWEBF) in raid 1 and 2x 2TB (MM2000JEFRC) in HBA mode
Comment 41 benoitc 2021-08-31 17:02:34 UTC
(In reply to benoitc from comment #40)
the RAID1 is mounted as UFS while the 2 others are in a zfs pool.
Comment 42 benoitc 2021-09-02 08:45:19 UTC
it seems using latest driver from 13.0-stable applied on 13.0-releng worked for me. I am not using any more hardware raid, only 2 zfs pools (2x300GB and 2x2TB). 


Commits used : 


commit 2c98463a296974dec38707b3c346c570dbfb3630 (HEAD -> releng/13.0)
Author: Edward Tomasz Napierala <trasz@FreeBSD.org>
Date:   Fri May 28 00:33:37 2021 -0600

    smartpqi: clear CCBs allocated on the stack

    Differential Revision:          https://reviews.freebsd.org/D30299

    (cherry picked from commit e20e60be501204c3ba742e266afecc6c6e498a6c)

commit 0ea861c05c484f5fcc8c1cc36c70f842daef04b1
Author: PAPANI SRIKANTH <papani.srikanth@microchip.com>
Date:   Fri May 28 00:17:56 2021 -0600

    Newly added features and bug fixes in latest Microchip SmartPQI driver

    It includes:

    1)Newly added TMF feature.
    2)Added newly Huawei & Inspur PCI ID's
    3)Fixed smartpqi driver hangs in Z-Pool while running on FreeBSD12.1
    4)Fixed flooding dmesg in kernel while the controller is offline during in ioctls.
    5)Avoided unnecessary host memory allocation for rcb sg buffers.
    6)Fixed race conditions while accessing internal rcb structure.
    7)Fixed where Logical volumes exposing two different names to the OS it's due to the system memory is overwritten with DMA stale data.
    8)Fixed dynamically unloading a smartpqi driver.
    9)Added device_shutdown callback instead of deprecated shutdown_final kernel event in smartpqi driver.
    10)Fixed where Os is crashed during physical drive hot removal during heavy IO.
    11)Fixed OS crash during controller lockup/offline during heavy IO.
    12)Fixed coverity issues in smartpqi driver
    13)Fixed system crash while creating and deleting logical volume in a continuous loop.
    14)Fixed where the volume size is not exposing to OS when it expands.
    15)Added HC3 pci id's.

    Reviewed by:            Scott Benesh (microsemi), Murthy Bhat (microsemi), imp
    Differential Revision:  https://reviews.freebsd.org/D30182

    (cherry picked from commit 9fac68fc3853b696c8479bb3a8181d62cb9f59c9)
Comment 43 benoitc 2021-09-02 08:47:37 UTC
(In reply to benoitc from comment #42)

 $ dmesg |grep -i smart
smartpqi0: <E208i-a SR Gen10> port 0xc000-0xc0ff mem 0xf3800000-0xf3807fff at device 0.0 numa-domain 0 on pci12
smartpqi0: using MSI-X interrupts (16 vectors)
ses2 at smartpqi0 bus 0 scbus16 target 68 lun 0
ses2: <HPE Smart Adapter 3.53> Fixed Enclosure Services SPC-3 SCSI device
pass7 at smartpqi0 bus 0 scbus16 target 1088 lun 1
da3 at smartpqi0 bus 0 scbus16 target 67 lun 0
da2 at smartpqi0 bus 0 scbus16 target 66 lun 0
da0 at smartpqi0 bus 0 scbus16 target 64 lun 0
da1 at smartpqi0 bus 0 scbus16 target 65 lun 0


 $ sudo camcontrol devlist
<AHCI SGPIO Enclosure 2.00 0001>   at scbus6 target 0 lun 0 (ses0,pass0)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus15 target 0 lun 0 (ses1,pass1)
<HP EG000300JWEBF HPD4>            at scbus16 target 64 lun 0 (da0,pass2)
<HP EG000300JWEBF HPD4>            at scbus16 target 65 lun 0 (da1,pass3)
<HP MM2000JEFRC HPD8>              at scbus16 target 66 lun 0 (da2,pass4)
<HP MM2000JEFRC HPD8>              at scbus16 target 67 lun 0 (da3,pass5)
<HPE Smart Adapter 3.53>           at scbus16 target 68 lun 0 (ses2,pass6)
<HPE E208i-a SR Gen10 3.53>        at scbus16 target 1088 lun 1 (pass7)
Comment 44 Peter 2021-09-02 12:25:33 UTC
(In reply to benoitc from comment #42)

Good to know there's progress being made. The latest driver dated 8/5/21 still contains this issue though. In all fairness, this only became a problem for me after the addition of disks to the system (4x->12x 16TB). It was stable for over a year prior to the addition.
Comment 45 Palle Girgensohn freebsd_committer 2021-11-02 13:57:33 UTC
Hi!

I also have problems with this controller. With 13.0 installed, it crashed quite quickly on just IO intermediate load. After upgrading to -STABLE on October 12 2021, the system is quite stable, BUT, when restoring postgresql databases with pg_restore -j 5 (five writes in parallel), the database later reports checksum errors when reading some blocks back.

This seems to happen mainly for big database indexes that where generated in parallel.

I didn't notice until I took a pg_basebackup because postgresql does not validate the checksum until it is read.

Sorry, lots of database methods, not necessarily common knowledge for scsi experts. A pg_basebackup basically copies all the files, quite similar to an rsync, but optiionally also validates a CRC checksum, that was calculated for each block was they where written, as it reads the data

pg_restore reads a database dumps, writes all the data to disk and creates the indexes using sql create index commands, that is, looking the written files and calculates the index and writes them.

For about 1,3 TB of database data, the system had 2324 blocks with checksum errors. All but two of them where with indexes, which kind of suggest that this *could* be a postgresql issue, but given the amount of users using postgresql as opposed to the amount of users using this controller with freebsd, I'm reluctant to discredit postgresql here. We should have heard of it if there was a problem with postgresql?

Since most errors where with the indexes, they could be reindexed, and the one data table that was broken, I managed to fix, so at the moment my data seems to be safe, but I do not trust this controller-driver-OS combo much at the moment. 

Anything I can do to help find a solution to the problem? I'm considering moving the databases back to an old "trusted" box, so if it could help, I could perhaps supply you with a login to the box in a week or so? Would that help? It has an ILO for remote console as well.

I am using the built in RAID:

$ dmesg |grep -i smart
smartpqi0: <P408i-a SR Gen10> port 0x8000-0x80ff mem 0xe6c00000-0xe6c07fff at device 0.0 numa-domain 0 on pci9
smartpqi0: using MSI-X interrupts (32 vectors)
da0 at smartpqi0 bus 0 scbus0 target 0 lun 1
da1 at smartpqi0 bus 0 scbus0 target 0 lun 2
ses0 at smartpqi0 bus 0 scbus0 target 72 lun 0
ses0: <HPE Smart Adapter 3.53> Fixed Enclosure Services SPC-3 SCSI device
pass3 at smartpqi0 bus 0 scbus0 target 1088 lun 1

$ sudo camcontrol devlist
<HPE RAID 1(1+0) OK>               at scbus0 target 0 lun 1 (pass0,da0)
<HPE RAID 1(1+0) OK>               at scbus0 target 0 lun 2 (pass1,da1)
<HPE Smart Adapter 3.53>           at scbus0 target 72 lun 0 (ses0,pass2)
<HPE P408i-a SR Gen10 3.53>        at scbus0 target 1088 lun 1 (pass3)
<Generic- SD/MMC CRW 1.00>         at scbus1 target 0 lun 0 (da2,pass4)
Comment 46 Palle Girgensohn freebsd_committer 2021-11-02 13:58:25 UTC
(In reply to Palle Girgensohn from comment #45)
...and I'm using UFS, btw. works better than ZFS for PostgreSQL.
Comment 47 Peter 2021-11-02 14:21:26 UTC
(In reply to Palle Girgensohn from comment #45)

The best tidbit I have to offer at the moment is that I distinctly remember large amounts of ZFS checksum errors on reads under load using a particular version of the smartpqi driver. Unfortunately I don't remember exactly which version(s). After performing a scrub under CentOS, my mind was at ease knowing the integrity of the data written to disk was 100% and that these checksum errors on reads were due to a driver issue. I can't say with any certainty that's what's happening in your case, but it may be worth the piece of mind to investigate.
Comment 48 Palle Girgensohn freebsd_committer 2021-11-02 14:36:11 UTC
(In reply to Peter from comment #47)
It seems to me that I already have the latest version in the FreeBSD ports tree, and google does not really help me finding anything newer? I agree it seems like a driver issue, but I am not sure how to solve it without upgrading the software?
Comment 49 Peter 2021-11-02 15:13:48 UTC
(In reply to Palle Girgensohn from comment #48)
The link in comment #28 contains the latest compiled driver from Microsemi, as well as archived links of all previous versions.
Comment 50 rainer 2021-11-09 20:21:57 UTC
I may be able to further test this - if our customer decides to order the hardware.

This would be 24x1.8TB SAS, likely on a HP p408i.

Sometime near the end of the year.

I don't have spare hardware for this sitting around, especially not with that many drives (each of these servers is around 20k CHF...)

I will update this ticket once it actually materializes (lead time for those is usually weeks).
Comment 51 Warner Losh freebsd_committer 2021-11-09 20:46:27 UTC
As always, I'd love to have the latest SOURCES in the base system, so if there's changes needed, I'm happy to usher them into the system. I believe that I have the latest publicly available ones there now.
Comment 52 Hermes T K 2021-11-11 05:36:01 UTC
Hi,

In freeBSD 13.0, while running  IO with blocksize 1MB , observed  a corruption in SGL received.
Due to this corrupted SGL was leading to FW lockup. 
With this driver hangs & crashes.
Observed that the incomplete SGL is observed during IO with higher transfer size.

Created a freebsd bugzilla ticket for this issue :https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=259129
But there is no update to this ticket till now. 

Workaround for this issue is to reduce the Maximum transfer size of the IO.
With the attached patch , i have reduced the tranfer size & was not observing the 
issue.

Thanks & Regards
Hermes T K
Comment 53 Peter 2021-11-11 13:12:00 UTC
(In reply to Hermes T K from comment #52)

From bug #259129...
>When we tried in FreeBSD 12.2, the maximum block size allowed to run in fio is 128k.

>We are suspecting some issue in SGL handling with FreeBSD 13.0.


The issue I'm having affects FreeBSD 12.X and 13.X with identical symptoms. That said, if you feel you have a workaround I'll gladly test it. You mentioned an attached patch, but I don't see one here or the other ticket - only a log file.
Comment 54 Hermes T K 2021-11-11 14:41:06 UTC
Created attachment 229431 [details]
Attaching the changes , reducing the maximum transfer size
Comment 55 Warner Losh freebsd_committer 2021-11-11 15:56:51 UTC
(In reply to Peter from comment #53)
A workaround for people w/o the patch is to set hw.maxphys=131072 which will have the same effect and likely not affect anything else in the system.

A question to the microsemi folks: What's the limit the firmware can do?
Comment 56 Hermes T K 2021-11-12 14:26:04 UTC
(In reply to Warner Losh from comment #55)
The (In reply to Warner Losh from comment #55)

Question: What's the limit the firmware can do?
Answer: 2MB is the maximum transfer size
Comment 57 Warner Losh freebsd_committer 2021-11-12 16:32:28 UTC
(In reply to Hermes T K from comment #56)
Thanks Hermes

> Answer: 2MB is the maximum transfer size

I wonder why setting the transfer limit to 128k fixed it for the reporter then. maxphys is only 1MB and should be well under that maximum.
Comment 58 Peter 2021-11-12 16:38:42 UTC
(In reply to Warner Losh from comment #57)

Unless I'm missing something, it doesn't appear the maxphys parameter is configurable on 12.2, only on 13.0 as AFAICT. I'm working on testing this but my production system runs on 12.2 so some additional steps are required to do so. I'm planning on testing this weekend and will report back.
Comment 59 Peter 2021-11-12 22:33:38 UTC
Created attachment 229457 [details]
Screencap showing maxphys=131072 and subsequent failure

Fail... no change in behavior
Comment 60 Nils Beyer 2021-12-09 12:15:22 UTC
Hi,

any updates on this? I'm using three Adaptec 1100-4i HBAs each connected to a seperate SuperMicro BPN-SAS3-216EL1 backplane for a total of 72 bays.

My zpool is created with 67 SSDs in a simple "RAID0"-config:

        zpool create atime=off mountpoint=none test da0 [..] da66

and each time I realiably can lockup a random controller by creating enough load using:

        dd if=/dev/zero of=/mnt/test.dat bs=100M

and after a time of five minutes a parallel

        zpool scrub test

with following kernel messages:

        [...heartbeat...] controller is offline
        [...take_ctrl-offline...] Controller FW is not runniung. Lockup code = 1403a

The Adaptec HBA shows after reboot:

        1719-Slot 10 A controller failure event occurred prior to this power-up
          Previous lock up code=0001403A
        POST Messages Ended. Press any key to continue.

I even tried only one Adaptec 1100 HBA and the three backplanes as a cascade; but the controller locks up using this config as well...



TIA and BR,
Nils
Comment 61 Barry van Someren 2021-12-18 20:45:57 UTC
Hi all,

Just adding myself to this issue. I've recently purchased a HP DL 380 Gen 10 with the P816-i SR, see patch level below:

smartpqi0: <P816i-a SR Gen10> port 0x7000-0x70ff mem 0xe6f00000-0xe6f07fff at device 0.0 numa-domain 0 on pci9

This server is slotted with 10 HPe branded 8TB SAS disks in passthrough:
da0: <HPE MB008000JWRTD HPD2> Fixed Direct Access SPC-4 SCSI device

I'm running FreeBSD 13.0 Release:

root@storage00:/home/coffeesprout # freebsd-version -kru
13.0-RELEASE-p4
13.0-RELEASE-p4
13.0-RELEASE-p5

And I've tried setting the maxphys as a workaround, but as comment #59 said it doesn't fix  the issue.

Btw, in my case a scrub seems to work fine because of the small dataset on this machine. I can reliably trigger the issue by installing python as it hangs on fsycing the extract step.
I'm guessing any large unarchive action might fail like this

See also:

root@storage00:/home/coffeesprout # pkg install python3
Updating FreeBSD repository catalogue...
FreeBSD repository is up to date.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The following 2 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
	python3: 3_3
	python38: 3.8.12

Number of packages to be installed: 2

The process will require 116 MiB more space.

Proceed with this action? [y/N]: y
[1/2] Installing python38-3.8.12...
[1/2] Extracting python38-3.8.12: 100%
load: 0.03  cmd: pkg 30374 [zcw->zcw_cv] 61.48r 1.93u 1.07s 0% 71116k
mi_switch+0xc1 _cv_wait+0xf2 zil_commit_impl+0xeed zfs_fsync+0x7f kern_fsync+0x192 amd64_syscall+0x10c fast_syscall_common+0xf8

At this point anything touching IO (read or write) will just hang and I need to reboot the machine.

The machine is still in my homelab so if there is anything you want me to try, let me know as I'd love to get the ball rolling.
Unfortunately not much of a C dev myself.
Comment 62 Barry van Someren 2021-12-19 11:10:50 UTC
(In reply to Barry van Someren from comment #61)

After a good night of sleep and some potent coffee I can confirm that upgrading to the latest stable/13 fixes the issue, or at least I can normally install packages again.
I'll be doing more tests to prepare this machine for production.

Any reason why this is not in releng/13 yet and anything I can do to help?

My working configuration is:

FreeBSD 13 Stable (19 Dec)
root@storage00:/usr/src # git rev-parse HEAD
96787c2ffe6da94e158172428608250e29584a74

Smart Controller and disks (missed in my last post, sorry):

root@storage00:/usr/src # camcontrol devlist
<HPE MB008000JWRTD HPD2>           at scbus0 target 64 lun 0 (pass0,da0)
<HPE MB008000JWRTD HPD2>           at scbus0 target 65 lun 0 (pass1,da1)
<HPE MB008000JWRTD HPD2>           at scbus0 target 66 lun 0 (pass2,da2)
<HPE MB008000JWRTD HPD2>           at scbus0 target 67 lun 0 (pass3,da3)
<HPE MB008000JWRTD HPD2>           at scbus0 target 68 lun 0 (pass4,da4)
<HPE MB008000JWRTD HPD2>           at scbus0 target 69 lun 0 (pass5,da5)
<HPE MB008000JWRTD HPD2>           at scbus0 target 70 lun 0 (pass6,da6)
<HPE MB008000JWWQP HPD6>           at scbus0 target 71 lun 0 (pass7,da7)
<HPE MB008000JWWQP HPD6>           at scbus0 target 72 lun 0 (pass8,da8)
<HPE MB008000JWWQP HPD6>           at scbus0 target 73 lun 0 (pass9,da9)
<HPE Smart Adapter 4.11>           at scbus0 target 74 lun 0 (pass10,ses0)
<HPE P816i-a SR Gen10 4.11>        at scbus0 target 1088 lun 1 (pass11)
<Generic- SD/MMC CRW 1.00>         at scbus1 target 0 lun 0 (da10,pass12)

No specials sysctls set this time
Comment 63 Barry van Someren 2021-12-19 18:25:53 UTC
(In reply to Nils Beyer from comment #60)

Final update for now. Looks like I spoke too soon. While the system is a lot more stable right now, when I really push IO for a long time the array totally locks up and the server panics.
Something as simple as installing postgresql14 and running pgbench -i -s 50000 crashes the machine at 10%
FreeBSD can't reach the disks anymore even after a reboot and I'm getting a similar error like Nils in the BIOS.

Eventually the controller comes back online but it looks like my zpool got totally broken.

In case this is helpful to anybody, I managed to save the logging when the system locks up but I feel this is a separate issue from this one.

I'm going to run similar tests using Proxmox to see if it's caused by the driver code or the hardware is just wonky since it is second hand (All the hardware was produced at the start of 2020)
Comment 64 Nils Beyer 2021-12-21 01:55:54 UTC
(In reply to Barry van Someren from comment #63)

Hi Barry,

thank you very much for your experience report. For what it's worth, I redid my test under Archlinux and Debian - both with ZFS. There I did not manage to get the HBAs to produce a firmware lockup, but I did notice that sometimes there was a small "hang" regarding IO transfer and corresponding messages in the kernel log that some devices were reset. But no panic or complete lockup. After that small "hang" everything went on normally; no corruptions.

Then I did the same test with FreeBSD 13.0-STABLE in a Dell chassis without an expander backplane (only four drives) and had no problems at all. No lockups, panics or zpool corruptions.

So it seems that the Adaptec/Microsemi controllers - no matter if HBA or RAID controller - have problems with expander backplanes.

No idea if that's fixable by driver or if it's a hardware issue...
Comment 65 Barry van Someren 2021-12-23 09:34:27 UTC
(In reply to Nils Beyer from comment #64)
Hi Nils,

Having re-run the tests with FreeBSD inside a virtual machine on Proxmox with the same hardware I've not seen anything in the logs indicating a drive / HBA hang, but it's always a possibility.
The test ran at about the same speed and there were no further issues.

In your testing did you try setting the maxphys parameter as mentioned in comment #55 ?
Because the system didn't boot back up I moved to testing with Proxmox.

It's the only thing I don't really see in this ticket, somebody that tested the latest 13 Stable with this parameter set or the patch applied; As mentioned in bug #259129 the controller doesn't like receiving larger than 2MB

I'm a little short on time, but I'll try and see if I can squeeze in one more test with:

FreeBSD 13 Stable (I'll need to redo the OS)
hw.maxphys=131072 set in the /boot/loader.conf
Postgresql 14 pgbench

I'd really just rather run FreeBSD than try to coax Linux into doing the ZFS on root thing :-)
Comment 66 Hermes T K 2021-12-28 10:08:57 UTC
Created attachment 230486 [details]
Attaching the driver file for FreeBSD 13.0
Comment 67 Hermes T K 2021-12-28 10:12:50 UTC
Created attachment 230487 [details]
Attaching the driver file for FreeBSD 13.0
Comment 68 Hermes T K 2021-12-28 10:18:32 UTC
(In reply to Peter from comment #59)
Hi,

I have made some code changes & tested the changes in the Freebsd 13.0.
With the changes, didnt observed the issue.
Can you please test with the attached driver file & update ?

Thanks & Regards
Hermes T K
Comment 69 Peter 2021-12-31 16:55:08 UTC
(In reply to Hermes T K from comment #68)
Thank you Hermes - I will test once I'm back in the building. Probably 2nd week of January and will definitely let you know.
Comment 70 Peter 2022-01-10 22:55:29 UTC
Created attachment 230891 [details]
Screencap showing new driver loaded and subsequent failure

(In reply to Hermes T K from comment #68)
Hermes - Unfortunately there's no change in behavior
Comment 71 Hermes T K 2022-01-11 18:00:10 UTC
(In reply to Peter from comment #70)
Thanks Peter for testing & updating.

Can you please share the server configuation , controller & drive details ?



Thanks & Regards
Hermes T K
Comment 72 Hermes T K 2022-01-11 18:00:34 UTC
(In reply to Peter from comment #70)
Thanks Peter for testing & updating.

Can you please share the server configuation , controller & drive details ?



Thanks & Regards
Hermes T K
Comment 73 Peter 2022-01-11 19:01:43 UTC
(In reply to Hermes T K from comment #72)

HPE dl180 g10, system firmware U31 v2.40
1x Intel Xeon Silver 4208 CPU @ 2.10GHz
HPE SmartArray p816i w/ firmware v3.53
12x Seagate ST16000NM002G as a RAID-Z3 all with firmware E003
Comment 74 Mergen A 2022-01-18 06:42:12 UTC
Hi,

Has encountered same issue, after intensive IO operations, ZFS pool hanged (zdata), after reboot stuck on error same as below (copied and pasted from one comments above):

[167] [ERROR]::[17:655.0][0,84,0][CPU 7][pqi_map_request][540]:bus_dmamap_load_ccb failed = 36 count = 131072
[167] [WARN]:[17:655.0][CPU 7][pqisrc_io_start][794]:In Progress on 84

Entered single boot mode, destroyed partitions of disks used in ZFS pool and could boot to multi user. I have separate ZFS pool for OS (zroot) and separate for data (zdata). 

Could reproduce that error by running bonnie++ (benchmarks/bonnie++ port) with below command:

bonnie++ -u root -r 1024 -s 16384 -d /zdata -f -b -n 1 -c 4

Currently upgrading my server with HPE Service Pack for Proliant image (HPE SPP Gen10 2021.10). 

My hardware is HPE ProLiant DL380 Gen10

CPU: 2x Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
RAM: 512 GB RAM
RAID/HBA: HPE Smart Array P408i-a SR Gen10
Disks:
- zdata
  - 15x 8TB SATA HDD MB8000GFECR (Bay 1-15)
- zroot
  - 2x 300GB SAS HDD EH000300JWHPL (Bay 21, Bay 23)
All are connected to Port 1

After upgrade will again and write more.
Best Regrads,
Mergen
Comment 75 Mergen A 2022-01-18 07:21:54 UTC
Forgot to add, raidz2 with 13 8TB disks was used and latest FreeBSD version was used. 

# freebsd-version
13.0-RELEASE-p6

please write if anything I can check to help.

BR,
Mergen
Comment 76 Mergen A 2022-01-18 08:17:15 UTC
created new zdata pool, started test and in parallel started `zpool iostat zdata 2`.


got below error in `/var/log/messages`

Jan 18 12:58:27 bigpotato kernel: [ERROR]::[92:655.0][0,69,0][CPU 8][pqi_map_request][551]:bus_dmamap_load_ccb failed = 36 count = 995328
Jan 18 12:58:27 bigpotato kernel: [WARN]:[92:655.0][CPU 8][pqisrc_io_start][802]:In Progress on 69

# zpool iostat zdata 2
              capacity     operations     bandwidth 
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zdata       1.20M  94.6T      2     62   147K  1.31M
zdata       1.20M  94.6T      0      0      0      0
zdata       1.20M  94.6T      0      0      0      0
zdata       1.20M  94.6T      0      0      0      0
zdata       1.20M  94.6T      0      0      0      0
zdata       1.20M  94.6T      0      0      0      0
zdata       1.20M  94.6T      0      0      0      0
zdata       1.73M  94.6T      0  1.34K      0   136M
zdata       1000M  94.6T      0  9.79K      0   559M
zdata       1000M  94.6T      0  13.9K      0   465M
zdata       3.32G  94.6T      0  17.4K      0   558M
zdata       3.32G  94.6T      0  19.2K      0   613M
zdata       3.32G  94.6T      0  19.1K      0   613M
zdata       6.26G  94.6T      0  17.5K      0   561M
zdata       6.26G  94.6T      0  17.1K      0   567M
zdata       6.26G  94.6T      0  17.5K      0   562M
zdata       9.19G  94.6T      0  17.0K      0   540M
zdata       9.19G  94.6T      0  19.1K      0   621M
zdata       12.1G  94.6T      0  19.7K      0   616M
zdata       12.1G  94.6T      0  20.1K      0   632M
zdata       15.1G  94.6T      0  20.3K      0   619M
zdata       15.1G  94.6T      0  17.6K      0   591M
zdata       15.1G  94.6T      0  4.63K      0   160M
              capacity     operations     bandwidth 
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zdata       15.1G  94.6T      0      0      0      0
zdata       15.1G  94.6T      0      0      0      0



* cannot create any new files on the system, but can open new shells (I used tmux and can create new windows and panes run as root).

* could connect to ssh as non root user.

* could create new file via `echo "test" > new_file.txt`

* could not create new file via vi save, (vi other_new_file.txt and :wq), vi hanged, could not do anything.

# ps aux | grep vi
ykjam    63876    0.0  0.0 16396  4768  5  D+   13:03      0:00.01 vi test.txt
root     63892    0.0  0.0 12868  2408  7  S+   13:07      0:00.00 grep vi

* could run `zpool list` result below

# zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zdata  94.6T  15.1G  94.6T        -         -     0%     0%  1.00x    ONLINE  -
zroot   276G  13.7G   262G        -         -     2%     4%  1.00x    ONLINE  -

* executed `zpool scrub zdata`, command hanged, running `zpool list` again hanged.

* running `camcontrol devlist` also hangs.

BR,
Mergen
Comment 77 Mergen A 2022-01-18 11:27:21 UTC
I have created hardware RAID1+0 volume with 12x 8TB disks, created single disk zfs pool with that volume, run the test, and it passed, without any problems.

Is there anything I can also test to help debug this issue?
Comment 78 Mergen A 2022-01-25 08:22:51 UTC
Hi,

After running on raid, I got same error, when disks were heavily loaded with IO.
I observed that SAS disks (I have two in zfs mirror) on same controller work without problems even if I run intensive benchmarks on it.

Also, there are 8 bugs if I search freebsd bugzilla, 7 are new and 1 is open.

Maybe we should open single bug and report there?

Best regards,
Mergen
Comment 79 Hermes T K 2022-01-27 10:52:20 UTC
Created attachment 231383 [details]
Attaching the smartpqi bootleg
Comment 80 Hermes T K 2022-01-27 10:54:58 UTC
Hi Mergen & Peter ,

Can you please try with the attached smartpqi bootleg driver & update ?

Thanks & Regards
Hermes T K
Comment 81 Hermes T K 2022-01-27 11:01:54 UTC
(In reply to Peter from comment #70)
From the attached picture, 

bus_dmamap_load_ccb failed = 36 , shows that bus_dmamap_load_ccb is returning EINPROGRESS status.
But in the provided bootleg ,we are not  failing for the  EINPROGRESS status.
Can you please try again with the  latest smartpqi driver attached ?

Thanks & Regards
Hermes T K
Comment 82 rainer 2022-01-27 11:16:19 UTC
So, this is for 13.0-RELEASE, I assume?
Comment 83 Mergen A 2022-01-27 12:11:10 UTC
Hi Hermes,

will be busy couple day, will try during next week with attached patch.

Hi Rainer,

yes 13.0-RELEASE also affected, there are many issues in bugs.freebsd.org when searching for smartpqi, looks that all are related.

Best regards,
Mergen
Comment 84 Mergen A 2022-01-27 12:47:06 UTC
Hi Hermes,

I think I did it correct, replaced `smartpqi.ko` with file you provided in `/boot/kernel/` folder.

Rebooted server, started test, and again got Kernel Panic

Jan 27 17:42:18 bigpotato kernel: [ERROR]::[92:655.0][0,73,0][CPU 24][pqi_map_request][551]:bus_dmamap_load_ccb failed = 36 count = 1044480
Jan 27 17:42:18 bigpotato kernel: [WARN]:[92:655.0][CPU 24][pqisrc_io_start][802]:In Progress on 73

Anything else I can help?

Best regards,
Mergen
Comment 85 Mergen A 2022-01-27 12:57:09 UTC
After reboot, I can not boot normally. 

Got the below error in boot screen.

[ERROR]::[92:655.0][0,75,0][CPU 35][pqi_map_request][551]:bus_dmamap_load_ccb failed = 36 count = 860160
[WANR]:[92:655.0][CPU 35][pqisrc_io_start][802]:In Progress on 75

and servers hangs here.

Best regards,
Mergen
Comment 86 Peter 2022-01-27 23:27:55 UTC
Created attachment 231397 [details]
Screencap showing "bootleg" driver loaded and subsequent failure

(In reply to Hermes T K from comment #80)
No change.
Comment 87 Hermes T K 2022-02-01 16:37:39 UTC
Hi Mergen & Peter,

Thanks for the update. 

[ERROR]::[92:655.0][0,75,0][CPU 35][pqi_map_request][551]:bus_dmamap_load_ccb failed = 36 count = 860160
[WANR]:[92:655.0][CPU 35][pqisrc_io_start][802]:In Progress on 75

Above  message is from the old driver. But in the new attached freebsd driver doesnt have the above prints.
I also faced similiar issue, where i was unable to upgrade the driver in freebsd 13.0 setup .

I upgraded the driver with below commands,
root@fbsd13ga:/ # pkg upgrade smartpqi-amd64.txz
Updating FreeBSD repository catalogue...
Fetching packagesite.pkg: 100%    6 MiB   1.1MB/s    00:06
Processing entries: 100%
FreeBSD repository update completed. 31366 packages processed.
All repositories are up to date.
Checking integrity... done (0 conflicting)
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
        smartpqi: 13.4140.0-1008

Number of packages to be installed: 1

Proceed with this action? [y/N]: y
[1/1] Installing smartpqi-13.4140.0-1008...
PRE INSTALL
Extracting smartpqi-13.4140.0-1008: 100%
POST_INSTALL
current directory /

After reboot again the inbox driver loaded.
But when i reinstalled the OS , issue got resolved.

Can you please try again & check whether the driver is getting 
updated ?
Attached driver version is 13.4210.0-1004.


Thanks & Regards
Hermes T K
Comment 88 Peter 2022-02-01 17:38:37 UTC
(In reply to Hermes T K from comment #87)
Hermes, my apologies - I see where I may have erred. I will retest both driver files and report back.
Comment 89 Peter 2022-02-03 13:44:51 UTC
Created attachment 231536 [details]
Screencap showing driver in attachment #230487 [details] actually loaded and successful

(In reply to Hermes T K from comment #87)
Hermes - in response to your observation that drivers were not loading, I decided to double check everything I was doing and realized unbeknownst to me I had more than one module in the /boot folder and didn't know which was actually being loaded at start. I removed the extra, verified the one remaining was the one needing testing, checked that it was loaded, observed the size reported by kldstat was different from the prior attempt shown in attachment #230891 [details], and proceeded to test. 

The scrub ran to completion with 0 errors. 

I apologize profusely for the operator error and thank you immensely for what I believe is a solution to this issue. I will also add that I went back to the 12.2 system where this endeavor started and did *not* have multiple modules on disk, so this error of mine happened somewhere along the way of creating the 13.0 USB based system for testing these new drivers.

Driver in attachment #230487 [details] is GOOD.
Comment 90 Mergen A 2022-02-12 07:00:38 UTC
(In reply to Hermes T K from comment #87)

Hi Hermes,

Sorry for late reply. Will test today, and reply ASAP.

Best Regards,
Mergen
Comment 91 Mergen A 2022-02-12 07:40:05 UTC
Hi Hermes,

Installed as you instructed, with pkg and it seems everything now works.
Test with bonnie++ passed. System did not freeze.

Best regards,
Mergen
Comment 92 rainer 2022-02-14 14:12:18 UTC
Ideally, the source of this driver should be integrated into 13-stable, so once 13.1 comes around it "just works" and people aren't required to scour bugzilla and phabricator.

As per 
https://www.freebsd.org/releases/13.1R/schedule/

code slush for 13.1 is already in less than two weeks.


So, where is the source of this driver?
Comment 93 scott.benesh 2022-02-17 17:14:45 UTC
The source has been in review here https://reviews.freebsd.org/D30182

I guess we are confused as to how to get it into 13.1 . What else do we need to do?
Comment 94 Warner Losh freebsd_committer 2022-02-17 17:49:29 UTC
If the source is in stable/13, it will be in 13.1 when it's released.
Is there some way I can confirm the right sources are in the stable/13 branch? It's my belief that they are and I committed the referenced review there.
Comment 95 rainer 2022-02-25 14:41:03 UTC
One question I have is with regard to firmware-versions.

I have several HPE servers with P408i HBAs and HPE has not yet "ported" the Microchip release 4.72, yet, even after several months. They are still at 4.11.

How bad is this?
Comment 96 scott.benesh 2022-02-25 16:33:34 UTC
That would be answered better by HPE. You should contact them for their plans on a future firmware release.
Comment 97 rainer 2022-02-25 17:04:10 UTC
I know. And I'm trying to get them to update the firmware.

Though that's like trying to juice a stone.

What I wanted to know:
If I understood this correctly, the driver in 13-stable is different from what is in 12.3 (which I use, with firmware 4.11).

How much does that driver rely on the latest firmware (vs. 4.11, what is available from HPE).

Do I have to expect any problems on 13.1 with only firmware 4.11?
Comment 98 scott.benesh 2022-02-25 18:27:10 UTC
We always recommend using the latest FW/Driver combination. 

4.11 firmware was released with the 4130.0.1008 FreeBSD driver on August 5th, 2021. The driver changes we've been talking about in this Bug bring the 13-stable driver up to this level. So this would be the best combination to use. 

Also, I'm currently not seeing any later driver changes related to firmware so this version of the driver will also work with the 4.72 firmware and the upcoming next release.

We are still trying to make sure the latest driver code is in 13-stable. I'm still not sure if that's done though.
Comment 99 Warner Losh freebsd_committer 2022-02-25 18:38:38 UTC
(In reply to scott.benesh from comment #98)
It is my firm belief that the latest code in freebsd-current is in stable/13.

I don't know if *that* code is the latest, but if not today is the time to let me know where the newer code is so I can get it into the mainline in time to merge it before the window for 13.1 closes (which is in a matter of days).

Please schedule any investigations in this area accordingly.
Comment 100 scott.benesh 2022-02-25 21:11:55 UTC
(In reply to Warner Losh from comment #99)

Looks like the latest we've sent up. I will have Hermes verify.
Comment 101 Mergen A 2022-04-25 06:44:31 UTC
Hi Guys,
Are changes will be in 13.1-RELEASE? Any idea?

Best regards,
Mergen
Comment 102 Warner Losh freebsd_committer 2022-04-25 08:37:19 UTC
13.1 has the latest code from microchip.

I think that this bug can be closed as resolved.
Comment 103 Palle Girgensohn freebsd_committer 2022-04-25 09:14:13 UTC
(In reply to Warner Losh from comment #102)

Is this source the version v4210.0.1004? Same as the latest binary at https://storage.microsemi.com/en-us/downloads/unix/freebsd/productid=aha-2100-8i&dn=microsemi+adaptec+smarthba+2100-8i.php
?
Comment 104 Warner Losh freebsd_committer 2022-04-25 13:23:21 UTC
(In reply to Palle Girgensohn from comment #103)
Comment 100 is someone from microchip saying that it's the latest that microchip has sent up, but the source is labelled 1.4014.0.195.
Comment 105 Warner Losh freebsd_committer 2022-04-25 13:25:13 UTC
Also, there's no source in that download, so I can't check myself to see if what we have matches or not.
Comment 106 Palle Girgensohn freebsd_committer 2022-04-27 07:13:43 UTC
(In reply to Warner Losh from comment #105)

Just looking at the diff, this can hardly be all the stability fixes introduced in the latest binary version.

Until someone from Microcode can verify that the source code is indeed the same version as the binary distribution v4210.0.1004, I can't trust it to fix my problem bug #259611. The binary distribution has so far proven reliable.
Comment 107 Palle Girgensohn freebsd_committer 2022-04-27 07:13:56 UTC
(In reply to Warner Losh from comment #105)

Just looking at the diff, this can hardly be all the stability fixes introduced in the latest binary version.

Until someone from Microcode can verify that the source code is indeed the same version as the binary distribution v4210.0.1004, I can't trust it to fix my problem bug #259611. The binary distribution has so far proven reliable.
Comment 108 Warner Losh freebsd_committer 2022-04-27 13:29:48 UTC
(In reply to Palle Girgensohn from comment #106)
They are two different versions. Microsemi only had the resources to include the critical fixes into 13.1. Due to tooling issues, they can't just release the current source, but efforts are underway to make future releases include that source so the project can integrate it into our tree. I have no access to this source (as I work for someone else), but I'm hopeful that this will come to pass soon.