Bug 151910 - [zfs] booting from raidz/raidz2 on ciss(4) doesn't work
Summary: [zfs] booting from raidz/raidz2 on ciss(4) doesn't work
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 8.1-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-03 09:40 UTC by Emil Smolenski
Modified: 2018-06-21 13:35 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Emil Smolenski 2010-11-03 09:40:10 UTC
FreeBSD 8.1 installed using Fixit# environment on HP Proliant 185 G5 with HP SmartArray does not boot from ZFS pool in raidz or raidz2 setup. Mirror-based configurations work as expected.

There are 6 disks, each configured on HP SmartArray as single disk RAID0 array. Thus there are 6 logical devices (da[0-5]).

System information gathered from Fixit# environment:

# dmesg
(...)
ciss0: <HP Smart Array P400> port 0xe800-0xe8ff mem 0xdef00000-0xdeffffff,0xdeeff000-0xdeefffff irq 35 at device 0.0 on pci4
ciss0: PERFORMANT Transport
ciss0: [ITHREAD]
(...)
da0 at ciss0 bus 0 scbus0 target 0 lun 0
da0: <COMPAQ RAID 0  VOLUME OK> Fixed Direct Access SCSI-5 device
da0: 135.168MB/s transfers
da0: Command Queueing enabled
da0: 1430767MB (2930211632 512 byte sectors: 255H 32S/T 65535C)
(...)
da1 at ciss0 bus 0 scbus0 target 1 lun 0
(...)
da2 at ciss0 bus 0 scbus0 target 2 lun 0
(...)
da3 at ciss0 bus 0 scbus0 target 3 lun 0
(...)
da4 at ciss0 bus 0 scbus0 target 4 lun 0
(...)
da5 at ciss0 bus 0 scbus0 target 5 lun 0
(...)

# diskinfo -v da0
da0
        512             # sectorsize
        1500268355584   # mediasize in bytes (1.4T)
        2930211632      # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        359094          # Cylinders according to firmware.
        255             # Heads according to firmware.
        32              # Sectors according to firmware.
        PAFGL0T9SXH13E  # Disk ident.
(...)

# camcontrol devlist -v
scbus0 on ciss0 bus 0:
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 0 lun 0 (da0,pass0)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 1 lun 0 (da1,pass1)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 2 lun 0 (da2,pass2)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 3 lun 0 (da3,pass3)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 4 lun 0 (da4,pass4)
<COMPAQ RAID 0  VOLUME OK>         at scbus0 target 5 lun 0 (da5,pass5)
scbus1 on ciss0 bus 32:
scbus-1 on xpt0 bus 0:
<>                                 at scbus-1 target -1 lun -1 (xpt0)

# pciconf -lv
(...)
ciss0@pci0:4:0:0:       class=0x010400 card=0x3234103c chip=0x3230103c rev=0x04 hdr=0x00
    class      = mass storage
    subclass   = RAID
(...)

I've done several tests with different configurations. Common setup:

Each logical device (array) has GPT scheme with following partitions:

=>        34  2930211565  da0  GPT  (1.4T)
          34         128    1  freebsd-boot  (64K)
         162     2097152    2  freebsd-swap  (1.0G)
     2097314     4194304    3  freebsd-zfs  (2.0G)
     6291618  2923919981       - free -  (1.4T)
(...)

Each freebsd-zfs partition has GPT label: gpt/test0, gpt/test1, etc. ZFS pool is created with following default options: canmount=off, checksum=fletcher4, atime=off, setuid=off. There are some datasets with compress=lzjb or compress=gzip option set.

Boot code is initialized on each disk this way:
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

Results from tests:
1) + One logical device -> works
2) + Mirror built on 2 logical devices -> works
3) + Mirror built on 6 LDs -> works
4) - RAIDZ2 built on 3 LDs -> doesn't work
5) - RAIDZ2 built on 6 LDs -> doesn't work
6) - RAIDZ built on 3 LDs -> doesn't work
7) - RAIDZ built on 6 LDs -> doesn't work

8) + RAIDZ built on 3 USB sticks on the same machine -> works
9) + RAIDZ built on 3 LDs on another machine (aacdX devices) -> works

There are three different error messages I encountered. It depends on where bootcodes were built.

a) Error message encountered when using bootcodes from Fixit# media:

error 1 lba 32
error 1 lba 1
error 1 lba 32
error 1 lba 1
No ZFS pools located, can't boot

b) Error message encountered when using bootcodes that I built myself:

ZFS: i/o error - all block copies unavailable
ZFS: can't read MOS object directory
Cant' find root filesystem - giving up
ZFS: unexpected object set type 0 
ZFS: unexpected object set type 0

FreeBSD/x86 boot
Default: test:/boot/kernel/kernel
boot:
ZFS: unexpected object set type 0

c) I also saw message similar to shown above, but with "ZFS: can't read MOS" message.

Example output from "status" command at "boot:" prompt (when error b appears):

boot: status pool: test
config:
          NAME STATE
          test ONLINE
            raidz1 ONLINE
              /dev/gpt/test0 ONLINE
              /dev/gpt/test1 ONLINE
              /dev/gpt/test2 ONLINE

So, the bootcode sees healthy ZFS pool with all devs available, but it can't boot from it.

More details on _working_ configuration (2) (mirror on 2 LDs):

works# zpool list test
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
test  1.98G   434M  1.56G    21%  ONLINE  -

works# zpool status test
  pool: test
 state: ONLINE
 scrub: none requested
config:

        NAME           STATE     READ WRITE CKSUM
        test           ONLINE       0     0     0
          mirror       ONLINE       0     0     0
            gpt/test0  ONLINE       0     0     0
            gpt/test1  ONLINE       0     0     0

errors: No known data errors

works# zdb -uuu test
Uberblock

        magic = 0000000000bab10c
        version = 14
        txg = 77
        guid_sum = 2401146990467298568
        timestamp = 1287650011 UTC = Thu Oct 21 08:33:31 2010
        rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:e90ca00:200> DVA[1]=<0:26037600:200> DVA[2]=<0:3e00ce00:200> fletcher4 lzjb LE contiguous birth=77 fill=169 cksum=b02658433:48b259d7157:f39e911b0c82:2286d07e38a6a3

More details on _NOT_working_ configuration (6) (raidz on 3 LDs):

doesntwork# zpool list test
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
test  5.97G   655M  5.33G    10%  ONLINE  -

doesntwork# zpool status test
  pool: test
 state: ONLINE
 scrub: none requested
config:

        NAME           STATE     READ WRITE CKSUM
        test           ONLINE       0     0     0
          raidz1       ONLINE       0     0     0
            gpt/test0  ONLINE       0     0     0
            gpt/test1  ONLINE       0     0     0
            gpt/test2  ONLINE       0     0     0

errors: No known data errors

doesntwork# zdb -uuu test
Uberblock

        magic = 0000000000bab10c
        version = 14
        txg = 78
        guid_sum = 8302404134133891378
        timestamp = 1287656704 UTC = Thu Oct 21 10:25:04 2010
        rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:5b9a7000:400> DVA[1]=<0:a205d800:400> DVA[2]=<0:ea016000:400> fletcher4 lzjb LE contiguous birth=78 fill=159 cksum=a13038304:4232b1013de:dc9ba0c2751d:1f1144c425d7a4

Please see also possibly the same issue reported on mailing list: http://lists.freebsd.org/pipermail/freebsd-stable/2010-October/059610.html

I also tried mm's mfsBSD with ZFSv15. System installed from it also doesn't boot. Same with FreeBSD 8.0-RELEASE.

I can provide more details on this issue and test patches.

How-To-Repeat: 1. Install ZFS-only FreeBSD 8.1-RELEASE on HP SmartArray. Use raidz or raidz2.
2. Reboot.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2010-11-05 07:53:13 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

not sure if this is a zfs problem or a ciss problem, but make a guess 
and assign.  Committers, feel free to reassign it if wrong.
Comment 2 Andriy Gapon 2010-11-05 13:17:05 UTC
8.1 release had a bug which prevented booting from raidz.
Please see http://www.freebsd.org/cgi/query-pr.cgi?pr=148655

-- 
Andriy Gapon
Comment 3 Emil Smolenski 2010-11-05 14:45:39 UTC
Please see at "Originator" field in both PRs. I'm familiar with PR 148655  
very well.

Problem in PR 148655 is related to booting from _degraded_  
raidz/raidz2/mirror. In this PR we have problem with booting from  
_non_degraded_ raidz/raidz2 and only when ciss(4) is used. Mirror-based  
configurations, SATA disks, USB sticks, other RAID controllers are all OK.

Of course I tested 8.1-STABLE (from Oct) and 8.0-RELEASE (where PR 148655  
is not applicable) but there were no difference -- still can't boot from  
raidz or raidz2.

Thanks.

-- 
am
Comment 4 Guido Falsi freebsd_committer 2010-11-05 15:18:48 UTC
I can confirm this same problem with similar hardware.

At present I'm booting this machine from an USB stick.

Hope this helps give this PR some traction.

-- 
Guido Falsi <mad@madpilot.net>
Comment 5 Andriy Gapon 2010-11-05 17:38:26 UTC
on 05/11/2010 17:10 Emil Smolenski said the following:
> The following reply was made to PR kern/151910; it has been noted by GNATS.
> 
> From: "Emil Smolenski" <am@raisa.eu.org>
> To: bug-followup@freebsd.org, am@raisa.eu.org
> Cc:  
> Subject: Re: kern/151910: [zfs] booting from raidz/raidz2 on ciss(4) doesn't
>  work
> Date: Fri, 05 Nov 2010 15:45:39 +0100
> 
>  Please see at "Originator" field in both PRs. I'm familiar with PR 148655  
>  very well.
>  
>  Problem in PR 148655 is related to booting from _degraded_  
>  raidz/raidz2/mirror. In this PR we have problem with booting from  
>  _non_degraded_ raidz/raidz2 and only when ciss(4) is used. Mirror-based  
>  configurations, SATA disks, USB sticks, other RAID controllers are all OK.
>  
>  Of course I tested 8.1-STABLE (from Oct) and 8.0-RELEASE (where PR 148655  
>  is not applicable) but there were no difference -- still can't boot from  
>  raidz or raidz2.

Apologies for missing that important information.
It looks like this might be some exotic issue about ciss, BIOS (and ciss's option
ROM) and how our boot code accesses/enumerates BIOS-visible disks.

-- 
Andriy Gapon
Comment 6 Emil Smolenski 2010-11-09 14:27:02 UTC
Thanks!

Unfortunately it didn't help. This is what I've done:

1. Apply patches [1]
2. Full [build|install][world|kernel]
3. Copy /boot/gptzfsboot and /boot/pmbr to target machine
4. On target machine:
# gpart bootcode -b pmbr -p gptzfsboot -i 1 da0
# gpart bootcode -b pmbr -p gptzfsboot -i 1 da1
# gpart bootcode -b pmbr -p gptzfsboot -i 1 da2
5. Copy /boot/zfsloader to /boot on target machine
6. Reboot
7. Type 'status' command when error message appears:

ZFS: i/o error - all block copies unavailable
ZFS: can't read MOS object directory
Cant' find root filesystem - giving up
ZFS: unexpected object set type 0
ZFS: unexpected object set type 0

FreeBSD/x86 boot
Default: test:/boot/kernel/kernel
boot:
ZFS: unexpected object set type 0

FreeBSD/x86 boot
Default: test:/boot/kernel/kernel
boot: status pool: test
config:
           NAME STATE
           test ONLINE
             raidz1 ONLINE
               /dev/gpt/test0 ONLINE
               /dev/gpt/test1 ONLINE
               /dev/gpt/test2 ONLINE



[1]

--- biosdisk.c.orig     2010-11-09 10:22:33.311797575 +0100
+++ biosdisk.c  2010-11-09 10:23:42.471306832 +0100
@@ -214,11 +214,6 @@
      /* sequence 0, 0x80 */
      for (base = 0; base <= 0x80; base += 0x80) {
         for (unit = base; (nbdinfo < MAXBDDEV); unit++) {
-           /* check the BIOS equipment list for number of fixed disks */
-           if((base == 0x80) &&
-              (nfd >= *(unsigned char *)PTOV(BIOS_NUMDRIVES)))
-               break;
-
             bdinfo[nbdinfo].bd_unit = unit;
             bdinfo[nbdinfo].bd_flags = (unit < 0x80) ? BD_FLOPPY : 0;

--- zfsboot.c.orig      2010-11-09 10:15:39.893495888 +0100
+++ zfsboot.c   2010-11-09 10:21:07.376447165 +0100
@@ -495,7 +495,7 @@
       * will find any other available pools and it may fill in missing
       * vdevs for the boot pool.
       */
-    for (i = 0; i < *(unsigned char *)PTOV(BIOS_NUMDRIVES); i++) {
+    for (i = 0; i < MAXBDDEV; i++) {
         if ((i | DRV_HARD) == *(uint8_t *)PTOV(ARGS))
             continue;
Comment 7 george 2010-11-13 05:14:12 UTC
Hello,

I see the same on 8.1-RELEASE.  I use raidz2 and a 'status' shows the pool is there just fine, but trying to boot results in the same error message:

ZFS: i/o error - all block copies unavailable
ZFS: can't read MOS object directory
Cant' find root filesystem - giving up
ZFS: unexpected object set type 0
ZFS: unexpected object set type 0

I do not have any disks on a CISS device though - I have standard ata disks on a gigabyte sata controller.

George=
Comment 8 Volodymyr Pushkar 2011-01-10 10:28:14 UTC
There is another problem with ciss and ZFS boot:
http://www.freebsd.org/cgi/query-pr.cgi?pr=153520
Comment 9 Palle Girgensohn freebsd_committer 2011-11-22 15:52:36 UTC
Hi,

First, it not a very good idea to run raidz on ciss, since ciss cannot 
supply you with JBODs. Instead you have to set up a bunch of RAID-0 volumes 
in the ciss controller, pack them together in a raidz, and see your 
performance drop to the bottom of the ocean. Just don't is my advice, and 
believe me, I did, and barely lived to tell the story... ;-)

Seriously, we have a couple of idle machines with ciss(4) and an iLO (for 
remote connections). If someone has the knowledge and time to try and fix 
the problems with ciss and ZFS boot, we have the equipment for it.

It is not just raidz. We tried with a standard vanilla zpool, no mirror or 
raid at all, on top of a ciss raid-5, and it failed with RC1. [trying RC2 
now, but seems nothing is changed?]. We also tried gptboot for ufs. It 
fails also, and I guess this is a bigger problem?

With gptboot, it just goes into a bootloop. With gptzfsboot, it fails with 
an error message that seem well known to google. I'll get back to that when 
we've tested with RC2.

Anyone up to the task of finding this culprit, we can let you into the 
machine remotely through the iLO. Let me know.

Best reagards
Palle Girgensohn
girgen@FreeBSD.org
Comment 10 Pawel Jakub Dawidek freebsd_committer 2014-06-01 07:24:45 UTC
State Changed
From-To: open->feedback

Could you take a look at two files in FreeBSD HEAD: 

sys/boot/i386/zfsboot/zfsboot.c 
sys/boot/i386/libi386/biosdisk.c 

Look for VIRTUALBOX in there and apply the same changes to your stable/8 code 
or just modify the code to use code that is compiled with VIRTUALBOX defined. 
There is a bug in VirtualBox that the BIOS reports only one disk available, 
but if you ignore that and just look for more, you will find them. 
Maybe there is a similar bug in your BIOS? 
Please try it out and let me know. If it won't work we ca add more debug to 
see where and why it fails exactly. 


Comment 11 Pawel Jakub Dawidek freebsd_committer 2014-06-01 07:24:45 UTC
Responsible Changed
From-To: freebsd-fs->pjd

I'll take this one.
Comment 12 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:14 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 13 Kazuhiko Kiriyama 2018-05-22 07:43:34 UTC
Same issue occered in 12.0-CURRENT(r332796) on 2 RAID controllers. One is RAID60 of 2TB SATA (HGST AUG-2017) x 12 on AVAGO MegaRAID and the other is RAID60 of 2TB SATA (WD20EFRX) x 12 on Adaptec 5405.  For MegaRAID machine:

# dmesg
  :
pci1: <ACPI PCI bus> on pcib1
AVAGO MegaRAID SAS FreeBSD mrsas driver version: 06.712.04.00-fbsd
mfi0: <Invader> port 0xe000-0xe0ff mem 0xdf300000-0xdf30ffff,0xdf200000-0xdf2fffff irq 16 at device 0.0 on pci1
mfi0: Using MSI
mfi0: Megaraid SAS driver Ver 4.23 
  :
mfid0 on mfi0
mfid0: 15257600MB (31247564800 sectors) RAID volume 'TrueFC' is optimal
  :
# diskinfo -v mfid0
mfid0
        512             # sectorsize
        15998753177600  # mediasize in bytes (15T)
        31247564800     # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        1945070         # Cylinders according to firmware.
        255             # Heads according to firmware.
        63              # Sectors according to firmware.
                        # Disk descr.
                        # Disk ident.
        No              # TRIM/UNMAP support
        Unknown         # Rotation rate in RPM

# camcontrol devlist -v
scbus0 on ahcich0 bus 0:
<>                                 at scbus0 target -1 lun ffffffff ()
scbus1 on ahcich1 bus 0:
<>                                 at scbus1 target -1 lun ffffffff ()
scbus2 on ahcich2 bus 0:
<>                                 at scbus2 target -1 lun ffffffff ()
scbus3 on ahcich3 bus 0:
<>                                 at scbus3 target -1 lun ffffffff ()
scbus4 on ahcich4 bus 0:
<>                                 at scbus4 target -1 lun ffffffff ()
scbus5 on ahcich5 bus 0:
<>                                 at scbus5 target -1 lun ffffffff ()
scbus6 on ahciem0 bus 0:
<AHCI SGPIO Enclosure 1.00 0001>   at scbus6 target 0 lun 0 (pass0,ses0)
<>                                 at scbus6 target -1 lun ffffffff ()
scbus7 on umass-sim0 bus 0:
<A-DATA USB Flash Drive 0.00>      at scbus7 target 0 lun 0 (da0,pass1)
scbus-1 on xpt0 bus 0:
<>                                 at scbus-1 target -1 lun ffffffff (xpt0)
# pciconf -lv
  :
mfi0@pci0:1:0:0:        class=0x010400 card=0x93631000 chip=0x005d1000 rev=0x02 hdr=0x00
    vendor     = 'LSI Logic / Symbios Logic'
    device     = 'MegaRAID SAS-3 3108 [Invader]'
    class      = mass storage
    subclass   = RAID
  :
# gpart show mfid0
=>         40  31247564720  mfid0  GPT  (15T)
           40         1024      1  freebsd-boot  (512K)
         1064          984         - free -  (492K)
         2048    268435456      2  freebsd-swap  (128G)
    268437504  30979125248      3  freebsd-zfs  (14T)
  31247562752         2008         - free -  (1.0M)

# kldload zfs
# zpool import -fR /mnt zroot
# zpool status 
  pool: zroot
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mfid0p3   ONLINE       0     0     0

errors: No known data errors
#
Comment 14 Marcelo Araujo freebsd_committer 2018-06-04 06:10:34 UTC
I have a similar problem with a feature that I'm working on bhyve: virtio-scsi.

The installation was made using 4 virtio-scsi disks with a zfs strip and then I have this error:

Error:
ZFS: i/o error - all block copies unavailable
Failed to read node from zroot (5)
Failed to load '/boot/loader.efi'
panic: No bootable partitions found!
Boot Failed. EFI SCSI Device.


If I use ZFS mirror, all works fine.


Best,
Comment 15 Allan Jude freebsd_committer 2018-06-21 13:35:01 UTC
(In reply to Marcelo Araujo from comment #14)
Marcelo: Do you happen to have an easy bhyve recipe for me to create the same environment here, I can work on debugging it.

Also, have you tried booting UEFI in bhyve on this setup?