Created attachment 241454 [details] zfs-ppc-test_not_working.diff Hello. after zfs: merge openzfs/zfs@431083f75 / https://cgit.freebsd.org/src/commit/?id=2a58b312b62f908ec92311d1bd8536dbaeb8e55b we can't boot the system anymore: [...] powernv_xscom0: <xscom> mem 0x603fc00000000-0x60403ffffffff on ofwbus0 Timecounter "timebase" frequency 512000000 Hz quality 1000 Event timer "decrementer" frequency 512000000 Hz quality 1000 Timecounters tick every 1.000 msec fatal kernel trap: exception = 0x800 (floating-point unavailable) srr0 = 0xc0000000033b8fd0 (0x10b8fd0) srr1 = 0x9000000000009032 current msr = 0x9000000000009032 lr = 0xc0000000024df3c0 (0x1df3c0) frame = 0xc00800000000bfc0 curthread = 0xc000000003652e80 pid = 0, comm = swapper panic: floating-point unavailable trap cpuid = 0 time = 1 KDB: stack backtrace: 0xc00800000000bbe0: at kdb_backtrace+0x60 0xc00800000000bcf0: at vpanic+0x1a0 0xc00800000000bda0: at panic+0x44 0xc00800000000bdd0: at trap+0x324 0xc00800000000bf00: at powerpc_interrupt+0x1cc 0xc00800000000bf90: kernel FPU trap by zfs_sha256_ppc: srr1=0x9000000000009032 r1=0xc00800000000c240 cr=0x48200822 xer=0 ctr=0xc0000000024df3a0 r2=0xc0000000033f0000 frame=0xc00800000000bfc0 0xc00800000000c240: at -0x4 0xc00800000000c270: at SHA2Update+0x1f4 0xc00800000000c320: at abd_checksum_sha256+0x128 0xc00800000000c350: at abd_iterate_func+0x170 0xc00800000000c440: at abd_checksum_sha256+0x78 0xc00800000000c5a0: at chksum_fini+0x468 0xc00800000000c6b0: at chksum_fini+0x134 0xc00800000000c6f0: at chksum_init+0x23c 0xc00800000000c770: at spa_init+0x184 0xc00800000000c7f0: at zfs_kmod_init+0x38 0xc00800000000c860: at zfsdev_detach+0x478 0xc00800000000c8e0: at module_register_init+0xf8 0xc00800000000c980: at mi_startup+0x1f4 0xc00800000000ca50: at __start+0xc4 KDB: enter: panic [ thread pid 0 tid 100000 ] Stopped at kdb_enter+0x70: ori r0, r0, 0x0 db> After patch zfs-ppc-test_not_working.diff (provided by jhibbits) we get: nda0 at nvme0 bus 0 scbus0 target 0 lun 1 nda0: <Samsung SSD 980 PRO with Heatsink 1TB 5B2QGXA7 S6WSNJ0W106430L> nda0: Serial Number S6WSNJ0W106430L nda0: nvme version 1.3 x4 (max x4) lanes PCIe Gen4 (max Gen4) link nda0: 953869MB (1953525168 512 byte sectors) nda1 at nvme1 bus 0 scbus1 target 0 lun 1 nda1: <Samsung SSD 980 PRO with Heatsink 1TB 5B2QGXA7 S6WSNJ0W103585P> nda1: Serial Number S6WSNJ0W103585P nda1: nvme version 1.3 x4 (max x4) lanes PCIe Gen4 (max Gen4) link nda1: 953869MB (1953525168 512 byte sectors) GEOM_MIRROR: Device mirror/swap0 launched (2/2). Mounting from zfs:zroot failed with error 6; retrying for 3 more seconds Mounting from zfs:zroot failed with error 6. Loader variables: vfs.root.mountfrom=zfs:zroot Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:zroot/ROOT/default cd9660:/dev/cd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input mountroot> This machine is part of the FreeBSD cluster for building PowerPC packages, so we can build kernels to test anytime necessary. Regards.
Fix brain-o.
Does it work fine with a kernel predating the import? As in can you test *now* that a kernel from this commit is able to mount: commit b98fbf3781df16f7797b2bbeabf205dc7d4985ae Author: Ganbold Tsagaankhuu <ganbold@FreeBSD.org> Date: Mon Apr 3 14:20:28 2023 +0000 Fix driver name. Submitted by: Tyuryukanov S.Y.
I guess we will need here something analogous to what has been done for aarch64: https://reviews.freebsd.org/D39448
Here what has been done for aarch64 in upstream: https://github.com/openzfs/zfs/pull/14715 https://github.com/openzfs/zfs/pull/14728
cc jhibbits@ iirc ppc doesn't have an fpu_kern(9) implementation and should have kfpu_allowed == 0.
(In reply to Mateusz Guzik from comment #2) I already tested it, and it boots with a kernel built from b98fbf3781df. Regards.
FYI, I've just tested it again with an updated tree. - 43c6b7a60aff069da7e0ba6c87d3d7a532e812f6 - clean Timecounters tick every 1.000 msec fatal kernel trap: exception = 0x800 (floating-point unavailable) srr0 = 0xc0000000033bb0b0 (0x10bb0b0) srr1 = 0x9000000000009032 current msr = 0x9000000000009032 lr = 0xc0000000024df57c (0x1df57c) frame = 0xc00800000000bfd0 curthread = 0xc000000003654e80 pid = 0, comm = swapper panic: floating-point unavailable trap cpuid = 0 time = 2 KDB: stack backtrace: 0xc00800000000bbf0: at kdb_backtrace+0x60 0xc00800000000bd00: at vpanic+0x1b8 0xc00800000000bdb0: at panic+0x44 0xc00800000000bde0: at trap+0x324 0xc00800000000bf10: at powerpc_interrupt+0x1cc 0xc00800000000bfa0: kernel FPU trap by zfs_sha256_ppc: srr1=0x9000000000009032 r1=0xc00800000000c250 cr=0x42200842 xer=0 ctr=0xc0000000024df55c r2=0xc0000000033f2000 frame=0xc00800000000bfd0 0xc00800000000c250: at -0x4 0xc00800000000c280: at SHA2Update+0x200 0xc00800000000c330: at abd_checksum_sha256+0x128 0xc00800000000c360: at abd_iterate_func+0x170 0xc00800000000c450: at abd_checksum_sha256+0x78 0xc00800000000c5b0: at chksum_fini+0x3e0 0xc00800000000c6b0: at chksum_fini+0x134 0xc00800000000c6f0: at chksum_init+0x23c 0xc00800000000c770: at spa_init+0x184 0xc00800000000c7f0: at zfs_kmod_init+0x38 0xc00800000000c860: at zfsdev_detach+0x478 0xc00800000000c8e0: at module_register_init+0x114 0xc00800000000c980: at mi_startup+0x1f4 0xc00800000000ca50: at __start+0xc4 KDB: enter: panic [ thread pid 0 tid 100000 ] Stopped at kdb_enter+0x70: ori r0, r0, 0x0 db> - 43c6b7a60aff069da7e0ba6c87d3d7a532e812f6 - with jhibbits patch Mounting from zfs:zroot failed with error 6; retrying for 3 more seconds Mounting from zfs:zroot failed with error 6. Loader variables: vfs.root.mountfrom=zfs:zroot Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:zroot/ROOT/default cd9660:/dev/cd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input mountroot>
Try this: https://people.freebsd.org/~mjg/.junk/zfsppc.diff it will still fail to mount, but it should explain why
(In reply to Mateusz Guzik from comment #8) Thanks. I built two kernels: 1) 43c6b7a60aff069da7e0ba6c87d3d7a532e812f6 - zfsppc.diff Timecounter "timebase" frequency 512000000 Hz quality 1000 Event timer "decrementer" frequency 512000000 Hz quality 1000 Timecounters tick every 1.000 msec fatal kernel trap: exception = 0x800 (floating-point unavailable) srr0 = 0xc0000000033b70b0 (0x10b70b0) srr1 = 0x9000000000009032 current msr = 0x9000000000009032 lr = 0xc0000000024dddb4 (0x1dddb4) frame = 0xc00800000000bfd0 curthread = 0xc000000003650e80 pid = 0, comm = swapper panic: floating-point unavailable trap cpuid = 0 time = 2 KDB: stack backtrace: 0xc00800000000bbf0: at kdb_backtrace+0x60 0xc00800000000bd00: at vpanic+0x1b8 0xc00800000000bdb0: at panic+0x44 0xc00800000000bde0: at trap+0x324 0xc00800000000bf10: at powerpc_interrupt+0x1cc 0xc00800000000bfa0: kernel FPU trap by zfs_sha256_ppc: srr1=0x9000000000009032 r1=0xc00800000000c250 cr=0x42200842 xer=0 ctr=0xc0000000024ddd94 r2=0xc0000000033ee000 frame=0xc00800000000bfd0 0xc00800000000c250: at -0x4 0xc00800000000c280: at SHA2Update+0x200 0xc00800000000c330: at abd_checksum_sha256+0x128 0xc00800000000c360: at abd_iterate_func+0x170 0xc00800000000c450: at abd_checksum_sha256+0x78 0xc00800000000c5b0: at chksum_fini+0x3e0 0xc00800000000c6b0: at chksum_fini+0x134 0xc00800000000c6f0: at chksum_init+0x23c 0xc00800000000c770: at spa_init+0x184 0xc00800000000c7f0: at zfs_kmod_init+0x38 0xc00800000000c860: at zfsdev_detach+0x444 0xc00800000000c8e0: at module_register_init+0x114 0xc00800000000c980: at mi_startup+0x1f4 0xc00800000000ca50: at __start+0xc4 KDB: enter: panic [ thread pid 0 tid 100000 ] Stopped at kdb_enter+0x70: ori r0, r0, 0x0 2) 43c6b7a60aff069da7e0ba6c87d3d7a532e812f6 - zfs-ppc-test_not_working.diff + zfsppc.diff nda1: 953869MB (1953525168 512 byte sectors) GEOM_MIRROR: Device mirror/swap0 launched (2/2). SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 [...] SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR spa_ld_validate_vdevs /usr/local/src/base/sys/contrib/openzfs/module/zfs/spa.c:3582 6 SET_ERROR spa_scan_get_stats /usr/local/src/base/sys/contrib/openzfs/module/zfs/spa_misc.c:2573 2 SET_ERROR spa_removal_get_stats /usr/local/src/base/sys/contrib/openzfs/module/zfs/vdev_removal.c:2521 2 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR spa_checkpoint_get_stats /usr/local/src/base/sys/contrib/openzfs/module/zfs/spa_checkpoint.c:167 1026 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 SET_ERROR vdev_rebuild_get_stats /usr/local/src/base/sys/contrib/openzfs/module/zfs/vdev_rebuild.c:1126 45 SET_ERROR feature_get_refcount /usr/local/src/base/sys/contrib/openzfs/module/zfs/zfeature.c:239 45 Mounting from zfs:zroot failed with error 6. Loader variables: vfs.root.mountfrom=zfs:zroot Manual root filesystem specification: <fstype>:<device> [options] Mount <device> using filesystem <fstype> and with the specified (optional) option list. eg. ufs:/dev/da0s1a zfs:zroot/ROOT/default cd9660:/dev/cd0 ro (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /) ? List valid disk boot devices . Yield 1 second (for background tasks) <empty line> Abort manual input mountroot>
/usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR zio_checksum_error_impl /usr/local/src/base/sys/contrib/openzfs/module/zfs/zio_checksum.c:527 97 SET_ERROR spa_ld_validate_vdevs /usr/local/src/base/sys/contrib/openzfs/module/zfs/spa.c:3582 6 As in: if (!ZIO_CHECKSUM_EQUAL(actual_cksum, expected_cksum)) return (SET_ERROR(ECKSUM)); and finally: if (rvd->vdev_state <= VDEV_STATE_CANT_OPEN) { spa_load_failed(spa, "cannot open vdev tree after invalidating " "some vdevs"); vdev_dbgmsg_print_tree(rvd, 2); return (SET_ERROR(ENXIO)); } as all checksum checks failed iow the new checksum routines return a different result than the original. add this on top of the other stuff: diff --git a/sys/contrib/openzfs/module/zfs/zio_checksum.c b/sys/contrib/openzfs/module/zfs/zio_checksum.c index 6090959c5b8c..581c3ccb82e8 100644 --- a/sys/contrib/openzfs/module/zfs/zio_checksum.c +++ b/sys/contrib/openzfs/module/zfs/zio_checksum.c @@ -523,8 +523,12 @@ zio_checksum_error_impl(spa_t *spa, const blkptr_t *bp, info->zbc_has_cksum = 1; } - if (!ZIO_CHECKSUM_EQUAL(actual_cksum, expected_cksum)) + if (!ZIO_CHECKSUM_EQUAL(actual_cksum, expected_cksum)) { + printf("bad sum report:\n"); + printf("expected: %lx %lx %lx %lx\n", expected_cksum.zc_word[0], expected_cksum.zc_word[1], expected_cksum.zc_word[2], expected_cksum.zc_word[3]); + printf("actual: %lx %lx %lx %lx\n", actual_cksum.zc_word[0], actual_cksum.zc_word[1], actual_cksum.zc_word[2], actual_cksum.zc_word[3]); return (SET_ERROR(ECKSUM)); + } return (0); } we may need to revert this from upstream: commit 4c5fec01a48acc184614ab8735e6954961990235 Author: Tino Reichardt <milky-zfs@mcmilk.de> Date: Wed Mar 1 09:40:28 2023 +0100 Add generic implementation handling and SHA2 impl
Created attachment 241637 [details] Working patch The old patch made an incorrect assumption. This patch fixes that.
(In reply to Mateusz Guzik from comment #10) With the latest patch from jhibbits, it booted. Do you want me to continue the testing you mentioned in this comment?
drop it
Fixed by 0468e89cb. Thanks for testing!
Thank you very much for digging into this!