Created attachment 144914 [details] fix I am using GRUB to boot the kernel directly from ZFS. Not long after an upgrade to a recent 10-stable r268881, GRUB stopped being able to see the pool and boot. Having completed an appropriate recovery effort and finally booting the system again, I used gdb on grub-probe to determine that the problem was in lz4 decompression of the uberblock. Here is the problematic code in GRUB 2.00 (with FreeBSD port patches): grub-core/fs/zfs/zfs_lz4.c: #if BYTE_ORDER == BIG_ENDIAN Apparently <sys/endian.h> isn't included, so those macros expand to 0, and the code incorrectly assumes a big-endian system. Then based on this assumption it byte-swaps a 2-byte offset field in the compressed data, which makes the data appear corrupt, and fails. I am not sure why this problem happened to manifest just now, since GRUB hasn't been updated in a while, but I think the recent kernel happens to lz4-compress the uberblock and earlier kernels happened to lzjb-compress or not compress it, leaving the problem unnoticed. This causes disturbing messages like "error: no such device: <pool id>." and "lz4 decompression failed" at the GRUB prompt, and this: # grub-probe -d /dev/gpt/mypool grub-probe: error: unknown filesystem. The fix is simply adding #include <sys/endian.h> at the top of zfs_lz4.c: # grub-probe -d /dev/gpt/mypool zfs Note I am also using the patch from bug 188524 for the "hole_birth" feature and I haven't enabled the "embedded_data" feature on my pool yet. A newly created pool doesn't work in GRUB because of those feature flags, regardless of lz4. The latest GRUB source uses grub_le_to_cpu16() instead of BYTE_ORDER, so the problem should resolve itself in future versions.
incredibly, grub2 is unmaintained...
A commit references this bug: Author: marino Date: Sun Jul 27 18:13:32 UTC 2014 New revision: 363087 URL: http://svnweb.freebsd.org/changeset/ports/363087 Log: sysutils/grub2: Fix wrong lz4 endianness and general port cleanup Due to lack of inclusion of <sys/endian.h>, the lz4 code incorrectly assumes a big-endian system. The result issues manifest with errors like, "error: no such device: <pool id>." and "lz3 decompression failed" at the grub prompt. Modify existing patch to add <sys/endian.h>. While here, simplify the port with OPTIONS_SUB framework and fix the manpage stuff on the options which apparently has been broken since this unmaintained port was staged. PR: 192066 Submitted by: Andrey Zholos Changes: head/sysutils/grub2/Makefile head/sysutils/grub2/files/patch-grub-2.00-zfs-feature-flag-support head/sysutils/grub2/pkg-plist
Thanks!