Summary: | [zfs] Panic during data access or scrub on 12.0-STABLE r343904 (blkptr at <addr> DVA 0 has invalid OFFSET) | ||
---|---|---|---|
Product: | Base System | Reporter: | Sergey Anokhin <admin> |
Component: | kern | Assignee: | Eugene Grosbein <eugen> |
Status: | Closed Feedback Timeout | ||
Severity: | Affects Only Me | CC: | avos, eugen, rgrimes |
Priority: | --- | Keywords: | crash |
Version: | 12.0-STABLE | ||
Hardware: | amd64 | ||
OS: | Any |
Description
Sergey Anokhin
2019-02-12 08:19:25 UTC
Please do not put bugs on stable@, current@, hackers@, etc Strange, I have assigned it to fs@ ... (In reply to Sergey Anokhin from comment #0) I think that your best bet is finding the directory that triggers the panic and then moving your data excluding that directory to a new pool. You can try to find the directory name in a debugger (e.g. pokinmg around frame 36) or empirically. Unfortunately, you have an on-disk data corruption and there is no easy way to fix it (a hard way would be to learn ZFS on-disk format, to find the bad bits and to somehow fix or clear them using a disk/hex editor). There is no obvious software bug, as far as I can tell, so this report is not actionable. (In reply to Andriy Gapon from comment #3) Yes, I agree with you about to find/remove broken directories, but in any case kernel panic - it is bad. (In reply to Andriy Voskoboinyk from comment #2) From a review of the history it was placed on stable@ by the submitter when it was created, before you assigned it to fs@. (In reply to Sergey Anokhin from comment #0) Please do not add stable@ or current@ to the cc: list when creating bugs, this is not a place we wish to have emails from bugzilla going. Thanks, rgrimes@ (In reply to Rodney W. Grimes from comment #5) Ok. Thank you for a tip. My apologies... I've noticed some additional info: kernel panic occurs when trying to move files from one directory to another by perl script. The destination directory contains many millions of files. Perhaps, it can be helpful. (In reply to Andriy Gapon from comment #3) Inability to fix data file system consistency is a bug. Yes, UFS can panic too in case of found inconsistency but it has fsck to get file system back to consistent state. ZFS should have means to fix consistency too, even at cost of some user data loss in case of non-redundant pool. (In reply to Sergey Anokhin from comment #4) It's bad, but it's not a bug (not an actionable one, in any case), it's a data corruption. (In reply to Eugene Grosbein from comment #8) I agree. But such a tool will not magically appear. (In reply to Andriy Gapon from comment #9) ZFS already has "zpool scrub". Isn't it job of scrub to repair consistency? Now it panices too. (In reply to Andriy Gapon from comment #9) Shouldn't scrub eliminate it? Instead of this it gives loop of reboot with kernel panic (In reply to Eugene Grosbein from comment #10) (In reply to Sergey Anokhin from comment #11) ZFS scrub repairs only blocks with bad checksums or otherwise unreadable blocks. ZFS scrub does not analyze and thus does not fix any logical inconsistencies. If bad data has a correct checksum, then at present ZFS cannot fix it. Sometimes it can recognize that the data is bad and report an error, sometimes it has no option but to panic, but sometimes it cannot even tell if it's bad data. If you need to recover your data then the following patch _may_ help you to read the pool and move the data elsewhere: https://people.freebsd.org/~avg/zfs-skip-bad-bp-read.diff However the patch cannot fix the on-disk data, so the pool will remain as it is. (In reply to Andriy Gapon from comment #13) Will the patch help to remove broken data without kernel panic? (In reply to Sergey Anokhin from comment #14) Most likely no. I suggest to re-create the pool. Feedback timeout. ^Triage: with closure, there should be assignment to a person. |