Bug 263885

Summary: fsck fails with UFS journaled soft-updates filesystem under ESXI 7.0u2a due to 13.0 increased kern.maxphys
Product: Base System Reporter: Russell.Yount
Component: kernAssignee: Alexander Motin <mav>
Status: Closed FIXED    
Severity: Affects Some People CC: Russell.Yount, chris, edavidjanousek, fs, grahamperrin, imp, mav, nvass, ronald-lists, ronald
Priority: ---    
Version: 13.0-RELEASE   
Hardware: amd64   
OS: Any   
URL: https://cgit.freebsd.org/src/commit/sys/dev/vmware/pvscsi/pvscsi.c?h=stable/13&id=f51c1d1dd595ce51059489d7e1248ff6ba39664a
Attachments:
Description Flags
script to trigger the bug
none
VMWare Screenshot of error none

Description Russell.Yount 2022-05-09 18:23:22 UTC
On FreeBSD 13.0-RELEASE-p11 under VMWARE ESXI 7.0U2a

fsck run on a UFS filesystem with journaled soft-updates enables cannot access journal file.

When fsck fails to access .sujournal Kernel messages show:

May  9 13:38:50 nas3 kernel: pvscsi0: pvscsi_execute_ccb error 27
May  9 13:38:50 nas3 syslogd: last message repeated 1 times
May  9 13:38:50 nas3 kernel: (da1:pvscsi0:0:1:0): READ(10). CDB: 28 00 00 00 a9 c0 00 04 00 00
May  9 13:38:50 nas3 kernel: (da1:pvscsi0:0:1:0): CAM status: The request was too large for this host
May  9 13:38:50 nas3 kernel: (da1:pvscsi0:0:1:0): Error 22, Unretryable error
May  9 13:38:50 nas3 kernel: (da1:pvscsi0:0:1:0): READ(10). CDB: 28 00 00 00 ad c0 00 04 00 00
May  9 13:38:50 nas3 kernel: (da1:pvscsi0:0:1:0): CAM status: The request was too large for this host
May  9 13:38:50 nas3 kernel: (da1:pvscsi0:0:1:0): Error 22, Unretryable error

Placing 
     kern.maxphys="131072"
in 
     /boot/loader.conf
remedies problem with fsck

a possible problem with pvscsi.ko kernel module?
Comment 1 Russell.Yount 2022-05-21 00:45:55 UTC
This bug still exists in 13.1-RELEASE
Comment 2 edavidjanousek 2022-09-15 14:26:39 UTC
This is very annoying bug. When installing TrueNAS-13.0-U1 on VMWARE ESXI 7.0U2 you will encounter it twice.

1. On installation (you want be able to select disks). You need to tweak the .iso. My recommendation is to use hex editor, find a text '# Boot loader file for TrueNAS.' and replace it with 'kern.maxphys="131072"' (fill the rest with spaces).
2. On adding disks to pool (you won't be able to format them). This time it's easier, just go to /boot/loader.conf and add 'kern.maxphys="131072"' and restart.
Comment 3 Alexander Motin freebsd_committer freebsd_triage 2022-10-29 17:56:46 UTC
Unfortunately fix implemented more than a year ago was not merged to stable/13 branch.  I've merged it recently after I hit the same issue.
Comment 4 Ronald Klop 2022-10-30 08:29:47 UTC
(In reply to Alexander Motin from comment #3)
Would you mind adding a URL to the commit to this issue?
Comment 6 nvass 2022-11-23 12:47:07 UTC
Created attachment 238278 [details]
script to trigger the bug

Hi,

I still see this on CURRENT. It gets easily triggered by gunion:
sh g_union.sh da1p1 da1p2

(Regarding NVMe:
sh g_union.sh nvd0p1 nvd0p2 deadlocks as well printing "nvme0: nvme_payload_map err 27")

The script is attached
Comment 7 Graham Perrin freebsd_committer freebsd_triage 2022-12-29 17:13:50 UTC
Triage: 

* assignment to the committer of f51c1d1dd595ce51059489d7e1248ff6ba39664a 
  (thanks for the then fix/resolution)

* CC author

* reopen, or make a new report for comment 6?
Comment 8 Christopher Brennan 2024-02-09 13:56:59 UTC
Created attachment 248282 [details]
VMWare Screenshot of error

My FreeBSD 13.1 installation (in VMWare) started experiencing this today. All of my VMWare guests had been offline for 2 months due to a host-system disk issue, I've since replaced the faulty disk and rebuilt/repaired the storage pool. today was the first time I had started to spin these images up again, my Debian image came back up fine, but FreeBSD dumped me into single-user mode. fsck during boot failed w/ this error and then when I ran rsck in single-user mode, it marked the system clean but fails w/ this error and if I reboot, it just recycles and does this again.
Comment 9 Ronald Klop freebsd_committer freebsd_triage 2024-02-09 14:54:37 UTC
(In reply to Christopher Brennan from comment #8)

You don't really pose a question so I don't know if you want a reply.

But the fix for your problem is in FreeBSD 13.2.

A workaround for 13.1 and 13.0 is mentioned in comment #0.
Set kern.maxphys="131072" while booting.
And put this setting in /boot/loader.conf to persist it for reboots.

You can remove the setting from /boot/loader.conf after upgrading to 13.2.

According to the commit mentioned in this PR kern.maxphys="262144" might also work. This is the number that 13.2 uses for pvscsi.

I hope this helps.