Summary: | UFS "cylinder checksum failed" on temporary storage or data disk on arm64 vm in Azure | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Li-Wen Hsu <lwhsu> | ||||||||||||
Component: | kern | Assignee: | Li-Wen Hsu <lwhsu> | ||||||||||||
Status: | Closed FIXED | ||||||||||||||
Severity: | Affects Only Me | CC: | Andrew, andrew, arrowd, brd, emaste, grahamperrin, kevans, marklmi26-fbsd, mckusick, schakrabarti, weh, whu | ||||||||||||
Priority: | --- | ||||||||||||||
Version: | CURRENT | ||||||||||||||
Hardware: | arm64 | ||||||||||||||
OS: | Any | ||||||||||||||
Bug Depends on: | |||||||||||||||
Bug Blocks: | 271607 | ||||||||||||||
Attachments: |
|
Description
Li-Wen Hsu
2022-11-08 23:27:07 UTC
dmesg: https://gist.github.com/lwhsu/31b70e6a7f4e8e35707093a1b570f88c Logs of error like: UFS /dev/da1p1 (/mnt/resource) cylinder checksum failed: cg 0, cgp: 0x9b739572 != bp: 0xfff99985 UFS /dev/da1p1 (/mnt/resource) cylinder checksum failed: cg 1, cgp: 0x2a6a014 != bp: 0x3e3c2145 UFS /dev/da1p1 (/mnt/resource) cylinder checksum failed: cg 3, cgp: 0x55b3a281 != bp: 0x692923d0 The mounted UFS is not writeable: lwhsu@arm64test:/mnt/resource $ sudo touch a /mnt/resource: create/symlink failed, no inodes free touch: a: No space left on device Have tried following scenarios: - Using UFS on "OS disk" (da0): no problem - Using ZFS on da1: no problem - Using UFS on md(4) using a file on OS disk: no problem - Adding another data disk (da2): error Souradeep and Wei, can you help to check if you can reproduce this in hyper-v in local arm64 machine? Is there any difference between OS disk and Data disk? Thanks! Kirk, is there anything suggested for us to dig more into this problem? Thanks! (In reply to Li-Wen Hsu from comment #2) Wei had faced similar problem during bring up. This is from the mail : We are trying to bring up FreeBSD on ARM64 Hyper-V. We are seeing the > iso image booted up fine and we can install the root disk with ZFS. > However, when choosing to install the root disk with UFS, we are > seeing UFS checksum errors as below and the installation just failed. > > ... > UFS /dev/da0p2 (/mnt) cylinder checksum failed: cg 41, cgp: 0x45721f90 > != > bp: 0x23b0cc8 > UFS /dev/da0p2 (/mnt) cylinder checksum failed: cg 42, cgp: 0xbb1ba737 > != > bp: 0xfc52b46f > UFS /dev/da0p2 (/mnt) cylinder checksum failed: cg 43, cgp: 0x12671d05 > != > bp: 0x552e0e5d > UFS /dev/da0p2 (/mnt) cylinder checksum failed: cg 44, cgp: 0x4224a088 > != > bp: 0x56db3d0 > UFS /dev/da0p2 (/mnt) cylinder checksum failed: cg 45, cgp: 0xea211974 > != > bp: 0x913217c9 > UFS /dev/da0p2 (/mnt) cylinder checksum failed: cg 0, cgp: 0xae78ec43 > != > bp: 0x3467ffcc > UFS /dev/da0p2 (/mnt) cylinder checksum failed: cg 1, cgp: 0x7fc7e750 > != > bp: 0x388ef408 > UFS /dev/da0p2 (/mnt) cylinder checksum failed: cg 2, cgp: 0x81ae5ff7 > != > bp: 0xc6e74caf > > /mnt: create/symlink failed, no inodes free (In reply to schakrabarti@microsoft.com from comment #4) Thanks, please note that currently UFS works fine as the OS disk (da0), is there anything different in the OS disk and data disks? And did you do any fix for the UFS on OS disk during bring up? (In reply to Li-Wen Hsu from comment #5) We never touched UFS during the entire effort of ARM64 bring-up on Hyper-V. I hit this problem when trying to install the OS from iso image. Not sure if this problem exists on OS disk or not as I couldn't complete installing the OS if choosing ufs as root filesystem. Is it possible to run fsck_ffs on the filesystem? If so, it will fix the bad checksums and you should then be good to go. Can you run tunefs on the filesystem? If so, I could easily add an option to disable cylinder group checksums. (In reply to Kirk McKusick from comment #7) > Is it possible to run fsck_ffs on the filesystem? If so, it will fix the bad checksums and you should then be good to go. Thanks! Running fsck_ffs does fix the issue. I attach the log at https://gist.github.com/lwhsu/76e78a680df5f8107039c1a293ec2e42 > Can you run tunefs on the filesystem? If so, I could easily add an option to disable cylinder group checksums. Yes we can run tunefs on this filesystem. What's more, all this disk is partitioned and formatted after boot. (refer to the above log) It's weird that this filesystem is newly created by gpart and newfs but doesn't function like da0 (OS disk) and mapped md(4) devices. Is it safe to just disable (checking?) cylinder group checksums? It seems still weird that we need users to manually turn off cylinder group checksums every time after running newfs. Newfs is generating the check hashes using the libufs cgwrite(3) routine. The cgwrite(3) routine calls calculate_crc32c() which comes from sys/libkern/gsb_crc32.c. The kernel check also uses calculate_crc32c() from sys/libkern/gsb_crc32.c so they should be getting the same answer. If the structures were somehow laid out differently that might explain it, but then a lot of other things would also break. I also note that the same calculate_crc32c() function is used for the superblock checksum and the inode checksum which are both working fine which makes it even more of a mystery why the cylinder group checksums are broken. Another option is to disable cylinder group checksums in newfs(8) using something like this patch: diff --git a/sbin/newfs/mkfs.c b/sbin/newfs/mkfs.c index 48091d7882d0..5000adff138f 100644 --- a/sbin/newfs/mkfs.c +++ b/sbin/newfs/mkfs.c @@ -499,8 +499,10 @@ mkfs(struct partition *pp, char *fsys) */ if (Oflag > 1) { sblock.fs_flags |= FS_METACKHASH; +#ifndef AZURE if (getosreldate() >= P_OSREL_CK_CYLGRP) sblock.fs_metackhash |= CK_CYLGRP; +#endif if (getosreldate() >= P_OSREL_CK_SUPERBLOCK) sblock.fs_metackhash |= CK_SUPERBLOCK; if (getosreldate() >= P_OSREL_CK_INODE) I tried to newfs a partition which is directly on da0 (OS disk), it works and doesn't have any checksum error. I think this means that there should be something different in os disk (da0) and data disks (da1, 2, ...). I guess what newfs wants to write and actually on the disk may be different, is there any easier way to debug this? Or it needs to modify newfs to print out the data before writing? I do not have any better idea than the debugging idea that you are suggesting. Of interest is verifying that the seek offset is correct and that the checksum of what is written and what ends up on the disk match. Note that the checksum is being calculated and the block written in the libufs routine lib/libufs/cgroup.c. See functions cgwrite(), cgwrite1(), and where the work is actually done cgput(). I bumped into this problem on my RockPro64 with SSD disk attached via an adapter. The symptoms are all the same: # OK newfs /dev/nda0 # OK mount /dev/nda0 /mnt # spits a ton of "cylinder checksum failed" messages echo asdasd > /mnt/asd If I do instead newfs /dev/nda0 # fixes a ton of cylinder groups fsck_ffs -y /dev/nda0 mount /dev/nda0 /mnt # OK echo asdasd > /mnt/asd (In reply to Gleb Popov from comment #12) Forgot to mention, I'm running 13.1-RELEASE. Is this still an unresolved issue? If so, I am not sure how to reproduce it which I need to do to try and find a fix for it. (In reply to Kirk McKusick from comment #14) Can check again if this is the case for me as RockPro is currenly being used in production. (In reply to Kirk McKusick from comment #14) We are still seeing this on ufs data disk on ARM64 Hyper-V guests in Azure. Wonder if Li-Wen (FreeBSD Foundation) can provide you access to a VM in Azure for further debugging. Created attachment 242385 [details]
adding printf to newfs(1)
I pointed this bug out to Andrew Gierth and he pointed out that it sounds a bit like an alignment issue; the varying printfs you were experimenting with at could have certainly tweaked the alignment of different buffers. Slapping d_sbunion and d_cgunion in libufs' struct uufsd with an `__attribute__ ((aligned(512)))` seems to be sufficient to alleviate the bug, though that's not really a good long-term solution. Presumably there's something related/hinky in storvsc or vmbus. (In reply to Kyle Evans from comment #18) There are some test programs in bug #271766 and the commits that fix it that could perhaps be adapted to help narrow down the failure without needing to involve filesystem code. *** Bug 272719 has been marked as a duplicate of this bug. *** None of the comments give enough context to replicate the same partition start and size or other such to try newfs with for replication attempts. Comment #12 indicates a replication without needing Azure involved. But, again, without enough detail to make an approximate simulation of the failure context. It would be nice to have details from a known example that one could try to use to get a replication, in my context avoiding requiring Azure be involved but still being a reasonable replication of various other aspects. Details from Azure contexts that would not require Azure to be a reasonable basis for testing could well serve. On a local arm64 Hyper-V instance on an Windows Dev Kit 2023 I tried the test program from bug #271766. I see a difference with a stride size of 32, but not when it's 64. I will often, bot not always receive similar data to the following with a stride of 32: % sudo ./a.out /dev/da1 32 Discrepancies found with page offset 32: found at 1fe5: 1f ff ff expected 1fe5: 00 00 00 found at 1fea: 02 expected 1fea: 00 found at 1fed: 03 80 expected 1fed: 00 00 Discrepancies found with page offset 96: found at 1fa5: 1f ff ff expected 1fa5: 00 00 00 found at 1faa: 02 expected 1faa: 00 found at 1fad: 03 80 expected 1fad: 00 00 Discrepancies found with page offset 160: found at 1f65: 1f ff ff expected 1f65: 00 00 00 found at 1f6a: 02 expected 1f6a: 00 found at 1f6d: 03 80 expected 1f6d: 00 00 Discrepancies found with page offset 224: found at 1f25: 1f ff ff expected 1f25: 00 00 00 found at 1f2a: 02 expected 1f2a: 00 found at 1f2d: 03 80 expected 1f2d: 00 00 ... (In reply to Andrew Turner from comment #22) That test program isn't ideal in this case (it could be adapted to be far more comprehensive), but those results do show that (a) corruption is indeed occuring, in this case on reads, and (b) those corruptions are consistently happening just after hitting a page boundary. I think this is due to how the Hyper-V storvsc driver is using bus_dma. It looks like it needs physically contiguous memory which may not be the case when bus_dma needs to create a bounce buffer. This can happen when the destination is not cache-line aligned. A workaround to pass the BUS_DMA_COHERENT flag bus_dma_tag_create, however this should only be done when we know the bus is cache-coherent. It would be better to move this to the vmbus driver and implement bus_get_dma_tag. Under ACPI we can check the _CCA value to set this flag, however due to how the vmbus driver attaches we don't have the acpi handle to get this. (In reply to Andrew Turner from comment #24) I have tested with the suggested change: diff --git a/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c b/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c index dff2ee709bed..7c908495c6c3 100644 --- a/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c +++ b/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c @@ -949,9 +949,14 @@ storvsc_init_requests(device_t dev) struct storvsc_softc *sc = device_get_softc(dev); struct hv_storvsc_request *reqp; int error, i; + unsigned int dma_flag = 0; LIST_INIT(&sc->hs_free_list); +#if defined(__aarch64__) + dmag_flag |= BUS_DMA_COHERENT; +#endif + error = bus_dma_tag_create( bus_get_dma_tag(dev), /* parent */ 1, /* alignment */ @@ -962,7 +967,7 @@ storvsc_init_requests(device_t dev) STORVSC_DATA_SIZE_MAX, /* maxsize */ STORVSC_DATA_SEGCNT_MAX, /* nsegments */ STORVSC_DATA_SEGSZ_MAX, /* maxsegsize */ - 0, /* flags */ + dma_flag, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &sc->storvsc_req_dtag); But it has not solved the issue: FS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 811, cgp: 0x4a9c4e5 != bp: 0xb51bda9f UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 812, cgp: 0x54ea7968 != bp: 0xe5586712 UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 813, cgp: 0xfd96c35a != bp: 0x4c24dd20 UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 814, cgp: 0x3ff7bfd != bp: 0xb24d6587 UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 815, cgp: 0xaa83c1cf != bp: 0x1b31dfb5 UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 816, cgp: 0x1008f9ad != bp: 0xa1bae7d7 UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 817, cgp: 0xb974439f != bp: 0x8c65de5 UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 818, cgp: 0x97f71033 != bp: 0x8c2b9d5e /datadrive: create/symlink failed, no inodes free [schakrabarti@freebsd-arm-schakrabarti /datadrive]$ Aug 8 09:18:18 freebsd-arm-schakrabarti kernel: pid 961 (touch), uid 1001 inumber 2 on /datadrive: out of inodes ^C [schakrabarti@freebsd-arm-schakrabarti /datadrive]$ uname -a FreeBSD freebsd-arm-schakrabarti 14.0-CURRENT FreeBSD 14.0-CURRENT aarch64 1400093 #0 main-n264325-65f8467e3351-dirty: Tue Aug 8 08:43:39 UTC 2023 root@freebsd-arm-schakrabarti:/usr/obj/home/schakrabarti/sandbox/arm64.aarch64/sys/GENERIC arm64 (In reply to schakrabarti@microsoft.com from comment #25) I will implement b vmbus_get_dma_tag() and will share the patch (In reply to Andrew Turner from comment #24) I'm tempted to submit a patch to dd(1) to give it options to use intentionally-misaligned buffers, so that we have a tool for more convenient testing of this stuff. (Between this bug and the crypto one, obviously more testing is needed.) Any interest? (In reply to Andrew Turner from comment #24) Hi Andrew Hyper-V is already having _CCA in acpi . Device (\_SB.VMOD.VMBS) { Name (STA, 0x0F) Name (_ADR, Zero) // _ADR: Address Name (_CCA, One) // _CCA: Cache Coherency Attribute Name (_DDN, "VMBUS") // _DDN: DOS Device Name Name (_HID, "VMBus") // _HID: Hardware ID Name (_UID, Zero) // _UID: Unique ID Method (_DIS, 0, NotSerialized) // _DIS: Disable Device { STA &= 0x0D } So, for testing purpose set the storvsc bus_dma_tag as BUS_DMA_COHERENT, but still the problem is happening. diff --git a/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c b/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c index dff2ee709bed..7c908495c6c3 100644 --- a/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c +++ b/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c @@ -949,9 +949,14 @@ storvsc_init_requests(device_t dev) struct storvsc_softc *sc = device_get_softc(dev); struct hv_storvsc_request *reqp; int error, i; + unsigned int dma_flag = 0; LIST_INIT(&sc->hs_free_list); +#if defined(__aarch64__) + dmag_flag |= BUS_DMA_COHERENT; +#endif + error = bus_dma_tag_create( bus_get_dma_tag(dev), /* parent */ 1, /* alignment */ @@ -962,7 +967,7 @@ storvsc_init_requests(device_t dev) STORVSC_DATA_SIZE_MAX, /* maxsize */ STORVSC_DATA_SEGCNT_MAX, /* nsegments */ STORVSC_DATA_SEGSZ_MAX, /* maxsegsize */ - 0, /* flags */ + dma_flag, /* flags */ NULL, /* lockfunc */ NULL, /* lockfuncarg */ &sc->storvsc_req_dtag); Can you help me here to understand why bus_get_dma_tag() in vmbus is required? Even if setting the flat in stovsc is not solving the checksum issue. Also if we need to implement it, where should we get the parameter for bus_dma_tag_create() in vmbus ? My 8 GiByte RPi4B v1.5 boot logs show: Aug 11 13:37:06 generic kernel: bcm2835_cpufreq0: <CPU Frequency Control> on cpu0 Aug 11 13:37:06 generic kernel: bcm2835_cpufreq0: Unable to find firmware device Aug 11 13:37:06 generic kernel: device_attach: bcm2835_cpufreq0 attach returned 6 That is this code that has not changed: static int bcm2835_cpufreq_attach(device_t dev) { struct bcm2835_cpufreq_softc *sc; struct sysctl_oid *oid; /* set self dev */ sc = device_get_softc(dev); sc->dev = dev; sc->firmware = devclass_get_device( devclass_find("bcm2835_firmware"), 0); if (sc->firmware == NULL) { device_printf(dev, "Unable to find firmware device\n"); return (ENXIO); } . . . This does look to me like devclass_find is internal FreeBSD infrastructure about its own internals, not RPi* specific. If so, the problem seems likely more general than RPi*'s. (In reply to Mark Millard from comment #29) IGNORE: wrong bugzilla entry. Sorry for the noise. No ability to delete. Created attachment 244092 [details]
Disk alignment test
bus_get_dma_tag is needed as the parent bus may have restrictions on the DMA of a child, e.g. to limit the usable memory. To know if the BUS_DMA_COHERENT flag can be used in an acpi attachment you can read _CCA. There is an example of reading it in sys/dev/pci/pci_host_generic_acpi.c (In reply to Andrew Turner from comment #32) I have done the change and tested with the test code and no bad alignment in both boot disk data disk, but still the ufs checksum failure is happening. # ./a.out /dev/da1 bad # ./a.out /dev/da bad /dev/da0 /dev/da0p1 /dev/da0p2 /dev/da1 /dev/da1p1 # ./a.out /dev/da bad /dev/da0 /dev/da0p1 /dev/da0p2 /dev/da1 /dev/da1p1 # ./a.out /dev/da0 bad # # geom disk list Geom name: da0 Providers: 1. Name: da0 Mediasize: 32246857728 (30G) Sectorsize: 512 Stripesize: 4096 Stripeoffset: 0 Mode: r2w2e6 descr: Msft Virtual Disk lunname: 4d534654202020208bc8faba34d8cb468624ca4d553eaac1 lunid: 600224808bc8faba34d8ca4d553eaac1 ident: (null) rotationrate: unknown fwsectors: 63 fwheads: 255 Geom name: da1 Providers: 1. Name: da1 Mediasize: 536870912000 (500G) Sectorsize: 512 Stripesize: 4096 Stripeoffset: 0 Mode: r1w1e2 descr: Msft Virtual Disk lunname: 4d534654202020201c9c312fabf3254f8826b1ed5d388e7a lunid: 600224801c9c312fabf3b1ed5d388e7a ident: (null) rotationrate: unknown fwsectors: 63 fwheads: 255 Created attachment 244186 [details]
vmbus bus_get_dma_tag change
Created attachment 244220 [details] patch to libufs for Azure As suggested in Comment #18 this patch to the structure that libufs uses to hold disk blocks works around a bug in Azure where it is unable to correctly read disk blocks into buffers that are not aligned to 512-byte boundaries. (In reply to schakrabarti@microsoft.com from comment #33) See my comment #35 to work around a bug in Azure. Note that you must make this patch and then run `make buildworld' so that libufs and all the programs that use it get recompiled with the aligned disk buffers. Ideally the problem should be fixed in Azure as the POSIX standard requires disk I/O to be possible in unaligned buffers. (In reply to Kirk McKusick from comment #36) I have tried the patch mentioned in comment #35 but the problem persists. Also this problem is only seen in arm64 not in x86. While trying to do touch in da1p1. After bus_get_dma_tag patch and libufs patch also the error is happeing: UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 816, cgp: 0x1008f9ad != bp: 0xa1bae7d7 UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 817, cgp: 0xb974439f != bp: 0x8c65de5 UFS /dev/da1p1 (/datadrive) cylinder checksum failed: cg 818, cgp: 0x97f71033 != bp: 0x8c2b9d5e pid 1013 (touch), uid 1001 inumber 2 on /datadrive: out of inodes Please ignore my previous comment #38 and comment #37. It seems the patch works, I had to create the partition and format it with ufs again to take the patch into affect. Thanks. Please commit it. (In reply to schakrabarti@microsoft.com from comment #39) Thanks for confirming that the patch works. I will need to run it through the FreeBSD Phabricator review process. I will include you in folks to review it so if any Azure questions arise you can answer them. (In reply to Kirk McKusick from comment #40) Hi Kirk, can please let me know when you are planning to put it in phabricator? Created attachment 244573 [details] revised proposed fix (In reply to schakrabarti@microsoft.com from comment #41) Sorry for the delay. I have been on vacation and not keeping up with work email. Offline comment I got was that using 512-byte alignment is probably overkill. Rather using 64-byte alignment should be sufficient. To that end, could you please test the attached revised patch to see if it also solves your issue. (In reply to Kirk McKusick from comment #42) I have tested with 64 bit alignment and it has worked. How are you testing? I'm not seeing any issues on either a local arm64 Hyper-V or Azure instance. (In reply to schakrabarti@microsoft.com from comment #43) See https://reviews.freebsd.org/D41724 for the full fix. Please confirm that it does resolve the issue in Asure. (In reply to Andrew Turner from comment #44) This vmbus bus_get_dma_tag change patch is also solving the issue and the libufs change is also solving the issue. (In reply to Andrew Turner from comment #44) I have raised it https://reviews.freebsd.org/D41728 (In reply to Kirk McKusick from comment #45) Looks like https://reviews.freebsd.org/D41728 this solves the problem. Also it solves the panic during reboot from add_route page fault. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272666 (In reply to schakrabarti@microsoft.com from comment #48) Does https://reviews.freebsd.org/D41728 solving the problem mean that https://reviews.freebsd.org/D41724 is no longer needed? (In reply to Kirk McKusick from comment #49) With this https://reviews.freebsd.org/D41728, the ufs checksum error is not coming. We don't need https://reviews.freebsd.org/D41724 for the ufs checksum failure from Azure ARM64 perpective. (In reply to Kirk McKusick from comment #49) Yes I have tested in Azure and can confirm that both https://reviews.freebsd.org/D41728 and https://reviews.freebsd.org/D41724 can solve the problem we observed, independently. A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=e7a9817b8d328dda04069b65944ce2ed6f54c6f0 commit e7a9817b8d328dda04069b65944ce2ed6f54c6f0 Author: Souradeep Chakrabarti <schakrabarti@microsoft.com> AuthorDate: 2023-09-14 07:11:25 +0000 Commit: Wei Hu <whu@FreeBSD.org> CommitDate: 2023-09-14 07:11:25 +0000 Hyper-V: vmbus: implementat bus_get_dma_tag in vmbus In ARM64 Hyper-V UFS filesystem is getting corruption and those corruptions are consistently happening just after hitting a page boundary. It is unable to correctly read disk blocks into buffers that are not aligned to 512-byte boundaries. It happens because storvsc needs physically contiguous memory which may not be the case when bus_dma needs to create a bounce buffer. This can happen when the destination is not cache-line aligned. Hyper-V VMs have VMbus synthetic devices and PCI pass-thru devices that are added dynamically via the VMbus protocol and are not represented in the ACPI DSDT. Only the top level VMbus node exists in the DSDT. As such, on ARM64 these devices don't pick up coherence information and default to not hardware coherent. PR: 267654, 272666 Reviewed by: andrew, whu Tested by: lwhsu MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D41728 sys/dev/hyperv/vmbus/vmbus.c | 33 +++++++++++++++++++++++++++++++++ sys/dev/hyperv/vmbus/vmbus_var.h | 1 + 2 files changed, 34 insertions(+) A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=85bc81352e4b0d0a9da251bacec35eec130eee49 commit 85bc81352e4b0d0a9da251bacec35eec130eee49 Author: Souradeep Chakrabarti <schakrabarti@microsoft.com> AuthorDate: 2023-09-14 07:11:25 +0000 Commit: Wei Hu <whu@FreeBSD.org> CommitDate: 2023-09-18 10:26:59 +0000 Hyper-V: vmbus: implementat bus_get_dma_tag in vmbus In ARM64 Hyper-V UFS filesystem is getting corruption and those corruptions are consistently happening just after hitting a page boundary. It is unable to correctly read disk blocks into buffers that are not aligned to 512-byte boundaries. It happens because storvsc needs physically contiguous memory which may not be the case when bus_dma needs to create a bounce buffer. This can happen when the destination is not cache-line aligned. Hyper-V VMs have VMbus synthetic devices and PCI pass-thru devices that are added dynamically via the VMbus protocol and are not represented in the ACPI DSDT. Only the top level VMbus node exists in the DSDT. As such, on ARM64 these devices don't pick up coherence information and default to not hardware coherent. PR: 267654, 272666 Reviewed by: andrew, whu Tested by: lwhsu MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D41728 (cherry picked from commit e7a9817b8d328dda04069b65944ce2ed6f54c6f0) sys/dev/hyperv/vmbus/vmbus.c | 33 +++++++++++++++++++++++++++++++++ sys/dev/hyperv/vmbus/vmbus_var.h | 1 + 2 files changed, 34 insertions(+) A commit in branch releng/14.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=48a799af88c0c77963ffb23eba797f85f86751dd commit 48a799af88c0c77963ffb23eba797f85f86751dd Author: Souradeep Chakrabarti <schakrabarti@microsoft.com> AuthorDate: 2023-09-14 07:11:25 +0000 Commit: Wei Hu <whu@FreeBSD.org> CommitDate: 2023-09-18 14:57:57 +0000 Hyper-V: vmbus: implementat bus_get_dma_tag in vmbus In ARM64 Hyper-V UFS filesystem is getting corruption and those corruptions are consistently happening just after hitting a page boundary. It is unable to correctly read disk blocks into buffers that are not aligned to 512-byte boundaries. It happens because storvsc needs physically contiguous memory which may not be the case when bus_dma needs to create a bounce buffer. This can happen when the destination is not cache-line aligned. Hyper-V VMs have VMbus synthetic devices and PCI pass-thru devices that are added dynamically via the VMbus protocol and are not represented in the ACPI DSDT. Only the top level VMbus node exists in the DSDT. As such, on ARM64 these devices don't pick up coherence information and default to not hardware coherent. Approved by: re (gjb) PR: 267654, 272666 Reviewed by: andrew, whu Tested by: lwhsu Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D41728 (cherry picked from commit e7a9817b8d328dda04069b65944ce2ed6f54c6f0) (cherry picked from commit 85bc81352e4b0d0a9da251bacec35eec130eee49) sys/dev/hyperv/vmbus/vmbus.c | 33 +++++++++++++++++++++++++++++++++ sys/dev/hyperv/vmbus/vmbus_var.h | 1 + 2 files changed, 34 insertions(+) With 48a799af88c0 in releng/14.0 now, should this PR be resolved? (In reply to Ed Maste from comment #55) I would love to close this bug. But I can't find the place to change the status to resolved in the bug report. (In reply to Wei Hu from comment #56) There's a "Status:" dropdown before the first comment, and after changing it to Closed a second dropdown will appear with the resolution. I'll change it to Closed/FIXED. A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=772430dd67955850942d689714ab982da24257ba commit 772430dd67955850942d689714ab982da24257ba Author: Kirk McKusick <mckusick@FreeBSD.org> AuthorDate: 2023-11-17 22:10:29 +0000 Commit: Kirk McKusick <mckusick@FreeBSD.org> CommitDate: 2023-11-17 22:11:24 +0000 Ensure I/O buffers in libufs(3) are 128-byte aligned. Various disk controllers require their buffers to be aligned to a cache-line size (128 bytes). For buffers allocated in structures, ensure that they are 128-byte aligned. Use aligned_malloc to allocate memory to ensure that the returned memory is 128-byte aligned. While we are here, we replace the dynamically allocated inode buffer with a buffer allocated in the uufsd structure just as the superblock and cylinder group buffers do. This can be removed if/when the kernel is fixed. Because this problem has existed on one I/O subsystem or another since the 1990's, we are probably stuck with dealing with it forever. The problem most recent showed up in Azure, see: https://reviews.freebsd.org/D41728 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267654 Before these fixes were applied, it was confirmed that the changes in this commit also fixed the issue in Azure. Reviewed-by: Warner Losh, kib Tested-by: Souradeep Chakrabarti of Microsoft (earlier version) PR: 267654 Differential Revision: https://reviews.freebsd.org/D41724 lib/libufs/Makefile | 2 +- lib/libufs/block.c | 45 +++++++++++------------------------ lib/libufs/inode.c | 16 ++----------- lib/libufs/libufs.h | 57 ++++++++++++++++++++++++++++++--------------- lib/libufs/sblock.c | 3 ++- lib/libufs/type.c | 12 ++++++---- lib/libufs/ufs_disk_close.3 | 9 ++++++- sbin/fsck_ffs/fsck.h | 15 ++++++++++++ sbin/fsck_ffs/fsutil.c | 11 ++++----- sbin/fsck_ffs/inode.c | 3 +-- sbin/fsck_ffs/main.c | 1 - sbin/fsck_ffs/pass5.c | 1 - sbin/fsck_ffs/setup.c | 11 ++++----- sbin/fsck_ffs/suj.c | 3 +-- 14 files changed, 98 insertions(+), 91 deletions(-) |