Bug 260200

Summary: AHCI emulation throws CAM errors when backed by -t malloc memory disk devices
Product: Base System Reporter: Michael Dexter <editor>
Component: bhyveAssignee: freebsd-virtualization (Nobody) <virtualization>
Status: Closed FIXED    
Severity: Affects Only Me CC: afedorov, markj
Priority: ---    
Version: 13.0-RELEASE   
Hardware: Any   
OS: Any   

Description Michael Dexter freebsd_triage 2021-12-04 06:10:44 UTC
If a bhyve emulated AHCI device is backed by a 'mdconfig -t malloc' memory disk device, variations on this simple fio(1) test:

fio --name=bhyve --rw=write --runtime=1 --time_based --filename=/dev/ada0 --bs=512b --size=512b

Produces these errors logged in the system messages:

(ada0:ahcich0:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
(ada0:ahcich0:0:0:0): CAM status: ATA Status Error
(ada0:ahcich0:0:0:0): ATA status: 41 (DRDY ERR), error: 04 (ABRT )
(ada0:ahcich0:0:0:0): RES: 41 04 00 00 00 40 00 00 00 00 00
(ada0:ahcich0:0:0:0): Retrying command, 0 more tries remain
(ada0:ahcich0:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
(ada0:ahcich0:0:0:0): CAM status: ATA Status Error
(ada0:ahcich0:0:0:0): ATA status: 41 (DRDY ERR), error: 04 (ABRT )
(ada0:ahcich0:0:0:0): RES: 41 04 00 00 00 40 00 00 00 00 00
(ada0:ahcich0:0:0:0): Error 5, Retries exhausted
(ada0:ahcich0:0:0:0): Synchronize cache failed

The error is consistent with repeat runs and variations on the fio syntax.

The virtio-blk and nvme devices do not exhibit this behavior.
Comment 1 Michael Dexter freebsd_triage 2022-02-10 09:06:59 UTC
Update: bhyve/illumos does not exhibit this issue under similar testing with ramdiskadm(M1) devices.
Comment 2 Aleksandr Fedorov freebsd_committer freebsd_triage 2022-02-12 11:40:07 UTC
This is because md(4) + memory backend doesn't support the DIOCGFLUSH ioctl (BIO_FLUSH command): https://github.com/freebsd/freebsd-src/blob/main/sys/dev/md/md.c#L650

pci_ahci.c: https://github.com/freebsd/freebsd-src/blob/main/usr.sbin/bhyve/pci_ahci.c#L783

So I see two ways to solve this issue.

First, fix bhyve with patch like this:

diff --git a/usr.sbin/bhyve/block_if.c b/usr.sbin/bhyve/block_if.c
index 98c0f9f5f38b..3fb81bb86ae2 100644
--- a/usr.sbin/bhyve/block_if.c
+++ b/usr.sbin/bhyve/block_if.c
@@ -102,6 +102,7 @@ struct blockif_ctxt {
        int                     bc_ischr;
        int                     bc_isgeom;
        int                     bc_candelete;
+       int                     bc_canflush;
        int                     bc_rdonly;
        off_t                   bc_size;
        int                     bc_sectsz;
@@ -224,6 +225,8 @@ static int
 blockif_flush_bc(struct blockif_ctxt *bc)
 {
        if (bc->bc_ischr) {
+               if (bc->bc_canflush == 0)
+                       return (0);
                if (ioctl(bc->bc_fd, DIOCGFLUSH))
                        return (errno);
        } else if (fsync(bc->bc_fd))
@@ -463,7 +466,7 @@ blockif_open(nvlist_t *nvl, const char *ident)
        struct diocgattr_arg arg;
        off_t size, psectsz, psectoff;
        int extra, fd, i, sectsz;
-       int ro, candelete, geom, ssopt, pssopt;
+       int ro, candelete, canflush, geom, ssopt, pssopt;
        int nodelete;
 
 #ifndef WITHOUT_CAPSICUM
@@ -566,6 +569,12 @@ blockif_open(nvlist_t *nvl, const char *ident)
                        candelete = arg.value.i;
                if (ioctl(fd, DIOCGPROVIDERNAME, name) == 0)
                        geom = 1;
+               if (ioctl(fd, DIOCGFLUSH) == 0)
+                       canflush = 1;
+               else if (errno == ENOTSUP) {
+                       canflush = 0;
+                       errno = 0;
+               }
        } else
                psectsz = sbuf.st_blksize;
 
@@ -614,6 +623,7 @@ blockif_open(nvlist_t *nvl, const char *ident)
        bc->bc_ischr = S_ISCHR(sbuf.st_mode);
        bc->bc_isgeom = geom;
        bc->bc_candelete = candelete;
+       bc->bc_canflush = canflush;
        bc->bc_rdonly = ro;
        bc->bc_size = size;
        bc->bc_sectsz = sectsz;

Second, add a NOP support of the BIO_FLUSH cmd to md(4) (mdstart_malloc()).
Comment 3 Aleksandr Fedorov freebsd_committer freebsd_triage 2022-02-12 11:53:24 UTC
Second way:

diff --git a/sys/dev/md/md.c b/sys/dev/md/md.c
index 308374f49f14..affb0780fa4a 100644
--- a/sys/dev/md/md.c
+++ b/sys/dev/md/md.c
@@ -643,6 +643,9 @@ mdstart_malloc(struct md_s *sc, struct bio *bp)
        case BIO_WRITE:
        case BIO_DELETE:
                break;
+       case BIO_FLUSH:
+               bp->bio_resid = 0;
+               return (0);
        default:
                return (EOPNOTSUPP);
        }
Comment 4 Mark Johnston freebsd_committer freebsd_triage 2022-02-12 14:14:51 UTC
(In reply to Aleksandr Fedorov from comment #3)
I'd be tempted to change md(4) to simply ignore BIO_FLUSH in malloc-backed MDs, rather than adding complexity to bhyve.  The second patch looks ok to me.

Should swap-backed MDs also be changed?  What effect should BIO_FLUSH have there?
Comment 5 Michael Dexter freebsd_triage 2022-02-12 23:23:27 UTC
Thank you for looking into this.

I confirmed that this behavior is with both malloc and swap memory devices, but is not present with CTL "ramdisk" devices. i.e.:

mdconfig -s 256m -t malloc -u 11
mdconfig -s 256m -t swap -u 12
ctladm create -b ramdisk -s 256m -o capacity=256m
ctladm port -o on
<pass to VM, run fio against them>
Comment 6 Aleksandr Fedorov freebsd_committer freebsd_triage 2022-02-13 12:30:59 UTC
I think the swap backend should also ignore the BIO_FLUSH command. Because swap is not a persistent storage, I don't know how BIO_FLUSH can be useful.

As for ctl(4), it works with SCSI commands. Ramdisk backend simply ignores the SYNCHRONIZE_CACHE and SYNCHRONIZE_CACHE_16 commands: https://github.com/freebsd/freebsd-src/blob/main/sys/cam/ctl/ctl_backend_ramdisk.c#L748
Comment 7 Aleksandr Fedorov freebsd_committer freebsd_triage 2022-02-13 13:15:47 UTC
https://reviews.freebsd.org/D34260
Comment 8 Michael Dexter freebsd_triage 2022-02-16 04:23:20 UTC
This review applies cleanly against 13.0R and passes my dd(1) and fio(1) tests.

Thank you!
Comment 9 commit-hook freebsd_committer freebsd_triage 2022-02-17 19:25:27 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=cb28dfb27d12f1cbd7831f370c01493c01d74b10

commit cb28dfb27d12f1cbd7831f370c01493c01d74b10
Author:     Aleksandr Fedorov <afedorov@FreeBSD.org>
AuthorDate: 2022-02-17 19:21:56 +0000
Commit:     Aleksandr Fedorov <afedorov@FreeBSD.org>
CommitDate: 2022-02-17 19:21:56 +0000

    md(4): Add dummy support of the BIO_FLUSH command for malloc and swap
    backend.

    PR:     260200
    Reported by:    editor@callfortesting.org
    Reviewed by:    vmaffione (mentor), markj
    Approved by:    vmaffione (mentor), markj
    Differential Revision:  https://reviews.freebsd.org/D34260

 sys/dev/md/md.c | 4 ++++
 1 file changed, 4 insertions(+)
Comment 10 commit-hook freebsd_committer freebsd_triage 2022-02-23 10:59:58 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f40063d329775a7c7bdac90afd261b683ad2fec3

commit f40063d329775a7c7bdac90afd261b683ad2fec3
Author:     Aleksandr Fedorov <afedorov@FreeBSD.org>
AuthorDate: 2022-02-17 19:21:56 +0000
Commit:     Aleksandr Fedorov <afedorov@FreeBSD.org>
CommitDate: 2022-02-23 10:57:44 +0000

    md(4): Add dummy support of the BIO_FLUSH command for malloc and swap
    backend.

    PR:     260200
    Reported by:    editor@callfortesting.org
    Reviewed by:    vmaffione (mentor), markj
    Approved by:    vmaffione (mentor), markj
    Differential Revision:  https://reviews.freebsd.org/D34260

    (cherry picked from commit cb28dfb27d12f1cbd7831f370c01493c01d74b10)

 sys/dev/md/md.c | 4 ++++
 1 file changed, 4 insertions(+)
Comment 11 Michael Dexter freebsd_triage 2024-06-08 05:24:08 UTC
Confirmed working on a 14.0-RELEASE-p6 host with a 14.0-RELEASE-p0 VM.