Bug 236042 - Windows Server 2016 Hyper-V snapshot triggers SCSI errors
Summary: Windows Server 2016 Hyper-V snapshot triggers SCSI errors
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Many People
Assignee: freebsd-virtualization mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-25 23:29 UTC by Alex G
Modified: 2019-11-14 20:15 UTC (History)
9 users (show)

See Also:


Attachments
SCSI error during snapshot (23.84 KB, image/png)
2019-04-19 07:25 UTC, Gesture
no flags Details
proposed patch, ported from Linux (793 bytes, patch)
2019-11-13 07:49 UTC, Andriy Gapon
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alex G 2019-02-25 23:29:38 UTC
Hi,

Currently running FreeBSD 12.0-RELEASE in a Hyper-V Gen 2 with a SCSI virtual disk.

When Veeam backup takes a hyperv snapshot of the running vm this is printed on the console.

Feb 21 08:51:40 8797web01 kernel: hvtimesync0: RTT
Feb 21 08:51:40 8797web01 kernel: (da0:storvsc0:0:0:0): WRITE(10). CDB: 2a 00 0a 6e 5a b0 00 00 08 00 
Feb 21 08:51:40 8797web01 kernel: (da0:storvsc0:0:0:0): CAM status: SCSI Status Error
Feb 21 08:51:40 8797web01 kernel: (da0:storvsc0:0:0:0): SCSI status: Check Condition
Feb 21 08:51:40 8797web01 kernel: (da0:storvsc0:0:0:0): SCSI sense: UNIT ATTENTION asc:3f,2 (Changed operating definition)
Feb 21 08:51:40 8797web01 kernel: (da0:storvsc0:0:0:0): Retrying command (per sense data)
Feb 21 08:54:33 8797web01 kernel: (da0:storvsc0:0:0:0): WRITE(10). CDB: 2a 00 03 b3 14 68 00 00 40 00 
Feb 21 08:54:33 8797web01 kernel: (da0:storvsc0:0:0:0): CAM status: SCSI Status Error
Feb 21 08:54:33 8797web01 kernel: (da0:storvsc0:0:0:0): SCSI status: Check Condition
Feb 21 08:54:33 8797web01 kernel: (da0:storvsc0:0:0:0): SCSI sense: UNIT ATTENTION asc:3f,2 (Changed operating definition)
Feb 21 08:54:33 8797web01 kernel: (da0:storvsc0:0:0:0): Retrying command (per sense data)

Could it be something is still trying to write to the disk when hyperv signals the HV VSS driver to freeze the file system?

The system has 1 virtual disk with 3 partitions.
* EFI boot
* swap
* UFS root file system

Thank you.
Cheers,
Alex.
Comment 1 Alex G 2019-02-25 23:34:14 UTC
I tried to email the authors of the hyerv integration drivers but I got a bounce back from Microsoft email server.

The response from the remote server was:
550 5.4.1 [bsdic@microsoft.com]: Recipient address rejected: Access denied [BL2NAM06FT011.Eop-nam06.prod.protection.outlook.com]
Comment 2 Yold 2019-03-19 13:08:09 UTC
I'm also having same problem with different setup:
- HyperV 2016 with OPNSense as Guest (FreeBSD 11.1-RELEASE-p17)
This VM has a replicate on another HyperV wich often failed (need to force sync again).
Here is FreeBSD log:

(da0:storvsc0:0:0:0): WRITE(10). CDB: 2a 00 00 cd ac a8 00 01 00 00
(da0:storvsc0:0:0:0): CAM status: SCSI Status Error
(da0:storvsc0:0:0:0): SCSI status: Check Condition
(da0:storvsc0:0:0:0): SCSI sense: UNIT ATTENTION asc:3f,2 (Changed operating definition)
(da0:storvsc0:0:0:0): Retrying command (per sense data)
(da0:storvsc0:0:0:0): WRITE(10). CDB: 2a 00 00 ca 90 68 00 00 40 00
(da0:storvsc0:0:0:0): CAM status: SCSI Status Error
(da0:storvsc0:0:0:0): SCSI status: Check Condition
(da0:storvsc0:0:0:0): SCSI sense: UNIT ATTENTION asc:3f,2 (Changed operating definition)
(da0:storvsc0:0:0:0): Retrying command (per sense data)
Comment 3 Gesture 2019-04-19 07:25:57 UTC
Created attachment 203786 [details]
SCSI error during snapshot
Comment 4 Gesture 2019-04-19 07:27:35 UTC
Hi,

I can confirm that I have the same SCSI errors during HyperV snapshots.

Greetings
Gesture
Comment 5 Gesture 2019-04-19 07:28:41 UTC
I'm using FreeBSD 11.2 and a pfSense 2.4.4 HyperV Virtual machine
Comment 6 thomaslauer 2019-05-20 19:05:15 UTC
Hi, i have 120 PFSense VMs from 2.3.4 to 2.4.4-2 all Hyperv VMs with GEN2 and UFS.
and some vms with Hyperv GEN2 and UFS. All this VM has the same issue.

I have only one VM with GEN2 and ZFS. This VM has no SCSI Errors during the snapshot.
Comment 7 Nick 2019-05-31 16:53:28 UTC
I am having this problem with PFSense (2.4.4-RELEASE-p3) running on Hyper-V (Windows 2012 R2).  In my case, replication might run fine for a while (hours, days) but at some point there is a SCSI Status Error during a WRITE operation and PFSense/FreeBSD will become locked up or partially working but eventually will not respond to network or UI requests.  I would love a resolution to this.  For the moment, I've disabled replication and it's been fine.

Perhaps useful, perhaps not:  I've been running PFSense for years as a replicating VM on Hyper-V W2K2012R2 without issues.  It was just this week when I started having problems.  PFSense was previously running many different versions (2.2, 2.3, 2.4.3).  When I started having problems this week I had not upgraded PFSsense or the hypervisor.  As far as I can tell "nothing changed".

Nick
Comment 8 Michael 2019-11-01 11:07:10 UTC
I can confirm that I have the same SCSI errors during HyperV snapshots.

FreeBSD from 11.2 to current 13.0 have this problem.
Also this problem and on ZFS too.
Comment 9 Michael 2019-11-01 11:44:42 UTC
Linux seems to have the same problem:
https://bugzilla.redhat.com/show_bug.cgi?id=1502601

I wonder if FreeBSD solved it or not?
...But the output of SCSI Status Error messages in FreeBSD is very similar to the already resolved problem in Linux.
Comment 10 Andriy Gapon freebsd_committer 2019-11-13 07:49:30 UTC
Created attachment 209124 [details]
proposed patch, ported from Linux

Could anyone observing the problem please test this patch?
Thanks!

P.S.
There is a review request for it as well:
https://reviews.freebsd.org/D22313
Comment 11 Michael 2019-11-14 20:15:03 UTC
(In reply to Andriy Gapon from comment #10)

Did not help. Messages one to one after applying this patch and
make cleanworld && make cleandir && make -j8 buildworld && make -j8 buildkernel KERNCONF=GENERIC