Bug 216467 - [request] Hardware (SSE 4.2) CRC32C support in calculate_crc32c()
Summary: [request] Hardware (SSE 4.2) CRC32C support in calculate_crc32c()
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Conrad Meyer
URL:
Keywords:
: 212454 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-01-25 21:26 UTC by Ben RUBSON
Modified: 2017-05-22 12:54 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ben RUBSON 2017-01-25 21:26:03 UTC
Hello,

Could it be possible to implement an optimized version of calculate_crc32c() with hardware CRC32C support please ?
SSE 4.2 has this feature, which is really interesting.

Here are some benchmarks with this :
https://github.com/laanwj/crcbench

On an Intel E5-2620v3 :

# ./crcbench 
CRC32C benchmarks
[sw] 65543000 bytes in 69730us monotonic 69726us CPU (940.0MB/s)
[hw-sse42] 65543000 bytes in 12619us monotonic 12618us CPU (5194.4MB/s)

On more data (10GB) :

# ./crcbench 
CRC32C benchmarks
[sw] 10485767000 bytes in 11179263us monotonic 11178667us CPU (938.0MB/s)
[hw-sse42] 10485767000 bytes in 2028412us monotonic 2028305us CPU (5169.7MB/s)

Sounds really interesting and promising.

Main goal behind this request is to improve iSCSI throughput when HeaderDigest and DataDigest are enabled.
Enabling these options made my iSCSI throughput drop from 300MB/s to 200MB/s.
Hardware CRC32C should then help minimising this difference.

Thank you very much !

Ben
Comment 1 Ben RUBSON 2017-01-25 21:28:04 UTC
*** Bug 212454 has been marked as a duplicate of this bug. ***
Comment 2 Conrad Meyer freebsd_committer freebsd_triage 2017-01-25 21:33:04 UTC
We have a patch for this at Isilon.

Unfortunately, our assembly CRC32C implementations are privately licensed Intel code we can't share.  Do you have a permissively licensed replacement we could use in FreeBSD?

There are two algorithms.  One algorithm just uses SSE4.2 instructions and no FPU; that's generally usable in the kernel.  There is another algorithm that also uses PCLMULQDQ (FPU) which is less generally suitable for the kernel, but an option if the thread is already preserving FPU state.
Comment 3 Conrad Meyer freebsd_committer freebsd_triage 2017-01-25 21:36:52 UTC
Looks like Linux has a BSDL copy of the PCLMULQDQ Intel code, so we can just borrow that: http://lxr.free-electrons.com/source/arch/x86/crypto/crc32c-pcl-intel-asm_64.S

Unfortunately, their non-PCLMULQDQ version is not available BSDL.
Comment 4 Ben RUBSON 2017-01-25 21:43:12 UTC
Thank you very much for your feedback !
Can't we simply look at this (which runs fine on FreeBSD) ?
https://github.com/laanwj/crcbench/blob/master/crc32c_sse42.h

Would you prefer a SSE4.2 only algorithm, or with FPU ?
Comment 5 Ben RUBSON 2017-01-25 21:47:11 UTC
In addition, I'm not sure what you really mean by a "permissively licensed replacement".
Comment 6 Conrad Meyer freebsd_committer freebsd_triage 2017-01-25 21:51:00 UTC
(In reply to Ben RUBSON from comment #4)
That file isn't licensed clearly, although the COPYING file for the project appears to be MIT (which would be fine).

I think no-FPU would be preferred, as long as it's not too much slower than the FPU variant.
Comment 7 Conrad Meyer freebsd_committer freebsd_triage 2017-01-25 21:52:10 UTC
(Oh, and the github project is in C++ and uses some compiler intrinsics that may or may not be available -- it would require some massaging to get into FreeBSD.)
Comment 8 Ben RUBSON 2017-01-25 21:58:44 UTC
Yep, -msse4.2, this is why I think this code looks so simple (mainly _mm_crc32_u64 calls).
Comment 9 Conrad Meyer freebsd_committer freebsd_triage 2017-01-26 00:16:53 UTC
Here is a proposed patch: https://reviews.freebsd.org/D9342
Comment 10 Ben RUBSON 2017-01-26 07:05:29 UTC
Wow, many thanks !
I'll then give it a try very shortly !
So you are using the _mm_crc32_u64 functions :)

As a test I could make a ZFS scrub on a healthy pool made of iSCSI disks, to see if ZFS complains (bad) or not (good).
Not the perfect test however.
Comment 11 Conrad Meyer freebsd_committer freebsd_triage 2017-01-26 07:07:34 UTC
(In reply to Ben RUBSON from comment #10)
Yeah, I intend to do some userspace tests to confirm correctness before committing.
Comment 12 Ben RUBSON 2017-01-26 08:46:40 UTC
Patch & compilation went fine with 11.0-RELEASE-p7.

But does not seem to work, at least with the iSCSI test I did :
kernel: WARNING: 192.168.2.1 (iqn.2012-06.srv1:rT1): connection error; reconnecting
kernel: (probe0:iscsi19:0:0:0): INQUIRY. CDB: 12 00 00 00 24 00 
kernel: (probe0:iscsi19:0:0:0): CAM status: CCB request aborted by the host
kernel: (probe0:iscsi19:0:0:0): Retrying command

If I remove HeaderDigest and DataDigest from iscsi.conf, then iSCSI connection is OK.
Comment 13 Conrad Meyer freebsd_committer freebsd_triage 2017-01-26 18:33:28 UTC
(In reply to Ben RUBSON from comment #12)
The sense of the input/output bits was inverted.  Please try the latest version of the patch (v4 in phabricator), which removes the inversion.  I've verified that it returns identical results to our software implementation for a few thousand inputs.  I expect it will fix the iSCSI attach problem.

Thanks!
Comment 14 Conrad Meyer freebsd_committer freebsd_triage 2017-01-26 18:36:59 UTC
> Enabling these options made my iSCSI throughput drop from 300MB/s to 200MB/s.

By the way, this patch may not restore all lost performance.  Some of the difference may come from the target having to compute checksums as well, which we can't do anything about :-).
Comment 15 Ben RUBSON 2017-01-26 18:43:21 UTC
Both my initiators and targets are FreeBSD servers :)
So this nice kernel improvement should really be helpful !

I'm going to give your last (not inverted ;) patch a try.
Thanks too !
Comment 16 Ben RUBSON 2017-01-26 20:50:00 UTC
Just tested but does not fully work : iSCSI connects successfully, but as soon as there is some trafic, disks disconnect and hang, system is not able to recover them.
Comment 17 Conrad Meyer freebsd_committer freebsd_triage 2017-01-26 21:04:05 UTC
(In reply to Ben RUBSON from comment #16)
That doesn't sound like a CRC problem.
Comment 18 Ben RUBSON 2017-01-26 21:59:53 UTC
Yep I agree but strangely when I disable HeaderDigest and DataDigest, it works fine.
And without the patch, HeaderDigest and DataDigest work correctly.
I've performed the test twice.
Comment 19 Conrad Meyer freebsd_committer freebsd_triage 2017-01-26 22:30:47 UTC
(In reply to Ben RUBSON from comment #18)
I noticed a bug in handling of unaligned inputs.  I expect that is the problem you're seeing.
Comment 20 Conrad Meyer freebsd_committer freebsd_triage 2017-01-27 06:45:27 UTC
There is a new patch in phabricator that should resolve the previous issue.  It should also be a little faster for inputs larger than 768 bytes (e.g. 4k and 8k IOs).
Comment 21 Ben RUBSON 2017-01-27 06:46:51 UTC
Such as iSCSI frames :)
Very nice !
Will build and test it in a few minutes, and start a scrub if runs fine.
Comment 22 Ben RUBSON 2017-01-27 07:08:52 UTC
Scrub running !
Result in a few hours, but sounds good actually :)
Comment 23 Ben RUBSON 2017-01-27 17:42:29 UTC
So scrub finished correctly, good news.

And some benchmark, iSCSI troughput :
- without CRC32C : 300 MB/s
- with sw CRC32C : 200 MB/s
- with hw CRC32C : 285 MB/s

Really nice improvement !
Many thanks Conrad, really glad :)
Comment 24 Conrad Meyer freebsd_committer freebsd_triage 2017-01-27 17:49:43 UTC
(In reply to Ben RUBSON from comment #23)
Hey, that's great to hear.  Thanks for trying it out several times, even a few broken patches :-).
Comment 25 Edward Tomasz Napierala freebsd_committer freebsd_triage 2017-01-28 15:53:44 UTC
There is no way for iSCSI to pass the fact the digest failed to the upper layers (like ZFS); the way it's handled is by dropping the connection - afterwards you get a reconnect and a retry of the disk operation.  So, what I'd expect is that you would see a message saying "data digest check failed" in your syslog, then a connection drop, and a reconnect.  Do you see that?
Comment 26 Ben RUBSON 2017-01-29 08:14:21 UTC
>what I'd expect is that you would see a message saying "data digest check failed"
No, I did not see such a message.
When I saw that the patch was not working correctly, I tested with only one disk.
It successfully connected, iscsictl returned it as connected, perfect.
But as soon as I dd if=/dev/mydisk, I got the following :
kernel: WARNING: 192.168.2.1 (iqn.2012-06.srv1:rT1): no ping reply (NOP-In) after 5 seconds; reconnecting
srv1 last message repeated 31 times
srv1 last message repeated 42 times
But no digest failed message.
Comment 27 commit-hook freebsd_committer freebsd_triage 2017-01-31 03:26:54 UTC
A commit references this bug:

Author: cem
Date: Tue Jan 31 03:26:33 UTC 2017
New revision: 313006
URL: https://svnweb.freebsd.org/changeset/base/313006

Log:
  calculate_crc32c: Add SSE4.2 implementation on x86

  Derived from an implementation by Mark Adler.

  The fast loop performs three simultaneous CRCs over subsets of the data
  before composing them.  This takes advantage of certain properties of
  the CRC32 implementation in Intel hardware.  (The CRC instruction takes 1
  cycle but has 2-3 cycles of latency.)

  The CRC32 instruction does not manipulate FPU state.

  i386 does not have the crc32q instruction, so avoid it there.  Otherwise
  the implementation is identical to amd64.

  Add basic userland tests to verify correctness on a variety of inputs.

  PR:		216467
  Reported by:	Ben RUBSON <ben.rubson at gmail.com>
  Reviewed by:	kib@, markj@ (earlier version)
  Sponsored by:	Dell EMC Isilon
  Differential Revision:	https://reviews.freebsd.org/D9342

Changes:
  head/sys/conf/files.amd64
  head/sys/conf/files.i386
  head/sys/libkern/crc32.c
  head/sys/libkern/x86/
  head/sys/libkern/x86/crc32_sse42.c
  head/sys/sys/libkern.h
  head/tests/sys/kern/Makefile
  head/tests/sys/kern/libkern_crc32.c
Comment 28 Ben RUBSON 2017-02-08 08:16:50 UTC
10 days running on 2 servers without issue :)
Do you plan to MFC to stable/11 ?
Thank you !
Comment 29 Conrad Meyer freebsd_committer freebsd_triage 2017-02-08 16:50:57 UTC
(In reply to Ben RUBSON from comment #28)
I don't touch stable branches, but you may ask someone else to MFC it.
Comment 30 Kurt Jaeger freebsd_committer freebsd_triage 2017-05-22 12:54:35 UTC
Merged to 11 in https://svnweb.freebsd.org/changeset/base/317149