Bug 263877 - net/wireguard: go implementation is twice faster than the kernel's one
Summary: net/wireguard: go implementation is twice faster than the kernel's one
Status: Closed Works As Intended
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Some People
Assignee: Bernhard Froehlich
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-05-09 12:04 UTC by Loic
Modified: 2022-05-09 19:36 UTC (History)
1 user (show)

See Also:
bugzilla: maintainer-feedback? (decke)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Loic 2022-05-09 12:04:21 UTC
Dear Sirs,

  I am having better performance using the wireguard-go port than the wireguard-kmod port, I was expecting the opposite.
I thought it was only related to my servers, so I tried on other servers (tried till now about 10 servers, using HP, Dell with igb/bge/cxl/ix) and still the same thing.
Yesterday, I tried with a fresh installation on 2 servers and had the same results.
So I am wondering if someone else is seeing the same problems.

Here are some iperf3 on 2 fresh FreeBSD 13 both servers are Xeon E5-2640 128Gram:

1)Np VPN lro and tso on:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.61 MBytes       
[  5]   1.00-2.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.61 MBytes       
[  5]   2.00-3.00   sec  1.10 GBytes  9.42 Gbits/sec    0   1.61 MBytes       
[  5]   3.00-4.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.61 MBytes       
[  5]   4.00-5.00   sec  1.10 GBytes  9.42 Gbits/sec    0   1.61 MBytes       
[  5]   5.00-6.00   sec  1.10 GBytes  9.42 Gbits/sec    0   1.61 MBytes       
[  5]   6.00-7.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.61 MBytes       
[  5]   7.00-8.00   sec  1.10 GBytes  9.42 Gbits/sec    0   1.61 MBytes       
[  5]   8.00-9.00   sec  1.10 GBytes  9.41 Gbits/sec    0   1.61 MBytes       
[  5]   9.00-10.00  sec  1.10 GBytes  9.41 Gbits/sec    0   1.61 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  11.0 GBytes  9.41 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  11.0 GBytes  9.41 Gbits/sec                  receiver

2)Np VPN lro and tso off:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   518 MBytes  4.34 Gbits/sec    0   1.20 MBytes       
[  5]   1.00-2.00   sec   543 MBytes  4.56 Gbits/sec    0   1.61 MBytes       
[  5]   2.00-3.00   sec   542 MBytes  4.54 Gbits/sec    0   1.61 MBytes       
[  5]   3.00-4.00   sec   537 MBytes  4.51 Gbits/sec    0   1.61 MBytes       
[  5]   4.00-5.00   sec   536 MBytes  4.50 Gbits/sec    0   1.61 MBytes       
[  5]   5.00-6.00   sec   533 MBytes  4.47 Gbits/sec    0   1.61 MBytes       
[  5]   6.00-7.00   sec   533 MBytes  4.47 Gbits/sec    0   1.61 MBytes       
[  5]   7.00-8.00   sec   537 MBytes  4.50 Gbits/sec    0   1.61 MBytes       
[  5]   8.00-9.00   sec   539 MBytes  4.53 Gbits/sec    0   1.61 MBytes       
[  5]   9.00-10.00  sec   540 MBytes  4.52 Gbits/sec    0   1.61 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  5.23 GBytes  4.49 Gbits/sec    0             sender
[  5]   0.00-10.03  sec  5.23 GBytes  4.48 Gbits/sec                  receiver

3)Wireguard-go:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.01   sec   100 MBytes   832 Mbits/sec  307    383 KBytes       
[  5]   1.01-2.00   sec  23.0 MBytes   194 Mbits/sec  337    983 KBytes       
[  5]   2.00-3.00   sec  65.8 MBytes   553 Mbits/sec  530   41.4 KBytes       
[  5]   3.00-4.00   sec  89.8 MBytes   753 Mbits/sec  411   98.1 KBytes       
[  5]   4.00-5.01   sec  99.9 MBytes   830 Mbits/sec  337   1.14 MBytes       
[  5]   5.01-6.00   sec  24.5 MBytes   207 Mbits/sec  651   2.67 KBytes       
[  5]   6.00-7.00   sec  90.6 MBytes   762 Mbits/sec  472   79.1 KBytes       
[  5]   7.00-8.00   sec  88.6 MBytes   744 Mbits/sec  387    152 KBytes       
[  5]   8.00-9.01   sec  40.9 MBytes   339 Mbits/sec  791   1.35 MBytes       
[  5]   9.01-10.00  sec  81.3 MBytes   691 Mbits/sec  1423   91.2 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   705 MBytes   591 Mbits/sec  5646             sender
[  5]   0.00-10.00  sec   703 MBytes   590 Mbits/sec                  receiver

4)Wireguard-kmod:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  39.5 MBytes   331 Mbits/sec    0    485 KBytes       
[  5]   1.00-2.00   sec  39.0 MBytes   327 Mbits/sec    0    490 KBytes       
[  5]   2.00-3.00   sec  39.0 MBytes   327 Mbits/sec    0    490 KBytes       
[  5]   3.00-4.00   sec  39.0 MBytes   327 Mbits/sec    0    490 KBytes       
[  5]   4.00-5.00   sec  39.0 MBytes   327 Mbits/sec    0    490 KBytes       
[  5]   5.00-6.00   sec  39.0 MBytes   328 Mbits/sec    0    490 KBytes       
[  5]   6.00-7.00   sec  39.1 MBytes   327 Mbits/sec    0    490 KBytes       
[  5]   7.00-8.00   sec  39.1 MBytes   329 Mbits/sec    0    490 KBytes       
[  5]   8.00-9.00   sec  39.1 MBytes   328 Mbits/sec    0    490 KBytes       
[  5]   9.00-10.00  sec  39.1 MBytes   328 Mbits/sec    0    490 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   391 MBytes   328 Mbits/sec    0             sender
[  5]   0.00-10.02  sec   391 MBytes   327 Mbits/sec                  receiver

Thank you for any suggestion to debug this issue.

Best regards
LQ
Comment 1 Bernhard Froehlich freebsd_committer freebsd_triage 2022-05-09 19:05:47 UTC
This is roughly what can be expected at the moment. The reasons behind it are quite simple.

1) The Go implementation has inefficient packet handling and runs in userland so a lot of context switches but uses performance optimized implementation for the crypto operations.

2) The kernel module has more efficient packet handling but uses safe but non optimized crypto implementation.

There is work in the pipeline to improve performance of the kernel implementation and to use performance optimized crypto from the kernel. [1]

[1] https://lists.zx2c4.com/pipermail/wireguard/2022-April/007534.html
Comment 2 Loic 2022-05-09 19:10:12 UTC
Thank you very much for your reply.
While the iperf3 is running, i have 4 cores used, while the system has 16 cores, is there a way to use more cores and have better performance waiting the new kernel implementation?
Comment 3 Bernhard Froehlich freebsd_committer freebsd_triage 2022-05-09 19:35:18 UTC
Sorry, don't know yet.
Comment 4 Loic 2022-05-09 19:36:46 UTC
No worries