Bug 272461 - Very low throughput mellanox arm64 Azure
Summary: Very low throughput mellanox arm64 Azure
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: arm64 Any
: --- Affects Some People
Assignee: freebsd-net (Nobody)
URL:
Keywords: performance
: 272462 (view as bug list)
Depends on:
Blocks:
 
Reported: 2023-07-12 05:15 UTC by schakrabarti@microsoft.com
Modified: 2023-10-18 04:55 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description schakrabarti@microsoft.com 2023-07-12 05:15:18 UTC
In Azure ARM64 with Mellanox T27800 Family ConnectX-5 Virtual Function, we are seeing very low throughput of 2.4Gbps instead of available 24Gbps in iperf3 testing. We are using 14.0-CURRENT FreeBSD 14.0-CURRENT.

[root@lisa--447-e0-n1 /home/lisatest]# iperf3 -Vsd
iperf 3.13
FreeBSD lisa--447-e0-n1 14.0-CURRENT FreeBSD 14.0-CURRENT #26 main-n263986-4631191c8a5f-dirty: Thu Jul  6 15:55:36 UTC 2023     root@fbsd13-nvme-test:/data/ws/obj/data/ws/main/arm64.aarch64/sys/GENERIC arm64
-----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------
get_parameters:
{
        "tcp":  true,
        "omit": 0,
        "time": 10,
        "num":  0,
        "blockcount":   0,
        "parallel":     1,
        "len":  131072,
        "pacing_timer": 1000,
        "client_version":       "3.13"
}
SNDBUF is 32768, expecting 0
RCVBUF is 65536, expecting 0
Time: Wed, 12 Jul 2023 05:12:02 UTC
Accepted connection from 10.0.0.4, port 30147
      Cookie: sufbhz7veq7mc2zfkhgfqb3brk7qtekmh35z
      TCP MSS: 0 (default)
Congestion algorithm is cubic
[  5] local 10.0.0.5 port 5201 connected to 10.0.0.4 port 30668
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test, tos 0
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 1.000051 bytes_transferred 440081080
interval forces keep
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   420 MBytes  3.52 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 0.999994 bytes_transferred 428949692
interval forces keep
[  5]   1.00-2.00   sec   409 MBytes  3.43 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 1.000187 bytes_transferred 315787268
interval forces keep
[  5]   2.00-3.00   sec   301 MBytes  2.53 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 1.000033 bytes_transferred 313509888
interval forces keep
[  5]   3.00-4.00   sec   299 MBytes  2.51 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 0.999786 bytes_transferred 312184546
interval forces keep
[  5]   4.00-5.00   sec   298 MBytes  2.50 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 1.000079 bytes_transferred 307495692
interval forces keep
[  5]   5.00-6.00   sec   293 MBytes  2.46 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 1.000039 bytes_transferred 307502682
interval forces keep
[  5]   6.00-7.00   sec   293 MBytes  2.46 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 0.999980 bytes_transferred 309119866
interval forces keep
[  5]   7.00-8.00   sec   295 MBytes  2.47 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 0.999948 bytes_transferred 305474486
interval forces keep
[  5]   8.00-9.00   sec   291 MBytes  2.44 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 1.000009 bytes_transferred 305829276
interval forces keep
[  5]   9.00-10.00  sec   292 MBytes  2.45 Gbits/sec
tcpi_snd_cwnd 13980 tcpi_snd_mss 1410 tcpi_rtt 10000
interval_len 0.000524 bytes_transferred 0
ignoring short interval with no data
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate
[  5] (sender statistics not available)
[  5]   0.00-10.00  sec  3.12 GBytes  2.68 Gbits/sec                  receiver
rcv_tcp_congestion cubic
get_results
{
        "cpu_util_total":       71.699362492382051,
        "cpu_util_user":        0.1566976289611289,
        "cpu_util_system":      71.589782122913959,
        "sender_has_retransmits":       1,
        "congestion_used":      "newreno",
        "streams":      [{
                        "id":   1,
                        "bytes":        3347795132,
                        "retransmits":  109,
                        "jitter":       0,
                        "errors":       0,
                        "packets":      0,
                        "start_time":   0,
                        "end_time":     10.000558
                }]
}
send_results
{
        "cpu_util_total":       35.778942258224347,
        "cpu_util_user":        0.15585994703341635,
        "cpu_util_system":      35.731875294043533,
        "sender_has_retransmits":       18446744073709551615,
        "congestion_used":      "cubic",
        "streams":      [{
                        "id":   1,
                        "bytes":        3345934476,
                        "retransmits":  18446744073709551615,
                        "jitter":       0,
                        "errors":       0,
                        "packets":      0,
                        "start_time":   0,
                        "end_time":     10.00063
                }]
}
iperf 3.13

Also  client side is showing retransmit and then speed drops:
# iperf3 -c 10.0.0.5
Connecting to host 10.0.0.5, port 5201
[  5] local 10.0.0.4 port 30668 connected to 10.0.0.5 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   421 MBytes  3.52 Gbits/sec    0   1.61 MBytes
[  5]   1.00-2.00   sec   410 MBytes  3.44 Gbits/sec  107    579 KBytes
[  5]   2.00-3.00   sec   302 MBytes  2.53 Gbits/sec    0    879 KBytes
[  5]   3.00-4.00   sec   299 MBytes  2.51 Gbits/sec    0    886 KBytes
[  5]   4.00-5.00   sec   297 MBytes  2.50 Gbits/sec    0    910 KBytes
[  5]   5.00-6.00   sec   293 MBytes  2.46 Gbits/sec    0    961 KBytes
[  5]   6.00-7.00   sec   293 MBytes  2.46 Gbits/sec    0    982 KBytes
[  5]   7.00-8.00   sec   295 MBytes  2.47 Gbits/sec    2    884 KBytes
[  5]   8.00-9.00   sec   291 MBytes  2.45 Gbits/sec    0    929 KBytes
[  5]   9.00-10.00  sec   292 MBytes  2.45 Gbits/sec    0    973 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  3.12 GBytes  2.68 Gbits/sec  109             sender
[  5]   0.00-10.00  sec  3.12 GBytes  2.68 Gbits/sec                  receiver


ifconfig output:
root@schakrabarti-freebsd:/datadrive/sandbox_21_06/src/sys # ifconfig -a
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
hn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8051b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,LINKSTATE>
        ether 60:45:bd:d4:97:44
        inet 172.30.0.4 netmask 0xffffff00 broadcast 172.30.0.255
        media: Ethernet 50GBase-KR4 <full-duplex,rxpause,txpause>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
mce0: flags=8a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8805bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,LINKSTATE>
        ether 60:45:bd:d4:97:44
        media: Ethernet 50GBase-KR4 <full-duplex,rxpause,txpause>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

Please help me to solve this issue as it will block FreeBSD on Azure ARM64.
Comment 1 Graham Perrin freebsd_committer freebsd_triage 2023-07-12 20:34:18 UTC
*** Bug 272462 has been marked as a duplicate of this bug. ***
Comment 2 schakrabarti@microsoft.com 2023-07-19 07:14:59 UTC
# nuttcp -r 10.0.0.6
11183.5827 MB /  10.00 sec = 9377.7762 Mbps 12 %TX 99 %RX 0 retrans 0.94 msRTT
# nuttcp -t 10.0.0.6
 7382.5625 MB /  10.01 sec = 6189.0401 Mbps 88 %TX 23 %RX 1272 host-retrans 1.09 msRTT
# nuttcp -t 10.0.0.6
 7165.5625 MB /  10.01 sec = 6007.3615 Mbps 95 %TX 22 %RX 932 host-retrans 1.24 msRTT
# nuttcp -t 10.0.0.6
 7328.3125 MB /  10.01 sec = 6143.3396 Mbps 97 %TX 23 %RX 1128 host-retrans 0.89 msRTT
# nuttcp -r 10.0.0.6
11282.3515 MB /  10.00 sec = 9461.1586 Mbps 12 %TX 99 %RX 0 retrans 1.07 msRTT

# ifconfig
lo0: flags=1008049<UP,LOOPBACK,RUNNING,MULTICAST,LOWER_UP> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet 127.0.0.1 netmask 0xff000000
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
hn0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
        options=0
        ether 60:45:bd:c7:5b:6a
        inet 10.0.0.4 netmask 0xffffff00 broadcast 10.0.0.255
        media: Ethernet 100GBase-CR4 <full-duplex,rxpause,txpause>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
mce0: flags=1008a43<UP,BROADCAST,RUNNING,ALLMULTI,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
        options=18a05ab<RXCSUM,TXCSUM,VLAN_MTU,JUMBO_MTU,VLAN_HWCSUM,TSO4,LRO,NV,LINKSTATE,HWSTATS,TXRTLMT>
        ether 60:45:bd:c7:5b:6a
        media: Ethernet 100GBase-CR4 <full-duplex,rxpause,txpause>
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
# uname -a
FreeBSD lisa--523-e0-n0 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n263986-4631191c8a5f: Fri Jul 14 09:03:58 UTC 2023     root@fbsd13-nvme-test:/data/ws/obj/data/ws/main/arm64.aarch64/sys/GENERIC-NODEBUG arm64
#

The nuttcp server was running on Ubuntu X86 on same subnet in Azure.
Comment 3 schakrabarti@microsoft.com 2023-07-19 07:40:41 UTC
Some more data points:
From Linux Ubuntu x86 system running  nuttcp -t -T 5 -w 128 -v localhost:
The speed is 32Gbps
root@ubuntu2004:/home/lisatest# nuttcp -t -T 5 -w 128 -v localhost
nuttcp-t: v6.1.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5001 tcp -> localhost
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 127.0.0.1 with mss=32767, RTT=0.074 ms
nuttcp-t: send window size = 262144, receive window size = 131072
nuttcp-t: available send window = 131072, available receive window = 65536
nuttcp-t: 19626.5625 MB in 5.00 real seconds = 4019455.69 KB/sec = 32927.3810 Mbps
nuttcp-t: retrans = 0
nuttcp-t: 314025 I/O calls, msec/call = 0.02, calls/sec = 62804.00
nuttcp-t: 0.0user 4.9sys 0:05real 99% 0i+0d 926maxrss 0+24pf 109+168csw

nuttcp-r: v6.1.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5001 tcp
nuttcp-r: accept from 127.0.0.1
nuttcp-r: send window size = 2626560, receive window size = 262144
nuttcp-r: available send window = 1313280, available receive window = 131072
nuttcp-r: 19626.5625 MB in 5.00 real seconds = 4019583.51 KB/sec = 32928.4281 Mbps
nuttcp-r: 318597 I/O calls, msec/call = 0.02, calls/sec = 63720.41
nuttcp-r: 0.0user 2.3sys 0:05real 49% 0i+0d 92maxrss 0+18pf 308774+48csw
root@ubuntu2004:/home/lisatest#


Now FreeBSD ARM64:
32Gbps

# nuttcp -t -T 5 -w 128 -v localhost
nuttcp-t: v8.2.2: socket
nuttcp-t: buflen=65536, nstream=1, port=5101 tcp -> localhost
nuttcp-t: time limit = 5.00 seconds
nuttcp-t: connect to 127.0.0.1 with mss=16344, RTT=0.051 ms af=inet
nuttcp-t: send window size = 147096, receive window size = 81720
nuttcp-t: 18235.1250 MB in 5.02 real seconds = 3720201.81 KB/sec = 30475.8932 Mbps
nuttcp-t: host-retrans = 0
nuttcp-t: 291762 I/O calls, msec/call = 0.02, calls/sec = 58128.15
nuttcp-t: 0.0user 4.7sys 0:05real 94% 106i+202d 1200maxrss 0+2pf 18597+0csw

nuttcp-r: v8.2.2: socket
nuttcp-r: buflen=65536, nstream=1, port=5101 tcp
nuttcp-r: accept from 127.0.0.1 with af=inet
nuttcp-r: send window size = 49032, receive window size = 147096
nuttcp-r: 18235.1250 MB in 5.02 real seconds = 3717070.73 KB/sec = 30450.2434 Mbps
nuttcp-r: 332594 I/O calls, msec/call = 0.02, calls/sec = 66207.40
nuttcp-r: 0.0user 4.6sys 0:05real 92% 104i+198d 1172maxrss 0+18pf 32651+0csw
#
Comment 4 schakrabarti@microsoft.com 2023-09-14 07:58:41 UTC
If we are running iperf3 with multiple server bound to multiple ports and connecting multiple iperf3 client, then we are getting the similar throughput that of Linux. So there is no issue from mlx or hyper-v here.
Similar behavior was reported earlier here https://forums.freebsd.org/threads/poor-performance-with-stable-13-and-mellanox-connectx-6-mlx5.85460/