275882 – net/realtek-re-kmod: hangs after update to 199.00

Bug 275882 - net/realtek-re-kmod: hangs after update to 199.00

Summary: net/realtek-re-kmod: hangs after update to 199.00

Status:	New

Alias:	None

Product:	Ports & Packages
Classification:	Unclassified
Component:	Individual Port(s) (show other bugs)
Version:	Latest
Hardware:	Any Any

Importance:	--- Affects Some People
Assignee:	Alex Dupre

URL:
Keywords:

Depends on:
Blocks:

Reported:	2023-12-22 10:30 UTC by Tino Engel
Modified:	2024-11-25 02:59 UTC (History)
CC List:	22 users (show)

See Also:

Flags:	bugzilla: maintainer-feedback? (ale)

Attachments
kdump of 'ktrace curl google.de' (167.33 KB, text/plain) 2023-12-26 08:52 UTC, Tino Engel	no flags	Details
kdump of 'ktrace -i curl www.freebsd.org' (148.53 KB, text/plain) 2024-01-04 10:07 UTC, Tino Engel	no flags	Details
fruss -f curl www.freebsd.org (94.79 KB, text/plain) 2024-01-05 18:56 UTC, Tino Engel	no flags	Details
0001-net-realtek-re-kmod-downgrade-to-198.00.patch (3.32 KB, patch) 2024-03-13 01:29 UTC, Koichiro Iwao	no flags	Details \| Diff
0001-net-realrek-re-kmod198-add-port-for-198-version.patch (4.44 KB, patch) 2024-03-13 09:50 UTC, Koichiro Iwao	no flags	Details \| Diff
console message during panic (218.68 KB, image/jpeg) 2024-06-03 12:49 UTC, rdunkle	no flags	Details
Show Obsolete (2) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Tino Engel 2023-12-22 10:30:15 UTC

On FreeBSD 14.0-RELEASE-p4 amd64 with Killer E3000 I have the following problem:
The driver hangs after update from 198.00 to 199.00. DHCP looks good, but any process that tries to access network simply hangs.
After rollback to 198.00 it works again.

Comment 1 Alexander Vereeken freebsd_triage

2023-12-23 00:08:04 UTC

Hello,

i have some sort of that problem aswell.

When i do get to the steam login dialog in Wine then the whole connection is stuck, like in Tino`s case DHCP etc.. looks fine.

Reverting it to 198.00 helps for me aswell.

Comment 2 Martin Birgmeier 2023-12-25 12:48:05 UTC

Same issue here... ASUS Prime X670-P WiFi motherboard, after upgrade to 1.99 a few ping packets can be exchanged initially after setting the IP address, then nothing.

Comment 3 Alex Dupre freebsd_committer

2023-12-25 16:34:16 UTC

Tino, are you able to debug the issue?

This new version fixed some severe issues for some cards, and introduced new severe issues for others :-(

Comment 4 Alex Dupre freebsd_committer

2023-12-26 08:37:56 UTC

Is LRO enabled after the update to 1.99 on your card? Can you try disabling it and check if the problem persists?

Comment 5 Tino Engel 2023-12-26 08:52:51 UTC

Created attachment 247259 [details]
kdump of 'ktrace curl google.de'

Comment 6 Tino Engel 2023-12-26 08:54:54 UTC

Hello Alex,

I already tried to debug the issue, but had no finding yet.
I tried to ktrace/kdump a hanging process ('ktrace curl google.de'). I have attached the output to this ticket. Maybe this gives you a better idea of what is going wrong (it is not evident to me)?

Comment 7 Tino Engel 2023-12-26 09:10:00 UTC

P.S.: I also tried https://wiki.freebsd.org/Networking/10GbE/Router#Disabling_LRO_and_TSO without success.

Comment 8 Tino Engel 2024-01-04 10:07:33 UTC

Created attachment 247441 [details]
kdump of 'ktrace -i curl www.freebsd.org'

I have tried again to debug the issue, but unfortunately it seems this is over my head.
I have attached a new trace, this time also tracing the sub-processes.
I am not good at reading kdumps, but I have the impression curl calls www.freebsd.org and forever waits for an answer.

If anyone has an idea, I am willing to invest more time in this issue.

Comment 9 Tino Engel 2024-01-05 18:56:59 UTC

Created attachment 247468 [details]
fruss -f curl www.freebsd.org

I am not willing to give up on this.
I have attached also a truss trace.
I am digging through it, nevertheless any hints are appreciated.

Comment 10 Martin Birgmeier 2024-01-05 19:21:07 UTC

Hi Tino,

From the behavior I have seen this is an issue in the new driver. After booting it can exchange a few packets and then stops working. This means that userland traces like you are supplying probably won't give many clues as to what is happening.

I have compared the 1.98 and 1.99 sources from https://github.com/alexdupre/rtl_bsd_drv, and there are extensive changes, so it is not easy to find what causes the regression. The best way forward might be to ask the Realtek people who supply the original code, which can be found at https://www.realtek.com/en/component/zoo/category/network-interface-controllers-10-100-1000m-gigabit-ethernet-pci-express-software, to also support FreeBSD 14.

-- Martin

Comment 11 Tino Engel 2024-01-06 11:02:32 UTC

Hi Martin,

I also have compared the 1.98 and 1.99 sources from https://github.com/alexdupre/rtl_bsd_drv. I even tried some minor changes, but did not manage to get it working.
The Realtek site (https://www.realtek.com/en/component/zoo/category/network-interface-controllers-10-100-1000m-gigabit-ethernet-pci-express-software) is confusing. They offer a FreeBSD driver (latest version from 09/2023) for FreeBSD 7 and 8. That absolutely makes no sense to me.
I'll try to contact Realtek, maybe they are gonna help if we are lucky.

Tino

Comment 12 imbutler 2024-01-06 20:26:58 UTC

Something appears to have changed in the kernel ..

I can boot ..

FreeBSD 15.0-CURRENT #4 main-c3268c23de4: Mon Jan  1 20:17:26 EST 2024

 .. but my next snapshot of a build at ..

FreeBSD 15.0-CURRENT #8 main-10f2e94acc1: Tue Jan  2 16:46:09 EST 2024

 .. (or anything after that) panics with the message 're0 taskq'

I used the same module from ports (realtek-re-kmod-199.00_1) in each case.

Comment 13 rdunkle 2024-01-09 14:08:04 UTC

FreeBSD 14.0-STABLE #0 stable/14-53a984a36  arm64.aarch64
I compiled realtek-re-kmod today.  This module appears to load OK.  The nic is recognized OK.  But quickly the kernel panics.
dmesg | grep re0
re0: <Realtek PCIe 2.5GbE Family Controller> mem 0xf3000000-0xf300ffff,0xf3010000-0xf3013fff at device 0.0 on pci1
re0: Using Memory Mapping!
re0: Using line-based interrupt
re0: version:1.98.00
---------------------
I switched back to the realtek-re-kmod from FreeBSD repo.That one appears to work OK.

Comment 14 Alex Dupre freebsd_committer

2024-01-09 15:14:09 UTC

(In reply to rdunkle from comment #13)

From your log it seems you compiled an old 1.98 version, so it's not actually related to this issue that started with 1.99, according to other users.

Comment 15 rdunkle 2024-01-09 15:45:13 UTC

that is dmesg is from the old version, correct. That version runs OK. I did a git pull today on ports and compiled.  The new version does a kernel panic so I cannot get a dmesg with new version

Comment 16 Alexander Vereeken freebsd_triage

2024-01-09 17:08:44 UTC

(In reply to rdunkle from comment #15)

Not even in /var/log/messages ?

Comment 17 rdunkle 2024-01-09 17:36:32 UTC

when I boot with the 1.99.04 ... there is a panic and the /var/log/messages is empty
when I boot with 1.98 the nics work
root@orange:/boot/modules # strings if_re.ko | grep 1.99
1.99.04
root@orange:/boot/modules # strings if_re.ko.save | grep 1.98
1.98.00
Is there something else I can do to get useful information for you?

Comment 18 Alexander Vereeken freebsd_triage

2024-01-10 07:20:39 UTC

(In reply to rdunkle from comment #17)

I guess that you can obtain something when you load the module while the system is running.

Remove the module from your loader.conf then load the module manually later with:

kldload /boot/modules/if_re.ko

then the panic should be documented in /var/log/messages.

Comment 19 rdunkle 2024-01-10 08:46:16 UTC

the kldload completes.  In about 2 seconds the system reboots.  The version info is not written to the log and the previous log entries vanish.

Jan 10 10:30:50 orange kernel: , 1061.
Jan 10 10:30:50 orange ntpd[1008]: ntpd exiting on signal 15 (Terminated)
Jan 10 10:30:50 orange kernel: .
Jan 10 10:30:51 orange kernel: , 736.
Jan 10 10:30:51 orange syslogd: exiting on signal 15
Jan 10 10:32:15 orange syslogd: kernel boot file is /boot/kernel/kernel
Jan 10 10:32:15 orange kernel: ---<<BOOT>>---
Jan 10 10:32:15 orange kernel: Copyright (c) 1992-2023 The FreeBSD Project.
Jan 10 10:32:15 orange kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
Jan 10 10:32:15 orange kernel:  The Regents of the University of California. All rights reserved.
Jan 10 10:32:15 orange kernel: FreeBSD is a registered trademark of The FreeBSD Foundation.
Jan 10 10:32:15 orange kernel: FreeBSD 14.0-STABLE #0 stable/14-53a984a36: Mon Jan  8 12:46:16 EET 2024
Jan 10 10:32:15 orange kernel:     root@sky22.smallcatbrain.com:/usr/obj/usr/src-stable-14/arm64.aarch64/sys/
GENERIC arm64

Comment 20 Ott Köstner 2024-01-19 17:58:09 UTC

I can confirm this bug. Everything seems to work, but no traffic goes through the interface.

I have custom built 14.0 kernel and realtek-re-kmod-199.00_1 built from port.
Tried with different ifconfig options and got it working at some point of time, but it was not stable. Also, repeating the same sequence of disabling offload options did not give the same results.

No error messages. Driver loads OK, and ifconfig shows the status active. 

Devices are:
device     = 'RTL8125 2.5GbE Controller'
and
device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'

None of these is working with this driver.

What is more interesting is that on another machine with no Realtek devices, loading this driver disables all traffic on another interface(bge), not related to Realtek.

Comment 21 Koichiro Iwao freebsd_committer

2024-01-30 01:11:51 UTC

I also encountered this issue. 198 works fine, and 199 stops working after exchanging a few packets such as DHCP and IPv6 RA.

My devices are:

re0@pci0:2:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x103c subdevice=0x806a
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet

re0@pci0:4:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8161 subvendor=0x10ec subdevice=0x8168
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet
re1@pci0:5:0:0: class=0x020000 rev=0x0e hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x17aa subdevice=0x32e1
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet

Comment 22 László Károlyi 2024-02-14 13:59:14 UTC

I've had to recompile version 198 for myself after rebooting with the GENERIC re0 driver, after the update caused ~3hrs of downtime (it took 2hrs to get a console on my server).

Definitely can confirm this is an issue, because it made my server unreachable as well.

What I noticed is, upon first rebooting with the faulty driver, the server responded to 3 pings (IPv6) and then went completely silent.

Looking forward for a fix here because I now can't install the latest driver from realtek-re-kmod.

Comment 23 Victor Volpe 2024-03-12 06:38:45 UTC

Same problem with the version 199.00_1. Previous versions worked as intended.

FreeBSD home.local 13.2-RELEASE-p10 FreeBSD 13.2-RELEASE-p10 GENERIC amd64

re0@pci0:1:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x10ec subdevice=0x0123
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet

/boot/loader.conf
if_re_load="YES"
if_re_name="/boot/modules/if_re.ko"
hw.re.max_rx_mbuf_sz="2048"

No feedback yet?

Comment 24 Koichiro Iwao freebsd_committer

2024-03-13 01:29:35 UTC

Created attachment 249124 [details]
0001-net-realtek-re-kmod-downgrade-to-198.00.patch

I suggest downgrading this port to 198 until the issue is resolved.

Comment 25 Alex Dupre freebsd_committer

2024-03-13 07:31:14 UTC

(In reply to Koichiro Iwao from comment #24)

Unfortunately 1.98 was broken for another set of people/cards (see for example https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274995 that was reported by many people), reverting is not a good solution.

Comment 26 Koichiro Iwao freebsd_committer

2024-03-13 08:31:11 UTC

(In reply to Alex Dupre from comment #25)
I see. Then, probably we need to create another port for the 198. At least, 198 needs to be able to be installed via pkg install for the people 199 doesn't work.

Comment 27 Koichiro Iwao freebsd_committer

2024-03-13 08:51:48 UTC

In addition, using the default driver instead of this port is not a solution, too. It has a watchdog timeout issue so using 198 is the only solution so far. They need 198, really.

Comment 28 Alex Dupre freebsd_committer

2024-03-13 09:13:14 UTC

(In reply to Koichiro Iwao from comment #27)
I know, that was the main reason to create this port. I have no objections if you want to restore the previous version as a separate port.

Comment 29 Koichiro Iwao freebsd_committer

2024-03-13 09:50:00 UTC

Created attachment 249128 [details]
0001-net-realrek-re-kmod198-add-port-for-198-version.patch

Here it is. Feel free to modify it if you think necessary. It also should be added to quarterly because the quarterly branch has already been updated to 199.

Comment 30 Alex Dupre freebsd_committer

2024-03-13 16:39:27 UTC

(In reply to Koichiro Iwao from comment #29)
I think you can drop the `PORTREVISION=3` from the new port. I'm time limited, you are welcome to commit (and take the maintainership of) this new port.

Comment 31 Victor Volpe 2024-03-14 00:10:48 UTC

(In reply to Koichiro Iwao from comment #27)
4 days of uptime with no watchdog timeout so far. What FreeBSD version are you running?

# uname -a
FreeBSD home.local 13.2-RELEASE-p10 FreeBSD 13.2-RELEASE-p10 GENERIC amd64
# ifconfig re0
re0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=82099<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
        ether 7c:83:34:b1:8f:8f
        inet 192.168.15.250 netmask 0xffffff00 broadcast 192.168.15.255
        inet6 fe80::7e83:34ff:feb1:8f8f%re0 prefixlen 64 scopeid 0x1
        inet6 2804:7f0:ba41:1e60:7e83:**** prefixlen 64 autoconf
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
# netstat -db -I re0
Name    Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll  Drop
re0    1500 <Link#1>      7c:83:34:b1:8f:8f 58536613     0     0 38572546833 101711299     0 99083129652     0   144

Comment 32 László Károlyi 2024-03-14 00:19:16 UTC

(In reply to Victor Volpe from comment #31)
Victor,

there is an entire bug dedicated to the watchdog timeout (bug #166724), I know because I was a victim of it.

Although it recently disappeared for me — which I only managed to find out through a mis-compiled v198 of mine that didn't work and the built-in re0 loaded instead, which I only noticed weeks later by testing rebooting for the pf rule changes I made —, I don't want to risk going back to it on a bare metal, production server like mine is.

Cheers,
László

Comment 33 Victor Volpe 2024-03-14 00:24:47 UTC

(In reply to László Károlyi from comment #32)
Yes, I know that, mate. I was affected too on the 12-RELEASE and I've been using the kmod driver since version 196.04. Now with my system upgraded to 13.2, and after the version 199 bug I had no more watchdog timeouts after downgrading to default driver.

Comment 34 László Károlyi 2024-03-14 00:27:43 UTC

(In reply to Victor Volpe from comment #33)
Welp, that makes two of us then.

Maybe more testing is in order for the default driver.

Comment 35 commit-hook freebsd_committer

2024-03-14 02:04:17 UTC

A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=b770c919121526ebbf61b81fd6b832619319df60

commit b770c919121526ebbf61b81fd6b832619319df60
Author:     Koichiro Iwao <meta@FreeBSD.org>
AuthorDate: 2024-03-13 08:52:50 +0000
Commit:     Koichiro Iwao <meta@FreeBSD.org>
CommitDate: 2024-03-14 02:03:06 +0000

    net/realrek-re-kmod198: add port for 198 version

    as a workaround for bug 275882. This port can be retired when the bug is
    resolved completely.

    Many people need the 198 version because of the hang-up issue. Another
    set of people need 199 because of another issue. This port is needed to
    satisfy both sets of people until complete until a complete solution for
    275882 is found.

    PR:             275882
    Sponsored by:   Cybertrust Japan

 net/Makefile                             |  1 +
 net/realtek-re-kmod198/Makefile (new)    | 23 +++++++++++++++++++++++
 net/realtek-re-kmod198/distinfo (new)    |  3 +++
 net/realtek-re-kmod198/pkg-descr (new)   | 25 +++++++++++++++++++++++++
 net/realtek-re-kmod198/pkg-message (new) | 22 ++++++++++++++++++++++
 5 files changed, 74 insertions(+)

Comment 36 commit-hook freebsd_committer

2024-03-14 02:06:20 UTC

A commit in branch 2024Q1 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=f967592923a21e7b44c11a45f7a241439a97f163

commit f967592923a21e7b44c11a45f7a241439a97f163
Author:     Koichiro Iwao <meta@FreeBSD.org>
AuthorDate: 2024-03-13 08:52:50 +0000
Commit:     Koichiro Iwao <meta@FreeBSD.org>
CommitDate: 2024-03-14 02:04:19 +0000

    net/realrek-re-kmod198: add port for 198 version

    as a workaround for bug 275882. This port can be retired when the bug is
    resolved completely.

    Many people need the 198 version because of the hang-up issue. Another
    set of people need 199 because of another issue. This port is needed to
    satisfy both sets of people until complete until a complete solution for
    275882 is found.

    PR:             275882
    Sponsored by:   Cybertrust Japan

    (cherry picked from commit b770c919121526ebbf61b81fd6b832619319df60)

 net/Makefile                             |  1 +
 net/realtek-re-kmod198/Makefile (new)    | 23 +++++++++++++++++++++++
 net/realtek-re-kmod198/distinfo (new)    |  3 +++
 net/realtek-re-kmod198/pkg-descr (new)   | 25 +++++++++++++++++++++++++
 net/realtek-re-kmod198/pkg-message (new) | 22 ++++++++++++++++++++++
 5 files changed, 74 insertions(+)

Comment 37 Koichiro Iwao freebsd_committer

2024-03-14 02:10:30 UTC

(In reply to Alex Dupre from comment #30)
Thanks, I have added the port. 

Guys, the temporary workaround until the complete resolution is to use net/realtek-re-kmod198 instead.

Comment 38 Ott Köstner 2024-03-20 17:21:59 UTC

The temporary workaround wit net/realtek-re-kmod198 works in my case. That confirmed, the hardware is OK and this is a driver bug.

Still waiting the new driver net/realtek-re-kmod to be fixed.

Comment 39 rdunkle 2024-06-03 12:47:54 UTC

A little more data.
FreeBSD 14.1 rel. arm64.  did a pkg fetch of realtek driver 1.99.04
the system panics at boot during dhcpdiscover:
Starting dhclient.
DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 5
panic: driver error: _bus_dma_dflt_lock called
cpuid = 0

Comment 40 rdunkle 2024-06-03 12:49:45 UTC

Created attachment 251191 [details]
console message during panic

Comment 41 Alex Dupre freebsd_committer

2024-06-04 09:57:09 UTC

I've ported the new driver version 1.100. I'd like to know if it fixes the issue that many of you are experiencing with the 1.99 version. I'd be glad if you could try building the port replacing the `GH_TAGNAME` variable with the following commits:
- ea4ed1e version with all patchset applied
- eb00816 version with minimal patchset applied

1. Change the variable in the makefile
2. Run `make makesum`
3. Build the port as usual
4. Let me know if any of them work

Thanks!

Comment 42 rdunkle 2024-06-05 10:36:00 UTC

I built both versions.  I see the same panic--
panic: driver error: _bus_dma_dflt_lock called

Comment 43 László Károlyi 2024-06-22 15:04:15 UTC

(In reply to Alex Dupre from comment #41)
Unfortunately, I can confirm that version 1,100 still doesn't work on my production server, had to go to version 1.98.

Quick info about the installed, failing package:
realtek-re-kmod-1100.00
Name           : realtek-re-kmod
Version        : 1100.00
Installed on   : Sat Jun 22 16:41:20 2024 CEST
Origin         : net/realtek-re-kmod
Architecture   : FreeBSD:14:amd64
Prefix         : /usr/local
Categories     : net kld
Licenses       : BSD4CLAUSE
Maintainer     : ale@FreeBSD.org
WWW            : https://github.com/alexdupre/rtl_bsd_drv
Comment        : Kernel driver for Realtek PCIe Ethernet Controllers
Annotations    :
    FreeBSD_version: 1400097
    build_timestamp: 2024-06-15T12:25:56+0000
    built_by       : poudriere-git-3.4.1-30-g79e3edcd
    port_checkout_unclean: no
    port_git_hash  : 38b614919
    ports_top_checkout_unclean: no
    ports_top_git_hash: ffe948747
    repo_type      : binary
    repository     : FreeBSD



-----
The kernel driver says upon booting without the realtek driver:
Chip rev. 0x54000000
MAC rev. 0x00100000


Hope this helps. This is on 14.1-RELEASE-p1.

Version 1.98 still works flawlessly.

Comment 44 Danilo Egea Gondolfo freebsd_committer

2024-07-28 09:52:00 UTC

I'm having a similar issue. I recently got a PC with the NIC below:


re0@pci0:7:0:0:	class=0x020000 rev=0x05 hdr=0x00 vendor=0x10ec device=0x8125 subvendor=0x1043 subdevice=0x87d7
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8125 2.5GbE Controller'
    class      = network
    subclass   = ethernet

I'm using this version of the driver realtek-re-kmod-1100.00

It was working perfectly fine until I decided to enable IPv6 on my home network. Once the interface acquires IPv6 addresses it will completely stop sending/receiving traffic. Disabling IPv6 makes it work again. Is that the same scenario you guys have?

After playing around with some NIC settings I found out that disabling checksum offload will fix the problem for me.

This is what I'm currently doing:

ifconfig_re0="-rxcsum6 -txcsum6 -rxcsum -txcsum DHCP"

This is the state of my NIC:

re0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
	options=2518<VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,WOL_MAGIC>


After disabling TX/RX checksum offloading it's working reliably with IPv6.

Comment 45 Larry Rosenman freebsd_committer

2024-08-20 08:55:39 UTC

Turning off checksum offload fixes it for me as well on 15-CURRENT.

Comment 46 David Marker 2024-08-20 17:10:46 UTC

Using the latest driver realtek-re-kmod-1100.00 with the following hardware:

re0@pci0:10:0:0:        class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x1849 subdevice=0x8168
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet

I stumbled across this bug report after updating my system (to stable/14 1ff3118d72b1) but in an even more odd way: everything worked fine until I attempted to use `rdate -p time.google.com`. That is an IPv6 address (for me) which did matter, using an IPv4 server was fine.

When it failed, `rdate` would print that it did not receive enough valid responses periodically. But more importantly for this bug, it reliably hung networking. No kernel panic, just immediate loss of network. You don't need to wait for `rdate` either, breaking out immediately with CTRL-C still has issue. Any remote shells I had were immediately non-responsive as you would expect when network dies.

I added "-rxcsum6 -txcsum6 -rxcsum -txcsum" in /etc/rc.conf for re0 and no longer have issues.

Comment 47 Alex Dupre freebsd_committer

2024-08-21 12:42:55 UTC

Can you check if all 4 flags are required to fix the issue (-rxcsum6 -txcsum6 -rxcsum -txcsum) or just a subset? I'd like to add such information to the port's pkg-message.

Comment 48 Larry Rosenman freebsd_committer

2024-08-21 14:32:52 UTC

I tried disabling pairs of the offload and always fails unless ALL 4 are disabled.

Comment 49 David Marker 2024-08-21 16:04:02 UTC

On stable/14 I tried several combinations. Like comment #48 I found I needed all 4.

If I reqeust just "-rxcsum -txcsum", `ifconfig` will stil show RXCSUM in the interface options.

Comment 50 commit-hook freebsd_committer

2024-08-22 09:30:27 UTC

A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=ca63dbc1ebe126e526ded8214a88c034fc92ab68

commit ca63dbc1ebe126e526ded8214a88c034fc92ab68
Author:     Alex Dupre <ale@FreeBSD.org>
AuthorDate: 2024-08-22 09:28:10 +0000
Commit:     Alex Dupre <ale@FreeBSD.org>
CommitDate: 2024-08-22 09:29:49 +0000

    net/realtek-re-kmod: suggest to disable checksum offloading if the network hangs

    PR:             275882

 net/realtek-re-kmod/Makefile    | 1 +
 net/realtek-re-kmod/pkg-message | 7 +++++++
 2 files changed, 8 insertions(+)

Comment 51 Jan Przybylak 2024-10-30 19:39:14 UTC

I can confirm this bug and also the solution. I added all four flags in rc.conf and it has been stable for a couple of days now, it was unusable before.

Interestingly, according to ifconfig, the "RXCSUM" flag is still there, despite "-rxcsum" in rc.conf.

Comment 52 Michel Depeige 2024-11-10 18:17:49 UTC

Same issue here…

re0@pci0:8:0:0:	class=0x020000 rev=0x05 hdr=0x00 vendor=0x10ec device=0x8125 subvendor=0x1043 subdevice=0x87d7
    vendor     = 'Realtek Semiconductor Co., Ltd.'
    device     = 'RTL8125 2.5GbE Controller'
    class      = network
    subclass   = ethernet

Can confirm the workaround work as well, thanks!

However, I noticed that version 1.98 doesn't support RXCSUM6, or at least doesn't enable it by default, hence why the issue isn't there.

Is RXCSUM6/TXCSUM6 something new with 1.99 / 1.100 ?

Comment 53 Alex Dupre freebsd_committer

2024-11-11 09:09:07 UTC

(In reply to Michel Depeige from comment #52)

Yes, checksum offloading on IPV6 has been introduced in version 1.99.

Alexander88207
antoine
bz
d8zNeCFG
danilo
dave
demik+freebsd
dseliv
grahamperrin
imb
ish
jplx256
laszlo-rdr-keethu9thi
ler
me
meta
milios
ottkostner
rdunkle
rozhuk.im
tino.engel
victor_volpe