On FreeBSD 14.0-RELEASE-p4 amd64 with Killer E3000 I have the following problem: The driver hangs after update from 198.00 to 199.00. DHCP looks good, but any process that tries to access network simply hangs. After rollback to 198.00 it works again.
Hello, i have some sort of that problem aswell. When i do get to the steam login dialog in Wine then the whole connection is stuck, like in Tino`s case DHCP etc.. looks fine. Reverting it to 198.00 helps for me aswell.
Same issue here... ASUS Prime X670-P WiFi motherboard, after upgrade to 1.99 a few ping packets can be exchanged initially after setting the IP address, then nothing.
Tino, are you able to debug the issue? This new version fixed some severe issues for some cards, and introduced new severe issues for others :-(
Is LRO enabled after the update to 1.99 on your card? Can you try disabling it and check if the problem persists?
Created attachment 247259 [details] kdump of 'ktrace curl google.de'
Hello Alex, I already tried to debug the issue, but had no finding yet. I tried to ktrace/kdump a hanging process ('ktrace curl google.de'). I have attached the output to this ticket. Maybe this gives you a better idea of what is going wrong (it is not evident to me)?
P.S.: I also tried https://wiki.freebsd.org/Networking/10GbE/Router#Disabling_LRO_and_TSO without success.
Created attachment 247441 [details] kdump of 'ktrace -i curl www.freebsd.org' I have tried again to debug the issue, but unfortunately it seems this is over my head. I have attached a new trace, this time also tracing the sub-processes. I am not good at reading kdumps, but I have the impression curl calls www.freebsd.org and forever waits for an answer. If anyone has an idea, I am willing to invest more time in this issue.
Created attachment 247468 [details] fruss -f curl www.freebsd.org I am not willing to give up on this. I have attached also a truss trace. I am digging through it, nevertheless any hints are appreciated.
Hi Tino, From the behavior I have seen this is an issue in the new driver. After booting it can exchange a few packets and then stops working. This means that userland traces like you are supplying probably won't give many clues as to what is happening. I have compared the 1.98 and 1.99 sources from https://github.com/alexdupre/rtl_bsd_drv, and there are extensive changes, so it is not easy to find what causes the regression. The best way forward might be to ask the Realtek people who supply the original code, which can be found at https://www.realtek.com/en/component/zoo/category/network-interface-controllers-10-100-1000m-gigabit-ethernet-pci-express-software, to also support FreeBSD 14. -- Martin
Hi Martin, I also have compared the 1.98 and 1.99 sources from https://github.com/alexdupre/rtl_bsd_drv. I even tried some minor changes, but did not manage to get it working. The Realtek site (https://www.realtek.com/en/component/zoo/category/network-interface-controllers-10-100-1000m-gigabit-ethernet-pci-express-software) is confusing. They offer a FreeBSD driver (latest version from 09/2023) for FreeBSD 7 and 8. That absolutely makes no sense to me. I'll try to contact Realtek, maybe they are gonna help if we are lucky. Tino
Something appears to have changed in the kernel .. I can boot .. FreeBSD 15.0-CURRENT #4 main-c3268c23de4: Mon Jan 1 20:17:26 EST 2024 .. but my next snapshot of a build at .. FreeBSD 15.0-CURRENT #8 main-10f2e94acc1: Tue Jan 2 16:46:09 EST 2024 .. (or anything after that) panics with the message 're0 taskq' I used the same module from ports (realtek-re-kmod-199.00_1) in each case.
FreeBSD 14.0-STABLE #0 stable/14-53a984a36 arm64.aarch64 I compiled realtek-re-kmod today. This module appears to load OK. The nic is recognized OK. But quickly the kernel panics. dmesg | grep re0 re0: <Realtek PCIe 2.5GbE Family Controller> mem 0xf3000000-0xf300ffff,0xf3010000-0xf3013fff at device 0.0 on pci1 re0: Using Memory Mapping! re0: Using line-based interrupt re0: version:1.98.00 --------------------- I switched back to the realtek-re-kmod from FreeBSD repo.That one appears to work OK.
(In reply to rdunkle from comment #13) From your log it seems you compiled an old 1.98 version, so it's not actually related to this issue that started with 1.99, according to other users.
that is dmesg is from the old version, correct. That version runs OK. I did a git pull today on ports and compiled. The new version does a kernel panic so I cannot get a dmesg with new version
(In reply to rdunkle from comment #15) Not even in /var/log/messages ?
when I boot with the 1.99.04 ... there is a panic and the /var/log/messages is empty when I boot with 1.98 the nics work root@orange:/boot/modules # strings if_re.ko | grep 1.99 1.99.04 root@orange:/boot/modules # strings if_re.ko.save | grep 1.98 1.98.00 Is there something else I can do to get useful information for you?
(In reply to rdunkle from comment #17) I guess that you can obtain something when you load the module while the system is running. Remove the module from your loader.conf then load the module manually later with: kldload /boot/modules/if_re.ko then the panic should be documented in /var/log/messages.
the kldload completes. In about 2 seconds the system reboots. The version info is not written to the log and the previous log entries vanish. Jan 10 10:30:50 orange kernel: , 1061. Jan 10 10:30:50 orange ntpd[1008]: ntpd exiting on signal 15 (Terminated) Jan 10 10:30:50 orange kernel: . Jan 10 10:30:51 orange kernel: , 736. Jan 10 10:30:51 orange syslogd: exiting on signal 15 Jan 10 10:32:15 orange syslogd: kernel boot file is /boot/kernel/kernel Jan 10 10:32:15 orange kernel: ---<<BOOT>>--- Jan 10 10:32:15 orange kernel: Copyright (c) 1992-2023 The FreeBSD Project. Jan 10 10:32:15 orange kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jan 10 10:32:15 orange kernel: The Regents of the University of California. All rights reserved. Jan 10 10:32:15 orange kernel: FreeBSD is a registered trademark of The FreeBSD Foundation. Jan 10 10:32:15 orange kernel: FreeBSD 14.0-STABLE #0 stable/14-53a984a36: Mon Jan 8 12:46:16 EET 2024 Jan 10 10:32:15 orange kernel: root@sky22.smallcatbrain.com:/usr/obj/usr/src-stable-14/arm64.aarch64/sys/ GENERIC arm64
I can confirm this bug. Everything seems to work, but no traffic goes through the interface. I have custom built 14.0 kernel and realtek-re-kmod-199.00_1 built from port. Tried with different ifconfig options and got it working at some point of time, but it was not stable. Also, repeating the same sequence of disabling offload options did not give the same results. No error messages. Driver loads OK, and ifconfig shows the status active. Devices are: device = 'RTL8125 2.5GbE Controller' and device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' None of these is working with this driver. What is more interesting is that on another machine with no Realtek devices, loading this driver disables all traffic on another interface(bge), not related to Realtek.
I also encountered this issue. 198 works fine, and 199 stops working after exchanging a few packets such as DHCP and IPv6 RA. My devices are: re0@pci0:2:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x103c subdevice=0x806a vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet re0@pci0:4:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8161 subvendor=0x10ec subdevice=0x8168 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet re1@pci0:5:0:0: class=0x020000 rev=0x0e hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x17aa subdevice=0x32e1 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet
I've had to recompile version 198 for myself after rebooting with the GENERIC re0 driver, after the update caused ~3hrs of downtime (it took 2hrs to get a console on my server). Definitely can confirm this is an issue, because it made my server unreachable as well. What I noticed is, upon first rebooting with the faulty driver, the server responded to 3 pings (IPv6) and then went completely silent. Looking forward for a fix here because I now can't install the latest driver from realtek-re-kmod.
Same problem with the version 199.00_1. Previous versions worked as intended. FreeBSD home.local 13.2-RELEASE-p10 FreeBSD 13.2-RELEASE-p10 GENERIC amd64 re0@pci0:1:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x10ec subdevice=0x0123 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet /boot/loader.conf if_re_load="YES" if_re_name="/boot/modules/if_re.ko" hw.re.max_rx_mbuf_sz="2048" No feedback yet?
Created attachment 249124 [details] 0001-net-realtek-re-kmod-downgrade-to-198.00.patch I suggest downgrading this port to 198 until the issue is resolved.
(In reply to Koichiro Iwao from comment #24) Unfortunately 1.98 was broken for another set of people/cards (see for example https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274995 that was reported by many people), reverting is not a good solution.
(In reply to Alex Dupre from comment #25) I see. Then, probably we need to create another port for the 198. At least, 198 needs to be able to be installed via pkg install for the people 199 doesn't work.
In addition, using the default driver instead of this port is not a solution, too. It has a watchdog timeout issue so using 198 is the only solution so far. They need 198, really.
(In reply to Koichiro Iwao from comment #27) I know, that was the main reason to create this port. I have no objections if you want to restore the previous version as a separate port.
Created attachment 249128 [details] 0001-net-realrek-re-kmod198-add-port-for-198-version.patch Here it is. Feel free to modify it if you think necessary. It also should be added to quarterly because the quarterly branch has already been updated to 199.
(In reply to Koichiro Iwao from comment #29) I think you can drop the `PORTREVISION=3` from the new port. I'm time limited, you are welcome to commit (and take the maintainership of) this new port.
(In reply to Koichiro Iwao from comment #27) 4 days of uptime with no watchdog timeout so far. What FreeBSD version are you running? # uname -a FreeBSD home.local 13.2-RELEASE-p10 FreeBSD 13.2-RELEASE-p10 GENERIC amd64 # ifconfig re0 re0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=82099<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE> ether 7c:83:34:b1:8f:8f inet 192.168.15.250 netmask 0xffffff00 broadcast 192.168.15.255 inet6 fe80::7e83:34ff:feb1:8f8f%re0 prefixlen 64 scopeid 0x1 inet6 2804:7f0:ba41:1e60:7e83:**** prefixlen 64 autoconf media: Ethernet autoselect (1000baseT <full-duplex>) status: active nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> # netstat -db -I re0 Name Mtu Network Address Ipkts Ierrs Idrop Ibytes Opkts Oerrs Obytes Coll Drop re0 1500 <Link#1> 7c:83:34:b1:8f:8f 58536613 0 0 38572546833 101711299 0 99083129652 0 144
(In reply to Victor Volpe from comment #31) Victor, there is an entire bug dedicated to the watchdog timeout (bug #166724), I know because I was a victim of it. Although it recently disappeared for me — which I only managed to find out through a mis-compiled v198 of mine that didn't work and the built-in re0 loaded instead, which I only noticed weeks later by testing rebooting for the pf rule changes I made —, I don't want to risk going back to it on a bare metal, production server like mine is. Cheers, László
(In reply to László Károlyi from comment #32) Yes, I know that, mate. I was affected too on the 12-RELEASE and I've been using the kmod driver since version 196.04. Now with my system upgraded to 13.2, and after the version 199 bug I had no more watchdog timeouts after downgrading to default driver.
(In reply to Victor Volpe from comment #33) Welp, that makes two of us then. Maybe more testing is in order for the default driver.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=b770c919121526ebbf61b81fd6b832619319df60 commit b770c919121526ebbf61b81fd6b832619319df60 Author: Koichiro Iwao <meta@FreeBSD.org> AuthorDate: 2024-03-13 08:52:50 +0000 Commit: Koichiro Iwao <meta@FreeBSD.org> CommitDate: 2024-03-14 02:03:06 +0000 net/realrek-re-kmod198: add port for 198 version as a workaround for bug 275882. This port can be retired when the bug is resolved completely. Many people need the 198 version because of the hang-up issue. Another set of people need 199 because of another issue. This port is needed to satisfy both sets of people until complete until a complete solution for 275882 is found. PR: 275882 Sponsored by: Cybertrust Japan net/Makefile | 1 + net/realtek-re-kmod198/Makefile (new) | 23 +++++++++++++++++++++++ net/realtek-re-kmod198/distinfo (new) | 3 +++ net/realtek-re-kmod198/pkg-descr (new) | 25 +++++++++++++++++++++++++ net/realtek-re-kmod198/pkg-message (new) | 22 ++++++++++++++++++++++ 5 files changed, 74 insertions(+)
A commit in branch 2024Q1 references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=f967592923a21e7b44c11a45f7a241439a97f163 commit f967592923a21e7b44c11a45f7a241439a97f163 Author: Koichiro Iwao <meta@FreeBSD.org> AuthorDate: 2024-03-13 08:52:50 +0000 Commit: Koichiro Iwao <meta@FreeBSD.org> CommitDate: 2024-03-14 02:04:19 +0000 net/realrek-re-kmod198: add port for 198 version as a workaround for bug 275882. This port can be retired when the bug is resolved completely. Many people need the 198 version because of the hang-up issue. Another set of people need 199 because of another issue. This port is needed to satisfy both sets of people until complete until a complete solution for 275882 is found. PR: 275882 Sponsored by: Cybertrust Japan (cherry picked from commit b770c919121526ebbf61b81fd6b832619319df60) net/Makefile | 1 + net/realtek-re-kmod198/Makefile (new) | 23 +++++++++++++++++++++++ net/realtek-re-kmod198/distinfo (new) | 3 +++ net/realtek-re-kmod198/pkg-descr (new) | 25 +++++++++++++++++++++++++ net/realtek-re-kmod198/pkg-message (new) | 22 ++++++++++++++++++++++ 5 files changed, 74 insertions(+)
(In reply to Alex Dupre from comment #30) Thanks, I have added the port. Guys, the temporary workaround until the complete resolution is to use net/realtek-re-kmod198 instead.
The temporary workaround wit net/realtek-re-kmod198 works in my case. That confirmed, the hardware is OK and this is a driver bug. Still waiting the new driver net/realtek-re-kmod to be fixed.
A little more data. FreeBSD 14.1 rel. arm64. did a pkg fetch of realtek driver 1.99.04 the system panics at boot during dhcpdiscover: Starting dhclient. DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 5 panic: driver error: _bus_dma_dflt_lock called cpuid = 0
Created attachment 251191 [details] console message during panic
I've ported the new driver version 1.100. I'd like to know if it fixes the issue that many of you are experiencing with the 1.99 version. I'd be glad if you could try building the port replacing the `GH_TAGNAME` variable with the following commits: - ea4ed1e version with all patchset applied - eb00816 version with minimal patchset applied 1. Change the variable in the makefile 2. Run `make makesum` 3. Build the port as usual 4. Let me know if any of them work Thanks!
I built both versions. I see the same panic-- panic: driver error: _bus_dma_dflt_lock called
(In reply to Alex Dupre from comment #41) Unfortunately, I can confirm that version 1,100 still doesn't work on my production server, had to go to version 1.98. Quick info about the installed, failing package: realtek-re-kmod-1100.00 Name : realtek-re-kmod Version : 1100.00 Installed on : Sat Jun 22 16:41:20 2024 CEST Origin : net/realtek-re-kmod Architecture : FreeBSD:14:amd64 Prefix : /usr/local Categories : net kld Licenses : BSD4CLAUSE Maintainer : ale@FreeBSD.org WWW : https://github.com/alexdupre/rtl_bsd_drv Comment : Kernel driver for Realtek PCIe Ethernet Controllers Annotations : FreeBSD_version: 1400097 build_timestamp: 2024-06-15T12:25:56+0000 built_by : poudriere-git-3.4.1-30-g79e3edcd port_checkout_unclean: no port_git_hash : 38b614919 ports_top_checkout_unclean: no ports_top_git_hash: ffe948747 repo_type : binary repository : FreeBSD ----- The kernel driver says upon booting without the realtek driver: Chip rev. 0x54000000 MAC rev. 0x00100000 Hope this helps. This is on 14.1-RELEASE-p1. Version 1.98 still works flawlessly.
I'm having a similar issue. I recently got a PC with the NIC below: re0@pci0:7:0:0: class=0x020000 rev=0x05 hdr=0x00 vendor=0x10ec device=0x8125 subvendor=0x1043 subdevice=0x87d7 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8125 2.5GbE Controller' class = network subclass = ethernet I'm using this version of the driver realtek-re-kmod-1100.00 It was working perfectly fine until I decided to enable IPv6 on my home network. Once the interface acquires IPv6 addresses it will completely stop sending/receiving traffic. Disabling IPv6 makes it work again. Is that the same scenario you guys have? After playing around with some NIC settings I found out that disabling checksum offload will fix the problem for me. This is what I'm currently doing: ifconfig_re0="-rxcsum6 -txcsum6 -rxcsum -txcsum DHCP" This is the state of my NIC: re0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500 options=2518<VLAN_MTU,VLAN_HWTAGGING,TSO4,LRO,WOL_MAGIC> After disabling TX/RX checksum offloading it's working reliably with IPv6.
Turning off checksum offload fixes it for me as well on 15-CURRENT.
Using the latest driver realtek-re-kmod-1100.00 with the following hardware: re0@pci0:10:0:0: class=0x020000 rev=0x15 hdr=0x00 vendor=0x10ec device=0x8168 subvendor=0x1849 subdevice=0x8168 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet I stumbled across this bug report after updating my system (to stable/14 1ff3118d72b1) but in an even more odd way: everything worked fine until I attempted to use `rdate -p time.google.com`. That is an IPv6 address (for me) which did matter, using an IPv4 server was fine. When it failed, `rdate` would print that it did not receive enough valid responses periodically. But more importantly for this bug, it reliably hung networking. No kernel panic, just immediate loss of network. You don't need to wait for `rdate` either, breaking out immediately with CTRL-C still has issue. Any remote shells I had were immediately non-responsive as you would expect when network dies. I added "-rxcsum6 -txcsum6 -rxcsum -txcsum" in /etc/rc.conf for re0 and no longer have issues.
Can you check if all 4 flags are required to fix the issue (-rxcsum6 -txcsum6 -rxcsum -txcsum) or just a subset? I'd like to add such information to the port's pkg-message.
I tried disabling pairs of the offload and always fails unless ALL 4 are disabled.
On stable/14 I tried several combinations. Like comment #48 I found I needed all 4. If I reqeust just "-rxcsum -txcsum", `ifconfig` will stil show RXCSUM in the interface options.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=ca63dbc1ebe126e526ded8214a88c034fc92ab68 commit ca63dbc1ebe126e526ded8214a88c034fc92ab68 Author: Alex Dupre <ale@FreeBSD.org> AuthorDate: 2024-08-22 09:28:10 +0000 Commit: Alex Dupre <ale@FreeBSD.org> CommitDate: 2024-08-22 09:29:49 +0000 net/realtek-re-kmod: suggest to disable checksum offloading if the network hangs PR: 275882 net/realtek-re-kmod/Makefile | 1 + net/realtek-re-kmod/pkg-message | 7 +++++++ 2 files changed, 8 insertions(+)
I can confirm this bug and also the solution. I added all four flags in rc.conf and it has been stable for a couple of days now, it was unusable before. Interestingly, according to ifconfig, the "RXCSUM" flag is still there, despite "-rxcsum" in rc.conf.
Same issue here… re0@pci0:8:0:0: class=0x020000 rev=0x05 hdr=0x00 vendor=0x10ec device=0x8125 subvendor=0x1043 subdevice=0x87d7 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8125 2.5GbE Controller' class = network subclass = ethernet Can confirm the workaround work as well, thanks! However, I noticed that version 1.98 doesn't support RXCSUM6, or at least doesn't enable it by default, hence why the issue isn't there. Is RXCSUM6/TXCSUM6 something new with 1.99 / 1.100 ?
(In reply to Michel Depeige from comment #52) Yes, checksum offloading on IPV6 has been introduced in version 1.99.