Summary: | ix0: Watchdog timeout | ||
---|---|---|---|
Product: | Base System | Reporter: | Jiri <silence> |
Component: | kern | Assignee: | Kevin Bowling <kbowling> |
Status: | Closed Unable to Reproduce | ||
Severity: | Affects Only Me | CC: | denis, kbowling, krzysztof.galazka, net |
Priority: | --- | Keywords: | IntelNetworking, needs-qa |
Version: | 12.1-RELEASE | ||
Hardware: | amd64 | ||
OS: | Any | ||
See Also: | https://reviews.freebsd.org/D21712 |
Description
Jiri
2020-01-20 08:47:59 UTC
(In reply to Jiri from comment #0) Could you, please, check if applying this patch https://reviews.freebsd.org/D21712 has any influence? I would like to rule out that the watchdog timeouts are false positives. (In reply to Krzysztof Galazka from comment #1) Thank You, O.K. I''l do it at night. Now, two days router running at the same traffic condition (no reboots, no config changes) no ix0 timeouts has appeared. Jiri I have had applied the patch. No timeouts or messages like "queue can't be marked as hung if interface is down" has appeared. Next I'll try switch port shutdown and traffic torture. I'll let you know if something happened. Jiri I tried to switch on/off optical link to my ix0 - manually remove fibers. Kernel doesn't detect any outage, no message ix0 down/up in log. (was about 7 sec - info from switch). No errors appear in log, system running about 5 days from recommended patch. Strange, but it looks like fully operable. Two outages was observed. No kernel message, no log event. ix0 stop communicating, looking still up. Ifconfig up/down did resolve this issue. May be bad network card, if nobody have this problem ? looks like 235524 for me. the igb interface will not survive iperf3 -t 300 for me. Yes, agree. I have added the second X520 card, there is very low traffic (about some MBits) and no problem observed here. This timeout behavior probably depends on traffic. I.E. high traffic = problem. Some new information. Kernel patched to P3. May be not problem in Intel driver. I change Intel NIC to Mellanox ConnectX4 NIC to wish solve traffic outages. There is running cron script to test connectivity to gateway an down/up interface. Log attached: Fri Apr 10 19:57:13 CEST 2020 interface ix0 restart Fri Apr 10 20:38:13 CEST 2020 interface ix0 restart Sat Apr 11 17:45:13 CEST 2020 interface ix0 restart Sat Apr 11 20:00:13 CEST 2020 interface ix0 restart Sat Apr 11 20:30:13 CEST 2020 interface ix0 restart Sun Apr 12 19:16:13 CEST 2020 interface ix0 restart Sun Apr 12 20:30:13 CEST 2020 interface ix0 restart Sun Apr 26 00:27:13 CEST 2020 interface mce0 restart Sun Apr 26 04:48:13 CEST 2020 interface mce0 restart Sun Apr 26 11:12:13 CEST 2020 interface mce0 restart Wed Apr 29 21:27:13 CEST 2020 interface mce0 restart Wed Apr 29 21:33:13 CEST 2020 interface mce0 restart After changing network card from Apr 12 to Apr 26 no problems appears. But from Apr 26 on the Mellanox card traffic not stop, but 80% packet loss appear. Interface down/up solve this issue, like in Intel case. Other servers in my network have connectivity to gateway O.K., there is no connectivity/switch issues. Traffic max about 1,3GBit/s on the NIC. On Mellanox NIC this log event appear "arpresolve: can't allocate llinfo for xxx.xxx.xxx.xxx on mce0" Jiri Please reopen if you think there is still an issue with ixgbe. |