Bug 17965

Summary: vr (MII-bus version in 4.0 ONLY) driver lock-up problems
Product: Base System Reporter: locke <locke>
Component: kernAssignee: silby
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.0-STABLE   
Hardware: Any   
OS: Any   

Description locke 2000-04-13 04:20:01 UTC
Moderate to heavy traffic load on the vr card can periodically cause
the network to completely freeze up (all connections die, everything
unreachable with ping, etc) for about 10-30 secs.  Also, the following
message appears in the system log:
vr0: watchdog timeout

If the network stays down for longer than 10 secs or so, multiple
copies of the above message appear in the system log (I got up to 20+
once).

Also, I have also seen the "rx error: unknown rx error" message
in syslog (see kern/17866), but that is much less frequent than the
"watchdog timeout" message.

Fix: 

Get to a pre-MII-bus version by doing the following:
Back out /src/sys/pci/if_vr.c to version 1.17
Back out /src/sys/pci/if_vrreg.h to version 1.6
Recompile kernel. Reboot (there might be syslog messages about miibus
conflicts.. haven't tried to track them down yet, they're probably due
to not removing mii bus from my kernel config).
How-To-Repeat: Any sort of moderate to heavy traffic load (Samba, FTP, etc) can cause
the card to lock up for a relatively short period of time (10-20 secs).
Comment 1 locke 2000-07-28 02:51:42 UTC
Updated (for -STABLE) fix available at http://www.mcs.net/~locke/vrfix/.
Comment 2 Sheldon Hearn freebsd_committer freebsd_triage 2000-07-28 09:53:36 UTC
Responsible Changed
From-To: freebsd-bugs->wpaul

Bill, this one references a patch.
Comment 3 fp 2001-01-18 01:44:47 UTC
Release:
 4.1.1-RELEASE, 4.2-RELEASE


Environment:
 GENERIC and custom kernels


Description:
 We experience about the same problem as kern/17695
 on 5 different systems with the same hardware
 configuration:
  ASUS P3C2000
  PIII Slot1 600MHz<=...<=800MHz
  Adaptec ASC29160 + IBM SCSI3 HD
  ATI MACH64
  D-LINK DFE530TX
  (3 different revisions with the 3 different chips
  VT3043, VT86C100A, VT6102)

 The NIC lock up from a few minutes to infinity with the
 'vr0 watchdog timeout' kernel message

 Several other systems with the same hardware config EXCEPT
 the motherboard (which are ASUS CUV4X) do not seem to have
 the problem. I do not think it is due to the i820 memory hub
 bug as the machine do not crash, only the NIC.


How-To-Repeat:
 Some loaded machines run a few days without problems, then the
 message appears several times a day. Some other unloaded
 machines lock up at the first HTTP request.


Fix:
 The pre MII bus driver seems to fix the problem.
 The miibus must be disabled in the kernel and miibus, vr modules
 must be removed from the modules directory to avoid error messages
 at startup and loading of unused modules

 If the machine is definitely locked, only a restart can fix the
 problem (we did not try to ifconfig delete then reconfigure the
 interface, but the media change or down/up do not fix it)


Francois Pollet
Perceval Belgium
fp@perceval.net
Comment 4 fp 2001-02-24 14:06:14 UTC
Release:
4.2-RELEASE

Description:
I need to correct the description of my previous followup

The DFE530TX with the VT6102 chipset are NOT broken !
We changed the DFE530TX on 5 machines having VT3043 and VT86C100A
chipset with the new one and we do not have any troubles for several
weeks now.

It seems that the problem could be caused by the phy driver which is
the amphy for the DFE530TX with VT3043 and VT86C100A chipset.
It is the ukphy which is used for the NIC with VT6102 chipset
Moreover the problem happened on several ASUS motherboards
(at least P3C2000, P2B-B, MEW)
errors in the kernel log:
 'vr0 watchdog timeout'
 'vr0: rx error: unknown rx error' (!? not logged in syslog)

BAD NIC RELEASE (< Rev A3):
vr0: <VIA VT3043 Rhine I 10/100BaseTX> port 0xd400-0xd47f mem
0xe2000000-0xe200007f irq 10 at device 11.0 on pci1
vr0: Ethernet address: 00:50:ba:08:be:b0
miibus0: <MII bus> on vr0
amphy0: <DM9101 10/100 media interface> on miibus0
amphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

GOOD NIC RELEASE (Rev A3)
vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xb400-0xb4ff mem
0xcd000000-0xcd0000ff irq 10 at device 11.0 on pci2
vr0: Ethernet address: 00:50:ba:6e:66:bd
miibus0: <MII bus> on vr0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

Francois
Comment 5 silby freebsd_committer freebsd_triage 2002-05-17 18:44:22 UTC
Responsible Changed
From-To: wpaul->silby
Comment 6 silby freebsd_committer freebsd_triage 2002-08-22 06:04:38 UTC
State Changed
From-To: open->closed

This problem should be solved in if_vr.c as of rev 1.26.2.10. 

If it can still be reproduced with that revision of the driver, 
I'll be happy to reopen the PR.