Bug 171121 - [bge] bge driver not working with BCM5719 (HP Proliant DL 360 G8)
Summary: [bge] bge driver not working with BCM5719 (HP Proliant DL 360 G8)
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 9.1-PRERELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-27 20:50 UTC by Anders Nordby
Modified: 2017-12-31 22:27 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Anders Nordby freebsd_committer freebsd_triage 2012-08-27 20:50:02 UTC
I'm having lots of difficulties with BCM5719, which is the default
network card of HP Proliant DL 360 G8 servers. I can get a few ping
replies before I get a couple of these:

bge0: watchdog timeout -- resetting
bge0: watchdog timeout -- resetting

Then everything hangs. Can not log in using ssh.

I'm running: FreeBSD-9.0-RELENG_9-20120701-JPSNAP-amd64

Info about the NIC:

# devinfo -rv | grep phy
                brgphy0 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=1
                brgphy1 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=2
                brgphy2 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=3
                brgphy3 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=4
# grep bge /var/run/dmesg.boot
bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bf0000-0xf6bfffff,
0xf6be0000-0xf6beffff,0xf6bd0000-0xf6bdffff irq 32 at device 0.0 on pci3
bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus0: <MII bus> on bge0
bge0: Ethernet address: 2c:76:8a:54:08:14
bge1: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bc0000-0xf6bcffff,
0xf6bb0000-0xf6bbffff,0xf6ba0000-0xf6baffff irq 36 at device 0.1 on pci3
bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus1: <MII bus> on bge1
bge1: Ethernet address: 2c:76:8a:54:08:15
bge2: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b90000-0xf6b9ffff,
0xf6b80000-0xf6b8ffff,0xf6b70000-0xf6b7ffff irq 32 at device 0.2 on pci3
bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus2: <MII bus> on bge2
bge2: Ethernet address: 2c:76:8a:54:08:16
bge3: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b60000-0xf6b6ffff,
0xf6b50000-0xf6b5ffff,0xf6b40000-0xf6b4ffff irq 36 at device 0.3 on pci3
bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus3: <MII bus> on bge3
bge3: Ethernet address: 2c:76:8a:54:08:17

Searching other bug reports and posts, I've tried:

hw.bge.allow_asf="0"
hw.pci.enable_msi="0"

But it didn't help. Any ideas?

If I don't use the loader.conf settings above, I also get (before the
watchdog timeouts):

bge0: 2 link states coalesced
bge0: 2 link states coalesced
bge0: 2 link states coalesced
Comment 1 Anders Nordby freebsd_committer freebsd_triage 2012-08-27 21:47:58 UTC
Hi,

In this PR I copy data from the mail thread at
http://lists.freebsd.org/pipermail/freebsd-stable/2012-July/thread.html#68720

Just figured it's better to follow up on this with a PR.

My last response was:

On ons, jul 04, 2012 at 06:01:36pm -0700, YongHyeon PYUN wrote:
> There is a WIP version at the following URL.
> http://people.freebsd.org/~yongari/bge/if_bge.c
> http://people.freebsd.org/~yongari/bge/if_bgereg.h
> http://people.freebsd.org/~yongari/bge/brgphy.c
>
> I have a couple of positive feedbacks but it seems it still has
> some issues. Let me know whether it makes any difference on your
> box.

I tried these bge source files in 9.1-PRERELEASE this week, and it does
not help. If I try to log in with SSH I get:

Aug 23 17:30:32  login: ROOT LOGIN (root) ON ttyu0
bge0: watchdog timeout -- resetting
Aug 23 17:31:31  kernel: bge0: watchdog timeout -- resetting
Aug 23 17:31:31  kernel: bge0: link state changed to DOWN
Aug 23 17:31:35  kernel: bge0: link state changed to UP
bge0: watchdog timeout -- resetting
Aug 23 17:33:24  kernel: bge0: watchdog timeout -- resetting
Aug 23 17:33:24  kernel: bge0: link state changed to DOWN
Aug 23 17:33:28  kernel: bge0: link state changed to UP

I tried setting hw.bge.allow_asf to 0, but it did not help.

During boot I get:

pcib3: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci3: <ACPI PCI bus> on pcib3
pci0:3:0:0: failed to read VPD data.
bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bf0000-0xf6bfffff0xf6be0000-0xf6beffff,0xf6bd0000-0xf6bdffff irq 32
at device 0.0 on pci3
bge0: APE FW version: NCSI v1.0.80.0
bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus0: <MII bus> on bge0
brgphy0: <BCM5719C 1000BASE-T media interface> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 2c:76:8a:54:08:14
pci0:3:0:1: failed to read VPD data.
bge1: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bc0000-0xf6bcffff0xf6bb0000-0xf6bbffff,0xf6ba0000-0xf6baffff irq 36
at device 0.1 on pci3
bge1: APE FW version: NCSI v1.0.80.0
bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus1: <MII bus> on bge1
brgphy1: <BCM5719C 1000BASE-T media interface> PHY 2 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 2c:76:8a:54:08:15
pci0:3:0:2: failed to read VPD data.
bge2: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b90000-0xf6b9ffff0xf6b80000-0xf6b8ffff,0xf6b70000-0xf6b7ffff irq 32
at device 0.2 on pci3
bge2: APE FW version: NCSI v1.0.80.0
bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus2: <MII bus> on bge2
brgphy2: <BCM5719C 1000BASE-T media interface> PHY 3 on miibus2
brgphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge2: Ethernet address: 2c:76:8a:54:08:16
pci0:3:0:3: failed to read VPD data.
bge3: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b60000-0xf6b6ffff0xf6b50000-0xf6b5ffff,0xf6b40000-0xf6b4ffff irq 36
at device 0.3 on pci3
bge3: APE FW version: NCSI v1.0.80.0
bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus3: <MII bus> on bge3
brgphy3: <BCM5719C 1000BASE-T media interface> PHY 4 on miibus3
brgphy3:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge3: Ethernet address: 2c:76:8a:54:08:17

-- 
Anders.
Comment 2 Anders Nordby freebsd_committer freebsd_triage 2012-08-27 21:48:19 UTC
Responsible Changed
From-To: freebsd-bugs->yongari

Over to yongari@.
Comment 3 Anders Nordby freebsd_committer freebsd_triage 2012-08-28 13:49:31 UTC
Hi,

The last response from YongHyeon PYUN (yongari@) was:

> I tried setting hw.bge.allow_asf to 0, but it did not help.

The loader tunable has no effect for controllers with
APE(Application Processor Engine).

> During boot I get:
>
> pcib3: <ACPI PCI-PCI bridge> at device 2.0 on pci0
> pci3: <ACPI PCI bus> on pcib3
> pci0:3:0:0: failed to read VPD data.
> bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
> 0xf6bf0000-0xf6bfffff0xf6be0000-0xf6beffff,0xf6bd0000-0xf6bdffff irq
32
> at device 0.0 on pci3
> bge0: APE FW version: NCSI v1.0.80.0
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
It seems your APE runs slightly newer NC-SI firmware. I was able to
reproduce watchdog timeouts on Dell R820 but I'm not sure you're
also seeing the same issue here. Due to unknown reason, it seems
programming RX MTU register has no effect with BCM5720 on R820.
Receiving frames larger than 175(?) bytes seem to hang the
controller on R820. Current workaround for the issue is to set
the MTU of sender(i.e. link partner or switch) to some low value,
128 for example. That would show poor performance but shall make
your controller work. I asked help to Broadcom and waiting for
answers/hint from Broadcom.

Regards,

-- 
Anders.
Comment 4 Anders Nordby freebsd_committer freebsd_triage 2012-08-29 20:59:25 UTC
Hi,

Unfortunately our switches can not set MTU per port, so no luck. :-/
But I could hook up a laptop and test whether having a low MTU on the
link partner. Is it helpful if I test and verify that?

Has it been a long time since you contacted Broadcom? Is there any hope
of getting this fixed? I'm kinda stuck and may need to consider Linux
instead if this does not sort out. :-(

Bye,
Anders.
Comment 5 pyunyh 2012-08-30 19:39:56 UTC
On Wed, Aug 29, 2012 at 09:59:25PM +0200, Anders Nordby wrote:
> Hi,
> 
> Unfortunately our switches can not set MTU per port, so no luck. :-/
> But I could hook up a laptop and test whether having a low MTU on the
> link partner. Is it helpful if I test and verify that?
> 

Yes it will give more data point to me.  If lowering MTU make
driver work it indicates you're seeing the same issue I had with
BCM5720 on Dell R820.

> Has it been a long time since you contacted Broadcom? Is there any hope

Unfortunately I didn't get any answers/hints from Broadcom.

> of getting this fixed? I'm kinda stuck and may need to consider Linux
> instead if this does not sort out. :-(
> 

I never give up but it seems it shall take more time than I
initially thought. :-(
Comment 6 Anders Nordby freebsd_committer freebsd_triage 2012-08-31 17:29:14 UTC
Hi,

On tor, aug 30, 2012 at 11:39:56am -0700, YongHyeon PYUN wrote:
> Yes it will give more data point to me.  If lowering MTU make
> driver work it indicates you're seeing the same issue I had with
> BCM5720 on Dell R820.

I hooked up a laptop running Ubuntu Linux up to the server directly with
only a network cable, and with a low MTU as suggested:

root@ubuntu:~# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 68:b5:99:f2:d4:c6  
          inet addr:192.168.0.2  Bcast:192.168.0.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:128  Metric:1
          RX packets:73 errors:0 dropped:0 overruns:0 frame:0
          TX packets:303 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:4786 (4.7 KB)  TX bytes:52097 (52.0 KB)
          Interrupt:20 Memory:d4700000-d4720000 

Then I can ssh to the host (which I can not when its connected to our
switches) as well as scp large files to the FreeBSD server:

root@ubuntu:/cdrom/casper# scp filesystem.squashfs 192.168.0.1:
filesystem.squashfs                             4%   30MB   1.7MB/s
06:03 ETA^Croot@ubuntu:/cdrom/casper# 
root@ubuntu:/cdrom/casper# scp filesystem.squashfs 192.168.0.1:
filesystem.squashfs                             4%   30MB   1.7MB/s
06:03 ETA^Croot@ubuntu:/cdrom/casper# scp filesystem.squashfs
192.168.0.1:
filesystem.squashfs                            34%  227MB   3.1MB/s
02:18 ETA^Croot@ubuntu:/cdrom/casper# 

The file was around 650 MB. Transfer is quite slow, but it never seems
to stop.

> I never give up but it seems it shall take more time than I
> initially thought. :-(

Let me know if you can work out a fix, I'll test it for sure.

Regards,

-- 
Anders.
Comment 7 anders 2012-09-03 20:54:01 UTC
Hi,

Just to let you know. I checked previously and again today, there does
not seem to be any newer firmware for this NIC on hp.com website:

anders@eggsilo:~/Downloads/cp016186/NIC_FW/ncsi$ strings * | grep -i
ncsi 
NCSI01.00.80

Best regards,

-- 
Anders.
Comment 8 Andrej Zverev freebsd_committer freebsd_triage 2012-09-05 15:11:32 UTC
Hello, same issue here.

Can i help to ?
Comment 9 Pyun YongHyeon freebsd_committer freebsd_triage 2012-09-06 05:41:17 UTC
State Changed
From-To: open->feedback

I was able to send/receive frames with latest WIP version on Dell R820. 
Could you try latest WIP version at the following URL? 

http://people.freebsd.org/~yongari/bge/if_bge.c 
http://people.freebsd.org/~yongari/bge/if_bgereg.h 
http://people.freebsd.org/~yongari/bge/brgphy.c 

Note you have to rebuld kernel or build bge(4) and mii(4) kernel 
driver modules.
Comment 10 Anders Nordby freebsd_committer freebsd_triage 2012-09-06 19:59:29 UTC
Hi!

On tor, sep 06, 2012 at 04:44:30am +0000, yongari@FreeBSD.org wrote:
> I was able to send/receive frames with latest WIP version on Dell R820.
> Could you try latest WIP version at the following URL?
> 
> http://people.freebsd.org/~yongari/bge/if_bge.c
> http://people.freebsd.org/~yongari/bge/if_bgereg.h
> http://people.freebsd.org/~yongari/bge/brgphy.c
> 
> Note you have to rebuld kernel or build bge(4) and mii(4) kernel
> driver modules.
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=171121

Great! This works for me, on this HP Proliant DL 360 G8 that I have this
issue with. I can now log in and also get reasonable transfer speeds.
Example, copying from my workstation:

anders@noname:~/data$ scp
HP_Service_Pack_for_Proliant_2012.06.0-0_696111-001_spp_2012.06.0-SPP-2012.06.0B-0.zip
root@foo.bar.baz:
Password:
HP_Service_Pack_for_Proliant_2012.06.0-0_6961 100% 1884MB  40.1MB/s
00:47    

I hope you can get the updated driver into svn soon? I suppose it is
too late for 9.1-RELEASE.

If you ever make it to Oslo, I owe you beer. :-)

Cheers,

-- 
Anders.
Comment 11 Andrej Zverev freebsd_committer freebsd_triage 2012-09-07 09:34:09 UTC
Hello, sorry for interfering, but I also used these patches and they work! :-)

mine output of dmesg is here: http://az.semmy.ru/dl380.txt

I did some quick perfomance test (scp file from one server to dl380)
and this all i got. But at least it now working. Thank you!


% systat -ifstat 1 (:scale mbit)

           bge0  in      0.005 Mb/s        263.437 Mb/s            7.572 GB
                   out     0.003 Mb/s          8.001 Mb/s          249.442 MB
Comment 12 pyunyh 2012-09-07 17:49:20 UTC
On Thu, Sep 06, 2012 at 08:59:29PM +0200, Anders Nordby wrote:
> Hi!
> 
> On tor, sep 06, 2012 at 04:44:30am +0000, yongari@FreeBSD.org wrote:
> > I was able to send/receive frames with latest WIP version on Dell R820.
> > Could you try latest WIP version at the following URL?
> > 
> > http://people.freebsd.org/~yongari/bge/if_bge.c
> > http://people.freebsd.org/~yongari/bge/if_bgereg.h
> > http://people.freebsd.org/~yongari/bge/brgphy.c
> > 
> > Note you have to rebuld kernel or build bge(4) and mii(4) kernel
> > driver modules.
> > 
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=171121
> 
> Great! This works for me, on this HP Proliant DL 360 G8 that I have this
> issue with. I can now log in and also get reasonable transfer speeds.

Nice!

> Example, copying from my workstation:
> 
> anders@noname:~/data$ scp
> HP_Service_Pack_for_Proliant_2012.06.0-0_696111-001_spp_2012.06.0-SPP-2012.06.0B-0.zip
> root@foo.bar.baz:
> Password:
> HP_Service_Pack_for_Proliant_2012.06.0-0_6961 100% 1884MB  40.1MB/s
> 00:47    
> 
> I hope you can get the updated driver into svn soon? I suppose it is
> too late for 9.1-RELEASE.

You're right. And it needs wider testing for all Broadcom
controllers since I overhauled ASF/IPMI handling and controller
reset sequence which are very sensitive to many variants.

Could you post dmesg output(only bge(4) and brgphy(4)/ukphy(4)
part)?

> 
> If you ever make it to Oslo, I owe you beer. :-)

:-)

> 
> Cheers,
> 
> -- 
> Anders.
Comment 13 pyunyh 2012-09-08 01:59:25 UTC
On Fri, Sep 07, 2012 at 12:34:09PM +0400, Andrej Zverev wrote:
> Hello, sorry for interfering, but I also used these patches and they work! :-)
> 
> mine output of dmesg is here: http://az.semmy.ru/dl380.txt
> 
> I did some quick perfomance test (scp file from one server to dl380)
> and this all i got. But at least it now working. Thank you!
> 
> 
> % systat -ifstat 1 (:scale mbit)
> 
>            bge0  in      0.005 Mb/s        263.437 Mb/s            7.572 GB
>                    out     0.003 Mb/s          8.001 Mb/s          249.442 MB

The correct way to measure the performance of an ethernet driver is
to use netperf or iperf benchmark which live in ports.  You should
be able to get more than 920~930Mbps. If you use jumbo frame, the
number would be around 980Mbps.

Anyway thanks for testing!
Comment 14 Anders Nordby freebsd_committer freebsd_triage 2012-09-08 07:41:56 UTC
Hi,

On fre, sep 07, 2012 at 09:49:20am -0700, YongHyeon PYUN wrote:
>> I hope you can get the updated driver into svn soon? I suppose it is
>> too late for 9.1-RELEASE.
> You're right. And it needs wider testing for all Broadcom
> controllers since I overhauled ASF/IPMI handling and controller
> reset sequence which are very sensitive to many variants.
> 
> Could you post dmesg output(only bge(4) and brgphy(4)/ukphy(4)
> part)?

bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bf0000-0xf6bfffff,
0xf6be0000-0xf6beffff,0xf6bd0000-0xf6bdffff irq 32 at device 0.0 on pci3
bge0: APE FW version: NCSI v1.0.80.0
bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus0: <MII bus> on bge0
brgphy0: <BCM5719C 1000BASE-T media interface> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-m
aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 2c:76:8a:54:08:14
pci0:3:0:1: failed to read VPD data.
bge1: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6bc0000-0xf6bcffff,
0xf6bb0000-0xf6bbffff,0xf6ba0000-0xf6baffff irq 36 at device 0.1 on pci3
bge1: APE FW version: NCSI v1.0.80.0
bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus1: <MII bus> on bge1
brgphy1: <BCM5719C 1000BASE-T media interface> PHY 2 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-m
aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 2c:76:8a:54:08:15
pci0:3:0:2: failed to read VPD data.
bge2: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b90000-0xf6b9ffff,
0xf6b80000-0xf6b8ffff,0xf6b70000-0xf6b7ffff irq 32 at device 0.2 on pci3
bge2: APE FW version: NCSI v1.0.80.0
bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus2: <MII bus> on bge2
brgphy2: <BCM5719C 1000BASE-T media interface> PHY 3 on miibus2
brgphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-m
aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge2: Ethernet address: 2c:76:8a:54:08:16
pci0:3:0:3: failed to read VPD data.
bge3: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem
0xf6b60000-0xf6b6ffff,
0xf6b50000-0xf6b5ffff,0xf6b40000-0xf6b4ffff irq 36 at device 0.3 on pci3
bge3: APE FW version: NCSI v1.0.80.0
bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus3: <MII bus> on bge3
brgphy3: <BCM5719C 1000BASE-T media interface> PHY 4 on miibus3
brgphy3:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-m
aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge3: Ethernet address: 2c:76:8a:54:08:17
(..)
miibus4: <MII bus> on udav0
ukphy0: <Generic IEEE 802.3u media interface> PHY 0 on miibus4
ukphy0:
none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ue0: <USB Ethernet> on udav0
ue0: Ethernet address: 00:10:14:00:49:ec
bge0: link state changed to UP

I was using ue0 to download your patches. :-)

Regards,

-- 
Anders.
Comment 15 jpmg 2012-10-08 16:15:57 UTC
I've just tested this against a device (one of  four interfaces on a 
Dell PowerEdge R720xd) that identifies itself as

bge2: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 
0xd51a0000-0xd51affff,0xd51b0000-0xd51bffff,0xd51c0000-0xd51cffff irq 35 
at device 0.0 on pci1
bge2: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
brgphy2: <BCM5720C 1000BASE-T media interface> PHY 1 on miibus2

which generates the "typical" errors of

bge2: Try again
bge2: Try again
miibus2: <MII bus> on bge2
bge2: Ethernet address:  xx:xx:xx:xx:xx:xx   [ real data removed ]
bge2: watchdog timeout - resetting

followed by lots of

bge2: N link states coalesced     [ various values of N between 2 and 6 ]

with occasional

bge2: watchdog timeout -- resetting

and splurges of

bge2: link state changed to UP
bge2: link state changed to DOWN

----------------------------------

I'm assuming this is because some of the modifications made (eg the "fix 
internal FIFO overflow")
are coded to only apply to the 5719 and not the 5720 .  Is it worth me 
trying out adding 5720
cases to each of the 5719-specific fixes?

-patrick.
Comment 16 jpmg 2012-10-08 16:18:47 UTC
I've just tested this against a device (one of four interfaces on a
Dell PowerEdge R720xd) that identifies itself as

bge2: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> 
  mem 0xd51a0000-0xd51affff,0xd51b0000-0xd51bffff,0xd51c0000-0xd51cffff 
  irq 35 at device 0.0 on pci1
bge2: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
brgphy2: <BCM5720C 1000BASE-T media interface> PHY 1 on miibus2

which generates the "typical" errors of

bge2: Try again
bge2: Try again
miibus2: <MII bus> on bge2
bge2: Ethernet address:  xx:xx:xx:xx:xx:xx   [ real data removed ]
bge2: watchdog timeout - resetting

followed by lots of

bge2: N link states coalesced     [ various values of N between 2 and 6 ]

with occasional

bge2: watchdog timeout -- resetting

and splurges of

bge2: link state changed to UP
bge2: link state changed to DOWN

----------------------------------

I'm assuming this is because some of the modifications made (eg the
"fix internal FIFO overflow") are coded to only apply to the 5719 and
not the 5720 .  Is it worth me trying out adding 5720 cases to each of
the 5719-specific fixes?

-patrick.
Comment 17 pyunyh 2012-10-09 17:56:18 UTC
On Mon, Oct 08, 2012 at 03:20:12PM +0000, Patrick Gosling wrote:
> The following reply was made to PR kern/171121; it has been noted by GNATS.
> 
> From: Patrick Gosling <jpmg@eng.cam.ac.uk>
> To: anders@freebsd.org, bug-followup@freebsd.org
> Cc:  
> Subject: Re: kern/171121: [bge] bge driver not working with BCM5719 (HP
>  Proliant DL 360 G8)
> Date: Mon, 08 Oct 2012 16:18:47 +0100
> 
>  I've just tested this against a device (one of four interfaces on a
>  Dell PowerEdge R720xd) that identifies itself as
>  
>  bge2: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> 
>    mem 0xd51a0000-0xd51affff,0xd51b0000-0xd51bffff,0xd51c0000-0xd51cffff 
>    irq 35 at device 0.0 on pci1
>  bge2: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
>  brgphy2: <BCM5720C 1000BASE-T media interface> PHY 1 on miibus2
>  
>  which generates the "typical" errors of
>  
>  bge2: Try again
>  bge2: Try again
>  miibus2: <MII bus> on bge2
>  bge2: Ethernet address:  xx:xx:xx:xx:xx:xx   [ real data removed ]
>  bge2: watchdog timeout - resetting
>  
>  followed by lots of
>  
>  bge2: N link states coalesced     [ various values of N between 2 and 6 ]
>  
>  with occasional
>  
>  bge2: watchdog timeout -- resetting
>  
>  and splurges of
>  
>  bge2: link state changed to UP
>  bge2: link state changed to DOWN
>  
>  ----------------------------------
>  
>  I'm assuming this is because some of the modifications made (eg the
>  "fix internal FIFO overflow") are coded to only apply to the 5719 and
>  not the 5720 .  Is it worth me trying out adding 5720 cases to each of
>  the 5719-specific fixes?
>  

Probably no. Did you try latest bge(4) in HEAD? If the answer is
YES, you may have to wait until I fully merge all the changes made
in WIP.
If you see the issue with WIP version then it indicates other issue
though.

>  -patrick.
Comment 18 jpmg 2012-10-11 13:02:09 UTC
Ah, that's a very useful pointer.  I just built the kernel from HEAD
(from 10 Oct 2012), and it appears to have fixed the problem for me
for this Dell PowerEdge Rx720xd BCM5720C interface.

Many thanks.

-patrick.
Comment 19 Maxim Sobolev freebsd_committer freebsd_triage 2012-11-06 19:46:33 UTC
Hi,

We have been having similar issue on Dell R620 server with the 9.1-RC3 
built from sources from 1th of November (constant UP/DOWN after adding 
IP). The issue has been fixed by installing the WIP driver. It would be 
really nice to get it into the tree before 9.1 is out.

We have not tested it much, but the WIP driver appears to be working 
reasonably well. The only problem is that the interface takes very long 
time to come online after boot. Something like 1-2 minutes, so that 
ntpdate fails to get the time. I am not sure if it's something with the 
switch or some kind of driver issue.

Relevant dmesg is below.

Thanks!

-Maxim

bge0: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 
0xd90a0000-0xd90affff,0xd90b0000-0xd90bffff,0xd90c0000-0xd90cffff irq 34 
at device 0.0 on pci2
bge0: APE FW version: NCSI v1.0.85.0
bge0: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
bge0: Ethernet address: 90:b1:1c:06:7d:3f
bge1: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 
0xd90d0000-0xd90dffff,0xd90e0000-0xd90effff,0xd90f0000-0xd90fffff irq 36 
at device 0.1 on pci2
bge1: APE FW version: NCSI v1.0.85.0
bge1: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
bge1: Ethernet address: 90:b1:1c:06:7d:40
bge2: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 
0xd91a0000-0xd91affff,0xd91b0000-0xd91bffff,0xd91c0000-0xd91cffff irq 35 
at device 0.0 on pci1
bge2: APE FW version: NCSI v1.0.85.0
bge2: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
bge2: Ethernet address: 90:b1:1c:06:7d:3d
bge3: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 
0xd91d0000-0xd91dffff,0xd91e0000-0xd91effff,0xd91f0000-0xd91fffff irq 38 
at device 0.1 on pci1
bge3: APE FW version: NCSI v1.0.85.0
bge3: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E
bge3: Ethernet address: 90:b1:1c:06:7d:3e
bge3: link state changed to UP
Comment 20 danmason 2012-11-08 21:18:43 UTC
Greetings,

I have Gen8 HP DL360 running 9.1-RC2.  I recompiled bge using yongari's WIP
files.

-rw-r--r--  1 root  wheel  200820 Sep  6 04:17 if_bge.c
-rw-r--r--  1 root  wheel  104310 Sep  6 04:17 if_bgereg.h
-rw-r--r--  1 root  wheel   31530 May 21 05:23 brgphy.c

The WIP bugs fix the obvious UP/DOWN bug but it also appears to be very
functional.  My lab only has a 100M uplink but I was able to iperf a bit
over 90Mbit/sec between this server and an out of state server.  I
connected this to another local system via crossover cable to test the gigE
performance and I'm able to get 920Mbit/sec between the two servers.  I
tested with iperf both as a server (send) and a client (receive) with the
same speed results.

I'm emailing this information wondering if there any specific testing that
I can do to help with the development of this driver.  I was also wondering
if we'll have to wait for 9.2 for this driver to be included in a release?
I know 9.1 was frozen in July but maybe this is just a bug fix for 9.1?  :)



Dan

-- 
Daniel Mason
Systems Engineer
danmason@danmason.net

"and all you touch and all you see, is all your life will ever be"
Comment 21 Pyun YongHyeon freebsd_committer freebsd_triage 2012-11-21 04:50:27 UTC
State Changed
From-To: feedback->patched

The WIP version was committed to CURRENT. 
Thanks for all who tested the WIP version.
Comment 22 a 2012-11-28 12:30:13 UTC
Since I ran into that exact same problem (see
http://lists.freebsd.org/pipermail/freebsd-questions/2012-November/246824.html)
making it impossible for me to get my new system (9.1-RC3) online.

Do you have any status whether your patch will make it into the final
9.1-release?

If not, how can I get my system online without any NIC (they're all
the same type) working?

Thanks much in advance for any clue,
-ewald
Comment 23 pyunyh 2012-11-29 00:23:04 UTC
On Wed, Nov 28, 2012 at 12:40:01PM +0000, Ewald Jenisch wrote:

[...]

>  Since I ran into that exact same problem (see
>  http://lists.freebsd.org/pipermail/freebsd-questions/2012-November/246824.html)
>  making it impossible for me to get my new system (9.1-RC3) online.
>  
>  Do you have any status whether your patch will make it into the final
>  9.1-release?

No, it is too late. But the changes were merged to both stable/9
and stable/8.

>  
>  If not, how can I get my system online without any NIC (they're all
>  the same type) working?

Install FreeBSD 9.1-RC3 and use USB ethernet to get access to
Internet.  Copy if_bge.c/if_bgereg.h/brgphy.c from CURRENT or
stable/9 to your box and rebuild kernel.

>  
>  Thanks much in advance for any clue,
>  -ewald
Comment 24 a 2012-11-29 15:22:27 UTC
On Thu, Nov 29, 2012 at 09:23:04AM +0900, YongHyeon PYUN wrote:
> 
> >  
> >  If not, how can I get my system online without any NIC (they're all
> >  the same type) working?
> 
> Install FreeBSD 9.1-RC3 and use USB ethernet to get access to
> Internet.  Copy if_bge.c/if_bgereg.h/brgphy.c from CURRENT or
> stable/9 to your box and rebuild kernel.

Hi,

You also wrote that the updated driver files are in
"if_bge.c/if_bgereg.h/brgphy.c". So as far as I understand these files
should be on any other machine of mine that I update with cvsup, right?

Leaves only the problem of getting them over to the machine in
question...


What do you mean by "USB ethernet"? 

Thanks much in advance for any clue,
-ewald
Comment 25 pyunyh 2012-11-30 01:53:56 UTC
On Thu, Nov 29, 2012 at 04:22:27PM +0100, Ewald Jenisch wrote:
> On Thu, Nov 29, 2012 at 09:23:04AM +0900, YongHyeon PYUN wrote:
> > 
> > >  
> > >  If not, how can I get my system online without any NIC (they're all
> > >  the same type) working?
> > 
> > Install FreeBSD 9.1-RC3 and use USB ethernet to get access to
> > Internet.  Copy if_bge.c/if_bgereg.h/brgphy.c from CURRENT or
> > stable/9 to your box and rebuild kernel.
> 
> Hi,
> 
> You also wrote that the updated driver files are in
> "if_bge.c/if_bgereg.h/brgphy.c". So as far as I understand these files
> should be on any other machine of mine that I update with cvsup, right?

9.1-RC3 does not ship the updated bge(4) so you still have to
manually download required files. If you can access to CURRENT or
latest stable/9 sources, you can copy required files from there too.

> 
> Leaves only the problem of getting them over to the machine in
> question...
> 
> 
> What do you mean by "USB ethernet"? 
> 

I thought you don't have any working network devices on your box so
I recommended to use USB based ethernet controller(i.e. axe(4)) to
get working network on the box. If you have other network devices
that work with 9.1-RC3, you can directly download required files
from CURRENT/stable9 to your 9.1-RC3.
Comment 26 Anders Nordby freebsd_committer freebsd_triage 2012-12-01 20:57:54 UTC
On fre, nov 30, 2012 at 10:53:56am +0900, YongHyeon PYUN wrote:
>> What do you mean by "USB ethernet"? 
> I thought you don't have any working network devices on your box so
> I recommended to use USB based ethernet controller(i.e. axe(4)) to
> get working network on the box. If you have other network devices
> that work with 9.1-RC3, you can directly download required files
> from CURRENT/stable9 to your 9.1-RC3.

Or he could just copy the files over using a usb key, cd/dvd-rom or if
he is lucky like me by mounting an iso image with the files as a virtual
drive (HP ILO).

Looks like snapshots.jp.freebsd.org is down these days, and
ftp://ftp.freebsd.org/pub/FreeBSD/snapshots is rather empty. There are
some JPSNAP snapshots in
ftp://ftp.allbsd.org/pub/FreeBSD-snapshots/amd64-amd64/. The latest
RELENG_9 image is from september 15, and the latest current (not so
desirable if you are a user and not a devloper) from october 6. Not sure
if these are new enough. We should have downloadable and updated stable
snapshots IMO, then this would not be necessary. :-}

Cc to buildadm@jp.FreeBSD.org regarding snapshots.jp.freebsd.org.

Best regards,

-- 
Anders.
Comment 27 danmason 2013-01-18 22:54:06 UTC
Is there a recommendation for using this bge fix in 9.1-RELEASE?  I've
tried a few versions from stable but they didn't seem to work.

Should I be using that WIP source still?  I'm compiling that right now.


Dan


-- 
Daniel Mason
Systems Engineer
danmason@danmason.net

"and all you touch and all you see, is all your life will ever be"
Comment 28 Anders Nordby freebsd_committer freebsd_triage 2013-01-19 11:36:47 UTC
Hi!

On fre, jan 18, 2013 at 04:54:06pm -0600, Dan Mason wrote:
> Is there a recommendation for using this bge fix in 9.1-RELEASE?  I've
> tried a few versions from stable but they didn't seem to work.
> 
> Should I be using that WIP source still?  I'm compiling that right now.

It will not be merged to 9.1-RELEASE AFAIK. But you can find it in
-STABLE. YongHyeon said the changes were merged to stable/9 and stable/8
in november.

Unless you want to patch the fix in I'd try one of the newer RELENG_9
(not RELENG_9_1!) snapshots in
ftp://ftp.allbsd.org/pub/FreeBSD-snapshots/amd64-amd64.

Let us know how it goes. :)

Cheers,

-- 
Anders.
Comment 29 danmason 2013-01-30 23:30:49 UTC
I didn't have a lot of luck finding bge drivers in 9/stable that both fixed
the link up/down problems and that would compile in 9.1-RELEASE.  It seems
there were a fair number of other changes made to that driver that don't
play well with 9.1-RELEASE.  I was however able to compile those WIP driver
files in 9.1-RELEASE and initial testing looks good enough so far.

I understand that these driver changes were aimed towards 9.2-RELEASE but
it would be useful if there was an semi-official recommendation on how to
use this driver in 9.1-RELEASE since all the new HP (and I suspect Dell)
servers have these chipsets.  Maybe some specific revision/branch etc thats
recommended over the WIP files?

In any case I appreciate all the hard work on this driver.  Thanks!


Dan



On Sat, Jan 19, 2013 at 5:36 AM, Anders Nordby <anders@freebsd.org> wrote:

> Hi!
>
> On fre, jan 18, 2013 at 04:54:06pm -0600, Dan Mason wrote:
> > Is there a recommendation for using this bge fix in 9.1-RELEASE?  I've
> > tried a few versions from stable but they didn't seem to work.
> >
> > Should I be using that WIP source still?  I'm compiling that right now.
>
> It will not be merged to 9.1-RELEASE AFAIK. But you can find it in
> -STABLE. YongHyeon said the changes were merged to stable/9 and stable/8
> in november.
>
> Unless you want to patch the fix in I'd try one of the newer RELENG_9
> (not RELENG_9_1!) snapshots in
> ftp://ftp.allbsd.org/pub/FreeBSD-snapshots/amd64-amd64.
>
> Let us know how it goes. :)
>
> Cheers,
>
> --
> Anders.
>



-- 
Daniel Mason
Systems Engineer
danmason@danmason.net

"and all you touch and all you see, is all your life will ever be"
Comment 30 Pyun YongHyeon freebsd_committer freebsd_triage 2013-03-14 01:46:54 UTC
State Changed
From-To: patched->closed

All required change to support BCM5719/BCM5720 was merged to 
stable/9 and stable/8.
Comment 31 Maxim Sobolev freebsd_committer freebsd_triage 2013-04-09 22:51:28 UTC
State Changed
From-To: closed->open

Still not working on (some) HP Proliant DL 360 G8.
Comment 32 Maxim Sobolev freebsd_committer freebsd_triage 2013-04-09 22:59:33 UTC
Hi,

I don't think this issue has been resolved. I have same kind of the 
server here (HP 360 G8) with BCM5719 and none of the 4 onboard 
interfaces seems to be working with the very latest 9-stable kernel. My 
kernel is more recent than rev.248858, which was the MFC to the 
9-stable. Upon booting up one interface shows as active, but as long as 
I try to ifconfig it, it goes down and no matter what I do is not coming 
back again. At the ling below please find some screenshots of myself 
trying to debug the issue.

http://sobomax.sippysoft.com/pr171121.zip

If necessary I can provide shell access to the console via the KVM.

Any help would be appreciated.

Thanks!

-Maxim
Comment 33 pyunyh 2013-04-10 02:09:03 UTC
On Tue, Apr 09, 2013 at 02:49:40PM -0700, Maxim Sobolev wrote:
> Hi,
> 

Hi,

> I don't think this issue has been resolved. I have same kind of the 
> server here (HP 360 G8) with BCM5719 and none of the 4 onboard 
> interfaces seems to be working with the very latest 9-stable kernel. My 

Hmm, did it ever work?

> kernel is more recent than rev.248858, which was the MFC to the 
> 9-stable. Upon booting up one interface shows as active, but as long as 
> I try to ifconfig it, it goes down and no matter what I do is not coming 
> back again. Attached please find some screenshots of myself trying to 
> debug the issue.

Thanks for additional info. Does the box have identical bge(4)
controllers?(i.e. you showed me bge0/brgphy0 output but you tested
bge3 and HP used to deploy different controllers(i.e. two dual port
controllers)). 
 
> 
> If necessary I can provide shell access to the console via the KVM.

Sorry no clue yet except trying r248993.

> 
> Any help would be appreciated.

This PR contains lots of mixed issues(from simple questions to real
issues) and I believe all of them were resolved. Given that you're
the first reporter that indicates driver issue after MFC, it would
be even better to open a new PR with additional information like
dmesg output and revision number. 

You're using IPMI right? If you do not configure IPMI, does the
issue still happen? By the way, do not manually configure media.
brgphy(4) does not work well with manual configuration so always
use auto-negotiation(This should be used to establish 1000baseT
link anyway). 

I also noticed you're trying to establish a 100baseTX link rather
than a 1000baseT. Link partner does not support 1000baseT?

> 
> Thanks!
> 
> -Maxim
Comment 34 Kurt Jaeger 2013-07-31 10:07:44 UTC
Hi!

Just FYI, I just did an upgrade of a HP Proliant DL 360 G8 with
BCM5719 to 9.2-BETA2-amd64, and dmesg.boot still says

pci0:3:0:0: failed to read VPD data.

But, the last few minutes bge0 was at least working (it failed with 9.1-REL).

More of it:

[...]
pci0:3:0:0: failed to read VPD data.
bge0: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem 0xf6bf0000-0xf6bfffff,
0xf6be0000-0xf6beffff,0xf6bd0000-0xf6bdffff irq 32 at device 0.0 on pci3
bge0: APE FW version: NCSI v1.0.88.0
bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus0: <MII bus> on bge0
brgphy0: <BCM5719C 1000BASE-T media interface> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-m
aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: ac:16:2d:77:22:88
pci0:3:0:1: failed to read VPD data.
bge1: <Broadcom unknown BCM5719, ASIC rev. 0x5719001> mem 0xf6bc0000-0xf6bcffff,
0xf6bb0000-0xf6bbffff,0xf6ba0000-0xf6baffff irq 36 at device 0.1 on pci3
bge1: APE FW version: NCSI v1.0.88.0
bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E
miibus1: <MII bus> on bge1
brgphy1: <BCM5719C 1000BASE-T media interface> PHY 2 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-m
aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: ac:16:2d:77:22:89
[...]

-- 
pi@opsec.eu            +49 171 3101372                         7 years to go !
Comment 35 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:49 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped