Summary: | [bge] [ipmi] regression IPMI access disabled when bge driver is loaded | ||
---|---|---|---|
Product: | Base System | Reporter: | Laurent Frigault <freebsdbugzilla> |
Component: | kern | Assignee: | Pyun YongHyeon <yongari> |
Status: | Open --- | ||
Severity: | Affects Some People | CC: | andrew.daugherity, net |
Priority: | Normal | Keywords: | regression |
Version: | 10.3-BETA2 | Flags: | koobs:
mfc-stable10?
koobs: mfc-stable9? |
Hardware: | amd64 | ||
OS: | Any | ||
URL: | https://svnweb.freebsd.org/changeset/base/241438 |
Description
Laurent Frigault
2015-01-20 17:29:38 UTC
I can confirm this regression on a Dell PowerEdge SC1435 with the same BCM5721 NICs: % pciconf -lvb bge0 bge0@pci0:1:0:0: class=0x020000 card=0x01eb1028 chip=0x165914e4 rev=0x21 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5721 Gigabit Ethernet PCI Express' class = network subclass = ethernet bar [10] = type Memory, range 64, base rxefcf0000, size 65536, enabled bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x004201> mem 0xefcf0000-0xefcfffff irq 33 at device 0.0 on pci1 bge0: CHIP ID 0x00004201; ASIC REV 0x04; CHIP REV 0x42; PCI-E miibus0: <MII bus> on bge0 brgphy0: <BCM5750 1000BASE-T media interface> PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow I am running 10.2 and IPMI ceases to work after the kernel loads. I previously ran Linux on this hardware and IPMI worked fine. IPMI shares a physical port with bge0 but has its own MAC address and IP. I have tested other versions of FreeBSD install images, and for my hardware at least, the regression seems to be between 9.1 and 9.2: 9.1: works 9.2: does not work 9.3: does not work 10.2: does not work Interestingly, I have some PE850 (not 860) running 9.3 that also have a BCM5721 bge0, and IPMI *does* work on 3/4 of them, but only when connecting from the local subnet, despite the gateway being set in IPMI config. Not sure what's broken with the last one, or when the others started working at all; I know the last time I tried IPMI on those a couple years ago it failed the same way (worked at boot, stopped working once kernel initialized bge0), but that was in the 7.x or 8.x days. My SC1435 has IPMI 2.0, unlike the 850 & 860, which only have IPMI 1.5, if that matters. I've found the commit that breaks it: base r241438 (which was MFC into stable/9 as 243546). Reading the commit history between 9.1 and 9.2, I saw that r248226 (MFC onto 9-stable as r248858) claims to fix IPMI on a Sun X2200 that broke with 241838, so my first test was to see if it was working just before that. I built a 9.2 kernel with sys/dev/bge/if_bge.c rolled back to the commit before the "bad" one, 243541 (MFC from 241436), and IPMI works! (I did not touch any other files.) Also works for a stable/9 kernel (identified as 9.3-STABLE #1 r243541:295788) with if_bge.c at 243541. If I update if_bge.c to the commit in question (241438 aka 243546), IPMI is broken once more. I also tried r248858 (248226) which supposedly fixed IPMI on those Sun servers but it did not help here. I have not tried any other commits, as it appears that for my hardware, it works on <=241436 and is broken for >=241438. I also fixed my 10.2 kernel in the same way by rolling back if_bge.c to r241436. I had to merge r242426 and r242625 to get it to build; after doing so IPMI works in 10.2! Obviously rolling all the way back like this isn't the solution for everyone, as there have been many other commits since then, but at least I found the breakage point. I don't know the bge driver or kernel well enough to properly fix it, but hopefully this is good information for someone who does. Assign to committer for apparent regressing changeset made in HEAD. This is a 10.3-RELEASE candidate For clarity, this is a regression in 9.x, 10.x, current (In reply to Andrew Daugherity from comment #2) Thank you very much for narrowing down guilty change set. I don't see differences in ASF/IPMI code path before/after APE support except additional H/W reset in 9.1. If you don't configure bge(4) at all(i.e. kernel just attaches driver), does the IPMI work? No, it doesn't. The only difference is the interface speed is 100BaseTX at boot and then 1000BaseT after running ifconfig or dhclient, but IPMI ceases to work once the kernel loads, before any interface configuration is done. However, I have found a workaround: enabling PXE in the BIOS. I'm still booting via local disk, not over PXE, but with PXE enabled, it prints a message during BIOS load and apparently resets/initializes the NIC in such a way that IPMI still works after FreeBSD loads its bge driver. To clarify: with FreeBSD 9.1 (and my test kernels with if_bge.c rolled back) and Linux, IPMI works regardless of PXE setting. With FreeBSD >= 9.2, IPMI only works when PXE is enabled. This is true for both the PowerEdge 850 and PowerEdge SC1435, and I would expect the 860 as well. For completeness, I also tested OpenBSD (snapshot) and NetBSD 7.0, and IPMI also breaks with both of those, even with PXE enabled. The default Dell BIOS setting is "enabled with PXE" for bge0 and "enabled without PXE" for bge1, but I had disabled PXE on some systems to speed up booting and avoid accidentally booting the wrong device. (In reply to Andrew Daugherity from comment #6) Thanks for PXE related clue. But I've confused with ifconfig/dhclient command. When did you run those commands? bge(4) does not report current link speed if the interface is not UP. So if you can see established link it means you initialized/upped the controller. By upping interface bge(4) will initialize the controller which in turn will touch many registers. The same is true for dhclient(8). The first thing dhclint(8) does is UP the interface. In order not to touch bge(4) H/W in bge_init(), you should not have any 'ifconfig_bge0=xxxx' line in rc.conf. What I'd like to know is whether IPMI is broken by bge_attach() call. Could you check it? I've done most of my testing with the FreeBSD memdisk install images, both with release kernels and test kernels copied onto the USB key. After choosing "Live CD" and logging in, no network is configured until I run 'dhclient bge0' or 'ifconfig bge0 inet a.b.c.d/NN up'. Some kernels on the PE850 reported the media speed even before bringing up the interface and others didn't, but that's not the issue here, since it fails when the driver loads, before any configuration happens. Just to be sure this issue is on attach vs. network configuration, I built a test 9-stable (unmodified r296050) kernel with GENERIC + 'nodevice bge' and tested it with PXE disabled. IPMI continues to work after this kernel is booted and I log in to the live CD environment, but as soon as I 'kldload if_bge' it breaks. (In reply to Andrew Daugherity from comment #8) OK, thank you very much for double checking. Could you try a diff at the following URL? https://people.freebsd.org/~yongari/bge/bge.ipmi.diff I don't have access to IPMI-aware bge(4) H/Ws so it's just compile tested. The diff was generated against HEAD but I guess it will apply to stable/9 or stable/10. (In reply to Pyun YongHyeon from comment #9) Unfortunately, the diff does not fix anything on my hardware. IPMI still works when PXE is enabled and does not work without it. (In reply to Andrew Daugherity from comment #10) Uploaded updated diff. The URL is the same as before. Could you test it again? (In reply to Pyun YongHyeon from comment #11) Is that diff meant to be cumulative with the previous one or replace it entirely? With a kernel built with only the new diff (discarding the previous one), there is no change. |