Summary: | em(4): Crash with Intel 82571EB NIC with AMD Piledriver and Steamroller APUs | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | tinfever <tinfever6> | ||||||||||||||
Component: | kern | Assignee: | freebsd-net (Nobody) <net> | ||||||||||||||
Status: | Open --- | ||||||||||||||||
Severity: | Affects Some People | CC: | alin_im, ben, emaste, fugkco+freebsd-bugzilla, kbowling, krzysztof.galazka, ron | ||||||||||||||
Priority: | --- | Keywords: | IntelNetworking, crash, needs-qa | ||||||||||||||
Version: | 12.0-RELEASE | ||||||||||||||||
Hardware: | amd64 | ||||||||||||||||
OS: | Any | ||||||||||||||||
URL: | https://www.reddit.com/r/PFSENSE/comments/da6nh7/multiport_intel_82571ebbased_network_cards_not/ | ||||||||||||||||
Attachments: |
|
Description
tinfever
2019-09-30 16:50:46 UTC
(In reply to tinfever from comment #0) Could you, please, provide output from: pciconf -l -vbc and dmesg for that NIC? Created attachment 207973 [details] pciconf -l -vbc and dmesg from AMD Piledriver APU and HP NC364T NIC pciconf -l -vbc and dmesg as requested in comment #1 Created attachment 207977 [details]
pciconf -l -vbc from NC364T and AMD RX-427BB
Attached copy of requested output of "pciconf -l -vbc" on system using NC364T and AMD RX-427BB.
Created attachment 207978 [details]
dmesg from from NC364T and AMD RX-427BB Steamroller
Attached copy of requested output of dmesg on system using NC364T and AMD RX-427BB.
I had surprising difficulty getting these logs off the machine since even an ssh session seems to be enough to crash everything sometimes. I've also noticed that if you catch it crashing and start mashing buttons on the keyboard, you can see it register the key presses really slowly for a second until it eventually registers nothing at all.
Dropping in to state that I am seeing the same issue with my HP t730 and both a HP NC364T and NC360T. I followed the same reproduction steps as tinfever but with the 12.1 release. Also tested with pfSense 2.5 (based on FreeBSD 12.1) running as an iperf server. I do also see this with pfSense 2.4.5 (based on FreeBSD 11.3). If any further information is needed, please let me know. Same issue happens to me :| I am running it on HP T730 with HP NC365T Network Controller. 32GB SSD, 2x4GB RAM (brand new) Trying to make it work with pfSense 2.4.5 and 2.5 (FreeBSD 12.2-Stable). Changed different RAM sticks, SSDs, NICs. The only thing I have not changed is the CPU. Works for about an hour maybe less and then becomes unresponsive and required hard reboot. Let me know if you found any solution/workaround or I need to repurpose the box to something else. Thanks :) Stumbled upon the same issue today. Took the card out and it works fine again. Happy to provide any details if necessary. (In reply to Ace from comment #7) Please do, I'd like to see the output of 'ifconfig em0' and a 'dmesg'. Created attachment 227688 [details]
dmesg output
Created attachment 227689 [details]
ifconfig em0 output
Created attachment 227690 [details]
pciconf output
(In reply to Kevin Bowling from comment #8) Thanks, I've attached them. I've also included pciconf -l -vbc output for good measure as asked in comment #1 (In reply to Ace from comment #12) > ecap 0001[100] = AER 1 0 fatal 1 non-fatal 2 corrected You have some fatal PCI errors occurring on the card, and that looks consistent with the other pciconf reports.. just to start with a low effort guess can you try disabling PCI Link Power management (ASPM) and/or AER (advanced error reporting) in the system's firmware and see what happens? Beyond that there are a number of relevant errata we may need to check off in the driver to see if we are missing some mitigation http://iommu.com/datasheets/e1000-datasheets/82571eb-82572ei-gbe-controller-spec-update.pdf the above two firmware changes stand out to me as eliminating some possible issues. > try disabling PCI Link Power management (ASPM) and/or AER (advanced error reporting) in the system's firmware and see what happens?
Sorry mate, I'm struggling to figure out how to do this. Sorry if the following sounds dumb in this context. I'm using UEFI but don't see any such option, nor do I find anything on the internet on how to disable these.
(In reply to Ace from comment #14) If the UEFI had options for it I think it would be obvious so it may not have the knobs exposed. It will be tricky to proceed and make any fixes without a card. (In reply to Kevin Bowling from comment #15) I just bought a new a Intel i350-T4 which users online have reported no issues with in combination with the HP T730 and OPNSense so fingers crossed that will fare better. Having said that, if you are UK based, I'd be happy to post the HP NC364T card to you if it helps other users since I'll have no use for it. Please contact me directly via email if you're up for that. ^Triage: clear stale flags. |