Bug 217299 - Periodic Kernel Panic with FreeBSD 10.3-RELEASE and Myricom cards.
Summary: Periodic Kernel Panic with FreeBSD 10.3-RELEASE and Myricom cards.
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.3-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2017-02-22 18:14 UTC by Shirkdog
Modified: 2017-06-03 14:10 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Shirkdog 2017-02-22 18:14:36 UTC
In working with members of the Bro community that run FreeBSD, an issue was brought to my attention with a user running a large setup of FreeBSD systems running Bro NSM. The issue appears to be between FreeBSD and Myricom.

Myricom support has stated this is a known issue in FreeBSD. I am submitting this bug on behalf of Keith (attached as a CC to this bug) to try to figure this out. 

He is not able to catch the kernel panic, but below are syslog messages that lead up to the panic.

Sep 30 06:23:11 gulp9 devd: Processing event '!system=DEVFS subsystem=CDEV type=CREATE cdev=myri_fake.1048575'
Sep 30 06:23:11 gulp9 devd: Pushing table
Sep 30 06:23:11 gulp9 devd: Processing notify event
Sep 30 06:23:11 gulp9 devd: Popping table
Sep 30 06:23:12 gulp9 dev 0xfffff80139c8b400 (myri_fake.0) is on clonelist
Sep 30 06:23:12 gulp9 unit=0, low=1048576, extra=0x0
Sep 30 06:23:12 gulp9   0xfffff80139c8b400 myri_fake.0
Sep 30 06:23:12 gulp9   0xfffff8022dbb2000 myri_fake.1
Sep 30 06:23:12 gulp9   0xfffff8022dd00800 myri_fake.2
Sep 30 06:23:12 gulp9   0xfffff801a7244c00 myri_fake.3
Sep 30 06:23:12 gulp9   0xfffff801a7243000 myri_fake.4
Sep 30 06:23:12 gulp9   0xfffff80139650e00 myri_fake.5
Sep 30 06:23:12 gulp9   0xfffff80139bebe00 myri_fake.6
Sep 30 06:23:12 gulp9   0xfffff801390f1200 myri_fake.7
Sep 30 06:23:12 gulp9   0xfffff8013904cc00 myri_fake.8

Basically the myri_counters tool tries to open several devices that don’t exist in an attempt to find the device for this NIC.  Each failed open raises a CREATE event.  Like clockwork, the box will panic after a few weeks.  Every box in the cluster will panic in roughly the order they were booted.  He has not found a way to configure devfs to just ignore these events. In the Bro sense, the cluster is a group of servers setup as worker nodes, with the function of processing up do 10Gb/s of network traffic for network monitoring. 

Keith is currently running 10.3-RELEASE-p7 (when I received the info), and he is looking to do a test with FreeBSD 11.0-RELEASE. The same Myricom card did not exhibit this issue on FreeBSD 9.

(additional details about the hardware are available from Keith).
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2017-02-26 20:58:54 UTC
I'm confused.  Which driver is this again?  I don't see anything like it in man4.
Comment 2 klehigh 2017-03-06 23:40:51 UTC
The driver in use is provided by Myricom as part of their SNF software.
Comment 3 Shirkdog 2017-06-03 14:10:20 UTC
Keith, I was not sure if you had a change to test with FreeBSD 11 yet, but update the bug if you still are having the same issue.

The issue that needs to be ironed out is whether this is an issue with the operating system or the proprietary driver provided by Myricom for their Sniffer Pro License.

Myricom has pointed to this as a FreeBSD issue, as of course it is the operating systems fault trying to open fake devices :)

However, they have not provided Keith with any specific reason why from Myricom support, he only has these logs and crashes that occur periodically.