Bug 232549

Summary: [PowerPC64] IPMI driver will attach, but ipmitool and commands sent to the device fail.
Product: Base System Reporter: Sean Bruno <sbruno>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: New ---    
Severity: Affects Some People CC: jhb, jhibbits, kbowling, luporl, mmacy, nwhitehorn
Priority: ---    
Version: CURRENT   
Hardware: powerpc   
OS: Any   

Description Sean Bruno freebsd_committer 2018-10-22 23:15:41 UTC
The Tyan PPC hosts we have in the freebsd cluster are administered via IPMI.  I recently added "device ipmi" to our default kernel configuration to test out local interactions with /dev/ipmi0:

ipmi0: IPMI device rev. 1, firmware rev. 1.10, version 2.0, device support mask 0xbf
ipmi0: Number of channels 2
ipmi0: Attached watchdog
ipmi0: Establishing power cycle handler

The attach will emit one or two errors at startup that are only visible on the console:

[464515.035780175,3] BT: seq 0x6f netfn 0x06 cmd 0x42: Maximum queue length exceeded
[464515.036165129,3] BT: seq 0x6e netfn 0x06 cmd 0x42: Removed from queue

When attempting to interact with /dev/ipmi0 via ipmitool lan print, the following spew occurs:

root@archon.nyi:~ # ipmitool lan print
Error receiving message: Input/output error
[464615.265945540,3] BT: seq 0x7c netfn 0x06 cmd 0x42: Maximum queue length exceeded

Error rec[464615.266728266,3] BT: seq 0x7b netfn 0x06 cmd 0x42: Removed from queue
eivin[464615.266996584,3] BT: seq 0x7e netfn 0x06 cmd 0x42: Maximum queue length exceeded
g message:[464615.267162623,3] BT: seq 0x7d netfn 0x06 cmd 0x42: Removed from queue
[464615.267270229,3] BT: seq 0x7f netfn 0x06 cmd 0x42: Maximum queue length exceeded
 Input/output e[464615.267383071,3] BT: seq 0x7e netfn 0x06 cmd 0x42: Removed from queue
[464615.267496041,3] BT: seq 0x80 netfn 0x06 cmd 0x42: Maximum queue length exceeded
Error rec[464615.267651110,3] BT: seq 0x7f netfn 0x06 cmd 0x42: Removed from queue
eiving message: Input/output error

---- errors continue for several pages.

I'm going to make the "bold" assumption that ipmi(4) has never been used on a BE system or at least on one recently.
Comment 1 Sean Bruno freebsd_committer 2018-10-22 23:18:24 UTC
Well, maybe its not BE as this appears to be loading the opal_ipmi interface on boot.

opalcons0: <OPAL Consoles> on opal0
uart0: <OPAL Serial Port> on opalcons0
uart0: console
ipmi0: <OPAL IPMI System Interface> on opal0
opalsens0: <OPAL Sensors> on opal0
Comment 2 Leandro Lupori 2018-10-23 12:10:17 UTC
Let me just see if I got it right:

- I assume that using ipmitool from another OS/platform (e.g. Linux or FreeBSD on x86) to Tyan PPC hosts works, right?
- The problem happens when issuing IPMI commands from Tyan PPC FreeBSD hosts to other Tyan PPC hosts, right?
Comment 3 Sean Bruno freebsd_committer 2018-10-23 13:13:30 UTC
(In reply to Leandro Lupori from comment #2)
Using ipmitool to connect to a remote Tyan PPC host works just fine.  It how you power on and grab the console of these things.

These errors are generated locally on a Tyan PPC host to query the local BMC.  e.g. ipmitool lan print when the ipmi driver is loaded.
Comment 4 Sean Bruno freebsd_committer 2018-11-21 14:59:05 UTC
It looks like petitboot can talk to the ipmi controller, most of the time.  So, the linux kernel definitely has the code to do this.
Comment 5 Leandro Lupori freebsd_committer 2019-02-20 13:27:11 UTC
Sean, as a temporary workaround, aren't you able to use IPMI through the network?

Commands like the following work fine for me:

ipmitool -I lanplus -H -P password sol activate
Comment 6 Sean Bruno freebsd_committer 2019-02-20 13:32:09 UTC
(In reply to Leandro Lupori from comment #5)
Yes.  I am currently using IPMI as the console and power control for the freebsd.org machines.

Remotely accessing IPMI over the network works just fine.
Comment 7 John Baldwin freebsd_committer freebsd_triage 2019-02-21 22:55:05 UTC
Can you provide more details about how IPMI over OPAL works?  Is it using one of the backends (KCS, SMIC, etc.) from the IPMI spec?  If it is using the 'BT' backend, I don't think that I've ever seen that in the wild on x86 (and I didn't think we even supported it).

Hmm, looks like it's an entirely separate thing just done via a single opal_call.  Are those 'BT:' messages from the OPAL console?
Comment 8 Sean Bruno freebsd_committer 2019-02-22 00:13:15 UTC
(In reply to John Baldwin from comment #7)
The BT messages appear on the console of the machine.  In this case its on the serial port.
Comment 9 Justin Hibbits freebsd_committer 2019-02-23 03:00:31 UTC
THe BT messages are from OPAL.

The problem is that the IPMI send queue fills up, and never flushes.  The queue is supposed to be ratcheted (processed and removed) when calling OPAL_HANDLE_INTERRUPT and OPAL_POLL_EVENTS, but that doesn't seem to be happening, and I don't understand why.