foul1.nyi.freebsd.org has been down for a couple of days already.
We need to get that host online first.
I have tried to reboot it. The machine is responding to power cycle commands over IPMI. Get HPM.x Capabilities request failed, compcode = d3 Get Device ID command failed: 0xd3 Destination unavailable Get Chassis Power Status failed: Destination unavailable The BMC web interface is full of whining along the lines of: org.open_power.Proc.FSI.Error.MasterDetectionFailure CALLOUT_DEVICE_PATH=/sys/devices/platform/gpio-fsi/fsi0/slave@00:00/raw CALLOUT_ERRNO=0 _PID=22804 I toggled the power on the PDUs. No change.
Can you check the dmesg of the BMC itself?
Created attachment 228657 [details] foul1 BMC dmesg Attached.
the dmesg output is similar to other Talos, there's nothing calling my attention. The FSI error may be linked to power on issue. Bug [1] suggests FSI is important for host power on. The only things I can suggest at this moment are: 1 - BMC reboot 2 - full power cycle 3 - reset firmware to factory defaults 4 - reinstall firmware 5 - return board to RaptorCS [1] https://github.com/openbmc/openbmc/issues/3477
comments from #talos-workstation @efnet: "FSI is needed to release the SBE on P9" "fsi is used quite a bit to power on the talos II, if it's broken it absolutely won't boot."
(In reply to Philip Paeps from comment #4) Hi Philip. I work on upstream OpenBMC support for Power systems - is any chance you can post the journal output of a failed power-on in addition to the dmesg output you've already attached here?
Hi. For the records. - Mainboard replaced. - One CPU removed. - NVMes of 512gb were replaced with new NVMes of 1TB. foul1 is now building packages again. https://pkg-status.freebsd.org/foul1/ or http://foul1.nyi.freebsd.org/ Kind Regards.
(In reply to Danilo G. Baio from comment #8) Sweet! Thanks (and thanks to whoever funded this).