Bug 259104 - powerpc64 package builder (foul1.nyi.freebsd.org) is down
Summary: powerpc64 package builder (foul1.nyi.freebsd.org) is down
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Package Infrastructure (show other bugs)
Version: Latest
Hardware: powerpc Any
: --- Affects Many People
Assignee: Danilo G. Baio
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-12 14:34 UTC by Piotr Kubaj
Modified: 2022-12-15 00:13 UTC (History)
10 users (show)

See Also:


Attachments
foul1 BMC dmesg (13.68 KB, text/plain)
2021-10-13 10:28 UTC, Philip Paeps
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Piotr Kubaj freebsd_committer freebsd_triage 2021-10-12 14:34:54 UTC
foul1.nyi.freebsd.org has been down for a couple of days already.
Comment 1 Li-Wen Hsu freebsd_committer freebsd_triage 2021-10-12 19:48:51 UTC
We need to get that host online first.
Comment 2 Philip Paeps freebsd_committer freebsd_triage 2021-10-13 08:30:01 UTC
I have tried to reboot it.

The machine is responding to power cycle commands over IPMI.

Get HPM.x Capabilities request failed, compcode = d3
Get Device ID command failed: 0xd3 Destination unavailable
Get Chassis Power Status failed: Destination unavailable

The BMC web interface is full of whining along the lines of:

org.open_power.Proc.FSI.Error.MasterDetectionFailure
CALLOUT_DEVICE_PATH=/sys/devices/platform/gpio-fsi/fsi0/slave@00:00/raw CALLOUT_ERRNO=0 _PID=22804 

I toggled the power on the PDUs.  No change.
Comment 3 Piotr Kubaj freebsd_committer freebsd_triage 2021-10-13 10:08:56 UTC
Can you check the dmesg of the BMC itself?
Comment 4 Philip Paeps freebsd_committer freebsd_triage 2021-10-13 10:28:44 UTC
Created attachment 228657 [details]
foul1 BMC dmesg

Attached.
Comment 5 Alfredo Dal'Ava Junior freebsd_committer freebsd_triage 2021-10-22 22:01:02 UTC
the dmesg output is similar to other Talos, there's nothing calling my attention.

The FSI error may be linked to power on issue. Bug [1] suggests FSI is important for host power on. 

The only things I can suggest at this moment  are:

1 - BMC reboot
2 - full power cycle
3 - reset firmware to factory defaults
4 - reinstall firmware
5 - return board to RaptorCS


[1] https://github.com/openbmc/openbmc/issues/3477
Comment 6 Alfredo Dal'Ava Junior freebsd_committer freebsd_triage 2021-10-23 01:19:39 UTC
comments from #talos-workstation @efnet:

"FSI is needed to release the SBE on P9"
"fsi is used quite a bit to power on the talos II, if it's broken it absolutely won't boot."
Comment 7 Andrew Jeffery 2022-01-10 00:15:57 UTC
(In reply to Philip Paeps from comment #4)

Hi Philip. I work on upstream OpenBMC support for Power systems - is any chance you can post the journal output of a failed power-on in addition to the dmesg output you've already attached here?
Comment 8 Danilo G. Baio freebsd_committer freebsd_triage 2022-12-15 00:11:01 UTC
Hi.

For the records.

- Mainboard replaced.
- One CPU removed. 
- NVMes of 512gb were replaced with new NVMes of 1TB.


foul1 is now building packages again.

https://pkg-status.freebsd.org/foul1/
or
http://foul1.nyi.freebsd.org/

Kind Regards.
Comment 9 Graham Perrin freebsd_committer freebsd_triage 2022-12-15 00:13:59 UTC
(In reply to Danilo G. Baio from comment #8)

Sweet! Thanks (and thanks to whoever funded this).