Summary: | 11.0-CURRENT/10-STABLE powerpc64: a PowerMac G5 specific sys/powerpc/ofw/ofw_machdep.c change for reliable PowerMac G5 booting (with lots of RAM) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Mark Millard <marklmi26-fbsd> | ||||||
Component: | kern | Assignee: | Justin Hibbits <jhibbits> | ||||||
Status: | Closed FIXED | ||||||||
Severity: | Affects Only Me | CC: | andreast, jhibbits, kristof, nwhitehorn, ppc | ||||||
Priority: | --- | Keywords: | patch | ||||||
Version: | CURRENT | ||||||||
Hardware: | powerpc | ||||||||
OS: | Any | ||||||||
Attachments: |
|
Description
Mark Millard
2015-12-20 10:45:25 UTC
I should have noted the specifics of the types of PowerMac G5's that my activity with this hack has been in use on: 2 PowerMac7,11 examples, one with 16 GBytes RAM and one with 12 GBytes RAM, each with 2 dual-core processors ("Quad core G5" examples). 1 PowerMac7,2 with 8 GBytes RAM, 2 single-core processors. One of the worries about the hack is that other PowerMac/iMac G5 variants than I have tested on might behave differently. I've no evidence for any other type of PowerMac G5 variant. It works great with 10.3 on PowerMac11,2, quad core with 16GB of memory. (In reply to Mark Millard from comment #1) My PowerMac7,11 reference for the Quad Core (copied from an old note) must be wrong: PowerMac11,2 would be right. [It is still weeks before I'll have access to directly look.] I've finally understood why this patch does anything useful. Open Firmware runs in virtual mode on the Powermac G5. This runs inside the kernel page table, which preserves all address translations made by OF before the kernel starts; as a result, the kernel address space is a strict superset of OF's. Where this explodes is if OF uses an unmapped SLB entry. The SLB fault handler runs in real mode and refers to the PCPU pointer in SPRG0, which blows up the kernel. Having a value of SPRG0 that works for the kernel is less fatal than preserving OF's value in this case. I believe that OF's SPRG0 is maintained only for compatibility with some G4 Apple hardware, the eMac in particular, but will check and we can move on with this. I think it should be safe to wrap this in an #ifdef __powerpc64__. Nathan Whitehorn has put out a kernel patch that he is asking for help with testing on PowerMac/iMac G5 variations. See: https://lists.freebsd.org/pipermail/freebsd-ppc/2016-September/008413.html The patch is not just using __powerpc64__ instead of POWERMAC_G5_SPECIFIC_BUILD in my presentation of my hack. Nathan has already tested a PowerMac11,2. If anyone reading this has any other type(s) of Apple G5's and is willing/able it would be good to get in some testing before the change is made to the kernel. I expect that it does not mater much which version/variation of 10.x, 11.0, or 12 is used if the original boot problems can sometimes be observed before the change. Although I'm not sure of the patch automatically applies nicely to all possible vintages of those. As for me. . . It will be a few weeks before I'll again have access in order to test a PowerMac7,2 (or to independently repeat PowerMac11,2 testing). I'll not have access to other Apple G5's. So my range of testing will be rather limited and will not be soon. (In reply to Mark Millard from comment #5) Jukka A. Ukkonen has reported Nathan Whithorns patch to have failed to work but my hack to have worked: In https://lists.freebsd.org/pipermail/freebsd-ppc/2016-September/008416.html he reports the failure of Nathan's patch. . . I just tried the patch on a PowerMac G5 early 2005 model ... cpu0: IBM PowerPC 970 revision 2.2, 2000.19 MHz cpu0: Features dc000000<PPC32,PPC64,ALTIVEC,FPU,MMU> cpu0: HID0 511081<NAP,DPM,NHR,TBEN,ENATTN> which I think is a PowerMac7,3. It still panics right after it has reported VT(ofwfb). --jau But we had an E-mail exchange about him manually applying a simple edit to get my patch and he reports in the end. . . I rebooted the box only a few minutes ago, and this time it booted just fine. Well, the tmpfs has started failing as follows... # mount /tmp mount: tmpfs: Operation not supported by device but supposedly this is an unrelated issue. At least I hope so. ;-) Now the function reads as shown below, and I guess this is exactly what you had in mind. static __inline void ofw_sprg_prepare(void) { if (ofw_real_mode) return; /* * Assume that interrupt are disabled at this point, or * SPRG1-3 could be trashed */ __asm __volatile("mfsprg0 %0\n\t" "mtsprg1 %1\n\t" "mtsprg2 %2\n\t" "mtsprg3 %3\n\t" : "=&r"(ofw_sprg0_save) : "r"(ofmsr[2]), "r"(ofmsr[3]), "r"(ofmsr[4])); } So, from my point of view the PowerMac7,3 seems to be back to relative health with this change. I hope it works for all the other G5s as well. --jau (In reply to Mark Millard from comment #6) That test code does not have the #ifdef __powerpc64__ logic that an official update would need to also G4's and the like: it just serves as a test of the alternative's operation on Jukka's PowerMac7,3. (In reply to Mark Millard from comment #7) Also: The "simple edit" involved was to the official svn source that does not have Nathan's proposed patch. My original wording did not make that explicit. (In reply to Nathan Whitehorn from comment #4) There seems to have been no activity for this since comment #6 reported the patch failed on a PowerMac 7,3 . So it has been sitting in the In Progress state for about 4 months (2016-Sep-13 to 2017-Jan-10) as I write this. Comment 6 basically reports that some of the special purpose register(s) do need to be saved and restored, just not all of them. Nathan's patch had completely disabled ofw_sprg_prepare and its matching restore for __powerpc64__ . Unfortunately for now I've only access to a PowerMac 11,2 (a so-called "Quad Core" PowerMac G5) so for now I could only test the context that Nathan found to be working for his 2016-Sep patch. I should eventually have access to a PowerMac 7,2 form of G5 again. Still I wonder about 205458 showing "In Progress" for long periods. It might get lost with folks thinking it is being worked on when other higher priority things may have displaced its activity for a long time (even starting from now). However, closing the defect seems wrong as well. Going back to new would seem better. But as it shows to me such is not an option: closed is the only alternative to In Progress. Created attachment 179238 [details]
Test patch to preserve SPRG[1-3]
This is an untested patch, combining Nathan's work and Mark's work. Everyone please test this on all varieties of G5's (7,x and 11,x).
Created attachment 179258 [details]
Corrected test patch
The previous patch was obviously wrong, and I didn't test compile it (wrong asm arg indices). This one's been compile tested.
Comment on attachment 179258 [details]
Corrected test patch
As for the result of the corrected test patch. . .
It appears that:
static __inline void
ofw_sprg_prepare(void)
. . .
static __inline void
ofw_sprg_restore(void)
. . .
are only defined under an earlier:
#ifdef AIM
extern register_t ofmsr[5];
extern void *openfirmware_entry;
char save_trap_init[0x2f00]; /* EXC_LAST */
char save_trap_of[0x2f00]; /* EXC_LAST */
int ofwcall(void *);
static int openfirmware(void *args);
. . . (ofw_sprg_<?>'s defined in here) . . .
#ifndef __powerpc64__
__asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save));
#endif
}
#endif
(Matching the ifdef AIM if I checked right.)
But of following parts are outside any AIM
definition requirement:
static int
openfirmware_core(void *args)
{
. . .
ofw_sprg_prepare();
. . .
ofw_sprg_restore();
. . .
}
So without AIM defined the code looks like
it would not compile. (I have AIM defined
for my context.)
(In reply to Mark Millard from comment #12) openfirmware_core() is only defined behind a #ifdef AIM (#ifdef is right over ofw_quiesce()). (In reply to Justin Hibbits from comment #13) That makes the following code in openfirmware_core odd: #if defined(AIM) && !defined(__powerpc64__) /* * Clear battable[] translations */ if (!(cpu_features & PPC_FEATURE_64)) __asm __volatile("mtdbatu 2, %0\n" "mtdbatu 3, %0" : : "r" (0)); isync(); #endif No need to test AIM. I assumed too much from seeing that AIM test. Sorry for the noice. (In reply to Justin Hibbits from comment #11) Context: PowerMac G5 so-called "Quad Core", full 16 GByte of RAM. (Not the most interesting but the only powerpc64 context I currently have access to.) I've had no problems with the classic boot hangups and crashes since I switched to your code (replacing my hack). That even includes during the kernel version bisection effort that eventually found recent locking changes in the kernel (after -r313266) lead to randomly occurring panics and hangups after booting. It is still unknown when I'll again get access to the other, older type of PowerMac G5 (a dual processor, single core per processor G5). The attached patch (2017-01-23) works for me so far. Quad 14GB. A commit references this bug: Author: jhibbits Date: Tue Mar 7 22:11:58 UTC 2017 New revision: 314885 URL: https://svnweb.freebsd.org/changeset/base/314885 Log: Fix booting with >4GB RAM on PowerMac G5 hardware === From Nathan Whitehorn: Open Firmware runs in virtual mode on the Powermac G5. This runs inside the kernel page table, which preserves all address translations made by OF before the kernel starts; as a result, the kernel address space is a strict superset of OF's. Where this explodes is if OF uses an unmapped SLB entry. The SLB fault handler runs in real mode and refers to the PCPU pointer in SPRG0, which blows up the kernel. Having a value of SPRG0 that works for the kernel is less fatal than preserving OF's value in this case. === The result of this is seemingly random panics from NULL dereferences, or hangs immediately upon boot. By not restoring SPRG0 for Open Firmware entry the kernel PCPU pointer is preserved and SLB faults are successful, resulting in a stable kernel. PR: 205458 Reported by: several (over bugzilla, lists, IRC) Reviewed by: andreast Tested by: many (various forms) MFC after: 2 weeks Changes: head/sys/powerpc/ofw/ofw_machdep.c A commit references this bug: Author: jhibbits Date: Sat Apr 1 18:52:49 UTC 2017 New revision: 316367 URL: https://svnweb.freebsd.org/changeset/base/316367 Log: MFC r314885: Fix booting with >4GB RAM on PowerMac G5 hardware === From Nathan Whitehorn: Open Firmware runs in virtual mode on the Powermac G5. This runs inside the kernel page table, which preserves all address translations made by OF before the kernel starts; as a result, the kernel address space is a strict superset of OF's. Where this explodes is if OF uses an unmapped SLB entry. The SLB fault handler runs in real mode and refers to the PCPU pointer in SPRG0, which blows up the kernel. Having a value of SPRG0 that works for the kernel is less fatal than preserving OF's value in this case. === The result of this is seemingly random panics from NULL dereferences, or hangs immediately upon boot. By not restoring SPRG0 for Open Firmware entry the kernel PCPU pointer is preserved and SLB faults are successful, resulting in a stable kernel. PR: 205458 Changes: _U stable/11/ stable/11/sys/powerpc/ofw/ofw_machdep.c MFC'd to stable/11 r316367. This patch doesn't apply directly to stable/10, so unless there's sufficient interest, I don't plan to MFC back to stable/10. (In reply to Justin Hibbits from comment #19) It turns out that if TARGET_ARCH=powerpc is to work on powerpc64 PowerMac's reliably then it needs to detect the context and also avoid the sprg0 update. It turns out that the problems and evidence that I've been reporting are from the code restoring the openfirmware sprg0 lile the TARGET_ARCH=powerpc64 code used to do. I'll submit a separate, new report. But I figured the relationship would be good to note here. |