Bug 239245 - r350074 will panic on ppc64 PowerMac G5 in vm_phys_enqueue_contig
Summary: r350074 will panic on ppc64 PowerMac G5 in vm_phys_enqueue_contig
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: powerpc Any
: --- Affects Some People
Assignee: freebsd-bugs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-16 10:19 UTC by Dennis Clarke
Modified: 2019-08-12 11:38 UTC (History)
8 users (show)

See Also:


Attachments
cell phone photo of panic seen on console of PowerMac G5 quad (463.81 KB, image/png)
2019-07-16 10:26 UTC, Dennis Clarke
no flags Details
r350026 panic is consistent (373.33 KB, image/png)
2019-07-16 13:10 UTC, Dennis Clarke
no flags Details
r350074 does the same panic (397.18 KB, image/png)
2019-07-17 03:41 UTC, Dennis Clarke
no flags Details
OpenFirmware messages from PowerMac G5 quad during boot (358.48 KB, image/png)
2019-07-19 23:25 UTC, Dennis Clarke
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dennis Clarke 2019-07-16 10:19:50 UTC
As seen also by Francis Little while working on bugid 238730 :

KDB: debugger backends: ddb
KDB: current backend: ddb
---<<BOOT>>---
Copyright (c) 1992-2019 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT #0 r350018M: Mon Jul 15 23:32:27 GMT 2019
    root@hydra:/usr/obj/usr/src/r350018/powerpc.powerpc64/sys/GENERIC64 powerpc
gcc version 4.2.1 20070831 patched [FreeBSD]
WARNING: WITNESS option enabled, expect reduced performance.
panic: segind 127 m 0xc000000273832640
cpuid = 0
time = 1
KDB: stack backtrace:
0xe000000000008500: at .kdb_backtrace+0x5c
0xe000000000008630: at .vpanic+0x1b4
0xe0000000000086f0: at .panic+0x38
0xe000000000008780: at .vm_phys_enqueue_contig+0x64
0xe000000000008850: at .vm_page_startup+0x7b4
0xe000000000008920: at .vm_mem_init+0x30
0xe0000000000089b0: at .mi_startup+0x1f8
0xe000000000008a50: at .btext+0xc4
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at      .kdb_enter+0x60:        ld      r2, r1, 0x28
db>

Carefully transcribed from a photo taken with a cell phone.
See attached image and please advise on any error and I hope
that Francis Little can confirm that he sees the exact same
panic on similar PowerMac G5 quad hardware with 8GB memory.


-- 
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken
GreyBeard and suspenders optional
Comment 1 Dennis Clarke 2019-07-16 10:26:39 UTC
Created attachment 205817 [details]
cell phone photo of panic seen on console of PowerMac G5 quad

As seen this morning and then carefully transcribed into this bug report.
Comment 2 Dennis Clarke 2019-07-16 11:01:40 UTC
Also see a similar ( nearly identical ) panic report from Francis Little : 

https://bz-attachments.freebsd.org/attachment.cgi?id=205809


Carefully looking at the two panic reports side by side I see that
there are very subtle diffs. 

The segind number is 125 for Francis and 127 for mine.  Is this a
segment index number? 

The address difference between the two is : 

    Francis  0xc000000273833710
    My G5    0xc000000273832640
    ----------------------------
                           10d0

I was hoping for some perfect multiple of 0x100 or similar. 

Also for Francis he has the exact same stack backtrace however the 
"Stopped at" is different likely due to the segind 125 address being
different.
Comment 3 Dennis Clarke 2019-07-16 13:10:42 UTC
Created attachment 205819 [details]
r350026 panic is consistent

The exact same panic with r350026
Comment 4 Kyle Evans freebsd_committer 2019-07-16 13:16:13 UTC
CC'ing dougm@ and alc@, as this backtrace seems to only be possible after r348484
Comment 5 Mark Johnston freebsd_committer 2019-07-16 14:31:52 UTC
(In reply to Kyle Evans from comment #4)
I think the real problem is the very large "segind" value.  VM_PHYSSEG_MAX is 16 on powerpc.  The physical memory segments are initialized based on phys_avail[] in vm_page_startup().

Can we see the output of a verbose boot ("boot -v" at the loader)?  I would in particular like to see the output of this chunk of code:

205         if (bootverbose) {                                                                                                                                                
206                 int indx;                                                                                                                                                 
207                                                                                                                                                                           
208                 printf("Physical memory chunk(s):\n");                                                                                                                    
209                 for (indx = 0; phys_avail[indx + 1] != 0; indx += 2) {                                                                                                    
210                         vm_paddr_t size1 =                                                                                                                                
211                             phys_avail[indx + 1] - phys_avail[indx];                                                                                                      
212                                                                                                                                                                           
213                         #ifdef __powerpc64__                                                                                                                              
214                         printf("0x%016jx - 0x%016jx, %ju bytes (%ju pages)\n",                                                                                            
215                         #else                                                                                                                                             
216                         printf("0x%09jx - 0x%09jx, %ju bytes (%ju pages)\n",                                                                                              
217                         #endif                                                                                                                                            
218                             (uintmax_t)phys_avail[indx],                                                                                                                  
219                             (uintmax_t)phys_avail[indx + 1] - 1,                                                                                                          
220                             (uintmax_t)size1, (uintmax_t)size1 / PAGE_SIZE);                                                                                              
221                 }                                                                                                                                                         
222         }
Comment 6 Francis Little 2019-07-16 15:27:16 UTC
(In reply to Dennis Clarke from comment #0)
Hi, yes, I am on G5 Quad with 8GB Ram.

Currently going updating back to r345425 before I try anything newer!
Comment 7 Mark Millard 2019-07-16 18:41:37 UTC
(In reply to Mark Johnston from comment #5)

I tried but failed to repeat the original problem first: it
booted just fine, making such output from my context probably
not all that useful. Details follow.

I grabbed materials from:

https://artifact.ci.freebsd.org/snapshot/head/r350055/powerpc/powerpc64/

and put themn on a SSD parition for a 16 GiByte PowerMac G5
quad. It was unchanged, not even editing of /etc/fstab :
just tar expansion of the .txz files after a newfs -U -t .
(So I had to enter ufs:/dev/ada0s3 when prompted.)

But it booted just fine. Interestingly, uname -apKU reports
itself as being r350056 (not r350055). The revision.txt
showed r350055. (So artifact's version indications can be
wrong?)

Possibilities for why it boots include:

A) The amount of RAM matters.
B) Some FreeBSD configuration that causes differing activity
   matters.
C) ?
Comment 8 Mark Johnston freebsd_committer 2019-07-16 19:07:26 UTC
Alan suggested by mail that this might be the result of incomplete recompilation after r349846, which changed the layout of struct vm_page and in particular changed the offset of the segind field whose value is triggering this panic.

To rule this out, please try compiling a kernel without -DKERNFAST or -DNO_CLEAN.
Comment 9 Mark Millard 2019-07-16 19:34:59 UTC
(In reply to Mark Millard from comment #7)

Side note on versions from artifacts:
(to avoid confusions from the oddity)

The r350056 from uname -apKU is even odder:
head goes from r350055 to the next being
r350057.

r350056 is listed in: svn-src-stable-11/2019-July/ .

The uname -apKU output does list:

13.0-CURRENT

and for the KU part:

1300036
1300036

(So not 11 based.)

It appears that the build picked up the wrong
version number from svn, not one listed under
svn-src-head.
Comment 10 Dennis Clarke 2019-07-16 21:50:05 UTC
(In reply to Mark Johnston from comment #5)
I want to jump onto your "chunk of code" but wherefore do I find that?

I can grep around I guess. 

Also, yes, this is always a boot -sv because I have little hopes of
even getting past the kernel loading into memory and beginning to
map out the system memory.  May as well be verbose but I doubt the
request for verbosity means anything given that we don't get very
far.
Comment 11 Dennis Clarke 2019-07-16 21:51:10 UTC
(In reply to Francis Little from comment #6)

Excellent idea ... however I will stay at the bleeding edge.

Perhaps you and I will bracket this sucker with an attack from
both ends until we hit a running kernel revision without the 
panic and then we can do a bisect that is not monstrous.
Comment 12 Dennis Clarke 2019-07-16 22:07:51 UTC
(In reply to Mark Johnston from comment #8)

result of incomplete recompilation  ?

Actually I svn checkout the entire source every time. 
I did try the -DNO_CLEAN once or twice but it seems to
have little or zero effect on the build time. This is a
one hour and 20 minute build every time. Closer to 2 hours
actually and it is from the top and beginning with nothing
from the past every time. 

Also I revert back to :

hydra$ uname -a 
FreeBSD hydra 12.0-RELEASE FreeBSD 12.0-RELEASE r341666 GENERIC  powerpc
hydra$ sysctl -a | grep 'smp'
kern.smp.maxid: 3
kern.smp.maxcpus: 256
kern.smp.active: 0
kern.smp.disabled: 1
kern.smp.cpus: 1
kern.smp.topology: 0
kern.smp.forward_signal_enabled: 1
hydra$ 

So I have a single core working. Nothing more. Not even a fast disk.
Just the original Apple type disk in there. Also 8GB of memory.

This is from the top every time. 

Dennis Clarke
Comment 13 Dennis Clarke 2019-07-17 01:07:13 UTC
(In reply to Mark Johnston from comment #5)
Sorry Mark but it is not clear where you want that code chunk.

Inside vm_page_init_cache_zones() of sys/vm/vm_page.c ??
Comment 14 Mark Millard 2019-07-17 01:23:40 UTC
(In reply to Dennis Clarke from comment #12)

(In reply to Dennis Clarke from comment #12)

If you can set up independent media for it, I recommend
that you try one of the fairly modern:

https://artifact.ci.freebsd.org/snapshot/head/r*/powerpc/powerpc64/

and compare it to what happens for what you build (for the same r*).

As reported in comment #7, when I tried:

https://artifact.ci.freebsd.org/snapshot/head/r350055/powerpc/powerpc64/

the result boots just fine on the G5 quad that I currently
have access to.

(The apple_boot partition and its contents had been previously
established and were not updated.)

It might even be an idea to deliberately r350055, matching
my experiment.

(Be warned: the likes of uname -apKU will report an inaccurate
r350056.)
Comment 15 Mark Millard 2019-07-17 01:25:10 UTC
(In reply to Dennis Clarke from comment #13)

If I read Mark J. correctly, the code chunk is already
in place. It is the boot -v output from the code chunk
that he wanted extracted and provided.
Comment 16 Dennis Clarke 2019-07-17 01:59:10 UTC
(In reply to Mark Millard from comment #15)

Oh.  Well I have no idea how to get that boot -v output other than
to take a stab at this r350066 which is building now and see what happens.
Comment 17 Mark Millard 2019-07-17 03:30:27 UTC
(In reply to Dennis Clarke from comment #16)

r350066 is a change to stable/11/sys/modules/spigen/Makefile .

Is there a typo in your version number?
Comment 18 Dennis Clarke 2019-07-17 03:34:37 UTC
(In reply to Mark Millard from comment #17)
r350066 is what I see on my most recent checkout.
Comment 19 Dennis Clarke 2019-07-17 03:41:20 UTC
(In reply to Mark Millard from comment #17)
Sorry .. I spoke too soon. 

I have r350074 here which does the exact same panic in the exact
same way and "boot -sv" means nothing and reveals nothing.
Comment 20 Dennis Clarke 2019-07-17 03:41:47 UTC
Created attachment 205831 [details]
r350074 does the same panic

r350074 does the same panic
Comment 21 Mark Millard 2019-07-17 03:55:59 UTC
(In reply to Dennis Clarke from comment #19)

Could you try:

https://artifact.ci.freebsd.org/snapshot/head/r350074/powerpc/powerpc64/

and see if the FreeBSD build that was one on the
FreeBSD server also gets the problem?
Comment 22 Mark Millard 2019-07-17 04:32:34 UTC
(In reply to Dennis Clarke from comment #19)

FYI:

https://artifact.ci.freebsd.org/snapshot/head/r350074/powerpc/powerpc64/

worked fine booting the G5 quad that I currently have
access to.
Comment 23 Francis Little 2019-07-17 05:04:09 UTC
(In reply to Mark Millard from comment #22)
Hi Mark, is there any chance you can try that with only 8GB ram in the machine?

I think both Dennis and I are on 8G in our quads.
Comment 24 Mark Millard 2019-07-17 07:25:20 UTC
(In reply to Francis Little from comment #23)

Maybe at some point I can remove some RAM for a test,
I'm not sure.

But a better test would likely be for a known-failing
context to try the:

https://artifact.ci.freebsd.org/snapshot/head/r350074/powerpc/powerpc64/

materials (instead of a personal/local build).
Comment 25 Francis Little 2019-07-18 09:05:51 UTC
(In reply to Mark Millard from comment #24)

Hi, extracting the sets from:

https://artifact.ci.freebsd.org/snapshot/head/r350074/powerpc/powerpc64/

to my disk just results in a black screen after the boot loader and the machine locks.

Trying boot -sv still just gives a black screen.

I may be doing something wrong... so will try a few other versions.

Similarly, burning:

http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/13.0/FreeBSD-13.0-CURRENT-powerpc-powerpc64-20190711-r349909-disc1.iso

to a DVD, the boot loader comes up, pressing enter and its just a blank screen and the machine locks.
Comment 26 Dennis Clarke 2019-07-18 20:13:24 UTC
(In reply to Francis Little from comment #25)

I am working with a PowerMac G5 here with 8GB of memory and just a very
trivial baseline install from the 12-RELEASE dvd. That means nothing
from ports and not even pkg other than the stub in place. 

So it took me a moment to figure out the command line for "fetch" which
seems to be a little brother to big bad 'wget' : 

hydra# pwd
/usr/src/artifact
hydra# 
hydra# fetch --no-verify-peer https://artifact.ci.freebsd.org/snapshot/head/r350074/powerpc/powerpc64/kernel.txz
kernel.txz                                     24 MB   12 MBps    02s

hydra# ls 
kernel.txz
hydra# xz -dc kernel.txz | tar -xf - 
hydra# ls
boot            kernel.txz      usr
hydra# 
hydra# 
hydra# uname -a 
FreeBSD hydra 12.0-RELEASE FreeBSD 12.0-RELEASE r341666 GENERIC  powerpc
hydra# mv /boot/kernel /boot/kernel_r341666
hydra# mv boot/kernel /boot 
hydra# 

That should be all I need.  However what is this stuff ? 

hydra# ls -la usr/lib/debug/boot/kernel/
total 824
drwxr-xr-x  2 root  wheel     512 Jul 17 02:22 .
drwxr-xr-x  3 root  wheel     512 Jul 17 02:22 ..
-r-xr-xr-x  1 root  wheel  543800 Jul 17 02:22 fuse.ko
-r-xr-xr-x  1 root  wheel  121696 Jul 17 02:22 if_tap.ko
-r-xr-xr-x  1 root  wheel  121696 Jul 17 02:22 if_tun.ko
hydra# 

I have no idea about those. I have my own already and I will save them :

hydra# 
hydra# cp -p /usr/lib/debug/boot/kernel/fuse.ko /usr/lib/debug/boot/kernel/fuse.ko_mine
hydra# cp -p /usr/lib/debug/boot/kernel/if_tap.ko /usr/lib/debug/boot/kernel/if_tap.ko_mine
hydra# cp -p /usr/lib/debug/boot/kernel/if_tun.ko /usr/lib/debug/boot/kernel/if_tun.ko_mine
hydra# 
hydra# cp -p usr/lib/debug/boot/kernel/* /usr/lib/debug/boot/kernel/
hydra# 

Fine so .. that should be it for this "artifact" kernel stuff that we
did not need to compile ourselves but you and I are clearly hitting a
panic over and over. So I reboot now and interrupt the boot loader to
set kern.smp.disabled=1 and guess what ? 

The damn thing boots. 

Moments like this cause me to lose faith in compilers and source code.

Where was this kernel compiled and why is it different from what we have
carefully home baked with just "make kernel" ? 

I can ssh in from another machine just fine and look around :

hydra# 
hydra# uname -a 
FreeBSD hydra 13.0-CURRENT FreeBSD 13.0-CURRENT r350074 GENERIC  powerpc
hydra# 
hydra# sysctl -a | grep 'smp'
kern.smp.maxid: 3
kern.smp.maxcpus: 256
kern.smp.active: 0
kern.smp.disabled: 1
kern.smp.cpus: 1
kern.smp.threads_per_core: 1
kern.smp.cores: 1
kern.smp.topology: 0
kern.smp.forward_signal_enabled: 1
"devfs","crossmp"
hydra# 
hydra# cat /etc/rc.conf
clear_tmp_enable="YES"
syslogd_flags="-ss"
hostname="hydra"
ifconfig_bge0="inet 172.16.35.8 netmask 0xffffffc0"
defaultrouter="172.16.35.1"
sshd_enable="YES"
ntpd_enable="YES"
# Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
dumpdev="AUTO"
hydra# 

At this point I want to try a "shutdown -r 'now' "  and see what smoke
comes out.

hydra# 
hydra# shutdown -r 'now' 
Shutdown NOW!
shutdown: [pid 1074]
hydra#                                                                                
*** FINAL System shutdown message from dclarke@hydra ***                     

System going down IMMEDIATELY                                                  

                                                                               

System shutdown time has arrived
Connection to 172.16.35.8 closed by remote host.
Connection to 172.16.35.8 closed.
.
.
.

Now I watch as the machine does the usual openfirmware bits and just let
it boot with zero input from me. 

Boots fine.  All cores available : 

hydra# uname -a 
FreeBSD hydra 13.0-CURRENT FreeBSD 13.0-CURRENT r350074 GENERIC  powerpc
hydra# uptime
 8:10PM  up 58 secs, 1 user, load averages: 0.57, 0.21, 0.08
hydra# sysctl -a | grep 'smp'
kern.smp.maxid: 3
kern.smp.maxcpus: 256
kern.smp.active: 1
kern.smp.disabled: 0
kern.smp.cpus: 4
kern.smp.threads_per_core: 1
kern.smp.cores: 4
kern.smp.topology: 0
kern.smp.forward_signal_enabled: 1
"devfs","crossmp"
hydra# 

baffled. 

Dennis
Comment 27 Dennis Clarke 2019-07-18 20:50:19 UTC
Using the tarballs in https://artifact.ci.freebsd.org/snapshot/head/r350114/powerpc/powerpc64/  I am able to boot just fine : 


hydra# 
hydra# uname -a 
FreeBSD hydra 13.0-CURRENT FreeBSD 13.0-CURRENT r350114 GENERIC  powerpc
hydra# uptime
 8:45PM  up 2 mins, 1 user, load averages: 0.44, 0.42, 0.19

hydra# sysctl -a | grep 'smp'
kern.smp.maxid: 3
kern.smp.maxcpus: 256
kern.smp.active: 1
kern.smp.disabled: 0
kern.smp.cpus: 4
kern.smp.threads_per_core: 1
kern.smp.cores: 4
kern.smp.topology: 0
kern.smp.forward_signal_enabled: 1
"devfs","crossmp"
hydra# 

So I am hearing in various places ( irc #bsdmips on efnet ) that the
kernel builds done for artifact.ci.freebsd.org/snapshot/head are actual
cross compiles done on an x86_64 boxen with something like : 

    make buildworld TARGET=powerpc TARGET_ARCH=powerpc64

So this is bothersome as it suggests that the native build process on a
ppc64 boxen is somehow slightly broken?


-- 
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken
GreyBeard and suspenders optional
Comment 28 Mark Millard 2019-07-19 04:59:02 UTC
(In reply to Dennis Clarke from comment #26)

Mixing a 12 world with a 13 kernel is not guaranteed to work
in general.

Generally uname -apKU should show matching numbers at the end
(the KU part), for example: 1300036 1300036 . (The U is for
User instead of Kernel as I understand.)

When they are not the same is takes more detail about the
differences to know what specifics to expect. It can
vary widely, from mostly operable to failing to finish
booting.

Still, that it did boot is rather good evidence.

If it had failed when or after starting the world stage of
the activity, why would have been less clear.
Comment 29 Mark Millard 2019-07-19 05:09:08 UTC
(In reply to Mark Millard from comment #28)

I should have mentioned that the .txz files
can be expanded by commands of the form
(for updating a distinct parition mounted
at /mnt):

tar -xpf FILE.txz -C/mnt

(Live replacement of world (base) code by using -C/
can be problematical.)

I'll note that for base.yxz (world) various files can
not be replaced unless something like:

chflags -R noschg /mnt

had been done first to allow the replacements.

I tend to do a newfs before mounting to /mnt
instead.

The debug information for the build goes in a
directory tree under:

/mnt/usr/lib/debug/

(or /usr/lib/debug as seen when later booted).

Both the kernel and base *-dbg.txz material
goes there.
Comment 30 Mark Millard 2019-07-19 05:34:57 UTC
(In reply to Dennis Clarke from comment #27)

> So this is bothersome as it suggests that the native build process on a
> ppc64 boxen is somehow slightly broken?

My suggestion, if you can do it, is to use a:

https://artifact.ci.freebsd.org/snapshot/head/r*/powerpc/powerpc64/

(with r* being after r349846) to establish a working
G5 quad environment, and then see if it can rebuild
itself from scratch and update itself and then boot.

My guess is that it will and that the problem is related
to jumping across some change in an inappropriate way,
not that I know what went wrong for self-hosted.

[I've had a couple of days with FreeBSD activity largely
blocked. It looks like I might have only limited time
for another day or more.]
Comment 31 Francis Little 2019-07-19 09:36:16 UTC
So using the method of extracting the kernel from ci.freebsd.org over my install (I use a 12-STABLE r349903 ISO to install), I am able to boot an r350114 kernel by setting

set usefdt=1

at the boot loader, without that, I get a black screen and the machine locks.

root@PowerMacG5:~ # uname -apKU
FreeBSD PowerMacG5 13.0-CURRENT FreeBSD 13.0-CURRENT r350114 GENERIC  powerpc powerpc64 1300036 1200513

root@PowerMacG5:~ # sysctl -a | grep smp
kern.smp.maxid: 3
kern.smp.maxcpus: 256
kern.smp.active: 1
kern.smp.disabled: 0
kern.smp.cpus: 4
kern.smp.threads_per_core: 1
kern.smp.cores: 4

I get the NIC reorder, but can manage that.

Next, I will get a full kernel, world, lib32 etc of r350114 from artifact.ci.freebsd.org extracted to the drive, checkout r350114 to src and re-build everything and see what happens.
Comment 32 Francis Little 2019-07-19 10:18:35 UTC
Also, using the snapshot linked below, by interrupting the loader and setting usefdt=1, boots to the installer and will install a bootable system without panic:

http://ftp.freebsd.org/pub/FreeBSD/snapshots/powerpc/powerpc64/ISO-IMAGES/13.0/FreeBSD-13.0-CURRENT-powerpc-powerpc64-20190718-r350103-disc1.iso
Comment 33 Francis Little 2019-07-19 13:59:27 UTC
So reinstalling with the FreeBSD-13.0-CURRENT-powerpc-powerpc64-20190718-r350103-disc1.iso and doing a buildworld / kernel etc from the r350114 svn sources had produced a workable G5 Quad!

Booting using usefdt=1

So looks like it may be a case of skip over to a recent current build via snapshot iso or from artifact.ci.freebsd.org before rebuilding world etc.
Comment 34 Mark Millard 2019-07-19 17:11:21 UTC
(In reply to Francis Little from comment #31)

If I read Dennis Clarke's material correctly,
he is not using usefdt=1 . Also, as of r347463,
the known issue for which usefdt=1 was being
used was handled and use of usefdt mode was not
known to be required any more.

So you may have run into something new. If so,
figuring out what is distinct between Dennis's
and your context's that makes the difference
would be important to isolating the problem
that is avoided by usefdt=1 for you.

(Past a activity in my context does not suggest
that I'd get a repeat of your result. But I'll
likely not have time to set up and perform any
tests soon.)
Comment 35 Francis Little 2019-07-19 18:04:54 UTC
(In reply to Mark Millard from comment #34)

That is a good question, are the Quad's all the same?

My OFW reports: 

Apple PowerMac11,2 5.2.7f1 BootROM Built on 09/30/05 at 15:31:03

Dennis, do you have the same OFW version?
Comment 36 Mark Millard 2019-07-19 18:51:53 UTC
(In reply to Francis Little from comment #35)

For my context:

Apple PowerMac11,2 5.2.7f1 BootROM built on 09/30/05 at 15:31:03

What is added to the quads could matter. I've got:

16 GiBYte of ECC RAM

vgapci0@pci0:10:0:0:	class=0x030000 card=0x005210de chip=0x009210de rev=0xa1 hdr=0x00
    vendor     = 'NVIDIA Corporation'
    device     = 'G70 [GeForce 7800 GT]'
    class      = display
    subclass   = VGA

1 or two SSD's on the SATA connections.

Nothing else added inside. Outside:

One Ethernet cable connected
one USB keyboard connected
one display connected to the GeForce 7800 GT

Nothing else connected outside.


My reported boots of r350055 and r350074 from artifact.ci
were without usefdt mode being enabled (or anything else
bieng set specially).



I'll note that I patched usefdt code in my personal builds
because at the time I had access to:

2 distinct types of 64-bit PowerMacs
3 or so types of 32-bit PowerMacs, including one iMac G3.

but usefdt mode could not boot most of them, but could for
the two examples of G5 quads. Getting them to all boot lead
to what I've reported about usefdt mode. It also showed that
the quads booting was lucky based on on the typs of problems
usefdt has for PowerMacs. This was before the fix that has
(generally?) avoided needing usefdt mode.

(I ignore here one type of G4 that I had access to that I've
never gotten FreeBSD to boot. I've only sometimes had access
to one specific machine of that type.)
Comment 37 Francis Little 2019-07-19 19:54:47 UTC
(In reply to Mark Millard from comment #36)

So for mine its all fairly standard stuff and not a million miles from yours:

Apple PowerMac11,2 5.2.7f1 BootROM Built on 09/30/05 at 15:31:03

cpu0: IBM PowerPC 970MP revision 1.1, 2500.36 MHz
cpu0: Features dc000000<PPC32,PPC64,ALTIVEC,FPU,MMU>
cpu0: HID0 1511081<DEEPNAP,NAP,DPM,NHR,TBEN,ENATTN>
real memory  = 8539688960 (8144 MB)
avail memory = 8131739648 (7755 MB)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs

ada0: <ST31000524AS JC45> ATA8-ACS SATA 3.x device
cd0: <SONY DVD RW DW-Q28A KAS7> Removable CD-ROM SCSI device
ugen1.2: <Mitsumi Electric Hub in Apple Extended USB Keyboard> at usbus1


vgapci0@pci10:10:0:0:	class=0x030000 card=0x001010de chip=0x014110de rev=0xa2 hdr=0x00
    vendor     = 'NVIDIA Corporation'
    device     = 'NV43 [GeForce 6600]'
    class      = display
    subclass   = VGA

1 Display connected

Using Ethernet
Apple USB Keyboard
Generic USB Mouse.

Odly enough, using 12-R and 12-Stable, I don't need usefdt=1 to run the machine!
Comment 38 Mark Millard 2019-07-19 20:51:27 UTC
(In reply to Francis Little from comment #37)

Your list points out that I forgot to list the CD/DVD:

cd0: <HL-DT-ST DVD-RW GWA-4165B C006> Removable CD-ROM SCSI device

In case you want to see matches to what you were
explict about:

cpu0: IBM PowerPC 970MP revision 1.1, 2500.31 MHz
cpu0: Features dc000000<PPC32,PPC64,ALTIVEC,FPU,MMU>
cpu0: HID0 1511081<DEEPNAP,NAP,DPM,NHR,TBEN,ENATTN>
real memory  = 17134116864 (16340 MB)
avail memory = 16354693120 (15597 MB)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs

ada0: <OWC Mercury Electra 3G SSD 560ABBF0> ATA8-ACS SATA 2.x device
ada1: <OWC Mercury Electra 3G SSD 560ABBF0> ATA8-ACS SATA 2.x device
ugen0.2: <Kingston HyperX Alloy FPS Mechanical Gaming Keyboard> at usbus0

The RAM, SATA 2 vs. SATA 3 devices, CD/DVD drive, and video card
type seem to be the only possibly-significant differences. But
Dennis Clarke had reported matching your amount of RAM, as I
remember. I wonder what his context will show for the others.
Comment 39 Dennis Clarke 2019-07-19 22:57:48 UTC
(In reply to Francis Little from comment #31)


Regarding the "usefdt=1" there isa blunt comment from Justin Hibbits on
this at the bottom of :

    Bug 233863 - Various PowerMac G5 models may require 
                 kern.smp.disabled=1 and must set usefdt=1 which
                 causes net interface reorder 
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233863

Essentially don't do that. 

So based on the word done there I don't any longer and the machine works
as expected and the network interfaces don't mystically re-order. 

Let's keep plowing forwards here as this feels like something is wrong
with the native kernel build process but then again it is hard to say
when I have a mixed up kernel and "world" revision. I need to sort that
out pronto.

Francis has a great idea wherein he did an install with a snapshot dvd
image and that may be a good way to go as a starting point *before* any
attempt to build the kernel etc. 

Dennis
Comment 40 Dennis Clarke 2019-07-19 23:13:43 UTC
(In reply to Mark Millard from comment #34)

Yes I think that myself and Francis Little need to get on the same page
in every respect. If we can. I think we have nearly identical hardware
and we certainly do see nearly identical panic data.
Comment 41 Dennis Clarke 2019-07-19 23:25:07 UTC
Created attachment 205912 [details]
OpenFirmware messages from PowerMac G5 quad during boot

OpenFirmware messages from PowerMac G5 quad during boot
Comment 42 Dennis Clarke 2019-07-19 23:27:33 UTC
Regarding OpenFirmware the usual boot messages are none to helpful : 

see https://bz-attachments.freebsd.org/attachment.cgi?id=205912

I will shutdown the boxen and try to boot to the Openfirmware screen
and see what I see.
Comment 43 Dennis Clarke 2019-07-19 23:40:29 UTC
(In reply to Francis Little from comment #37)

I think that the three of us have four machines on hand. I have two of
these 64-bit PowerMac G5 types wherein I rarely powerup the smaller of
the pair. However this feels like a good opportunity to employ it. 

Here is a bit of data from the PowerMac G5 "quad" : 

hydra# 
hydra# uname -apKU
FreeBSD hydra 13.0-CURRENT FreeBSD 13.0-CURRENT r350114 GENERIC  powerpc powerpc64 1300036 1200086

    That is a bloody mess I know. 


hydra# pciconf -lv
vgapci0@pci0:10:0:0:    class=0x030000 card=0x001010de chip=0x014110de rev=0xa2 hdr=0x00
    vendor     = 'NVIDIA Corporation'
    device     = 'NV43 [GeForce 6600]'
    class      = display
    subclass   = VGA
pcib2@pci1:0:1:0:       class=0x060400 card=0x00000000 chip=0x01301166 rev=0xa3 hdr=0x01
    vendor     = 'Broadcom'
    device     = 'BCM5780 [HT2000] PCI-X bridge'
    class      = bridge
    subclass   = PCI-PCI
pcib3@pci1:0:2:0:       class=0x060400 card=0x00000000 chip=0x01301166 rev=0xa3 hdr=0x01
    vendor     = 'Broadcom'
    device     = 'BCM5780 [HT2000] PCI-X bridge'
    class      = bridge
    subclass   = PCI-PCI
pcib4@pci1:0:3:0:       class=0x060400 card=0x00000000 chip=0x01321166 rev=0xa3 hdr=0x01
    vendor     = 'Broadcom'
    device     = 'BCM5780 [HT2000] PCI-Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
pcib5@pci1:0:4:0:       class=0x060400 card=0x00000000 chip=0x01321166 rev=0xa3 hdr=0x01
    vendor     = 'Broadcom'
    device     = 'BCM5780 [HT2000] PCI-Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
pcib6@pci1:0:5:0:       class=0x060400 card=0x00000000 chip=0x01321166 rev=0xa3 hdr=0x01
    vendor     = 'Broadcom'
    device     = 'BCM5780 [HT2000] PCI-Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
pcib7@pci1:0:6:0:       class=0x060400 card=0x00000000 chip=0x01321166 rev=0xa3 hdr=0x01
    vendor     = 'Broadcom'
    device     = 'BCM5780 [HT2000] PCI-Express Bridge'
    class      = bridge
    subclass   = PCI-PCI
pcib8@pci1:0:7:0:       class=0x060400 card=0x00000000 chip=0x0053106b rev=0x00 hdr=0x01
    vendor     = 'Apple Inc.'
    device     = 'Shasta PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
pcib9@pci1:0:8:0:       class=0x060400 card=0x00000000 chip=0x0054106b rev=0x00 hdr=0x01
    vendor     = 'Apple Inc.'
    device     = 'Shasta PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
pcib10@pci1:0:9:0:      class=0x060400 card=0x00000000 chip=0x0055106b rev=0x00 hdr=0x01
    vendor     = 'Apple Inc.'
    device     = 'Shasta PCI Bridge'
    class      = bridge
    subclass   = PCI-PCI
bge0@pci1:5:4:0:        class=0x020000 card=0x0085106b chip=0x166a14e4 rev=0x03 hdr=0x00
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'NetXtreme BCM5780 Gigabit Ethernet'
    class      = network
    subclass   = ethernet
bge1@pci1:5:4:1:        class=0x020000 card=0x0085106b chip=0x166a14e4 rev=0x03 hdr=0x00
    vendor     = 'Broadcom Inc. and subsidiaries'
    device     = 'NetXtreme BCM5780 Gigabit Ethernet'
    class      = network
    subclass   = ethernet
none0@pci1:2:15:0:      class=0x020000 card=0x00000000 chip=0x0051106b rev=0x00 hdr=0x00
    vendor     = 'Apple Inc.'
    device     = 'Shasta (Sun GEM)'
    class      = network
    subclass   = ethernet
macio0@pci1:1:7:0:      class=0xff0000 card=0x00000000 chip=0x004f106b rev=0x00 hdr=0x00
    vendor     = 'Apple Inc.'
    device     = 'Shasta Mac I/O'
ohci0@pci1:1:11:0:      class=0x0c0310 card=0x00351033 chip=0x00351033 rev=0x43 hdr=0x00
    vendor     = 'NEC Corporation'
    device     = 'OHCI USB Controller'
    class      = serial bus
    subclass   = USB
ohci1@pci1:1:11:1:      class=0x0c0310 card=0x00351033 chip=0x00351033 rev=0x43 hdr=0x00
    vendor     = 'NEC Corporation'
    device     = 'OHCI USB Controller'
    class      = serial bus
    subclass   = USB
ehci0@pci1:1:11:2:      class=0x0c0320 card=0x00e01033 chip=0x00e01033 rev=0x04 hdr=0x00
    vendor     = 'NEC Corporation'
    device     = 'uPD72010x USB 2.0 Controller'
    class      = serial bus
    subclass   = USB
atapci0@pci1:3:12:0:    class=0x01018f card=0x02401166 chip=0x02401166 rev=0x00 hdr=0x00
    vendor     = 'Broadcom'
    device     = 'K2 SATA'
    class      = mass storage
    subclass   = ATA
ata0@pci1:3:13:0:       class=0xff0000 card=0x00000000 chip=0x0050106b rev=0x00 hdr=0x00
    vendor     = 'Apple Inc.'
    device     = 'Shasta IDE'
fwohci0@pci1:3:14:0:    class=0x0c0010 card=0x5811106b chip=0x0052106b rev=0x00 hdr=0x00
    vendor     = 'Apple Inc.'
    device     = 'Shasta Firewire'
    class      = serial bus
    subclass   = FireWire
hydra# 

So then looking at  /var/run/dmesg.boot which I may as well paste here : 

---<<BOOT>>---
Copyright (c) 1992-2019 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT r350114 GENERIC powerpc
gcc version 4.2.1 20070831 patched [FreeBSD]
WARNING: WITNESS option enabled, expect reduced performance.
VT(ofwfb): resolution 1280x1024
cpu0: IBM PowerPC 970MP revision 1.1, 2500.34 MHz
cpu0: Features dc000000<PPC32,PPC64,ALTIVEC,FPU,MMU>
cpu0: HID0 1511081<DEEPNAP,NAP,DPM,NHR,TBEN,ENATTN>
real memory  = 8539713536 (8144 MB)
avail memory = 8131776512 (7755 MB)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
random: unblocking device.
random: entropy device external interface
000.000015 [4254] netmap_init               netmap: loaded module
kbd0 at kbdmux0
ofwbus0: <Open Firmware Device Tree> on nexus0
unin0: <Apple UniNorth System Controller> mem 0xf8000000-0xf8ffffff on ofwbus0
unin0: Version 66
pcib0: <IBM CPC945 PCI Express Root> mem 0xf0000000-0xf1ffffff on ofwbus0
pci0: <OFW PCI bus> on pcib0
pcib1: <IBM CPC9X5 HyperTransport Tunnel> mem 0xf2000000-0xf47fffff,0xf8070000-0xf8070fff on ofwbus0
pcib1: 86 HT IRQs on device 7.0
pci1: <OFW PCI bus> on pcib1
pcib1: Enabling MSI window for HyperTransport slave at pci1:0:1:0
pcib2: <OFW PCI-PCI bridge> at device 1.0 on pci1
pci2: <OFW PCI bus> on pcib2
pcib3: <OFW PCI-PCI bridge> at device 2.0 on pci1
pci3: <OFW PCI bus> on pcib3
pcib4: <OFW PCI-PCI bridge> at device 3.0 on pci1
pci4: <OFW PCI bus> on pcib4
pcib5: <OFW PCI-PCI bridge> at device 4.0 on pci1
pci5: <OFW PCI bus> on pcib5
pcib6: <OFW PCI-PCI bridge> at device 5.0 on pci1
pci6: <OFW PCI bus> on pcib6
pcib7: <OFW PCI-PCI bridge> at device 6.0 on pci1
pci7: <OFW PCI bus> on pcib7
pcib8: <OFW PCI-PCI bridge> at device 7.0 on pci1
pci8: <OFW PCI bus> on pcib8
pcib9: <OFW PCI-PCI bridge> at device 8.0 on pci1
pci9: <OFW PCI bus> on pcib9
macio0: <Shasta I/O Controller> mem 0x80000000-0x8007ffff at device 7.0 on pci9
macgpio0: <MacIO GPIO Controller> mem 0x50-0x8a on macio0
pcib10: <OFW PCI-PCI bridge> at device 9.0 on pci1
pci10: <OFW PCI bus> on pcib10
htpic0: <OpenPIC Interrupt Controller> mem 0xf8040000-0xf807ffff on unin0
cpulist0: <Open Firmware CPU Group> on ofwbus0
cpu0: <Open Firmware CPU> on cpulist0
pcr0: <PPC 970 Power Control Register> on cpu0
cpu1: <Open Firmware CPU> on cpulist0
pcr1: <PPC 970 Power Control Register> on cpu1
cpu2: <Open Firmware CPU> on cpulist0
pcr2: <PPC 970 Power Control Register> on cpu2
cpu3: <Open Firmware CPU> on cpulist0
pcr3: <PPC 970 Power Control Register> on cpu3
powermac_nvram0: <Apple NVRAM> mem 0xfff04000-0xfff07fff on ofwbus0
powermac_nvram0: bank0 generation 468, bank1 generation 467
iichb0: <Keywest I2C controller> mem 0xf8001000-0xf8001fff irq 0 on unin0
iicbus0: <OFW I2C bus> on iichb0
iic0: <I2C generic I/O> on iicbus0
ds17750: <Temp-Monitor DS1775> at addr 0x94 on iicbus0
ds16310: <Temp-Monitor DS1631> at addr 0x96 on iicbus0
max66900: <Temp-Monitor MAX6690> at addr 0x98 on iicbus0
max66901: <Temp-Monitor MAX6690> at addr 0x9c on iicbus0
vgapci0: <VGA-compatible display> mem 0xa1000000-0xa1ffffff,0x90000000-0x9fffffff,0xa0000000-0xa0ffffff irq 3 at device 0.0 on pci0
vgapci0: Boot video device
bge0: <Broadcom BCM5714 B3, ASIC rev. 0x008003> mem 0xfa530000-0xfa53ffff,0xfa520000-0xfa52ffff irq 66 at device 4.0 on pci3
bge0: CHIP ID 0x00008003; ASIC REV 0x08; CHIP REV 0x80; PCI-X 33 MHz
miibus0: <MII bus> on bge0
brgphy0: <BCM5780 1000BASE-T media interface> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 00:14:51:64:67:10
bge1: <Broadcom BCM5714 B3, ASIC rev. 0x008003> mem 0xfa510000-0xfa51ffff,0xfa500000-0xfa50ffff irq 67 at device 4.1 on pci3
bge1: CHIP ID 0x00008003; ASIC REV 0x08; CHIP REV 0x80; PCI-X 33 MHz
miibus1: <MII bus> on bge1
brgphy1: <BCM5780 1000BASE-T media interface> PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 00:14:51:64:67:11
gem0: <Apple Shasta GMAC Ethernet> mem 0xfa200000-0xfa3fffff at device 15.0 on pci8
gem0: invalid MAC address
device_attach: gem0 attach returned 6
scc0: <Zilog Z8530 dual channel SCC> mem 0x13000-0x13fff,0x8400-0x84ff,0x8500-0x85ff,0x8600-0x86ff,0x8700-0x87ff irq 23,17,18,24,19,20 on macio0
uart0: <z8530, channel A> on scc0
uart1: <z8530, channel B> on scc0
iichb1: <Keywest I2C controller> mem 0x18000-0x18fff irq 27 on macio0
iicbus1: <OFW I2C bus> on iichb1
iic1: <I2C generic I/O> on iicbus1
onyx0: <Texas Instruments PCM3052 Audio Codec> at addr 0x8c on iicbus1
iicbus1: <unknown card> at addr 0x24
pcm0: <Apple I2S Audio Controller> mem 0x10000-0x10fff,0x8000-0x80ff,0x8100-0x81ff irq 28,11,12,30,15,16 on macio0
ohci0: <NEC uPD 9210 USB controller> mem 0x80082000-0x80082fff irq 70 at device 11.0 on pci9
usbus0 on ohci0
ohci1: <NEC uPD 9210 USB controller> mem 0x80081000-0x80081fff irq 70 at device 11.1 on pci9
usbus1 on ohci1
ehci0: <NEC uPD 72010x USB 2.0 controller> mem 0x80080000-0x800800ff irq 70 at device 11.2 on pci9
usbus2: EHCI version 1.0
usbus2 on ehci0
atapci0: <ServerWorks K2 SATA150 controller> mem 0xfa402000-0xfa403fff irq 10 at device 12.0 on pci10
pcib1: failed to reserve resource for pcib10
atapci0: 0x10 bytes of rid 0x20 res 4 failed (0, 0xffffffffffffffff).
ata2: <ATA channel> at channel 0 on atapci0
ata3: <ATA channel> at channel 1 on atapci0
ata4: <ATA channel> at channel 2 on atapci0
ata5: <ATA channel> at channel 3 on atapci0
ata0: <Shasta Kauai ATA Controller> mem 0xfa404000-0xfa407fff irq 38,37 at device 13.0 on pci10
fwohci0: <1394 Open Host Controller Interface> mem 0xfa400000-0xfa400fff irq 39 at device 14.0 on pci10
fwohci0: OHCI version 1.0 (ROM=0)
fwohci0: No. of Isochronous channels is 8.
fwohci0: EUI64 00:11:24:ff:fe:e5:13:d0
fwohci0: invalid speed 7 (fixed to 3).
fwohci0: Phy 1394a available S800, 3 ports.
fwohci0: Link S800, max_rec 4096 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
fwe0: <Ethernet over FireWire> on firewire0
if_fwe0: Fake Ethernet address: 02:11:24:e5:13:d0
fwe0: Ethernet address: 02:11:24:e5:13:d0
sbp0: <SBP-2/SCSI over FireWire> on firewire0
fwohci0: Initiate bus reset
fwohci0: fwohci_intr_core: BUS reset
fwohci0: PhysicalUpperBound register is not implemented.  Physical memory access is limited to the first 4GB
fwohci0: PhysicalUpperBound = 0x00000000
fwohci0: fwohci_intr_core: node_id=0x00000001, SelfID Count=1, CYCLEMASTER mode
smu0: <Apple System Management Unit> on ofwbus0
smu0: registered as a time-of-day clock, resolution 0.001000s
iichb2: <SMU I2C controller> on smu0
iicbus2: <OFW I2C bus> on iichb2
iic2: <I2C generic I/O> on iicbus2
smusat0: <SMU Satellite Sensors> at addr 0xb0 on iicbus2
smusat1: <SMU Satellite Sensors> at addr 0xb2 on iicbus2
iicbus2: <unknown card> at addr 0xd4
iichb3: <SMU I2C controller> on smu0
iicbus3: <OFW I2C bus> on iichb3
iic3: <I2C generic I/O> on iicbus3
cryptosoft0: <software crypto> on nexus0
Timecounter "timebase" frequency 33333333 Hz quality 0
Event timer "decrementer" frequency 33333333 Hz quality 1000
Timecounters tick every 1.000 msec
firewire0: 2 nodes, maxhop <= 1 cable IRM irm(1)  (me) 
firewire0: bus manager 1 
bge0: link state changed to UP
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 12Mbps Full Speed USB v1.0
max66900: 2 sensors detected.
max66901: 2 sensors detected.
ugen1.1: <NEC OHCI root HUB> at usbus1
uhub0 on usbus1
uhub0: <NEC OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
ugen0.1: <NEC OHCI root HUB> at usbus0
uhub1 on usbus0
uhub1: <NEC OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
usbus2: 480Mbps High Speed USB v2.0
ugen2.1: <NEC EHCI root HUB> at usbus2
uhub2 on usbus2
uhub2: <NEC EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
uhub0: 2 ports with 2 removable, self powered
uhub1: 3 ports with 3 removable, self powered
uhub2: 5 ports with 5 removable, self powered
ada0 at ata2 bus 0 scbus0 target 0 lun 0
ada0: <ST380013AS 3.00> ATA-6 SATA 1.x device
ada0: Serial Number 4MR3C8TG
ada0: 150.000MB/s transfers (SATA 1.x, UDMA5, PIO 8192bytes)
ada0: 76319MB (156301488 512 byte sectors)
Launching APs: 1 2 3
Trying to mount root from ufs:/dev/ada0s3 [rw]...
cd0 at ata0 bus 0 scbus4 target 0 lun 0
cd0: <HL-DT-ST DVD-RW GWA-4165B C006> Removable CD-ROM SCSI device
cd0: Serial Number M0063NE3358
cd0: 66.700MB/s transfers (UDMA4, ATAPI 12bytes, PIO 65534bytes)
cd0: 3267MB (1672851 2048 byte sectors)
WARNING: WITNESS option enabled, expect reduced performance.
ugen1.2: <Lite-On Technology Corp. USB Multimedia Keyboard> at usbus1
ukbd0 on uhub0
ukbd0: <Lite-On Technology Corp. USB Multimedia Keyboard, class 0/0, rev 1.10/1.04, addr 2> on usbus1
kbd1 at ukbd0
uhid0 on uhub0
uhid0: <Lite-On Technology Corp. USB Multimedia Keyboard, class 0/0, rev 1.10/1.04, addr 2> on usbus1
Deprecated code (to be removed in FreeBSD 14): FreeBSD 12.x ABI compat
Deprecated code (to be removed in FreeBSD 14): FreeBSD 12.x ABI compat
lo0: link state changed to UP
bge0: link state changed to DOWN
bge0: link state changed to UP
hydra# 

I will try to get the Openfirmware data from the usual keyboard vulkan
grip on five or six keys as I power then beast(s) on. 

Dennis
Comment 44 Dennis Clarke 2019-07-19 23:44:24 UTC
*** Bug 238730 has been marked as a duplicate of this bug. ***
Comment 45 Mark Millard 2019-07-20 02:27:52 UTC
(In reply to Dennis Clarke from comment #42)

The output from:

# ofwdump -Pmodel /rom/boot-rom

includes textual lines like:

    'Apple PowerMac11,2 5.2.7f1 BootROM built on 09/30/05 at 15:3'
    '1:03'


FYI: do not try "ofwdump -ap" without usefdt mode for
64-bit FreeBSD: it will crash attempting to extract and
dump out a log property. (32-bit FreeBSD can handle the
ofwdump -ap just fine, even without usefdt mode.)
Comment 46 Mark Millard 2019-07-20 02:38:52 UTC
(In reply to Dennis Clarke from comment #44)

Bugzilla 238730 is not a duplicate of this bug:

It has its own fix with the change to if_bge.c
that was noted in its comments: 239245 is a
distinct problem not fixed by that change.

Marking 238730 as a duplicate may cause the
fix there to be lost/ignored.

The 238730 should have its status changed back.
(Not something I can do as far as I know, since
I did not submit it.)
Comment 47 Mark Millard 2019-07-20 02:43:38 UTC
(In reply to Mark Millard from comment #45)

My wording was poor: the "it" that crashes for
ofwdump -ap is the system (FreeBSD), not the
program.

The issue is FreeBSD trying to handle an openfirnware
excpetion and see a openfirmware stack address that it
does not handle.
Comment 48 Dennis Clarke 2019-07-20 04:26:57 UTC
(In reply to Mark Millard from comment #46)
oops.

    Bug 238730 - r349985 on ppc64 IBM 970MP PowerMac G5 
                  sys/dev/bge/if_bge.c must move the device_get_devclass(bus)

    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238730

Not a duplicate.

This is a specific fix to sys/dev/bge/if_bge.c 
wherein we must move the device_get_devclass(bus) check to be just
after device_get_devclass(dev) 

Closed as fixed.
Comment 49 Dennis Clarke 2019-07-21 01:07:20 UTC
(In reply to Francis Little from comment #32)

Following in the foot steps of Francis Little I also fetched the
snapshot dvd :

http://ftp.freebsd.org/pub/FreeBSD/snapshots/powerpc/powerpc64/ISO-IMAGES/13.0/FreeBSD-13.0-CURRENT-powerpc-powerpc64-20190718-r350103-disc1.iso

I carefully burned it to a DVD-RW and then used readcd to read back in
the data from the DVD-RW and did a sha512 hash check compare. Perfect.

I can confirm that usefdt=1 is required. Any attempt to boot that dvd
without that wwill provide a black screen and fans roaring. I did also
try kern.smp.disabled=1 by itself. Blank screen. Fans roaring.

So now that I have this dvd booted I may as well install and then have
consistent output from uname -apKU and then we plow forwards chasing
this dragon. 

Dennis
Comment 50 Mark Millard 2019-07-21 03:27:02 UTC
(In reply to Dennis Clarke from comment #49)

Does direct use of the materials from:

https://artifact.ci.freebsd.org/snapshot/head/r350103/powerpc/powerpc64/

reproduce the problem? Or is it only the DVD boot that
is a problem? How about after installation from the DVD:
still need usefdt mode or was it only the DVD stage that
had the problem?
Comment 51 Mark Millard 2019-07-21 03:49:16 UTC
(In reply to Dennis Clarke from comment #49)

I tried use of the:

https://artifact.ci.freebsd.org/snapshot/head/r350103/powerpc/powerpc64/

materials directly and the G5 quad booted fine from
the partition populated via extraction of the content
from the *.txz files.

I'll see about trying to produce a DVD to test if it
boots, not that I expect it to.
Comment 52 Mark Millard 2019-07-21 04:39:28 UTC
(In reply to Dennis Clarke from comment #49)

I downloaded and burned a copy of:

https://download.freebsd.org/ftp/snapshots/ISO-IMAGES/13.0/FreeBSD-13.0-CURRENT-powerpc-powerpc64-20190718-r350103-disc1.iso

to a DVD-RW. I put it in the G5 quad that I have access to.
It booted fine and Live-CD allowed me to log in as root.
I did not set anything, just let it boot. So no usefdt mode
involved.

(I had no other drives attached to the G5 quad at
the time, so there was no place to install to.)
Comment 53 Dennis Clarke 2019-07-21 17:02:15 UTC
(In reply to Mark Millard from comment #50)

Installation from the dvd failed. The machine locked up and the fans
started roaring. Even the numlock key on the keyboard would not 
work. 

I'll figure out a way to prep the machine with materials from perhaps
the https://artifact.ci.freebsd.org/snapshot/head/r350103/ area.
Comment 54 Mark Millard 2019-07-21 19:47:02 UTC
(In reply to Dennis Clarke from comment #53)

I've now used the DVD that had burned and reported
on previously for booting all of the following ECC
RAM configurations:

16 GiBytes RAM
12 GiBytes RAM
 8 GiBytes RAM
 4 GiBytes RAM

They all booted just fine with no manual settings
made. (I removed memory in pairs from the outside
in.)

Again I had no other drives attached.

One good thing about this test is that avoid
any issues of the vintage/variations of materials
in the apple boot partition on SATA or USB media.

You and I reported the same type of DVD drive:

cd0: <HL-DT-ST DVD-RW GWA-4165B C006> Removable CD-ROM SCSI device
cd0: <HL-DT-ST DVD-RW GWA-4165B C006> Removable CD-ROM SCSI device

This leaves the differences:

    device     = 'G70 [GeForce 7800 GT]'
vs.

    device     = 'NV43 [GeForce 6600]'

and differences in drives attached. Can you try booting
with only the DVD drive present? This could eliminate
anything tied to SATA drives or protocol vintage, leaving
only the video card variations.

(I do not currently have access to a GeForce 6600 or any
alternative video hardware and so can not eliminate this
difference.)

My guess would be that the GeForce 6600 is what is common
that matters for the two of you vs. my not having such.
Comment 55 Mark Millard 2019-07-21 20:03:07 UTC
(In reply to Mark Millard from comment #54)

One video-context issue I'd not reported on for
my context:

VT(ofwfb): resolution 1920x1080
Comment 56 Francis Little 2019-07-21 20:19:22 UTC
(In reply to Mark Millard from comment #55)

Hi, I'm not using ECC Ram in mine, just 8 x 1GB sticks of plain memory.

Also, I recall reading that there were two motherboard revisions on the G5 Quad, I cannot remember where now. Looking on my board, I have to remove the CPU intake fans to read the model:

On my board, from the UK, I have this in the silk screen:

C 2005
160/105/1 - 1628-A

(The first part "160/105/1" is on a sticker covering the silk screen, "- 1628-A" is on the silk screen.

And this on the sticker below:

630 - 7431/T6536

I tried booting the DVD without a HDD attached and the system required usefdt=1 still.

I also tried booting my install on the HDD with no DVD Drive attached and it needed usefdt=1
Comment 57 Mark Millard 2019-07-21 22:14:23 UTC
(In reply to Francis Little from comment #56)

The one that I have access to shows:

Silk screen shows:
(c) 2005
820-1628-A
Apple Computer, Inc.

Sticker shows:
630-7431/T6536


Nothing has: 160/105/1

Looks like a match to me.
Comment 58 Dennis Clarke 2019-07-23 12:43:09 UTC
I managed to get an install of r350103 from that same DVD that crashed
a few days ago. I simply went for a minimal install without ports or
src tarball and the process went smoothly. 

hydra$ 
hydra$ uptime
12:41PM  up 3 mins, 2 users, load averages: 0.09, 0.20, 0.10
hydra$ uname -apKU
FreeBSD hydra 13.0-CURRENT FreeBSD 13.0-CURRENT r350103 GENERIC  powerpc powerpc64 1300036 1300036
hydra$ 

The DVD needs usefdt=1 but after install the system boots fine without
it and that makes as much sense as high altitude flying fish. 

Baffled but it is running for the moment. 

Dennis
Comment 59 Alfredo Dal'Ava Júnior 2019-08-12 11:38:29 UTC
I can reproduce similar issue (same stack on panic) on QEMU using the following steps:

1 - take a clean-installed FreeBSD 12 RELEASE
2 - get src at r350672;
3 - buildkernel
4 - installkernel
5 - shutdown -r now

It's not reproduced all the times, reproduce rate is bellow 20%. Looking at my continuous integration logs, I see this panic first appearing on early June.

For the record:

- commenting out NUMA option on kernel config file will workaround it.
- NUMA feature was introduced/made default in early April, two months before the first time my testbed hit the issue for the first time. So it doesn't appear like a bug in the original code. Something between April and June has changed it, or maybe triggered it.
- Once panics, QEMU system_reset will hit the panic again. Closing QEMU and starting it again will make it boot. (power cycle?)
- Once the panic, if I do QEMU system_reset and make loader boot the old (FreeBSD 12) kernel, it works fine. If I issue reboot and load the new kernel (that was just built), the panic is hit again. (something like a bad persistent state affecting only the new kernel, that only disappears after closing QEMU).