Bug 244118

Summary: 13-CURRENT Kernel post r357547 hang during boot on PowerMac G5Q
Product: Base System Reporter: Francis Little <oggy>
Component: kernAssignee: Ryan Libby <rlibby>
Status: Closed FIXED    
Severity: Affects Some People CC: marklmi26-fbsd, rlibby
Priority: --- Keywords: needs-qa, regression
Version: CURRENTFlags: koobs: maintainer-feedback? (rlibby)
Hardware: powerpc   
OS: Any   
Attachments:
Description Flags
Print zone layouts
none
Print zone layouts
none
slb_zone_init: unconditional UMA_ZONE_CONTIG
none
Output of slabs="0"
none
Output of slabs="1" none

Description Francis Little 2020-02-14 10:28:51 UTC
As posted to the mailing list recently, Kernels post r357547 hang during boot on my PowerMac G5 Quad, at:

mountroot waiting for /dev/ada0s3

(This is with SMP disabled)

I have tried revisions:

357547 - Works

357550 - hangs on boot
357556 - hangs on boot
357566 - hangs on boot
357875 - hangs on boot

I think the issue may be with commit 357548 as it changes something in: head/sys/powerpc/aim/slb.c

Although what its changing may be beyond me!
Comment 1 Ryan Libby freebsd_committer freebsd_triage 2020-02-14 16:00:40 UTC
Hmm, r357548 shouldn't have had an effect by itself, but it could be
fallout from related r357549.  Could you try setting this from the boot
loader or in /boot/loader.conf?

vm.debug.uma_multipage_slabs="0"

If that works then we can try to find which zone causes trouble.  Would
you be able to try out a patch that would print debugging info?
Comment 2 Francis Little 2020-02-14 16:15:42 UTC
(In reply to Ryan Libby from comment #1)

Hi,

Yep, that works, so r357550 now boots successfully with:

vm.debug.uma_multipage_slabs="0"

I'm happy to test a patch and update to a more recent rev also.

Thanks
Comment 3 Ryan Libby freebsd_committer freebsd_triage 2020-02-14 16:44:33 UTC
Created attachment 211651 [details]
Print zone layouts

Okay, could you try out this patch?  It won't fix anything, it will
hopefully just tell us what difference r357549 makes.  So, first could
you try it with vm.debug.uma_multipage_slabs="0", and then grep for
"keg.*layout" in /var/log/messages and attach that somewhere, and then
again with vm.debug.uma_multipage_slabs="1"?  There should be a number
of zones which end up with different layouts between the two and likely
one of those is causing the issue.
Comment 4 Ryan Libby freebsd_committer freebsd_triage 2020-02-14 17:03:42 UTC
Created attachment 211652 [details]
Print zone layouts

Sorry, update to patch, I realized I left off a newline.
Comment 5 Ryan Libby freebsd_committer freebsd_triage 2020-02-14 18:31:41 UTC
(In reply to Francis Little from comment #0)
> As posted to the mailing list recently [...]

I wasn't able to locate this thread, so if there's more information
there then could you please either repost it here or direct me to the
thread?
Comment 6 Mark Millard 2020-02-14 21:39:42 UTC
(In reply to Ryan Libby from comment #5)

I tink he means:

https://lists.freebsd.org/pipermail/freebsd-ppc/2020-February/011444.html
https://lists.freebsd.org/pipermail/freebsd-ppc/2020-February/011504.html

but I doubt they add anything. (The one between them is from me
and does not have information useful to you.)


Going in another direction . . .

vm.debug.uma_multipage_slabs="1" leaves his G5 unable to
boot (waiting for mountroot). So how is he to follow the
directions for testing that case?
Comment 7 Francis Little 2020-02-14 21:45:58 UTC
(In reply to Ryan Libby from comment #3)

Hi, as Mark just mentioned, I can only get the logfile with:

vm.debug.uma_multipage_slabs="0"

With it set as:

vm.debug.uma_multipage_slabs="1"

The Machine doesn't get far enough to write the log out!

Regards
Comment 8 Mark Millard 2020-02-14 22:34:17 UTC
(In reply to Francis Little from comment #7)

If you have an appropriate swap/paging partition you might
be able to have /boot/loader.conf contain an assignment
of the general form:

dumpdev="/dev/<???>"

(substituting as necessary). Then you might be able
to cause a kernel dump from ddb (say). (Of cource,
this might run back into the same or other problems.)

If it worked, rebooting with vm.debug.uma_multipage_slabs="0"
should pick up that dump if you have room in /var/crash.
The dump might contain the tail of the sequence of lines
that would have gone into the log file. There might be a
way to extract those lines.
Comment 9 Mark Millard 2020-02-15 01:32:33 UTC
(In reply to Mark Millard from comment #8)

I got to thinking about my history investigating
and I think this is too early for the device to
be available for dump to use.

Also the hangup likely prevents getting to ddb.

So I set up -r357550's artifact materials on a
SSD and got into ddb somewhat before the lockup.

One part of acttrace looked potentially interesting
(typed from a picture and avoiding typing everything
for now):

Tracing command cam pid 7 tid 100034 td  . . . (CPU3)
intr_event_handle
powerpc_dispatch_intr
openpic_dispatch
powerpc_interrupt
kernel EXI trap by __mtx_unlock_flags
__mtx_unlock_flags
xpt_copy_path
xpt_async
cam_periph_error
probe_done
xpt_done_process
xpt_done_td
fork_exit
fork_trampoline
-0x04

I've still got the picture to extract other details from
if required.

For thread 100034, "show locks" reported one:
(I split the line below)

exclusive sleep mutex CAM device lock (CAM device lock)
r = 0 (. . .) locked @ /usr/src/sys/cam/cam_xpt.c:5490
Comment 10 Mark Millard 2020-02-15 03:47:18 UTC
I tried non-debug kernel builds (that involve
some of my own PowerMac patches):

Head -r357548 did not have the problem.
Head -r357549 did have the problem.

So: Confirming Ryan's expectation.
Comment 11 Ryan Libby freebsd_committer freebsd_triage 2020-02-15 05:22:27 UTC
Yeah... I guess that's not going to work easily.  (I guess I'm just used
to debugging on my VM and captuing the console.)

Actually, I _may_ be able to debug this with just information from a
system with vm.debug.uma_multipage_slabs="0", and without the above
patch and log, but with the output of

sysctl vm.uma

Otherwise, I can try to reproduce this on a VM, but I am out of town
right now so I probably won't have time to try setting that up until
next weekend.
Comment 12 Mark Millard 2020-02-15 05:25:13 UTC
(In reply to Ryan Libby from comment #11)

I tried to use pictures to get your originally requested
informatino.

I only have pictures of the non-booting case so I only
show the booting case line in full, where the
non-booting case had some larger slabsize also listed.

UMA Kegs layout: format 0 (ipers 7 * rsize 512) / slabsize 0x1000 = 87% eff
vs. ipers 15 with slabsize 0x2000

UMA Zones layout: format 0 (ipers 2 * rsize 1792) / slabsize 0x1000 = 87% eff
vs. ipers 9 with slabsize 0x4000

keg VMSPACE layout: format 0 (ipers 2 * rsize 1400) / slabsize 0x1000 = 68% eff
vs. ipers 5 slabsize 0x2000
vs. ipers 8 slabsize 0x3000

keg SLB cache layout: format 0 (ipers 7 * rsize 520) / slabsize 0x1000 = 88% eff
vs. ipers 15 slabsize 0x2000

keg THREAD layout: format 0 (ipers 2 * rsize 1376) / slabsize 0x1000 = 67% eff
vs. ipers 5 slabsize 0x2000
vs. ipers 8 slabsize 0x3000
vs. ipers 11 slabsize 0x4000

keg NCLNODE layout: format 0 (ipers 6 * rsize 592) / slabsize 0x1000 = 86% eff
vs. ipers 13 slabsize 0x2000

keg socket layout: format 0 (ipers 4 * rsize 904) / slabsize 0x1000 = 88% eff
vs. ipers 9 slabsize 0x2000

keg sctp_asoc layout: format 0 (ipers 1 * rsize 2288) / slabsize 0x1000 = 55% eff
vs. ipers 3 slabsize 0x2000
vs. ipers 5 slabsize 0x3000

keg sctp_raddr layout: format 0 (ipers 5 * rsize 736) / slabsize 0x1000 = 89% eff
vs. ipers 11 skabsize 0x2000


I will note that sometimes FFS inode, FFS1 dinode, and FFS2 dinode lines
were listed for the failing boots. In the booting case, those are the last
3 keg lines. slabsize 0x1000 always showed for any that displayed.

I got all but the final lines via booting and getting to
ddb> and then scrolling around, taking pictures.

Picture comparison means a higher probability that I missed some
example(s) that should be listed above.
Comment 13 Mark Millard 2020-02-15 05:36:11 UTC
(In reply to Mark Millard from comment #12)

Found one that I missed earlier:

keg Mountpoints layout: format 0 (ipers 1 * rsize 2944) / slabsize 0x1000 = 71% eff
vs. ipers 4 slabsize 0x3000
Comment 14 Ryan Libby freebsd_committer freebsd_triage 2020-02-15 05:45:34 UTC
Created attachment 211661 [details]
slb_zone_init: unconditional UMA_ZONE_CONTIG

(In reply to Mark Millard from comment #12)

Thanks!

This one looks especially interesting:

> keg SLB cache layout: format 0 (ipers 7 * rsize 520) / slabsize 0x1000 = 88% eff
> vs. ipers 15 slabsize 0x2000

Based on that, could either of you try out this patch?

I have not compile tested it, so I apologize in advance if it is broken.
Comment 15 Mark Millard 2020-02-15 06:01:01 UTC
(In reply to Ryan Libby from comment #14)

It boots. I'll see about getting a diff
of the keg lines. It might catch things
I missed.
Comment 16 Mark Millard 2020-02-15 06:21:42 UTC
(In reply to Mark Millard from comment #15)

The diff:

# more /mnt/root/keg_lines_diff.txt 
--- /root/keg_lines_0.txt       2020-02-15 06:06:11.318230000 +0000
+++ /root/keg_lines_1.txt       2020-02-15 06:06:55.069131000 +0000
@@ -1,3 +1,5 @@
 keg UMA Kegs layout: format 0 (ipers 7 * rsize 512) / slabsize 0x1000 = 87% eff
+keg UMA Kegs layout: format 0 (ipers 15 * rsize 512) / slabsize 0x2000 = 93% eff
 keg UMA Zones layout: format 0 (ipers 2 * rsize 1792) / slabsize 0x1000 = 87% eff
+keg UMA Zones layout: format 0 (ipers 9 * rsize 1792) / slabsize 0x4000 = 98% eff
 keg UMA Slabs 0 layout: format 0 (ipers 50 * rsize 80) / slabsize 0x1000 = 97% eff
@@ -26,2 +28,4 @@
 keg VMSPACE layout: format 0 (ipers 2 * rsize 1400) / slabsize 0x1000 = 68% eff
+keg VMSPACE layout: format 0 (ipers 5 * rsize 1400) / slabsize 0x2000 = 85% eff
+keg VMSPACE layout: format 0 (ipers 8 * rsize 1400) / slabsize 0x3000 = 91% eff
 keg UPVO entry layout: format 0 (ipers 42 * rsize 96) / slabsize 0x1000 = 98% eff
@@ -53,2 +57,3 @@
 keg filedesc0 layout: format 0 (ipers 3 * rsize 1104) / slabsize 0x1000 = 80% eff
+keg filedesc0 layout: format 0 (ipers 7 * rsize 1104) / slabsize 0x2000 = 94% eff
 keg rangeset pctrie nodes layout: format 0 (ipers 28 * rsize 144) / slabsize 0x1000 = 98% eff
@@ -62,2 +67,5 @@
 keg THREAD layout: format 0 (ipers 2 * rsize 1376) / slabsize 0x1000 = 67% eff
+keg THREAD layout: format 0 (ipers 5 * rsize 1376) / slabsize 0x2000 = 83% eff
+keg THREAD layout: format 0 (ipers 8 * rsize 1376) / slabsize 0x3000 = 89% eff
+keg THREAD layout: format 0 (ipers 11 * rsize 1376) / slabsize 0x4000 = 92% eff
 keg cpuset layout: format 0 (ipers 31 * rsize 128) / slabsize 0x1000 = 96% eff
@@ -92,2 +100,3 @@
 keg NCLNODE layout: format 0 (ipers 6 * rsize 592) / slabsize 0x1000 = 86% eff
+keg NCLNODE layout: format 0 (ipers 13 * rsize 592) / slabsize 0x2000 = 93% eff
 keg DIRHASH layout: format 0 (ipers 3 * rsize 1024) / slabsize 0x1000 = 75% eff
@@ -101,2 +110,3 @@
 keg Mountpoints layout: format 0 (ipers 1 * rsize 2944) / slabsize 0x1000 = 71% eff
+keg Mountpoints layout: format 0 (ipers 4 * rsize 2944) / slabsize 0x3000 = 95% eff
 keg ksiginfo layout: format 0 (ipers 36 * rsize 112) / slabsize 0x1000 = 98% eff
@@ -105,2 +115,3 @@
 keg socket layout: format 0 (ipers 4 * rsize 904) / slabsize 0x1000 = 88% eff
+keg socket layout: format 0 (ipers 9 * rsize 904) / slabsize 0x2000 = 99% eff
 keg unpcb layout: format 0 (ipers 15 * rsize 256) / slabsize 0x1000 = 93% eff
@@ -123,4 +134,7 @@
 keg sctp_asoc layout: format 0 (ipers 1 * rsize 2288) / slabsize 0x1000 = 55% eff
+keg sctp_asoc layout: format 0 (ipers 3 * rsize 2288) / slabsize 0x2000 = 83% eff
+keg sctp_asoc layout: format 0 (ipers 5 * rsize 2288) / slabsize 0x3000 = 93% eff
 keg sctp_laddr layout: format 0 (ipers 84 * rsize 48) / slabsize 0x1000 = 98% eff
 keg sctp_raddr layout: format 0 (ipers 5 * rsize 736) / slabsize 0x1000 = 89% eff
+keg sctp_raddr layout: format 0 (ipers 11 * rsize 736) / slabsize 0x2000 = 98% eff
 keg sctp_chunk layout: format 0 (ipers 26 * rsize 152) / slabsize 0x1000 = 96% eff
@@ -136,2 +150,3 @@
 keg swpctrie layout: format 0 (ipers 28 * rsize 144) / slabsize 0x1000 = 98% eff
+keg swblk layout: format 0 (ipers 29 * rsize 136) / slabsize 0x1000 = 96% eff
 keg FFS inode layout: format 0 (ipers 25 * rsize 160) / slabsize 0x1000 = 97% eff

So: swblk missing vs. existing is something I missed, along with originally
missing Mountpoints.
Comment 17 Mark Millard 2020-02-15 06:48:47 UTC
(In reply to Mark Millard from comment #16)

The diff also shows that I missed: filedesc0
Comment 18 Francis Little 2020-02-15 08:23:25 UTC
Created attachment 211663 [details]
Output of slabs="0"
Comment 19 Francis Little 2020-02-15 08:23:40 UTC
Created attachment 211664 [details]
Output of slabs="1"
Comment 20 Francis Little 2020-02-15 08:25:27 UTC
I have applied both patches and attached the output of grep "keg.*layout" in /var/log/messages for both:

vm.debug.uma_multipage_slabs="0"

and

vm.debug.uma_multipage_slabs="1"
Comment 21 Mark Millard 2020-02-15 22:30:48 UTC
(In reply to Ryan Libby from comment #14)

FYI:

I made a -r357979 based kernel that also had the
patches involved. It booted the example G5 Quad
just fine. (World/base was still the artifacts
material for -r357550 instead of being my normal
environment.)

So it appears that this is the last issue
outstanding for having a booting context on these
types of machines, at least for now.
Comment 22 Mark Millard 2020-02-17 19:16:58 UTC
(In reply to Mark Millard from comment #21)

Note: I progressed my normal powerpc64 PowerMac
environment to head -r357979 as well. The patch
has worked fine on the G5 "Quad" for that context
as well.
Comment 23 commit-hook freebsd_committer freebsd_triage 2020-02-19 04:46:59 UTC
A commit references this bug:

Author: rlibby
Date: Wed Feb 19 04:46:41 UTC 2020
New revision: 358094
URL: https://svnweb.freebsd.org/changeset/base/358094

Log:
  powerpc: unconditionally mark SLB zones UMA_ZONE_CONTIG

  PR:		244118
  Reported by:	Francis Little <oggy at farscape.co.uk>
  Tested by:	Francis Little, Mark Millard <marklmi at yahoo.com>
  Reviewed by:	markj
  Sponsored by:	Dell EMC Isilon
  Differential Revision:	https://reviews.freebsd.org/D23729

Changes:
  head/sys/powerpc/aim/slb.c
Comment 24 Ryan Libby freebsd_committer freebsd_triage 2020-02-19 04:52:27 UTC
Regression in head from r357549, fixed in r358094.