Bug 248264

Summary: arm64 / RPi4: cpu frequency control stops working
Product: Base System Reporter: Gordon Bergling <gbe>
Component: armAssignee: freebsd-arm (Nobody) <freebsd-arm>
Status: Closed FIXED    
Severity: Affects Some People CC: marklmi26-fbsd
Priority: --- Keywords: regression
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Gordon Bergling freebsd_committer freebsd_triage 2020-07-25 15:31:14 UTC
On a recent CURRENT from the 25th of July the cpu frequency control stops working. During boot the following error messages are displayed,

bcm2835_cpufreq0: ARM 600MHz, Core 200MHz, SDRAM 400MHz, Turbo OFF
mbox0: mbox response error
bcm2835_cpufreq0: can't set clock rate (id=4)
mbox0: mbox response error
bcm2835_cpufreq0: can't set clock rate (id=8)

The result is that the CPUs are always at the highest clockrate, even if they are idle.

dev.cpu.0.temperature: 47.0C
dev.cpu.0.freq_levels: 1500/-1 600/-1
dev.cpu.0.freq: 1500
dev.cpu.0.%parent: cpulist0
dev.cpu.0.%pnpinfo: name=cpu@0 compat=arm,cortex-a72
dev.cpu.0.%location:
dev.cpu.0.%driver: cpu
dev.cpu.0.%desc: Open Firmware CPU

$ uname -a
FreeBSD tiny.0xfce3.net 13.0-CURRENT FreeBSD 13.0-CURRENT #14 cddea6e05fc-c270140(master): Sat Jul 25 16:18:11 CEST 2020     root@tiny.0xfce3.net:/tank/nfs_public/tiny/obj/tank/nfs_public/tiny/src/arm64.aarch64/sys/GENERIC-TCP  arm64

From my memory, on a -CURRENT from about 14th of July the frequency control was working as intended.
Comment 1 Mark Millard 2020-07-26 04:08:43 UTC
Was this booting via a sysutils/u-boot-rpi4 based configuration?
UEFI/ACPI (say, https://github.com/pftf/RPi4/ v1.17 material)?
Something else?

In general RPi4B support is in a development/experimental state
with multiple directions of activity. Indicating more than then
the FreeBSD-project-internal content involved might be important.
Closer to: how to reproduce the example (given that there are
alternatives around).
Comment 2 Gordon Bergling freebsd_committer freebsd_triage 2020-07-26 07:08:10 UTC
(In reply to Mark Millard from comment #1)

I not entirely sure how the the RPi4 gets booted. Its an installation from an RPi3 -CURRENT image from the 2nd of July. I had looked through the logs and can tell that the frequency control was working with the kernel from the 2nd of July and stopped working with a kernel from the 10th of July.

If I can provide more information just let me know.
Comment 3 Mark Millard 2020-07-26 07:25:10 UTC
(In reply to Gordon Bergling from comment #2)

"Its an installation from an RPi3 -CURRENT image from the 2nd of July":
that is the kind if thing that was important to know along the lines
that I'd referenced. It is even being treated as an RPi3 instead of
as an RPi4, a type of alternative I'd not even referenced. The content
of the image can give other details. But . . .

https://lists.freebsd.org/pipermail/freebsd-snapshots/2020-July/thread.html
lists just the two below for 2020-Jul-02:

New FreeBSD snapshots available: stable/12 (20200702 r362880)   Glen Barber
New FreeBSD snapshots available: stable/11 (20200702 r362844)   Glen Barber

In other words, if the date is correct (2020-Jul-02), then it is
either stable/12 ( -r362880 ) or stable/11 ( -r362844 ) instead
of being head/current.

If it is head (current) and is one of the snapshots, then it must be a
different date:

New FreeBSD snapshots available: head (20200625 r362596)   Glen Barber
New FreeBSD snapshots available: head (20200709 r363032)   Glen Barber
New FreeBSD snapshots available: head (20200716 r363236)   Glen Barber
New FreeBSD snapshots available: head (20200723 r363439)   Glen Barber

You might want to use something like:

uname -apKU

and report what it outputs. That would give all the detail necessary
to identify which.
Comment 4 Gordon Bergling freebsd_committer freebsd_triage 2020-07-26 08:03:03 UTC
(In reply to Mark Millard from comment #3)

Not sure why the -CURRENT snapshot isn't listed, but the image was from

https://download.freebsd.org/ftp/snapshots/arm64/aarch64/ISO-IMAGES/13.0/

and named,

FreeBSD-13.0-CURRENT-arm64-aarch64-RPI3-20200702-r362853.img.xz

The output of uname -apKU is the following:

FreeBSD tiny.0xfce3.net 13.0-CURRENT FreeBSD 13.0-CURRENT #14 cddea6e05fc-c270140(master): Sat Jul 25 16:18:11 CEST 2020     root@tiny.0xfce3.net:/tank/nfs_public/tiny/obj/tank/nfs_public/tiny/src/arm64.aarch64/sys/GENERIC-TCP  arm64 aarch64 1300101 1300101
Comment 5 Mark Millard 2020-07-26 08:57:00 UTC
(In reply to Gordon Bergling from comment #4)

> . . . arm64 aarch64 1300101 1300101.

Hmm. 1300101 started after 2020-Jul-02 and after -r362853 .

https://www.freebsd.org/doc/en_US.ISO8859-1/books/porters-handbook/versions.html
reports:

. . .
1300100	362640	June 26, 2020	13.0-CURRENT after changing the internal API between the NFS kernel modules.
1300101	363077	July 10, 2020	13.0-CURRENT after implementing the array_size() function in the LinuxKPI.

Your uname -apKU output is inconsistent with any build prior
to -r363077 (and so any build before 2020-Jul-10).

Another hint may be that the official snaphots probably do
not get strings like "cddea6e05fc-c270140(master)" or
kernel config paths like "root@tiny.0xfce3.net:/tank/ . . . /sys/GENERIC-TCP".
Plain GENERIC may be, but a special GENERIC-TCP kernel config?
Seems unlikely.

A github search indicates that the hash prefix cddea6e05fc605fc
( https://github.com/search?q=hash:cddea6e05fc ) is tied to:

Move Intel GAS to dev/iommu/ as now a part of generic iommu framework.
Reviewed by:	kib
Sponsored by:	DARPA/AFRL
Differential Revision:	https://reviews.freebsd.org/D25799
 master
. . . 
bukinr committed 21 hours ago 

Finding that in svn-src-head is:

https://lists.freebsd.org/pipermail/svn-src-head/2020-July/138206.html

Author: br
Date: Sat Jul 25 11:34:50 2020
New Revision: 363524
URL: https://svnweb.freebsd.org/changeset/base/363524

Log:
  Move Intel GAS to dev/iommu/ as now a part of generic iommu framework.
  
  Reviewed by:	kib
  Sponsored by:	DARPA/AFRL
  Differential Revision:	https://reviews.freebsd.org/D25799


In other words, "today" as of when this bugzilla submittal was made.
Comment 6 Gordon Bergling freebsd_committer freebsd_triage 2020-07-26 09:10:07 UTC
(In reply to Mark Millard from comment #5)

I use the GitHub mirror to update my local sources. The GENERIC-TCP is a custom kernel config with additional TCP related options.

$ cat /usr/src/sys/arm64/conf/GENERIC-TCP
include		GENERIC
options		RATELIMIT
options		TCPHPTS
#options	KERN_TLS

I usually try to update -CURRENT once a week on the RPi4. I bisected the revisions, or better the dates by looking at /var/logs/messages.
Comment 7 Mark Millard 2020-07-26 10:08:18 UTC
(In reply to Gordon Bergling from comment #6)

So, if I gather right at this point, correcting a misinterpretation
on my part:

FreeBSD-13.0-CURRENT-arm64-aarch64-RPI3-20200702-r362853.img.xz is from
well before 2020-Jul-14 or so and through 2020-Jul-14 or so the kernels
with such src/build dates did not show the problem.

Kernel updates you made later eventually started showing the problem.
You build a kernel that is a variation of GENERIC with a couple of
extra options. (For now we  are ignoring make.conf and src.conf
content as hopefully irrelevant.)

It may be that head -r363524 from 2020-Jul-25 is the earliest version
that you have seen the problem on.

You are using the sysutils/u-boot-rpi3 way of booting ( its
u-boot.bin ) based on whatever vintage is in
FreeBSD-13.0-CURRENT-arm64-aarch64-RPI3-20200702-r362853.img.xz .
You have not updated u-boot.bin after starting from that image.

You are using the sysutils/rpi-firmware materials based on
whatever vintage is in
FreeBSD-13.0-CURRENT-arm64-aarch64-RPI3-20200702-r362853.img.xz .
You have not updated those materials after starting from that
image.

That still leaves one file from buildworld buildkernel sorts
of activity but that goes outside the FreeBSD file system
and is part of booting and has an initial version in
FreeBSD-13.0-CURRENT-arm64-aarch64-RPI3-20200702-r362853.img.xz
. . .

Have you been updating EFI/BOOT/bootaa64.efi with any updated
/boot/loader.efi vintages? Vs.: is EFI/BOOT/bootaa64.efi still
as it was in
FreeBSD-13.0-CURRENT-arm64-aarch64-RPI3-20200702-r362853.img.xz ?
Comment 8 Gordon Bergling freebsd_committer freebsd_triage 2020-07-27 08:11:18 UTC
(In reply to Mark Millard from comment #7)

Sorry that the information are combing bit by bit. Below are the contents of src.conf and make.conf, but I would think that these aren't problem.

/etc/src.conf
WITH_EXTRA_TCP_STACKS=1
WITH_BEARSSL=1
WITH_PIE=1
WITH_RETPOLINE=1

/etc/make.conf
MALLOC_PRODUCTION=1

I verified that the problem started between
r362883 (loader: potential memory leak and check return values)
and
r363125 (Implement CLOCK_MONOTONIC_RAW (linux >= 2.6.28)).

I haven't touch any files in /boot/msdos, so they are all from the installation image. The contents is as follows,

-rwxr-xr-x  1 root  wheel     5888 Jul  2 05:10 armstub8-gic.bin
-rwxr-xr-x  1 root  wheel     5888 Jul  2 05:10 armstub8.bin
-rwxr-xr-x  1 root  wheel    27054 Nov 22  2019 bcm2710-rpi-3-b-plus.dtb
-rwxr-xr-x  1 root  wheel    26435 Nov 22  2019 bcm2710-rpi-3-b.dtb
-rwxr-xr-x  1 root  wheel    40659 Nov 22  2019 bcm2711-rpi-4-b.dtb
-rwxr-xr-x  1 root  wheel    52304 Nov 22  2019 bootcode.bin
-rwxr-xr-x  1 root  wheel      135 Jul  2 05:10 config_rpi4.txt
-rwxr-xr-x  1 root  wheel      147 Jul  2 05:10 config.txt
drwxr-xr-x  1 root  wheel     4096 Jul  2 10:39 dtb
drwxr-xr-x  1 root  wheel     4096 Jul  2 10:39 EFI
-rwxr-xr-x  1 root  wheel     2655 Nov 22  2019 fixup_cd.dat
-rwxr-xr-x  1 root  wheel     9816 Nov 22  2019 fixup_db.dat
-rwxr-xr-x  1 root  wheel     9816 Nov 22  2019 fixup_x.dat
-rwxr-xr-x  1 root  wheel     6744 Nov 22  2019 fixup.dat
-rwxr-xr-x  1 root  wheel     6193 Nov 22  2019 fixup4.dat
-rwxr-xr-x  1 root  wheel     3089 Nov 22  2019 fixup4cd.dat
-rwxr-xr-x  1 root  wheel     9181 Nov 22  2019 fixup4db.dat
-rwxr-xr-x  1 root  wheel     9183 Nov 22  2019 fixup4x.dat
-rwxr-xr-x  1 root  wheel     1594 Nov 22  2019 LICENCE.broadcom
drwxr-xr-x  1 root  wheel     4096 Jul  2 10:40 overlays
-rwxr-xr-x  1 root  wheel      124 Jul  2 04:58 README
-rwxr-xr-x  1 root  wheel   688068 Nov 22  2019 start_cd.elf
-rwxr-xr-x  1 root  wheel  4857160 Nov 22  2019 start_db.elf
-rwxr-xr-x  1 root  wheel  3794600 Nov 22  2019 start_x.elf
-rwxr-xr-x  1 root  wheel  2880356 Nov 22  2019 start.elf
-rwxr-xr-x  1 root  wheel  2775076 Nov 22  2019 start4.elf
-rwxr-xr-x  1 root  wheel   775872 Nov 22  2019 start4cd.elf
-rwxr-xr-x  1 root  wheel  4582664 Nov 22  2019 start4db.elf
-rwxr-xr-x  1 root  wheel  3536680 Nov 22  2019 start4x.elf
-rwxr-xr-x  1 root  wheel   499824 Jul  2 04:58 u-boot.bin

and EFI/BOOT is
$ /bin/ls -l /boot/msdos/EFI/BOOT/
total 680
-rwxr-xr-x  1 root  wheel  695088 Jul  2 08:18 bootaa64.efi

I download the lastest image from the 23th of July and u-boot.bin and bootaa64.efi have a newer date and a different md5 checksum so I would assume that they were updated. Which brings me to the question, must /boot/loader.efi keep in sync with the kernel?

Thanks for the much appreciated help so far.
Comment 9 Mark Millard 2020-07-28 18:04:53 UTC
head -r363643 has changed how the cpufreq driver for bcm2835
works:

QUOTE
Author: andrew
Date: Tue Jul 28 10:45:29 2020
New Revision: 363643
URL: https://svnweb.freebsd.org/changeset/base/363643

Log:
  Switch the bcm2835 cpufreq driver to use the firmware interface
  
  Use the new Raspberry Pi firmware driver in the cpufreq driver. It is
  intended all drivers that need to interact with the firmware will move to
  use the firmware driver, this is the first.
END QUOTE

(Prior check-ins added the new RPi firmware driver.)

So it may be important to only deal with testing and reporting
against -r363643 and later for head. Similarly if other use of
the new RPi firmware driver ends up being added.
Comment 10 Gordon Bergling freebsd_committer freebsd_triage 2020-07-28 18:10:02 UTC
(In reply to Mark Millard from comment #9)

I have started a 'buildworld buildkernel' the moment I saw the commit message. It just takes a while on RPi4, even with a -DNO_CLEAN=yes. I report back once I tested the kernel with nextboot.
Comment 11 Gordon Bergling freebsd_committer freebsd_triage 2020-07-29 16:38:49 UTC
After r363643 the problem is fixed.

No mbox related errors are reported during boot.
------------------------------------------------------------------
bcm2835_cpufreq0: ARM 600MHz, Core 200MHz, SDRAM 400MHz, Turbo OFF
------------------------------------------------------------------

And the CPU frequency is changing, checked via

$ sysctl dev.cpu.0.freq

Thanks for the help!