Bug 251113 - sysutils/turbostat fails on zen cpu need newer version
Summary: sysutils/turbostat fails on zen cpu need newer version
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-ports-bugs (Nobody)
URL:
Keywords: buildisok
Depends on:
Blocks:
 
Reported: 2020-11-13 20:49 UTC by Nick Wolff
Modified: 2020-12-04 23:58 UTC (History)
2 users (show)

See Also:
bugzilla: maintainer-feedback? (d.scott.phillips)


Attachments
Diff of the ports patch that builds but has nopped out functions (17.96 KB, patch)
2020-11-13 20:49 UTC, Nick Wolff
no flags Details | Diff
Epyc 7551 sysctl topology-sched output (18.12 KB, text/plain)
2020-12-04 20:52 UTC, Nick Wolff
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Wolff 2020-11-13 20:49:17 UTC
Created attachment 219649 [details]
Diff of the ports patch that builds but has nopped out functions

Turbostat fails on zen and newer versions of turbostat.

There's a minor sysctl error that I will submit freebsd review for but after that current turbostat doesn't understand topology.

I've included a patch to get newer turbostat version building(including sysctl access fix) but it has new functions nopped out so it builds so still doesn't work just deals with some build fixes for new dependencies.
Comment 1 Nick Wolff 2020-11-13 22:47:54 UTC
For cross-referencing here is a review I filed for another bug that needs fixed in turbostat where the topology sysctl can be over the 16k buffer allocated. This patch in that review is complete and able to be put in ports tree incase there's any systems where turbostat understands cpu but topology is to long.
https://reviews.freebsd.org/D27209
Comment 2 Automation User 2020-11-28 01:58:43 UTC
Build and package info is available at https://gitlab.com/swills/freebsd-ports/pipelines/222446216
Comment 3 Nick Wolff 2020-12-04 18:20:46 UTC
With review D27209 in place to deal with size of sysctl error message is now 

epyc1% sudo turbostat
turbostat version 17.06.23 - Len Brown <lenb@kernel.org>
CPUID(0): AuthenticAMD 13 CPUID levels; family:model:stepping 0xf:1:2 (15:1:2)
CPUID(1): SSE3 MONITOR - - - TSC MSR - -
CPUID(6): APERF, No-TURBO, No-DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, No-EPB
CPUID(7): No-SGX
NSFOD /sys/devices/system/cpu/cpu24/cpufreq/scaling_driver
zsh: bus error  sudo turbostat


Attaching core file from debug build of turbostat and debug package built on head that includes bin with variables.


Still think best solution is import of latest version of turbostat from linux but non-zero amount of work there
Comment 4 Nick Wolff 2020-12-04 19:07:47 UTC
Wasn't able to upload package with debug binary and more core dump so here it is https://drive.google.com/drive/folders/1kKMI7JclesfwpVGvzwvfVPK4K9WaeYBL?usp=sharing
Comment 5 D Scott Phillips freebsd_committer freebsd_triage 2020-12-04 19:58:14 UTC
looks like the topology parsing code is going off the rails? that core file has:

topo = {
  num_packages = 22,
  num_cpus = 256,
  num_cores = 64,
  max_cpu_num = 255,
  num_cores_per_pkg = 8,
  num_threads_per_core = 2
}

where 22 packages must be bogus. could you post this machine's kern.sched.topology_spec
Comment 6 Nick Wolff 2020-12-04 20:52:25 UTC
Created attachment 220263 [details]
Epyc 7551 sysctl topology-sched output

Attached topology sysctl output as requested.
Comment 7 D Scott Phillips freebsd_committer freebsd_triage 2020-12-04 23:36:48 UTC
what is the topology kernel print when this machine boots? is it:

FreeBSD/SMP: 2 package(s) x 4 groups x 2 cache groups x 4 core(s) x 2 hardware threads

If so, it looks like kern.sched.topology_spec collapses package and group together into group level 2. we might need to pull the topology logic all into turbostat to deal with that.
Comment 8 Nick Wolff 2020-12-04 23:58:42 UTC
Yes it appears to match what you thought it would be.

FreeBSD/SMP: Multiprocessor System Detected: 128 CPUs
FreeBSD/SMP: 2 package(s) x 4 groups x 2 cache groups x 4 core(s) x 2 hardware threads

On top of redoing how we're getting topology logic I think turbostat from linux kernel 4.17 is just t0o old. That makes it's last commit from oct 2017 https://github.com/torvalds/linux/commits/master?after=b3298500b23f0b53a8d81e0d5ad98a29db71f4f0+82&branch=master&path%5B%5D=tools&path%5B%5D=power&path%5B%5D=x86&path%5B%5D=turbostat This shows all the newer commits starting with the first new one at bottom of page and coninuing if you selecter newer till you get to head .


Around here https://github.com/torvalds/linux/commit/40f5cfe7b886676f00e860b482c4bf7103413a24#diff-e8059b77880f85661c08aad6158834e7de2b4625c7828a7a28d4956b0127dfcc is where they made the turbostat capable of handling epyc processors that ran into a very similiar parsing issue on linux.

Thanks for looking into all this.