Reproduction steps: 1. get current arch iso (or other rolling release linux). The following will deal with archlinux 2. boot install medium inside the bhyve vm, and attempt to run any of: [vim, python3, archinstall, gdb (if installed), localedef] 3. all of the above will crash with a segfault (sigsev) and error 4 (cause was a user-mode read resulting in no page being found.) 4. downgrading to glibc-2.39-1 fixes all of the above applications, though in the case of bootstrapping scripts like archinstall, this can be fail to work if, for instance, the script re-downloads glibc. Existing board post discussing this: https://bbs.archlinux.org/viewtopic.php?id=295802 offending commit: https://sourceware.org/git/?p=glibc.git;a=commit;h=aa4249266e9906c4bc833e4847f4d8feef59504f Affects: - Ryzen 5 7600, possibly more AMD Zen3 & Zen4 CPUs Last working version: - linux glibc-2.39-1 Relevant /boot/loader.conf: vmm_load="YES" hw.vmm.amdvi.enable="1" Relevant /etc/rc.conf: vm_enable="YES" vm_dir="zfs:zroot/vm" vm-bhyve configuration file: loader="uefi" graphics="yes" xhci_mouse="yes" cpu="8" cpu_sockets="1" cpu_cores="4" cpu_threads="2" memory="8G" ahci_device_limit="8" network0_type="virtio-net" network0_switch="public" disk0_type="nvme" disk0_name="disk0.img"
Recently I was trying to install nixos-24.05 bhyve vm, the official iso crashed during boot. But 23.11 can be installed without any issues. By searching for the glibc package in different nixos, the nixos-24.05 uses glibc 2.39-52 while nixos-23.11 uses glibc 2.38-77. I think this might be same issue.
So what is the instruction that faults?
How can I help to debug this? On my arch vm I still downgrade glibc, which only works for so long on a rolling release. I tried to install Fedora Server 40 (netinstall) which also fails during installation. It's kind of difficult to debug as gdb will also crash when the latest glibc is installed. The other user of the arch forum made some backtraces of some applications crashing: https://bbs.archlinux.org/viewtopic.php?pid=2172581#p2172581
Upgrading a debian sid today [....] Unpacking libc6:amd64 (2.39-4) over (2.38-14) ... Setting up libc6:amd64 (2.39-4) ... free(): invalid pointer dpkg: error processing package libc6:amd64 (--configure): installed libc6:amd64 package post-installation script subprocess was killed by signal (Aborted), core dumped Errors were encountered while processing: libc6:amd64 Error: Sub-process /usr/bin/dpkg returned an error code (1) [....]
CPU: AMD Ryzen 9 5950X 16-Core Processor (3393.72-MHz K8-class CPU)
I see on that sid dmesg: [173364.415802] apt[5234]: segfault at 7f29a9d1aff8 ip 00007f29aa36ff25 sp 00007ffcc0b4ad18 error 4 in libc.so.6[7f29aa23f000+158000] likely on CPU 3 (core 0, socket 3) [173364.415815] Code: fe 6f 76 40 48 8d 8c 17 7f ff ff ff c5 fe 6f 7e 60 c5 7e 6f 44 16 e0 48 29 fe 48 83 e1 e0 48 01 ce 0f 1f 40 00 c5 fe 6f 4e 60 <c5> fe 6f 56 40 c5 fe 6f 5e 20 c5 fe 6f 26 48 83 c6 80 c5 fd 7f 49 [173385.774003] traps: appstreamcli[5351] general protection fault ip:7f793c03993a sp:7ffcfcdaae90 error:0 in libc.so.6[7f793bfc5000+158000] don't know too much linux, let me know if I can help
(In reply to jordy from comment #3) > It's kind of difficult to debug as gdb will also crash when the latest glibc is installed. The other user of the arch forum made some backtraces of some applications crashing: https://bbs.archlinux.org/viewtopic.php?pid=2172581#p2172581 The easy solution is presumably to crash it & get a core, then downgrade glibc and use gdb. If you load the core, `disas` should presumably get us what we need to make any kind of progress.
(In reply to Kyle Evans from comment #7) Most likely noit, unfortunately. The downgrade of libc would cause wrong code to be disassembled, because text is not dumped normally. Is there a way to affect the glibc selection of the CPU-optimized functions, like our "ARCHLEVEL" env var?
(In reply to Konstantin Belousov from comment #8) One could `add-symbol-file` in a saved off copy of the broken shlib at the expected offset and get expected disassembly, perhaps? Not as easy as written before, but an option if there's no env for this.
Perhaps LD_LIBRARY_PATH to force the crashing binary to use crashing libc is the easiest.
Is this the kind of thing you need? "disas" didn't work, so I tried dumping the instructions near the program counter instead. (I have no idea what I'm doing when it comes to gdb.) root@localhost:~# gdb --core=python3.core GNU gdb (Debian 13.2-1+b2) 13.2 ... Core was generated by `python3'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000000000553914 in ?? () (gdb) bt #0 0x0000000000553914 in ?? () #1 0x0000000000000000 in ?? () (gdb) disas No function contains program counter for selected frame. (gdb) x/50i ($pc - 100) 0x5538b0: test %eax,%eax 0x5538b2: je 0x554064 0x5538b8: test %eax,%eax 0x5538ba: jns 0x55406d 0x5538c0: mov %r14,%r12 0x5538c3: cmp %r14,%r15 0x5538c6: jae 0x553f1d 0x5538cc: mov %ebp,%r14d 0x5538cf: shr $0x6,%bpl 0x5538d3: lea 0x28(%r13),%rax 0x5538d7: mov %r13,0x28(%rsp) 0x5538dc: and $0x1,%ebp 0x5538df: shr $0x5,%r14b 0x5538e3: mov %rax,0x10(%rsp) 0x5538e8: mov %r12,%r13 0x5538eb: mov %bpl,0x8(%rsp) 0x5538f0: and $0x1,%r14d 0x5538f4: mov %rbx,0x30(%rsp) 0x5538f9: mov %r14d,%ebx 0x5538fc: mov %r8,%r14 0x5538ff: mov %r13,%rax 0x553902: mov %r14,%rdx 0x553905: sub %r15,%rax 0x553908: sar $0x4,%rax 0x55390c: lea (%r15,%rax,8),%rbp 0x553910: mov 0x0(%rbp),%rsi => 0x553914: mov 0x10(%rsi),%r12 0x553918: movzbl 0x20(%rsi),%eax 0x55391c: cmp %r14,%r12 0x55391f: cmovle %r12,%rdx 0x553923: test $0x20,%al 0x553925: je 0x451a14 0x55392b: test $0x40,%al 0x55392d: je 0x554dfc 0x553933: add $0x28,%rsi 0x553937: test %bl,%bl 0x553939: je 0x555085 0x55393f: cmpb $0x0,0x8(%rsp) 0x553944: je 0x554018 0x55394a: mov 0x10(%rsp),%rdi 0x55394f: call 0x4217f0 0x553954: test %eax,%eax 0x553956: je 0x554030 0x55395c: test %eax,%eax 0x55395e: jns 0x554040 0x553964: cmp %rbp,%r15 0x553967: jae 0x55404d 0x55396d: mov %rbp,%r13 0x553970: jmp 0x5538ff 0x553972: nopw 0x0(%rax,%rax,1) And for vim: root@localhost:~# gdb --core=vim.core GNU gdb (Debian 13.2-1+b2) 13.2 ... Core was generated by `vim'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fee03ec47a7 in ?? () (gdb) bt #0 0x00007fee03ec47a7 in ?? () #1 0x0000559fc8dc4831 in ?? () #2 0x00007fffc3822820 in ?? () #3 0x00000000000001a6 in ?? () #4 0x00007fee03de9440 in ?? () #5 <signal handler called> #6 0x00007fee03f1339c in ?? () #7 0x00007fffc3822860 in ?? () #8 0x2c0d8adf099bf900 in ?? () #9 0x0000000000000006 in ?? () #10 0x00007fee03de9440 in ?? () #11 0x00007fffc3822820 in ?? () #12 0x00007fffc3822820 in ?? () #13 0x00007fffc3822820 in ?? () #14 0x00007fee03ec44f2 in ?? () #15 0x00007fee04060b50 in ?? () #16 0x00007fee03ead4ed in ?? () #17 0x0000000000000020 in ?? () #18 0x0000000000000000 in ?? () (gdb) x/50i ($pc - 100) 0x7fee03ec4743: jne 0x7fee03ec4652 0x7fee03ec4749: xor %edx,%edx 0x7fee03ec474b: xor %esi,%esi 0x7fee03ec474d: jmp 0x7fee03ec4652 0x7fee03ec4752: nopw 0x0(%rax,%rax,1) 0x7fee03ec4758: mov 0x19a699(%rip),%rdx # 0x7fee0405edf8 0x7fee03ec475f: neg %eax 0x7fee03ec4761: mov %eax,%fs:(%rdx) 0x7fee03ec4764: mov $0xffffffff,%edx 0x7fee03ec4769: jmp 0x7fee03ec4717 0x7fee03ec476b: call 0x7fee03f98b20 0x7fee03ec4770: sub $0x8,%rsp 0x7fee03ec4774: call 0x7fee03f18220 0x7fee03ec4779: test %eax,%eax 0x7fee03ec477b: jne 0x7fee03ec4788 0x7fee03ec477d: add $0x8,%rsp 0x7fee03ec4781: ret 0x7fee03ec4782: nopw 0x0(%rax,%rax,1) 0x7fee03ec4788: mov 0x19a669(%rip),%rdx # 0x7fee0405edf8 0x7fee03ec478f: mov %eax,%fs:(%rdx) 0x7fee03ec4792: mov $0xffffffff,%eax 0x7fee03ec4797: jmp 0x7fee03ec477d 0x7fee03ec4799: nopl 0x0(%rax) 0x7fee03ec47a0: mov $0x3e,%eax 0x7fee03ec47a5: syscall => 0x7fee03ec47a7: cmp $0xfffffffffffff001,%rax 0x7fee03ec47ad: jae 0x7fee03ec47b0 0x7fee03ec47af: ret 0x7fee03ec47b0: mov 0x19a641(%rip),%rcx # 0x7fee0405edf8 0x7fee03ec47b7: neg %eax 0x7fee03ec47b9: mov %eax,%fs:(%rcx) 0x7fee03ec47bc: or $0xffffffffffffffff,%rax 0x7fee03ec47c0: ret 0x7fee03ec47c1: cs nopw 0x0(%rax,%rax,1) 0x7fee03ec47cb: nopl 0x0(%rax,%rax,1) 0x7fee03ec47d0: mov $0x8,%esi 0x7fee03ec47d5: mov $0x7f,%eax 0x7fee03ec47da: syscall 0x7fee03ec47dc: cmp $0xfffffffffffff000,%rax 0x7fee03ec47e2: ja 0x7fee03ec47e8 0x7fee03ec47e4: ret 0x7fee03ec47e5: nopl (%rax) 0x7fee03ec47e8: mov 0x19a609(%rip),%rdx # 0x7fee0405edf8 0x7fee03ec47ef: neg %eax 0x7fee03ec47f1: mov %eax,%fs:(%rdx) 0x7fee03ec47f4: mov $0xffffffff,%eax 0x7fee03ec47f9: ret 0x7fee03ec47fa: nopw 0x0(%rax,%rax,1) 0x7fee03ec4800: cmpb $0x0,0x1a2839(%rip) # 0x7fee04067040 0x7fee03ec4807: je 0x7fee03ec4820 To get the above output, I used the latest Debian Sid nocloud image: https://cloud.debian.org/cdimage/cloud/sid/daily/20240721-1815/debian-sid-nocloud-amd64-daily-20240721-1815.tar.xz I ran it on Bhyve on AMD to get the core dump and on KVM on Intel to debug it. The debug version of Python (python3-dbg) doesn't crash, so I don't know how to get debug symbols.
Just in case, could you also please show the GPR file ((gdb) show registers). From what you posted, the damage was done elsewhere and it is an innocent general purpose instruction that got invalid memory address. Is it possible to install debugging symbols on the Linux distro you use?
(In reply to Konstantin Belousov from comment #12) Here are the register values you asked for. Installing the debug symbols using debuginfod (or find-dbgsym-packages) doesn't seem to have changed the backtraces. The damage is presumably done by something related to memcpy/memmove since this commit is what causes the symptoms to manifest: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=aa4249266e9906c4bc833e4847f4d8feef59504f;hp=5a461f2949ded98d8211939f84988bc464c7b4fe Python: root@localhost:~# gdb --core=python3.core GNU gdb (Debian 13.2-1+b2) 13.2 ... This GDB supports auto-downloading debuginfo from the following URLs: <https://debuginfod.debian.net> Enable debuginfod for this session? (y or [n]) y Debuginfod has been enabled. To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit. Core was generated by `python3'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000000000553914 in ?? () (gdb) bt #0 0x0000000000553914 in ?? () #1 0x0000000000000000 in ?? () (gdb) info registers rax 0x5 5 rbx 0x1 1 rcx 0x7 7 rdx 0xc 12 rsi 0xa2967 665959 rdi 0x7f1c2020d318 139758774833944 rbp 0x7f1c201a4458 0x7f1c201a4458 rsp 0x7ffd271820b0 0x7ffd271820b0 r8 0xc 12 r9 0x1 1 r10 0x7f1c202eb078 139758775742584 r11 0x7f1c20434d00 139758777093376 r12 0x7f1c201a4480 139758774404224 r13 0x7f1c201a4480 139758774404224 r14 0xc 12 r15 0x7f1c201a4430 139758774404144 rip 0x553914 0x553914 eflags 0x10216 [ PF AF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 Vim: root@localhost:~# gdb --core=vim.core GNU gdb (Debian 13.2-1+b2) 13.2 ... This GDB supports auto-downloading debuginfo from the following URLs: <https://debuginfod.debian.net> Enable debuginfod for this session? (y or [n]) y Debuginfod has been enabled. To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit. Core was generated by `vim'. Program terminated with signal SIGABRT, Aborted. #0 0x00007fee03ec47a7 in ?? () (gdb) bt #0 0x00007fee03ec47a7 in ?? () #1 0x0000559fc8dc4831 in ?? () #2 0x00007fffc3822820 in ?? () #3 0x00000000000001a6 in ?? () #4 0x00007fee03de9440 in ?? () #5 <signal handler called> #6 0x00007fee03f1339c in ?? () #7 0x00007fffc3822860 in ?? () #8 0x2c0d8adf099bf900 in ?? () #9 0x0000000000000006 in ?? () #10 0x00007fee03de9440 in ?? () #11 0x00007fffc3822820 in ?? () #12 0x00007fffc3822820 in ?? () #13 0x00007fffc3822820 in ?? () #14 0x00007fee03ec44f2 in ?? () #15 0x00007fee04060b50 in ?? () #16 0x00007fee03ead4ed in ?? () #17 0x0000000000000020 in ?? () #18 0x0000000000000000 in ?? () (gdb) info registers rax 0x0 0 rbx 0x1 1 rcx 0x7fee03ec47a7 140660244760487 rdx 0x0 0 rsi 0x6 6 rdi 0x1a6 422 rbp 0x6 0x6 rsp 0x7fffc38220d8 0x7fffc38220d8 r8 0x7fffc3822020 140736473473056 r9 0x559fdb866f50 94145071181648 r10 0x8 8 r11 0x206 518 r12 0x7fffc3822820 140736473475104 r13 0x6 6 r14 0x7fffc3822820 140736473475104 r15 0x7fffc3822820 140736473475104 rip 0x7fee03ec47a7 0x7fee03ec47a7 eflags 0x206 [ PF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0
*** Bug 282927 has been marked as a duplicate of this bug. ***
Hi, I'm also an AlmaLinux member and encountered the same issue with AlmaLinux 9.5 and 10 Kitten. AlmaLinux 10 Kitten has glibc 2.39 but 9.x has older versions. In AlmaLinux 9, glibc-2.34-100.el9_4.4 for 9.4 doesn't have the issue but glibc-2.34-125.el9_5.1 for 9.5 does. So I guess some changes between the versions affects. It was the offending commit the reporter mentioned. https://git.almalinux.org/rpms/glibc/commit/4da5357bcd7b44f2ee8891f477a5a6c34973ddba https://gitlab.com/redhat/centos-stream/rpms/glibc/-/commit/6dbf26d6f46fce37e63fcf2eec542b03ef4e0704 AlmaLinux team reverted the changes and it avoided the issue. I'm still not sure if this issue is glibc bug or bhyve bug. At least I've never seen this issue except bhyve. Have anyone seen this on bare Zen 3/4 machines?
Not exposing ERMS in CPUID works fine here as a workaround with Arch Linux@Zen 3: diff --git a/sys/amd64/vmm/x86.c b/sys/amd64/vmm/x86.c --- a/sys/amd64/vmm/x86.c +++ b/sys/amd64/vmm/x86.c @@ -434,7 +434,7 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, CPUID_STDEXT_BMI1 | CPUID_STDEXT_HLE | CPUID_STDEXT_AVX2 | CPUID_STDEXT_SMEP | CPUID_STDEXT_BMI2 | - CPUID_STDEXT_ERMS | CPUID_STDEXT_RTM | + CPUID_STDEXT_RTM | CPUID_STDEXT_AVX512F | CPUID_STDEXT_AVX512DQ | CPUID_STDEXT_RDSEED |
(In reply to Evgenii Khramtsov from comment #16) Thanks for the patch, works here as well for epyc milan.
If you copy over /usr/bin/ld.so from glibc 2.40 into an older distribution (without replacing the installed /usr/bin/ld.so on the older distribution), does “ld.so --list-diagnostics” work? If so, please attach its output.
I am trying to reproduce it with a different hypervisor/emulation (in this case qemu/kvm) with a Ryzen 9 5900x Zen3 core but both AlmaLinux 10 Kitten (glibc 2.39) and debian sid (glibc 2.40) boots and works without any issue. And I also verified on debian sid the selected memcpy/memmove is indeed the one that optimized with glibc change (__memcpy_avx_unaligned_erms). I even tried to run glibc memcpy/memmove tests in this VM, where they stress a lot of different sizes and alignments for different memcpy/memmove implementations. Also, my daily workstation (Ryzen 9 5900x) the uses a recent glibc that also contains this issue and I haven't see any memcpy/memmove related issue. So I am not sure if this is a glibc issue.
Created attachment 255691 [details] ld.so --list-diagnostics from glibc240
(In reply to Getz Mikalsen from comment #20) > ld.so --list-diagnostics from glibc240 Sorry, this says “version.version="2.36"” in the dump, and it lacks the CPUID diagnostics.
(In reply to Florian Weimer from comment #21) It's /usr/bin/ld.so from glibc 2.40 copied over into an older distribution running 2.36 just like you asked. I was able to install the latest archlinux iso by not exposing ERMS in CPUID. Then I reverted the patch and it booted, although would easily segfault. I can attach the output from --list-diagnostics from that vm if that'd be of more use.
(In reply to Getz Mikalsen from comment #22) > It's /usr/bin/ld.so from glibc 2.40 copied over into an older distribution running 2.36 just like you asked. Sorry, It looks like you have copied over the symbolic link instead of the symbolic link target. Sorry, I should have mentioned that.
Created attachment 255708 [details] ld-linux-x86-64.so.2 --list-diagnostics from glibc2.40 (In reply to Florian Weimer from comment #23) Sorry my bad, here is the actual ld-linux-x86-64.so.2 output, no symlink this time. :-)
(In reply to Getz Mikalsen from comment #24) I can confirm that the data looks complete now, thnaks. Now I just have to find someone who can make sense of it. 8-)
Just another data point that seems to be related.... This bug rears it's head on a Ryzen 7 5700g system running 14.2-RELEASE. It's impossible currently to install a bhyve instance with any of the Fedora derivatives (AlmaLinux 9.5, CentOS Stream 9, Rocky 9.5, Oracle 9.5, and Fedora 41). You can install AlmaLinux 9.4 or Rocky 9.4 successfully (haven't tried all the others) and it runs perfectly. Doing a "dnf upgrade" on those leads to a borked systems as segmentation faults start showing up as soon as glibc is upgraded.
(In reply to Darren Henderson from comment #26) Yes, RHEL derivatives 9.5 includes this patch to glibc and it is an issue: https://gitlab.com/redhat/centos-stream/rpms/glibc/-/commit/6dbf26d6f46fce37e63fcf2eec542b03ef4e0704 I reported the issue to glibc upstream but it seems that the cause is on bhyve side because I've only seen this problem on bhyve. https://sourceware.org/bugzilla/show_bug.cgi?id=30994 bhyve guru is wanted!
This has been sitting unassigned for nearly six months now apparently - can we up the priority and scope of effected systems? It is present in 14.2 as well as 14.1 and is going to become more critical as the glibc versions in question propagate further and further. I haven't seen any indication that it only effects a few AMD CPUs so chances are it's going to start snowballing. I don't want to overstate it but the potential is there that bhyve will be less and less functional for a significant number of people.
(In reply to Darren Henderson from comment #28) The glibc changes indicated in this bug were specific to amd. I have tried numerous times to reproduce this issue on intel systems but have not seen it occur. Sadly I don't have an applicable AMD systems to test on. Here's my arch install as a bhyve guest: $ uname -a Linux archlinux 6.12.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 09 Dec 2024 14:31:57 +0000 x86_64 GNU/Linux $ ld.so --list-diagnostics | grep -e version.version version.version="2.40" Given some of the ld.so output in previous updates to this bug, I did a diff to try to find the x86 tunable changes between them. 43c40 < version.version="2.36" --- > version.version="2.40" 189,191c196,199 < x86.cpu_features.rep_movsb_threshold=0x2000 < x86.cpu_features.rep_movsb_stop_threshold=0x80000 < x86.cpu_features.rep_stosb_threshold=0x800 --- > x86.cpu_features.memset_non_temporal_threshold=0xc000000000 > x86.cpu_features.rep_movsb_threshold=0x0 > x86.cpu_features.rep_movsb_stop_threshold=0xc000000000 > x86.cpu_features.rep_stosb_threshold=0xffffffffffffffff Running binaries with this environment set, so the intel system would run the same code path in glibc as amd would, did not cause the issue on my intel system. GLIBC_TUNABLES="glibc.cpu.x86_memset_non_temporal_threshold=0xc000000000:glibc.cpu.x86_rep_movsb_threshold=0x0:glibc.cpu.x86_rep_movsb_stop_threshold=0xc000000000:glibc.cpu.x86_rep_stosb_threshold=0xffffffffffffffff" If anyone has seen it on intel, it would be great to have the repro steps added to this bug.
I do believe that the problem is specific to AMD. The AMD hw virtualization assist (SVM) is completely different from the Intel facility (VMX), and perhaps there is a bug in bhyve. What is not clear to me, is it a bug to look for in bhyve, or some hw quirk in the reported CPU models. As a first step, could somebody extract the code that is used for broken configuration? I suspect that it is actually ERMS(or FRMS) + AVX2 and not just ERMS memcpy().
This is reportedly fixed in Alma Linux.
(In reply to Michael Dexter from comment #31) Can you provide any details on the supposed fix?
If someone can provide me SSH access to a guest system that exhibits the issue once glibc is updated, I can try to debug it (assuming that GDB/ptrace works on the guest, but I don't see why it wouldn't). No root access or glibc update would be needed; the issue should reproduce with an uninstalled upstream rebuild of glibc. I'm not convinced if I will be able to reproduce it locally.
(In reply to Konstantin Belousov from comment #32) > This is reportedly fixed in Alma Linux. I don't get exactly Michael Dexter meant but this is a misunderstanding. I, with a hat of AlmaLinux developer, would say that the issue is not FIXED on AlmaLinux side. AlmaLinux just tried to revert the offending commit to AVOID the issue. It is not a FIX at all. There's no progress on the AlmaLinux side for a fundamental fix.
(In reply to Florian Weimer from comment #33) I can provide you access to the system if you're IPv6 reachable. Could you email me the public SSH key?
Thanks for the offers of machine access. I should have studied the ld.so --list-diagnostics output. It's a recurrence of the previous rep_movsb_threshold bug because it ends up as zero. The bhyve bug that triggers this is that it reports 1 TiB of L3 cache (x86.cpu_features.level3_cache_size=0x10000000000 in the diagnostic output). This triggers an integer truncation in glibc's cache size computation. Misreporting cache information like this typically impacts performance, so it should be fixed independently of the glibc bug. The glibc bug is below. I'll submit it upstream and we'll backport it. diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index e9579505..6a0a30ba 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -1021,11 +1021,11 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) non_temporal_threshold = maximum_non_temporal_threshold; /* NB: The REP MOVSB threshold must be greater than VEC_SIZE * 8. */ - unsigned int minimum_rep_movsb_threshold; + unsigned long int minimum_rep_movsb_threshold; /* NB: The default REP MOVSB threshold is 4096 * (VEC_SIZE / 16) for VEC_SIZE == 64 or 32. For VEC_SIZE == 16, the default REP MOVSB threshold is 2048 * (VEC_SIZE / 16). */ - unsigned int rep_movsb_threshold; + unsigned long int rep_movsb_threshold; if (CPU_FEATURE_USABLE_P (cpu_features, AVX512F) && !CPU_FEATURE_PREFERRED_P (cpu_features, Prefer_No_AVX512)) { With this fix, the testsuite is very clean, only nptl/tst-mutex10 fails with a test timeout. Cause is unclear. (The test does not actually use elision because the system does not support RTM.)
(In reply to Florian Weimer from comment #36) Do you see which CPUID leaf causes the trouble? As a guess, I wonder if the following bhyve patch helps (it tries to fix CPUID leaf 0x8000_001D %ecx 3): diff --git a/sys/amd64/vmm/x86.c b/sys/amd64/vmm/x86.c index a833b61786e7..8474666b5e6f 100644 --- a/sys/amd64/vmm/x86.c +++ b/sys/amd64/vmm/x86.c @@ -256,7 +256,7 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, func = 3; /* unified cache */ break; default: - logical_cpus = 0; + logical_cpus = sockets * threads * cores; level = 0; func = 0; break;
(In reply to Konstantin Belousov from comment #37) This didn't help. It still reports the same l3 cache size. $ ld.so --list-diagnostics |grep cache x86.cpu_features.data_cache_size=0x8000 x86.cpu_features.shared_cache_size=0x1000000000 x86.cpu_features.level1_icache_size=0x8000 x86.cpu_features.level1_icache_linesize=0x40 x86.cpu_features.level1_dcache_size=0x8000 x86.cpu_features.level1_dcache_assoc=0x8 x86.cpu_features.level1_dcache_linesize=0x40 x86.cpu_features.level2_cache_size=0x80000 x86.cpu_features.level2_cache_assoc=0x8 x86.cpu_features.level2_cache_linesize=0x40 x86.cpu_features.level3_cache_size=0x1000000000 x86.cpu_features.level3_cache_assoc=0x0 x86.cpu_features.level3_cache_linesize=0x40 x86.cpu_features.level4_cache_size=0x0 x86.cpu_features.cachesize_non_temporal_divisor=0x4
(In reply to Konstantin Belousov from comment #37) > Do you see which CPUID leaf causes the trouble? Let me try based on attachment 255708 [details]. The maximum leaf is 0x80000023 according to this: x86.processor[0x0].cpuid.eax[0x80000000].eax=0x80000023 Ordinarly, handle_amd in sysdeps/x86/dl-cacheinfo.h would use the modern way for obtaining cache details, using leaf 0x8000001D: x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x0].eax=0x121 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x0].ebx=0x3f x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x0].ecx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x0].edx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x1].eax=0x143 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x1].ebx=0x3f x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x1].ecx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x1].edx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x2].eax=0x163 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x2].ebx=0x3f x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x2].ecx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x2].edx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x3].eax=0x3ffc100 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x3].ebx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x3].ecx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x3].edx=0x0 x86.processor[0x0].cpuid.subleaf_eax[0x8000001d].ecx[0x3].until_ecx=0x1ff L3 cache data is subleaf 3. We have a safety check that requires ECX != 0, in case hypervisors do not fill in this information, which is happening here. We fall back to the legacy way of obtaining cache size. That uses leaf 0x80000006 for L3 cache information: x86.processor[0x0].cpuid.eax[0x80000006].eax=0x48002200 x86.processor[0x0].cpuid.eax[0x80000006].ebx=0x68004200 x86.processor[0x0].cpuid.eax[0x80000006].ecx=0x2006140 x86.processor[0x0].cpuid.eax[0x80000006].edx=0x8009140 The base L3 cache size is 2 * (EDX & 0x3ffc0000), so 256 MIB. This is not unreasonable for an EPYC system, and it's probably right. However, that number could be a per-socket number, and the way we use this number for tuning, we need a per-thread amount. We adjust this per leaf 0x80000008. The thread count is in (ECX & 0xff) + 1: x86.processor[0x0].cpuid.eax[0x80000008].eax=0x3030 x86.processor[0x0].cpuid.eax[0x80000008].ebx=0x7 x86.processor[0x0].cpuid.eax[0x80000008].ecx=0x0 x86.processor[0x0].cpuid.eax[0x80000008].edx=0x10007 So we get 1, and there is no per-thread scale-down. (I think the hypervisor should expose a more realistic count here?) If the CPU family is at least 0x17, we assume that the number is measured per core complex. And that comes again from leaf 0x8000001d, subleaf 3, but this time register EAX. It's computed as (EAX >> 14 & 0xfff) + 1. This evaluates to 4096 here, and I think this is the bug. This CCX count is just way too high. Based on the available information, the glibc code assumes that there are 4096 instances of 256 MiB caches, which translates to 1 TiB of L3 cache (per thread, but the thread count is 1).
(In reply to Koichiro Iwao from comment #34) That was relayed from a colleague and I am awaiting a link. :-|
(In reply to Florian Weimer from comment #39) For 8000_001Dh, the %ecx == 0 reply seems to be legal, for fully assoc cache. This probably explains why my previous attempt did not worked, lets arbitrary set asssoc to 2 (%ecx == 1). From your explanation, and what I see in code, glibc should use the 'new way' then. For 8000_0006h, bhyve reflects the data reported by the host CPU. For 8000_0008h, the reported number of threads is user-controllable, AFAIR. There is a strange force-fallback to legacy reporting of ApicIdSize (%ecx 15:12) when cpu count per-package is less than 16. A useful experiment is to remove it. diff --git a/sys/amd64/vmm/x86.c b/sys/amd64/vmm/x86.c index a833b61786e7..b00ae12f802d 100644 --- a/sys/amd64/vmm/x86.c +++ b/sys/amd64/vmm/x86.c @@ -150,8 +150,8 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, * pkg_id_shift and other OSes may rely on it. */ width = MIN(0xF, log2(threads * cores)); - if (width < 0x4) - width = 0; +// if (width < 0x4) +// width = 0; logical_cpus = MIN(0xFF, threads * cores - 1); regs[2] = (width << AMDID_COREID_SIZE_SHIFT) | logical_cpus; } @@ -256,7 +256,7 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, func = 3; /* unified cache */ break; default: - logical_cpus = 0; + logical_cpus = sockets * threads * cores; level = 0; func = 0; break; @@ -266,7 +266,7 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, regs[0] = (logical_cpus << 14) | (1 << 8) | (level << 5) | func; regs[1] = (func > 0) ? (CACHE_LINE_SIZE - 1) : 0; - regs[2] = 0; + regs[2] = 1; /* Num of cache ways */ regs[3] = 0; break;
(In reply to Konstantin Belousov from comment #41) I still get the same result. $ ld.so --list-diagnostics |grep cache x86.cpu_features.data_cache_size=0x8000 x86.cpu_features.shared_cache_size=0x1000000000 x86.cpu_features.level1_icache_size=0x8000 x86.cpu_features.level1_icache_linesize=0x40 x86.cpu_features.level1_dcache_size=0x8000 x86.cpu_features.level1_dcache_assoc=0x8 x86.cpu_features.level1_dcache_linesize=0x40 x86.cpu_features.level2_cache_size=0x80000 x86.cpu_features.level2_cache_assoc=0x8 x86.cpu_features.level2_cache_linesize=0x40 x86.cpu_features.level3_cache_size=0x1000000000 x86.cpu_features.level3_cache_assoc=0x0 x86.cpu_features.level3_cache_linesize=0x40 x86.cpu_features.level4_cache_size=0x0 x86.cpu_features.cachesize_non_temporal_divisor=0x4
(In reply to Koichiro Iwao from comment #42) > I still get the same result. With an updated glibc? Which version did you build, and how did you install it?
(In reply to Florian Weimer from comment #43) I am sure it is with the previous (not fixed) glibc and my bhyve patch applied. (In reply to Koichiro Iwao from comment #42) Could you please show the output 'x86info -r' from patched and non-patched bhyve?
(In reply to Florian Weimer from comment #43) glibc I use here is not an updated version. The updated glibc is working fine, and I backported your patch to AlmaLinux 9.5. The updated package will be released soon. I will also report it to CentOS Stream 9 to get the fix backported. Thanks for your effort. https://git.almalinux.org/rpms/glibc/pulls/3 The issue I'm trying to address here now is to fix the 1TiB L3 cache issue, I think it is still a bhyve issue and needs to be fixed separately from the glibc issue. (In reply to Konstantin Belousov from comment #44) I'll post it later.
Created attachment 256095 [details] x86info-r-vanilla-bhyve-14-1-release.txt
Created attachment 256096 [details] x86info-r-patched-bhyve-14-stable.txt
(In reply to Konstantin Belousov from comment #44) Sorry, it turned out that I could not apply your patch provided in comment #41. I believe the patch is properly applied now. Now it looks to be working properly. $ ld.so --list-diagnostics |grep level3 x86.cpu_features.level3_cache_size=0x2 x86.cpu_features.level3_cache_assoc=0x1 x86.cpu_features.level3_cache_linesize=0x1
(In reply to Koichiro Iwao from comment #48) Ok, but does unpatched glibc work on patched bhyve?
(In reply to Konstantin Belousov from comment #49) Yes, unpatched glibc with patched bhyve works. After unpatching bhyve, it causes segfault again.
https://reviews.freebsd.org/D48187
Just for the record: - https://issues.redhat.com/browse/RHEL-71581 - https://issues.redhat.com/browse/RHEL-71583 - https://issues.redhat.com/browse/RHEL-71584
The bhyve patch in comment #41 broke EL8 (AlmaLinux 8, RockyLinux 8). There was no issue with unpatched bhyve. Host: FreeBSD 14-STABLE w/ patch comment #14 on AMD Ryzen 7 5700G Guest: AlmaLinux 8.10, RockyLinux 8.10 (glibc-2.28-251.el8_10.2.x86_64) Actually, I run EL8 container on AlmaLinux 9 on bhyve. AL9$ podman run -it --rm quay.io/almalinuxorg/almalinux:8 /bin/bash AL9$ echo $? 139 AL9$ dmesg | tail -2 [99577.163764] bash[2877]: segfault at 7ffdaa60f130 ip 00007fb9a9a12cf0 sp 00007ffdaa60e0e8 error 4 in libc-2.28.so[7fb9a996b000+1cd000] likely on CPU 1 (core 1, socket 1) [99577.163773] Code: 00 00 0f 18 8e c0 20 00 00 0f 18 8e 80 30 00 00 0f 18 8e c0 30 00 00 c5 fe 6f 06 c5 fe 6f 4e 20 c5 fe 6f 56 40 c5 fe 6f 5e 60 <c5> fe 6f a6 00 10 00 00 c5 fe 6f ae 20 10 00 00 c5 fe 6f b6 40 10
(In reply to Koichiro Iwao from comment #53) > Host: FreeBSD 14-STABLE w/ patch comment #14 on AMD Ryzen 7 5700G I meant comment #41 here.
Created attachment 256146 [details] almalinux-8-patched-bhyve-ldso-list-diagnostics.txt Just in case it might be useful, I attach the result of ld.so --list-diagnostics.
(In reply to Koichiro Iwao from comment #55) So can you disassemble the function around the faulted address from libc.so, please?
(In reply to Konstantin Belousov from comment #56) Also, as a blind guess, try to revert this chunk - if (width < 0x4) - width = 0; +// if (width < 0x4) +// width = 0; and see.
(In reply to Konstantin Belousov from comment #57 )I'm not familiar with this low-layer area, so I need to know how to disassemble. Anyway, I'll try reverting the width stuff first.
(In reply to Konstantin Belousov from comment #57) I re-added the width stuff didn't help. $ podman run -it --rm quay.io/almalinuxorg/almalinux:8 /bin/bash [ 16.234882] bash[1001]: segfault at 7ffd851bbd80 ip 00007fd58d40fd10 sp 00007ffd851b9d38 error 4 in libc-2.28.so[7fd58d368000+1cd000] likely on CPU 0 (core 0, socket 0) [ 16.234892] Code: c5 fe 6f 56 40 c5 fe 6f 5e 60 c5 fe 6f a6 00 10 00 00 c5 fe 6f ae 20 10 00 00 c5 fe 6f b6 40 10 00 00 c5 fe 6f be 60 10 00 00 <c5> 7e 6f 86 00 20 00 00 c5 7e 6f 8e 20 20 00 00 c5 7e 6f 96 40 20
Giving this a bump... haven't seen any movement for a while now. This strikes me as being pretty critical. It is now possible to upgrade from Almalinux 9.4 to 9.5 - I gather they reverted the lib change. The same is not true for Rocky etc. You can not do a fresh install of 9.5 (which makes sense since the release images are the same as they were in December).
It has been fixed on glibc [1], although it seems that bhyve still sets the L3 cache size to bogus value (which might impact in perfomance, since it influences in which string optimization will be selected at runtime). So either backport this glibc fix to the affected distros or fix the bhyve L3 cache size report (which should fix boot on the affected distros). [1] https://sourceware.org/bugzilla/show_bug.cgi?id=32470
(In reply to Adhemerval Zanella from comment #61) Somebody needs to help debug the Alma Linux crash report for the supposed fix.
(In reply to Konstantin Belousov from comment #62) The command used was: > podman run -it --rm quay.io/almalinuxorg/almalinux:8 /bin/bash This is not going to work until the container image is updated. That hasn't happened. Same issue for quay.io/almalinuxorg/almalinux:9. I do not see an a8 branch here: https://git.almalinux.org/rpms/glibc So I'm not sure if there's even a forked glibc package that contains the revert for AlmaLinux 8. So it doesn't look like there is anything mysterious about the reported failure.
(In reply to Florian Weimer from comment #63) I believe that the Alma issue is different. It is only for the patched bhyve. It is probably something that the old glibc wants from CPUID.
(In reply to Florian Weimer from comment #63) Let me sort out the remaining issues. 1TB L3 cache issue: - Addressed in glibc upstream - AlmaLinux 9.5 and Kitten 10 already include the upstream patch - Addressed also on bhyve side (temporary patch) However, the temporary bhyve patch broke glibc on AlmaLinux 8 / EL8 (glibc-2.28-251.el8_10.13). So the temporary patch for bhyve might have regression. This regression prevents the bhyve patch from being merged. The bhyve patch means here is the following: --- a/sys/amd64/vmm/x86.c +++ b/sys/amd64/vmm/x86.c @@ -152,6 +152,8 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, * pkg_id_shift and other OSes may rely on it. */ width = MIN(0xF, log2(threads * cores)); + if (width < 0x4) + width = 0; logical_cpus = MIN(0xFF, threads * cores - 1); regs[2] = (width << AMDID_COREID_SIZE_SHIFT) | logical_cpus; To reproduce the EL8 glibc issue, run the following command on PATCHED bhyve environment. $ podman run -it --rm quay.io/almalinuxorg/almalinux:8 /bin/bash $ podman run -it --rm quay.io/rockylinux/rockylinux:8 /bin/bash AlmaLinux doesn't have a8 branch for glibc package, so there are no AlmaLinux-specific patches. AlmaLinux's glibc and Rocky Linux's one are built from the same source. So what we need to do is, - Improve bhyve patch not to break EL8 glibc but fix EL9 glibc issue (1TB L3 cache issue) - Fix EL8 glibc issue in upstream if it is an upstream issue I will give you access for PATCHED bhyve environment if necessary. Send me your SSH public key.
(In reply to Koichiro Iwao from comment #65) > - Fix EL8 glibc issue in upstream if it is an upstream issue If I understand correctly, both AlmaLinux 9.5 and Kitten 10 works correctly now after glibc upstream fixed the issue and it was backported, right? Do they work with and without the bhyve workaround? I am trying to understand whether we still have a upstream (which I get that this is current developement branch) issue or if this is indeed fixed.
(In reply to Adhemerval Zanella from comment #66) > If I understand correctly, both AlmaLinux 9.5 and Kitten 10 works correctly now after glibc upstream fixed the issue and it was backported, right? Do they work with and without the bhyve workaround? Yes, that's all correct. You might need to install it using AlmaLinux 9.4 installer ISO and update it to the latest 9.5 packages to avoid using the affected gilbc. Otherwise, use the latest 9.5 GenericCloud image.
(In reply to Koichiro Iwao from comment #65) This wasn't correct. > The bhyve patch means here is the following: > >--- a/sys/amd64/vmm/x86.c >+++ b/sys/amd64/vmm/x86.c >@@ -152,6 +152,8 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, > * pkg_id_shift and other OSes may rely on it. > */ > width = MIN(0xF, log2(threads * cores)); >+ if (width < 0x4) >+ width = 0; > logical_cpus = MIN(0xFF, threads * cores - 1); > regs[2] = (width << AMDID_COREID_SIZE_SHIFT) | logical_cpus; This is the correct patch I meant. --- a/sys/amd64/vmm/x86.c +++ b/sys/amd64/vmm/x86.c @@ -258,7 +256,7 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, func = 3; /* unified cache */ break; default: - logical_cpus = 0; + logical_cpus = sockets * threads * cores; level = 0; func = 0; break; @@ -268,7 +266,7 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, regs[0] = (logical_cpus << 14) | (1 << 8) | (level << 5) | func; regs[1] = (func > 0) ? (CACHE_LINE_SIZE - 1) : 0; - regs[2] = 0; + regs[2] = 1; /* Num of cache ways */ regs[3] = 0; break;
Having just received an AMD 7840U I wanted to do a little more research into this bug and the current patch. Given the cache values I am seeing I believe the patch needs a small change. Looking at the cache output from ld.so --list-diagnostics without the patch, i.e., the current code: x86.cpu_features.data_cache_size=0x8000 x86.cpu_features.shared_cache_size=0x1000000000 x86.cpu_features.level1_icache_size=0x8000 x86.cpu_features.level1_icache_linesize=0x40 x86.cpu_features.level1_dcache_size=0x8000 x86.cpu_features.level1_dcache_assoc=0x8 x86.cpu_features.level1_dcache_linesize=0x40 x86.cpu_features.level2_cache_size=0x100000 x86.cpu_features.level2_cache_assoc=0x8 x86.cpu_features.level2_cache_linesize=0x40 x86.cpu_features.level3_cache_size=0x1000000000 x86.cpu_features.level3_cache_assoc=0x0 x86.cpu_features.level3_cache_linesize=0x40 x86.cpu_features.level4_cache_size=0x0 x86.cpu_features.cachesize_non_temporal_divisor=0x4 As Florian states in #36, the L3 cache size reporting 1TB is what triggers the bug in glibc-2.40 (or a patched 2.39). Applying the patch from https://reviews.freebsd.org/D48187 gives these cache values: x86.cpu_features.data_cache_size=0x80 x86.cpu_features.shared_cache_size=0x2 x86.cpu_features.level1_icache_size=0x80 x86.cpu_features.level1_icache_linesize=0x40 x86.cpu_features.level1_dcache_size=0x80 x86.cpu_features.level1_dcache_assoc=0x1 x86.cpu_features.level1_dcache_linesize=0x40 x86.cpu_features.level2_cache_size=0x80 x86.cpu_features.level2_cache_assoc=0x1 x86.cpu_features.level2_cache_linesize=0x40 x86.cpu_features.level3_cache_size=0x2 x86.cpu_features.level3_cache_assoc=0x1 x86.cpu_features.level3_cache_linesize=0x1 x86.cpu_features.level4_cache_size=0x0 x86.cpu_features.cachesize_non_temporal_divisor=0x4 While the guest apps now work, the cache sizes are too small and not realistic. This is due to cpuid 0x8000001D not being fully implemented. Looking at the glibc code: https://github.com/bminor/glibc/blob/glibc-2.40/sysdeps/x86/dl-cacheinfo.h#L309 As Florian talks about in #39, the handle_amd() function first looks at cpuid 0x8000001D for the cache information which is not providing all of the parameters needed to compute the correct cache sizes. If 0x8000001D is not available or the returned ecx==0, it falls back to a legacy mechanism. But for Zen architecture will also look at the 0x8000001D eax for the NumSharingCache. To get this fallback to work properly I reverted one of the changes in the proposed patch from Konstantin <https://reviews.freebsd.org/D48187> and only used: --- a/sys/amd64/vmm/x86.c +++ b/sys/amd64/vmm/x86.c @@ -150,8 +150,6 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, * pkg_id_shift and other OSes may rely on it. */ width = MIN(0xF, log2(threads * cores)); - if (width < 0x4) - width = 0; logical_cpus = MIN(0xFF, threads * cores - 1); regs[2] = (width << AMDID_COREID_SIZE_SHIFT) | logical_cpus; } @@ -256,7 +254,7 @@ x86_emulate_cpuid(struct vcpu *vcpu, uint64_t *rax, uint64_t *rbx, func = 3; /* unified cache */ break; default: - logical_cpus = 0; + logical_cpus = sockets * threads * cores; level = 0; func = 0; break; The reverted change will keep 0x8000001D ecx==0 to prevent 0x8000001D use in handle_amd() while still setting a better value for NumSharingCache for use in the legacy code path. The reported cache sizes with this change shows: x86.cpu_features.data_cache_size=0x8000 x86.cpu_features.shared_cache_size=0x2000000 x86.cpu_features.level1_icache_size=0x8000 x86.cpu_features.level1_icache_linesize=0x40 x86.cpu_features.level1_dcache_size=0x8000 x86.cpu_features.level1_dcache_assoc=0x8 x86.cpu_features.level1_dcache_linesize=0x40 x86.cpu_features.level2_cache_size=0x100000 x86.cpu_features.level2_cache_assoc=0x8 x86.cpu_features.level2_cache_linesize=0x40 x86.cpu_features.level3_cache_size=0x2000000 x86.cpu_features.level3_cache_assoc=0x0 x86.cpu_features.level3_cache_linesize=0x40 x86.cpu_features.level4_cache_size=0x0 x86.cpu_features.cachesize_non_temporal_divisor=0x4 These values look more reasonable and fixes the guest issues on my system. Would like see if this matches what other people are seeing for cache sizes with this patch and if it resolves any outstanding issues.
(In reply to Mark Peek from comment #69) Thank you for the analysis. I realized that it is just a bug in the patch. The intent was to set the number of cache ways to 1, but I ignored the 'number of ways is the value returned plus one' part of the spec. I updated the patch, basically with the yours revert, and added the comment explaining the intent.
(In reply to Mark Peek from comment #69) With your patch, EL8 glibc no longer crashes. It looks good to me as for this issue. However, it still reports wrong L3 cache size. My processor is AMD Ryzen 7 5700G (8-core/16-threads) so it has the following cache sizes. L1: 64KiB (32KiB instruction + 32KiB data/core) L2: 4MiB (512KiB/core) L3: 16MiB https://www.techpowerup.com/cpu-specs/ryzen-7-5700g.c2472 (AlmaLinux 9 on bhyve with D48187 patch) $ ld.so --list-diagnostics | grep cache x86.cpu_features.data_cache_size=0x8000 x86.cpu_features.shared_cache_size=0x2000000 x86.cpu_features.level1_icache_size=0x8000 # 32768 -> 32KiB x86.cpu_features.level1_icache_linesize=0x40 x86.cpu_features.level1_dcache_size=0x8000 # 32768 -> 32KiB x86.cpu_features.level1_dcache_assoc=0x8 x86.cpu_features.level1_dcache_linesize=0x40 x86.cpu_features.level2_cache_size=0x80000 # 524288 -> 512KiB x86.cpu_features.level2_cache_assoc=0x8 x86.cpu_features.level2_cache_linesize=0x40 x86.cpu_features.level3_cache_size=0x2000000 # 33554432 -> 32MiB <= WRONG! x86.cpu_features.level3_cache_assoc=0x0 x86.cpu_features.level3_cache_linesize=0x40 x86.cpu_features.level4_cache_size=0x0 x86.cpu_features.cachesize_non_temporal_divisor=0x4
(In reply to Koichiro Iwao from comment #71) In what sense this is wrong? x86.cpu_features.level3_cache_size=0x2000000 # 33554432 -> 32MiB <= WRONG! 32MB is relatively reasonable number. What misbehavior do you see?
(In reply to Konstantin Belousov from comment #72) > In what sense this is wrong? > 32MB is relatively reasonable number. What misbehavior do you see? The processor actually has only 16MB L3 cache but bhyve reports the doubled value. Maybe you meant that it's normal for the cache size not to be the same as the physical CPU? Sorry about that if it is absolutely normal. On my other hardware which has Intel CPU, it reports exactly the same cache sizes with physical CPU. That's the reason why I considered it is still wrong. (AlmaLinux 9 on vanilla bhyve 14.2-RELEASE on Intel Celeron N5105) $ ld.so --list-diagnostics | grep cache x86.cpu_features.data_cache_size=0x8000 x86.cpu_features.shared_cache_size=0x400000 x86.cpu_features.level1_icache_size=0x8000 x86.cpu_features.level1_icache_linesize=0x40 x86.cpu_features.level1_dcache_size=0x8000 x86.cpu_features.level1_dcache_assoc=0x8 x86.cpu_features.level1_dcache_linesize=0x40 x86.cpu_features.level2_cache_size=0x180000 x86.cpu_features.level2_cache_assoc=0xc x86.cpu_features.level2_cache_linesize=0x40 x86.cpu_features.level3_cache_size=0x400000 x86.cpu_features.level3_cache_assoc=0x10 x86.cpu_features.level3_cache_linesize=0x40 x86.cpu_features.level4_cache_size=0xffffffffffffffff x86.cpu_features.cachesize_non_temporal_divisor=0x4
(In reply to Koichiro Iwao from comment #73) Yes, this is normal. We do not aim to report the host values, only something that makes the guest accept the values. So no other issues?
(In reply to Konstantin Belousov from comment #74) Yes, no issues as far as I tested.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=0698ce429f78f548f7eb3e54476fb312109ddd8b commit 0698ce429f78f548f7eb3e54476fb312109ddd8b Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-12-17 21:09:33 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2025-03-05 12:27:58 +0000 bhyve: fix CPUID L3 Cache Size reporting for AMD/SVM Adjust leaf 0x8000_001D %ecx 3 on AMD (L3 cache params). - Report cache as 1-way associative. Glibc does not believe that there are fully associative L3 caches, ignoring the leaf and falling back to legacy way of reading cache params. - Do not report 4095 logical CPUs per L3 cache, report the true total number of emulated CPUs. The insanely large value tricked some version of glibc to overflow 32bit calculation of the L3 cache size, as reported in the PR. Also, for leaf 0x8000_0008, do not clip ApicIdSize to zero if less than 4. This effectively falls back to legacy. PR: 279901 With the help from: Florian Weimer <fweimer@redhat.com> Reviewed by: kevans, meta, mp Tested by: meta, mp Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D48187 sys/amd64/vmm/x86.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=f9857ea43a38b74e34ce7f6576ad4e6415413454 commit f9857ea43a38b74e34ce7f6576ad4e6415413454 Author: Konstantin Belousov <kib@FreeBSD.org> AuthorDate: 2024-12-17 21:09:33 +0000 Commit: Konstantin Belousov <kib@FreeBSD.org> CommitDate: 2025-03-12 00:25:02 +0000 bhyve: fix CPUID L3 Cache Size reporting for AMD/SVM PR: 279901 (cherry picked from commit 0698ce429f78f548f7eb3e54476fb312109ddd8b) sys/amd64/vmm/x86.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
^Triage: assign to committer who resolved.