Created attachment 235873 [details] unity_1935_cxx.cxx.E.cxx.bz2 clang never finishes on the attached (compressed) C++ unit. The command > c++ -fno-strict-aliasing -fno-omit-frame-pointer -march=native -O3 -fPIC -c unity_1935_cxx.cxx.E.cxx doesn't finish in hours. But when -fno-strict-aliasing is removed it finishes within a minute. > $ c++ --version > FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c) > Target: x86_64-unknown-freebsd13.1 > Thread model: posix > InstalledDir: /usr/bin This problem was originally discovered when one C++ module in the psi4 project (https://github.com/psi4/psi4) never finished to compile.
The hang (or exponential runtime) appears to be in LLVM's DAGCombiner::Run working on the function deriv2eri3_aB_P__0__F__1___TwoPRep_unit__0__P__1___Ab__up_0. To narrow it down I ran clang++ -emit-llvm to get an IR file. That ran quickly. Then I ran llc from an LLVM 14 build (it's not installed by default) on that IR file and that part hung. The C++ file does not compile with LLVM 15 clang due to changes in intrinsics. The LLVM 15 llc program hangs on the output from LLVM 14 clang. Is anybody in particular responsible for reporting LLVM bugs upstream?
I believe I'm hitting the same or a related problem after source upgrading from FreeBSD-13.0-RELEASE-p5 to FreeBSD-13.1-RELEASE-p2. When recompiling all ports afterwards it gets stuck consistently on the same file from databases/mariadb105-server-10.5.17: /usr/ports/databases/mariadb105-server/work/mariadb-10.5.17/storage/innobase/pars/pars0sym.cc The compilation process gets stuck at 100% cpu but produces no output to the object file and truss(1) shows no syscalls at this point. I let it sit for 8+ hours in this state but it never progressed so I killed the process at that point. I quickly installed a fresh FreeBSD-13.1-p2 in a virtual machine and verified that the ports compilation worked there. The only thing that differs between my machines and a default setup are a few exclusions in /etc/src.conf and a CPUTYPE?=bonnell directive in /etc/make.conf I reverted those changes and recompiled my system and then the ports compilation of mariadb105-server worked normally again. I restored the CPUTYPE?=bonnell and recompiled the system again and once more the compilation of mariadb gets stuck at the exact same file. My knowledge is limited and I don't know how to track down the problem further but the CPUTYPE directive adds a -march= to the compiler arguments just as the original post shows. Perhaps there's a malfunction here in the clang/llvm 13 version included in FreeBSD-13.1? I had no such problems with clang/llvm 11 in FreeBSD-13.0. As an extra test I also tried to compile mariadb106-server and it experiences the exact same stuck behaviour, albeit on a different file but in the same directory. I forgot to make a note of that filename unfortunately.
(In reply to Morgan Wesström from comment #2) I was unable to reproduce the hang. It may be unrelated to the original bug. There are many reasons a compiler might hang. If you send the compiler a QUIT signal while it is stuck it will crash and save enough information to submit a bug report: the preprocessed source and a command line to invoke the compiler.
(In reply to John F. Carr from comment #3) Thank you, John. These are two identical 10 year old Atom D525 boxes. They are slow and take about 18 hours to recompile the whole base system. I will try your suggestion but it will take a few days before i have something to report. :)
I can reproduce this with Yuri's original test case. I'm currently attempting to reduce the test case to something that can be checked against various versions of clang, to see if it is a regression. After it has been reduced, I will probably submit it as an upstream bug.
(In reply to John F. Carr from comment #3) I've recompiled my system again with CPUTYPE?=bonnell and can now reproduce the stall again. My earlier copy and paste unfortunately referenced the wrong file. The correct file should be pars0pars.cc for mariadb105-server-10.5.17. I apologize for that error. I have attached the crash backtrace and the source file. If this isn't related to the original bug report, feel free to delete my posts and advice me if I should create a new bug report for this. I realize old Atom CPUs don't get much love these days and that I may be the only person affected.
Created attachment 236572 [details] SIGQUIT from mariadb105-server-10.5.17
Created attachment 236573 [details] pars0pars.cc from mariadb105-server-10.5.17
(In reply to Morgan Wesström from comment #7) There's nothing BSD-specific here so I filed a bug report with LLVM. https://github.com/llvm/llvm-project/issues/57764
(In reply to John F. Carr from comment #9) Greatly appreciated, thank you. :) This is far beyond my level of understanding but I've subscribed to that thread and will monitor it if more info is requested. Once again, I apologize for hijacking this thread with an unrelated issue, it was not my intention.
LLVM issue 57164 has been accidentally fixed between 16.0 and the latest main branch. There is no obviously relevant commit message but the latest llc compiles the IR file in 2 minutes instead of forever.
(In reply to John F. Carr from comment #11) I don't see that, with llvm main (llvmorg-17-init-16183-g7b31a73ffe8) I still get: Assertion failed: ((ExtraInfo->getCascade(Intf->reg()) < Cascade || VirtReg.isSpillable() < Intf->isSpillable()) && "Cannot decrease cascade number, illegal eviction"), function evictInterference, file /home/dim/src/llvm/llvm-project/llvm/lib/CodeGen/RegAllocGreedy.cpp, line 505. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /home/dim/ins/llvmorg-17-init-16183-g7b31a73ffe8/bin/clang -O2 -c hang.ll 1. Code generation 2. Running pass 'Function Pass Manager' on module 'hang.ll'. 3. Running pass 'Greedy Register Allocator' on function '@_Z26pars_info_add_int4_literalP11pars_info_tPKcm' on your .ll test case. Maybe you have a 16.0 version with assertions disabled?