During an exp-run for llvm 13 (see bug 258209), it turned out that java/openjdk11 through openjdk13 fail to build with clang 13: === Output from failing command(s) repeated here === * For target jdk__packages_attribute.done: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x0000000802c8a991, pid=92123, tid=618713 # # JRE version: (11.0.12+7) (build ) # Java VM: OpenJDK 64-Bit Server VM (11.0.12+7-1, mixed mode, tiered, compressed oops, serial gc, bsd-amd64) # Problematic frame: # V [libjvm.so+0xe8a991] JVM_RaiseSignal+0x3bfbf1 # # Core dump will be written. Default location: /wrkdirs/usr/ports/java/openjdk11/work/jdk11u-jdk-11.0.12-7-1/make/java.core # # An error report file with more information is saved as: # /wrkdirs/usr/ports/java/openjdk11/work/jdk11u-jdk-11.0.12-7-1/make/hs_err_pid92123.log These crashes are all caused by the markOop/markOopDesc classes, which are used to keep track of objects, and which are 'marked' using the low few bits. (See https://github.com/openjdk/jdk13u/blob/master/src/hotspot/share/oops/markOop.hpp ). After some laborious bisecting, I found out that these crashes start occuring after the upstream commit https://github.com /llvm/llvm-project/commit/16d03818412 (Return "[CGCall] Annotate this argument with alignment"). What happens afterwards, is that clang considers the "this" pointer to always be aligned to the alignment of the actual object, and then masking or adding a few low bits is not working as expected. The reason openjdk14 and higher work fine with clang 13, and don't crash similarly, is that the OpenJDK people completely redid the markOop/markOopDesc classes in https://github.com/openjdk/jdk/commit/ae5615c6142a4dc0d9033462f4880d7b3c127e26 ("8229258: Rework markOop and markOopDesc into a simpler mark word value carrier"). E.g, the markOopDesc class was renamed to markWord, and *stores* a pointer-like value instead of *being* a pointer-like value. This is a much safer way of handling things. However, this upstream commit is *very* large, as are a few of its follow-ups, which is probably the reason why it has not been backported to JDKs <= 13. I tried manually backporting it, but got lost in many nasty patch conflicts and problems. I would like to solicit some opinions from our OpenJDK maintainers, on how to move forward with this issue. I see a few ways: * Get someone well-versed in OpenJDK internals to backport '8229258: Rework markOop and markOopDesc' (this is a *lot* of tricky stuff, and has to be done for at least 11, 12 and 13; but maybe earlier JDKs too). * Find some alternative way of simplifying the approach in '8229258: Rework markOop and markOopDesc', and backport that * Revert the upstream LLVM commit; I don't really like this because we would have to carry that patch forever (as LLVM upstream won't accept it obviously) * Adjust the port Makefiles for openjdk11 though openjdk13 to use the clang12 port * ... something else?
Created attachment 228727 [details] java/openjdk{8,11,12,13}: work around UB in markOopDesc Unless there is a strong objection, I will commit the attached patch soon, probably during the weekend. Since the patches committed upstream for https://bugs.openjdk.java.net/browse/JDK-8229258 are tricky to backport to OpenJDK 8, 11, 12 and 13, the safest workaround is to force use of clang12 from the devel/llvm12 port, iff the system compiler is 13.0.0.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=3822416493cfbbed8fe7a487391b40bec956d671 commit 3822416493cfbbed8fe7a487391b40bec956d671 Author: Dimitry Andric <dim@FreeBSD.org> AuthorDate: 2021-10-15 18:18:36 +0000 Commit: Dimitry Andric <dim@FreeBSD.org> CommitDate: 2021-10-16 12:22:03 +0000 java/openjdk*: work around UB in markOopDesc, fix builds with clang 13 During an exp-run for llvm 13 (see bug 258209), it turned out that java/openjdk11 through openjdk13 fail to build with clang 13: === Output from failing command(s) repeated here === * For target jdk__packages_attribute.done: These crashes are all caused by the markOop/markOopDesc classes, which are used to keep track of objects, and which are 'marked' using the low few bits. (See https://github.com/openjdk/jdk13u/blob/master/src/hotspot/share/oops/markOop.hpp ). After some laborious bisecting, I found out that these crashes start occuring after the upstream commit https://github.com /llvm/llvm-project/commit/16d03818412 (Return "[CGCall] Annotate this argument with alignment"). What happens afterwards, is that clang considers the "this" pointer to always be aligned to the alignment of the actual object, and then masking or adding a few low bits is not working as expected. The reason openjdk14 and higher work fine with clang 13, and don't crash similarly, is that the OpenJDK people completely redid the markOop/markOopDesc classes in https://github.com/openjdk/jdk/commit/ae5615c6142a4dc0d9033462f4880d7b3c127e26 ("8229258: Rework markOop and markOopDesc into a simpler mark word value carrier"). E.g, the markOopDesc class was renamed to markWord, and *stores* a pointer-like value instead of *being* a pointer-like value. This is a much safer way of handling things. However, this upstream commit is *very* large, as are a few of its follow-ups, which is probably the reason why it has not been backported to JDKs <= 13. I tried manually backporting it, but got lost in many nasty patch conflicts and problems. As a workaround, build openjdk8 through 13 with clang12 from the devel/llvm12 port, for the time being. In addition, allow openjdk14 through 17 to be built with clang 13, by adding -Wno-unused-but-set-parameter to the compilation flags. PR: 258954 Approved by: maintainer timeout (2 weeks) MFH: 2021Q4 java/openjdk11/Makefile | 9 +++++++++ java/openjdk12/Makefile | 10 +++++++++- java/openjdk13/Makefile | 9 +++++++++ java/openjdk14/Makefile | 5 ++++- java/openjdk15/Makefile | 4 ++++ java/openjdk16/Makefile | 4 ++++ java/openjdk17/Makefile | 5 ++++- java/openjdk8/Makefile | 8 ++++++++ 8 files changed, 51 insertions(+), 3 deletions(-)
Re-opening this as it seems this was fixed upstream with a backport (according to the mentioned JDK bug): https://github.com/openjdk/jdk11u/pull/23/commits/f4a05f68ef4223993fd47896a806c8a49df7db71 Could we apply that change to openjdk and avoid building another llvm for this pulling in an older version of python and whatnot? Re-assigning to java for this...
Note that openjdk12 through openjdk16 expired today.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=591a784f324b7d8c45596d758b4c0893626bdbef commit 591a784f324b7d8c45596d758b4c0893626bdbef Author: Dimitry Andric <dim@FreeBSD.org> AuthorDate: 2022-08-06 10:54:52 +0000 Commit: Dimitry Andric <dim@FreeBSD.org> CommitDate: 2022-08-06 10:55:32 +0000 java/openjdk{8,11}: Remove dependency on devel/llvm12 which is no longer necessary The workarounds committed in aa1ca89826b5 and 846ff4e95291 are no longer necessary, as both the upstream commits for PR258954 (https://github.com/battleblow/jdk11u/commit/305a68a90c722aa7a7b75589e24d5b5d554c96c1) and PR264065 (https://hg.openjdk.java.net/jdk/jdk/rev/40c07de877ab) are now merged into the distribution tarballs. PR: 258954, 264065 Approved by: maintainer timeout (1 month) MFH: 2022Q3 java/openjdk11/Makefile | 8 -------- java/openjdk8/Makefile | 8 -------- 2 files changed, 16 deletions(-)