Bug 271498 - databases/postgresql15-server: fix build with clang 16 (as system compiler)
Summary: databases/postgresql15-server: fix build with clang 16 (as system compiler)
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Palle Girgensohn
URL:
Keywords:
: 271428 (view as bug list)
Depends on:
Blocks: 271047
  Show dependency treegraph
 
Reported: 2023-05-18 21:27 UTC by Dimitry Andric
Modified: 2023-07-05 14:15 UTC (History)
3 users (show)

See Also:
bugzilla: maintainer-feedback? (pgsql)


Attachments
databases/postgresql15-server: fix build with clang 16 (as system compiler) (4.22 KB, patch)
2023-05-18 21:28 UTC, Dimitry Andric
no flags Details | Diff
databases/postgresql15-server: fix build with clang 16 (as system compiler) (v2) (4.50 KB, patch)
2023-05-30 20:56 UTC, Dimitry Andric
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dimitry Andric freebsd_committer freebsd_triage 2023-05-18 21:27:36 UTC
During an exp-run for llvm 16 (see bug 271047), it turned out that
databases/postgresql15-server failed to build with clang 16 as the base
system compiler:

  In file included from llvmjit.c:38:
  ../../../../src/include/jit/llvmjit_emit.h:112:23: warning: call to undeclared function 'LLVMBuildStructGEP'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
          LLVMValueRef v_ptr = LLVMBuildStructGEP(b, v, idx, "");
                               ^
  ../../../../src/include/jit/llvmjit_emit.h:112:15: error: incompatible integer to pointer conversion initializing 'LLVMValueRef' (aka 'struct LLVMOpaqueValue *') with an expression of type 'int' [-Wint-conversion]
          LLVMValueRef v_ptr = LLVMBuildStructGEP(b, v, idx, "");
                       ^       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ../../../../src/include/jit/llvmjit_emit.h:114:9: warning: call to undeclared function 'LLVMBuildLoad'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
          return LLVMBuildLoad(b, v_ptr, name);
                 ^
  ../../../../src/include/jit/llvmjit_emit.h:114:9: error: incompatible integer to pointer conversion returning 'int' from a function with result type 'LLVMValueRef' (aka 'struct LLVMOpaqueValue *') [-Wint-conversion]
          return LLVMBuildLoad(b, v_ptr, name);
                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

and many more similar errors. The same applies to the slave ports
postgresql11-server, postgresql12-server, postgresql13-server, and
postgresql14-server.

This is because PostgreSQL's llvmjit support is not yet ready for LLVM
16's API changes. In LLVM 15 functions like LLVMBuildStructGEP were
already declared deprecated, but in LLVM 16 these have been removed (and
replaced with LLVMBuildStructGEP2, etc).

Until PostgreSQL updates their llvmjit for the changed APIs, I would
like to reintroduce the LLVM version restriction that was removed in
<https://cgit.freebsd.org/ports/commit/?id=1e19f00e5e46>, but slightly
adjusted: there is no more need to check for LLVM ports below 10.

Another thing that is required is to pass a CLANG variable to the
configure script, pointing to the same version that is used for the LLVM
bindings. Otherwise, the .bc files will get compiled by the base system
compiler, and this can lead to a ThinLTO link error, if the base system
compiler is a newer version of llvm.
Comment 1 Dimitry Andric freebsd_committer freebsd_triage 2023-05-18 21:28:13 UTC
Created attachment 242261 [details]
databases/postgresql15-server: fix build with clang 16 (as system compiler)
Comment 2 Palle Girgensohn freebsd_committer freebsd_triage 2023-05-30 20:28:07 UTC
Hi!
Am correct in assuming that the bit:

# Convert LLVM_DEFAULT to COMPILER_VERSION format to make it
# suitable for version comparison.
.if ${LLVM_DEFAULT} >= 70 && ${LLVM_DEFAULT} <= 90
LLVM_DEFAULT_VERSION=   ${LLVM_DEFAULT}
.else
LLVM_DEFAULT_VERSION=   ${LLVM_DEFAULT}0
.endif



could also be removed? the llvm70 80 and 90 ports are not around anymore.
Comment 3 Palle Girgensohn freebsd_committer freebsd_triage 2023-05-30 20:28:18 UTC
Hi!
Am correct in assuming that the bit:

# Convert LLVM_DEFAULT to COMPILER_VERSION format to make it
# suitable for version comparison.
.if ${LLVM_DEFAULT} >= 70 && ${LLVM_DEFAULT} <= 90
LLVM_DEFAULT_VERSION=   ${LLVM_DEFAULT}
.else
LLVM_DEFAULT_VERSION=   ${LLVM_DEFAULT}0
.endif



could also be removed? the llvm70 80 and 90 ports are not around anymore.
Comment 4 Dimitry Andric freebsd_committer freebsd_triage 2023-05-30 20:49:22 UTC
(In reply to Palle Girgensohn from comment #3)
Yes, I think I missed that part? In the latter part of the Makefile it has:

.if ${COMPILER_VERSION} > ${LLVM_DEFAULT_VERSION}
# LLVM versions in ports are, in order, 10, 11, 12...
.  if ${LLVM_PORT_SUFFIX} < 11
PG_LLVM_VERSION=11
PG_COMPILER_VERSION=11
.  elif ${LLVM_PORT_SUFFIX} > 15
# LLVM 16 and later are not yet supported in PostgreSQL.

but I didn't see the fiddling with the version number before that. :)
Comment 5 Dimitry Andric freebsd_committer freebsd_triage 2023-05-30 20:56:48 UTC
Created attachment 242506 [details]
databases/postgresql15-server: fix build with clang 16 (as system compiler) (v2)
Comment 6 Palle Girgensohn freebsd_committer freebsd_triage 2023-05-30 21:10:10 UTC
Another question:

if the system clang rules which llvm we use, does this statement still hold true:

    Another thing that is required is to pass a CLANG variable to the
    configure script, pointing to the same version that is used for the LLVM
    bindings. Otherwise, the .bc files will get compiled by the base system
    compiler, and this can lead to a ThinLTO link error, if the base system
    compiler is a newer version of llvm.


if the system /usr/bin/clang has version 13, and we decide to use llvm13 to match, what's the point in forcing the use the *port's* version of clang13? Does it really differ from the /usr/bin/clang? And in that case, we could just skip the tests and pull the latest llvm15 from the ports tree and build with it, instead of testing which compiler we use.

I don't have the experience with different minor versions of clang, so please enlighten me :)

Palle
Comment 7 Palle Girgensohn freebsd_committer freebsd_triage 2023-05-30 22:48:37 UTC
Since we want the same llvm as the compiler, as I see it we have two choices. Either use the compiler that comes with llvm, in which case we use the default LLVM out of the box. Easy:

```
LLVM_USES= llvm:max=15,lib
LLVN_CONFIGURE_ENV= CC=${LLVM_PREFIX}/bin/clang${LLVM_DEFAULT}
```

or we decide to use the base system compiler, and match LLVM version with it:

```
.include <bsd.port.pre.mk>

.if ${PORT_OPTIONS:MLLVM}
# Sync LLVM to the preferred compiler if possible
# or else use a lower version compiler that is compatible.
# LLVM 16 and later are not yet supported in PostgreSQL.
.  if ${COMPILER_VERSION} >= 160
PG_LLVM_SUFFIX=         15
CONFIGURE_ENV+=         CC=${LOCALBASE}/bin/clang${PG_LLVM_SUFFIX}
.  else
PG_LLVM_SUFFIX=         ${COMPILER_VERSION:C/.$//}
.  endif

LIB_DEPENDS+=           libLLVM-${PG_LLVM_SUFFIX}.so:devel/llvm${PG_LLVM_SUFFIX}
INSTALL_DIRS+=          src/backend/jit/llvm
.endif

```

I suggest the latter, just sticking with the base compiler, seems like the safest bet.
Comment 8 Palle Girgensohn freebsd_committer freebsd_triage 2023-05-30 22:58:27 UTC
(In reply to Palle Girgensohn from comment #6)
Please also comment on the question about CLANG.

When we do need to set CLANG, that is, when we downgrade from 16 to 15, we should set both CC and CLANG variables, right?

so add

CONFIGURE_ENV+=         CLANG=${LOCALBASE}/bin/clang${PG_LLVM_SUFFIX}

to my previous comment, just after the other CONFIGURE_ENV.

Palle
Comment 9 Palle Girgensohn freebsd_committer freebsd_triage 2023-05-30 23:09:13 UTC
or is it just the ,bc files that need the clang compiler version in sync with llvm? could we allow that just the .bc files get clang from ports, same version as llvm, and the rest if postgresql gets built with base clang from /usr/bin/cc?

That would suggest something similar to my first suggestion, just use

```
LLVM_USES= llvm:max=15,lib
LLVM_CONFIGURE_ENV= CLANG=${LLVM_PREFIX}/bin/clang${LLVM_VERSION}
```

The result being that in 13.2, clang14 (/usr/bin/cc from base) is used to compile postgresql, llvm15 is used for llvm-config and llvm-libs, and clang15 from ports is used to compile the .bc files.

I feel that I'm in somewhat deep waters here, so please advice. :-D

Palle
Comment 10 Palle Girgensohn freebsd_committer freebsd_triage 2023-05-30 23:21:24 UTC
*** Bug 271428 has been marked as a duplicate of this bug. ***
Comment 11 Dimitry Andric freebsd_committer freebsd_triage 2023-06-04 18:23:07 UTC
(In reply to Palle Girgensohn from comment #6)
> if the system clang rules which llvm we use, does this statement still hold true:
>
>> Another thing that is required is to pass a CLANG variable to the configure script, pointing to the same version that is used for the LLVM bindings. Otherwise, the .bc files will get compiled by the base system compiler, and this can lead to a ThinLTO link error, if the base system compiler is a newer version of llvm.

The tricky part is that the postgresql configure script has a separate check for "clang", independent of the compiler(s) you passed via CC and CXX:

  checking whether cc accepts -g... yes
  checking for cc option to accept ISO C89... none needed
  checking for cc option to accept ISO C99... none needed
  checking whether we are using the GNU C++ compiler... yes
  checking whether c++ accepts -g... yes
  checking for gawk... (cached) /usr/bin/awk
  checking for LLVM_CONFIG... /usr/local/bin/llvm-config15
  checking for clang... /usr/bin/clang

That last check auto-detects the configure variable CLANG, which is described in the help as:

  CLANG       path to clang compiler to generate bitcode

Indeed, the CLANG value is used to compile bitcode, but to actually link the bitcode into an executable or shared library, it uses the llvm-lto binary from the directory indicated by $LLVM_CONFIG --bindir. And that one is typically the one from the port.

In case base clang is 16 but the port is llvm15, this results in an error at LTO time:

  cd '/wrkdirs/share/dim/ports/databases/postgresql15-server/work/stage/usr/local/lib/postgresql/bitcode' && /usr/local/llvm15/bin/llvm-lto -thinlto -thinlto-action=thinlink -o postgres.index.bc <...long list of .bc files...>
  LLVM ERROR: ThinLTO cannot create input file: Unknown attribute kind (86) (Producer: 'LLVM16.0.4' Reader: 'LLVM 15.0.7')

The only way it could work correctly is if the base system clang and the chosen llvm port are exactly the same version. In that case you could skip passing CLANG=${LOCALBASE}/bin/clang${PG_LLVM_VERSION}, but since you need the llvm port anyway to do LTO on bitcode, you might as well use its clang binary directly.


> if the system /usr/bin/clang has version 13, and we decide to use llvm13 to match, what's the point in forcing the use the *port's* version of clang13? Does it really differ from the /usr/bin/clang?

It could have a minor version number mismatch, but theoretically that should not matter for the LTO stuff. However, as I said, you always need an llvm port to do LTO on bitcode.


> And in that case, we could just skip the tests and pull the latest llvm15 from the ports tree and build with it, instead of testing which compiler we use.

That would simplify things a lot: in case the LLVM option is enabled, you could rip out all versioning checks, always depend directly on devel/llvm15, and pass CC=clang15, CXX=clang++15 and CLANG=clang15 to the configure script. And it will always work predictably.

Conversely, if the LLVM option is disabled by the port user, there is no need to pull in any llvm port, and then CC=cc, CXX=c++ and CLANG would not have to be set at all.
Comment 12 Palle Girgensohn freebsd_committer freebsd_triage 2023-06-05 08:54:15 UTC
(In reply to Dimitry Andric from comment #11)

Sounds reasonable.

My only question is then, do I *need* the same compiler for everything? I mean, for i386 we're currenly building postgres with gcc, since clang cannot build a binary that works pre-pentium5. That might be grossly generous, I guess. [1]

Let's say I set LLVM and CLANG to same port's version, but don't touch the CC:

LLVM_CONFIGURE_WITH=    llvm
LLVM_CONFIGURE_ENV=     LLVM_CONFIG=${LLVM_CONFIG} \
                        CLANG=${LOCALBASE}/bin/clang${LLVM_VERSION}
LLVM_USES=              llvm:max=15,min=11,lib


Will there be likning issues when base clang is 16 then?


[1] https://www.postgresql.org/message-id/20190307140421.GA8362%40gate.oper.dinoex.org
Comment 13 Dimitry Andric freebsd_committer freebsd_triage 2023-06-05 19:26:47 UTC
(In reply to Palle Girgensohn from comment #12)
> My only question is then, do I *need* the same compiler for everything? I mean, for i386 we're currenly building postgres with gcc, since clang cannot build a binary that works pre-pentium5. That might be grossly generous, I guess. [1]

I was not aware of this issue, but if people had problems with clang-built postgres on i386, I will believe them. Better to let it be compiled by gcc as it has always been, as i386 is mostly a dead-end platform.

That is, on i386 you will likely not want to build the postgres JIT stuff anyway, neither will you want LTO as it is way too heavy for this architecture.
Comment 14 Dimitry Andric freebsd_committer freebsd_triage 2023-06-05 19:57:47 UTC
FWIW, on my 14-CURRENT i386 VM, which is built by clang 16, I could build databases/postgresql15-server just fine with clang, and it also starts up correctly. I didn't do any heavy testing, but it seems to work fine.

Note that even on 'plain' i386 it still tries to check for some compiler builtins that require CPU support, e.g.:

checking for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=... no
checking for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=-msse4.2... yes
checking for __crc32cb, __crc32ch, __crc32cw, and __crc32cd with CFLAGS=... no
checking for __crc32cb, __crc32ch, __crc32cw, and __crc32cd with CFLAGS=-march=armv8-a+crc... no
checking which CRC-32C implementation to use... SSE 4.2 with runtime check

And later on it compiles a few specific files with the -msse4.2 option:

cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-deprecated-non-prototype -O2 -pipe  -fstack-protector-strong -fno-strict-aliasing  -msse4.2 -I../../src/port -DFRONTEND -I../../src/include -I/usr/local/include  -I/usr/local/include -I/usr/local/include -I/usr/local/include -I/usr/local/include -I/usr/local/include  -c -o pg_crc32c_sse42.o pg_crc32c_sse42.c

So I think this is just a CPU specific optimization. I don't have any ancient hardware that does not support SSE anymore, though. :)
Comment 15 Dimitry Andric freebsd_committer freebsd_triage 2023-06-12 18:30:14 UTC
So what's the way forward? I can commit what I have, which does not change a great deal, but fixes the build for clang 16 and later, or we can refactor it a bit more?
Comment 16 Palle Girgensohn freebsd_committer freebsd_triage 2023-06-13 09:16:13 UTC
I have a fix ready, but I'm not 100% certain that it is wise to mix clang compilers. That is, instead of the somewhat complicated juggling for setting the compiler, I just do

LLVM_CONFIGURE_ENV=    LLVM_CONFIG=${LLVM_CONFIG} \
                       CLANG=${LOCALBASE}/bin/clang${LLVM_VERSION}
LLVM_USES=             llvm:max=15,min=11,lib


and postgresql will use the clang from the LLVM port to compile the *.bc files and LLVM parts. The rest is built with the builtin compiler, whatever that is.

I will ask the other packagers before committing, but please hold off you changes for the day.
Comment 17 Dimitry Andric freebsd_committer freebsd_triage 2023-06-13 09:22:07 UTC
(In reply to Palle Girgensohn from comment #16)
Yes, I think that is a good solution. Once Postgres updates their JIT support to work with LLVM >= 16, you can then easily bump the version.
Comment 18 commit-hook freebsd_committer freebsd_triage 2023-07-05 14:14:37 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=ab843ea1ee69a9f541c5c92913a1efe5cc2ca7e6

commit ab843ea1ee69a9f541c5c92913a1efe5cc2ca7e6
Author:     Palle Girgensohn <girgen@FreeBSD.org>
AuthorDate: 2023-05-30 22:52:02 +0000
Commit:     Palle Girgensohn <girgen@FreeBSD.org>
CommitDate: 2023-07-05 14:12:20 +0000

    databases/postgresql15-server: fix build with clang 16 (as system compiler)

    databases/postgresql??-server: use the default LLVM

    Always use the version of LLVM that is default for the ports tree,
    if PostgreSQL builds with it. This simplifies the build process.

    For i386, revert to using clang -msse2 instead of pulling in gcc. CPUs
    older than Pentium5 do not support SSE2 and the binary will not run
    there. We are not expected to support ancient hardware forever, so I've
    added a comment recommending users of ancient CPUs to build from ports
    using gcc.

    During an exp-run for llvm 16 (see bug 271047), it turned out that
    databases/postgresql15-server failed to build with clang 16 as the base
    system compiler:

      In file included from llvmjit.c:38:
      ../../../../src/include/jit/llvmjit_emit.h:112:23: warning: call to undeclared function 'LLVMBuildStructGEP'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
              LLVMValueRef v_ptr = LLVMBuildStructGEP(b, v, idx, "");
                                   ^
      ../../../../src/include/jit/llvmjit_emit.h:112:15: error: incompatible integer to pointer conversion initializing 'LLVMValueRef' (aka 'struct LLVMOpaqueValue *') with an expression of type 'int' [-Wint-conversion]
              LLVMValueRef v_ptr = LLVMBuildStructGEP(b, v, idx, "");
                           ^       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ../../../../src/include/jit/llvmjit_emit.h:114:9: warning: call to undeclared function 'LLVMBuildLoad'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
              return LLVMBuildLoad(b, v_ptr, name);
                     ^
      ../../../../src/include/jit/llvmjit_emit.h:114:9: error: incompatible integer to pointer conversion returning 'int' from a function with result type 'LLVMValueRef' (aka 'struct LLVMOpaqueValue *') [-Wint-conversion]
              return LLVMBuildLoad(b, v_ptr, name);
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    and many more similar errors. The same applies to the slave ports
    postgresql11-server, postgresql12-server, postgresql13-server, and
    postgresql14-server.

    This is because PostgreSQL's llvmjit support is not yet ready for LLVM
    16's API changes. In LLVM 15 functions like LLVMBuildStructGEP were
    already declared deprecated, but in LLVM 16 these have been removed (and
    replaced with LLVMBuildStructGEP2, etc).

    Until PostgreSQL updates their llvmjit for the changed APIs, I would
    like to reintroduce the LLVM version restriction that was removed in
    <https://cgit.freebsd.org/ports/commit/?id=1e19f00e5e46>, but slightly
    adjusted: there is no more need to check for LLVM ports below 10.

    Another thing that is required is to pass a CLANG variable to the
    configure script, pointing to the same version that is used for the LLVM
    bindings. Otherwise, the .bc files will get compiled by the base system
    compiler, and this can lead to a ThinLTO link error, if the base system
    compiler is a newer version of llvm.

    PR:             271498
    Suggested by:   dim@

 databases/postgresql16-server/Makefile | 55 +++++-----------------------------
 1 file changed, 8 insertions(+), 47 deletions(-)
Comment 19 Palle Girgensohn freebsd_committer freebsd_triage 2023-07-05 14:15:43 UTC
Committed a fix based on the ideas here. Thanks!