Created attachment 245745 [details] poudriere build log Compiling lang/gcc10 with poudriere on 1400094 gives reproducible the following lines in /var/log/messages: Oct 19 08:29:29 jet kernel: pid 56930 (cc1plus), jid 3, uid 65534: exited on signal 11 (core dumped) Oct 19 08:29:59 jet kernel: pid 66794 (cc1plus), jid 3, uid 65534: exited on signal 11 (core dumped) Oct 19 09:02:01 jet kernel: pid 67265 (cc1plus), jid 3, uid 65534: exited on signal 11 (core dumped) Oct 19 09:02:19 jet kernel: pid 76889 (cc1plus), jid 3, uid 65534: exited on signal 11 (core dumped) Oct 19 09:16:48 jet kernel: pid 50164 (cc1plus), jid 3, uid 65534: exited on signal 11 (core dumped) Oct 19 09:17:07 jet kernel: pid 59903 (cc1plus), jid 3, uid 65534: exited on signal 11 (core dumped) I'm attaching the build log and a special version of the build log with timestamp in front of each line to help identifying the place which produces the above messages.
Created attachment 245747 [details] poudriere build log with timestamp in each line always 10 lines before and after the signal 11 time from /var/log/messages
I can reproduce this also outside of poudriere, i.e. by just building /usr/ports/lang/gcc10. But although there are signal 11 messages in the kernel logs, there is *nothing* visible in the build output, and it seems to happily do its thing. Now the trick is of course catching the suspect while the crime is being committed :)
Ah, here are the actual troublemakers: $ find /wrkdirs/share/dim/ports/lang/gcc10/work -type f -name '*.log' | xargs grep "signal terminated" /wrkdirs/share/dim/ports/lang/gcc10/work/.build/prev-x86_64-portbld-freebsd15.0/32/libstdc++-v3/config.log:xgcc: internal compiler error: Segmentation fault signal terminated program cc1plus /wrkdirs/share/dim/ports/lang/gcc10/work/.build/prev-x86_64-portbld-freebsd15.0/libstdc++-v3/config.log:xgcc: internal compiler error: Segmentation fault signal terminated program cc1plus $ grep -C10 "signal terminated" /wrkdirs/share/dim/ports/lang/gcc10/work/.build/prev-x86_64-portbld-freebsd15.0/libstdc++-v3/config.log configure:14438: checking if /wrkdirs/share/dim/ports/lang/gcc10/work/.build/./gcc/xgcc -shared-libgcc -B/wrkdirs/share/dim/ports/lang/gcc10/work/.build/./gcc -nostdinc++ -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src/.libs -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/libsupc++/.libs -B/usr/local/x86_64-portbld-freebsd15.0/bin/ -B/usr/local/x86_64-portbld-freebsd15.0/lib/ -isystem /usr/local/x86_64-portbld-freebsd15.0/include -isystem /usr/local/x86_64-portbld-freebsd15.0/sys-include -fno-checking supports -c -o file.o configure:14485: result: yes configure:14515: checking whether the /wrkdirs/share/dim/ports/lang/gcc10/work/.build/./gcc/xgcc -shared-libgcc -B/wrkdirs/share/dim/ports/lang/gcc10/work/.build/./gcc -nostdinc++ -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src/.libs -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/libsupc++/.libs -B/usr/local/x86_64-portbld-freebsd15.0/bin/ -B/usr/local/x86_64-portbld-freebsd15.0/lib/ -isystem /usr/local/x86_64-portbld-freebsd15.0/include -isystem /usr/local/x86_64-portbld-freebsd15.0/sys-include -fno-checking linker (/wrkdirs/share/dim/ports/lang/gcc10/work/.build/./gcc/collect-ld) supports shared libraries configure:14543: result: yes configure:14686: checking dynamic linker characteristics configure:15301: result: freebsd15.0 ld.so configure:15354: checking how to hardcode library paths into programs configure:15379: result: immediate configure:15579: checking for compiler with PCH support xgcc: internal compiler error: Segmentation fault signal terminated program cc1plus Please submit a full bug report, with preprocessed source if appropriate. See <https://gcc.gnu.org/bugs/> for instructions. configure:15614: result: no configure:15619: checking for enabled PCH configure:15621: result: no configure:15633: checking for thread model used by GCC configure:15636: result: posix configure:15879: checking for atomic builtins for bool configure:15881: /wrkdirs/share/dim/ports/lang/gcc10/work/.build/./gcc/xgcc -shared-libgcc -B/wrkdirs/share/dim/ports/lang/gcc10/work/.build/./gcc -nostdinc++ -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src/.libs -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/libsupc++/.libs -B/usr/local/x86_64-portbld-freebsd15.0/bin/ -B/usr/local/x86_64-portbld-freebsd15.0/lib/ -isystem /usr/local/x86_64-portbld-freebsd15.0/include -isystem /usr/local/x86_64-portbld-freebsd15.0/sys-include -fno-checking -c -O0 -S conftest.cpp >&5 So this is a segfault caused by an internal configure script check for PCH support. It doesn't really say what exact command it runs, but PCH support is notoriously iffy. :)
The configure script makes two files: $ cat conftest.h #include <math.h> $ cat conftest.cc #include "conftest.h" Then it builds a precompiled header from the .h file, which is successful: $ /wrkdirs/share/dim/ports/lang/gcc10/work/.build/prev-gcc/xgcc -shared-libgcc -B/wrkdirs/share/dim/ports/lang/gcc10/work/.build/prev-gcc -nostdinc++ -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src/.libs -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/libsupc++/.libs -B/usr/local/x86_64-portbld-freebsd15.0/bin/ -B/usr/local/x86_64-portbld-freebsd15.0/lib/ -isystem /usr/local/x86_64-portbld-freebsd15.0/include -isystem /usr/local/x86_64-portbld-freebsd15.0/sys-include -fno-checking -Werror -Winvalid-pch -Wno-deprecated -x c++-header conftest.h -o conftest.h.gch And then it tries to compile the .cc file, which segfaults: $ /wrkdirs/share/dim/ports/lang/gcc10/work/.build/prev-gcc/xgcc -shared-libgcc -B/wrkdirs/share/dim/ports/lang/gcc10/work/.build/prev-gcc -nostdinc++ -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/src/.libs -L/wrkdirs/share/dim/ports/lang/gcc10/work/.build/x86_64-portbld-freebsd15.0/libstdc++-v3/libsupc++/.libs -B/usr/local/x86_64-portbld-freebsd15.0/bin/ -B/usr/local/x86_64-portbld-freebsd15.0/lib/ -isystem /usr/local/x86_64-portbld-freebsd15.0/include -isystem /usr/local/x86_64-portbld-freebsd15.0/sys-include -fno-checking -Werror -Winvalid-pch -Wno-deprecated conftest.cc xgcc: internal compiler error: Segmentation fault signal terminated program cc1plus Please submit a full bug report, with preprocessed source if appropriate. See <https://gcc.gnu.org/bugs/> for instructions. So this is one of those bugs where gcc chokes on its own .gch files, and even on a very simple one! It does not affect the rest of the build though, because it effectively turns off precompiled header support.
Ha, this is very nice Heisenbug. If I run the command in gdb, or under valgrind, the problem does not occur, and it tries to link an executable!
I did a grep as well and could not find this places: $ grep 'signal terminated' gcc10-10.4.0_1.log $
(In reply to Matthias Apitz from comment #6) The problem is that the configure scripts hide the error output, so it does not show up in your poudriere logs, but in individual config.log files spread throughout the working dir. So if poudriere cleans up your working dir, the evidence gets lost...
I have run a few tests that you can see here, together with all generated config.log files: - lang/gcc10: https://cirrus-ci.com/build/5989715258638336 - lang/gcc11: https://cirrus-ci.com/build/6202630880362496 - lang/gcc12: https://cirrus-ci.com/build/5536280025497600 - lang/gcc13: https://cirrus-ci.com/build/5049723984281600 Based on those tests, it seems to me that the issue has been fixed upstream starting with GCC 12. Can you please confirm that the bug does not affect lang/gcc12 or higher? In that case, I would say that we can ignore this issue and simply let older versions of GCC get deprecated.
By the way, I am working at updating all the GCC ports to their latest versions. I have not checked if the incoming versions of lang/gcc10 and lang/gcc11 have also been fixed upstream. It might be the case.
(In reply to Lorenzo Salvadore from comment #8) I picked up the build time of gcc12 from its poudriere log: Oct 15 10:59 - 15:49. In this time window there have been no signal 11 messages in /var/log/messages from cc1plus, only two caused by a known conftest: Oct 15 11:58:44 jet kernel: pid 42538 (conftest), jid 7, uid 65534: exited on signal 11 (core dumped) Oct 15 12:06:32 jet kernel: pid 64342 (conftest), jid 7, uid 65534: exited on signal 11 (core dumped)
Note that lang/gcc10 expired today, does this bug apply to newer versions too?
I read that it no longer reproduces with GCC 12, so closing it.