[Note: lang/gcc6's xgcc's cc1 was built by clang for the stage of the lang/gcc6 bootstrap that was in use below. So the code generation problem is clang's, not gcc6's or xgcc's.] The below was found while trying to figure out why a bootstrap lang/gcc6 build on a armv6/cortex-a7 stable/11 -r307797 crashes in xgcc's cc1 with SIGSYS for some of what xgcc tries to build. I'm recording this here while pursuing the system problems the context has also exposed: truss's error handling for watching the failing cc1 has crash problems of its own. The cc1 crash (when under gdb) shows a stack address as the pc value once the problem happens. In more detail, when ( in gcc/gimple-match-head.c ): bool gimple_resimplify1 (gimple_seq *seq, code_helper *res_code, tree type, tree *res_ops, tree (*valueize)(tree)) { . . . } returns the armv6 pc ends up with a stack address instead of a code address. [There may be other cc1 routines with similar problems. I've only analyzed the one example stack corruption.] Eliminating the long names (mostly) in the gdb disassembly output for gimple_resimplify1 so the code is easier to see --and showing function stack-handing preamble/post-amble code only: Dump of assembler code for function _Z18gimple_resimplify1PP6gimpleP11code_helperP9tree_nodePS5_PFS5_S5_E: 0x0105d5ec push {r0, r1, r4, r6, r8, r11, sp, lr} 0x0105d5f0 add r11, sp, #16 ; 0x10 0x0105d5f4 sub sp, sp, #88 ; 0x58 . . . 0x0105d7e0 sub sp, r11, #16 ; 0x10 0x0105d7e4 pop {r4, r5, r6, r10, r11, pc} Note that, just after restoring sp to its value from just after the push, the pop does not match the push and restores what was r11 (a pointer into the stack) to the pc register, matching the observed behavior that gdb shows: execution of stack contents. I have used stepi in gdb to go up to and through the pop and so have seen the evidence of the corruption fairly directly: just after the pop things are messed up in gdb with a stack address shown for the pc value. So it is a code generation defect for at least armv6 / cortext-a7. Context details: root@bananapi-m3:/usr/obj/portswork/usr/ports/lang/gcc6/work/.build/armv6-portbld-freebsd11.0/libgcc # svnlite info /mnt/usr/ports | grep "Re[lv]" Relative URL: ^/head Revision: 424540 Last Changed Rev: 424540 root@bananapi-m3:/usr/obj/portswork/usr/ports/lang/gcc6/work/.build/armv6-portbld-freebsd11.0/libgcc # uname -apKU FreeBSD bananapi-m3 11.0-STABLE FreeBSD 11.0-STABLE #0 r307797M: Sat Oct 29 10:54:45 PDT 2016 markmi@FreeBSDx64:/usr/local/src/crochet/work/obj/arm.armv6/usr/src/sys/ALLWINNER arm armv6 1100505 1100505 (So stable/11 -r307797 was built with crochet, not with my usual procedure.) The crashing cc1 shows crash problems in truss. ktrace reports odd information from the stack corruption as well but does not crash. So for now the environment with all these issues is being kept in a form appropriate to testing the stable/11 truss issue(s). For example John Baldwin is working on truss for the its issue(s) and when he asks I rebuild world/kernel with his truss related updates and report what happened. (truss does not work yet for handling the cc1 failure as of when I wrote this.) root@bananapi-m3:/usr/obj/portswork/usr/ports/lang/gcc6/work/.build/armv6-portbld-freebsd11.0/libgcc # more /etc/make.conf DEFAULT_VERSIONS+=perl5=5.22 WRKDIRPREFIX=/usr/obj/portswork WITH_DEBUG= WITH_DEBUG_FILES= MALLOC_PRODUCTION= # CFLAGS+= -mcpu=cortex-a7 CXXFLAGS+= -mcpu=cortex-a7 CPPFLAGS+= -mcpu=cortex-a7 So -mcpu=cortex-a7 was part of the CFLAGS/CXXFLAGS context while lang/gcc6 was being built. I'm not trying alternatives for such for now as I'm keeping the truss-testing context in place.
(In reply to Mark Millard from comment #0) Here is a gdb session extraction showing the details just before and after the pop: (gdb) display/i $pc 1: x/i $pc 0x105d7e0 <_Z18gimple_resimplify1PP6gimpleP11code_helperP9tree_nodePS5_PFS5_S5_E+500>: sub sp, r11, #16 ; 0x10 (gdb) si 0x0105d7e4 in gimple_resimplify1 () 1: x/i $pc 0x105d7e4 <_Z18gimple_resimplify1PP6gimpleP11code_helperP9tree_nodePS5_PFS5_S5_E+504>: pop {r4, r5, r6, r10, r11, pc} (gdb) info reg r0 0x0 0 r1 0x0 0 r2 0x17c8506 24937734 r3 0x65 101 r4 0x21aed420 565105696 r5 0xbfbf6e7c -1077973380 r6 0xbfbf6e80 -1077973376 r7 0x229ebba0 580828064 r8 0x0 0 r9 0xbfbfe13c -1077944004 r10 0xbfbfdff0 -1077944336 r11 0xbfbf6b70 -1077974160 r12 0x17ef3b1 25097137 sp 0xbfbf6b60 -1077974176 lr 0x65 101 pc 0x105d7e4 17160164 fps 0x0 0 cpsr 0x80000010 -2147483632 (gdb) x/8x $sp 0xbfbf6b60: 0xbfbf6e80 0xbfbf6e7c 0xbfbf6e80 0xbfbf6e7c 0xbfbf6b70: 0x00000000 0xbfbf6d20 0xbfbf6b80 0x01063c7c (gdb) si Error accessing memory address 0x0: Bad address. 0xbfbf6d20 in ?? () 1: x/i $pc 0xbfbf6d20: svclt 0x00bf6ef0 (gdb) info reg r0 0x0 0 r1 0x0 0 r2 0x17c8506 24937734 r3 0x65 101 r4 0xbfbf6e80 -1077973376 r5 0xbfbf6e7c -1077973380 r6 0xbfbf6e80 -1077973376 r7 0x229ebba0 580828064 r8 0x0 0 r9 0xbfbfe13c -1077944004 r10 0xbfbf6e7c -1077973380 r11 0x0 0 r12 0x17ef3b1 25097137 sp 0xbfbf6b78 -1077974152 lr 0x65 101 pc 0xbfbf6d20 -1077973728 fps 0x0 0 cpsr 0x80000010 -2147483632 I'll note that the: (gdb) x/8x $sp 0xbfbf6b60: 0xbfbf6e80 0xbfbf6e7c 0xbfbf6e80 0xbfbf6e7c 0xbfbf6b70: 0x00000000 0xbfbf6d20 0xbfbf6b80 0x01063c7c matches up with the push's 8 items correctly, the last (lower right/increasing-memory-address order) being the lr value that should have been restored to the pc by the pop--instead of the 0xbfbf6d20 that is restored (only 6 items popped: the last of the 6 by increasing memory address being put in the pc).
(In reply to Mark Millard from comment #1) I just compared .build/gcc/gimple-match.o to .build/gcc/cc1 for gimple_resimplify1's code and .build/gcc/gimple-match.o is correct yet .build/gcc/cc1 has the push wrong. # diff from*.txt | more 1c1 < e92d6953 push {r0, r1, r4, r6, r8, fp, sp, lr} --- > e92d4c70 push {r4, r5, r6, sl, fp, lr} . . . (The from*.txt files have the address columns from objdump -d removed and the cc1 material limited to the one routine.) The "<" one is from cc1 file and the ">" is from the .o file. (Of course the .o has branch offsets and such not filled in so there are other differences --but no unexpected ones.) The pop's match in the two from*.txt files. So. . . Once the truss testing activity completes that is based on the bad behavior of cc1 as it was built I'll have to see if this is repeatable when lang/gcc6 is rebuilt from scratch.
I tried rebuilding lang/gcc6 and it completed instead of stopping like it did before. So apparently the USB SSD involved glitched during the original build attempt --or some other such non-repeating issue happened. The original bad (xgcc's) cc1 code still shows problems with truss handling of SIGSYS and odd (huge) syscall numbers reported by ktrace (and internally to truss as seen via gdb). But there are separate bugzilla reports for those issues.