Bug 207325 - 11.0-CURRENT/clang 3.8.0 for TARGET_ARCH=powerpc : c++ exceptions cause SEGV (9 line program)
Summary: 11.0-CURRENT/clang 3.8.0 for TARGET_ARCH=powerpc : c++ exceptions cause SEGV ...
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: powerpc Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-19 06:41 UTC by Mark Millard
Modified: 2020-03-10 23:45 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Millard 2016-02-19 06:41:01 UTC
When run the following 9 or so program compiled by clang 3.8.0 (from project/clang380import -r295601) for TARGET_ARCH=powerpc gets a SEGV:

#include <exception>

int main(void)
{
    try { throw std::exception(); }
    catch (std::exception& e) {} // same result without &
    return 0;
}

(This simplifies what I found in trying to build and use some ports. For example, it blocks using "kyua test -k /usr/tests/Kyuafile", which gets a SEGV and aborts.)

# clang++ -g -std=c++11 -Wall -Wpedantic exception_test.cpp
# ./a.out
Segmentation fault (core dumped)

Trying under gdb:
. . .
(gdb) run
Starting program: /root/c_tests/a.out 

Program received signal SIGSEGV, Segmentation fault.
_Unwind_GetGR (context=0xffffd5a0, index=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:177
177	    return * (_Unwind_Ptr *) ptr;
(gdb) bt
#0  _Unwind_GetGR (context=0xffffd5a0, index=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:177
#1  _Unwind_GetPtr (context=0xffffd5a0, index=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:188
#2  uw_update_context (context=0xffffd5a0, fs=0xffffd0e0) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:1370
#3  _Unwind_RaiseException (exc=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:126
#4  0x4192970c in throw_exception (ex=<optimized out>) at /usr/src/lib/libcxxrt/../../contrib/libcxxrt/exception.cc:751
#5  __cxa_throw (thrown_exception=<optimized out>, tinfo=<optimized out>, dest=<optimized out>) at /usr/src/lib/libcxxrt/../../contrib/libcxxrt/exception.cc:778
#6  0x00000000 in ?? ()

Context details:

# freebsd-version -ku; uname -aKU
11.0-CURRENT
11.0-CURRENT
FreeBSD FBSDG4C1 11.0-CURRENT FreeBSD 11.0-CURRENT #4 r295601M: Sun Feb 14 15:49:49 PST 2016     markmi@FreeBSDx64:/usr/obj/clang_gcc421/powerpc.powerpc/usr/src/sys/GENERICvtsc-NODEBUG  powerpc 1100097 1100097

buildkernel is via gcc 4.2.1
buildworld is via clang 3.8.0

(I've been experimenting with and submitting issues from this environment, an arm
rip2 environment (clang 3.8.0 for both buildworld and buildkernel), and powerpc64 (via powerpc64-gcc, not clang). So there are some fixes/workarounds for various issues in my environment.)

# svnlite status /usr/src/
?       /usr/src/.snap
M       /usr/src/contrib/libc++/include/__config
M       /usr/src/contrib/libcxxrt/guard.cc
M       /usr/src/contrib/llvm/tools/clang/lib/CodeGen/TargetInfo.cpp
M       /usr/src/lib/csu/powerpc64/Makefile
?       /usr/src/restoresymtable
?       /usr/src/sys/arm/conf/RPI2-NODBG
M       /usr/src/sys/boot/ofw/Makefile.inc
M       /usr/src/sys/boot/powerpc/Makefile
M       /usr/src/sys/boot/powerpc/Makefile.inc
M       /usr/src/sys/boot/uboot/Makefile.inc
M       /usr/src/sys/conf/Makefile.powerpc
M       /usr/src/sys/conf/kern.mk
M       /usr/src/sys/conf/kmod.mk
?       /usr/src/sys/powerpc/conf/GENERIC64-NODBG
?       /usr/src/sys/powerpc/conf/GENERIC64vtsc
?       /usr/src/sys/powerpc/conf/GENERIC64vtsc-NODEBUG
?       /usr/src/sys/powerpc/conf/GENERICvtsc
?       /usr/src/sys/powerpc/conf/GENERICvtsc-NODEBUG
M       /usr/src/sys/powerpc/ofw/ofw_machdep.c
M       /usr/src/sys/powerpc/powerpc/exec_machdep.c

For TARGET_ARCH=powerpc the signal delivery has a "red zone" added to deal with clang 3.8.0 moving the stack pointer late on entry to functions and early on exit from functions compared to the ABI. And there is a va_arg fix for va_list's gpr and fpr value handling to be sure the overflow area is used when it should be. There is tracking of command line option changes.
Comment 1 Mark Millard 2016-02-21 00:14:10 UTC
(In reply to Mark Millard from comment #0)

If for TARGET_ARCH=powerpc I try using lang/gcc5 at the compiler instead of clang 3.8.0:

# g++5 -I /usr/include/c++/v1/ -L /usr/lib/ -g -Wall -pedantic exception_test.cpp
or
# g++5 -g -Wall -pedantic exception_test.cpp
or
# g++49 -g -Wall -pedantic exception_test.cpp
or
# g++49 -I /usr/include/c++/v1/ -L /usr/lib/ -g -Wall -pedantic exception_test.cpp

(Note the lack of -Wl,-rpath=/usr/local/lib/gcc5 or -Wl,-rpath=/usr/local/lib/gcc5 use.)

Then for the ./a.out I get the same SEGV at the same place as with clang 3.8.0 as the compiler in use, despite g++5's/g++49's use of _Unwind_Resume_or_Rethrow being the caller of _Unwind_RaiseException:

# ./a.out
terminate called after throwing an instance of 'std::exception'
Segmentation fault (core dumped)

Picking one core's backtrace to display:

Core was generated by `a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  _Unwind_GetGR (context=0xffffce80, index=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:177
177	    return * (_Unwind_Ptr *) ptr;
(gdb) bt
#0  _Unwind_GetGR (context=0xffffce80, index=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:177
#1  _Unwind_GetPtr (context=0xffffce80, index=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:188
#2  uw_update_context (context=0xffffce80, fs=0xffffc9c0) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:1370
#3  _Unwind_RaiseException (exc=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:126
#4  0x41a25d8c in _Unwind_Resume_or_Rethrow (exc=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:256
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

The following do not fail:

# g++49 -Wl,-rpath=/usr/local/lib/gcc49 -g -Wall -pedantic exception_test.cpp
# ./a.out
# g++5 -Wl,-rpath=/usr/local/lib/gcc5 -g -Wall -pedantic exception_test.cpp
# ./a.out
#
Comment 2 Mark Millard 2016-02-21 04:17:43 UTC
More comparisons/contrasts:

Renaming the failing a.outs to:

exception_test.clang++380 (requires libc++/libcxxrt)
exception_test.g++49
exception_test.g++5

and using the g++ ones from a 11.0-CURRENT built via gcc 4.2.1 has them all working fine. (clang requires libraries not present and so is not tested this way.)

On a 11.0-CURRENT built via gcc 4.2.1 producing a

exception_test.g++421

and later using it on a clang 3.8.0 based projects/clang380-import -r295601 buildworld gets the SEGV. (It works fine in its original environment.)

Using g++49 to make the following point about the working -Wl,rpath=? compiles vs. the other failing ones --via differencing ldd output under the clang 3.8.0 based buildworld:

# diff exception_test.ldd_g++49_*
1c1
< exception_test.g++49:
---
> exception_test.g++49_rpath:
4,5c4,5
< 	libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x419a1000)
< 	libc.so.7 => /lib/libc.so.7 (0x419c0000)
---
> 	libgcc_s.so.1 => /usr/local/lib/gcc49/libgcc_s.so.1 (0x419a1000)
> 	libc.so.7 => /lib/libc.so.7 (0x419c4000)
# 

All the falling exception_test.* variants bind:

libgcc_s.so.1 => /lib/libgcc_s.so.1

clang++ output or g++ output does not matter: No exception_test.* so bound avoids the SEGV.

All of the

libgcc_s.so.1 => /usr/local/lib/gcc*/libgcc_s.so.1

bindings work fine.



I will note that the exception_test.g++* ones produced from the clang 3.8.0 buildworld context without -Wl,-rpath=? use (in that same buildworld environment):

ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/gcc49 /usr/local/lib/gcc5 /usr/local/lib/gcc6

to bind:

libstdc++.so.6 => /usr/local/lib/gcc49/libstdc++.so.6

but the exception_test.clang++380 example does not use that library yet still has the SEGV problem. It is libgcc_s.so.1 that makes the difference.
Comment 3 Mark Millard 2016-02-26 21:50:03 UTC
(In reply to Mark Millard from comment #2)

In the operation there are a sequence of 2 errors, the 2nd of which gets the SEGV:

A) The catch clause is rejected (or not even found to check?) so std::terminate is called

B) During the std::terminate related execution the SEGV happens.

(A) is the more fundamental issue but I'v not gotten far with it yet.

In discovering the (A)/(B) split I did get the following information but I will be deferring investigating this stage for now. . .

During (B) an example backtrace from before the SEGV time frame is:

#0  _Unwind_Resume_or_Rethrow (exc=0x41e12040) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:255
#1  0x4188f2b8 in __cxxabiv1::__cxa_rethrow () at ../../.././../gcc-4.9-20160210/libstdc++-v3/libsupc++/eh_throw.cc:118
#2  0x418926c4 in __gnu_cxx::__verbose_terminate_handler () at ../../.././../gcc-4.9-20160210/libstdc++-v3/libsupc++/vterminate.cc:80
#3  0x4188edbc in __cxxabiv1::__terminate (handler=<optimized out>) at ../../.././../gcc-4.9-20160210/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4  0x4188ee9c in std::terminate () at ../../.././../gcc-4.9-20160210/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5  0x4188f250 in __cxxabiv1::__cxa_throw (obj=0xffffd090, tinfo=0x1810d8c <typeinfo for std::exception@@GLIBCXX_3.4>, dest=<optimized out>)
    at ../../.././../gcc-4.9-20160210/libstdc++-v3/libsupc++/eh_throw.cc:87
#6  0x01800a3c in main () at exception_test.cpp:5
Comment 4 Mark Millard 2016-02-26 22:04:28 UTC
I have found the following mismatch between the powerpc code generated and the .eh_frame information generated by clang 3.8.0. (Using objdump and dwarfdump notation below, no relocations.)

00007fd8 <_Unwind_RaiseException> mflr    r0
00007fdc <_Unwind_RaiseException+0x4> stw     r31,-148(r1)
00007fe0 <_Unwind_RaiseException+0x8> stw     r30,-152(r1)
00007fe4 <_Unwind_RaiseException+0xc> stw     r0,4(r1)
00007fe8 <_Unwind_RaiseException+0x10> stwu    r1,-2992(r1)
00007fec <_Unwind_RaiseException+0x14> mr      r31,r1
00007ff0 <_Unwind_RaiseException+0x18> mfcr    r12
. . .
0000827c <_Unwind_RaiseException+0x2a4> lwz     r14,2776(r31)
00008280 <_Unwind_RaiseException+0x2a8> addi    r1,r1,2992
00008284 <_Unwind_RaiseException+0x2ac> lwz     r0,4(r1)
00008288 <_Unwind_RaiseException+0x2b0> lwz     r31,-148(r1)
0000828c <_Unwind_RaiseException+0x2b4> lwz     r30,-152(r1)
00008290 <_Unwind_RaiseException+0x2b8> mtlr    r0
00008294 <_Unwind_RaiseException+0x2bc> blr
00008298 <_Unwind_RaiseException+0x2c0> bl      0001eccc <abort@plt>

The .eh_frame information shows off cfa=2992(r31) over that whole range but 0x828c to 0x8298 comes after R31 is returned to its old value. (See below.)

Also with <off cfa=2992(r31) > for 0x00007ff0 it also lists: <off r31=-148(cfa) > for 0x00007ff0.

In other words:

DW_CFA_offset r31 -148  (37 * -4)

is used as if cfa was not tied to r31's value via <off cfa=2992(r31) >.

The dwarfdump material for this is:

<    0><0x00007fd8:0x0000829c><><fde offset 0x000002b4 length: 0x00000064><eh aug data len 0x0>
        0x00007fd8: <off cfa=00(r1) > 
        0x00007fec: <off cfa=2992(r1) > <off r30=-152(cfa) > <off r31=-148(cfa) > <off r65=04(cfa) > 
        0x00007ff0: <off cfa=2992(r31) > <off r14=-216(cfa) > <off r15=-212(cfa) > <off r16=-208(cfa) > <off r17=-204(cfa) > <off r18=-200(cfa) > <off r19=-196(cfa) > <off r20=-192(cfa) > <off r21=-188(cfa) > <off r22=-184(cfa) > <off r23=-180(cfa) > <off r24=-176(cfa) > <off r25=-172(cfa) > <off r26=-168(cfa) > <off r27=-164(cfa) > <off r28=-160(cfa) > <off r29=-156(cfa) > <off r30=-152(cfa) > <off r31=-148(cfa) > <off r46=-144(cfa) > <off r47=-136(cfa) > <off r48=-128(cfa) > <off r49=-120(cfa) > <off r50=-112(cfa) > <off r51=-104(cfa) > <off r52=-96(cfa) > <off r53=-88(cfa) > <off r54=-80(cfa) > <off r55=-72(cfa) > <off r56=-64(cfa) > <off r57=-56(cfa) > <off r58=-48(cfa) > <off r59=-40(cfa) > <off r60=-32(cfa) > <off r61=-24(cfa) > <off r62=-16(cfa) > <off r63=-8(cfa) > <off r65=04(cfa) > 
 fde section offset 692 0x000002b4 cie offset for fde: 696 0x000002b8
         0 DW_CFA_advance_loc 20  (5 * 4)
         1 DW_CFA_def_cfa_offset 2992
         4 DW_CFA_offset r31 -148  (37 * -4)
         6 DW_CFA_offset r30 -152  (38 * -4)
         8 DW_CFA_offset_extended_sf r65 4  (-1 * -4)
        11 DW_CFA_advance_loc 4  (1 * 4)
        12 DW_CFA_def_cfa_register r31
        14 DW_CFA_offset r14 -216  (54 * -4)
        16 DW_CFA_offset r15 -212  (53 * -4)
        18 DW_CFA_offset r16 -208  (52 * -4)
        20 DW_CFA_offset r17 -204  (51 * -4)
        22 DW_CFA_offset r18 -200  (50 * -4)
        24 DW_CFA_offset r19 -196  (49 * -4)
        26 DW_CFA_offset r20 -192  (48 * -4)
        28 DW_CFA_offset r21 -188  (47 * -4)
        30 DW_CFA_offset r22 -184  (46 * -4)
        32 DW_CFA_offset r23 -180  (45 * -4)
        34 DW_CFA_offset r24 -176  (44 * -4)
        36 DW_CFA_offset r25 -172  (43 * -4)
        38 DW_CFA_offset r26 -168  (42 * -4)
        40 DW_CFA_offset r27 -164  (41 * -4)
        42 DW_CFA_offset r28 -160  (40 * -4)
        44 DW_CFA_offset r29 -156  (39 * -4)
        46 DW_CFA_offset r30 -152  (38 * -4)
        48 DW_CFA_offset r31 -148  (37 * -4)
        50 DW_CFA_offset r46 -144  (36 * -4)
        52 DW_CFA_offset r47 -136  (34 * -4)
        54 DW_CFA_offset r48 -128  (32 * -4)
        56 DW_CFA_offset r49 -120  (30 * -4)
        58 DW_CFA_offset r50 -112  (28 * -4)
        60 DW_CFA_offset r51 -104  (26 * -4)
        62 DW_CFA_offset r52 -96  (24 * -4)
        64 DW_CFA_offset r53 -88  (22 * -4)
        66 DW_CFA_offset r54 -80  (20 * -4)
        68 DW_CFA_offset r55 -72  (18 * -4)
        70 DW_CFA_offset r56 -64  (16 * -4)
        72 DW_CFA_offset r57 -56  (14 * -4)
        74 DW_CFA_offset r58 -48  (12 * -4)
        76 DW_CFA_offset r59 -40  (10 * -4)
        78 DW_CFA_offset r60 -32  (8 * -4)
        80 DW_CFA_offset r61 -24  (6 * -4)
        82 DW_CFA_offset r62 -16  (4 * -4)
        84 DW_CFA_offset r63 -8  (2 * -4)
        86 DW_CFA_nop
Comment 5 Mark Millard 2016-02-26 22:22:31 UTC
(In reply to Mark Millard from comment #4)

A better, more direct wording of some middle material:


The .eh_frame information shown by dwarfdump shows off cfa=2992(r31) over the range starting at 0x00007ff0 but 0x828c to 0x8298 comes after R31 is returned to its old value.

<off cfa=2992(r31) > is just wrong at 0000828c and later in the objdump material.



The following original wording should likely be ignored and the above should used instead:



The .eh_frame information shows off cfa=2992(r31) over that whole range but 0x828c to 0x8298 comes after R31 is returned to its old value. (See below.)

Also with <off cfa=2992(r31) > for 0x00007ff0 it also lists: <off r31=-148(cfa) > for 0x00007ff0.

In other words:

DW_CFA_offset r31 -148  (37 * -4)

is used as if cfa was not tied to r31's value via <off cfa=2992(r31) >.
Comment 6 Mark Millard 2016-02-27 23:17:52 UTC
I've tracked this down to misbehavior of clang 3.8.0 for __builtin_dwarf_cfa () for TARGET_ARCH=powerpc in:

#define uw_init_context(CONTEXT)                                           \
  do                                                                       \
    {                                                                      \
      /* Do any necessary initialization to access arbitrary stack frames. \
         On the SPARC, this means flushing the register windows.  */       \
      __builtin_unwind_init ();                                            \
      uw_init_context_1 (CONTEXT, __builtin_dwarf_cfa (),                  \
                         __builtin_return_address (0));                    \
    }                                                                      \
  while (0)
. . .
85	_Unwind_Reason_Code
86	_Unwind_RaiseException(struct _Unwind_Exception *exc)
87	{
88	  struct _Unwind_Context this_context, cur_context;
89	  _Unwind_Reason_Code code;
90	
91	  /* Set up this_context to describe the current stack frame.  */
92	  uw_init_context (&this_context);

In the below r4 ends up with the __builtin_dwarf_cfa () value:

Dump of assembler code for function _Unwind_RaiseException:
   0x419a8fd8 <+0>:	mflr    r0
   0x419a8fdc <+4>:	stw     r31,-148(r1)
   0x419a8fe0 <+8>:	stw     r30,-152(r1)
   0x419a8fe4 <+12>:	stw     r0,4(r1)
   0x419a8fe8 <+16>:	stwu    r1,-2992(r1)
   0x419a8fec <+20>:	mr      r31,r1
. . .
   0x419a9094 <+188>:	mr      r4,r31
   0x419a9098 <+192>:	mflr    r30
   0x419a909c <+196>:	lwz     r5,2996(r31)
   0x419a90a0 <+200>:	mr      r3,r28
   0x419a90a4 <+204>:	bl      0x419a929c <uw_init_context_1>

That r4 ends up holding the stack pointer (r1) value for after it has been decremented. It is not pointing at the boundary with the caller's frame.

The .eh_frame information and unwind code is set up for it pointing at the boundary with the caller's frame. So the cfa relative addressing is messed up for what it actually extracts.

Contrast this with gcc/g++ 5.3's TARGET_ARCH=powerpc64 code where r4 is  made to be at the boundary with the caller's frame:

Dump of assembler code for function _Unwind_RaiseException:
   0x00000000501cb810 <+0>:	mflr    r0
   0x00000000501cb814 <+4>:	stdu    r1,-5648(r1)
. . .
   0x00000000501cb8d0 <+192>:	addi    r4,r1,5648
   0x00000000501cb8d4 <+196>:	stw     r12,5656(r1)
   0x00000000501cb8d8 <+200>:	mr      r28,r3
   0x00000000501cb8dc <+204>:	addi    r31,r1,2544
   0x00000000501cb8e0 <+208>:	mr      r3,r27
   0x00000000501cb8e4 <+212>:	addi    r29,r1,112
   0x00000000501cb8e8 <+216>:	bl      0x501cae60 <uw_init_context_1>


NOTE: This may in someway be associated with the clang 3.8.0 ABI violation in how it handles the stack pointer for FreeBSD: TARGET_ARCH=powerpc is currently using a "red zone", decrementing the stack pointer late, and incrementing the stack pointer early compared to the ABI rules. (This is similar to the official FreeBSD ABI for TARGET_ARCH=powerpc64.)
Comment 7 Mark Millard 2016-02-27 23:51:39 UTC
(In reply to Mark Millard from comment #6)

I have submitted this to llvm: 26761 is the number for the bug submittal.

https://llvm.org/bugs/show_bug.cgi?id=26761
Comment 8 Mark Millard 2016-02-28 00:20:27 UTC
(In reply to Mark Millard from comment #6)

# more builtin_dwarf_cfa.cpp 
extern void g(void*);
void f() { g(__builtin_dwarf_cfa()); }

In a TARGET_ARCH=powerpc64 context:

# clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf64-powerpc-freebsd


Disassembly of section .text:
0000000000000000 <._Z1fv> mflr    r0
0000000000000004 <._Z1fv+0x4> std     r31,-8(r1)
0000000000000008 <._Z1fv+0x8> std     r0,16(r1)
000000000000000c <._Z1fv+0xc> stdu    r1,-128(r1)
0000000000000010 <._Z1fv+0x10> mr      r31,r1
0000000000000014 <._Z1fv+0x14> mr      r3,r31
0000000000000018 <._Z1fv+0x18> bl      0000000000000018 <._Z1fv+0x18>
000000000000001c <._Z1fv+0x1c> nop
0000000000000020 <._Z1fv+0x20> addi    r1,r1,128
0000000000000024 <._Z1fv+0x24> ld      r0,16(r1)
0000000000000028 <._Z1fv+0x28> ld      r31,-8(r1)
000000000000002c <._Z1fv+0x2c> mtlr    r0
0000000000000030 <._Z1fv+0x30> blr
        ...

r3 does not point to the boundary with the caller's stack frame.

By contrast for g++49:

# g++49 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o | more

builtin_dwarf_cfa.o:     file format elf64-powerpc-freebsd


Disassembly of section .text:
0000000000000000 <._Z1fv> mflr    r0
0000000000000004 <._Z1fv+0x4> std     r0,16(r1)
0000000000000008 <._Z1fv+0x8> std     r31,-8(r1)
000000000000000c <._Z1fv+0xc> stdu    r1,-128(r1)
0000000000000010 <._Z1fv+0x10> mr      r31,r1
0000000000000014 <._Z1fv+0x14> addi    r9,r31,128
0000000000000018 <._Z1fv+0x18> mr      r3,r9
000000000000001c <._Z1fv+0x1c> bl      000000000000001c <._Z1fv+0x1c>
0000000000000020 <._Z1fv+0x20> nop
0000000000000024 <._Z1fv+0x24> addi    r1,r31,128
0000000000000028 <._Z1fv+0x28> ld      r0,16(r1)
000000000000002c <._Z1fv+0x2c> mtlr    r0
0000000000000030 <._Z1fv+0x30> ld      r31,-8(r1)
0000000000000034 <._Z1fv+0x34> blr
0000000000000038 <._Z1fv+0x38> .long 0x0
000000000000003c <._Z1fv+0x3c> .long 0x90001
0000000000000040 <._Z1fv+0x40> lwz     r0,1(r1)

r3 does point to the boundary with the caller's stack frame.

For TARGET_ARCH=powerpc, clang 3.8.0 first:

# clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf32-powerpc-freebsd


Disassembly of section .text:
00000000 <_Z1fv> mflr    r0
00000004 <_Z1fv+0x4> stw     r31,-4(r1)
00000008 <_Z1fv+0x8> stw     r0,4(r1)
0000000c <_Z1fv+0xc> stwu    r1,-16(r1)
00000010 <_Z1fv+0x10> mr      r31,r1
00000014 <_Z1fv+0x14> mr      r3,r31
00000018 <_Z1fv+0x18> bl      00000018 <_Z1fv+0x18>
0000001c <_Z1fv+0x1c> addi    r1,r1,16
00000020 <_Z1fv+0x20> lwz     r0,4(r1)
00000024 <_Z1fv+0x24> lwz     r31,-4(r1)
00000028 <_Z1fv+0x28> mtlr    r0
0000002c <_Z1fv+0x2c> blr

Then g++5 (5.3):

# g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf32-powerpc-freebsd


Disassembly of section .text:
00000000 <_Z1fv> stwu    r1,-16(r1)
00000004 <_Z1fv+0x4> mflr    r0
00000008 <_Z1fv+0x8> stw     r0,20(r1)
0000000c <_Z1fv+0xc> stw     r31,12(r1)
00000010 <_Z1fv+0x10> mr      r31,r1
00000014 <_Z1fv+0x14> addi    r9,r31,16
00000018 <_Z1fv+0x18> mr      r3,r9
0000001c <_Z1fv+0x1c> bl      0000001c <_Z1fv+0x1c>
00000020 <_Z1fv+0x20> nop
00000024 <_Z1fv+0x24> addi    r11,r31,16
00000028 <_Z1fv+0x28> lwz     r0,4(r11)
0000002c <_Z1fv+0x2c> mtlr    r0
00000030 <_Z1fv+0x30> lwz     r31,-4(r11)
00000034 <_Z1fv+0x34> mr      r1,r11
00000038 <_Z1fv+0x38> blr
Comment 9 Mark Millard 2016-02-28 01:21:42 UTC
I should have been explicit:

The stack frames boundary that I reference in the 2-line examples are between:

A) f's frame
and
B) f's caller's frame

(Not between f vs. g.)

(The external g function just avoided any potential optimization that might eliminate the code I was trying to produce.)


(B) is rather implicit as I wrote comment #1. It could lead to confusion. Thus this note.

Also: It looks like arm has the same sort of distinction vs. g++:

# clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf32-littlearm


Disassembly of section .text:
00000000 <_Z1fv> push	{fp, lr}
00000004 <_Z1fv+0x4> mov	fp, sp
00000008 <_Z1fv+0x8> mov	r0, fp
0000000c <_Z1fv+0xc> bl	00000000 <_Z1gPv>
00000010 <_Z1fv+0x10> pop	{fp, pc}
# g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf32-littlearm


Disassembly of section .text:
00000000 <_Z1fv> push	{fp, lr}
00000004 <_Z1fv+0x4> add	fp, sp, #4, 0
00000008 <_Z1fv+0x8> add	r3, fp, #4, 0
0000000c <_Z1fv+0xc> mov	r0, r3
00000010 <_Z1fv+0x10> bl	00000000 <_Z1gPv>
00000014 <_Z1fv+0x14> nop			; (mov r0, r0)
00000018 <_Z1fv+0x18> pop	{fp, pc}
Comment 10 Mark Millard 2016-03-06 21:03:19 UTC
I have made a separate submittal for the __builtin_dwarf_cfa() issue after discovering more about the type of context it is messed up in for clang 3.8.0.

Specifically in:

extern void g(void*);
void f0() { g(__builtin_dwarf_cfa()); }
void f1()
{ auto f1_cfa = __builtin_dwarf_cfa(); g(f1_cfa); }

f0 passes g a different offset from the frame pointer than f1 does. g++ has both behave like clang++ 3.8.0 has f1 behave.

Where __builtin_dwarf_cfa() is used in the same routine changes its results for clang++ 3.8.0.
Comment 11 Mark Millard 2016-03-06 21:32:53 UTC
(In reply to Mark Millard from comment #10)

Ignore comment 10. An operator-error of mine was involved: I misread where the offset was being used in the code.
Comment 12 Mark Millard 2019-07-09 01:50:13 UTC
It looks like head has some updates for use of
WITH_LLVM_LIBUNWIND= (although I've not tested
such yet.) (Also involves use of clang 8.0.1
as the compiler.)

Various folks have indicated they do not want to bother
with patching the old library's libunwind code to work
(despite a patch on the lists). Instead they want to
require LIBUNWIND.

So if things test out as working and are MFC'd to stable/12
at some point, and LLVM's libunwind is the default after
such, this can probably be updated to Overcome By Events.
Comment 13 Mark Millard 2020-03-10 23:45:17 UTC
Head no longer has the old libunwind code and my
understanding is that there is no plan to complete
the implementation of DW_CFA_remember_state and
DW_CFA_restore_state in 11 or 12 (which still have
the old libunwind code as I understand).

clang also has progressed so that in head, and
eventually 13.0, clang will be officially the
means of building powerpc families (64-bit and
32-bit), using llvm's libunwind.

The 64-bit ABI has also been changed as part of
the activity.