Bug 207732

Summary: libgcc_s .eh_frame handling messes up interpreting powerpc/powerpc64 frame pointer register use produced by clang 3.8.0
Product: Base System Reporter: Mark Millard <marklmi26-fbsd>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed Not A Bug    
Severity: Affects Only Me    
Priority: ---    
Version: CURRENT   
Hardware: powerpc   
OS: Any   

Description Mark Millard 2016-03-06 00:59:49 UTC
Because I'm pointing at long standing FreeBSD libgcc_s code that is not limited to the example TARGET_ARCH's taht I'm using I first review some dwarf CFA material.

Based on dwarf-2.0.0.pdf (with  my notes added):

	•	The algorithm to compute the CFA changes as you progress through the prologue and epilogue code. (By definition, the CFA value does not change.) 

So as I understand for the "Call Frame Instruction Usage" . . .

	1.	Initialize a register set by reading the initial_instructions field of the associated CIE. 

The initial CFA value in/for _Unwind_RaiseException is to be established as initialization before interpreting the first instruction of the initial_instructions field of the CIE, which is the initial CIE for the internal exception handling activity.

For that initial CIE (for _Unwind_RaiseException for exception handling): While not part of the Itanium C++ exception ABI (as I understand) this initialization of the initial CFA value is based on starting from the value returned by __builtin_dwarf_cfa(), used in the likes of _Unwind_RaiseException. (For TARGET_ARCH=powerpc or powerpc64 or possibly others clang 3.8.0 vs. gcc4.2.1/4.9/5.3/6.0 do not agree about which frame boundary _builtin_drawf_cfa() returns. So as stands the value may sometimes need conversion to a standardized-frame-boundary. This is a somewhat separate issue separately reported.)

Relative to the CFA value: The CIE/FDE instructions for the locations of a specific routine only change the CFA rule that would reproduce the CFA value. (Which would allow back calculating the the value in REG from the CFA value for CFA=OFFSET(REG) contexts: REG=CFA-OFFSET.) Any computation that results in a changed value while interpreting that routines .ef_frame instructions must be wrong.

So finding a CFA for, say, the caller of _Unwind_RaiseException is not via execution of one of the CIE/FDE "instructions" stored in the .eh_frame information for _Unwind_RaiseException or for its caller: it is a separate, additional step based on the information available that may extract some of the .eh_frame information from the two routines.

	2.	Read and process the FDE’s instruction sequence until a DW_CFA_advance_loc, DW_CFA_set_loc, or the end of the instruction stream is encountered. 

	3.	If a DW_CFA_advance_loc or DW_CFA_set_loc instruction was encountered, then compute a new location value (L2). If L1 >= L2 then process the instruction and go back to step 2. 

	4.	The end of the instruction stream can be thought of as a 
    DW_CFA_set_loc( initial_location + address_range )
instruction. Unless the FDE is ill-formed, L1 should be less than L2 at this point. 

The rules in the register set now apply to location L1. 


So given that dwarf CFA material. . .

As compiled  by clang 3.8.0 for powerpc (for example): libcxxrt ends up with (dwarfdump -v -v -F output for __cxa_throw):

<    0><0x00010620:0x00010794><__cxa_throw><fde offset 0x000006c0 length: 0x00000028><eh aug data len 0x0>
       0x00010620: <off cfa=00(r1) > 
       0x00010634: <off cfa=48(r1) > <off r30=-8(cfa) > <off r31=-4(cfa) > <off r65=04(cfa) > 
       0x00010638: <off cfa=48(r31) > <off r25=-28(cfa) > <off r26=-24(cfa) > <off r27=-20(cfa) > <off r28=-16(cfa) > <off r29=-12(cfa) > <off r30=-8(cfa) > <off r31=-4(cfa) > <off r65=04(cfa) > 
fde section offset 1728 0x000006c0 cie offset for fde: 1732 0x000006c4
        0 DW_CFA_advance_loc 20  (5 * 4)
        1 DW_CFA_def_cfa_offset 48
        3 DW_CFA_offset r31 -4  (1 * -4)
        5 DW_CFA_offset r30 -8  (2 * -4)
        7 DW_CFA_offset_extended_sf r65 4  (-1 * -4)
       10 DW_CFA_advance_loc 4  (1 * 4)
       11 DW_CFA_def_cfa_register r31
       13 DW_CFA_offset r25 -28  (7 * -4)
       15 DW_CFA_offset r26 -24  (6 * -4)
       17 DW_CFA_offset r27 -20  (5 * -4)
       19 DW_CFA_offset r28 -16  (4 * -4)
       21 DW_CFA_offset r29 -12  (3 * -4)
       23 DW_CFA_offset r30 -8  (2 * -4)
       25 DW_CFA_nop
       26 DW_CFA_nop

Note the cfa and r31 references in:

0x00010634: <off cfa=48(r1) >  . . . <off r31=-4(cfa) > . . .
0x00010638: <off cfa=48(r31) > . . . <off r31=-4(cfa) > . . .

The use of r31 to define cfa is from (in part) the clang++ 3.8.0 code generation using r31 as a frame pointer in addition to r1 as the stack pointer. The matching actual sequence of operations listed above is:

        1 DW_CFA_def_cfa_offset 48
        3 DW_CFA_offset r31 -4  (1 * -4)
. . .
       11 DW_CFA_def_cfa_register r31

The "1 DW_CFA_def_cfa_offset 48" just notes that r1 (the stack pointer) was decremented by 48 by the prior instruction so 48 needs to be added to the new r1 value to reference the same _Unwind_Context cfa value as the prior "<off cfa=00(r1) >" status (from the CIE the FDE references) does.

The "3 DW_CFA_offset r31 -4  (1 * -4)" was generated because (soon old) r31 value was saved at address cfa-4 ("<off r31=-4(cfa) >"). This address to access what will be the old/saved r31 value is recorded in the _Unwind_Context reg[31].

The "11 DW_CFA_def_cfa_register r31" was generated because the prior instruction r31 was updated to be a copy of r1 for use as a frame pointer. Note that such does not change the _Unwind_Context cfa value. At this stage r1=r31 and 48(r1)=48(r31) and such will hold until either r1 or r31 is changed in the routine (if either is).

The repeat of "<off r31=-4(cfa) >" on the "0x00010638: <off cfa=48(r31) >" line indicates that there is no change to where/how to find the pointer to the old/saved r31 value relative to the CFA value: no new DW_CFA_offset r31 "instruction" for interpretation.


[Note the messy mix of different r31's. gcc 4.2.1 does not (normally?) generate such TARGET_ARCH=powerpc code but clang++ 3.8.0 normally does generate such a Frame Pointer and use it in places. Thus clang++ touches an error that g++ 4.2.1 and the like normally do not.]


Unfortunately the above is not the interpretation given by the interpreter in libgcc_s:

"11 DW_CFA_def_cfa_register r31" instead accesses the old/saved r31 value via the pointer in _Unwind_Context reg[31] and then applies the offset 48 to that value.

The result is the wrong cfa value (which should not have changed at all) and all else based on the cfa value is messed up after that. In essence the reg[31] value and the offset value used are of mixed vintages/mixed frames: an arbitrary combination.

Code for a routine that sticks to cfa=OFFSET(r1) for the cfa will not see this error in the .eh_frame information's interpretation. [r1= powerpc/powerpc64 stack pointer.]
Comment 1 Mark Millard 2016-03-06 01:14:20 UTC
(In reply to Mark Millard from comment #0)

Note: The context for libgcc_s was a clang 3.8.0 based buildworld. A gcc buildworld does not involve such a Frame Pointer Register.

I do not know if any TARGET_ARCH's other than powerpc/powerpc64 also generate such Frame Pointer Register like code and so might touch the same error.
Comment 2 Mark Millard 2016-03-06 08:13:58 UTC
With the other errors identified and reported for .eh_frame and C++ exception handling for powerpc it is getting harder to tell if a problem is a new problem or a consequence of the other ones. (Various problems have no work around yet to avoid them.)

This turned out to be a consequence of the other problems.

It was easier to discover once I induced gcc 4.2.1 to generate some example code with r31 in use as a frame pointer. (I used alloca.) Observing its behavior and the .eh_frame output indicated I'd misinterpreted where the earliest problem was.