Bug 206810

Summary: 11.0-CURRENT/clang380-import for powerpc (32-bit): signal handlers given insufficient stack alignment
Product: Base System Reporter: Mark Millard <marklmi26-fbsd>
Component: kernAssignee: Justin Hibbits <jhibbits>
Status: Closed FIXED    
Severity: Affects Only Me CC: jhibbits
Priority: ---    
Version: CURRENT   
Hardware: powerpc   
OS: Any   

Description Mark Millard 2016-02-01 00:36:28 UTC
[syslogd, nfsd, mountd, and  make all get the described-below sorts of segmentation faults via similar __vfprintf use during a signal handler for the powerpc (32-bit) context identifed.]

For a system based on buildworld with TARGET_ARCH=powerpc via clang 3.8.0 (and use of project/clang380-import -r294962) the following short program gets a segmentation violation for the use of handler via raise:

#include <signal.h> // for signal, SIGINT, SIG_ERR, raise.
#include <stdio.h>  // for snprintf

void handler(int sig)
{
    char buf[32];
    snprintf(buf, sizeof buf, "%d", sig); // FreeBSD's world does such things in some of its signal handlers
}

int main(void)
{
    handler(0); // works fine
    if (signal(SIGINT, handler) != SIG_ERR) raise(SIGINT); // fails during handler
    return 0;
}

The problem actually occurs during __vfprintf (and the code it inlines), specifically for io_flush(struct io_state *iop, locale_t locale) has:

     return (__sprint(iop->fp, &iop->uio, locale));

The code generated that calculates &iop->uio depends on alignment by masking in the offset of 4 bytes instead of adding 4 to the address. ("li r3,4" and later "rlwimi r23,r3,0,29,29".) With the stack misalignment via the signal delivery the wrong address is calculated (the bit position already has a one). In fact it calculates the same address as its does for iop->fp (fp being the first field of *iop).

For reference:

#define NIOV 8
struct io_state {
     FILE *fp;
     struct __suio uio;      /* output information: summary */
     struct __siov iov[NIOV];/* ... and individual io vectors */
};


With the bad address propagated to

#define COPY(n)   (void)memcpy((void *)fp->_p, (void *)p, (size_t)(n))

(as p via __sfvwrite using COPY) the memcpy gets the segmentation fault.

The signal delivery is generating a modulo-4 alignment that is not aligned for any higher power of 2. Roman Divacky reports via a list response that 16 byte alignment is expected by llvm for ppc32. While 16 is what I expect as the ABI requirement I do not have a definitive reference to the FreeBSD powerpc (32-bit) ABI criteria for this (or any) context: I can not point to the definition or a known-good secondary source to prove a violation of a rule.

But I doubt that this should be considered a clang 3.8.0 defect for ppc32: it should be treated as a FreeBSD signal delivery defect instead as far as I can tell.

Below I provide some gdb evidence of the good (direct handler call) vs. bad (signal based handler call) stack alignments for handler and for __vfprintf (and its inlined code) if you want to look. For __vfprintf's lots of local variables (including arrays) they all end up with unusual alignments from start from a unusual alignment.

You can stop reading this description here to avoid this detail: There is no more after it.


(gdb) run
Starting program: /root/c_tests/a.out 

Breakpoint 10, 0x018006d4 in handler ()
(gdb) bt
#0  0x018006d4 in handler ()
#1  0x01800760 in main ()
(gdb) info frame
Stack level 0, frame at 0xffffdcb0:
pc = 0x18006d4 in handler; saved pc = 0x1800760
called by frame at 0xffffdcd0
Arglist at 0xffffdc60, args: 
Locals at 0xffffdc60, Previous frame's sp is 0xffffdcb0
Saved registers:
 r31 at 0xffffdcac, pc at 0xffffdcb4, lr at 0xffffdcb4

vs.

(gdb) cont
Continuing.

Breakpoint 10, 0x018006d4 in handler ()
(gdb) bt
#0  0x018006d4 in handler ()
#1  <signal handler called>
#2  0x00000000 in ?? ()
(gdb) info frame
Stack level 0, frame at 0xffffd73c:
pc = 0x18006d4 in handler; saved pc = 0xffffe008
called by frame at 0xffffd73c
Arglist at 0xffffd6ec, args: 
Locals at 0xffffd6ec, Previous frame's sp is 0xffffd73c
Saved registers:
 r31 at 0xffffd738, pc at 0xffffd740, lr at 0xffffd740


(gdb) info frame
Stack level 0, frame at 0xffffdad0:
pc = 0x41931590 in __vfprintf (/usr/src/lib/libc/stdio/vfprintf.c:454); saved pc = 0x4199c644
called by frame at 0xffffdc60
source language c.
Arglist at 0xffffd880, args: fp=0xffffdb40, locale=0x419cba40 <__xlocale_global_locale>, fmt0=0x180085c "%d", ap=0xffffdc30
Locals at 0xffffd880, Previous frame's sp is 0xffffdad0
Saved registers:
. . .

vs.

(gdb) info frame
Stack level 0, frame at 0xffffd55c:
pc = 0x41931590 in __vfprintf (/usr/src/lib/libc/stdio/vfprintf.c:454); saved pc = 0x4199c644
called by frame at 0xffffd6ec
source language c.
Arglist at 0xffffd30c, args: fp=0xffffd5cc, locale=0x419cba40 <__xlocale_global_locale>, fmt0=0x180085c "%d", ap=0xffffd6bc
Locals at 0xffffd30c, Previous frame's sp is 0xffffd55c
Saved registers:
. . .
Comment 1 Mark Millard 2016-02-02 09:29:32 UTC
I tried the following change on/for the powerpc (32-bit) PowerMac that I use

Index: /usr/src/sys/powerpc/powerpc/sigcode32.S
===================================================================
--- /usr/src/sys/powerpc/powerpc/sigcode32.S	(revision 294962)
+++ /usr/src/sys/powerpc/powerpc/sigcode32.S	(working copy)
@@ -45,9 +45,9 @@
  */
 	.globl	CNAME(sigcode32),CNAME(szsigcode32)
 CNAME(sigcode32):
-	addi	1,1,-20			/* reserved space for callee */
+	addi	1,1,-32			/* reserved space for callee */
 	blrl
-	addi	3,1,20+SF_UC		/* restore sp, and get &frame->sf_uc */
+	addi	3,1,32+SF_UC		/* restore sp, and get &frame->sf_uc */
 	li	0,SYS_sigreturn
 	sc				/* sigreturn(scp) */
 	li	0,SYS_exit


and the results were:

A) "info frame" in gdb shows signal handlers are now started with 16-byte aligned stack frames.

and

B) The clang 3.8.0 compiled __vfprintf segmentation faults in libc/stdio library code during signal handlers no longer happen because the alignment matches the code requirements.

(Before 2014 it was -16 and 16 instead of -20 and 20, but 16 was too small of a space. The change to -20 and 20 fixed that but no longer produced aligned stack frames: It should have gone from -16 and 16 to -32 and 32 to maintain 16 byte stack alignment while allocating more space.)
Comment 2 commit-hook freebsd_committer freebsd_triage 2016-02-03 01:50:43 UTC
A commit references this bug:

Author: jhibbits
Date: Wed Feb  3 01:50:27 UTC 2016
New revision: 295186
URL: https://svnweb.freebsd.org/changeset/base/295186

Log:
  Align signal stack pointer to 16 bytes.

  The stack must be aligned to 16 bytes at all times.  Clang 3.8 is especially
  adamant about this, and causes strange behavior and segmentation faults if it is
  not the case.

  PR:		kern/206810

Changes:
  head/sys/powerpc/powerpc/exec_machdep.c
  head/sys/powerpc/powerpc/sigcode32.S
Comment 3 Justin Hibbits freebsd_committer freebsd_triage 2016-02-03 01:51:56 UTC
Fixed in r295186.
Comment 4 Mark Millard 2016-02-03 02:38:20 UTC
(In reply to Justin Hibbits from comment #3)

Thanks for the fixes.

Looking at

stable/10/sys/powerpc/powerpc/exec_machdep.c -r293581

and

stable/10/sys/powerpc/powerpc/sigcode32.S -r267040

indicates that the fixes should be MFC'd, especially signcode32.S that has the -20 and 20 usage that definitely misaligns stack for signal handlers. It dates back to 2014-Jun-4 in 10-STABLE.

It would be good if 10.3-RELEASE ended up with the correction.
Comment 5 commit-hook freebsd_committer freebsd_triage 2016-04-07 00:38:33 UTC
A commit references this bug:

Author: jhibbits
Date: Thu Apr  7 00:37:47 UTC 2016
New revision: 297630
URL: https://svnweb.freebsd.org/changeset/base/297630

Log:
  MFC r295186

  Align signal stack pointer to 16 bytes.

  The stack must be aligned to 16 bytes at all times.  Clang 3.8 is especially
  adamant about this, and causes strange behavior and segmentation faults if it is
  not the case.

  PR:             kern/206810

Changes:
_U  stable/10/
  stable/10/sys/powerpc/powerpc/exec_machdep.c
  stable/10/sys/powerpc/powerpc/sigcode32.S