Bug 177624 - [libc] [patch] Swapcontext can get compiled incorrectly
Summary: [libc] [patch] Swapcontext can get compiled incorrectly
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2013-04-04 03:40 UTC by Brian Demsky
Modified: 2022-10-17 12:39 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Brian Demsky 2013-04-04 03:40:00 UTC
Here is the code for swap context:

int
swapcontext(ucontext_t *oucp, const ucontext_t *ucp)
{
      int ret;

      if ((oucp == NULL) || (ucp == NULL)) {
              errno = EINVAL;
              return (-1);
      }
      oucp->uc_flags &= ~UCF_SWAPPED;
      ret = getcontext(oucp);
      if ((ret == 0) && !(oucp->uc_flags & UCF_SWAPPED)) {
              oucp->uc_flags |= UCF_SWAPPED;
              ret = setcontext(ucp);
      }
      return (ret);
}

On the OS X port of libc in Mac OSX 10.7.5, this gets compiled as:

0x00007fff901e86b2 <swapcontext+0>:     push   %r14
0x00007fff901e86b4 <swapcontext+2>:     push   %rbx
0x00007fff901e86b5 <swapcontext+3>:     sub    $0x8,%rsp
0x00007fff901e86b9 <swapcontext+7>:     test   %rdi,%rdi
0x00007fff901e86bc <swapcontext+10>:    je     0x7fff901e86c6 <swapcontext+20>
0x00007fff901e86be <swapcontext+12>:    mov    %rsi,%rbx
0x00007fff901e86c1 <swapcontext+15>:    test   %rbx,%rbx
0x00007fff901e86c4 <swapcontext+18>:    jne    0x7fff901e86d8 <swapcontext+38>
0x00007fff901e86c6 <swapcontext+20>:    callq  0x7fff90262c88 <__error>
0x00007fff901e86cb <swapcontext+25>:    movl   $0x16,(%rax)
0x00007fff901e86d1 <swapcontext+31>:    mov    $0xffffffff,%eax
0x00007fff901e86d6 <swapcontext+36>:    jmp    0x7fff901e86f3 <swapcontext+65>
0x00007fff901e86d8 <swapcontext+38>:    mov    %rdi,%r14
0x00007fff901e86db <swapcontext+41>:    andb   $0x7f,0x3(%r14)
0x00007fff901e86e0 <swapcontext+46>:    mov    %r14,%rdi
0x00007fff901e86e3 <swapcontext+49>:    callq  0x7fff901e87af <getcontext>
0x00007fff901e86e8 <swapcontext+54>:    test   %eax,%eax
0x00007fff901e86ea <swapcontext+56>:    jne    0x7fff901e86f3 <swapcontext+65>
0x00007fff901e86ec <swapcontext+58>:    mov    (%r14),%ecx
0x00007fff901e86ef <swapcontext+61>:    test   %ecx,%ecx
0x00007fff901e86f1 <swapcontext+63>:    jns    0x7fff901e86fb <swapcontext+73>
0x00007fff901e86f3 <swapcontext+65>:    add    $0x8,%rsp
0x00007fff901e86f7 <swapcontext+69>:    pop    %rbx
0x00007fff901e86f8 <swapcontext+70>:    pop    %r14
0x00007fff901e86fa <swapcontext+72>:    retq   
0x00007fff901e86fb <swapcontext+73>:    or     $0x80000000,%ecx
0x00007fff901e8701 <swapcontext+79>:    mov    %ecx,(%r14)
0x00007fff901e8704 <swapcontext+82>:    mov    %rbx,%rdi
0x00007fff901e8707 <swapcontext+85>:    add    $0x8,%rsp
0x00007fff901e870b <swapcontext+89>:    pop    %rbx
0x00007fff901e870c <swapcontext+90>:    pop    %r14
0x00007fff901e870e <swapcontext+92>:    jmpq   0x7fff90262855 <setcontext>

The problem is that rbx is callee saved by compiled version of swapcontext and then reused before getcontext is called.  Getcontext then stores the wrong value for rbx and setcontext later restores the wrong value for rbx.  If the caller had any value in rbx, it has been trashed at this point.

Brian
Comment 1 Brian Demsky 2013-04-04 04:01:14 UTC
QSBxdWljayBub3RlIHRoYXQgdGhlIHRhaWwgY2FsbCB0byBzZXQgY29udGV4dCBpcyB3aGF0IGFj
dHVhbGx5IHRyYXNoZXMgdGhlIGNvcGllcyBvZiByYnggYW5kIHIxNCBvbiB0aGUgc3RhY2su
Comment 2 Eitan Adler freebsd_committer freebsd_triage 2013-04-04 13:11:34 UTC
---------- Forwarded message ----------
From: Brian Demsky <bdemsky@uci.edu>
Date: 4 April 2013 01:25
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
To: Eitan Adler <lists@eitanadler.com>
Cc: freebsd-bugs@freebsd.org


> The analysis is a little wrong about the problem.  Ultimately, the tail c=
all to set context trashes the copies of bx and r14 on the stack=E2=80=A6.

--=20
Eitan Adler
Comment 3 Bruce Evans freebsd_committer freebsd_triage 2013-04-04 14:46:32 UTC
On Thu, 4 Apr 2013, Brian Demsky wrote:

>> Description:
> Here is the code for swap context:
>
> int
> swapcontext(ucontext_t *oucp, const ucontext_t *ucp)
> {
>      int ret;
>
>      if ((oucp == NULL) || (ucp == NULL)) {
>              errno = EINVAL;
>              return (-1);
>      }
>      oucp->uc_flags &= ~UCF_SWAPPED;
>      ret = getcontext(oucp);
>      if ((ret == 0) && !(oucp->uc_flags & UCF_SWAPPED)) {
>              oucp->uc_flags |= UCF_SWAPPED;
>              ret = setcontext(ucp);
>      }
>      return (ret);
> }


> On the OS X port of libc in Mac OSX 10.7.5, this gets compiled as:


> ...
> 0x00007fff901e870b <swapcontext+89>:    pop    %rbx
> 0x00007fff901e870c <swapcontext+90>:    pop    %r14
> 0x00007fff901e870e <swapcontext+92>:    jmpq   0x7fff90262855 <setcontext>
>
> The problem is that rbx is callee saved by compiled version of swapcontext and then reused before getcontext is called.  Getcontext then stores the wrong value for rbx and setcontext later restores the wrong value for rbx.  If the caller had any value in rbx, it has been trashed at this point.


Later you wrote:

> The analysis is a little wrong about the problem.  Ultimately, the tail call to set context trashes the copies of bx and r14 on the stack.


The bug seems to be in setcontext().  It must preserve the callee-saved
registers, not restore them.  This would happen automatically if more
were written in C.  But setcontext() can't be written entirely in C,
since it must save all callee-saved registers including ones not used
and therefore not normally saved by any C function that it might be in,
and possibly also including callee-saved registers for nonstandard or
non-C ABIs.  In FreeBSD, it is apparently always a syscall.

In FreeBSD, this bug doesn't occur on at least amd64 or i386 because the
C version of swapcontext() has never been used on these arches.
swapcontext() is a syscall too.

If setcontext() is a syscall, then it has a minor problem even knowing
what the ABI's callee-saved registers are.  At least the FreeBSD amd64
version doesn't know anything about this.  It uses much the same code
as for asynchronous signal handling, so it just restores all registers,
including scratch ones that don't need to be preserved.  It even
restores the return register to the trap frame, although it can't
return this to userland.  This can probably be fixed a library wrapper.

In FreeBSD on amd64, getcontext(), setcontext() and swapcontext() are
all syscalls, but their documenation is misplaced in a section 3 man
page.  swapcontext() is misplaced together with makecontext(), which
actually is library function.  Oops, not quite.  getcontext() is
actually a small wrapper around an undocumented setcontext() syscall
(this is needed to adjust the instruction pointer register).  Only
makecontext() is in C, and not a wrapper.  I don't count the wrappers
that just make a syscall as library functions, since the corresponding
syscalls can be made easily without using the C library and this should
be documented.

Bruce
Comment 4 Bruce Evans freebsd_committer freebsd_triage 2013-04-04 16:16:02 UTC
On Fri, 5 Apr 2013, Bruce Evans wrote:

> The bug seems to be in setcontext().  It must preserve the callee-saved
> registers, not restore them.  This would happen automatically if more
> were written in C.  But setcontext() can't be written entirely in C,
> since it must save all callee-saved registers including ones not used
> and therefore not normally saved by any C function that it might be in,
> and possibly also including callee-saved registers for nonstandard or
> non-C ABIs.  In FreeBSD, it is apparently always a syscall.


This is more than a little wrong.  When setcontext() succeeds, it
doesn't return here.  Then it acts like longjmp() and must restore all
the callee-saved to whatever they were when getcontext() was called.
Otherwise, it must not clobber any callee-saved registers (then it
differs from longjmp().  longjmp() just can't fail).

Now I don't see any bug here.  If the saved state is returned to, then
it is as if getcontext() returned, and the intermediately-saved %rbx
is correct (we will restore the orginal %rbx if we return).  If
setcontext() fails, then it should preserve all callee-saved registers.
In the tail-call case, we have already restored the orginal %rbx and
the failing setcontext() should preserve that.

Bruce
Comment 5 Brian Demsky 2013-04-04 17:43:06 UTC
On Apr 4, 2013, at 8:16 AM, Bruce Evans wrote:

> This is more than a little wrong.  When setcontext() succeeds, it
> doesn't return here.  Then it acts like longjmp() and must restore all
> the callee-saved to whatever they were when getcontext() was called.
> Otherwise, it must not clobber any callee-saved registers (then it
> differs from longjmp().  longjmp() just can't fail).
>=20
> Now I don't see any bug here.  If the saved state is returned to, then
> it is as if getcontext() returned, and the intermediately-saved %rbx
> is correct (we will restore the orginal %rbx if we return).  If
> setcontext() fails, then it should preserve all callee-saved =
registers.
> In the tail-call case, we have already restored the orginal %rbx and
> the failing setcontext() should preserve that.
>=20
> Bruce

Take at setcontext:=20

(gdb) disassemble setcontext
Dump of assembler code for function setcontext:
0x00007fff90262855 <setcontext+0>:      push   %rbx
0x00007fff90262856 <setcontext+1>:      lea    0x38(%rdi),%rbx
0x00007fff9026285a <setcontext+5>:      cmp    0x30(%rdi),%rbx
0x00007fff9026285e <setcontext+9>:      je     0x7fff90262864 =
<setcontext+15>
0x00007fff90262860 <setcontext+11>:     mov    %rbx,0x30(%rdi)
0x00007fff90262864 <setcontext+15>:     mov    0x4(%rdi),%edi
0x00007fff90262867 <setcontext+18>:     callq  0x7fff90262998 =
<sigsetmask>
0x00007fff9026286c <setcontext+23>:     mov    %rbx,%rdi
0x00007fff9026286f <setcontext+26>:     pop    %rbx
0x00007fff90262870 <setcontext+27>:     jmpq   0x7fff90262875 =
<_setcontext>
End of assembler dump.

The stack from swapcontext had rbx and r14 popped after getcontext =
stored everything.  Now we push rbx and then later call sigsetmask.  =
Those two actions guarantee that the memory locations where rbx and r14 =
were on the stack have been overwritten.  When we later return to the =
save context, it will start up swapcontext and pop the wrong values off =
of the stack for rbx and r14.

Brian
Comment 6 Bruce Evans freebsd_committer freebsd_triage 2013-04-04 20:38:51 UTC
On Thu, 4 Apr 2013, Brian Demsky wrote:

> Take at setcontext:
>
> (gdb) disassemble setcontext
> Dump of assembler code for function setcontext:
> 0x00007fff90262855 <setcontext+0>:      push   %rbx
> 0x00007fff90262856 <setcontext+1>:      lea    0x38(%rdi),%rbx
> 0x00007fff9026285a <setcontext+5>:      cmp    0x30(%rdi),%rbx
> 0x00007fff9026285e <setcontext+9>:      je     0x7fff90262864 <setcontext+15>
> 0x00007fff90262860 <setcontext+11>:     mov    %rbx,0x30(%rdi)
> 0x00007fff90262864 <setcontext+15>:     mov    0x4(%rdi),%edi
> 0x00007fff90262867 <setcontext+18>:     callq  0x7fff90262998 <sigsetmask>
> 0x00007fff9026286c <setcontext+23>:     mov    %rbx,%rdi
> 0x00007fff9026286f <setcontext+26>:     pop    %rbx
> 0x00007fff90262870 <setcontext+27>:     jmpq   0x7fff90262875 <_setcontext>
> End of assembler dump.
>
> The stack from swapcontext had rbx and r14 popped after getcontext stored everything.  Now we push rbx and then later call sigsetmask.  Those two actions guarantee that the memory locations where rbx and r14 were on the stack have been overwritten.  When we later return to the save context, it will start up swapcontext and pop the wrong values off of the stack for rbx and r14.


Ah, it is not really rbx and r14, but rsp and the whole stack frame of
swapcontext() that are mishandled.  Even returning from swapcontext()
leaves the saved rsp pointing to garbage.  The stack frame could have
had almost anything on it before it became invalid, but here it has mainly
the saved rbx and r14 (not rbp; however, when compiled by clang on FreeBSD,
it also has the saved rbp, and when compiled with -O0 it also has the
local variable).

Now I think swapcontext() can't be written in C, for the same reason that
setjmp() can't be written in C -- the stack frame cannot be controlled in
C, and if it has anything at all on it (even the return address requires
special handling), then the stack pointer saved in the context becomes
invalid when the function returns, or even earlier for tail calls and
other optimizations.  Also, if the C function calls another function like
the library getcontext(), then there are 2 stack frames below the saved
stack pointer that are hard to control.  In FreeBSD on at least x86,
getcontext() is a special non-automatically generated asm function to
control this.  The automatically generated asm function would have
only the return address in its stack frame, but even this is too much,
and there is even more to control.  The comment about this is incomplete/
partly wrong, but gives a good hint about the problem (*).

This problem is avoided in setjmp() and longjmp() by not leaving anything
on stack frames except the return address for setjmp() at the point where
the stack pointer is saved.  The saved stack pointer is still technically
invalid, since it points to the return address which goes away when
setjmp() returns.  This is handled by not returning in the usual way in
lonjmp().  Instead, setjmp() saves the return address and longjmp()
restores it; the restored stack pointer becomes valid only at the end
of setjmp() when the top word on it is restored.  FreeBSD getcontext()
uses a more hackish method (*).

The saved stack pointer becomes more than technically invalid if the
function that called setjmp() or swapcontext() returns.  Then the
caller's frame becomes invalid.  Such returns are invalid.  This
restriction is clearly documented for setjmp() but not for swapcontext()
in FreeBSD man pages and old C99 and POSIX specs.  The C99 restriction
is only that longjmp() must not be invoked with the saved state after
the function that saved the state using setjmp() returns.  Compilers
must know about this and and not do optimizations (like tail calls?)
that would invalidate the saved stack pointer before the function
returns.

(*) Here is the FreeBSD i386 getcontext():

@ /*
@  * This has to be magic to handle the multiple returns.

Multiple = just 2.

@  * Otherwise, the setcontext() syscall will return here and we'll
@  * pop off the return address and go to the *setcontext* call.
@  */

Actually, we would pop off garbage and go to neverland, with a lower
chance of going to setcontext than most places.

@ 	.weak	_getcontext
@ 	.set	_getcontext,__sys_getcontext
@ 	.weak	getcontext
@ 	.set	getcontext,__sys_getcontext
@ ENTRY(__sys_getcontext)
@ 	movl	(%esp),%ecx	/* save getcontext return address */
@ 	mov	$SYS_getcontext,%eax
@ 	KERNCALL
@ 	jb	HIDENAME(cerror)

When getcontext() fails, the return address on the stack is still valid
and cerror depends on that.

@ 	addl	$4,%esp		/* remove stale (setcontext) return address */

Actually, remove the non-stale (our getcontext) return address if the
return is not from setcontext, and remove stack garbage if the
return is from setcontext.  The stack contents is not related to
setcontext in either case.  In the garbage case, it started as the
return address for another getcontext, but became garbage when that
returned.

@ 	jmp	*%ecx		/* restore return address */

We want our return address in both cases.  We don't know which case
applies and use same code for both.  The comment is imprecise.  We
don't restore the return address.  What we do is return.

The code should be changed to match the comment (don't adjust %esp,
but actually restore the return address to it).  This method is used
without comment by FreeBSD x86 longjmp()).  Optimizing this for speed
is unimportant, but this is probably faster as well as cleaner.  The
jmp may have been faster 20 years ago, but now it unbalances call/return
branch prediction.

The libc swapcontext() can probably be fixed by copying
setjmp()/longjmp().  Except setjmp()/longjmp() has fundamentally broken
stack handling too, at least when setjmp() is actually sigsetjmp().
Then it is necessary to restore the stack pointer atomically with
restoring the signal mask, and this seems to be impossible with using
a single syscall that does both.  The 2 different orders of restoring
them give different problems.  In the above, you seem to have
setcontext() in userland, with some signal unmasking, so I think it
has the same races as setjmp()/longjmp().  Perhaps an atomic syscall
for setcontext() is enough.

Bruce
Comment 7 David Xu freebsd_committer freebsd_triage 2013-04-07 03:41:29 UTC
On 2013/04/05 03:38, Bruce Evans wrote:
> Now I think swapcontext() can't be written in C, for the same reason that
> setjmp() can't be written in C -- the stack frame cannot be controlled in
> C, and if it has anything at all on it (even the return address requires
> special handling), then the stack pointer saved in the context becomes
> invalid when the function returns, or even earlier for tail calls and
> other optimizations.

This reminds me that I can not override swapcontext in libthr, I had
put a wrapper for swapcontext in libthr, I am considering to remove it
now ...
Comment 8 David Xu freebsd_committer freebsd_triage 2013-04-17 10:38:02 UTC
I have worked out a patch:

http://people.freebsd.org/~davidxu/patch/libc_swapcontext.diff
Comment 9 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:01:15 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 10 Graham Perrin freebsd_committer freebsd_triage 2022-10-17 12:39:47 UTC
Keyword: 

    patch
or  patch-ready

– in lieu of summary line prefix: 

    [patch]

* bulk change for the keyword
* summary lines may be edited manually (not in bulk). 

Keyword descriptions and search interface: 

    <https://bugs.freebsd.org/bugzilla/describekeywords.cgi>