Bug 182161 - [libc] restarting SYSCALL system call on amd64 loses arguments
Summary: [libc] restarting SYSCALL system call on amd64 loses arguments
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 9.1-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Konstantin Belousov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-16 18:20 UTC by rsc
Modified: 2013-10-06 23:35 UTC (History)
0 users

See Also:


Attachments
file.txt (1.15 KB, text/plain)
2013-09-16 18:20 UTC, rsc
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description rsc 2013-09-16 18:20:00 UTC
FreeBSD 9 (and other versions) appear to support invoking system calls using the SYSCALL instruction. However, that code path does not work for system call that will be restarted due to incoming signals (that is, due to signals with SA_RESTART set in their sigaction settings), because the fast_syscall path in amd64/amd64/exception.S only restores two of the six system call arguments. 

The specific problem we have been seeing is that if a SIGCHLD interrupts wait4 (and we've marked SIGCHLD SA_RESTART), then the wait4 loses its fourth argument, R10, which changes to a different value entirely. If the restarted wait4 succeeds, the copy into the rusage will return "bad address".

I encountered this using Go, which invokes system calls using SYSCALL (because it seemed to work, I guess). I have reproduced it with a simple C program.


The bug reproduces under ktrace, where it becomes easy to see. Watch thread 6264879. The first restart loses the argument (changes it to 0xa00200a0), and the second restart, actually finds a child and fails with errno 14.

..
 50239 6264879 a.out    CALL  wait4(0xc760,0x7fffff9fcfb4,0<><invalid>0,0x7fffff9fcf20)
 50239 6264879 a.out    RET   wait4 RESTART
 50239 6264879 a.out    CALL  wait4(0xc760,0x7fffff9fcfb4,0<><invalid>0,0xa00200a0)

 50239 6264384 a.out    RET   wait4 51039/0xc75f
 50239 6264384 a.out    CALL  sigprocmask(SIG_BLOCK,0x80082c8f0,0x8010078e8)
 50239 6264384 a.out    RET   sigprocmask 0
 50239 6264384 a.out    CALL  fork
 50239 6264384 a.out    RET   fork 51041/0xc761
 50239 6264384 a.out    CALL  sigprocmask(SIG_SETMASK,0x8010078e8,0)
 50239 6264384 a.out    RET   sigprocmask 0
 50239 6264384 a.out    CALL  wait4(0xc761,0x7fffffbfdfb4,0<><invalid>0,0x7fffffbfdf20)

 50239 6264879 a.out    RET   wait4 RESTART
 50239 6264879 a.out    PSIG  SIGCHLD caught handler=0x800825520 mask=0x0 code=0x1
 50239 6264879 a.out    CALL  sigprocmask(SIG_SETMASK,0x7fffff9fca5c,0)
 50239 6264879 a.out    RET   sigprocmask 0
 50239 6264879 a.out    CALL  sigreturn(0x7fffff9fc690)
 50239 6264879 a.out    RET   sigreturn JUSTRETURN
 50239 6264879 a.out    CALL  wait4(0xc760,0x7fffff9fcfb4,0<><invalid>0,0xa00200a0)
 50239 6264879 a.out    RET   wait4 -1 errno 14 Bad address
..

The INT $0x80 path does not have this bug - it restores all the registers correctly - so I will change the Go implementation of system calls on FreeBSD to use INT $0x80.

Fix: Restore the other four arguments at the end of fast_syscall.


Patch attached with submission follows:
How-To-Repeat: Run the attached C program on an unloaded multicore system. It prints 'wait4 returned 14' on most runs. If it doesn't happen in the first few seconds, kill it and start again.
Comment 1 dfilter service freebsd_committer 2013-09-24 13:24:55 UTC
Author: kib
Date: Tue Sep 24 12:24:48 2013
New Revision: 255844
URL: http://svnweb.freebsd.org/changeset/base/255844

Log:
  Ensure that the ERESTART return from the syscall reloads the
  registers, to make the restarted syscall instruction pass the correct
  arguments.
  
  PR:	kern/182161
  Reported by:	Russ Cox <rsc@swtch.com>
  Sponsored by:	The FreeBSD Foundation
  MFC after:	3 days
  Approved by:	re (marius)

Modified:
  head/sys/amd64/amd64/vm_machdep.c

Modified: head/sys/amd64/amd64/vm_machdep.c
==============================================================================
--- head/sys/amd64/amd64/vm_machdep.c	Tue Sep 24 11:49:04 2013	(r255843)
+++ head/sys/amd64/amd64/vm_machdep.c	Tue Sep 24 12:24:48 2013	(r255844)
@@ -400,9 +400,13 @@ cpu_set_syscall_retval(struct thread *td
 		 * for the next iteration.
 		 * %r10 restore is only required for freebsd/amd64 processes,
 		 * but shall be innocent for any ia32 ABI.
+		 *
+		 * Require full context restore to get the arguments
+		 * in the registers reloaded at return to usermode.
 		 */
 		td->td_frame->tf_rip -= td->td_frame->tf_err;
 		td->td_frame->tf_r10 = td->td_frame->tf_rcx;
+		set_pcb_flags(td->td_pcb, PCB_FULL_IRET);
 		break;
 
 	case EJUSTRETURN:
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 2 dfilter service freebsd_committer 2013-09-27 07:57:06 UTC
Author: kib
Date: Fri Sep 27 06:56:58 2013
New Revision: 255905
URL: http://svnweb.freebsd.org/changeset/base/255905

Log:
  MFC r255844:
  Ensure that the ERESTART return from the syscall reloads the registers,
  to make the restarted syscall instruction pass the correct arguments.
  
  PR:	kern/182161

Modified:
  stable/9/sys/amd64/amd64/vm_machdep.c
Directory Properties:
  stable/9/sys/   (props changed)

Modified: stable/9/sys/amd64/amd64/vm_machdep.c
==============================================================================
--- stable/9/sys/amd64/amd64/vm_machdep.c	Thu Sep 26 22:47:02 2013	(r255904)
+++ stable/9/sys/amd64/amd64/vm_machdep.c	Fri Sep 27 06:56:58 2013	(r255905)
@@ -400,9 +400,13 @@ cpu_set_syscall_retval(struct thread *td
 		 * for the next iteration.
 		 * %r10 restore is only required for freebsd/amd64 processes,
 		 * but shall be innocent for any ia32 ABI.
+		 *
+		 * Require full context restore to get the arguments
+		 * in the registers reloaded at return to usermode.
 		 */
 		td->td_frame->tf_rip -= td->td_frame->tf_err;
 		td->td_frame->tf_r10 = td->td_frame->tf_rcx;
+		set_pcb_flags(td->td_pcb, PCB_FULL_IRET);
 		break;
 
 	case EJUSTRETURN:
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 3 dfilter service freebsd_committer 2013-09-27 08:02:56 UTC
Author: kib
Date: Fri Sep 27 07:02:48 2013
New Revision: 255906
URL: http://svnweb.freebsd.org/changeset/base/255906

Log:
  MFC r255844:
  Ensure that the ERESTART return from the syscall reloads the registers,
  to make the restarted syscall instruction pass the correct arguments.
  
  PR:	kern/182161

Modified:
  stable/8/sys/amd64/amd64/vm_machdep.c
Directory Properties:
  stable/8/sys/   (props changed)
  stable/8/sys/amd64/   (props changed)

Modified: stable/8/sys/amd64/amd64/vm_machdep.c
==============================================================================
--- stable/8/sys/amd64/amd64/vm_machdep.c	Fri Sep 27 06:56:58 2013	(r255905)
+++ stable/8/sys/amd64/amd64/vm_machdep.c	Fri Sep 27 07:02:48 2013	(r255906)
@@ -395,9 +395,13 @@ cpu_set_syscall_retval(struct thread *td
 		 * for the next iteration.
 		 * %r10 restore is only required for freebsd/amd64 processes,
 		 * but shall be innocent for any ia32 ABI.
+		 *
+		 * Require full context restore to get the arguments
+		 * in the registers reloaded at return to usermode.
 		 */
 		td->td_frame->tf_rip -= td->td_frame->tf_err;
 		td->td_frame->tf_r10 = td->td_frame->tf_rcx;
+		set_pcb_flags(td->td_pcb, PCB_FULL_IRET);
 		break;
 
 	case EJUSTRETURN:
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 4 Jilles Tjoelker freebsd_committer 2013-10-06 23:33:47 UTC
State Changed
From-To: open->closed

Fixed in 10-current, 9-stable and 8-stable. 
Thanks for the report. 


Comment 5 Jilles Tjoelker freebsd_committer 2013-10-06 23:33:47 UTC
Responsible Changed
From-To: freebsd-bugs->kib

Set responsible to committer.