Bug 15328

Summary: Threaded "system" does'nt work in -current
Product: Base System Reporter: nnd <nnd>
Component: binAssignee: Daniel Eischen <deischen>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.0-CURRENT   
Hardware: Any   
OS: Any   

Description nnd 1999-12-07 12:10:00 UTC
	The program from the "How-To-Repeat" section just
does'nt finished if compiled with "-pthread" option to gcc.
The same problem exists in the "configure" stage of the 'gtk12'
port and (may be) in (the now deleted by me) 'xmms' program
from ports/audio.

Fix: 

It can be found by Daniel Eischen - the author
of the last commit to the 'lib/libc_r/uthread' ;-)
How-To-Repeat: 
	Compile the following program with the command:

cc -Wall -O -o systest -pthread systest.c

	and start it by "./systest" -

you see the "date" output, but the program does'nt finish
(and cant be killed by SIGINT, but successfully killed by
SIGTERM).

====================== systest.c ========================
#include <stdlib.h>

int main()
{
	system("date");
	exit(0);
}
Comment 1 cpiazza 1999-12-07 18:44:15 UTC
> 	Compile the following program with the command:
> 
> cc -Wall -O -o systest -pthread systest.c
> 
> 	and start it by "./systest" -
> 
> you see the "date" output, but the program does'nt finish
> (and cant be killed by SIGINT, but successfully killed by
> SIGTERM).

This program is working fine on my system.  My -current is from the 5th
of december, though.  Do you think that's relevant?

-Chris
-- 
cpiazza@jaxon.net   cpiazza@FreeBSD.org
        Abbotsford, BC, Canada
Comment 2 nnd 1999-12-08 02:04:42 UTC
In <199912071850.KAA79377@freefall.freebsd.org> Chris Piazza <cpiazza@jaxon.net> wrote:
> The following reply was made to PR bin/15328; it has been noted by GNATS.
> 
> From: Chris Piazza <cpiazza@jaxon.net>
> To: nnd@mail.nsk.ru
> Cc: FreeBSD-gnats-submit@FreeBSD.ORG
> Subject: Re: bin/15328: Threaded "system" does'nt work in -current
> Date: Tue, 7 Dec 1999 10:44:15 -0800
> 
>  > 	Compile the following program with the command:
>  > 
>  > cc -Wall -O -o systest -pthread systest.c
>  > 
>  > 	and start it by "./systest" -
>  > 
>  > you see the "date" output, but the program does'nt finish
>  > (and cant be killed by SIGINT, but successfully killed by
>  > SIGTERM).
>  
>  This program is working fine on my system.  My -current is from the 5th
>  of december, though.  Do you think that's relevant?

	It seems to me that the commit causes this (mis)behavior
of the "system" function in threaded programs is cited down this
letter. What are the versions of the relevant files in your
sources ? 

	N.Dudorov

=================================================================== 
>deischen    1999/12/04 14:55:59 PST
>
>  Modified files:
>    lib/libc_r/uthread   pthread_private.h uthread_fork.c 
>                         uthread_sig.c uthread_sigwait.c 
>  Log:
>  Change signal handling to conform to POSIX specified semantics.
>  Before this change, a signal was delivered to each thread that
>  didn't have the signal masked.  Signals also improperly woke up
>  threads waiting on I/O.  With this change, signals are now
>  handled in the following way:
>  
>    o If a thread is waiting in a sigwait for the signal,
>      then the thread is woken up.
>  
>    o If no threads are sigwait'ing on the signal and a
>      thread is in a sigsuspend waiting for the signal,
>      then the thread is woken up.
>  
>    o In the case that no threads are waiting or suspended
>      on the signal, then the signal is delivered to the
>      first thread we find that has the signal unmasked.
>  
>    o If no threads are waiting or suspended on the signal,
>      and no threads have the signal unmasked, then the signal
>      is added to the process wide pending signal set.  The
>      signal will be delivered to the first thread that unmasks
>      the signal.
>  
>  If there is an installed signal handler, it is only invoked
>  if the chosen thread was not in a sigwait.
>  
>  In the case that multiple threads are waiting or suspended
>  on a signal, or multiple threads have the signal unmasked,
>  we wake up/deliver the signal to the first thread we find.
>  The above rules still apply.
>  
>  Reported by:	Scott Hess <scott@avantgo.com>
>  Reviewed by:	jb, jasone
>  
>  Revision  Changes    Path
>  1.30      +7 -3      src/lib/libc_r/uthread/pthread_private.h
>  1.15      +4 -1      src/lib/libc_r/uthread/uthread_fork.c
>  1.21      +66 -24    src/lib/libc_r/uthread/uthread_sig.c
>  1.12      +6 -3      src/lib/libc_r/uthread/uthread_sigwait.c
Comment 3 cpiazza freebsd_committer freebsd_triage 1999-12-08 02:19:49 UTC
Responsible Changed
From-To: freebsd-bugs->deischen

Over to the committer responsible for the problem. 

Comment 4 cpiazza 1999-12-08 02:20:00 UTC
On Wed, Dec 08, 1999 at 08:04:42AM +0600, nnd@mail.nsk.ru wrote:
> 	It seems to me that the commit causes this (mis)behavior
> of the "system" function in threaded programs is cited down this
> letter. What are the versions of the relevant files in your
> sources ? 

I guess I forgot to build world over the weekend (I usually do), so my
libc_r was out of date.  You're right, this is a problem.

-Chris
-- 
cpiazza@jaxon.net   cpiazza@FreeBSD.org
        Abbotsford, BC, Canada
Comment 5 eischen 1999-12-08 17:06:59 UTC
The problem is that libc's system blocks SIGCHLD before forking 
and eventually calls wait4() which is wrapped by libc_r in uthread_wait4.c.
Since SIGCHLD is blocked, the calling thread is never capable of receiving
SIGCHLD.           

Here's a fix to src/lib/libc_r/uthread/uthread_wait4.c.  Try it and
let me know if it works.

Dan Eischen
eischen@vigrid.com 

Index: uthread_wait4.c
===================================================================
RCS file: /A/cvs/src/lib/libc_r/uthread/uthread_wait4.c,v
retrieving revision 1.6
diff -u -r1.6 uthread_wait4.c
--- uthread_wait4.c	1999/11/28 05:38:11	1.6
+++ uthread_wait4.c	1999/12/08 16:48:35
@@ -40,11 +40,20 @@
 pid_t
 wait4(pid_t pid, int *istat, int options, struct rusage * rusage)
 {
-	pid_t           ret;
+	pid_t	ret;
+	int	sigchld_blocked;
 
 	_thread_enter_cancellation_point();
 	_thread_kern_sig_defer();
 
+	/* Ensure this thread doesn't have SIGCHLD blocked: */
+	if (sigismember(&_thread_run->sigmask, SIGCHLD)) {
+		sigchld_blocked = 1;
+		sigdelset(&_thread_run->sigmask, SIGCHLD);
+	}
+	else
+		sigchld_blocked = 0;
+
 	/* Perform a non-blocking wait4 syscall: */
 	while ((ret = _thread_sys_wait4(pid, istat, options | WNOHANG, rusage)) == 0 && (options & WNOHANG) == 0) {
 		/* Reset the interrupted operation flag: */
@@ -60,6 +69,10 @@
 			break;
 		}
 	}
+
+	/* Restore SIGCHLD to the threads mask if necessary: */
+	if (sigchld_blocked != 0)
+		sigaddset(&_thread_run->sigmask, SIGCHLD);
 
 	_thread_kern_sig_undefer();
 	_thread_leave_cancellation_point();
Comment 6 Daniel Eischen freebsd_committer freebsd_triage 1999-12-17 03:02:36 UTC
State Changed
From-To: open->closed

Fixed in -current and tested by originator. 

Many thanks to N. Dudorov (originator) for helping test bug fixes.