Bug 123062

Summary: C++ exception handling can loop during stacking unwinding in multithreaded programs
Product: Base System Reporter: Andy Newman <an>
Component: threadsAssignee: Konstantin Belousov <kib>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description Andy Newman 2008-04-25 02:10:02 UTC
The gcc runtime's _Unwind_Find_FDE function, invoked during exception handling's stack unwinding, is not safe to execute from within multiple threads.  FreeBSD's dl_iterate_phdr() however permits multiple threads to pass through it though.  The result is surprisingly reliable infinite looping of one or more threads if they just happen to be unwinding at the same time.

A bug report to gcc (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36030), with a fix to _Unwind_Find_FDE to protect its cache manipulations, has the response that glibc's dl_iterate_phdr doesn't  let multiple threads unwind at the same time.  To that end here's a patch to ld-elf.so that adds a lock around the dl_iterate_pdr call (well, within it) that has it operate in the same way as glibc's and avoid the looping problem.

Fix: Apply the attached patch and re-run tests.  No failures have been observed over hundreds of thousands of runs.

The patch adds another lock to ld-elf.so used solely to stop multiple threads running dl_iterate_pdr() at the same time.  At entry the lock is obtained (as a write lock), the iteration performed and then the lock is released.

Note that simply upgrading the existing read lock (rtld_bind_lock) to a write lock does not work and results in deadlock presumably due to other rtld functions being called during the stack unwinding which want to obtain the "bind" lock.



Patch attached with submission follows:
How-To-Repeat: Reproduction is via the small program attached to the gcc bug report.  As described there, the program creates some number of threads, each of which immediately throws an exception. The exceptions are caught and ignored in the thread.  The threads are then joined and the program exits.

Failure does not occur every time due to the usual race condition issues however by running the program in a shell loop it will eventually lock up attempting to join the threads.  Some number of threads will be looping consuming 100% CPU.

This has been tested on multiprocessor machines (two and four way) will reliable behavior.
Comment 1 kabaev 2008-04-26 18:31:59 UTC
  The submitted patch is obviously incorrect. If you are adding
rtld_phdr_lock, you should be grabbing it in addition to
rtld_bind_lock instead of replacing it.

-- 
Alexander Kabaev
Comment 2 Andy Newman 2008-04-27 01:53:13 UTC
Yes, obviously it needs the bind lock before it accesses the objs  
list.  As in,

Index: rtld.c
===================================================================
RCS file: /home/ncvs/root/src/src/libexec/rtld-elf/rtld.c,v
retrieving revision 1.124
diff -u -r1.124 rtld.c
--- rtld.c	17 May 2007 18:00:27 -0000	1.124
+++ rtld.c	27 Apr 2008 00:51:25 -0000
@@ -2098,9 +2098,10 @@
  {
      struct dl_phdr_info phdr_info;
      const Obj_Entry *obj;
-    int error, lockstate;
+    int error, bind_lockstate, phdr_lockstate;

-    lockstate = rlock_acquire(rtld_bind_lock);
+    phdr_lockstate = wlock_acquire(rtld_phdr_lock);
+    bind_lockstate = rlock_acquire(rtld_bind_lock);

      error = 0;

@@ -2119,7 +2120,8 @@
  		break;

      }
-    rlock_release(rtld_bind_lock, lockstate);
+    rlock_release(rtld_bind_lock, bind_lockstate);
+    wlock_release(rtld_phdr_lock, phdr_lockstate);

      return (error);
  }
Index: rtld_lock.c
===================================================================
RCS file: /home/ncvs/root/src/src/libexec/rtld-elf/rtld_lock.c,v
retrieving revision 1.4
diff -u -r1.4 rtld_lock.c
--- rtld_lock.c	3 Apr 2007 18:28:13 -0000	1.4
+++ rtld_lock.c	27 Apr 2008 00:50:01 -0000
@@ -171,7 +171,7 @@
  	lockinfo.thread_clr_flag(mask);
  }

-#define	RTLD_LOCK_CNT	2
+#define	RTLD_LOCK_CNT	3
  struct rtld_lock {
  	void	*handle;
  	int	 mask;
@@ -179,6 +179,7 @@

  rtld_lock_t	rtld_bind_lock = &rtld_locks[0];
  rtld_lock_t	rtld_libc_lock = &rtld_locks[1];
+rtld_lock_t	rtld_phdr_lock = &rtld_locks[2];

  int
  rlock_acquire(rtld_lock_t lock)
Index: rtld_lock.h
===================================================================
RCS file: /home/ncvs/root/src/src/libexec/rtld-elf/rtld_lock.h,v
retrieving revision 1.2
diff -u -r1.2 rtld_lock.h
--- rtld_lock.h	19 Jun 2003 02:39:37 -0000	1.2
+++ rtld_lock.h	27 Apr 2008 00:50:34 -0000
@@ -52,6 +52,7 @@

  extern rtld_lock_t	rtld_bind_lock;
  extern rtld_lock_t	rtld_libc_lock;
+extern rtld_lock_t	rtld_phdr_lock;

  int	rlock_acquire(rtld_lock_t);
  int 	wlock_acquire(rtld_lock_t);
Comment 3 Konstantin Belousov freebsd_committer freebsd_triage 2008-05-05 17:25:34 UTC
Responsible Changed
From-To: freebsd-threads->kib

Take.
Comment 4 dfilter service freebsd_committer freebsd_triage 2008-05-06 10:27:46 UTC
kib         2008-05-06 09:27:41 UTC

  FreeBSD src repository

  Modified files:
    libexec/rtld-elf     rtld.c rtld_lock.c rtld_lock.h 
  Log:
  Fix the problem with the C++ exception handling for the multithreaded
  programs.
  
  From the PR description:
  The gcc runtime's _Unwind_Find_FDE function, invoked during exception
  handling's stack unwinding, is not safe to execute from within multiple
  threads. FreeBSD' s dl_iterate_phdr() however permits multiple threads
  to pass through it though. The result is surprisingly reliable infinite
  looping of one or more threads if they just happen to be unwinding at
  the same time.
  
  Introduce the new lock that is write locked around the dl_iterate_pdr,
  thus providing required exclusion for the stack unwinders.
  
  PR:     threads/123062
  Submitted by:   Andy Newman <an at atrn org>
  Reviewed by:    kan
  MFC after:      2 weeks
  
  Revision  Changes    Path
  1.126     +5 -3      src/libexec/rtld-elf/rtld.c
  1.5       +2 -1      src/libexec/rtld-elf/rtld_lock.c
  1.3       +1 -0      src/libexec/rtld-elf/rtld_lock.h
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
Comment 5 dfilter service freebsd_committer freebsd_triage 2008-05-20 10:38:53 UTC
kib         2008-05-20 09:38:44 UTC

  FreeBSD src repository

  Modified files:        (Branch: RELENG_7)
    libexec/rtld-elf     rtld.c rtld_lock.c rtld_lock.h 
  Log:
  MFC
  rev. 1.126 of libexec/rtld-elf/rtld.c
  rev. 1.5 of libexec/rtld-elf/rtld_lock.c
  rev. 1.3 of libexec/rtld-elf/rtld_lock.h
  
  Fix the problem with the C++ exception handling for the multithreaded
  programs.
  
  From the PR description:
  The gcc runtime's _Unwind_Find_FDE function, invoked during exception
  handling's stack unwinding, is not safe to execute from within multiple
  threads. FreeBSD' s dl_iterate_phdr() however permits multiple threads
  to pass through it though. The result is surprisingly reliable infinite
  looping of one or more threads if they just happen to be unwinding at
  the same time.
  
  Introduce the new lock that is write locked around the dl_iterate_pdr,
  thus providing required exclusion for the stack unwinders.
  
  PR:     threads/123062
  Submitted by:   Andy Newman <an at atrn org>
  Reviewed by:    kan
  
  Revision   Changes    Path
  1.124.2.1  +5 -3      src/libexec/rtld-elf/rtld.c
  1.4.2.1    +2 -1      src/libexec/rtld-elf/rtld_lock.c
  1.2.20.1   +1 -0      src/libexec/rtld-elf/rtld_lock.h
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
Comment 6 Konstantin Belousov freebsd_committer freebsd_triage 2008-05-20 10:42:49 UTC
State Changed
From-To: open->closed

Fix committed to the HEAD and RELENG_7.