Bug 11287 - rfork(RFMEM...) doesn't share LDTs set by i386_set_ldt, breaking wine
Summary: rfork(RFMEM...) doesn't share LDTs set by i386_set_ldt, breaking wine
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 3.1-STABLE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Sheldon Hearn
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 1999-04-22 23:50 UTC by Juergen Lock
Modified: 2000-08-31 10:25 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Juergen Lock 1999-04-22 23:50:00 UTC
	wine now uses kernel threads (rfork()) and expects i386_set_ldt()
	to work across threads, i.e. the new LDT be global to all threads.
	rfork() copies the ldt regardless of the RFMEM flag so each thread
	ends up with its own ldt (sys/i386/i386/vm_machdep.c, cpu_fork()).

Fix: 

???  I'll see if i can come up with one, but don't hold your breath...
How-To-Repeat: 
	using a kernel with `options		"USER_LDT"' and wine
	current-CVS (see http://www.winehq.com), or a patched wine-990328
	(diffs are in my post to the freebsd-hackers mailing list which
	you can get at
	http://www.freebsd.org/cgi/mid.cgi?db=&id=19990417224534.A55834@saturn.kn-bremen.de),
	try to start a 16bit program from a win32 one.  it will die at
	the line

	    SET_CUR_THREAD( pNewTask->thdb );

	in TASK_Reschedule() in loader/task.c, where its loading the
	%fs register.  if you want all the details look for the
	`wine with threads?' thread in comp.unix.bsd.freebsd.misc and
	comp.emulators.ms-windows.wine,
	news:<7fnsgs$14jh$1@saturn.kn-bremen.de>
Comment 1 Juergen Lock 1999-04-26 20:10:28 UTC
In article <199904222239.AAA43095@saturn.kn-bremen.de> you write:

>>Description:
>
>	wine now uses kernel threads (rfork()) and expects i386_set_ldt()
>	to work across threads, i.e. the new LDT be global to all threads.
>	rfork() copies the ldt regardless of the RFMEM flag so each thread
>	ends up with its own ldt (sys/i386/i386/vm_machdep.c, cpu_fork()).

>>Fix:

Here's a patch that makes it share the user LDT for rfork(RFTHREAD...),
tested on 3.1-stable.  It works by copying only the pcb_ldt pointer
and copying it to all peers in i386_set_ldt(2).  the status `copied
pointer' is indicated by setting pcb_ldt_len = -1, only p_leader's
pcb_ldt_len holds the real size.

 This appears to fix the wine crashes (more in the newsgroups...)
if you add RFTHREAD to its rfork args.  Everything else works as
before, there is only one `problem': if you rfork(RFTHREAD...) and
then in the parent do an exec() the exec'd program will still share
the LDT as it will still be the p_leader...  But as there is nothing
else besides wine that uses i386_set_ldt(2) and wine doesn't do this
it shouldn't really matter.  (Btw.  if a child exec()s shouldn't it
unlink itself from the p_peers list?  Looks like it currently
doesn't.  Hmm.)

 One other change:  i added a handler for trap 12's at cpu_switch_load_{f,g}s
as i was getting these while testing.  the finished patch doesn't seem
to generate them anymore (only trap 9's for which there already is a
handler), but handling them anyway doesn't hurt, right? :)

 As for style etc., any comments are welcome.  this is only my second
patch to FreeBSD's kernel...

cvs diff: Diffing sys
Index: sys/proc.h
===================================================================
RCS file: /home/cvs/cvs/src/sys/sys/proc.h,v
retrieving revision 1.66.2.2
diff -u -r1.66.2.2 proc.h
--- proc.h	1999/02/23 13:44:36	1.66.2.2
+++ proc.h	1999/04/25 17:35:14
@@ -373,6 +373,7 @@
 void	unsleep __P((struct proc *));
 void	wakeup_one __P((void *chan));
 
+void	cpu_kill9 __P((struct proc *));
 void	cpu_exit __P((struct proc *)) __dead2;
 void	exit1 __P((struct proc *, int)) __dead2;
 void	cpu_fork __P((struct proc *, struct proc *));
cvs diff: Diffing kern
Index: kern/kern_exit.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.71.2.2
diff -u -r1.71.2.2 kern_exit.c
--- kern_exit.c	1999/03/02 00:42:08	1.71.2.2
+++ kern_exit.c	1999/04/26 14:48:47
@@ -41,6 +41,9 @@
 
 #include "opt_compat.h"
 #include "opt_ktrace.h"
+#ifdef __i386__
+#include "opt_user_ldt.h"
+#endif
 
 #include <sys/param.h>
 #include <sys/systm.h>
@@ -139,6 +142,12 @@
 			 * than the internal signal
 			 */
 			kill(p, &killArgs);
+#ifdef __i386__
+#ifdef USER_LDT
+			/* hook to undo LDT sharing */
+			cpu_kill9(q);
+#endif
+#endif
 			nq = q;
 			q = q->p_peers;
 			/*
cvs diff: Diffing i386/i386
Index: i386/i386/machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/machdep.c,v
retrieving revision 1.322.2.4
diff -u -r1.322.2.4 machdep.c
--- machdep.c	1999/02/17 13:08:41	1.322.2.4
+++ machdep.c	1999/04/26 16:34:31
@@ -815,13 +815,34 @@
 #ifdef USER_LDT
 	/* was i386_user_cleanup() in NetBSD */
 	if (pcb->pcb_ldt) {
-		if (pcb == curpcb) {
-			lldt(_default_ldt);
-			currentldt = _default_ldt;
+		if (pcb->pcb_ldt_len != -1) {
+#ifdef DIAGNOSTIC
+			if (p->p_leader != p)
+				panic("setregs: pcb_ldt_len != -1 in peer");
+#endif
+			if (!p->p_peers) {
+				if (pcb == curpcb) {
+					lldt(_default_ldt);
+					currentldt = _default_ldt;
+				}
+				pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
+				kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
+					pcb->pcb_ldt_len * sizeof(union descriptor));
+			} else {
+				/* XXX what to do here? */
+				printf("setregs: leader exec()ing, keeping shared user ldt\n");
+			}
+#ifdef DIAGNOSTIC
+		} else if (!p->p_leader || p->p_leader == p) {
+			panic("setregs: pcb_ldt_len == -1 in leader");
+#endif
+		} else {
+			if (pcb == curpcb) {
+				lldt(_default_ldt);
+				currentldt = _default_ldt;
+			}
+			pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
 		}
-		kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
-			pcb->pcb_ldt_len * sizeof(union descriptor));
-		pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
  	}
 #endif
   
Index: i386/i386/sys_machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/sys_machdep.c,v
retrieving revision 1.38
diff -u -r1.38 sys_machdep.c
--- sys_machdep.c	1998/12/07 21:58:19	1.38
+++ sys_machdep.c	1999/04/26 15:05:02
@@ -259,8 +259,16 @@
 void
 set_user_ldt(struct pcb *pcb)
 {
+	int nldt = pcb->pcb_ldt_len;
+	if (nldt == -1) {
+#ifdef DIAGNOSTIC
+		if (pcb != (struct pcb *)&curproc->p_addr->u_pcb)
+			panic("set_user_ldt: pcb->pcb_ldt_len == -1 and pcb != curproc's");
+#endif
+		nldt = ((struct pcb *)&curproc->p_leader->p_addr->u_pcb)->pcb_ldt_len;
+	}
 	gdt_segs[GUSERLDT_SEL].ssd_base = (unsigned)pcb->pcb_ldt;
-	gdt_segs[GUSERLDT_SEL].ssd_limit = (pcb->pcb_ldt_len * sizeof(union descriptor)) - 1;
+	gdt_segs[GUSERLDT_SEL].ssd_limit = (nldt * sizeof(union descriptor)) - 1;
 	ssdtosd(&gdt_segs[GUSERLDT_SEL], &gdt[GUSERLDT_SEL].sd);
 	lldt(GSEL(GUSERLDT_SEL, SEL_KPL));
 	currentldt = GSEL(GUSERLDT_SEL, SEL_KPL);
@@ -301,6 +309,13 @@
 
 	if (pcb->pcb_ldt) {
 		nldt = pcb->pcb_ldt_len;
+		if (nldt == -1) {
+#ifdef DIAGNOSTIC
+			if (!p->p_leader || p->p_leader == p)
+				panic("i386_get_ldt: pcb_ldt_len == -1 in leader");
+#endif
+			nldt = ((struct pcb *)&p->p_leader->p_addr->u_pcb)->pcb_ldt_len;
+		}
 		num = min(uap->num, nldt);
 		lp = &((union descriptor *)(pcb->pcb_ldt))[uap->start];
 	} else {
@@ -335,7 +350,8 @@
 	int error = 0, i, n;
  	int largest_ld;
 	struct pcb *pcb = &p->p_addr->u_pcb;
-	int s;
+	struct proc *q;
+	int nldt, s;
 	struct i386_set_ldt_args ua, *uap;
 
 	if ((error = copyin(args, &ua, sizeof(struct i386_set_ldt_args))) < 0)
@@ -359,24 +375,54 @@
   		return(EINVAL);
   
   	/* allocate user ldt */
- 	if (!pcb->pcb_ldt || (largest_ld >= pcb->pcb_ldt_len)) {
+	nldt = pcb->pcb_ldt_len;
+	if (nldt == -1) {
+#ifdef DIAGNOSTIC
+		if (!p->p_leader || p->p_leader == p)
+			panic("i386_set_ldt: pcb_ldt_len == -1 in leader");
+#endif
+		nldt = ((struct pcb *)&p->p_leader->p_addr->u_pcb)->pcb_ldt_len;
+	}
+ 	if (!pcb->pcb_ldt || (largest_ld >= nldt)) {
  		union descriptor *new_ldt = (union descriptor *)kmem_alloc(
  			kernel_map, SIZE_FROM_LARGEST_LD(largest_ld));
  		if (new_ldt == NULL) {
  			return ENOMEM;
  		}
  		if (pcb->pcb_ldt) {
- 			bcopy(pcb->pcb_ldt, new_ldt, pcb->pcb_ldt_len
+ 			bcopy(pcb->pcb_ldt, new_ldt, nldt
  				* sizeof(union descriptor));
  			kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
- 				pcb->pcb_ldt_len * sizeof(union descriptor));
+ 				nldt * sizeof(union descriptor));
  		} else {
  			bcopy(ldt, new_ldt, sizeof(ldt));
  		}
-  		pcb->pcb_ldt = (caddr_t)new_ldt;
- 		pcb->pcb_ldt_len = NEW_MAX_LD(largest_ld);
+		/*
+		 * copy pcb_ldt for peers, set their pcb_ldt_len = -1
+		 * to indicate this is a copy
+		 */
+		for (q = p->p_leader; q; q = q->p_peers) {
+			struct pcb *pcb2 = &q->p_addr->u_pcb;
+
+			pcb2->pcb_ldt = (caddr_t)new_ldt;
+			/* the leader gets the real pcb_ldt_len */
+			if (q == p->p_leader)
+				pcb2->pcb_ldt_len = NEW_MAX_LD(largest_ld);
+			else
+				pcb2->pcb_ldt_len = -1;
+			if (pcb2 == curpcb)
+			    set_user_ldt((struct pcb *)&p->p_leader->p_addr->u_pcb);
+		}
+#ifdef DIAGNOSTIC
+		if (!p->p_leader)
+			panic("i386_set_ldt: p_leader == 0");
+  		if (pcb->pcb_ldt != (caddr_t)new_ldt)
+			panic("i386_set_ldt: pcb->pcb_ldt != new_ldt");
+#endif
+#if 0
  		if (pcb == curpcb)
  		    set_user_ldt(pcb);
+#endif
   	}
 
 	/* Check descriptors for access violations */
Index: i386/i386/trap.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/trap.c,v
retrieving revision 1.133
diff -u -r1.133 trap.c
--- trap.c	1999/01/06 23:05:36	1.133
+++ trap.c	1999/04/26 13:44:35
@@ -434,6 +434,29 @@
 
 		switch (type) {
 		case T_PAGEFLT:			/* page fault */
+			if (intr_nesting_level == 0) {
+				/*
+				 * Invalid %fs's and %gs's can be created using
+				 * procfs or PT_SETREGS or by invalidating the
+				 * underlying LDT entry.  This causes a fault
+				 * in kernel mode when the kernel attempts to
+				 * switch contexts.  Lose the bad context
+				 * (XXX) so that we can continue, and generate
+				 * a signal.
+				 */
+				if (frame.tf_eip == (int)cpu_switch_load_fs
+				    && curpcb->pcb_fs) {
+					curpcb->pcb_fs = 0;
+					psignal(p, SIGBUS);
+					return;
+				}
+				if (frame.tf_eip == (int)cpu_switch_load_gs
+				    && curpcb->pcb_gs) {
+					curpcb->pcb_gs = 0;
+					psignal(p, SIGBUS);
+					return;
+				}
+			}
 			(void) trap_pfault(&frame, FALSE, eva);
 			return;
 
Index: i386/i386/vm_machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/vm_machdep.c,v
retrieving revision 1.115
diff -u -r1.115 vm_machdep.c
--- vm_machdep.c	1999/01/06 23:05:37	1.115
+++ vm_machdep.c	1999/04/26 15:31:35
@@ -173,11 +173,32 @@
         /* Copy the LDT, if necessary. */
         if (pcb2->pcb_ldt != 0) {
                 union descriptor *new_ldt;
-                size_t len = pcb2->pcb_ldt_len * sizeof(union descriptor);
+                int nldt = pcb2->pcb_ldt_len;
 
-                new_ldt = (union descriptor *)kmem_alloc(kernel_map, len);
-                bcopy(pcb2->pcb_ldt, new_ldt, len);
-                pcb2->pcb_ldt = (caddr_t)new_ldt;
+		if (nldt == -1) {
+#ifdef DIAGNOSTIC
+			if (!p2->p_leader || p2->p_leader == p2)
+				panic("cpu_fork: pcb_ldt_len == -1 in leader");
+#endif
+			nldt = ((struct pcb *)&p2->p_leader->p_addr->u_pcb)->pcb_ldt_len;
+		}
+		if (p2->p_leader == p1->p_leader) {
+			/*
+			 * this is a rfork(RFTHREAD|...),
+			 * indicate pcb_ldt is a copy
+			 */
+			pcb2->pcb_ldt_len = -1;
+#ifdef DIAGNOSTIC
+			if (p2->p_leader == p2)
+				panic("cpu_fork: p2->p_leader == p1->p_leader and p2 is leader");
+#endif
+		} else {
+			new_ldt = (union descriptor *)kmem_alloc(kernel_map,
+				nldt * sizeof(union descriptor));
+			bcopy(pcb2->pcb_ldt, new_ldt,
+				nldt * sizeof(union descriptor));
+			pcb2->pcb_ldt = (caddr_t)new_ldt;
+		}
         }
 #endif
 
@@ -240,8 +261,13 @@
 			lldt(_default_ldt);
 			currentldt = _default_ldt;
 		}
-		kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
-			pcb->pcb_ldt_len * sizeof(union descriptor));
+		if (pcb->pcb_ldt_len != -1)
+			kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
+				pcb->pcb_ldt_len * sizeof(union descriptor));
+#ifdef DIAGNOSTIC
+		else if (!p->p_leader || p->p_leader == p)
+			panic("cpu_exit: pcb_ldt_len == -1 in leader");
+#endif
 		pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
 	}
 #endif
@@ -249,6 +275,25 @@
 	cpu_switch(p);
 	panic("cpu_exit");
 }
+
+#ifdef USER_LDT
+void
+cpu_kill9(p)
+	register struct proc *p;
+{
+	struct pcb *pcb = &p->p_addr->u_pcb; 
+	/*
+	 * hook to undo ldt sharing:
+	 * we are going to be SIGKILL'd so we can just forget our ldt
+	 */
+	if (pcb->pcb_ldt_len == -1)
+		pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
+#ifdef DIAGNOSTIC
+	if (pcb == curpcb)
+		panic("cpu_kill9: pcb == curpcb");
+#endif
+}
+#endif
 
 void
 cpu_wait(p)
cvs diff: Diffing pc98/i386
Index: pc98/i386/machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/pc98/i386/machdep.c,v
retrieving revision 1.105.2.3
diff -u -r1.105.2.3 machdep.c
--- machdep.c	1999/02/19 14:39:52	1.105.2.3
+++ machdep.c	1999/04/26 16:34:38
@@ -828,13 +828,34 @@
 #ifdef USER_LDT
 	/* was i386_user_cleanup() in NetBSD */
 	if (pcb->pcb_ldt) {
-		if (pcb == curpcb) {
-			lldt(_default_ldt);
-			currentldt = _default_ldt;
+		if (pcb->pcb_ldt_len != -1) {
+#ifdef DIAGNOSTIC
+			if (p->p_leader != p)
+				panic("setregs: pcb_ldt_len != -1 in peer");
+#endif
+			if (!p->p_peers) {
+				if (pcb == curpcb) {
+					lldt(_default_ldt);
+					currentldt = _default_ldt;
+				}
+				pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
+				kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
+					pcb->pcb_ldt_len * sizeof(union descriptor));
+			} else {
+				/* XXX what to do here? */
+				printf("setregs: leader exec()ing, keeping shared user ldt\n");
+			}
+#ifdef DIAGNOSTIC
+		} else if (!p->p_leader || p->p_leader == p) {
+			panic("setregs: pcb_ldt_len == -1 in leader");
+#endif
+		} else {
+			if (pcb == curpcb) {
+				lldt(_default_ldt);
+				currentldt = _default_ldt;
+			}
+			pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
 		}
-		kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
-			pcb->pcb_ldt_len * sizeof(union descriptor));
-		pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
  	}
 #endif
   
 Happy hacking,
-- 
Juergen Lock <nox.foo@jelal.kn-bremen.de>
(remove dot foo from address to reply)
Comment 2 Luoqi Chen 1999-04-30 17:05:13 UTC
User LDT sharing should really be done in the machine dependent layer.
I have an implementation based on -current (I don't have any machine
running -stable), you may want to take a look at, the patch is at
http://www.freebsd.org/~luoqi

There are still two problems with this implementation:
- It is incomplete for SMP. We need something similar to TLB shootdown
  when modifying the ldt table.
- It doesn't work correctly in the following case:
  1. process A initially doesn't have a user ldt table
  2. process A forks process B (RFMEM)
  3. process B calls i386_set_ldt()
  now process B has a user ldt table, but inaccessible to A. I can
  see 3 solutions to this problem:
  1. Allocate a user ldt table for all processes.
       This is not really an acceptable solution, it penalize everyone
       else for the benefit of a few.
  2. Define another rfork flag RFLDT.
       The problem with solution is the flag is too machine-specific.
  3. Any process wants to share user ldt with its descendants should
     call i386_set_ldt() prior to any fork.
       This is a workaround in the user application, but should work
       well.

-lq
Comment 3 Juergen Lock 1999-04-30 22:25:27 UTC
On Fri, Apr 30, 1999 at 12:05:13PM -0400, Luoqi Chen wrote:
> User LDT sharing should really be done in the machine dependent layer.

Well, was my patch not in the machine dependent layer?

> I have an implementation based on -current (I don't have any machine
> running -stable), you may want to take a look at, the patch is at
> http://www.freebsd.org/~luoqi
> 
> There are still two problems with this implementation:
> - It is incomplete for SMP. We need something similar to TLB shootdown
>   when modifying the ldt table.
> - It doesn't work correctly in the following case:
>   1. process A initially doesn't have a user ldt table
>   2. process A forks process B (RFMEM)
>   3. process B calls i386_set_ldt()
>   now process B has a user ldt table, but inaccessible to A. I can
>   see 3 solutions to this problem:
>   1. Allocate a user ldt table for all processes.
>        This is not really an acceptable solution, it penalize everyone
>        else for the benefit of a few.
>   2. Define another rfork flag RFLDT.
>        The problem with solution is the flag is too machine-specific.
>   3. Any process wants to share user ldt with its descendants should
>      call i386_set_ldt() prior to any fork.
>        This is a workaround in the user application, but should work
>        well.
> 
> -lq

 Anyway I have back-ported your patch to 3.1-stable and it seems
to work, so whichever one gets committed is ok with me...

 I added a diff for pc98/i386/machdep.c, and the cpu_switch_load_{f,g}s
trap 12 handler. (Sorry no http location, maybe you can add this to
your page?  thanx.)

cvs diff: Diffing sys/alpha/alpha
Index: sys/alpha/alpha/vm_machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/alpha/alpha/vm_machdep.c,v
retrieving revision 1.7.2.1
diff -u -r1.7.2.1 vm_machdep.c
--- vm_machdep.c	1999/01/27 20:51:39	1.7.2.1
+++ vm_machdep.c	1999/04/30 18:37:45
@@ -114,11 +114,15 @@
  * ready to run and return to user mode.
  */
 void
-cpu_fork(p1, p2)
+cpu_fork(p1, p2, flags)
 	register struct proc *p1, *p2;
+	int flags;
 {
 	struct user *up = p2->p_addr;
 	int i;
+
+	if ((flags & RFPROC) == 0)
+		return;
 
 	p2->p_md.md_tf = p1->p_md.md_tf;
 	p2->p_md.md_flags = p1->p_md.md_flags & MDP_FPUSED;
cvs diff: Diffing sys/i386/i386
Index: sys/i386/i386/genassym.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/genassym.c,v
retrieving revision 1.62.2.1
diff -u -r1.62.2.1 genassym.c
--- genassym.c	1999/02/22 15:59:39	1.62.2.1
+++ genassym.c	1999/04/30 18:38:27
@@ -125,7 +125,9 @@
 	printf("#define\tPCB_EBX %#x\n", OS(pcb, pcb_ebx));
 	printf("#define\tPCB_EIP %#x\n", OS(pcb, pcb_eip));
 	printf("#define\tTSS_ESP0 %#x\n", OS(i386tss, tss_esp0));
+#ifdef USER_LDT
 	printf("#define\tPCB_USERLDT %#x\n", OS(pcb, pcb_ldt));
+#endif
 	printf("#define\tPCB_FS %#x\n", OS(pcb, pcb_fs));
 	printf("#define\tPCB_GS %#x\n", OS(pcb, pcb_gs));
 #ifdef VM86
Index: sys/i386/i386/machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/machdep.c,v
retrieving revision 1.322.2.4
diff -u -r1.322.2.4 machdep.c
--- machdep.c	1999/02/17 13:08:41	1.322.2.4
+++ machdep.c	1999/04/30 18:39:41
@@ -814,15 +814,7 @@
 
 #ifdef USER_LDT
 	/* was i386_user_cleanup() in NetBSD */
-	if (pcb->pcb_ldt) {
-		if (pcb == curpcb) {
-			lldt(_default_ldt);
-			currentldt = _default_ldt;
-		}
-		kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
-			pcb->pcb_ldt_len * sizeof(union descriptor));
-		pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
- 	}
+	user_ldt_free(pcb);
 #endif
   
 	bzero((char *)regs, sizeof(struct trapframe));
Index: sys/i386/i386/sys_machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/sys_machdep.c,v
retrieving revision 1.38
diff -u -r1.38 sys_machdep.c
--- sys_machdep.c	1998/12/07 21:58:19	1.38
+++ sys_machdep.c	1999/04/30 19:03:34
@@ -41,6 +41,7 @@
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/sysproto.h>
+#include <sys/malloc.h>
 #include <sys/proc.h>
 
 #include <vm/vm.h>
@@ -65,7 +66,6 @@
 
 
 #ifdef USER_LDT
-void set_user_ldt	__P((struct pcb *pcb));
 static int i386_get_ldt	__P((struct proc *, char *));
 static int i386_set_ldt	__P((struct proc *, char *));
 #endif
@@ -259,13 +259,72 @@
 void
 set_user_ldt(struct pcb *pcb)
 {
-	gdt_segs[GUSERLDT_SEL].ssd_base = (unsigned)pcb->pcb_ldt;
-	gdt_segs[GUSERLDT_SEL].ssd_limit = (pcb->pcb_ldt_len * sizeof(union descriptor)) - 1;
-	ssdtosd(&gdt_segs[GUSERLDT_SEL], &gdt[GUSERLDT_SEL].sd);
+	struct pcb_ldt *pcb_ldt = pcb->pcb_ldt;
+#ifdef SMP
+	gdt[cpuid * NGDT + GUSERLDT_SEL].sd = pcb_ldt->ldt_sd;
+#else
+	gdt[GUSERLDT_SEL].sd = pcb_ldt->ldt_sd;
+#endif
 	lldt(GSEL(GUSERLDT_SEL, SEL_KPL));
 	currentldt = GSEL(GUSERLDT_SEL, SEL_KPL);
 }
 
+struct pcb_ldt *
+user_ldt_alloc(struct pcb *pcb, int len)
+{
+	struct pcb_ldt *pcb_ldt, *new_ldt;
+
+	MALLOC(new_ldt, struct pcb_ldt *, sizeof(struct pcb_ldt),
+		M_SUBPROC, M_WAITOK);
+	if (new_ldt == NULL)
+		return NULL;
+
+	new_ldt->ldt_len = len = NEW_MAX_LD(len);
+	new_ldt->ldt_base = (caddr_t)kmem_alloc(kernel_map,
+		len * sizeof(union descriptor));
+	if (new_ldt->ldt_base == NULL) {
+		FREE(new_ldt, M_SUBPROC);
+		return NULL;
+	}
+	new_ldt->ldt_refcnt = 1;
+	new_ldt->ldt_active = 0;
+
+	gdt_segs[GUSERLDT_SEL].ssd_base = (unsigned)new_ldt->ldt_base;
+	gdt_segs[GUSERLDT_SEL].ssd_limit = len * sizeof(union descriptor) - 1;
+	ssdtosd(&gdt_segs[GUSERLDT_SEL], &new_ldt->ldt_sd);
+
+	if ((pcb_ldt = pcb->pcb_ldt)) {
+		if (len > pcb_ldt->ldt_len)
+			len = pcb_ldt->ldt_len;
+		bcopy(pcb_ldt->ldt_base, new_ldt->ldt_base,
+			len * sizeof(union descriptor));
+	} else {
+		bcopy(ldt, new_ldt->ldt_base, sizeof(ldt));
+	}
+	return new_ldt;
+}
+
+void
+user_ldt_free(struct pcb *pcb)
+{
+	struct pcb_ldt *pcb_ldt = pcb->pcb_ldt;
+
+	if (pcb_ldt == NULL)
+		return;
+
+	if (pcb == curpcb) {
+		lldt(_default_ldt);
+		currentldt = _default_ldt;
+	}
+
+	if (--pcb_ldt->ldt_refcnt == 0) {
+		kmem_free(kernel_map, (vm_offset_t)pcb_ldt->ldt_base,
+			pcb_ldt->ldt_len * sizeof(union descriptor));
+		FREE(pcb_ldt, M_SUBPROC);
+	}
+	pcb->pcb_ldt = NULL;
+}
+
 struct i386_get_ldt_args {
 	int start;
 	union descriptor *desc;
@@ -279,6 +338,7 @@
 {
 	int error = 0;
 	struct pcb *pcb = &p->p_addr->u_pcb;
+	struct pcb_ldt *pcb_ldt = pcb->pcb_ldt;
 	int nldt, num;
 	union descriptor *lp;
 	int s;
@@ -299,10 +359,10 @@
 
 	s = splhigh();
 
-	if (pcb->pcb_ldt) {
-		nldt = pcb->pcb_ldt_len;
+	if (pcb_ldt) {
+		nldt = pcb_ldt->ldt_len;
 		num = min(uap->num, nldt);
-		lp = &((union descriptor *)(pcb->pcb_ldt))[uap->start];
+		lp = &((union descriptor *)(pcb_ldt->ldt_base))[uap->start];
 	} else {
 		nldt = sizeof(ldt)/sizeof(ldt[0]);
 		num = min(uap->num, nldt);
@@ -335,6 +395,7 @@
 	int error = 0, i, n;
  	int largest_ld;
 	struct pcb *pcb = &p->p_addr->u_pcb;
+	struct pcb_ldt *pcb_ldt = pcb->pcb_ldt;
 	int s;
 	struct i386_set_ldt_args ua, *uap;
 
@@ -348,36 +409,37 @@
 	    uap->start, uap->num, (void *)uap->desc);
 #endif
 
- 	/* verify range of descriptors to modify */
- 	if ((uap->start < 0) || (uap->start >= MAX_LD) || (uap->num < 0) ||
- 		(uap->num > MAX_LD))
- 	{
+	/* verify range of descriptors to modify */
+	if ((uap->start < 0) || (uap->start >= MAX_LD) || (uap->num < 0) ||
+		(uap->num > MAX_LD))
+	{
+		return(EINVAL);
+	}
+	largest_ld = uap->start + uap->num - 1;
+	if (largest_ld >= MAX_LD)
  		return(EINVAL);
- 	}
- 	largest_ld = uap->start + uap->num - 1;
- 	if (largest_ld >= MAX_LD)
-  		return(EINVAL);
-  
-  	/* allocate user ldt */
- 	if (!pcb->pcb_ldt || (largest_ld >= pcb->pcb_ldt_len)) {
- 		union descriptor *new_ldt = (union descriptor *)kmem_alloc(
- 			kernel_map, SIZE_FROM_LARGEST_LD(largest_ld));
- 		if (new_ldt == NULL) {
- 			return ENOMEM;
- 		}
- 		if (pcb->pcb_ldt) {
- 			bcopy(pcb->pcb_ldt, new_ldt, pcb->pcb_ldt_len
- 				* sizeof(union descriptor));
- 			kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
- 				pcb->pcb_ldt_len * sizeof(union descriptor));
- 		} else {
- 			bcopy(ldt, new_ldt, sizeof(ldt));
- 		}
-  		pcb->pcb_ldt = (caddr_t)new_ldt;
- 		pcb->pcb_ldt_len = NEW_MAX_LD(largest_ld);
- 		if (pcb == curpcb)
- 		    set_user_ldt(pcb);
-  	}
+ 
+ 	/* allocate user ldt */
+	if (!pcb_ldt || (largest_ld >= pcb_ldt->ldt_len)) {
+		struct pcb_ldt *new_ldt = user_ldt_alloc(pcb, largest_ld);
+		if (new_ldt == NULL) {
+			return ENOMEM;
+		}
+	        if (pcb_ldt) {
+	                pcb_ldt->ldt_sd = new_ldt->ldt_sd;
+#ifdef SMP
+		/* signal other cpus to reload ldt */
+#endif
+			kmem_free(kernel_map, (vm_offset_t)pcb_ldt->ldt_base,
+				pcb_ldt->ldt_len * sizeof(union descriptor));
+			pcb_ldt->ldt_base = new_ldt->ldt_base;
+			pcb_ldt->ldt_len = new_ldt->ldt_len;
+			FREE(new_ldt, M_SUBPROC);
+	        } else
+			pcb->pcb_ldt = pcb_ldt = new_ldt;
+		if (pcb == curpcb)
+			set_user_ldt(pcb);
+	}
 
 	/* Check descriptors for access violations */
 	for (i = 0, n = uap->start; i < uap->num; i++, n++) {
@@ -388,70 +450,70 @@
 			return(error);
 
 		switch (desc.sd.sd_type) {
- 		case SDT_SYSNULL:	/* system null */ 
- 			desc.sd.sd_p = 0;
-  			break;
- 		case SDT_SYS286TSS: /* system 286 TSS available */
- 		case SDT_SYSLDT:    /* system local descriptor table */
- 		case SDT_SYS286BSY: /* system 286 TSS busy */
- 		case SDT_SYSTASKGT: /* system task gate */
- 		case SDT_SYS286IGT: /* system 286 interrupt gate */
- 		case SDT_SYS286TGT: /* system 286 trap gate */
- 		case SDT_SYSNULL2:  /* undefined by Intel */ 
- 		case SDT_SYS386TSS: /* system 386 TSS available */
- 		case SDT_SYSNULL3:  /* undefined by Intel */
- 		case SDT_SYS386BSY: /* system 386 TSS busy */
- 		case SDT_SYSNULL4:  /* undefined by Intel */ 
- 		case SDT_SYS386IGT: /* system 386 interrupt gate */
- 		case SDT_SYS386TGT: /* system 386 trap gate */
- 		case SDT_SYS286CGT: /* system 286 call gate */ 
- 		case SDT_SYS386CGT: /* system 386 call gate */
- 			/* I can't think of any reason to allow a user proc
- 			 * to create a segment of these types.  They are
- 			 * for OS use only.
- 			 */
-     	    	    	return EACCES;
- 
- 		/* memory segment types */
- 		case SDT_MEMEC:   /* memory execute only conforming */
- 		case SDT_MEMEAC:  /* memory execute only accessed conforming */
- 		case SDT_MEMERC:  /* memory execute read conforming */
- 		case SDT_MEMERAC: /* memory execute read accessed conforming */
-                         /* Must be "present" if executable and conforming. */
-                         if (desc.sd.sd_p == 0)
-                                 return (EACCES);
+		case SDT_SYSNULL:	/* system null */ 
+			desc.sd.sd_p = 0;
  			break;
- 		case SDT_MEMRO:   /* memory read only */
- 		case SDT_MEMROA:  /* memory read only accessed */
- 		case SDT_MEMRW:   /* memory read write */
- 		case SDT_MEMRWA:  /* memory read write accessed */
- 		case SDT_MEMROD:  /* memory read only expand dwn limit */
- 		case SDT_MEMRODA: /* memory read only expand dwn lim accessed */
- 		case SDT_MEMRWD:  /* memory read write expand dwn limit */  
- 		case SDT_MEMRWDA: /* memory read write expand dwn lim acessed */
- 		case SDT_MEME:    /* memory execute only */ 
- 		case SDT_MEMEA:   /* memory execute only accessed */
- 		case SDT_MEMER:   /* memory execute read */
- 		case SDT_MEMERA:  /* memory execute read accessed */
+		case SDT_SYS286TSS: /* system 286 TSS available */
+		case SDT_SYSLDT:    /* system local descriptor table */
+		case SDT_SYS286BSY: /* system 286 TSS busy */
+		case SDT_SYSTASKGT: /* system task gate */
+		case SDT_SYS286IGT: /* system 286 interrupt gate */
+		case SDT_SYS286TGT: /* system 286 trap gate */
+		case SDT_SYSNULL2:  /* undefined by Intel */ 
+		case SDT_SYS386TSS: /* system 386 TSS available */
+		case SDT_SYSNULL3:  /* undefined by Intel */
+		case SDT_SYS386BSY: /* system 386 TSS busy */
+		case SDT_SYSNULL4:  /* undefined by Intel */ 
+		case SDT_SYS386IGT: /* system 386 interrupt gate */
+		case SDT_SYS386TGT: /* system 386 trap gate */
+		case SDT_SYS286CGT: /* system 286 call gate */ 
+		case SDT_SYS386CGT: /* system 386 call gate */
+			/* I can't think of any reason to allow a user proc
+			 * to create a segment of these types.  They are
+			 * for OS use only.
+			 */
+    	    	    	return EACCES;
+
+		/* memory segment types */
+		case SDT_MEMEC:   /* memory execute only conforming */
+		case SDT_MEMEAC:  /* memory execute only accessed conforming */
+		case SDT_MEMERC:  /* memory execute read conforming */
+		case SDT_MEMERAC: /* memory execute read accessed conforming */
+			/* Must be "present" if executable and conforming. */
+			if (desc.sd.sd_p == 0)
+				return (EACCES);
+			break;
+		case SDT_MEMRO:   /* memory read only */
+		case SDT_MEMROA:  /* memory read only accessed */
+		case SDT_MEMRW:   /* memory read write */
+		case SDT_MEMRWA:  /* memory read write accessed */
+		case SDT_MEMROD:  /* memory read only expand dwn limit */
+		case SDT_MEMRODA: /* memory read only expand dwn lim accessed */
+		case SDT_MEMRWD:  /* memory read write expand dwn limit */  
+		case SDT_MEMRWDA: /* memory read write expand dwn lim acessed */
+		case SDT_MEME:    /* memory execute only */ 
+		case SDT_MEMEA:   /* memory execute only accessed */
+		case SDT_MEMER:   /* memory execute read */
+		case SDT_MEMERA:  /* memory execute read accessed */
 			break;
 		default:
 			return(EINVAL);
 			/*NOTREACHED*/
 		}
  
- 		/* Only user (ring-3) descriptors may be present. */
- 		if ((desc.sd.sd_p != 0) && (desc.sd.sd_dpl != SEL_UPL))
- 			return (EACCES);
+		/* Only user (ring-3) descriptors may be present. */
+		if ((desc.sd.sd_p != 0) && (desc.sd.sd_dpl != SEL_UPL))
+			return (EACCES);
 	}
 
 	s = splhigh();
 
 	/* Fill in range */
- 	error = copyin(uap->desc, 
- 		 &((union descriptor *)(pcb->pcb_ldt))[uap->start],
- 		uap->num * sizeof(union descriptor));
- 	if (!error)
-  		p->p_retval[0] = uap->start;
+	error = copyin(uap->desc, 
+		 &((union descriptor *)(pcb_ldt->ldt_base))[uap->start],
+		uap->num * sizeof(union descriptor));
+	if (!error)
+ 		p->p_retval[0] = uap->start;
 
 	splx(s);
 	return(error);
Index: sys/i386/i386/vm_machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/vm_machdep.c,v
retrieving revision 1.115
diff -u -r1.115 vm_machdep.c
--- vm_machdep.c	1999/01/06 23:05:37	1.115
+++ vm_machdep.c	1999/04/30 19:11:14
@@ -57,6 +57,7 @@
 #include <sys/vmmeter.h>
 #include <sys/kernel.h>
 #include <sys/sysctl.h>
+#include <sys/unistd.h>
 
 #include <machine/clock.h>
 #include <machine/cpu.h>
@@ -113,10 +114,29 @@
  * ready to run and return to user mode.
  */
 void
-cpu_fork(p1, p2)
+cpu_fork(p1, p2, flags)
 	register struct proc *p1, *p2;
+	int flags;
 {
-	struct pcb *pcb2 = &p2->p_addr->u_pcb;
+	struct pcb *pcb2;
+ 
+	if ((flags & RFPROC) == 0) {
+#ifdef USER_LDT
+		if ((flags & RFMEM) == 0) {
+			/* unshare user LDT */
+			struct pcb *pcb1 = &p1->p_addr->u_pcb;
+			struct pcb_ldt *pcb_ldt = pcb1->pcb_ldt;
+			if (pcb_ldt && pcb_ldt->ldt_refcnt > 1) {
+				pcb_ldt = user_ldt_alloc(pcb1,pcb_ldt->ldt_len);
+				user_ldt_free(pcb1);
+				pcb1->pcb_ldt = pcb_ldt;
+				if (pcb1 == curpcb)
+					set_user_ldt(pcb1);
+			}
+		}
+#endif
+	        return;
+	}
 
 #if NNPX > 0
 	/* Ensure that p1's pcb is up to date. */
@@ -126,6 +146,7 @@
 
 	/* Copy p1's pcb. */
 	p2->p_addr->u_pcb = p1->p_addr->u_pcb;
+	pcb2 = &p2->p_addr->u_pcb;
 
 	/*
 	 * Create a new fresh stack for the new process.
@@ -153,7 +174,6 @@
 	pcb2->pcb_eip = (int)fork_trampoline;
 	/*
 	 * pcb2->pcb_ldt:	duplicated below, if necessary.
-	 * pcb2->pcb_ldt_len:	cloned above.
 	 * pcb2->pcb_savefpu:	cloned above.
 	 * pcb2->pcb_flags:	cloned above (always 0 here?).
 	 * pcb2->pcb_onfault:	cloned above (always NULL here?).
@@ -172,12 +192,12 @@
 #ifdef USER_LDT
         /* Copy the LDT, if necessary. */
         if (pcb2->pcb_ldt != 0) {
-                union descriptor *new_ldt;
-                size_t len = pcb2->pcb_ldt_len * sizeof(union descriptor);
-
-                new_ldt = (union descriptor *)kmem_alloc(kernel_map, len);
-                bcopy(pcb2->pcb_ldt, new_ldt, len);
-                pcb2->pcb_ldt = (caddr_t)new_ldt;
+		if (flags & RFMEM) {
+			pcb2->pcb_ldt->ldt_refcnt++;
+		} else {
+			pcb2->pcb_ldt = user_ldt_alloc(pcb2,
+				pcb2->pcb_ldt->ldt_len);
+		}
         }
 #endif
 
@@ -235,15 +255,7 @@
 	}
 #endif
 #ifdef USER_LDT
-	if (pcb->pcb_ldt != 0) {
-		if (pcb == curpcb) {
-			lldt(_default_ldt);
-			currentldt = _default_ldt;
-		}
-		kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
-			pcb->pcb_ldt_len * sizeof(union descriptor));
-		pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
-	}
+	user_ldt_free(pcb);
 #endif
 	cnt.v_swtch++;
 	cpu_switch(p);
cvs diff: Diffing sys/i386/include
Index: sys/i386/include/pcb.h
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/include/pcb.h,v
retrieving revision 1.26
diff -u -r1.26 pcb.h
--- pcb.h	1998/02/03 21:27:50	1.26
+++ pcb.h	1999/04/30 19:12:53
@@ -53,8 +53,11 @@
 	int	pcb_esp;
 	int	pcb_ebx;
 	int	pcb_eip;
-	caddr_t	pcb_ldt;		/* per process (user) LDT */
-	int	pcb_ldt_len;		/* number of LDT entries */
+#ifdef USER_LDT
+	struct  pcb_ldt *pcb_ldt;       /* per process (user) LDT */
+#else
+	struct  pcb_ldt *pcb_ldt_dontuse;
+#endif
 	struct	save87	pcb_savefpu;	/* floating point state for 287/387 */
 	u_char	pcb_flags;
 #define	FP_SOFTFP	0x01	/* process using software fltng pnt emulator */
@@ -71,7 +74,7 @@
 #else
 	struct	pcb_ext	*pcb_ext_dontuse;
 #endif
-	u_long	__pcb_spare[1];	/* adjust to avoid core dump size changes */
+	u_long	__pcb_spare[2];	/* adjust to avoid core dump size changes */
 };
 
 /*
Index: sys/i386/include/pcb_ext.h
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/include/pcb_ext.h,v
retrieving revision 1.1
diff -u -r1.1 pcb_ext.h
--- pcb_ext.h	1997/08/09 04:55:05	1.1
+++ pcb_ext.h	1999/04/30 19:13:58
@@ -43,4 +43,22 @@
 	struct	vm86_kernel ext_vm86;	/* vm86 area */
 };
 
+struct pcb_ldt {
+	caddr_t ldt_base;
+	int     ldt_len;
+	int     ldt_refcnt;
+	u_long  ldt_active;
+	struct  segment_descriptor ldt_sd;
+};
+
+#ifdef KERNEL
+
+#ifdef USER_LDT
+void set_user_ldt __P((struct pcb *));
+struct pcb_ldt *user_ldt_alloc __P((struct pcb *, int));
+void user_ldt_free __P((struct pcb *));
+#endif
+
+#endif
+
 #endif /* _I386_PCB_EXT_H_ */
cvs diff: Diffing sys/i386/include/pc
cvs diff: Diffing sys/kern
Index: sys/kern/kern_fork.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/kern/kern_fork.c,v
retrieving revision 1.54.2.2
diff -u -r1.54.2.2 kern_fork.c
--- kern_fork.c	1999/03/02 00:42:08	1.54.2.2
+++ kern_fork.c	1999/04/30 19:14:56
@@ -162,16 +162,7 @@
 	 */
 	if ((flags & RFPROC) == 0) {
 
-		/*
-		 * Divorce the memory, if it is shared, essentially
-		 * this changes shared memory amongst threads, into
-		 * COW locally.
-		 */
-		if ((flags & RFMEM) == 0) {
-			if (p1->p_vmspace->vm_refcnt > 1) {
-				vmspace_unshare(p1);
-			}
-		}
+		vm_fork(p1, 0, flags);
 
 		/*
 		 * Close all file descriptors.
cvs diff: Diffing sys/sys
Index: sys/sys/proc.h
===================================================================
RCS file: /home/cvs/cvs/src/sys/sys/proc.h,v
retrieving revision 1.66.2.2
diff -u -r1.66.2.2 proc.h
--- proc.h	1999/02/23 13:44:36	1.66.2.2
+++ proc.h	1999/04/30 19:15:32
@@ -375,7 +375,7 @@
 
 void	cpu_exit __P((struct proc *)) __dead2;
 void	exit1 __P((struct proc *, int)) __dead2;
-void	cpu_fork __P((struct proc *, struct proc *));
+void	cpu_fork __P((struct proc *, struct proc *, int));
 int	fork1 __P((struct proc *, int));
 int	trace_req __P((struct proc *));
 void	cpu_wait __P((struct proc *));
cvs diff: Diffing sys/vm
Index: sys/vm/vm_glue.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/vm/vm_glue.c,v
retrieving revision 1.80.2.1
diff -u -r1.80.2.1 vm_glue.c
--- vm_glue.c	1999/01/27 20:51:43	1.80.2.1
+++ vm_glue.c	1999/04/30 19:17:19
@@ -208,6 +208,21 @@
 {
 	register struct user *up;
 
+	if ((flags & RFPROC) == 0) {
+		/*
+		 * Divorce the memory, if it is shared, essentially
+		 * this changes shared memory amongst threads, into
+		 * COW locally.
+		 */
+		if ((flags & RFMEM) == 0) {
+			if (p1->p_vmspace->vm_refcnt > 1) {
+				vmspace_unshare(p1);
+			}
+		}
+		cpu_fork(p1, p2, flags);
+		return;
+	}
+
 	if (flags & RFMEM) {
 		p2->p_vmspace = p1->p_vmspace;
 		p1->p_vmspace->vm_refcnt++;
@@ -257,7 +272,7 @@
 	 * cpu_fork will copy and update the pcb, set up the kernel stack,
 	 * and make the child ready to run.
 	 */
-	cpu_fork(p1, p2);
+	cpu_fork(p1, p2, flags);
 }
 
 /*

Index: sys/pc98/i386/machdep.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/machdep.c,v
retrieving revision 1.322.2.4
diff -u -r1.322.2.4 machdep.c
--- machdep.c	1999/02/17 13:08:41	1.322.2.4
+++ machdep.c	1999/04/30 18:39:41
@@ -814,15 +814,7 @@
 
 #ifdef USER_LDT
 	/* was i386_user_cleanup() in NetBSD */
-	if (pcb->pcb_ldt) {
-		if (pcb == curpcb) {
-			lldt(_default_ldt);
-			currentldt = _default_ldt;
-		}
-		kmem_free(kernel_map, (vm_offset_t)pcb->pcb_ldt,
-			pcb->pcb_ldt_len * sizeof(union descriptor));
-		pcb->pcb_ldt_len = (int)pcb->pcb_ldt = 0;
- 	}
+	user_ldt_free(pcb);
 #endif
   
 	bzero((char *)regs, sizeof(struct trapframe));
Index: sys/i386/i386/trap.c
===================================================================
RCS file: /home/cvs/cvs/src/sys/i386/i386/trap.c,v
retrieving revision 1.133
diff -u -r1.133 trap.c
--- trap.c	1999/01/06 23:05:36	1.133
+++ trap.c	1999/04/26 13:44:35
@@ -434,6 +434,29 @@
 
 		switch (type) {
 		case T_PAGEFLT:			/* page fault */
+			if (intr_nesting_level == 0) {
+				/*
+				 * Invalid %fs's and %gs's can be created using
+				 * procfs or PT_SETREGS or by invalidating the
+				 * underlying LDT entry.  This causes a fault
+				 * in kernel mode when the kernel attempts to
+				 * switch contexts.  Lose the bad context
+				 * (XXX) so that we can continue, and generate
+				 * a signal.
+				 */
+				if (frame.tf_eip == (int)cpu_switch_load_fs
+				    && curpcb->pcb_fs) {
+					curpcb->pcb_fs = 0;
+					psignal(p, SIGBUS);
+					return;
+				}
+				if (frame.tf_eip == (int)cpu_switch_load_gs
+				    && curpcb->pcb_gs) {
+					curpcb->pcb_gs = 0;
+					psignal(p, SIGBUS);
+					return;
+				}
+			}
 			(void) trap_pfault(&frame, FALSE, eva);
 			return;
 
 Regards,
-- 
Juergen Lock <nox.foo@jelal.kn-bremen.de>
(remove dot foo from address to reply)
Comment 4 luoqi 1999-07-23 20:29:23 UTC
> > > I have an implementation based on -current (I don't have any machine
> > > running -stable), you may want to take a look at, the patch is at
> > > http://www.freebsd.org/~luoqi
>
> I just found out this doesn't work if VM86 is not used, heres the fix:
>
> Index: sys/i386/i386/vm_machdep.c
> @@ -69,6 +69,9 @@
>  #include <machine/pcb_ext.h>
>  #include <machine/vm86.h>
>  #endif
> +#ifdef USER_LDT
> +#include <machine/pcb_ext.h>
> +#endif
>  
>  #include <vm/vm.h>
>  #include <vm/vm_param.h>
>
Yes, I'll update the patch set to include this change and the use of
smp_rendezvous() function to synchronize among cpus.

Thanks
-lq
Comment 5 pfeifer 2000-07-01 01:05:25 UTC
Jürgen, Luoqi, as far as I can see this PR can be closed now?

 o For 3.3 we have patches in the Wine port thanks to Jürgen.

 o 3.4 and 3.5 already have the change in the kernel?

 o 4.0 and above as well as 5-STABLE already have approriate changes in
   the kernel itself.

Can you please confirm the above or correct me, so that I can update the
README in the Wine port accordingly?

Gerald
-- 
Gerald "Jerry" pfeifer@dbai.tuwien.ac.at http://www.dbai.tuwien.ac.at/~pfeifer/
Have a look at http://petition.eurolinux.org -- it's not about Linux, btw!
Comment 6 pfeifer 2000-08-07 23:47:22 UTC
Unfortunately, I have not heard back from Jürgen or Luoqi, so I'm
going to analyse this piece by piece.

Please remove ports/emulators/wine/files/patch-3.3-sys-ldtshare and
install the patch at the end of this message. (patch...ldtshare has
been part of at least two FreeBSD releases now: 4.0, and 4.1, and it
was not necessary for most Wine applications I tried, so I don't see
a point in keeping it.)

Gerald

Index: README.patch
===================================================================
RCS file: /sw/FreeBSD/CVSUP/ports/emulators/wine/files/README.patch,v
retrieving revision 1.3
diff -u -r1.3 README.patch
--- README.patch	2000/02/08 09:26:18	1.3
+++ README.patch	2000/08/07 22:40:58
@@ -3,11 +3,6 @@
 They unfortunately didn't make it into the base distribution in time
 for the 3.3 release code freeze...
 
-patch-3.3-sys-ldtshare:
-make kernel threads (rfork(), which wine uses) share one LDT instead of
-each having its own.  this fixes the same problem that wine also had on
-linux kernels before 2.2.
-
 patch-3.3-sys-sigtrap:
 stop wine's SIGTRAP handler from being called in the sigreturn syscall,
 causing problems for wine's internal debugger.  (it would still
@@ -29,7 +24,6 @@
 
 Apply as follows:
 
-	(cd /usr/src/sys && patch ) <patch-3.3-sys-ldtshare
 	(cd /usr/src/sys && patch ) <patch-3.3-sys-sigtrap
 
 And if you don't already have it:
@@ -39,27 +33,6 @@
 then build a new kernel. (don't forget to include the options USER_LDT,
 SYSVSHM, SYSVSEM, and SYSVMSG, wine needs these.)
 
-A note about local patches and ctm, cvsup and friends...
-(if you don't know what those are good for see for example
-http://www.freebsd.org/handbook/stable.html)
-ctm cannot deal with local patches (unless you use it to mirror
-the cvs tree of course, instead of the sources directly), with
-cvsup i'm not sure but in any case the workaround is simple:  use
-patch -R to un-apply any local patches before the update (feeding
-it the patches again as above on stdin), then when the update is
-finished apply them again.  Should they fail on the updated sources
-(and you cannot fix it yourself), look for new versions of the
-patches at the place where you got them, or in this case you
-can also look in my current wine port tree at
-http://www.jelal.kn-bremen.de/freebsd/ports/emulators/wine/files/
-
 -current users:
-A LDT patch for -current is at http://people.FreeBSD.org/~luoqi/
-(well in a recent posting on the -current list,
-http://www.freebsd.org/cgi/mid.cgi?db=&id=199911150745.CAA27884@lor.watermarkgroup.com
-he said that version is outdated, seems you have to mail him to
-get a current one), the sigtrap patch looks like it could also
-apply to -current but i haven't tried.  And the fs/gs patch of course
-already is in -current.
-Late note: the LDT sharing fix just seems to have been committed now...
-(to -current that is.)
+The sigtrap patch looks like it could also apply to -current but i haven't
+tried.  And the fs/gs patch of course already is in -current.
Comment 7 dannyboy 2000-08-22 03:29:31 UTC
Just making note here that I've applied Gerald's patch.

-- 
Daniel Harris
Comment 8 Sheldon Hearn freebsd_committer freebsd_triage 2000-08-22 16:07:22 UTC
State Changed
From-To: open->feedback

This one can be closed now, no?
Comment 9 pfeifer 2000-08-24 01:58:56 UTC
sheldonh wrote:
> This one can be closed now, no? 

Not yet, but nearly! :-)

The following patch for the Wine port finally allows this PR to be fully
closed. (The kernel has already been updated, but neither the Wine port
nor this PR have been...)

Please install the patch below and remove files/patch-3.3-sys-fsgs from
the Wine port and also from pkg/PLIST of that port.

Approved by the maintainer of that port. (Myself ;-) )

Gerald

Index: Makefile
===================================================================
RCS file: /sw/FreeBSD/CVSUP/ports/emulators/wine/Makefile,v
retrieving revision 1.91
diff -u -3 -p -r1.91 Makefile
--- Makefile	2000/08/21 22:51:45	1.91
+++ Makefile	2000/08/24 00:46:56
@@ -101,7 +101,6 @@ do-install:
 		${PREFIX}/lib/wine/reg
 	${INSTALL_DATA} ${FILESDIR}/README.patch \
 		${FILESDIR}/patch-3.3-sys-sigtrap \
-		${FILESDIR}/patch-3.3-sys-fsgs \
 		${PREFIX}/lib/wine
 	${INSTALL_DATA} ${WRKSRC}/winedefault.reg ${PREFIX}/lib/wine
 	${ECHO}
Index: files/README.patch
===================================================================
RCS file: /sw/FreeBSD/CVSUP/ports/emulators/wine/files/README.patch,v
retrieving revision 1.4
diff -u -3 -p -r1.4 README.patch
--- files/README.patch	2000/08/21 19:27:04	1.4
+++ files/README.patch	2000/08/21 22:57:51
@@ -1,38 +1,20 @@
-Here are some patches for FreeBSD's kernel that are necessary for wine
-(well not strictly _necessary_ but without them parts of it won't work.)
+Here are some patches for FreeBSD's kernel that are necessary for Wine
+(Well not strictly _necessary_ but without them parts of it won't work).
 They unfortunately didn't make it into the base distribution in time
 for the 3.3 release code freeze...
 
 patch-3.3-sys-sigtrap:
 stop wine's SIGTRAP handler from being called in the sigreturn syscall,
-causing problems for wine's internal debugger.  (it would still
+causing problems for wine's internal debugger.  (It would still
 correctly show a crash backtrace but all commands that use single-
 stepping failed.)
 
-patch-3.3-sys-fsgs:
-always set/use the sc_fs and sc_gs entries in the sigcontext struct,
-making -stable behave the same as -current there.  this should finally
-allow signal handling of a wine that was built on -stable to correctly
-run on -current too.  The corresponding wine change is in the port in
-patches/patch-af, it is also in wine's CVS tree now, so that file will
-disappear when the port is updated after the next wine release.
-(this one was MFC'd Nov 15 1999, so you only need it if you're running a
-system from the -stable branch older than that, like a 3.3-RELEASE.  If you
-happen to try to apply it when its already there patch(1) should complain
-`Reversed (or previously applied) patch detected!  Assume -R? [y]',
-just hit ^C then...)
-
 Apply as follows:
 
 	(cd /usr/src/sys && patch ) <patch-3.3-sys-sigtrap
-
-And if you don't already have it:
-
-	(cd /usr/src/sys && patch ) <patch-3.3-sys-fsgs
 
-then build a new kernel. (don't forget to include the options USER_LDT,
-SYSVSHM, SYSVSEM, and SYSVMSG, wine needs these.)
+and build a new kernel. (Don't forget to include the options USER_LDT,
+SYSVSHM, SYSVSEM, and SYSVMSG which are required by Wine.)
 
--current users:
-The sigtrap patch looks like it could also apply to -current but i haven't
-tried.  And the fs/gs patch of course already is in -current.
+4.x users: The sigtrap patch looks like it could also apply to 4.x but I
+haven't tried.
Index: pkg/MESSAGE
===================================================================
RCS file: /sw/FreeBSD/CVSUP/ports/emulators/wine/pkg/MESSAGE,v
retrieving revision 1.2
diff -u -3 -p -r1.2 MESSAGE
--- pkg/MESSAGE	1999/12/10 17:36:22	1.2
+++ pkg/MESSAGE	2000/08/24 00:48:24
@@ -3,10 +3,6 @@ options USER_LDT, SYSVSHM, SYSVSEM, and 
 you may want to apply the patches in %%PREFIX%%/lib/wine to your
 kernel sources, see the README.patch there.
 
-(Note: if you already installed the patches from the 991031 version of
-this port and you're not tracking -stable or your -stable is older than
-Nov 15 1999:  there is a new patch you need, patch-3.3-sys-fsgs)
-
 And the port now also installs some of wine's doc files which
 describe additional things that are not in the manual pages, see
 %%PREFIX%%/lib/wine/documentation.  There are more in the source tree
Comment 10 Sheldon Hearn freebsd_committer freebsd_triage 2000-08-24 09:13:39 UTC
State Changed
From-To: feedback->open

Got feedback in the form of a patch to be applied to the port. 


Comment 11 Sheldon Hearn freebsd_committer freebsd_triage 2000-08-24 09:13:39 UTC
Responsible Changed
From-To: freebsd-bugs->sheldonh

I'll apply the patch.
Comment 12 Sheldon Hearn freebsd_committer freebsd_triage 2000-08-24 09:17:44 UTC
State Changed
From-To: open->feedback

The port has been updated.  Are there any outstanding issues 
relating to this problem on any branch, or can this be closed 
now? :-)
Comment 13 pfeifer 2000-08-30 20:30:50 UTC
sheldonh wrote:
> The port has been updated.  Are there any outstanding issues relating
> to this problem on any branch, or can this be closed now? :-)

Close it! :-)

(Someone interested in a current Wine port should at least use 4.x, and
even on 3.3, which was the original target of these patches, they were
not required for many applications.)

Gerald
-- 
Gerald "Jerry" pfeifer@dbai.tuwien.ac.at http://www.dbai.tuwien.ac.at/~pfeifer/
Comment 14 Sheldon Hearn freebsd_committer freebsd_triage 2000-08-31 10:24:48 UTC
State Changed
From-To: feedback->closed

The maintainer's happy with the outcome -- closed.