Bug 161498

Summary: [patch] ARM RAS code can fail to restart an atomic sequence.
Product: Base System Reporter: Ian Lepore <freebsd>
Component: armAssignee: freebsd-arm (Nobody) <freebsd-arm>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 8.2-STABLE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
ras-current.diff none

Description Ian Lepore 2011-10-11 18:50:02 UTC
The RAS implementation for ARM can occasionally fail to restart a sequence 
after being interrupted, leading to a failure of atomic operations which 
manifests in ways such as granting the same pthread mutex to multiple threads 
concurrently.

The "normal" RAS sequence goes like this:

    On entry, RAS_START = 0x00000000, RAS_END = 0xffffffff
    
    1: Set RAS_START to address of Step 1
    2: Set RAS_END to address of Step 4
    3: Do the atomic operation
    4: Set RAS_START to 0x00000000
    5: Set RAS_END to 0xffffffff

While the normal entry condition is for RAS_START/END to be 0x0/0xffffffff 
the code in PUSHFRAMEINSVC does not reliably enforce this (it sometimes 
leaves a stale value in RAS_END), which is the source of one of the two bugs 
in the current implementation.  The other bug is that the PUSHFRAMEINSVC 
code compares the PC to RAS_END using signed-value logic, but proper 
operation of the RAS sequence requires that 0xffffffff be interpreted as the 
highest possible memory address, not -1.

The first bug happens with interaction between a pair of threads doing RAS 
sequences at the same time.  Thread 1 runs through step 4 of the sequence 
and gets interrupted before completing step 5.  The PUSHFRAMEINSVC code 
sees a zero in RAS_START and does nothing; notably it does not change the 
value in RAS_END because START was zero.  Thread 2 begins to run and gets 
interrupted between steps 1 and 2.  Now PUSHFRAMEINSVC sees a non-zero 
RAS_START, and the RAS_END value is whatever thread 1 left there when it 
got interrupted.  The behavior now depends on whether the RAS sequence in 
thread 2 is at a higher or lower memory address than the one in thread 1.  
In one case it accidentally works right, in the other case an interrupt 
between steps 1 and 2 wouldn't lead to a proper restart because the PC is 
not between RAS_START and RAS_END.  But because RAS_START was non-zero
when thread 2 got interrupted, RAS_START/END get reset, so when thread 2 
gets resumed at step 2 it never reloads RAS_START, and another interrrupt 
in the sequence leads to misbehavior.

The second bug (signed value logic) is conceptually simpler, here is a 
sequence that triggers that bug:

On entry, RAS_START/END are 0x0/0xffffffff. A thread enters a RAS 
sequence and completes step 1, then gets interrupted before completing 
step 2.  The PUSHFRAMEINSVC code sees a non-zero RAS_START so it compares 
the PC to RAS_END.  Because of the signed-value logic, the PC will never be 
between RAS_START and -1, so the PUSHFRAMEINSVC code doesn't set up for a 
restart at the RAS_START location.  When the thread eventually resumes at 
step 2, RAS_START will be zero, and if it gets interrupted again before 
the end of the atomic operation it would not get restarted.


The attached patch addresses these two conditions as follows:

Now the PUSHFRAMEINSVC macro always resets RAS_START to zero if it has a non-
zero value, and always sets RAS_END to 0xffffffff if it has a different value.
This differs from the original in that the value of RAS_START isn't used to
decide whether to reset RAS_END. The two decisions are made independently, 
which allows them to reliably enforce the entry-condition for the next thread
regardless of where in the RAS sequence the current thread was interrupted.

The macro now uses conditional execution based on an unsigned compare of the 
PC against RAS_END to decide whether to set up for a restart (strhi rather 
than strgt instruction).  In the case where RAS_START has been set and 
RAS_END is still at 0xffffffff, this will result in setting up for a restart 
at the beginning of the sequence so that an interrupt on the second time 
through will still result in atomic behavior (I.E.  yet another restart for a 
3rd attempt at the atomic op).  

Note that there is some typical ARM conditional-execution trickiness in the 
new logic: the strhi instruction that changes the saved PC to the RAS_START 
address will execute only if the carry flag is set and the zero flag is not.  
If the RAS_START value is zero then the compare with the RAS_END address is 
skipped and the zero flag is still set from the RAS_START compare, causing 
the strhi to get skipped based soley on the RAS_START value, regardless of 
the carry flag. If RAS_START is non-zero then the RAS_END compare happens and 
the strhi executes based on the zero/carry status of the RAS_END compare.  

Note also that the patched code assumes that if RAS_START is non-zero, the
userland PC in PUSHFRAMEINSVC must be greater than RAS_START.  This is 
implicit in the definition of RAS, which prohibits function calls during a 
RAS sequence.  This allows the code to avoid a specific check for the PC 
being greater than RAS_START if RAS_START is non-zero, and explains why the 
series of instructions that used to be conditional on 'gt' are now 'ne'.

Thanks go to Symmetricom, Inc, for sponsoring this work, and to my 
collegue there, Lars Boehnke, for crafting test code to demonstrate the 
original problem and prove the fix works.

Fix: The following patch fixes the problems described above.  The changes to
sysarch.h are just updates to the comments to help future maintainers.

These diffs were developed for freebsd 8.x but apply cleanly to -current.

These changes have been extensively tested at Symmetricom; we are currently
deploying products that include these fixes.
How-To-Repeat: The following small program will demonstrate the problem using the 
atomic_cmpset_int() function.  The failure will be provoked most quickly
in the presence of lots of interrupts, so running with a kernel that has
HZ set unreasonably high will help demonstrate the problem.

--- atomictest.c begins here ---
/*
 * atomictest.c - Demonstrate non-atomicity of ARM RAS code.
 *
 * Build using: make LDFLAGS=-pthread atomictest
 *
 * Run in an environment with lots of interrupts going on (HZ=4096 does
 * the trick nicely for me).  On a 180mhz rm9200 with HZ=100 it takes
 * hours-to-months to see a failure; with HZ=4096 it takes minutes.
 */

#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <machine/atomic.h>

static int sStopThreads = 0;
static const int sNumThreads = 3;
static unsigned int sAtomicLock = 0;

// The implementation is defined in a macro because the RAS sections need to
// be at different addresses to demonstrate the problem.

// CMPSET test -
//  If a thread is able to "lock" sAtomicLock, it should be able to "release"
//  it.  Failure to release indicates multiple threads were able to change the
//  value of sAtomicLock at the same time.

#define THREAD_IMP(thrIdx) \
    unsigned int locked, released;                                           \
                                                                             \
    while (! sStopThreads)                                                   \
    {                                                                        \
        locked = atomic_cmpset_int(&sAtomicLock, 0, (thrIdx+1));             \
        released = atomic_cmpset_int(&sAtomicLock, (thrIdx+1), 0);           \
        if (locked != released)                                              \
        {                                                                    \
            printf("*** ERROR: locked=%u, released=%u\n", locked, released); \
            sStopThreads = 1;                                                \
            break;                                                           \
        }                                                                    \
    }                                                                        \
                                                                             \
    return NULL;                                                             \


void* ThreadFunc1(void* varg)
{
    THREAD_IMP(0);
}

void* ThreadFunc2(void* varg)
{
    THREAD_IMP(1);
}

void* ThreadFunc3(void* varg)
{
    THREAD_IMP(2);
}

void SigHandler(int sig)
{
    if (sig == SIGINT)
        sStopThreads = 1;
    return;
}

int main(int argc, const char* argv[])
{
    pthread_t threads[sNumThreads];
    size_t idx;
    time_t start = time(NULL);

    signal(SIGINT, SigHandler);
    
    pthread_create(&threads[0], NULL, ThreadFunc1, NULL);
    pthread_create(&threads[1], NULL, ThreadFunc2, NULL);
    pthread_create(&threads[2], NULL, ThreadFunc3, NULL);
    
    while (! sStopThreads)
        sleep(1);
    
    // make sure all threads have stopped
    for (idx = 0; idx < sNumThreads; idx ++)
        pthread_join(threads[idx], NULL);

    printf("Elapsed %lld seconds\n", (int64_t)(time(NULL)-start));

    return 0;
} 
--- atomictest.c ends here ---
Comment 1 marktinguely 2011-10-11 23:13:30 UTC
On 10/11/2011 12:42 PM, Ian Lepore wrote:
> The "normal" RAS sequence goes like this:
>
>      On entry, RAS_START = 0x00000000, RAS_END = 0xffffffff
>
>      1: Set RAS_START to address of Step 1
>      2: Set RAS_END to address of Step 4
>      3: Do the atomic operation
>      4: Set RAS_START to 0x00000000
>      5: Set RAS_END to 0xffffffff

good job!

I am looking forward to ARMv6/ARMv7 to replace these with 
ldrex/strex/clrex calls.

Devil's advocate: a person can now put any value in RAS_START and wait 
for an interrupt. I don't see that doing them any good. In the past they 
could do this only if the (RAS_START < PC). In a uni-processor, we 
cannot overlap these critical sections. We are assuming that critical 
sections are made by an atomic command because we do not check to see if 
(RAS_START < PC). Does the RAS_END comparison buy us anything? At this 
point.

     RAS_END == 0xffff_ffff when interrupt is between step 1 and 2
     RAS_END == address of step 5

In both cases, (PC <= RAS_END)

--Mark.
Comment 2 Ian Lepore 2011-10-11 23:29:08 UTC
On Tue, 2011-10-11 at 17:13 -0500, Mark Tinguely wrote:
> On 10/11/2011 12:42 PM, Ian Lepore wrote:
> Does the RAS_END comparison buy us anything? At this point.
> 
>      RAS_END == 0xffff_ffff when interrupt is between step 1 and 2
>      RAS_END == address of step 5
> 
> In both cases, (PC <= RAS_END)
> 
> --Mark.

The RAS_END comparison is still needed to ensure idempotence of the
atomic sequence.  Once the PC passes the "final store" of the atomic
sequence an interrupt must not cause a restart of the sequence, but
until the zero is successfully stored into RAS_START the possibility of
a restart is still in play, so it's the RAS_END comparison that covers
the case of an interrupt while tearing down the RAS mechanism.

Oh! I just re-read your quoted text above... an important distinction is
that RAS_END is the address of step 4, not step 5.  In the asm code it
has to be the address of the next instruction after the one that makes
the final modification of the memory value being protected by the
sequence.

-- Ian
Comment 3 dfilter service freebsd_committer freebsd_triage 2011-10-16 18:59:42 UTC
Author: cognet
Date: Sun Oct 16 17:59:28 2011
New Revision: 226443
URL: http://svn.freebsd.org/changeset/base/226443

Log:
  Fix 2 bugs :
  
  - A race condition could happen if two threads were using RAS at the same time
  as the code didn't reset RAS_END, the RAS code could believe we were not in
  a RAS, when we were in fact.
  - Using signed value logic to compare addresses wasn't such a good idea.
  
  Many thanks to Ian to investigate on these issues.
  
  Pointy hat to: 	cognet
  PR:		arm/161498
  Submitted by:	Ian Lepore <freebsd At damnhippie DOT dyndns dot org
  MFC after:	1 week

Modified:
  head/sys/arm/include/asmacros.h
  head/sys/arm/include/sysarch.h

Modified: head/sys/arm/include/asmacros.h
==============================================================================
--- head/sys/arm/include/asmacros.h	Sun Oct 16 17:38:20 2011	(r226442)
+++ head/sys/arm/include/asmacros.h	Sun Oct 16 17:59:28 2011	(r226443)
@@ -71,9 +71,8 @@
 	ldr	r0, =ARM_RAS_START;					   \
 	mov	r1, #0;							   \
 	str	r1, [r0];						   \
-	ldr	r0, =ARM_RAS_END;					   \
 	mov	r1, #0xffffffff;					   \
-	str	r1, [r0];
+	str	r1, [r0, #4];
 
 /*
  * PULLFRAME - macro to pull a trap frame from the stack in the current mode
@@ -120,20 +119,19 @@
 	stmia	r0, {r13-r14}^;		/* Push the user mode registers */ \
         mov     r0, r0;                 /* NOP for previous instruction */ \
 	ldr	r5, =ARM_RAS_START;	/* Check if there's any RAS */	   \
-	ldr	r3, [r5];						   \
-	cmp	r3, #0;			/* Is the update needed ? */	   \
-	ldrgt	lr, [r0, #16];						   \
-	ldrgt	r1, =ARM_RAS_END;					   \
-	ldrgt	r4, [r1];		/* Get the end of the RAS */	   \
-	movgt	r2, #0;			/* Reset the magic addresses */	   \
-	strgt	r2, [r5];						   \
-	movgt	r2, #0xffffffff;					   \
-	strgt	r2, [r1];						   \
-	cmpgt	lr, r3;			/* Were we in the RAS ? */	   \
-	cmpgt	r4, lr;							   \
-	strgt	r3, [r0, #16];		/* Yes, update the pc */	   \
-	mrs	r0, spsr_all;		/* Put the SPSR on the stack */	   \
-	str	r0, [sp, #-4]!
+	ldr     r4, [r5, #4];           /* reset it to point at the     */ \
+	cmp     r4, #0xffffffff;        /* end of memory if necessary;  */ \
+	movne   r1, #0xffffffff;        /* leave value in r4 for later  */ \
+	strne   r1, [r5, #4];           /* comparision against PC.      */ \
+	ldr     r3, [r5];               /* Retrieve global RAS_START    */ \
+	cmp     r3, #0;                 /* and reset it if non-zero.    */ \
+	movne   r1, #0;                 /* If non-zero RAS_START and    */ \
+	strne   r1, [r5];               /* PC was lower than RAS_END,   */ \
+	ldrne   r1, [r0, #16];          /* adjust the saved PC so that  */ \
+	cmpne   r4, r1;                 /* execution later resumes at   */ \
+	strhi   r3, [r0, #16];          /* the RAS_START location.      */ \
+	mrs     r0, spsr_all;                                              \
+	str     r0, [sp, #-4]!
 
 /*
  * PULLFRAMEFROMSVCANDEXIT - macro to pull a trap frame from the stack

Modified: head/sys/arm/include/sysarch.h
==============================================================================
--- head/sys/arm/include/sysarch.h	Sun Oct 16 17:38:20 2011	(r226442)
+++ head/sys/arm/include/sysarch.h	Sun Oct 16 17:59:28 2011	(r226443)
@@ -42,9 +42,13 @@
  * The ARM_TP_ADDRESS points to a special purpose page, which is used as local
  * store for the ARM per-thread data and Restartable Atomic Sequences support.
  * Put it just above the "high" vectors' page.
- * the cpu_switch() code assumes ARM_RAS_START is ARM_TP_ADDRESS + 4, and
+ * The cpu_switch() code assumes ARM_RAS_START is ARM_TP_ADDRESS + 4, and
  * ARM_RAS_END is ARM_TP_ADDRESS + 8, so if that ever changes, be sure to
  * update the cpu_switch() (and cpu_throw()) code as well.
+ * In addition, code in arm/include/atomic.h and arm/include/asmacros.h
+ * assumes that ARM_RAS_END is at ARM_RAS_START+4, so be sure to update those
+ * if ARM_RAS_END moves in relation to ARM_RAS_START (look for occurrances
+ * of ldr/str rm,[rn, #4]).
  */
 #define ARM_TP_ADDRESS		(ARM_VECTORS_HIGH + 0x1000)
 #define ARM_RAS_START		(ARM_TP_ADDRESS + 4)
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 4 Olivier Houchard freebsd_committer freebsd_triage 2011-11-09 14:14:47 UTC
State Changed
From-To: open->closed

Committed. Thanks!
Comment 5 dfilter service freebsd_committer freebsd_triage 2012-02-07 15:50:49 UTC
Author: cognet
Date: Tue Feb  7 15:50:14 2012
New Revision: 231131
URL: http://svn.freebsd.org/changeset/base/231131

Log:
  MFC r226441 and r226443
  r226441:
  Explicitely set ARM_RAS_START and ARM_RAS_END once the cacheline or the
  page has been allocated, or we could end up using random values, and bad things
   could happen.
  
   PR:		arm/161492
   Submitted by:	Ian Lepore <freebsd AT damnhippie dot dyndns DOT org>
  
  r226443:
  Fix 2 bugs :
  
  - A race condition could happen if two threads were using RAS at the same time
    as the code didn't reset RAS_END, the RAS code could believe we were not in
    a RAS, when we were in fact.
  - Using signed value logic to compare addresses wasn't such a good idea.
    Many thanks to Ian to investigate on these issues.
  
    Pointy hat to: 	cognet
    PR:		arm/161498
    Submitted by:	Ian Lepore <freebsd At damnhippie DOT dyndns dot org

Modified:
  stable/8/sys/arm/arm/machdep.c
  stable/8/sys/arm/include/asmacros.h
  stable/8/sys/arm/include/sysarch.h

Modified: stable/8/sys/arm/arm/machdep.c
==============================================================================
--- stable/8/sys/arm/arm/machdep.c	Tue Feb  7 14:50:33 2012	(r231130)
+++ stable/8/sys/arm/arm/machdep.c	Tue Feb  7 15:50:14 2012	(r231131)
@@ -312,6 +312,8 @@ cpu_startup(void *dummy)
 	m = vm_page_alloc(NULL, 0, VM_ALLOC_NOOBJ | VM_ALLOC_ZERO);
 	pmap_kenter_user(ARM_TP_ADDRESS, VM_PAGE_TO_PHYS(m));
 #endif
+	*(uint32_t *)ARM_RAS_START = 0;
+	*(uint32_t *)ARM_RAS_END = 0xffffffff;
 }
 
 SYSINIT(cpu, SI_SUB_CPU, SI_ORDER_FIRST, cpu_startup, NULL);

Modified: stable/8/sys/arm/include/asmacros.h
==============================================================================
--- stable/8/sys/arm/include/asmacros.h	Tue Feb  7 14:50:33 2012	(r231130)
+++ stable/8/sys/arm/include/asmacros.h	Tue Feb  7 15:50:14 2012	(r231131)
@@ -71,9 +71,8 @@
 	ldr	r0, =ARM_RAS_START;					   \
 	mov	r1, #0;							   \
 	str	r1, [r0];						   \
-	ldr	r0, =ARM_RAS_END;					   \
 	mov	r1, #0xffffffff;					   \
-	str	r1, [r0];
+	str	r1, [r0, #4];
 
 /*
  * PULLFRAME - macro to pull a trap frame from the stack in the current mode
@@ -120,20 +119,19 @@
 	stmia	r0, {r13-r14}^;		/* Push the user mode registers */ \
         mov     r0, r0;                 /* NOP for previous instruction */ \
 	ldr	r5, =ARM_RAS_START;	/* Check if there's any RAS */	   \
-	ldr	r3, [r5];						   \
-	cmp	r3, #0;			/* Is the update needed ? */	   \
-	ldrgt	lr, [r0, #16];						   \
-	ldrgt	r1, =ARM_RAS_END;					   \
-	ldrgt	r4, [r1];		/* Get the end of the RAS */	   \
-	movgt	r2, #0;			/* Reset the magic addresses */	   \
-	strgt	r2, [r5];						   \
-	movgt	r2, #0xffffffff;					   \
-	strgt	r2, [r1];						   \
-	cmpgt	lr, r3;			/* Were we in the RAS ? */	   \
-	cmpgt	r4, lr;							   \
-	strgt	r3, [r0, #16];		/* Yes, update the pc */	   \
-	mrs	r0, spsr_all;		/* Put the SPSR on the stack */	   \
-	str	r0, [sp, #-4]!
+	ldr     r4, [r5, #4];           /* reset it to point at the     */ \
+	cmp     r4, #0xffffffff;        /* end of memory if necessary;  */ \
+	movne   r1, #0xffffffff;        /* leave value in r4 for later  */ \
+	strne   r1, [r5, #4];           /* comparision against PC.      */ \
+	ldr     r3, [r5];               /* Retrieve global RAS_START    */ \
+	cmp     r3, #0;                 /* and reset it if non-zero.    */ \
+	movne   r1, #0;                 /* If non-zero RAS_START and    */ \
+	strne   r1, [r5];               /* PC was lower than RAS_END,   */ \
+	ldrne   r1, [r0, #16];          /* adjust the saved PC so that  */ \
+	cmpne   r4, r1;                 /* execution later resumes at   */ \
+	strhi   r3, [r0, #16];          /* the RAS_START location.      */ \
+	mrs     r0, spsr_all;                                              \
+	str     r0, [sp, #-4]!
 
 /*
  * PULLFRAMEFROMSVCANDEXIT - macro to pull a trap frame from the stack

Modified: stable/8/sys/arm/include/sysarch.h
==============================================================================
--- stable/8/sys/arm/include/sysarch.h	Tue Feb  7 14:50:33 2012	(r231130)
+++ stable/8/sys/arm/include/sysarch.h	Tue Feb  7 15:50:14 2012	(r231131)
@@ -42,9 +42,13 @@
  * The ARM_TP_ADDRESS points to a special purpose page, which is used as local
  * store for the ARM per-thread data and Restartable Atomic Sequences support.
  * Put it just above the "high" vectors' page.
- * the cpu_switch() code assumes ARM_RAS_START is ARM_TP_ADDRESS + 4, and
+ * The cpu_switch() code assumes ARM_RAS_START is ARM_TP_ADDRESS + 4, and
  * ARM_RAS_END is ARM_TP_ADDRESS + 8, so if that ever changes, be sure to
  * update the cpu_switch() (and cpu_throw()) code as well.
+ * In addition, code in arm/include/atomic.h and arm/include/asmacros.h
+ * assumes that ARM_RAS_END is at ARM_RAS_START+4, so be sure to update those
+ * if ARM_RAS_END moves in relation to ARM_RAS_START (look for occurrances
+ * of ldr/str rm,[rn, #4]).
  */
 #define ARM_TP_ADDRESS		(ARM_VECTORS_HIGH + 0x1000)
 #define ARM_RAS_START		(ARM_TP_ADDRESS + 4)
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 6 dfilter service freebsd_committer freebsd_triage 2012-02-07 16:07:39 UTC
Author: cognet
Date: Tue Feb  7 16:07:29 2012
New Revision: 231132
URL: http://svn.freebsd.org/changeset/base/231132

Log:
  MFC r226441 and r226443
  r226441:
  Explicitely set ARM_RAS_START and ARM_RAS_END once the cacheline or the
  page has been allocated, or we could end up using random values, and bad things
  could happen.
  
   PR:		arm/161492
   Submitted by:	Ian Lepore <freebsd AT damnhippie dot dyndns DOT org>
  
  r226443:
  Fix 2 bugs :
  
  - A race condition could happen if two threads were using RAS at the same time
    as the code didn't reset RAS_END, the RAS code could believe we were not in
     a RAS, when we were in fact.
  - Using signed value logic to compare addresses wasn't such a good idea.
  
  Many thanks to Ian to investigate on these issues.
  
    Pointy hat to: 	cognet
    PR:		arm/161498
    Submitted by:	Ian Lepore <freebsd At damnhippie DOT dyndns dot org

Modified:
  stable/9/sys/arm/arm/machdep.c
  stable/9/sys/arm/include/asmacros.h
  stable/9/sys/arm/include/sysarch.h

Modified: stable/9/sys/arm/arm/machdep.c
==============================================================================
--- stable/9/sys/arm/arm/machdep.c	Tue Feb  7 15:50:14 2012	(r231131)
+++ stable/9/sys/arm/arm/machdep.c	Tue Feb  7 16:07:29 2012	(r231132)
@@ -312,6 +312,8 @@ cpu_startup(void *dummy)
 	m = vm_page_alloc(NULL, 0, VM_ALLOC_NOOBJ | VM_ALLOC_ZERO);
 	pmap_kenter_user(ARM_TP_ADDRESS, VM_PAGE_TO_PHYS(m));
 #endif
+	*(uint32_t *)ARM_RAS_START = 0;
+	*(uint32_t *)ARM_RAS_END = 0xffffffff;
 }
 
 SYSINIT(cpu, SI_SUB_CPU, SI_ORDER_FIRST, cpu_startup, NULL);

Modified: stable/9/sys/arm/include/asmacros.h
==============================================================================
--- stable/9/sys/arm/include/asmacros.h	Tue Feb  7 15:50:14 2012	(r231131)
+++ stable/9/sys/arm/include/asmacros.h	Tue Feb  7 16:07:29 2012	(r231132)
@@ -71,9 +71,8 @@
 	ldr	r0, =ARM_RAS_START;					   \
 	mov	r1, #0;							   \
 	str	r1, [r0];						   \
-	ldr	r0, =ARM_RAS_END;					   \
 	mov	r1, #0xffffffff;					   \
-	str	r1, [r0];
+	str	r1, [r0, #4];
 
 /*
  * PULLFRAME - macro to pull a trap frame from the stack in the current mode
@@ -120,20 +119,19 @@
 	stmia	r0, {r13-r14}^;		/* Push the user mode registers */ \
         mov     r0, r0;                 /* NOP for previous instruction */ \
 	ldr	r5, =ARM_RAS_START;	/* Check if there's any RAS */	   \
-	ldr	r3, [r5];						   \
-	cmp	r3, #0;			/* Is the update needed ? */	   \
-	ldrgt	lr, [r0, #16];						   \
-	ldrgt	r1, =ARM_RAS_END;					   \
-	ldrgt	r4, [r1];		/* Get the end of the RAS */	   \
-	movgt	r2, #0;			/* Reset the magic addresses */	   \
-	strgt	r2, [r5];						   \
-	movgt	r2, #0xffffffff;					   \
-	strgt	r2, [r1];						   \
-	cmpgt	lr, r3;			/* Were we in the RAS ? */	   \
-	cmpgt	r4, lr;							   \
-	strgt	r3, [r0, #16];		/* Yes, update the pc */	   \
-	mrs	r0, spsr_all;		/* Put the SPSR on the stack */	   \
-	str	r0, [sp, #-4]!
+	ldr     r4, [r5, #4];           /* reset it to point at the     */ \
+	cmp     r4, #0xffffffff;        /* end of memory if necessary;  */ \
+	movne   r1, #0xffffffff;        /* leave value in r4 for later  */ \
+	strne   r1, [r5, #4];           /* comparision against PC.      */ \
+	ldr     r3, [r5];               /* Retrieve global RAS_START    */ \
+	cmp     r3, #0;                 /* and reset it if non-zero.    */ \
+	movne   r1, #0;                 /* If non-zero RAS_START and    */ \
+	strne   r1, [r5];               /* PC was lower than RAS_END,   */ \
+	ldrne   r1, [r0, #16];          /* adjust the saved PC so that  */ \
+	cmpne   r4, r1;                 /* execution later resumes at   */ \
+	strhi   r3, [r0, #16];          /* the RAS_START location.      */ \
+	mrs     r0, spsr_all;                                              \
+	str     r0, [sp, #-4]!
 
 /*
  * PULLFRAMEFROMSVCANDEXIT - macro to pull a trap frame from the stack

Modified: stable/9/sys/arm/include/sysarch.h
==============================================================================
--- stable/9/sys/arm/include/sysarch.h	Tue Feb  7 15:50:14 2012	(r231131)
+++ stable/9/sys/arm/include/sysarch.h	Tue Feb  7 16:07:29 2012	(r231132)
@@ -42,9 +42,13 @@
  * The ARM_TP_ADDRESS points to a special purpose page, which is used as local
  * store for the ARM per-thread data and Restartable Atomic Sequences support.
  * Put it just above the "high" vectors' page.
- * the cpu_switch() code assumes ARM_RAS_START is ARM_TP_ADDRESS + 4, and
+ * The cpu_switch() code assumes ARM_RAS_START is ARM_TP_ADDRESS + 4, and
  * ARM_RAS_END is ARM_TP_ADDRESS + 8, so if that ever changes, be sure to
  * update the cpu_switch() (and cpu_throw()) code as well.
+ * In addition, code in arm/include/atomic.h and arm/include/asmacros.h
+ * assumes that ARM_RAS_END is at ARM_RAS_START+4, so be sure to update those
+ * if ARM_RAS_END moves in relation to ARM_RAS_START (look for occurrances
+ * of ldr/str rm,[rn, #4]).
  */
 #define ARM_TP_ADDRESS		(ARM_VECTORS_HIGH + 0x1000)
 #define ARM_RAS_START		(ARM_TP_ADDRESS + 4)
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"