[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] CAS in FC0




----- Original Message -----
From: Yann Guidon <whygee@f-cpu.org>
To: <f-cpu@seul.org>
Sent: Wednesday, March 20, 2002 3:18 AM
Subject: Re: [f-cpu] CAS in FC0


> hello,
>
> Christophe wrote:
> > From: Michael Riepe
> > > On Tue, Mar 19, 2002 at 03:29:28AM +0100, Yann Guidon wrote:
> > > [...]
> > > > in fact, the real problem is to think of CAS as an "instruction".
> > > > Things become _so_easy_ if we use a combination of instructions !!!
> > > > We do not need any "CAS instruction" but a variant of the usual
> > > > load and store instructions, just like the locked versions in ALPHA
> > > > (and other CPUs).
> > > >
> > > > * "load locked" will "tag" the line it selected for reading. it's just
a bit
> > > > that the issue logic has to set after decoding the instruction.
> > > Or a bunch of bits (e.g. one per byte, or maybe 64-bit word, inside the
line).
> > >
> > > > * "store conditional" will proceed if two conditions are met : the
specified
> > > > condition is true (we can check for LSB, MSB, zero...) and the "tag" is
still
> > > > set.
> >
> > Which condition ? your tag should tell us if someone else has written to
this
> > line before your store conditional. So there is no condition to test
> > beforehand.

Sorry ! it was too late when I wrote that. In fact I didn't realize your were
writing a CAS implementation so you're right for the comparison :).

And I did another mistake in my example of implementation of CAS, here the
right one without the loop (i forget to remove the loop when I transform my TAS
example into CAS example) :

r1 : pointer, r2 : requested value, r3 : new value

move r2,r4
move r1,r6
load_tagged [r1],r2         // r2 : read value and set tag for [r1]
xor r2,r4,r1
if r1 != 0 jump r0,r63       // the value is not that we expect so return to
our caller !
store_tagged [r6],r3,r1   // r3 : value to write into [r6] if tag still set
(tag is cleared afterward)
jump r0,r63                      // return to our caller

Now note that I expect this implementation returns 0 if success instead of !0.
"store_tagged" returns 0 if tag is set now.

> you forget the purpose of the comparison. On monday, you wrote :
> > - a single word compare-and-swap : cmpxchg reg1,[reg2]
> >
> > atomic {
> >     if (eax == *reg2) {
> >         *reg2 = reg1;
> >         zf = 1; /* zero flag */
> >     } else {
> >         eax = *reg2; // update the requested value since it has changed
> >         zf = 0;
> >     }
> > }

The main difference between the acadimical CAS and the Intel cmpxchg is that
the latter updates the register which would contain the requested value. But
implementing cmpxchg finally looks like our CAS implementation since r2
contains the actual read value when not maching with the requested value. Good.

> so the store to memory is conditioned by the result of the comparison
> AND the integrity of the dirty flag.
> Unless i don't understand your C code.

It's the case for CAS implementation. I did a mistake so you're okay :).

> > > > What about the scheduling ? it's almost as fast, if there are all the
> > > > bypass networks !
> > > >
> > > >   load_tag [r1],r2
> > > >   xor  r2,r3,r4
> > > >   if r4==0 store_locked [r1],r3
> > > >
> > > > (or something like that)
> >
> > And store_locked should return the tag in a register to know if store has
be
> > done. This register can be zero if tag is clear (someone else has already
> > modified our word).
>
> yup. However, notice that most fields in the locked store are used :-/
>

Or use the register which holds the new value to be set according the tag just
after writing this register in memory if you don't have enough place ?

store_tagged [r1],r2 {
    if ([r1].tag == 1) {
        *r1 = r2;
        r2 = 0;
    } else {
        r2 = 1;
    }
}

> > how to implement a software CAS :

Oh forget this one, it is buggy :).

> this can be also written as :
>    loopentry r5

oh yes that's true.

> > jump.nz r6,r0,r5               // the value is not that we expect !
>
> oops, the pointers are in the middle, so it should be
>   if r6==0 jump r5; /* YG "clear" syntax */

ok ok i will remember to use this syntax for the next time

> > nxor r2,r0,r3                     // r3 != 0 if [r1] == r2 until writing r3
> > into [r1]
> what is this instruction ?

r3 = ~(r2 ^ r0);
It may be a mistake but forget we can use only xor if we decide that we return
0 when success.

> > how to implement a PUSH :
> >
> > loadaddr 0f,r5
> > move r2,r3 // our node address to push
> > 0:
>
> idem : "loopentry r5" is here for this purpose.
>
> > load_tagged [r1],r2       // read top
> > store r2,[r3]                    // node->link = top
> > store_tagged r3,[r1],r4 // top = node
> > jump.nz r4,r0,r63             // return to caller if ok
> > jump r0,r5
>
> mmm to me, it looks more like an element that is inserted
> in a linked list, but who cares, it seems to do the job.

I'm speaking about a stack implemented as a list, you want a version using an
array ? I would need something like a CAS2 to do so safely so I'm not bothering
to show it.


*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/