[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rep:[f-cpu] Hot issue : external LSU ?



Christophe Avoinne a écrit :
> 
> Ok take this example : i want to increase a memory value atomically
> 
> CAS version :
> 
> void atomic_inc (int *slot) {
>     int old;
>     do {
>         old = *slot;
>     } while (cas(slot,old,old+1) != old);
> }
> 
> or better to avoid reading *slot twice :
> 
> void atomic_inc (int *slot) {
>     int old1,old2 = *slot;
>     do {
>         old1 = old2;
>     } while ((old1 = cas(slot,old1,old1+1)) != old2);
> }
> 
> ll/sc version :
> 
> void atomic_inc (int *slot) {
>     int old;
>     do {
>         old = ll(slot);
>     } while (!sc(slot,old+1));
> }
> 
> as you can see : ll/sc looks much more "elegant" and simpler for a
> programmer.
> 
> you want to apply a function in a "atomical" way :
> 
> void atomic_apply (int *slot,int (*f) (int)) {
>     int old;
>     do {
>         old = ll(slot);
>     } while (!sc(slot,f(old)));
> }
> 
> in C++ :) :
> 
> template< typename f > void apply< f >(int &slot) {
>     int old;
>     do {
>         old = ll(slot);
>     } while (!sc(slot,f(old)));
> }
> 
> Don't forget :
> *ll read a value in the matching LSU entry and set the lock bit.
> *sc doesn't need to read a value in the matching LSU entry and compare with
> an expected value unlike CAS, it just needs to check the lock bit, so it


Does it have one single lock bit for the whole system or a lock bit per
adress line ?

If there is only one, in large scale system, there will have a problem.
If there is one by address line, can't have a flag for each line so the
memory will act as a caches. So you could "loose" some lock bit, because
there isn't enough room. If you have a lot of process or a lot of
memory, soonre or later you will have a big problem.

For CAS, it's a little easier. The io bus (opppose to the memory bus)
should have read-modify-write cycle. This kind of cycle will "lock" for
a little moment all the system from the cpu to the memory what ever way
it cross. That's possible. The performance is very poor but is much
quicker than a mutex. If we add CAS to the bus sub-system, CAS2 could be
added to it, too.

nicO

> should be faster.
> 
> ----- Original Message -----
> From: "nico" <nicolas.boulay@ifrance.com>
> To: <f-cpu@seul.org>
> Sent: Saturday, August 31, 2002 4:40 PM
> Subject: Re: Rep:[f-cpu] Hot issue : external LSU ?
> 
> > I don't refind the precise behavior of ll/sc. But i don't like too much
> > to lock a precise adress line. This become to be a very limited
> > ressource and raise many problem.
> >
> > And what about distributed memory where pages are duplicated ? But maybe
> > ll/sc seems to be much more low level but it's seen by user. So what do
> > you think about ?
> 
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/