[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] CAS in FC0

----- Original Message -----
From: Michael Riepe <michael@stud.uni-hannover.de>
To: <f-cpu@seul.org>
Sent: Thursday, March 21, 2002 12:05 PM
Subject: Re: [f-cpu] CAS in FC0

> > > One more point to consider: Do we really need an extra load_tag?
> > > Tagging could as well be done for all loads.
> >
> > It is what I said to Yann. If we have no equivalent of llp/scp, I suppose
> > don't need an extra load_tag in fact.
> >
> > One pdf file si speaking about a load conditional, that is a load which can
> > fails if locked (like our store conditional). I don't know the advantage. I
> > should reread this pdf.
> Me too.

Here is the extract where I heard about a load conditionnal :

Hardware Contention Control

As a further extension, a processor can provide a conditional load instruction
or Cload. The Cload instruction is a load instruction that succeeds only if the
location being loaded does not have an advisory lock set on it, setting the
advisory lock when it does succeed.

With Cload available, the version number is loaded initially using Cload rather
than a normal load. If the Cload operation fails, the thread waits and retries,
up to some maximum, and then uses the normal load instruction and proceeds.
This waiting avoids performing the update concurrently with another process
updating the same data structure. It also prevents potential starvation when
one operation takes significantly longer than other operations, causing these
other frequently occuring operations to perpetually abort the former. It
appears particularly beneficial in large-scale shared memory systems where the
time to complete a DCAS-governed operation can be significantly extended by
wait times on memory because of contention, increasing the exposure time for
another process to perform an interfering operation. Memory references that
miss can take 100 times as long, or more, because of contention misses. Without
Cload, a process can significantly delay the execution of another process by
faulting in the data being used by the other process and possibly causing its
DCAS to fail as well.

The cost of using Cload in the common case is simply testing whether the Cload
succeeded, given that a load of the version number is required in any case.

Cload can be implemented using the cache-based advisory locking mechanism
implemented in ParaDiGM [8]. Briefly, the processor advises the cache
controller that a particular cache line is ``locked''. Normal loads and stores
ignore the lock bit, but the Cload instruction tests and sets the cache-level
lock for a given cache line or else fails if it is already set. A store
operation clears the bit. This implementation costs an extra 3 bits of cache
tags per cache line plus some logic in the cache controller. Judging by our
experience with ParaDiGM, Cload is quite feasible to implement.


Ok, so using a load conditional allows us to know if someone is already using
our memory place, it seems good between two processors, one can wait for the
other to end but for different tasks in the same processor ?

task A -: load_locked [r1],r2,r3 ; ok [r1] is not locked so we lock it
<<< task switch >>>
task B -: loopentry r4
task B -: load_locked [r1],r2,r3 ; failure ! some else is using it
task B -: if r3 == 0 jump r0,r4

I dislike it.

In fact to handle it well, we would need two tags : one for a lock
inter-processor and another one for a lock intra-processor.

The load conditional will test the inter-processor lock and set both inter and
intra-processor locks.
The normal load just sets the intra-processor lock
The normal store clears both inter and intra-processor lock
The store conditional will test both inter and intra-processor locks and clears

I'm not sure of what I'm speaking about :) so don't flame me ;)

To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/