[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rep:[f-cpu] Hot issue : external LSU ?



On other world, does ll/sc are there to "emulate" true CAS ?

In that cases, it's much more easy to create a kind of CAS instruction.
This could be maid by linking 2 or more instructions (the first
instruction raise a flag, and if the second instruction come, it's a
true CAS).

Then the system emit the CAS functionnality that should have the bus
system. Exactly the same could be handle for CAS2 (could you reexplain
why we need CAS2, Cedric explain to me that it isn't usefull, CAS is
enough).

nicO

Christophe Avoinne a écrit :
> 
> Well, I try but I don't see any interest to emulate 'll/sc' with a CAS.
> 
> ----- Original Message -----
> From: "Nicolas Boulay" <nicolas.boulay@ifrance.com>
> To: <f-cpu@seul.org>
> Sent: Friday, August 30, 2002 7:20 PM
> Subject: Rep:[f-cpu] Hot issue : external LSU ?
> 
> i will reread this carefully later but juste one question : Does
> lstore/cload could be replaced by CAS and CAS2 ?
> 
> nicO
> 
> -----Message d'origine-----
> De: "Christophe Avoinne" <christophe.avoinne@laposte.net>
> A: <f-cpu@seul.org>
> Date: 30/08/02
> Objet: [f-cpu] Hot issue : external LSU ?
> 
> Humm, well this discussion seems to be very hot.
> 
> The best I can do is to expose better the different ways i saw (they
> must or
> not have their drawbacks) :
> 
> First :
> ------
> The locked load/store cannot be shared with the same instructions of
> normal
> load/store. Why ? because lstore is not a simple conditional store, we
> really need to catch the test result into a register to check if a
> locked
> write occurs because numerous algorithms needs this information after
> executing a locked store. So you must forget the idea to put the locked
> load/store in a conditional load/store format.
> 
> Second :
> ---------
> Most of data don't need to be shared between CPU, so normal load/store
> without any synchronisation can be done at full speed. In fact, it is up
> to
> the software programmer not to abuse locked load/store operations, since
> there is no real solution to speedup them (they would always be slower
> than
> normal load/store operations).
> 
> ---
> 
> In the case of a uniprocessor :
> ------------------------------
> 
> locked load/store can be done easily in the LSU with a bit acting like a
> token. The locked load put a token in a LSU entry, the first locked
> store
> must be the first to take the token in the LSU entry. It should be easy
> enough. The only trouble is that you cannot handle an array of lock that
> way
> because of the byte-width of LSU entry (how many bytes does an LSU entry
> represent ?) the fastest way : if i 'locked load' two contiguous words i
> would in fact set the token in the same LSU entry. So if I 'locked
> store'
> the two contiguous words, the last one would fail (not what we expect in
> fact, but it will just slowdown). But if you array of large node with a
> lock
> word well separated, this trouble should disappear.
> 
> I can see some error about using a separate address space for locking.
> You
> should not read 'locked load/store' as a semaphore 'acquire/release',
> which
> is not exactly the purpose of CAS and CAS2. Just consider an atomic
> stack,
> you want to push into or pop element from a stack atomically. You just
> need
> to change the top pointer with a CAS (so the need for 'locked
> load/store' to
> be able to access the same space address), instead you need to acquire
> semaphore first, then modify the top pointer then release semaphore,
> which
> gives us not exactly the same behaviour (blocking solution).
> 
> Another problem, just imagine you need to update an array entry in a
> user
> array. This array is being shared. It could be used a locked load/store
> to
> be sure that no other cpu or task is doing something else with the same
> entry meanwhile (very fine-grained synchronisation). Using a semaphore
> would
> force to associate one semaphore per entry...
> 
> So, please don't confuse 'locked load/store' with semaphore concept and
> don't think using separate address space is the solution for 'll/sc'
> counterpart. Your acquire/release suggestion, say,  is another solution
> for
> another locking purpose.
> 
> ---
> 
> In the case of a multi-processor :
> ---------------------------------
> 
> having an LSU for each processor leads to a coherency problem. To share
> locked entry in each LSU, is not a good idea, especially for CPU which
> never
> access those locked memory places. besides, it is difficult and slow to
> propagate such infos between LSUs.
> 
> It is why i was wondering if using an external LSU shared for all CPU
> could
> be a solution. You must see it just as a suggestion that can be down or
> improved.
> 
> Two cases:
> - locked entries are kept both in internal and external LSUs;
> - locked entries are only kept in external LSU;
> 
> I don't think coherency between internal and external LSUs is a real
> matter
> (I may be wrong).
> 
> locked entries are kept in both LSUs :- normal load/store don't bother
> with
> external LSU. Internal LSU accesses directly the memory (we can expect
> it is
> what the software programmer will use most time).
> - locked load sets a lock into internal (for intra-cpu locking) and
> external
> (for inter-cpu locking) LSU entries, no matter their contents.
> - locked store checks this lock into internal (for intra-cpu locking)
> AND
> external (for inter-cpu locking) LSU entries.
> 
> Having this lock bit in internal LSU allows us to remove the necessity
> to
> have an external LSU for uniprocessor (just need cpu intra-locking), so
> external LSU is just an option to have inter-cpu locking capability.
> 
> If an external LSU is not present :
> - locked load sets a lock into internal (for intra-cpu locking).
> - locked store checks this lock into internal (for intra-cpu locking).
> 
> locked entries are only kept in external LSU :
> 
> - normal load/store don't bother with external LSU. Internal LSU
> accesses
> directly the memory (we can expect it is what the software programmer
> will
> use most time). In fact internal LSU has no locked LSU entry
> (insensitive to
> locked load/store).
> - locked load/store always operate on external LSU entries instead of
> internal LSU entries.
> - locked load sets a lock into external (for intra/inter-cpu locking)
> LSU
> entry.
> - locked store checks this lock into external (for intra/inter-cpu
> locking)
> LSU entry.
> 
> You need to have an external LSU even for a uniprocessor.
> 
> An external LSU entry is really shared between CPU without duplicata, so
> there is no coherency problem.
> 
> A mixture :
> -----------
> lock load/store can have a mixture : an only intra-cpu locking (only use
> internal LSU), an only inter-cpu locking (only use external LSU) or both
> intra/extra-cpu locking just using suffixes to do so. That way you can
> even
> use only intra-cpu locks for threads in a multiprocessor for faster
> execution than usual intra/inter-cou locks (i'm thinking about locks
> which
> are only relevant for threads of a same process in one cpu only ).
> 
> As a result :
> ------------
> 
> The main idea behind the external LSU is to prevent from normal
> load/store
> to be dependent of a global locking. An application should use mostly
> this
> solution. Threads in a process could use intra-cpu locked load/store if
> necessary. Intra/inter-cpu locked load/store would only be used if
> coherency
> between CPU is needed. Inter-cpu locked load/store can be used for
> situation
> we know there is only one task to access but you need coherency amongst
> cpus.
> 
> That's all folks.
> 
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
> 
> ____________________________________________________________________________
> __
> Pour mieux recevoir vos emails, utilisez un PC plus performant !
> Découvrez la nouvelle gamme DELL en exclusivité sur i (france)
> http://www.ifrance.com/_reloc/signhdell
> 
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
> 
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/