[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] Re: Navier-Stokes



On Thu, 18 Apr 2002, Yann Guidon wrote:
>> How about special register access methodology opcodes?
>> This way you define the entry point how to access them
>> but do not need to fix everything (layout, number, etc.)
>> from the beginning. Just have it read/write for superuser
>> only, one parameter being SR#.
>
>i don't understand exactly, but just in case, here is the current method :
>
>- there are only get, put, geti and puti as means to access the SRs.
>  This instruction stalls the pipeline until it is sure that there is no
>  access violation.

Okay, this it what I meant. But why stall the pipeline? It also
would be possible to just advice to use a number of NOP cycles
after critical settings. Better do an out-of-order exception on
access violation. Thus the register bank may also be implemented
in a slow mode (maybe with eeprom backup or whatever).
And
finally do exception on any non-writable bit(!). This would imply
a write mask to be delivered with the write data - just to be
sure that the bits stay free for further extension...

>- the SRs are either hardwired (read-only, for example : the mask revision),
>supervisor (read-only too, if you are not running with a suitable privilege)
>or user-contolled (for example your private size registers).
>
>- the hardwired registers provide informations to the running system, they are
>  defined during the CPU design and say how many SRs there are, what kind of CPU,
>  what is the register size, what opcodes and what units are provided...
>
>- The "superuser" registers control those features that have any impact on the
>  running system, for example the trap handler addesses, the TLB, the memory mapping...
>  Note that when it is possible, an individual "authorisation" is allowed to every
>  ressource so a pice of code can deal only with one ressource at a time.
>  A task can only reset this bit (when it is set), but not set it, to enforce
>  the system's protection. This is a bit similar to the Hurd tokens but can
>  be used in other places such as Linux as well.
>
>- The "user" registers are those which have no protection bit, for example the
>  task's private performance counters or size attributes. A user can be "granted" access
>  to a SR-controlled ressource if the super-user modifies the associated SR enable bit,
>  but otherwise all the task has is a virtual address range to play with.

Do you handle the tasks in here? That really spoils my thinking.
Task private things... for how many tasks?

>There is no specific opcode for managing that, it's all done with get/put
>and it traps if the necessary rights are not granted, that's all.

Okay, get/put are specific opcodes, are they? ;)

>> Anyway, who wants to end up programming around hardware cache
>> strategies? :-/
>nobody "wants" but there are cases (yours ?) where this becomes necessary.

Oops! Necessary??? If I can control the way the cpu handles it
it may not be necessary at all. Shrug!

>> Could probably be easier to change the cache strategy on the fly?
>
>if you want to change the cache strategy, there must be more than one
>in HW. Intel's MTRR (and the competitor's equivalents) provide a mean
>to modify the cache strategy for a fixed number of memory ranges
>(for example, the video memory can be set as write-combine and
>multi-CPU shared memory space can be set as uncachable).

Yes, generally you have lock, disable and LRU in most processors.
I was mainly talking about the aging process of the cache to be
switchable from LRU to something else. It may therefore not need
a cache flush at all.

>This is usually possible to modify this when the system is "alive"
>but there's still a risk of loosing data if you switch from one
>policy to another when you forget to properly flush the caches etc...
>But that's for the x86 world.

:-) see above. If you manage to use the same valid bits it
should not cause problems as long as it stays associative.
I didn't talk of changing the load strategy! Just alternate
strategies for identification which cache entry to use for
replacement next.

>This is possible to do this for F-CPU but i don't see the point of a MTRR-like
>system, at least for the first-generation F-CPUs. I am not exactly sure
>of what you have in mind but if it's simply changing the cache
>from write-through to write-back, do not forget that F-CPU is a multitasking
>system and a user task doesn't have to change an environmental variable
>that might impact the rest of the task's performance.

I am sometimes thinking about the 64bit generation microcontrollers
with 1TByte memory built into the SOC. :-)

>So as a rule of thumb, the FC0 has a private memory range (as fast as possible)
>and public range (uncached), and the task can specify whether to keep in cache
>or not with the "hint bit" (flush) in the load and store instructions. This
>does not impact mullti-tasking systems at all and is very simple to implement.

I see - and I appreciate the hint bit.

>> The gcc 2stage profiling optimization
>> features are not thaaat convenient to use for global...

>profiling is not often used, but it is useful in constrained real-time
>applications, such as if you want to track where your soft DVD player
>wastes time. If you have a relatively coherent program you can attempt
>to do some auto-tuning (adaptative programs) which has the advantage to
>be portable, but you often forget what "convenience" means when you are
>CPU-bound, memory-bound and ressource-bound, no ? 

I did that type of thing for more than 7 years BTW... ;-)

JG

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/