[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [f-cpu] about register mapping



hi,

Nicolas Boulay wrote:

I would like to read some information about register mapping.

google is your fiend ;-)
however you will certainly find references to "register renaming" which
is another issue.

Memory mapped register to speak to device is a good way to extend easly a design.

what do you mean by "Memory mapped register to speak to device" exactly ?
Do you mean that I/O is memory-mapped ? If so, then history says that
it always end up in a big mess, look at the Ralph Brown lists
http://www.cs.cmu.edu/~ralf/interrupt-list/inter61d.zip
files Port.a, Port.b and port.c
For example, port 0x300 was "reserved" for prototype cards,
but today it is used by more than ten "commercial" cards
and probably much more, since the RBIL can't list everything.
Fortunately, the PCI standard reduced the pressure on the I/O port
addressing space with providion for dynamic address allocation
but the "legacy" remains : the PCI standard is and will remain
tainted by the original PC-XT definition.
The morale is : if it is possible to do something messy, it will become a mess.
Even if you make something a bit better later, it's too late.

If I/O boards, or even any board is to be designed, then it would be
better to use a few wires for a IC or SPI bus. When booting,
the main CPU will read the parameters from a serial EEPROM
and configure the addresses and other internal registers.

But it is not a panacea, because the main CPU needs a "device driver"
in order to interpret the embedded parameters. This could cause
more problems (CPU compatibility etc.) than it solves.

But it restric the kind of interraction we could have with the device.

f-cpu tend to throught every thing in special register.
So i would like to better know pro and cons about that.

There are many points to consider.

First, we must NOT forget that there are "slow" I/O and "high-bandwidth" I/O.

High-bandwidth requires specific hardware that performs the data moves automatically
(a sort of DMA). Configuring this DMA typically requires "slow I/O" access to the
address registers (source + destination) and count register for example.
But this could be much more complex for ATA/IDE, USB or SCSI.

The DMA for fast I/O is not yet studied for F-CPU because it depends on
the kind of peripheral : Disk, network, even video. The DMA engine for each
case is best examined globally, and defining a "one-fits-all" DMA engine
could cause problems if a completely new kind of peripheral is created.

So all we do now is manage "slow I/O" at the CPU core level.
One has to consider :
- resource protection (access rights per thread or process)
- bandwidth separation (sending a single-word access request in the middle
of block-wise requests can cause slowdown and make control logic more complex)
- single-CPU vs multi-CPU (some I/O may be common to the whole system but
some others depend on each CPU, such as configuring the local clock or stuff like that)
- centralised vs per-CPU I/O (the PC centralises all accesses to the CPUs so
its I/O uses a single address space but a F-CPU system could have some specific
I/O per CPU, making a heterogeneous system)
- memory hierarchy coherency (caches and cachability) as well as access time
(it may be slower to access external I/O if it has to go through the same bus)
- what we want to access
- how we want to access (and memory address decoding can become very complex)
- synchronicity (I/O must be strictly ordered, while memory access can be reordered)
- the ability to emulate I/O (for example, emulating memory-mapped I/O is difficult,
it requires a modification to the TLB miss handler)
- the number of additional instructions

For F-CPU, we need something both simple and very flexible
that does not interfere with the rest. It is not impossible to
use memory-mapped I/O but their absence makes it easier
to design a simple high-bandwidth memory bus.

Furthermore, the CPU core and probably some integrated HW
need a way to be configured : for example, the clock generator,
or the external memory banks (SDRAM for example) need
some tuning and probably run-time modifications, so it does not
make sense to go through the whole memory hierarchy to manage that.

The GET and PUT instructions in F-CPU are similar to the
RDMSR and WRMSR instructions for x86 (they are local to the CPU).
IN and OUT are other instructions that apply globally to the system.
There is also a similarity with the "coprocessor" instructions
of the MIPS and ARM architectures which are used to
configure the internal registers (memory protection/TLB,
IRQ configuration...)


I think this mail gives some ideas to discuss about but
i guess that you (nicO) have heard and read enough on
the matter of SRs to know the principal pros and cons
of this approach, since you have argued about it.
Here, i try to give a deeper insight and a general
overview, without saying "what shall and shall not be done",
but only "what is".

Happy new year,

nicO

YG

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/