[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] 15 MIPS FC0 emulator



hi !

devik wrote:

Hi,

today I've written part of my own emulator. Why ? Well simulator
is better but not complete yet and probably slow.

well, "probably" must be measured before undertaking this ...
My goal and plan is to make it scalable and work well, before making it fast.
otherwise, it ends up with kludges and workarounds when it could
be simply solved by simply changing the original design.

I wanted to have emulator which can emit stats on pipeline stalls,
be fast and simple enough to have it complete in a few days.

if that was possible, you would have it already ;-)


I want it also for gcc performance tests and run Linux on it later.

as everybody does ...

It is basically loop with switch() and each opcode defined by a few
lines (operation and pipeline data for scheduler).

did you do a copy and paste from the snapshot's source code ?
it contains a lot of "standard definitions" ....

It is incomplete but when tried with loop of independent
64bit AND it gives 15MIPS on my PII/375.

wow, i hope it's fast enough :-)

SIMD ADD.W gives 12MIPS and ADD.W with saturation and carry
store 7MIPS. It can be still optimized a bit.

before making it fast, what about making it acurate ?
what about the problem with signed saturations ?

It uses MMX where possible (it helps especially with .B and .W
SIMD ops). SSE would help but there is much less machines
with SSE than with MMX (mine for example).

I hereby grant you with the prize of "Least Portables Code" of the F-CPU Project,
congratulations ;-)

by "portable", it means that people with a Mac, SPARC or other machines
can still compile and execute it. I sometimes use Pentium computers
(of the first generation) and i know that x86 is not the only architecture
on Earth.

Well, that's my usual rant. But you said you contribute to the Linux kernel
so such a hack does not surprise me :-) there might even be a few things
to learn. And if you don't hit the walls a few times, you won't understand
why i rant ;-) For example, now, it would be funny to have a 256 bit version,
or even, an emulator where you can indicate the register size as a command
line parameter .......

Adding next OPs is simple without need to schedule them - just
specify type (if not 2r1w) and latency.

cool :-)

I'll continue on it (it has only 200 lines just now) but wanted
to share the ideas with you :)

thanks !

It uses timestamping of register writes and circular fifo so
that there are almost no loops in critical path.

it detects both RAW and write port stalls.

it looks like a good idea. I have to read it more carefully
but it seems to be efficient.

devik

PS: There should be specified whether ADD saturation is signed or not !

As far as i remember, it is unsigned.
a signed version is however interesting and useful,
but designing it might be a problem.
Look at the code written by Michael in EU_ASU :
this is a very specific implementation of the add operation
and Michael did an incredible work at fitting as many
ops as possible.

now, concerning the ROP2 operations, you can
look at the simulator code : it has a completely tested
version of the ROP2 code, with MUX and combine
included. Yes i know it's not as fast (and it can be
optimised in many ways) but i'm sure it can run
on the spare Pentium 100MHz in my house :-)

have fun,

YG

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/