[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: [f-cpu] GCC 3.1 for F-CPU port



hi !

>De: devik
>> there are several F-CPU assemblers now.
>> even though i don't know which one to trust :-P
>> everyone with a specific syntax, unique features etc ...
>:) Are there reasons other than personal ego ? Like
>if all wants their own assembler ? And are these
>documented so that I can select one ?

nobody is "happy" with other's SW.
unfortunately they are loosely documented,
and the only discriminating factor is whether it works ...

>> emulator is a big problem. it will take time
>> before it's completely solved but i am confident.
>
>I understand that it is a big piece of code. But is
>the principle so hairy ?

"in principle, no" :-)

One of the big limitations is C itself.
handling the large SIMD data is not easy, so a library
must be designed ... C is limited to 64-bit data
but not F-CPU, which can be configured to handle larger data.

> I'd expect relatively simple
> code ... But there may be things I'm overlooking ..
yup :-)

> Probably memory and cache simulations are not so simple too.

if we ever go to that point ....

>> compiler ... well ... you seem to have taken over
>> the past efforts :-)))
>
>I didn't know about past efforts ! Huh... I could
>starts from that and not from scratch then :(
have a look at http://www.f-cpu.de/gcc/

> I was learning gcc internals whole 3 days until I coded it...

and as you see, it's not easy or adapted ;-)
but recent GCC versions have shown some signs of
enhancements (support for more "modern" computers)

>> >linux kernel for example and count cycles it takes
>> >until /sbin/init is launched).
>>
>> i don't think that it is a good metric.
>> On top of that, there is no external HW ready.
>
>yep .. I only wanted to see linux booted on f-cpu ;-)

patience, patience ... ;-)

>Probably because I'm familiar with large parts of linux
>and wanted to try to understand arch-dependent parts too.
>I planed to stole HW emulation from bochs :)

huh....
do you really need to put some x86-related things in F-CPU ?...

>> However it can be interesting to code and run the
>> "primary boot monitor" (see at
>> http://f-cpu.seul.org/new/F-CPU_boot.txt )
>> and start bootstrapping stuffs from that point,
>> making simple "toy" or useful software, etc ...
>
>good idea.

nicO doesn't agree but most others do ;-P

> Maybe it could be start point to do other tests.

yup. It requires a small amount of code
(at most a few thousands lines) and it will be able to
run smoothly with the first emulators and simulators
(no complex HW to simulate/emulate) under a Un*x
environment...

>I wanted to port linux also because when it "runs" you
>can benchmark other code on it (like to compile gcc on it).

you will be able to "boot" a bzimage and even load a initrd
from the primary boot environment. And if you really want
to have fun, you can try to program a multi-boot SW
and even a grub-like program :-)

>> >Also insns other than add and shift should be add (just
>> >now gcc uses its libs).
>>
>> ? i don't understand what that means ....
>> we can't do boolean or shift operations ?
>
>no no ... :-) I was just be lazy and implemented only
>mandatory patterns like movM, jump, jump_indirect, call
>and a few optionals like addM3, shiftM, extendM, compareM
>and all branches.
ouch....

>So I wanted to say that I have to add all others like
>logic, other shifts, rots ....
have fun ....

>Also recently I found that I need reload_inM and reload_outM
>probably because when I tried to compile "vsprintf" is
>crashes when reloading registers - I used gen_reg_rtx
>in moveM which is unfortunately not allowed when
>reload_in_progress is true. Have you played with these ?

if at least i knew what you are talking about ....

>> >There is problem with jump optimizer because it needs
>> >labels tied to jumps but we have them in registers.
>> the 'trick' is maybe to use a "macro" and the instruction
>> can be rescheduled by Cédric's assembler ...
>I have done it exactly how you said. :) Problem can be
>that we disable certain otimizations by this. If you'd
>be able to instruct gcc to emit address load together
>with jump it is able to do cse and factoring the load
>of of the loop bodies. This is hard to do in assembler
>because it would need to redo loop detection and expression
>lifespan analysis.

heck. but at least it works, no ?

>I've seen something like this in ia64.md, I'll look at it.

cool...

>> >Also conditional branching is not tuned - it supresses
>> >loop optimization :(
>>
>> it seems that you do not use the same set of conditions
>> as is implemented (LSB, MSB, zero, instead of greater, etc.).
>
>I use only zero and not-zero accompanied with cmpxx. I didn't
>use MSB because there was discussion about its removal and LSB
>because I don't know good use for it yet ;-)

The problem with MSB is : how to deal with the cases
where F-CPU implements, say, 128-bit registers ?
until now, it's hardwired to bit 63 of the registers
(that's the sign bit ...) but i don't know what will
happen next.

>> >    jmp_direct.nz a0,@L6
>> >    widen.d a3,a1
>> >    bseti log2(1048576),r0,a0
>>
>> wow .... uh ....
>
>I was lazy to use "*" in define_expand instead of "@" to
>compute "log2_exact()" so that I simply emit it for assemler
>to handle it ;-) I use bseti C,r0,r for constants over 0xffff
>which are powers of two.

i'm not sure it's going to be implemented soon (the shifter
is already quite large and i'm less optimistic about inserting
more logic in the critical datapath). But it can be macro'ed
anyway...

>I played a lot with code which generates constants. Now it
>is relatively clean only doesn't handle negative numbers
>efficiently.

i thought that loadconsx could do the trick...

> I'll need to do:
>nand r0,r0,t0
>; STALL
>loadcons.0 C,t0
>
>or
>
>loadcons.0 C,t0
>widen.d t0,t0
>; STALL
>
>while for positive the best is probably:
>move r0,t0
>loadcons.0 C,t0
>
>which is without possible stall. Of course the "stall" slot
>will be probably used by compiler but not always. Do you
>know better code for small negatives ?

i have not read Cédric's modifications to the manual,
but there is/was an instruction called "loadconsx"
which sign-extends the constants.

>I'm thinking about this: If RTL pass of compiles allocates
>less than half of temporary registers then there will be
>some free ones even after optimization. In such case allocate
>one physical register and presume in patterns that is has
>value -1. Then use it during combiner pass just like we
>use r0 but for another things. If it was really used, generate
>new insn in prolog part to assign -1 to it and let second scheduler
>pass to reschedule it.
>Then we save 1 cycle per use for no runtime penalty. Hehe.
>Take it as interesting reading only - I don't plan to implement
>this beast soon but it is interesting, is not ?

i'm not sure to understand everything.
but i know that gcc is not the kind of beast i want
to fight this week-end :-)

>> i have put the sources at
>> http://f-cpu.seul.org/new/gccfcpu.20021203.tgz
>
>I'll definitely look at it !

it's yours :-)
but http://f-cpu.seul.org/new/ contains a lot
of random design files, it's the last web location
that is still updated unfrequently ;-)

> However I have to postpone
>any other hacking till Christmas because I have important
>project at my work.

have fun anyway !

>best regards,
>devik
>
>PS: do you know about new scheduler in gcc 3.3 ?? It should
>    schedule f-cpu almost perfectly

let's hope so.
but i know that the party is not going to end soon
and gcc still has a lot of problems, it is not well
adapted to F-CPU and new and additional scheduling techniques
must be implemented (like : reducing the overhead of the
prefetches, the number of pointers, etc ...)

and now, let's work ;-)

YG

*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/