[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FORTH plusminus, was Re: [f-cpu] Winograd DCT on my seul.org account

cedric wrote:
> Hi,
hi ! too

> > cedric wrote:
> > > > What i need is a DataFlow Graph analysis tool that allows me to
> > > > handle large chuncks of code with all the automated tedious tasks
> > > > that i usually do by hand (register allocation, reallocation,
> > > > interleaving...)
> > >
> > > What did you mean by reallocation is i memory operation or register
> > > reallocation ? If it's the second, the problem is the same as register
> > > allocation, and it exist some good algorithm for risc machine (with a lot
> > > of registers), that can do that better than a lot of human ;-)
> >
> > do you speak about register coloring ? the complexity grows at least O(n^2)
> > with the number of registers. the "manual" algo i use (and which can be
> > automated) is O(n), or linear complexity. give it more registers and it
> > will be even happier (contrarily to register coloring).
> Well, a lot of people will be interested by your algorithm I think, you will
> explain it to me "au first jeudi" ?

if you have enough time and if we find a quiet place (you know how much people
can shout there :-D)

> > >         For interleaving, it's, if I correctly understand what you mean
> > > scheduling,
> > instruction scheduling, to be exact (look at the end of the DCT paper)
> > > the problem is not easy,
> > what i do is straight-forward : it's like taking a string aaaaaaaaaaa and
> > bbbbbbbbbbb and interleaving it : abababababababababababababababab...
> I think that gcc can do that well, it's not very difficult, the solution is
> when you have 2 operations that are dependent, you must maximise the distance
> between them. That's all, I think ?
that's pretty well said. This also explains that when the degree of
parallelism (ILP, or with mutliple parallel pipelines) increased and the
latency increases too, the algorithm becomes more critical.

> I know that perhaps not always the best solution, but it good enough.
if you know a better solution, don't make me wait for an explanation ;-P

> > > and need a lot of time to have the right answer.
> > at least the time to type it ;-)
> So it's not so difficult ;-)

As i wrote in the source code of one of the DCT examples (5th version ?)
the is more or less what the P6 core does, but in the _reverse_ order,
which is more efficient (because no physical register is left unused).

> > > But what we can do is detecting where we lost cycle.
> > > (It's what I plan to do in my asm).
> > it's like calling the doctor who says : "you're sick" and goes away.
> Exactly, but actually the human are better than our algorithm, no ?
no, what i meant is : knowing the problem doesn't solve it.
If you can detect a pipeline stall, it won't tell you why it is there,
and how to remove it. OTOH an algorithm that creates good code _by_construction_
is more interesting because you don't have to check the result.

> > writing asm is ok for small code (like the DCT) but not for large code,
> > you need to not have to worry about the representation, only about the
> > function, and asm programming does not make that possible.
> > peephole optimisation might be easier with your tool, but that's all.
> > it doesn't solve the "big problems" which often come from superficial
> > analysis.
> And algorithm, but you know actually developper brain is too expensive, so you
> need a good optimisation but it must be done quickly and without the need of
> learning a new tool or language.

i don't understand what you want to say...

> > btw, do you come to the Libre Software Meeting in Bordeaux ?
> > will you present some of our work ?
> Yes of course, I have some problem with my pop at epita, so if some discussion
> append on it, I can't know them. But we will meet us at the "first jeudi" ?

of course i'll come to Les Halles.
But do you want to present something ? do you have a subject in mind ?

> A+
>   Cedric
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/