[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [f-cpu] Running long things (was: Re: Yet Another Upload)



On Wed, Apr 23, 2003 at 09:09:58PM +0300, Yedidyah Bar-David wrote:
[...]
> > Some tests last 10 seconds, but Michael's tests are usually extremely 
> > long with Vanilla
> > (which itself is already slow software).
> 
> I guess you won't need my help for the 10 seconds stuff. What I hope
> is that the long stuff (e.g. above an hour) can be cut to e.g. 1 hour
> parts without too much overhead. This way, if a machine is rebooted
> in the middle of such a 1 hour job, we lose at most 1 hour, and with
> a sufficiently good batch scheduler (a hand-written script or something
> like PBS which we already use) resubmit it without manual intervention.

I already considered checkpointing:  Whenever a testbench has reached a
certain milestone (let's say every 1000 operations), it could write the
current status to a file so that it can `roll forward' the simulation
and restart from the checkpoint.

With a zoo of machines, I'd prefer the `f-cpu@home' approach: Cut the
testbenches in small slices and execute them in parallel.  The slices
must not be too small, however -- simulation setup time is rather long.

Unfortunately, both approaches require another kind of testbench which is
particularly hard to write.  Using a vector file isn't an option either --
these files can become really huge (yes, I tried it).  You would need a
second program that generates the vectors on the fly -- and then you'll
again have what we have now, just separated into two processes (which
is not necessarily faster than a single process, even on a dual machine).

-- 
 Michael "Tired" Riepe <Michael.Riepe@stud.uni-hannover.de>
 "All I wanna do is have a little fun before I die"
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/