[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sv: Strangest thing happened; it seems I'm perfectly normal. That doctor must be a fraud! I want my money back!

Bjarke Hammersholt Roune wrote:

>>Well, the file format is an open spec, so anyone can implement her own
>>writing code...
>>But I'm more and more leaning towards just assuming the tree is correct
>>(perhaps just doing some simple sanity checks in the debug version).
>why not just make it part of the spec how this data *must* be stored?

It's a matter of robustness - simply assuming the data PFile reads is
correct is bad. PFile has to either detect corrupt data or limit the damage
caused by it.

>>>If format 1 dir entries are all stored in an unsorted linked list, why
>>>we need any sorting? An unsorted linked list is by definition not
>>We need to sort them on reading the dir info (by inserting the entries into
>>the tree). That's what I meant. In contrast we don't need to do any sorting
>>when reading a type 0 Pak, because its dir structures are already sorted.
>This strikes me as a bit strange way to do things. If its going to be sorted
>anyway at some point, why not just sort them at write-time instead of
>read-time? You just need to be ble to specify all the files that go in a pak
>format 1 when writing it at once for this to be effecient. You would still

Yup, right. But as soon as you add files the sorting is destroyed (it would
be of course possible to re-sort everything once a file is added, but
that's rather slow).

>Btw, how would you make adding files to a format 1 pak work? I mean, the
>data saying what is in the pak file is in the head of the file. This chunk

Nope. The dir info can be spread wildly over the entire PakFile (each entry
just contains a "pointer" to the next entry). That means entries for new
files are just appended to the Pak (and linked to the list of course).

>>>You haven't touched these issues which I raise in my Email(s):
>>>1. Implementing a complete binary tree instead of the current incomplete
>>Could be useful for format 0, together with 2. Classification: optimization
>>to be explored deeper when the main functionality is in place and working.
>Well, actually I haven't got anything to do right now, except for that
>benchmark thing. I'd be happy to implement it. I've already figured out how
>to add and remove files while keeping the tree complete at all times (ok, so
>I probably could've found that info on the net, but didn't have anything to

Do you have some code for this or can you describe in detail how
you're implementing a complete tree? I'm interested because you are right
in that point:

>This also makes it MUCH easier to store the tree: just dump the array to the


>>>4. Using seperate trees for files and dirs (become feasible if the tree is
>>Useless. Have a look at how the files of a typical game are organized - in
>>a directory are usually either (almost) only subdirectories or (almost)
>>only files. Especially the large directories (several 100 entries) contain
>>only files.
>Usually, as you correctly have observed, data is stored in sub-directories
>of the main game dir, sometimes data is stored 2-3-4 levels from the main
>game dir. In other words, the directory data is going to get accessed ALOT
>of times. Keeping directory data for itself therefore gives a performance

If the game does hundreds of accesses to files in the same subdir in a row,
each time with full path, then it deserves to get bad performance.
A simple
	ppfChdir ("the/sub/dir/");
	ppfOpen ("thousands.of.files");
makes all your performance fears here obsolete :)

>Often, you will have a data dir with alot of files, and then a few
>directories. In suchs a setup, the benefit becomes more profound.

Not often, seldom. 
And - same as above. Either the accesses to that subdir are only once in a
while (=> the speed doesn't matter that much) or many in a row (=> ppfChdir
() to the subdir eliminates the need for those lookups)


Drive A: not responding...Formatting C: instead