[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Strangest thing happened; it seems I'm perfectly normal. That doctor must be a fraud! I want my money back!



Bjarke Hammersholt Roune wrote:

>>>The current implementation of ppf_DirTree can handle a preconstructed tree
>>>by passing it the root node of this tree. ppf_DirTree becomes read-only in
>>>this case, though (why?). This can easily be implemented with this
>>>
>>Because if the preconstucted tree doesn't satisfy the "search tree" or
>>"balanced" requirements changing it could get very messy.
>>
>If its unbalanced or not even a search tree, and that is a problem (which it
>is), why not just MAKE it a balanced binary search tree (and if my idea is
>used, also complete)? How would you end up with suchs a tree, anyway?

Well, the file format is an open spec, so anyone can implement her own
writing code...
But I'm more and more leaning towards just assuming the tree is correct
(perhaps just doing some simple sanity checks in the debug version).

>>I'll have a second look at this when I'll implement the overloading thing
>>however...
>>
>We want overloaded mounting to be as fast as, or nearly as fast as,
>non-overloaded mounting. If we were to merge the trees of all the paks we

Most of all we want file opening to be fast. Mounting performance is
secondary.

>I think keeping the trees for seperate paks seperate is a more elegant way
>to handle this issue and avoid the large-scale sorting and file-tracking,
>not to talk about the mess, involved in merging and unmerging the pak file
>structure at every mount or unmount (or mount/unmount session). Implementing
>things this way would be very easy: When looking for a file (let's say we've
>already found the appropiate directory), that file would first be searched
>for in the tree which belong to the pak with the higest priority, if its
>there, we found our file and we can stop searching. If its not there, we go
>on to the tree with the next higest priority and search there, and if its
>there, we found our file and can stop searching. If it isn't there, we go on
>to the tree with the third higest priority and so on untill we either find
>the file or run out of trees.
>
>In this implementation, mounting is as simple as adding the file/directory
>trees for that pak to the rest of the file/directory trees and give it a
>priority. Files that are supposed to be hiddeno (overloaded) by any of those
>trees will be so automaticly. Unmounting is as simple as removing the trees
>of the pak being unmounted. Any files or directories previously hidden
>(overloaded) by the pak will automaticly pop into view again. Notice that no
>sorting what-so-ever is needed.
>
>I find this implementation effecient, elegant, easy to code and OOP'y
>(things take care of themselves and does so with minimum effort). What more
>can you ask for? We also get rid of that annoying mount/unmont session
>thingy.

Well, it's not as fast as it could be, but it's *much* simpler, right.

>Notice that hashing would really speed up things in an implementation of
>this kind, as we will be doing searches in several trees instead of just 1.
>With hashing, the performance loss should be minimal, though, and in any
>case it would be spread out, in contrast to the performance loss related to
>sorting, which would be all focused at the same moments (at every mount or
>unmount, or mount/unmount session).

Hmmm, I'll have to think about that...
Some speed benchmark results  comparing the speed of added hashing vs.
plain trees for small/mediun/large directories would be helpful here.

>>For format 0, yes. format 1 stores all dir entries in an unsorted linked
>>list, so at mounting time there has to be done some sorting. But format 1
>>paks won't be very big anyway, so this doesn't matter.
>>
>I thought format 1 paks were supposed to be used by developers while
>developing the game so they could easily alter their paks without having to
>recompile them all the time. You mean format 1 paks are also supposed to
>find their way into to the release versions?

Yup. For prototyping some game I wouldn't even recommend using Format #1
Paks, as they only allow *adding* of files, not removing or changing.

Format #1 Paks are intended for data the game creates on the fly, e.g.
savegames.

>Why would format 1 paks be small? I mean, what's stopping them from being
>large? I don't remember any size restrictions being mentioned in the
>documentation...

Well, nothing technical prevents you from taking part in a formula 1
contest with a VW beetle - but you should not be disappointed if you lose.
Format 1 is slow, that's stated clearly in the docs. If someone uses it for
big Paks it's his own fault.
So format 1 Paks usually *are* small.

>If format 1 dir entries are all stored in an unsorted linked list, why would
>we need any sorting? An unsorted linked list is by definition not sorted...

We need to sort them on reading the dir info (by inserting the entries into
the tree). That's what I meant. In contrast we don't need to do any sorting
when reading a type 0 Pak, because its dir structures are already sorted.

>>Well, but then you have to do the same processing on the string you compare
>>with (the one you're searching) -
>>
>Well, this would of course be done before-hand, when the pak file was being
>created. If it wasn't, this kind of hashing would be rather senseless.

No, I mean the hash function has to be run over the respective part of
the path given to ppfOpen ().

>You would have to hash the filename of the file you were searching for prior
>to the search, but as I said above, the filenames you would be searching
>against would already have been hashed.

exactly that ;)

>>>Also, I was thinking about the fact that there usually aren't that many
>>>directories. Perhaps it would be more effecient to store directories in
>>>another kind of structure than a tree?
>>>
>>For example?
>>Arrays and lists are clearly slower.
>>
>Well, I haven't actually come up with anything yet on this... I can say,
>though, that hashing for directories probably wouldn't be a good idea.

right.

>You haven't touched these issues which I raise in my Email(s):
>
>1. Implementing a complete binary tree instead of the current incomplete
>one.

Could be useful for format 0, together with 2. Classification: optimization
to be explored deeper when the main functionality is in place and working.

>2. Storing the nodes in arrays rather than dynamicly allocating each and
>every one (becomes feasible if the tree is complete).


>3. Implementing the tree as a template.

certainly not now. I still believe that making the thing generic would slow
things down. But a generic version of it should be created for other uses,
yes.

>4. Using seperate trees for files and dirs (become feasible if the tree is
>templated).

Useless. Have a look at how the files of a typical game are organized - in
a directory are usually either (almost) only subdirectories or (almost)
only files. Especially the large directories (several 100 entries) contain
only files.

>5. Mounting in sessions (bad idea)

right ;)


	Christian
--

Drive A: not responding...Formatting C: instead