[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Sv: Sv: A little more exact specification of the hash-table



>>That's why it says "in the state it currently is". Why wouldn't the
>>container (or, if the container isn't responsible for this blank-space
>>behavior, the calling code) be aware of this and include it in the
resulting
>>value? You just need to tell it "I need info for format 0" or "I need info
>>for format 1".
>
>Right. But with Format 1 e.g. it isn't guaranteed that all entries of a
>given directory are stored en bloc (i.e. entries added later on are
>somewhere else.)
>
Yes. I think it would then perhaps be beneficial to simply handle format 1
writing without implicating the container at all. See below.

>>>Bad. Again problems with the different Pak formats. Not only
serialization
>>>is required but also updating of an existing in-pak-structure.
>>>
>>How would you implement that? Keep a change log? Keep two in-memory
>>structures, one altered and one not, and compute the difference and only
>>
>Nope. Write on access.
>
Then the "ppfAddFile()" or whatever function is used for that can update the
pak. This way, the container only needs to serialize to format 0. It'll need
to serialize FROM a whole array of sources, but that's not that big a
problem.

Btw, I think the class that keeps track of files (ppf_File?), and the one
for dirs, should have a member WriteToStream(). The naming isn't important,
of course. The point is that the file/dir class itself should know how to
serialize itself, thus removing the dependency of, to the file/dir class,
external code being dependent on that file/dir class' data can simply be
dumped to the file. Code external to the file/dir class doesn't care exactly
how the file class is serialized, it just wants it to be.

I still think having two seperate containers for dirs and files are by far
the best implementation. This could all be done internally to the container
class, so the rest of PPlay wouldn't have to know. We'd be saving memory
too, as we wouldn't have to store a bool telling us wheter an entry was a
file or dir, and we wouldn't have to size the storage for all entries so
that they can hold whichever one is largets. Its just a big mess having both
structures in the same array, even if they need to be in the same container.

Keeping the arrays seperate has only one, single, drawback: if it isn't know
wheter a path is a file or dir, you have to look in both tables, doubling
look-up time. However, I think this is largely inconsequential, since only
files can be the last part of a path (you can't open or write to
directories), and only directories can be the first part(s) of a path. I
just don't see a case where you wouldn't know wheter its a directory or a
file you're looking for, and even then, lookups ARE extremely fast. I
remember you being against this with trees, though I can't remember why.

btw, if we are going to do serializing, should I implement it using a
standard library file-handle, a PFile file handle or a standrad library
stream class? I'd go for a PFile file-handle. Else it would be impossible to
write to any alternative file systems.

>>Now, all this work, and then the client really only needs to access 3
files,
>>and it doesn't even get to do that, because rigth after the first access,
>>the internet directory is changed... :(
>
>Unimportant. Reading ftp dirs is slow anyway:
>(1) get the dir listing over the net
>(2) parse the text and extract file names and their attributes
>(3) construct the proper file and dir objects
>
>This will take far longer than reorganizing the hash
>
Agreed. Later, perhaps it would be an idea to store these differently (like
in a sorted array and using binary search), but this is rarely time-critical
stuff, and we can't even actually connect to a ftp server yet... ->consider
later

With these issues sorted out, I should be able to implement the the
hash-table. Atleast most of it (more issues migth come up I hadn't thougth
of).