[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Progress Update



Bjarke Hammersholt Roune wrote:
>>How costly is the HashTable->DynHashTable conversion?
>>
>Well, to get data from a HashTable into a DynHashTable is almost free. All

Good.

>done, the job of the DynHashTable is finished. Then the HashTable needs to
>sort the array (it now uses qsort()), and then index it (ie, create a hash
>look-up table).

>Combined, this is O(n).

? qsort is O(n log(n))

>>I'm asking because the current system supports mounting some fs at dir a/
>>and then mounting another fs at a/b/ (i.e. in the "area" occupied by the
>>first fs). For that a directory has to be added to the hash of die a/
>>
>>But I'm not sure anymore if this is really a useful feature....
>>
>Hmm... Wouldn't situations in which there are overlapping sub-directories
>get nasty? I mean, if there is an a/b/c and then you mount a b at a, and
>this b contains a c? (ie, there is already a/b/c, and you mount a directory
>called b at a, and this b directory contains a c).

That's a different thing. I meant just mounting where no collisions happen
(the current VFS code supports this).

Allowing overriding of pak parts by other paks is however also a feature
very nice for updates - no need to patch the Paks, just tell the game to
also mount the update Paks (only containing the modified files).

But on the other hand changes in the game data are quite rare with updates,
so this feature might be unneccessary. Dunno. We'll go without it for now
and perhaps add it later. Perhaps.

>The MISC class (template parameter) takes care of serialization. Ie, it must
>define static members SerializeIn(ENTRY_TYPE, FILE*) and
>SerializeOut(ENTRY_TYPE, FILE*). I did it that way because it allowed me to

Ah, ok.

>get by without altering anything else (compatibility with existing code). It
>wouldn't take me 5 minutes to move functionality from MISC into entry
>methods.

I think I actually like the current scheme (with the Misc class). Together
with some typedefs (PakF0ConstructionHashTable etc) it's better than
"hardwiring" the info in the entries.

>>(1) New prefix/namespace convention
>>
>I think this is pretty critical. From my experience with the allocators, its
>almost as easy as clicking the "replace" menu option (or whatever you use).
>There's alot of code to do it to, though.

Well, and the code for importing stuff from pp::internal to pp has to be
added etc. But I guess you're right. It's not *that* much, partly because
for PFile the code is either only in pp::internal (everything except the
API functions) or in global space (the API functions). The picture is
different with PSound however.

>>(3) Improving the headers, including defining the above macros plus some
>>more
>>
>Well, I suspect that to be a somewhat minor job.

Perhaps. I guess I'll just have to seriously start with it...

>>(4) Adding the new URL system to PFile
>>
>That's a little bit worse. Stuff using the old APIs should still run, build
>and compile, though.

Nothing uses the old APIs anymore. But I guess the HashTable isn't
completely fit for the new one yet. <checking> Well, some simple changes
(only in the Misc thing as it seems) and it's done.

>>(5) Adding the new Hash container to PFile
>>
>The compiler will make it clear where this needs to be done, and I plan to
>make the new interface of the Directory class correspond somewhat 1:1 to the
>old, just with updated names. 

ppf_DirEntry, ppf_ReadDir* (), ppf_TGlobalData and the ppf_PakFileW*
classes have some dependencies to the tree stuff. But mostly that's very
simple.

>To use the new Directory class, though, client
>code will have to use the new URL system.

It does.

>>(3) requires some more planning, but it's a prerequisite for (2)
>>
>I think getting the main headers rigth is pretty important. It gets
>exponentially harder to alter them as implementation continues. What exactly
>do you have in mind?

Primarily adding those Win32-handling macros. Plus some more things, but I
haven't thought much about it yet. I'll do that the next days.

>>(1) requires some testing/planning on how to best do the
>>    export-from-internal-to-pp and export-to-global thing
>>
>First, I thougth having a header that included all external stuff form
>::pp::internal to ::pp was the best way to go. Now, I'm more for simply, at
>the bottom of each header file, including the external stuff that was
>defined in that header.

Agreed. Also my preference.

>perceps allows for a good deal of customisation. I think we should have a
>way of telling wheter a symbol is internal or external, and make it possible
>to generate documentation for only external stuff. Many poeple won't care
>too much about the internals.

I guess so. I haven't looked too much at Perceps yet (mainly because it
lacks good docs ;)

>Yeah... The new URL handling is quite well though out (compliment), and
>quite simple. I think its actually quite hard to create a bug with it.

Thx

>>(5) has to be integrated properly with the rest of PFile. Completing (1)
>>    beforehand makes this easier. An alternative would be to change the
>>    hash/allocator naming/scoping to the old scheme for testing/integration
>>
>We could also simply use typedefs, so even though stuff hasn't really been
>changed, it will look like that to new code. That's what I do to get
>Directory to compile.

Right.

>>(6) well, is this already done (HashTable?) or is the Directory class the
>>    main user of these allocators? Anyway, use of them is pretty localized,
>>   so this shouldn't be a too big mess. But it has the same problems with
>>    (1) as the hash stuff.
>>
>The directory class is the reason we thougth of having them, as I recall.
>I'm sure we'll find several other places to use them too, though. They are

But that's for later.

>quite good at linked lists (on a very fragmented memory system, you can get
>several 1000 percent improvement as a direct cause of improved locality of
>reference. Rare situation, though.), and DynHashTable uses an allocator for
>that porpuse (DynMemAlloc I think).

Right. I didn't think of that. Nice.

>>Furthermore there are some other things I have to do:
>>(1) My duties as coordinator of the LGDC
>>(2) Planning for the library database
>>(3) Write a good description of it for our homepage
>>(4) Update the PFile docs (well, I guess I'll wait with that until the
>>    thing stabilizes)
>>
>Do we plan on generating most documentation with perceps, or do we add
>substantial stuff on top of that? Anyways, we don't have any users rigth
>now, and the system isn't very usable, so I agree that this can easily wait.

Perceps for the API reference (and internal function references for those
wanting to help us - and for ourselves) and DocBook for tutorial / design
discussion / background info / ...

>btw, can perceps write TeX?

AFAIK yes. And if not it shouldn't be hard to add that. In any case perceps
development seems to come into gears again and the plans for the next
release look really nice.


>>(5) Update the Coding Standard doc with the changes we discussed
>>
>I think that migth be a good idea to do pretty quickly. It'll make nice
>newsitem to show people we haven't died ;)

Did you have a look at the homepage lately? ;)

>>(6) Start the coding guide
>>
>What exactly is that? Stuff like this:
>
>DON'T do this:
>
>for (int i = 0; i < 100; ++i)
>	ppStr[i] = "initialize all to this"
>
>DO this:
>
>char** ppTerm = ppStr;
>for (char** ppPos = strs; ppPos < ppTerm; ++pPos)
>	*ppPos = "initialize all to this"
>
>(its more effecient, if someone's wondering)
>
>Or more stuff like "try not to use templates and exceptions"

Both wrong. Things like:

"Never, Ever Add A Toplevel Directory To CVS Without Clear Approval Of The
Other PP Members !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"

and

"PenguinPlay.h defines these custom types for portability: ..."
"For debugging the following macros are provided:..."
"The basic exceptions defined automatically should cover most things: ..."

and

"Why telling the others of your progress from time to time is a good thing"

etc

>>(7) Do some preparation for my coming (mid-Oct) physics practical
>>(8) Do some preparation for my coming (start-Nov) CS test
>>
>Well, good luck to you!

I hope I won't need it ;)

>>(9) Do 40h/month of coding for money
>>
>Hmm... I didn't think firms actually hired programmers for just 10 hours a

But universities do.

>week. I want a job like that! :)

For about USD240 a month (USD6 a hour)? ;)

>>>How do I read .sgml files?
>>
>>ASCII editor ;)
>>
>Well, yes, that's what I'm doing, but that's what I DON'T want to do.
>
>>BTW - why do you want to read them directly (besides curiosity)?
>>
>I don't! I want to view them normally.

Ah, ok. The ones I write are always also available in a ton of nice formats
on ftpspace and as HTML via the site (not linked yet though - look at
/penguinplay/libpplay/doc/ for now)

>>Ok, so how do we continue now? I'm not sure, but this looks good IMHO:
>>
>>(1) convert the PPlay sources to the namespace use
>>(2) integrate the hash stuff & adapt the PakFile* classes
>>(3) modify the HashTable thing to use the new URL code
>>(4) debug. test.
>>
>That sounds good. I'll be finishing the Directory class first, though (right
>now, its less than 50 lines).

You should be able to use almost all of the code from the existing one. It
has quite few tree specifics.

>The HashTable doesn't know about URL, or anything else. It shouldn't.
>Directory will take care of this. HashTable just returns an entry that
>matches a given key, if any.

Good. <looking at the code> This isn't as fast as it could though - the URL
code provides you with a string (the key) *plus* it's length *for free*.
Actually the length is even required for processing because the string
isn't neccessarily null-terminated. So some specialized GetEntry () method
would be good.

>>The bad thing here is that during the entire time we can't even compile the
>>code, let alone run it. At least most of the code (HashTable and the main
>>URL processing code are more or less exceptions).
>>
>Well, after (1) and (2) we should be able to compile it.

Right.

>>As mentioned above we could also leave the namespace conversion for later
>>and instead change the hash stuff to use the old convention (simple). That
>>way we have a testable PFile sooner and we have to test less.
>>
>I'd prefer doing the namespace stuff first.

Ok from my side. Peter will have to do that for PSound shortly afterwards,
but as there's no direct dependency (yet) that should be ok.


So here's the new plan:

You take the PFile code, make it namespacy, and adapt the Directory stuff
for the HashTable. After that compilation should be possible.

During that time I do the header update, add the missing DOS-style path
support to the URL code and after that if neccessary help you with the
other stuff (especially the ppf_PakFile* things are quite some work because
the Pak format and the way Paks are written change).

Ok?


>Well, now some comments about the code you sent me:
>
>>* Why does for example HashTable::IsEmpty () return a ppInt8 instead of a
>>bool?
>>
>Converting an integer to a bool requires processing. More specificly, it
>requires the compiler to generate something like this:
>
>if (val == 0)
>	boolVal = false;
>else
>	boolVal = true;
>
>The way the empty condition is checked for, is to just return the value of
>m_pEntryArray. If I had made the return value a bool, I'd have the compiler
>generate extra code that isn't nessecary.
>
>As an eksample, take this:
>
>if (hashTable.IsEmpty())
>	// do something

-----------------------------------
	ppInt8 IsEmpty()
		{return (ppInt8)(m_pHashTable);}
-----------------------------------

With that implementation your example if () thing evaluates to true if the
HashTable is *not* empty (IsEmpty () returns 0 (== false) if the thing is
empty). That means a negation is needed. And so the step to the conversion
to bool isn't big. Besides that a compiler should be able to optimize the
following pretty well:

bool IsEmpty ()
{
    return (m_pHashTable == 0);
}

because bool values are internally just perfectly normal ints

>>* I'm no native english speaker, but I'm sure it's "purpose" instead of
>>"porpuse" and "such" instead of "suchs" :)
>>
>Well, I didn't pay too much attention to the spelling (should be mostly ok,
>though), as I was changing the wording alot all the time.

The above were the only things I noticed. But their use was pretty
persistent ;)

>>* The template types are not documented and I especially was confused
>>by the role of the MISC one (why the ALL_CAPS btw?).
>>
>Template types behave almost exactly like preprocessor symbols. That's the
>reason for the ALL_CAPS.

Hmmm, I don't really like that. But it's just a feeling. Have to think a
bit more.

>>* Although C++ doesn't require it, methods that may throw std::bad_alloc
>>should tell the user of that by using a proper exception declaration
>>(void MyMethod (...) throw (std::bad_alloc))
>>
>MS VC++ doesn't support this (and throws warnings all the time about it.

exception declarations in general?
That's weird. AFAIK they've been in the C++ spec for quite some time now.

>I've made some changes to the HashTable class to make it more effecient.
>Code is cleaner and leaner, better algorithms is used and documentation have
>been improved. I'll add the changes talked about in this Email to that, and
>send it to you one of these days (I'll try to get it done tomorrow).

Not urgent. 


	Christian
-- 

Drive A: not responding...Formatting C: instead