[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Funny situation (was: Re: Serialization et al)



> > Above, I state that HashTable should only know how to serialize itself
> > in a format that makes sense relative to itself. To me, that is pretty
> > irrefutable.
> >
> > However, your idea is/was to make derivatives of HashTable. Thus,
> > HashTable still only knows how to serialize itself in a way that makes
> > sense relative to itself, but its derivative class ZipHashTable will
> > also be able to serialize Zip files. The reason this is IMHO still
> > appripiate is that the Zip format makes sense relative to ZipHashTable.
> >
> > ZipHashTable might be implemented in a radically different way than
> > HashTable (that way, it would probably be better to have a pure virtual
> > HashTable base class, and make PakHashTable, ZipHashTable, TarHashTable
> > etc.), if a different internal representation of the data migth make it
> > more effecient when dealing with Zip files.
> 
> What the hell is this??? If a ZipFile (similar to PakFile, inherited
> from some common class) is made, it has no need to even *use* a hash
> table in its implementation!!!
>
I agree here. The solution I detailed at the end of the E-mail you
responded to abstracted out implementation details like the fact that
Paks use a hashtable. A directory for a zip could transparently use any
kind of algorithm and representation.

The reason I didn't bother to give the baseclass I here talk about a
better name than HashTable was that it would be more clear what I was
talking about. In my actual proposed solution, I do give the common base
class a more appropiate name than HashTable.

> The way I see this, there is a
> FooFile::SerializeTo method that just stores the content of the FooFile
> to a stream. If the format of FooFile is simply recursively serializing
> the objects that makes up FooFile, then so be it (calling
> HashTable::SerializeTo if it uses that class in its implementation).
> Maybe the format of FooFile is all done within a few methods of the
> FooFile objects and just walking the directory tree nodes to gather the
> information, then so be it.
>
Well, besides the fact that serialization happens at unmount, so its
umount() and not SerializeTo() that initiates the serialization, the
solution you propose here is perfectly possible to implement within the
framework of the solution I have proposed.

> If the internal implementation has to reflect one to one the internal
> implementation of the archive itself, you're definitely gonna freak out
> at some point. Just consider a zip file versus a tar.gz file. The first
> has each file compressed by itself, and the second has the exact
> reverse, compression is applied to the whole archive. If you force those
> two to share the same internal implementation, you will only get
> yourself in a lunatic asylum.
>
I get the feeling that you have not read my proposed solution. I don't
blame you, its not a very interesting read if you don't really care
about the implementation details too much.

The solution I propose completely abstracts out all and any
implementation details of how a Directory derivative does anything,
except for the fact that it stores pointers to files and
sub-directories.

> Isn't it obvious? Doesn't the words "encapsulation", "hidden
> implementation" and "black box" ring a bell?
>
Certainly. I believe it would be perfectly appropiate to describe my
proposed solution in those terms you mention.

> > Now, of course, you are correct that it shouldn't be so insanely
> > difficult to verify the product of the format (ie, the actual pak files)
> > manually that it basically becomes impossible.
> 
> There are actually where this is the case and everybody is perfectly
> happy. Think of the NFS protocol. It isn't defined in term of packet
> types and formats. It is defined in term of an IDL interface! The thing
> assuring compatibility isn't NFS itself, but the Sun RPC layer
> underneath it. Parallel the RPC layer to the serialization process, and
> the NFS layer to the Pak classes.
>
Excuse my ignorance, but I really don't have a clue as to what NFS
(filesystem?), IDL or RPC is.

> > > It calculates the faculty of n
> > > Our Prof presented this as example of an "iterative" process (meaning that
> > > a compiler can optimize it to an iterative process) ;)
> > > The recursive version is:
> > >
> > > (defun fak (n)
> > >    (if (= n 1) 1
> > >        (* n (fak (1- n)))
> > >    )
> > > )
> > >
> > > Just as little anecdote...
> >
> > I've just desided I don't like LISP :)
> >
> > C++ is so superior to that.
> 
> Youngster, you're showing your age and experience! What you don't
> understand, you are condemn to reinvent, poorly (paraphrasing somebody I
> do not remember).
>
Ok, I was exaggerating. What I meant to say was that: "C++ does this
*way* better than LISP. If that is any indication of how it is to do
stuff in LISP normally, then C++ must logically be better than LISP
generally"

That just got shortened down to the above two sentences... :)

I don't know first thing about LISP, so perhaps I shouldn't critisize
it. If that's what *you* meant, well, then you are correct.

What do you prefer:

(defun fak1 (n)
  (fak_iter 1 1 n)
)

(defun fak_iter (product counter n)
   (if (> counter n) product
       (fak_iter (* counter product) (1+ counter) n)
   )
)

Or:

unsigned int ComputeFaculty(unsigned int num)
{
	for (unsigned int = num - 1; i > 1; --i)
		num *= i;

	return num;
}

I'd say doing it the iterative way in C++ is as simple as doing it in
the recursive way in LISP:

(defun fak (n)
  (if (= n 1) 1
      (* n (fak (1- n)))
  )
)



> Lambda functions are like drug, when you start using them, you can't
> stop and C++ will probably give you some withdrawal symptoms (nausea and
> vomiting, usually). ;-)
> 
> The iterative way is more optimal (use a more clearly defined amount of
> resources and can be unrolled/vectorized even by the dumb C/C++
> compilers that plagues us)
>
Hadn't thougth of that.

> > Ahh, thank you very much for explaining this out for me. I agree, the
> > code I made would have been much better had it been done with recursion.
> > We should change it sometime we haven't got anything better to do (if we
> > choose to implement the solution I detail in the end of this E-mail,
> > doing it along with that would make sense).
> 
> Keep it iterative.
>
Hmm... I think I agree. The compiler optimization argument is pretty
good.

> About the rest of the e-mail, I just can't believe the amount of
> thinking you are putting into a so well-researched problem as
> serialization.
>
I wouldn't say I've used overly much time thinking about it. These
things work primarily unconsciously for me, so when I sat down to write,
I knew what I wanted to say, even thougth I hadn't spent much time on
thinking about it consciously.

It all comes down to skill, and how many programming problems you've
solved in the past (ie, experience). For me, that list would be quite so
extremely very very small. You obviously have a much larger experience
than I, so even if the problem may seem almost non-existent to you, it
takes me a little while to create a solution that both have a sound
design and works the way its supposed to.

Anyways, the issue wasn't really serialization, but proper design.
That's a thing that it is certainly worth spending some time on.

> For a very basic and totally sufficient example of
> serialization applied to a kind of archive file, look at the StrList
> class in Turbo Vision, which you can readily find on the metalab.unc.edu
> site.
>
> Your Pak file example is only a slightly more advanced form of that.
> StrList only let you get a bunch of bytes by using an integer number,
> where your system has directories and names. But the workings are the
> same, its only a more sophisticated index/directory.
> 
Thanks for the pointer.