[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Funny situation (was: Re: Serialization et al)



Bjarke Hammersholt Roune wrote:

> >  - a directory contains files.
> >  - a directory is a file.
> 
> That would be compared to:
> 
>   - a directory can contain any number of entities of a filesystem.
>   - a directory is not a file.
> 
> Which seems perfectly sensical to me.

Ok, so the inheritance looks like this:

DirEntry -+-> Directory
          |
          +-> File

And Directory contains DirEntry*. That is right?

If this is, how do you make the DirEntry* you got from Directory (while
iterating it for example) into a File* or another Directory*? Because I
assume that a totally general API that would easily survive the addition
of another subclass of DirEntry would not have a set of methods
returning Directory* and another set of methods returning File*, we
agree?

So this solution is just as fucked as mine as far as downcasting is
required. I just do the additionnal step of saying "well, okay, a
Directory is just about a File" and remove the DirEntry class, replacing
it with File. But the way you have it now can be fine also. But not if
you have two sets of methods (for Directory* and File*).

Adding a third subclass would mean major breakage. Major *unwarranted*
breakage might I add.

> There is not a is-a relationship between dirs and files. Sorry. There is
> a commonality, yes, but not a is-a relationship. According to my
> textbooks, in suchs a situation, the correct way to handle this is to
> capture the similar functionality in one class, and make the two derive
> from this.

As far as Unix is concerned, a directory is a file. A network socket is
also a file, but has no name. The name is a directory-only thing. You
can open a file, unlink it (removes the name in the directory) and the
file will still be there.

> I had a look at the composite pattern. Or, well, an article detailing it
> pretty much, and actually going through how to create a file-system with
> it (as an example of how to utilise it). In this article, directories
> were not files, but they both derived from a common abstract base class.
> (I don't have the GoF's book, so I might've got something wrong here).
> 
> This is very similar to what we have now. The only exception is that in
> the article, the common base class included some methods that really
> only made sense for directories (like iterating over and adding/removal
> of members).

I think that in the book (which I have, but lent it to a friend), they
use a widget tree. This is very similar to what I'm saying. You mention
container-type methods (iterating and add/remove methods) in the parent
class, which do not make sense for a file.

The point is that you shouldn't worry, just take it easy and make a big
fuss out of it. You can make a file class implement container-type
methods in a way that make sense. Make adding a member return an error
(disk full, no inode, "not a directory" or whatever), removal of a
member acts just like the asked member doesn't exist (which is true) and
iterating works just like it should on a directory with no content!

Me, I prefer keeping container-type methods in the container subclass,
*but* if I do so, I have to admit that I now need a way to get the
subclass pointer if it is one, hence the asDirectory() method. There is
no choice. I consider that separate container-type methods for
container-members and for non-container-members is evil, more evil than
a bunch of frenzied dynamic_cast roaming free. ;-)

Also, I'd like to point out a thing about the difference between a file
and a directory. I know the POSIX way, so I'll talk about that, okay?

A filesystem entity is described by its inode data, given by the stat()
system call. There is no difference at *ALL* between a directory and a
file. This type information *is* stored in this structure, in the
st_mode field (in the higher bits of the mode (think 0644)), but this is
a mere detail!

Also, this isn't the content of the file. Getting the actual content of
a file is a different thing, you have to open it, just like to get the
content of a directory, you have to open it (hmm, is that similar or
what?).

> > void recursivelyDoSomethingWithFile(File* file) {
> >   Directory* dir = file->isDirectory();
> >   int i;
> >
> >   if(dir)
> >     for(i = 0; i < dir->getCount(); i++)
> >       recursivelyDoSomethingWithFile(dir->getEntry(i))
> >   else
> >     doSomethingWithFile(file);
> > }
> >
> > Maybe the iterator isn't very nice, but this would work and looks very
> > clear and simple to me.
> 
> In exactly what way is this not possible using the current interface?

I do not know. Tell me how you'd do the same piece of code that I did
with your interface.

> Besides, that is a bad way to do it. What if we decide to add a
> completely third file-like-but-not-really-anyway File derivative? You'd
> have to check for that too, even if you just needed to call virtual
> members of File.

I wouldn't have to check for that. Note that my method is called
"recursive". All it cares about is "is this a directory or something
else". If it is something else, then that is the problem of
doSomethingWithFile(). If it is a directory, then it has to do its part,
the recursion.

This is one thing that proponents of dynamic_cast (as the "general nice
C++" way to do this) say, that adding a new subclass to File would
require new method asSomethingElse() (similar to asDirectory()) for it,
where using dynamic_cast requires *no* change to the parent class.

An in-between would be a virtual getAs() method that would take an enum
as parameter and work just like dynamic_cast, except that you'd have to
cast it manually afterward, which *should* be safe, but this might make
some people incomfortable... Those are all evil and do the same, the
asDirectory method requires modifying the parent class, dynamic_cast
uses RTTI and the last one requires a cast. Choose your poison!

> In the current scheme, you'd use a DirEntry (something like that, a
> common base of File and Directory). Anything in DirEntry is supposed to
> make sense for all its derivatives (they should atleast do something
> sensical that doens't break existing code), and thus old code will never
> have to be reworked (in theory, anyway ;)
> 
> You get exactly the same result with the current scheme, but you get
> better type-safety, less chance of something doing something you didn't
> expect it to do (like nothing) and more stable code (as in never or
> rarely needing to change it).

How do you go from a DirEntry to a Directory? I assume here that having
separate container-type methods for Directory* and File* in DirEntry are
simply ludicrous.

> I dislike both. AsDirectory() is just a pretty facade for dynamic_cast.
> dynamic_cast is actually better (read: more effecient) if implemented
> properly. dynamic_cast should be able to just test the virtual function
> table pointer, while AsDirectory() makes a virtual call through this
> table, and returns a value.

It doesn't do a dynamic_cast. It does a simple "return this" for
Directory and "return NULL" for File. Where dynamic_cast would need to
iterate thru the parentage of one of the class, looking for the other,
because of its general algorithm. My approach is done in bounded time
where the dynamic_cast approach is non-bounded time. Depending on the
inheritance tree and the number of class in this part of it, it could
require many iterations of multiple containers to do the dynamic_cast.

(this is assuming the most likely implementation of RTTI, each type_info
node being a container for pointers to parent type_info nodes (could be
none, just one or many). dynamic_cast would get the type_info node of
the object, then recursively explore itself and its parents to see if
the type_info node of the target type is there)

> It's not that I would hessitate to use any of these approcahes if I
> really need to, it's just that I'll try my best to find a solution that
> doesn't need it and that isn't inferior.

Hell, I'll just *have* to take a look at DirEntry for real now. There
must be Real Magic (tm) going on in there! ;-)

(after looking, it turns out you use the "ugly" way rather than the
"evil" way, and no magic)

> > The asDirectory() method is an intermediate way of doing this: it is
> > static, as you can't do asDirectory() on just about any object to ask it
> > if it is a directory, but it let you have the dynamic advantages of
> > dynamic_cast.
> 
> It *is* RTTI. It's not static in the way Bjarne meant it: He was talking
> about compile-time checked type safety. This quite clearly is not what
> AsDirectory() does.

Blargh. "return this" is pretty static to me. Just as "return NULL" in
fact. Okay, you could take this as home-made RTTI, but know something:
computers don't do objects, only bits, so everything were talking about
boils down to some other thing. So when I say RTTI, I really mean "RTTI
as documented in the C++ standard". Non-C++-specified-RTTI, I have been
doing since my Applesoft Basic days.

> > One application RTTI come handy is in *very* general cases, for example
> > the streaming mecanism in Turbo Vision, that takes a TObject* and
> > streams it. It isn't sufficient to call the virtual void write(TStream*)
> > of the TObject, you also have to identify the object and write down its
> > type ID first in the stream.
> 
> Wouldn't a virtual GetID() be much better?

Yes, just follow.

> > This could be done in a more general way, but it turns out the safest
> > way is the "unsafe" dynamic_cast!
> 
> dynamic_cast isn't unsafe. It's just evil :)

The virtual GetID() way is one of the "more general way" I was speaking
about. Having the compiler do the work for you (i.e. with RTTI and
dynamic_cast and typeid) is the *safest*. A virtual GetID() has all the
same evilness that dynamic_cast has and can introduce errors on top of
this!

> > Yes, that's true. Another case of 10% missing features.
> 
> All those 10%s (notice the plural 's' there) are begging to add up...

Oh no, they're part of the same 10%. :-)

And right after that, I describe a way to keep the performance up
without complicating the interface using iterators.

> > I knew it was used and "blatantly reused" the name because deep inside,
> > I'd like PakFile to be called PakArchive. Just ignore me. :-)
> 
> Hmm... I actually like that. The part about PakArchive (not the ignore
> stuff :).

That's my way of being rude, imposing myself and making you change your
code. Some weird people sporting goatees and drinking weird coffee would
say that I am an intellectual terrorist, but I don't ever use words like
that myself. :-)

> My main consearn of the Directory<->File difference is that Files, in my
> world atleast, *IS* data. If you give me a file, what you are really
> talking about is data. I don't really care where that data is coming
> from (ie, the device), what I do care about is that its data I'm
> getting.

The File class isn't data. There is no read() or write() method, is
there? Where IS the data then? I see none.

Are you saying that you can pass a File* to some method or function to
open it and access its data? I think the same, isn't that coincidence!
But File *definitely* ISN'T data. Well, *file* data I mean. :-)

> When you are giving me a directory, you are not giving me data. You are
> giving me a collection. For me, to have Directory derive form File is
> like having std::vector<T> derive from its template argument. It just
> doesn't compute. Atleast not for me. container<->data. A collection of
> something is quite different from just one of something.

See it that way: files contain data, and directory contains files.
They're both containers. Now, a simple exercise of abstraction can make
you tell that files are, well, some kind of data too, at some level.
There you go. Clear as crystal.

I can't believe people don't understand Unix, it is SO simple!
Everything you see is a file, how hard can that be??? :-)

Question: "Hey, what is this?"
Answer from the Unix guru: "A file."

You can reuse this dialog wherever you need it. ;-)

> > On the other hand, making this NamedPipe class a "different" class,
> > inheriting from DirEntry, like File and Directory are, would mean adding
> > a FileSystemEntityContainer to Directory and *three* loops in user code.
> > This would definitively freak me out.
> 
> If you needed only functionality that really *is* common for
> directories, files and this third "something", you'd be fine. No need
> for more loops. All that would be declared in the common base class.

I agree with the DirEntry scheme. I feel it is unnecessary, because
there is actually *no* useful methods in File (its power comes from
being used as a parameter to an open function). But you can leave it, if
it makes you feel better. Now, I'm going to look at PenguinFile for a
minute or so...

friends! Now, if dynamic_cast is evil, I wonder what friend is! :-)

Oh shit. GetFile/GetDir/GetEntry. This doesn't look good at ALL. So if I
want to add another subclass of DirEntry, I have to modify DirEntry. So
nice. You know that you'll (unnecessarily) break vtable compatibility
and require a recompile of client applications? Doing it with
dynamic_cast would require no recompile whatsoever and if you don't want
RTTI, a AsEntryType(ppfFileType) would work very nicely and in a nearly
type-safe manner. Its not my fault if C++ doesn't have dynamic typing to
make this nice.

Now, if I do GetDir("foo") and that "foo" is a file, I get NULL, right?
And then I do a GetFile("foo") and it works. Same as checking if
AsEntryType(ppfFT_file) returns NULL and casting otherwise. I found the
casting to be rather forced upon me by C++, by its failings. It tries to
enforce strict type-checking, but in many cases, you have to cheat,
generally in the "interesting" cases where object-orientation becomes
best used. I do not say that C++ sucks lightly, you know.

Casting is either required for strict languages (but done under strict
control) or completely forbidden, in the case of a dynamic language. C++
and Eiffel are example of strict languages, with dynamic_cast and the ?=
assignment operator (respectively) as the "strictly controlled cast" and
Smalltalk and Perl are examples of dynamic languages.

The GetVFSInterface method and the comments in VFSInterface about using
the FSType for casting purposes seems just as bad and evil as my
asDirectory() method, except that I do it right in your face and require
less user code.

> > VC++ isn't at all at the same level as Kai, Portland or Compaq. It's
> > closer to the GNU compilers (in the same league I might say). Those
> > compilers I listed are compilers made for HPC development, where
> > everything is tweaked to the bone. They take age to compile and their
> > code run faster than their shadow (for C/C++).
> >
> According to an article I saw on LGDC MS's compiler severely outperforms
> the GNU compilers, and it was the quickest in the test. Unless my memory
> has completely failed me, that is. The article migth be a bit dated,
> though (not sure).

See this grahics of compiler performance:

Bad |- GNU - VC++ --------------------- Kai and other HPC -| Good

It is not the following, as you seemed to think:

Bad |- GNU --------------------- VC++ - Kai and other HPC -| Good

> > They are sufficient for games. Nobody has a vector machine handy to play
> > games anyway. ;-)
> 
> I think you should ask a games programmer about that. The only reason
> games run on todays computers are that the programmers crippled the
> software in some way to make it so that they could. Games can *always*
> use more effeciency and processing power. Actually, they always *need*
> this.

The point is that Kai on a PC is barely better than VC++, because there
is no sophisticated hardware to exploit. Now, Kai on the SGI Origin 2000
or that NEC SX-5 we have might give you different results, but they
*are* hundreds of thousands of dollars for the first and multi-million
of dollars for the second. Even John Carmack doesn't have one of these.

Not exactly regular gamer fare. And Kai is much more expensive than
VC++, so the reasoning is that for the *very* slight advantage of Kai on
gamer hardware, it is not worth the price.

> In short, I don't think *anything* is sufficient for problems like
> these. Heck, a human will never be sufficient (atleast not normal ones
> like me :).

Yeah, our weather modeling customer are telling this to us all the
time... :-) ("More data points! More data points! Did I mention more
data points?")

-- 
Pierre Phaneuf
Ludus Design, http://ludusdesign.com/
"First they ignore you. Then they laugh at you.
Then they fight you. Then you win." -- Gandhi