[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Misc. coding convention proposals



Well, I blatantly allow myself to assume that there's some kind of agreement
that we some kind of coding convention for things like naming of files,
functions, methods, macroes and classes.

If this is not a convinient time for discussing things like this, just read
the Email and let the different argumetns sink in, think about it, and when
I get back from my vacation in 2 weeks (+ a few days, as I ahven't gone yet
:) I'll repost this post and perhaps then it'll be more convinient.


****** Macroes ******
Well, it seems there is more or less already agreement here:

All macroes must be in all caps, and words are seperated by underscores,
with the exception of macroes that function like functions.


****** File extensions ******
This is a little more tricky subject. It seems that the .h extension for all
headers (regardless of wheter they are C or C++) is already standard and
agreed upon. There are, however, several possibilities for the
implementation files. The extension for C implementation files are, however,
no problem, it's .c (lower case). That leaves us with the C++ implementation
files, where we have several possibilities:

-- Possibilitiy 1:  .C
Here, an uppercase C is used rather than a lower case c. This is completely
off the table as that would make any porting to Win32 or any other platform
which file system is case insensitive.

-- Possibility 2:  .cpp
There is one *major* benefit of this extension for Win32 people, and that is
that its required to get the by far most prevalent compiler on and for the
Win32 platform (MS Visual C++) to work properly on C++ implementation files.
It is possible to get files with other extensions to compile, but it is
extremely inconvinient, takes alot of time (you have to to through a menu
and set an option for each and every file) and its generally much easier to
rename the files to .cpp and then rename them back again when they are to be
committed. Many features of MS VC++ also doesn't work at all with other file
extensions than cpp.

-- Possibility 3:  .cc
This seems to be the standard extension on linux, atleast for gcc. There has
been some doubt as to wheter or not files with other extensions than .cc
will work with autoconf. One says it they do, one says they don't.

-- Possibility 4:  .c++
This possibility have neither of the advantages of .cc or .cpp, and I
imagine some platforms might object to having + in filenames. It seems to me
that this is somewhat off the table.


****** Global function naming ******
Global functions should be prefixed after the same rules as files (pp[?]).
New words are signified by a writing the first letter of each new word in
upper case. The first letter after the prefix should also be upper case.


****** File names ******
It seems to me that there is already agreement on this subject: All files
must be prefixed with pp. Files that belong to a specific subproject must be
prefixed with pp+project prefix. The only exception to this is the main
header of each project, which must have the full name of the subproject
itself, and not be prefixed with pp.


****** Class and struct names ******
The question here is how classes and structs should be named, and wheter or
not there needs to be a difference between the two naming schemes.

There are not that many reasonable posibilities here, atleast not that have
been proposed by anyone (that I am aware of).

Structs should be prefixed after the same rules as prefixes for files. New
words are signified by having the first letter of each word be uppercase.
The first letter after the pp[?] is also upper case. So, a structure for
files in the PenguinFile subproject could be named ppfFile.

If there needs to be a difference between struct and class names, then there
is the possibility of prefixing classes with pp[?]_ rather than pp[?]. If
there need not be a difference, then the exact same naming scheme should be
applied to classes as structs.

My personal opinion is that there definately should not be a difference,
partly because struct and class is synonymous, partly because a certain name
means one thing. A struct called ppError and a class called pp_Error will
create exactly the same associations in the head of a client. Also, the
addition of an underscore should IMHO not signify major difference between
two types specifiers, and having it do so is downright user hostile.


****** Order of variable and method declarations in class declarations
******
I don't know if this is something upon which there is agreement that there
even needs to be a convention, but I think there does need to be a
convention here.

We have a couple of possibilities (assuming we are ordering by access, not
use):

-- possibility 1:  public, protected, private
Here, variables and methods are ordered from most accessable to least
accessable. I think it makes alot of sense to do it this way, as the public
methods are what most people looking in a header file will be after, as
those are methods that all people who wish to use the class will be using or
implementing. The next logical choice is then, IMHO, to order the protected
variables and functions after that, as these variable and functions will be
of interest to everybody who whishes to derive form the class or alter some
of its code. What left is variables and functions with private access.
Moving this down to the bottom makes IMHO sense because these variables and
functions will only be of interest to a person who wishes to alter the code
of the class itself.

What is most used is at the start of the class, and what is least used is at
the end of the class: the more its used, the easier it is to get to. This
makes IMHO alot of sense.

-- possibility 2:  private, protected, public
This is done in some places. I, however, see no arguments for moving the
most-used definitions down to the place where they are hardest to find.

There is also the possibility of ordering variable and method definitions by
use. This is, however, rather non-standard, and it dones't make so much
sense as public access methods are the definitions that are going to be used
the most, regardless of their use.

The last question here is how to order method definitions in relation to
variable definitions. I believe methods are more often looked up than member
variables, so I propose putting method definitions before variable
definitions within each access type.

Constructors is by far the kind of method that is most frequently looked up,
and I therefore propose putting all constructors before any other method
declaration of the same access type. It makes sense to put the destructor
right after the constructor(s), so I also propose that.

If a class has no use for a constructor or a destructor, I propose defining
them anyway, and just provide an inline implementation that does nothing.
This makes it instantly obvious that the class in question only has a single
constructor taking no arguments. I also propose that destructors be made to
follow the same rules as constructors, as these two kind of methods are
closely related. This might just be a habit thing for me, though (the
destructor thing).

Also, I propose ordering method implementations in the implementation file
the same way the definitions are ordered in the declaration file, so as to
make it easier to find the right method.


****** Method and member variable naming ******
Methods are named just like functions, but without the prefix. The first
letter should still be upper case. There should be wide agreement on this
point.

Member variables are another matter. Currently, it seems most people are
doing nothing special with member variables that they are not doing to
non-member variables.

I propose giving all member variablessome kind of prefix to signify that
they are member variables. The benefit of this is that no one can ever be in
doubt what kind of variable that they are dealing with; is it a mrember or
isn't it a member? They don't have to look anywhere to find out, just
looking at the name is enough.

This makes it much easier to see what a method does. The reason is that
"doing something" means chaning some kind of state. A method will often be
"doing something" by altering the class that it is part of. Local variables
are usually just implementation details, and does not alter the state of the
object when their state are altered. Member variables, however, by
definition change the state of the object when they change state, as they
*are* the state of the object. Therefore, I've found that being able to
clearly distinguish between member functions and non-member functions have
been extremely nice in my own projects, and I therefore propose doing the
same in this project.

Of course, this raises the question of what this prefix should be.

 Some people use the prefix "its", so name becomes itsName. Distinguishing
names variables that begin with its out from alot of variables that do not
isn't particularly easy, as there is nothing special about neither 'i', 't'
nor 's'; those characters are used all over the place, and they don't really
stand out in any way. This diminishes the advantage of having a prefix for
member variables in the first place.

Some people use a prefix of an underscore to signify that a variable is a
member variable. This works. I don't personally like this, however, because
A) there is no inherent logic in an underscore meaning "member" and B)
underscores are supposed to connect things while still indicating that there
exists some kind of relationship between them. It's a kind of limited space.
prefixing something with an underscore means that the first character of all
members will be an underscore, and this again means that this underscore
will not be connecting anything to anything. This is, to me, a little
counter-intuitive, even though I realise it is often used to avoid
name-clashes. We, however, are not dealing with a name-clash here.

The method I personally prefer is to prefix all members with m_. This has
all the benefits of a single underscore, and actually a bit more, as its
longer and therefore easier to pick out if it is among alot of other code. m
_ IMHO makes the look of member variables more "well-rounded". I also think
its better to reserve leading underscores for name-clash situations and
inclusion guards.


****** General variable naming ******
There seems to be wide agreement on signifying seperate words in variable
name by making the first letter of each word in upper case, except for the
first. This, I think, is a very good idea.

I would, however, like to propose an addition to this: if the variable is of
pointer type, it should be prefixed with a lowercase p. This has numerous
benefits, and no drawbacks, as it A) makes it easier to find out what is a
pointer and what is not a pointer (doh!) B) prevents memory leaks from
happening as often as you are continually reminded that this is a pointer C)
prevents you from mistaking a pointer for having an integral char, short,
int, long, flot or double type. This doens't happen often, but a compiler
will usually not even issue a warning form suchs practice D) breaks code
whereever the variable is used if it is changed from pointer type to some
other type. This is highly desireable, as any code coded with the conseption
that a specific variable is a pointer is extremely likely to be broken
anyway when this specific variable is no longer a pointer. However, even
though the code is broken, the compiler may not always complain everywhere
the no-longer-pointer variable is used. Prefixing pointer variables with p
(and therefore removing the p when it is no longer a pointer) ensure that
the compiler generates errors *everywhere* the no-longer-pointer variable is
used, thus giving the benefits of A) certainty that all code that was
programmed with the conseption that the variable was a pointer has been
looked over and, if nessecary, updated B) the compiler helps you in this
task by telling you where the variable has been used, so you don't need to
go looking for it.

This, combined with prefixing member variables with m_, will make the
situation where on wonders about some specific of the type of a certain
variable and needs to go looking for its definition happen much much more
infrequently. And that leads me onto my next subject:


****** Where to declare variables within functions ******
In C, the general practice is to declare all variables at the top of a
function, before any other instructions. In the case of C, this is a
perfectly good idea and should be followed when programming in C.

However, I am of the opinion that C++ is another matter completely in this
context. In C++, variables should IMHO be declared no sooner than they are
needed. This has several good reasons, A) In C++, there are classes, classes
have constructors, many constructors take arguments and these arguments
might not be available at the start of a function (a situation that happens
quite often, actually) B) the declaration of a variable used in a function
will virtually always be on the screen, eliminating the need to scroll up to
check the type of a variable C) some variables might only actually be needed
within a sub-scope (ie, a pair of braces), and declaring it at the outermost
scope (the start of the function) is therefore both somewhat
counterintuitive and somewhat compiler-optimization-unfriendly (most
compilers should be able to figure out when a variable is not used, but
still).


****** Braces ******
I propose that an opening brace should always line up with its closing brace
on a vertical line, thereby making it clear what code belongs in which
areas, aswell as making it easier to see where a pair of braces open and
close.


****** for-loop initialising ******
Variables declared in the initialisation field of a for loop should not be
considered only to have scope within the forloop, but rather to have scope
to the scope in which the for loop is present.