Files&Links/Datoteke&Linkovi » The Old Man and the C
en

The Old Man and the C

Evan Adams  
Sun Microsystems

Abstract

"You can't teach an old dog new tricks" goes the old
proverb. This is a story about a pack of old dogs (C
programmers) and their odyssey of trying to learn new
tricks (C++ programming).

C++ is a large, complex language which can easily
be abused, but also includes many features to help
programmers more quickly write higher quality code.
The TeamWare group consciously decided which C++
features to use and, just as importantly, which features
not to use. We also incrementally adopted those
features we chose to use. This resulted in a successful
C++ experience.

1.0 Introduction

This paper describes the experience of a group of C
programmers adopting C++ for a new project. It is
written from the viewpoint of C programmers and
describes our expectations, surprises, pleasures,
disappointments and trials and tribulations. It is
intended for C programmers that may be considering a
journey into the realm of C++. It is not intended to be
a critique or evaluation of C++ as a programming
language.

The TeamWare project consisted of 8 very
experienced C programmers, ranging from 4 to 13
years of industrial C experience. In the spring of 1991
we decided to implement the TeamWare project in
C++. It was hoped that using C++ would lead to more
code sharing, more cleanly structured code and
improved internal interfaces. No one within the project
had any significant experience with C++, although two
members of the group had some object oriented
experience.

TeamWare is a set of command line and GUI tools
built from several common libraries. The libraries are
provided by the TeamWare group for use by the
TeamWare applications; they are not provided for
more general use.

TeamWare is a code management product that
encourages parallel development and is built on top of
SCCS. A user makes a copy (bringover) of an SCCS
hierarchy thus creating a personal hierarchy. In this
hierarchy the user makes and tests changes. These
changes are then integrated (putback) into the original
hierarchy. If the integration hierarchy contains
changes which are not in the user's hierarchy, then
TeamWare detects that there have been parallel
changes and refuses the integration. Therefore, users
must incorporate changes in the integration hierarchy
into their own hierarchy before integrating. TeamWare
also includes the filemerge utility, a graphical
three-way differences program allowing users to
merge parallel changes. TeamWare tracks both source
file changes (SCCS deltas) and file renames.

1.1 Which Way To Go?

In the beginning, the group was faced with two paths.
The first path, marketed by Nike, was labeled "Just Do
It" and appealed to our impulsive nature. The second
path, labeled "Crawl Before You Walk", appealed to
our logical selves. The Just Do It path called for each
of us to decide for ourselves which features of the
language to use and how to apply them. The Crawl
Before You Walk path called for the group to use new
features of C++ only as it became apparent that they
added value over our more well understood C
techniques.

1.2 Getting Started

We began by taking a C++ course taught by Hank
Shiffman of SunPro Marketing and offered through
Sun U. and buying a handful of books [Ellis,
Stroustrup 1990], [Eckel 1990], and [Dewhurst, Stark
1989]. We found the Annotated Reference Manual to
be somewhat daunting for the average programmer. As
a language reference manual it is intended more for
compiler writers and people interested in a very
precise language definition. The other two books
explain how to use the language. Dewhurst and Stark's
book is much more concise and was, therefore, the first
reference. Eckel's book was used when Dewhurst and
Stark's was inadequate. Many C++ books have been
published since the spring of 1991 so there may well
be better options available now. 

Initially, some of us felt that C++ would be pretty
easy to pick up. After all, wasn't it just C with a bit
more stuff? Hank's class convinced us otherwise. We
left the class feeling that there was a lot to this
language, some parts we liked, some parts we didn't
and much that we didn't fully understand. For
example, we left the class with the clear message -
"Stay away from multiple inheritance". 

We chose the Crawl Before You Walk path and
started with very modest goals; we would use classes
instead of structures, constructors and destructors and
member functions. We felt certain that our future
would include inheritance and virtual member
functions, but we did not feel ready for them yet.

2.0 Features We Used

2.1 Required Function Prototypes

A function prototype is a function declaration
containing the function's return type and the types of
all its arguments. Function prototypes allow the
compiler to do strong type checking as the compiler
ensures that a function is always called with
parameters of the appropriate types. C++ requires a
function prototype for every function that is called.

Initially, we ported some utility code from a
previous project. It had been written in Kernighan &
Ritchie (K&R) C. The first task was to change all the
function declarations from the K&R C style to the C++
style, and to declare the function prototypes in the
header files. This was tedious work, but quickly
demonstrated the power of requiring accurate function
prototypes. We kept compiling the files until the
compiler no longer complained, knowing that, only
then, did the uses match the definitions. In the long
run, we found required function prototypes to be the
single biggest advantage of C++ over K&R C or even
ANSI C.

Our early C++ days were very frustrating with
respect to the C++ error messages. We found the error
messages to be obscure and not terribly informative.
Each of us had experiences of spending several hours
trying to figure out what the compiler was telling us.
Coming from C programmers this is no small
statement. Over time this problem went away. We
concluded that the C++ error messages are probably
not much worse than the C compiler's, but it took us a
while to gain the same familiarity with them that we
have with the C compiler's.

2.2 Classes

Classes are the essence of C++'s object model. They
are like C structures with the addition of constructors
and destructors, public and private fields, member
functions and the ability for one class to inherit from
another. A class generally consists of the data needed
to present a certain concept. The member functions are
routines that operate on that data and form the
interface to the class.

2.3 Constructors and Destructors

Constructors are routines that initialize newly
allocated objects and destructors are routines that
clean up before an object is de-allocated. An object is
created by having the new operator allocate memory,
and then the constructor initializes the memory.
Likewise, an object is destroyed by having its
destructor called to clean up the object (such as closing
open file descriptors), and then having the delete
operator de-allocate the memory. The new and
delete operators replace traditional malloc() and
free() usage.

We became fans of constructors and destructors.
Much of our C code had followed the same principles
by providing one routine which would allocate an
instance of the type and initialize its fields, and by
providing a second routine which would de-allocate
the appropriate member fields and the instance of the
type itself. We were pleased to have language support
for what we had been doing by hand. 

One aspect of constructors we found annoying is
that they must be kept very, very simple because it is
awkward to have a constructor return an error value,
such as failure to open a file. We kept our constructors
simple and then added a member function to do things
which might fail. However, this separates the complete
construction of an object into two, possibly separated,
pieces. It is possible to end up with a partially
constructed object. In one case, we passed into the
constructor the address of an error variable so the
constructor could return an error.

2.4 Function Overloading 

Function overloading allows you to have more than
one function with the same name as long as those
functions take different types of arguments. In C, the
use of a function named foo maps to one and only one
function defining foo. With function overloading, this
is no longer true. The reader must take into account the
arguments passed to foo to correctly map foo to its
implementation.

As old C programmers we left the C++ class with
an uneasy feeling about function overloading. We
were fairly convinced that use of function overloading
would be confusing and not prove to be worthwhile.

However, constructors encouraged us to use
function overloading. We often found it advantageous
to provide more than one constructor for a class.
Frequently, we would provide a very bare-bones
constructor which initialized all the member fields to
default values along with additional constructors
which did more and more sophisticated initialization.
We found this type of function overloading to be very
natural and useful. Overloaded constructors probably
represented about 90% of all our overloaded functions.
The remaining overloaded functions tended to be ones
which accepted different numbers of arguments. The
ones accepting fewer arguments would supply default
values for the missing arguments and call the one
accepting the most arguments. Default arguments
could have been used instead but, since they made us
nervous, we chose to make the defaulting explicit.

2.5 Member Functions

Member functions are a suite of functions associated
with a class. They can be viewed as providing the
interface to the class, that is, the set of operations
which can be applied to instances of the class. A
member function is called via a pointer to an object or
an actual object. Each member function is implicitly
passed a this pointer, which is a pointer to the object
through which the call was made. Inside a member
function the scoping rules change. The member fields
and functions can be referenced directly, it is not
necessary to use the this pointer.

Member functions were a big hit. At this point, we
were using objects and member functions, but no
inheritance. Without inheritance, member functions
are primarily syntactic sugar, but one we took a liking
to. One of the larger weaknesses of C is that there is a
single global name space for all functions. In C++,
each class provides a separate name space for its
member functions. This results in member functions
being given less verbose and more descriptive names,
resulting in more readable code.

Initially, we found directly referencing member
fields and functions rather disconcerting as we were
referencing names which were not declared to be
either local or global. Furthermore, we sometimes
declared a parameter with the same name as a member
field. The compiler did not complain about this either
and the scoping rules result in the parameter hiding the
member field. The latter can be a serious problem and
was the source of several very subtle bugs.

Classes and member functions cause C++ to have
more name spaces than C, so the naming conventions
used in C programs often turn out to be inadequate for
C++. It would be advantageous to use a naming
convention which syntactically separates fields from
parameters and variables. We never completely came
to grips with this problem.

2.6 Public, Private, and Protected Fields

Member fields in a class can be either public, private
or protected. Public fields can be referenced from any
object (.) or object pointer (->); private fields can be
referenced only from within its class's member
functions and its friends; protected fields are the same
as private unless inheritance is used.

Some kinds of fields are really private to a class.
These keep track of the internal state of an object and
users of the class have no need to either read or write
them. Other fields are of interest to the users of a class.
Making these private requires that functions be
provided to get and set the field. We called these
accessor functions.

We did not discuss group-wide conventions for
using public, private and protected fields, and
consequently two very different styles emerged. The
people writing the libraries dabbled with private
members, but did not find that they added much value
and eventually took to declaring all new members
public. The people writing the GUI applications made
much more significant use of private member fields
and their corresponding accessor functions. When
debugging, they found it useful to be able to set a
breakpoint in a set routine and catch all situations
where a given field was being set.

In hindsight, we believe we would have made
much greater use of private fields if we had been
providing a public API. Private fields give the
implementors of an API much greater control over the
interface by preventing arbitrary access. In our case,
we were providing an API to ourselves and we did not
find this level of interface control necessary.

2.7 Inline Functions

Inline functions have their bodies expanded at each
call site. They eliminate the function call and return
overhead in exchange for duplicate copies of their
bodies. 

A frequent objection C programmers have to
accessor functions is #8212; "Why should I have to make a
function call just to get the value of a field? This will
be too expensive!". Inlined functions provide a very
nice solution to this problem. They allow the
implementor of a class to tightly control its interface
without needlessly sacrificing performance. When we
used private data, the corresponding get routine was
almost always inlined. Set routines were frequently
inlined as well.

A common question regarding inlined functions is
— "What size function should be inlined?". In
answering this question you must consider the
performance gain resulting from removing the
function call overhead, versus a possible increase in
code size. Someone suggested to our group a rule of
thumb we think makes sense #8212; do not inline any
function which contains control structures.

2.8 Public, Private, and Protected Member 

Functions

Public, private and protected apply to a class's member
functions as well as its fields. Just as for fields, they
define a member function's scope. We found we used
private member functions less often than private fields.

As with private fields, we believe we would have
made greater use of private member functions if we
had been providing a public API as they separate the
interface from the implementation.

2.9 Inheritance and Virtual Functions

Some of the early code we ported included a list
package. This took us on our first journey into
inheritance and virtual functions. Inheritance provides
for building one class (a derived class) from another (a
base class). The derived class becomes a superset of
the base class and has all the base class's functionality
along with any new functionality provided by the
derived class. Virtual functions allow the derived class
to modify the behavior of the base class. If the base
class contains a virtual function and the derived class
provides a function with the same name and
arguments, then the derived class's function
supersedes the base class's.

Inheritance and virtual functions encourage
implementing generic functionality in a base class, and
then specializing that functionality in each of the
derived classes. A list package is a good example.
Much of the implementation of a list package is
generic and applies to all lists regardless of the
elements they contain. However, some of the
implementation is specific to each type of list. Printing
the elements of a list is an example; walking the list to
visit each element is generic while the actual printing
of an element is specific to the element type.

Many C programmers are initially confused by the
semantics of inheritance and virtual functions. They
frequently have trouble determining whether a given
call will invoke the base class's function or the derived
class's function. Say you have a base class with a non-
virtual member function then you can call this function
via an object derived from the base class. If this non-
virtual function in turn calls a virtual function, does it
call the base class's function or the derived class's
function? The answer is obvious to experienced C++
programmers, however it is often confusing to C
programmers first learning C++.

We found it much easier to understand these
semantics by thinking about the implementation. First,
the C++ compiler will try to resolve as many function
references as it can at compile time. Any reference to
a non-virtual function is resolved at compile time. Any
reference to a virtual function cannot be resolved at
compile time. Secondly, a class's virtual functions are
put into a table of virtual function pointers and this
table is associated with every instance of that class. In
the earlier example, when a non-virtual function is
called from a derived object, a pointer to the derived
object is passed into the function (the this pointer).   
The virtual function is then found in the this
pointer's virtual function table. Therefore, the derived
object's function gets called.

2.10 List Package

We had two goals for our list package. First, to
preserve the ability to have lists of things which were
unaware they were in a list. Second, to have typed lists,
that is, a list of ints, a list of char *s, a list of
pointers to class foos, etc., rather than generic lists.
The first couple of attempts at converting the list
package were feeble. The interfaces were clumsy,
there were many friend declarations, and usage
was awkward. The third iteration settled down into
what we felt was a pretty reasonable interface and by
this time all the friend declarations had
disappeared. We also concentrated on making it easy
to create lists of new types.

It has become apparent that object oriented
languages do not deal well with implementing generic
container classes. In C++ this is the motivation behind
templates. However, we were using cfront 2.0
which predated templates.

An example of the generic container problem is
shown by trying to copy a list. Copying a list is
something the base list class should do as it
understands the implementation of lists. A virtual
function should be used to copy an element of a list as
this allows each derived list to provide its own copy
function. So far so good, but what type does the base
class's copy routine return? That is the dilemma. The
only type it knows about is the base class, yet this is the
wrong type to return since it is copying a typed list. We
found it necessary to have the base class's copy routine
return a pointer to the base class and then have each
derived class also supply a copy routine. The derived
class's copy routine then calls the base class's copy
routine and casts the return value to a pointer to the
derived class's type.

This same problem occurs for a few other
functions, and applies to all derived lists. Therefore,
we wrote a macro which, given the derived list's type,
generates all the necessary functions.

Likewise, there are several virtual functions which
manipulate the elements of a list. The elements are
stored as opaque types so each of these functions needs
to cast the opaque type to the appropriate element type.
We wrote a second macro to generate these functions.

We believe that templates would have led to a
cleaner solution to these problems but have not yet had
a chance to try them. In the end, we were pleased with
the list package (and its cousin the hash package). Our
libraries contain nine different types of lists. Several
members of the team commented that creating new list
types was very easy and beneficial.

2.11 More Inheritance

We applied inheritance and virtual functions in several
other places as well. They are very powerful concepts.
It takes a bit of effort to become comfortable with them
but their use can generate significant rewards. We
found that using classes, inheritance, and virtual
functions resulted in greater code sharing. Obviously,
this level of code sharing was possible using C but it
takes much more discipline. With C++, the language
provided support for these concepts, making it much
easier to achieve code sharing.

For those familiar with TeamWare, its
bringover and putback commands do very
similar things yet they differ in a few areas. They are
implemented with a base class called a Transaction
and derived Bringover and Putback classes. The
derived classes do things like argument parsing and
implement the differences between the bringover
and putback commands while the bulk of the work
is done in the Transaction class.

It was common, when a member of the group first
encountered this implementation, for them to ask -
"how do I tell if I'm in a bringover or putback
command? Where is the global variable to test?". The
answer would be - "There is no global variable. An
object knows which one it is so, if it you are doing
something unique to one of the commands, then it
should be done in a virtual function.". This was a new
way of thinking.

Sometimes we would create a class not expecting
it to become a base class only to discover later on that
we needed to derive another class from it. We then
faced the question - should all the base class's
functions be made virtual, or should only those
functions which our derived classes replace be made
virtual? We did a little of both with no obvious results
favoring either technique. We feel this issue comes
back to whether or not you are providing a public API.
If so, then you probably want to make your classes
very flexible and allow derived classes to replace
many of the functions. This implies that most, if not
all, public member functions should be virtual.
Otherwise, our experience indicates that it really
doesn't matter.

2.12 Pure Virtual Functions and Abstract Base Classes

A pure virtual function is a virtual function in a base
class for which no actual function is defined. A pure
virtual function is declared by putting = 0 at the end of
its declaration. If a class contains a pure virtual
function, then it is an abstract base class. The
significance is that the compiler will not allow you to
have any instances of an abstract base class.

Abstract base classes should be used when the base
class is so generic that it is not useful just by itself.
Only when some key functionality is supplied by a
derived class does it become useful. Our list package
was an example of an abstract base class. With typed
lists, an instance of the base list class is not
meaningful. Without pure virtual functions, the base
class's implementation of these functions would
probably consist of printing a nasty message and then
exiting.

2.13 Operator Overloading

Operator Overloading is the ability to redefine the
basic C++ operators. TeamWare's applications did not
lend themselves to needing operator overloading. We
only overloaded the new and delete operators for
some classes so that we could impose our own
memory management. This was very useful and
allowed us to elegantly gain significant performance
improvements.

General operator overloading seems like a feature
that is likely to be abused. During the C++ class, Hank
warned us to never overload an operator to do
something entirely different than the operator's normal
semantics. This is very reasonable advice. If you have
an object and doing things like adding two of them
together makes sense, then operator overloading may
be the way to go. However, we would recommend that
it be used with caution.

2.14 Calling C Routines From C++

It is common to need to call C routines from C++ as
many libraries, such as libc, contain routines written
in C. At first glance, this wouldn't appear to pose any
problems. However, since C++ allows function
overloading, it is forced to perform name-mangling on
function names. That is, if you have three different
functions named foo, then C++ has to invent a unique
name for each of them. External routines written in C
will not have mangled names, so C++ allows you to
indicate that a given function is a C routine and that its
name should not be mangled. This is done by
preceding a declaration with extern "C".

The extern "C" declarations provide a nice
mechanism and, when needed, are absolutely crucial.

2.15 Calling C++ Routines from C

Calling C++ routines from C is an entirely different
matter. There are two ways to call external C++
routines from C. You can either deduce the routine's
mangled name and call it, or you can define the global
C++ routine to be extern "C" and defeat the name
mangling.

It is more challenging to call C++ member
functions from C because you don't have objects in C.
Say you have a C++ class and a corresponding
structure in C. To call a member function, you would
need to write a wrapper routine in C++ which takes the
structure as a parameter, converts it to an object and
then calls the appropriate member function. The
wrapper routine must also be prepared to convert any
return values.

We called global C++ routines from C and did so
by deducing the mangled names. This was the wrong
way to do it. We were unaware of the extern "C"
technique until Hank pointed it out while reviewing
this paper. We never tried to call C++ member
functions directly from C.

Having mangled names in our C sources leaves us
at the mercy of the C++ compiler. There is no
guarantee that all C++ compiler's will mangle names
in the same way or that a given C++ compiler will
always use the same technique. This creates both
portability and maintenance problems. The extern
"C" approach is the civilized technique.

2.16 Comments

C++ introduces a new syntax for comments,  is a
comment until the next newline. 

This is another topic that we never discussed
amongst the group. Consequently, some people chose
to use  and others / /. Furthermore, we used
gxv++ and it generated code with 
comments.
Ultimately we ended up with some files using ,
some using 
/ / and, worse yet, some using both.

There is clearly no right or wrong answer here.
However, it would have been preferable if we had all
used the same style.

2.17 set_new_handler()

The new operator uses malloc() to allocate a new
object. If it is unable to allocate memory, then it returns
a NULL pointer. The C++ library routine
set_new_handler() allows you to register a
function to be called when the new operator fails.

set_new_handler() provides a nice way to
intercept malloc() failures within the built-in new
operator. The alternative is to check the return value of
every call to the new operator. We used
set_new_handler() to register one routine for
command line programs and a different routine for
GUI programs. The command line routine printed an
error message and exited and the GUI routine
displayed a pre-allocated notice and then exited after
the notice was dismissed.

3.0 Features We Chose Not to Use

Until now, this paper has described the features of C++
we used and our experiences with those features. This
section describes the features we chose not to use and
why.

3.1 Multiple Inheritance and Virtual Base Classes

Multiple inheritance allows a derived class to inherit
from more than one base class. Each base class can
also be the product of multiple inheritance, creating an
inheritance DAG. A class derived via multiple
inheritance exports an interface which is the union of
the interfaces exported by all its base classes. The
inheritance DAG can include the same base class more
than once. In this situation, virtual base classes control
whether or not the derived class has just one or many
instances of the base class.

When considering inheritance the is-a versus has-
a relationship is crucial. If a derived class is one of
another class, then inheritance is proper; however, if
the derived class has one of a another class, then
aggregation is proper. For example, an editor window
would derive from a generic window class because, it
is a window. An editor window would also have a font,
but it would not inherit from the font class. Rather, it
would contain an instance of the font class or a pointer
to an instance. For multiple inheritance to be proper, a
derived class needs to be one of several other classes.
We never encountered this situation.

The C++ books we had were little help either.
Despite Waldo's [1991] protests to the contrary, we
agreed with Cargill [1991] that the multiple
inheritance examples in the books were contrived and
could have been more cleanly expressed with
aggregation.

We found single inheritance easy to understand
and use and extremely powerful. However, we found
multiple inheritance to be very complicated and
confusing. With multiple inheritance, the reader of a
class definition must assimilate the entire inheritance
DAG complete with virtual and non-virtual base
classes to understand the class. Multiple inheritance
creates an awkward ambiguity when two base classes
have a member with the same name and the name
resolution rules in a complicated inheritance DAG are
extremely difficult.

More than any other feature in C++, multiple
inheritance appears to be a large wart on the language.

3.2 Reference Parameters

Reference parameters allow you to pass parameters by
reference rather than by value. There are some
situations in which reference parameters are highly
desirable, namely operator overloading, where they
clean up an otherwise very ugly syntactical situation.
There are other cases where they can cause confusion
such as for base types. Reference parameters are an
attempt to clean up some of the notation associated
with pointers in C. In C, if you have a structure and
want to pass a pointer to it as a parameter you must
take the structure's address with the & operator.
References let the compiler do this for you. 

Linton [1993] contends that reference parameters
aid in storage management. Typically, a local object is
passed as a reference parameter. Therefore, the
receiving function is promising not to store the
parameter's address in a global data structure.

More fundamentally, a C++ programmer must
decide if their basic programming model is pointer
based or object based, that is, do they have pointers to
objects or just objects. This is complicated by the fact
that objects can be declared globally and locally, yet
the new operator returns a pointer to an object, not the
object itself. Reference parameters assume that you
are object based, not pointer based, however, this
makes it more awkward to use the new operator. Most
real applications use the new operator because most
applications need to dynamically allocate objects.

Since we were all experienced C programmers, we
were comfortable with the notation associated with
pointers. We had also developed a programming
model in our heads which reference parameters
change. For example, in C, parameters cannot be
changed in a way which affects the calling function.
Early in our development someone used a reference
for an int parameter. Another person was reading the
code and thought he had found a bug because the int
was being passed to a routine and the next line clearly
assumed the int's value had changed. We all knew an
int passed by value could not be changed in the caller
so this must be a bug. However, since the caller
declared a reference parameter, the compiler was
actually passing in the address of the int, so the
parameter was being modified by the call. We avoided
references because of that incident, because our code
made heavy use of the new operator, and because we
did not use operator overloading. There are some
tricks which old dogs can't learn.

3.3 Friends

Friends allows you to specify that a function or an
entire class can access a class's private member fields
and functions.

We observed that friends are generally used when
interfaces are not cleanly defined. They can be thought
of as the casts of classes, that is, a way to circumvent
the language's built-in safeguards. We were not
successful in completely avoiding friends, but we
strongly suspect that the two places we used them
signify flaws in our interfaces. We discourage the use
of friends.

3.4 I/O Streams

C++ provides an I/O Streams package as a
replacement for C's stdio package. Stdout is
represented by the cout object and output is done
with the overloaded << operator; stdin is
represented by the cin object and input is done with
the overloaded >> operator.

We took an initial dislike to this package. We were
perfectly comfortable with printf, we were told that
I/O Streams have somewhat worse performance than
the stdio package, and we did not care for the
syntax. Actually, this package violates one of the
axioms for operator overloading that Hank told us
about - "Never overload an operator for something
other than its normal purpose". This package
overloads the shift operators for input and output.
Deciding to not use C++'s I/O streams was probably
one of the quicker decisions we made.

3.5 Operator Overloading

As mentioned above, we overloaded the new and
delete operators for some classes. Otherwise, we
did not use this feature. We felt that operator
overloading could be a very seductive feature, as you
can somewhat create your own language. This
temptation should be avoided. There are probably
some very good situations for operator overloading,
but care should be taken to ensure they are not used
gratuitously.

3.6 Default Arguments

Default arguments allow a function call to be missing
its trailing arguments and they will receive default
values. Default arguments are essentially a short-hand
for writing several overloaded functions.

One person in our group used a few default
arguments and liked them. The rest of us never felt
they were necessary and in similar situations wrote the
overloaded functions.

3.7 Local Declarations Anywhere

C requires that local variables be declared at the
beginning of statement blocks. C++ lets you declare
variables anywhere. For example, if an int is used
just in a for loop, then the for loop can be written:

for (int i = 0;...

We found very little practical value to this feature.

4.0 Features We Didn't Think About

C++ is a large, complex language. The following are a
set of features that we really didn't think about and
therefore did not use. 

4.1 Global Objects

Global objects are instances of classes with a global
scope. They must be initialized via a constructor
before main() is called. C++ sets up the executable
this way, however, the order of invocation of the
constructors is not defined.   So, if one global object's
initialization depends upon another there are potential
ordering problems.

We did not have any global objects, in part because
we used a pointer based programming model, so we
never encountered the ordering problem. However,
another group in SunPro did encounter this problem
and warned us about it.

4.2 Copy Constructors

Copy constructors are used when an object is copied.
For example, if a class has a char * field where each
instance has its own copy of the string, then, when this
object is copied, the char * field should be duplicated
rather than just copying the pointer. Copy constructors
provide this mechanism.

This is another situation which the pointer based
programming model seems to avoid.

4.3 Static Member Fields

Static member fields are fields which are shared across
all instances of an object. They can be considered
global data which is scoped by a class, that is, they can
be accessed only through an instance of a class.

4.4 Static Member Functions

Static member functions are member functions which
are not passed a this pointer. Like static member
fields, they can be considered global functions which
are scoped by a class.

4.5 Const

Const is new to ANSI C and to C++. Unfortunately it
is somewhat different in the two languages. We were
coming from a K&R C world without consts.

We never gave consts much thought. Looking back
we do not recall any situations in which consts would
have saved us. Still, they seem like a reasonable
feature, especially in interfaces to class libraries. If an
interface rigorously uses consts in its declarations,
then users can know which parameters may be
modified by which routines. This looks like another
feature which is more valuable when producing a
public API.

5.0 Conclusions

The TeamWare group felt that using C++ was an
advantage and that using C++ helped reduce
development time and increase the quality of our code.
We found the single largest advantage to be required
function prototypes. We converted two existing C
programs to C++ and in both cases function
prototypes found one or two latent errors. Required
function prototypes eliminate an entire class of errors
which occur in C and give programmers much more
confidence. However, function prototypes alone are
not sufficient reason to use C++ as an ANSI C
compiler which enforced the use of prototypes would
be a more sensible choice.

We were very pleased with C++'s object model.
This includes classes, constructors, destructors, single
inheritance, member functions, virtual functions and
abstract base classes. We found these features of the
language easy to adopt and easy to understand. C++'s
object model encourages the good programming
practices of modularization, well defined interfaces
and code sharing. The object model provides valuable
functionality which sets C++ apart from ANSI C.

One significant disappointment with C++ is that it
does not separate a class's interface from its
implementation. Abstract base classes are nice as they
provide an interface specification. However, if the
class also has private member functions, they must be
defined along with the public ones. If a private
member function is added to a class, its interface has
not changed but, in the world of make, the header file
has changed and all files including that header will be
recompiled.

Unfortunately, the object model includes multiple
inheritance and, consequently, virtual base classes.
Most examples of multiple inheritance found in C++
books are contrived and should be replaced with
aggregation. They do not provide good models to
follow. Multiple inheritance adds complexity to a
program. It should only be used if the is-a relationship
is satisfied for each inheritance and if the result is
simpler than aggregation. We believe that groups
which follow this advice will rarely, if ever, use
multiple inheritance.

Second only to multiple inheritance is the decision
regarding a pointer based programming model versus
an object based programming model, or whether or not
to use reference parameters. C programmers tend to
gravitate towards the pointer based programming
model as it is familiar to them. While we favored the
pointer based model, it is more important that a
decision be made and that the decision be applied
consistently throughout the project. In the case of
public APIs it might be appropriate to supply both
pointer and reference interfaces as this allows users of
a library to choose their own style. 

We chose to use a fairly small subset of C++ as we
found it had an over abundance of features.
Unfortunately, the ANSI C++ committee is not done
adding to the language. Exceptions and templates were
added after we started our project. The language is still
growing and looks like it will include run-time type
information (RTTI), name spaces and additional cast
operators. We fear that as C++ grows it becomes less
usable. This may result in C trying to adopt a useful
subset of C++ features. The ANSI C committee has
voted to reconvene itself and has received a proposal
to add classes with single inheritance [Jervis 1993].
While we welcome these features into C, we fear the
war between C and C++ that may result.

We are convinced that incrementally adopting C++
features and making conscious decisions about which
features to use and how to use them was the right thing
to do. If anything, we should have discussed more
thoroughly some of the features of the language, such
as private and public members and comments.

One final anecdote regarding the earlier story
where someone used an int reference parameter. In
discussing this paper, the programmer's comment was
- "Well, the feature was in the language so I figured I
should use it.". It is our belief that this is not a
sufficient criteria for using a feature of C++. A feature
should be used only when it can be demonstrated to be
of benefit. A mountain is climbed "because it is there".
The same should not hold true for C++ features. Their
mere existence is not justification for use.

6.0 Summary

Below is a table of C++ features along with our
assessments:

_Feature We Used  _Comments
 Function Prototypes    Most valuable feature 
 Objects      Second best feature 
 Classes     Well done 
 Constructors/Destructors   Good programming practice 
 Member Functions    Good name space scoping 
 Single Inheritance    Promotes code sharing 
 Virtual Functions    Powerful; promotes code sharing 
 Pure Virtual Functions   Ugly syntax; valuable 
 Abstract Base Classes    Ugly syntax; valuable 
 Function Overloading    Used mostly for constructors 
 Inlined Functions    Very nice 
 Calling C from C++    Strange syntax; nice feature 
 Comments     Why? 
 set_new_handler()    Very convenient 
 Private or Protected Fields   Did not use effectively 
 Private or Protected Functions  Did not use effectively 
 Operator Overloading    Used only for new and delete 
 Calling C++ from C    Very awkward 
Features We Chose Not to Use Comments 
 Multiple Inheritance     Too complicated 
 Virtual Base Classes     Too complicated 
 Reference Parameters     Old dogs can't learn new tricks 
 Friends      Indicative of problems 
 I/O Streams      Saw no advantage over stdio 
 Default Arguments     Used function overloading instead 
 Local Declarations Anywhere    Why? 
Features We Didn't Think About Comments 
 Global Objects    Did not need them 
 Copy Constructors    Used a pointer-based model 
 Static Member Fields    Provides tighter scoping 
 Static Member Functions   Provides tighter scoping 
 Const     Good for public APIs 

Acknowledgments

I would thank the TeamWare development team of
Jules Damji, Jill Foley, Claeton Giordano, Lewie
Knapp, Daniel O'leary, Marla Parker, Mark Sabiers
and our manager John Treacy.

References
[1] T. Cargill, Controversy: The Case Against
Multiple Inheritance in C++, Computing Systems,
4(1);   Winter 1991.

[2] S. C. Dewhurst and K. T. Stark, Programming in
C++, Englewood Cliffs, NJ; Prentice Hall, 1989.

[3] B. Eckel, Using C++, Berkeley, CA; McGraw-
Hill, 1990.

[4] M. A. Ellis and B.Stroustrup, The Annotated C++
Reference Manual, Reading, MA; Addison-Wesley, 

[5] R. Jervis, Classes in C, Working paper for ANSI/
ISO WG14/N298 X3J11/93-044, 1993.

[6] M. A. Linton, private communication, 1993.

[7] J. Waldo, Controversy: The Case Against Multiple
Inheritance in C++, Computing Systems, 4(2); Spring
1991.

Biography

Evan Adams. Has been at Sun Microsystems for over
11 years working on compilers, debuggers (wrote the
original dbxtool) and other programming tools.
Spent 4 1/2 years at Amdahl Corporation working on
UTS (Unix on IBM mainframes). Graduated with a BS
in Computer Science from Oregon State University in 

  1. Email address: evan at eng dot sun dot com.
Tags:
Created by admin on 2009/10/26 12:17
Last modified by admin on 2009/10/26 12:17

Collectives


XWiki Enterprise 2.7.1.34853 - Documentation