by Doug Lea.
An interface encapsulates a coherent set of services and attributes (broadly, a Role), without explicitly binding this functionality to that of any particular object. In general, one object may support several interfaces, and conversely, one interface may be implemented by several objects working in tandem. CORBA IDL is probably the best-known (but very impure) example of an IDL. IBM SOM is in part an IDL, but also contains additional features; many akin to the C++ variants described below.
OO Interfaces may be described as structured collections of operations, where each operation has a name and a type. The type is described as a signature of arguments and results, perhaps along with some indication of of semantics (e.g., preconditions and postconditions) and/or protocols (e.g., descriptions of client-visible events such as callbacks produced upon invocation -- see for example PSL ).
The types appearing in interface signatures in a pure OO IDL must either be value types representing abstract values in an opaque fashion (e.g., integers, strings), or be handle types that represent capabilities or connections to entities providing the services described in an indicated interface.
One interface may be described as a subinterface of another if it extends its properties (normally by listing additional services.) In some IDLs one interface need not explicitly list that it is a subinterface of another; if it contains all of the same features, plus possibly more, it is considered a subinterface (this is known as type conformance -- see OOSD). It is common when defining interface hierarchies to use fairly fine-grained interfaces at the top, each defining only a few operations, and to use (multiple) interface inheritance to define ``fatter'', more useful ones as various combinations of these basic bits of functionality.
C++ does not directly support all of these notions, but contains mechanisms that achieve some of the effect.
Abstract classes can be used to define the C++ version of interfaces. A C++ abstract class describes functionality shared by objects of all concrete classes that are explicitly listed (perhaps indirectly) as subclasses. An object may play multiple roles by inheriting and implementing multiple interfaces. (The language doesn't directly support notions that an object may play different roles at different times, or that a role is implemented by a collection of objects, but these effects can usually be had in one way or another.)
C++ interface-style abstract classes take an idiomatic form:
class AnInterface { public: virtual T1 aService(T2, T3) = 0; ... virtual T4 anAttribute() const = 0; // get value virtual void anAttribute(T4) = 0; // set value ... virtual T5 aReadOnlyAttribute() const = 0; virtual ~AnInterface() {} protected: AnInterface() {} };
Ideally, the types T
should consist only of
pass-by-value scalar types (int, float, enum, ...
),
collections of them (structs, ...
) and/or pointers to
objects of classes also defined via abstract classes (thus
representing handles). The lack of native by-value string and array
types in C++ is a problem here. One occasionally attractive
alternative is to always contain fixed arrays in
structs
. But in practice, you can make ADT-style types
work OK too. This way you can sometimes obtain simpler mechanics and
also avoid value copying. ADT-style fake pointer classes may
also be used instead of raw pointers for handle types, although there
is no perfect way to do this.
Abstract classes sometimes lend themselves to parameterization
over some type used in one or more signatures. (Some nice examples are
described in Barton and Nachman's book.) Beyond the
template
prefix, nothing much changes except for the
pragmatic problems of dealing with templates in C++. These include for
example the fact that template instantiation errors are not usually
reported until link time. This is best combatted by prefacing each
template with a brief comment about what operations are assumed to be
supported on the type (e.g., a < comparison).
Since interface classes cannot be directly instantiated, yet serve
as virtual base classes for implementations, the constructors should
take no arguments and should be listed as
protected
. Also, for similar reasons, abstract classes
should have a no-op virtual destructor, not one listed as ... =
0
. Depending on your compiler, you might need to define the
no-op constructor and destructor operations outside the class
declaration in a separate .C
file.
The use of const here and elsewhere has its ups and downs. Officially, it is a good idea, since it helps enforce some of the intended semantic guarantees. But often enough, pragmatic concerns get in the way -- generally, const-ness propagates through all code that any implementations of the services touch. And C++ is sometimes too literal-minded about it to enforce it in a useful way. Sometimes (but only sometimes) it is better just to enforce these semantics manually. (In ADT-style classes, on the other hand, C++ const support tends to work pretty well and should almost always be used.)
Similar remarks hold, but moreso for exceptions. It is hardly ever a good idea to annotate a signature of a C++ abstract class with an exception list -- doing so commits all implementations to raise only the ones listed, which is often impossible to live with. Not listing any says implicitly that any exception may occur, thus requiring manual documentation about the ones that are likely.
Officially, subinterfaces should be declared as public
virtual
subclasses of all of their direct ancestors. This form
of (possibly multiple) inheritance prevents spurious strangenesses
when the same operation is inherited along more than one path. The
same hold true of the leaf concrete subclasses that
implement interfaces. For most purposes public virtual
inheritance ought to be considered the default subclassing mechanism.
People often break this rule however. With most compilers,
programmers pay very noticeable performance penalties (both time and
especially space) when they use virtual base classes. Instead, they
stick with 100% single-inheritance designs, in which case regular
public
subclassing mechanics suffice. If you do this, it
alters the way you tend to define base classes. And once you go this
route you are normally stuck with it. As a rule of thumb, either use
public virtual
subclassing consistently for all
subclasses or not at all. Doing otherwise leads to dark corners.
In general, try to avoid overloading the same operation name, with the same number of arguments but with different argument or result types in subinterface classes. If you do so, be prepared to study the resolution rules carefully.
newC
for each class C
.
Factories often have methods that produce instances of of several related classes, but all in a compatible way. Factories should themselves be defined via interfaces, so the client need not know which concrete factory object it is using. Ideally, all such matters can be reduced to a single concrete call to construct the appropriate ``master'' concrete factor in a client application.
Factories often need to invoke special ``open'' constructors on
the concrete classes they generate, that enable them to lay out all
properties by initializing internals in any thay they please. The
basic form of an ``open'' constructor is to have an argument
associated with each internal slot (member variable) and to bind the
slot to the value of the argument. These kinds of ``open''
constructors are a little dangerous to have around in general, but it
is hard to get C++ to agree about the access privileges surrounding
them. Ideally, you'd like to have constructors listed as
private
but with the factories as
friends
. Unfortunately (in this case),
friend
ship is not transitive, so you can't write
something saying in class C
saying that all
CFactory
s are friends of all SubC
s. Often
enough, the only alternative is to leave the constructors as
public
but to document their intended use.
It doesn't take too many classes before naming conventions for
interfaces start becoming a problem. People tend to want to give the
same names to different classes and operations (for example
Node
, put
etc). But when clients use a
class or operation name, they need to be sure about what they are
getting. There is only one good solution here, language-based support
of modules, packages, or namespaces, that support some kind of nested
name prefixing scheme. The ANSI standard C++ contains a
namespace
construct usable for these purposes, but most
compilers do not yet implement it.
Until then, you have to live with non-optimal solutions, for example manual naming conventions in which each class and/or operation name given a standard prefix reflecting its module name. The least desirable workaround is to use C++ nested classes, which are hardly ever worth fighting with.
On the other hand, nesting typedefs
and
enums
within classes is a simple way of avoiding
name-clutter for symbolic type names, and should always be used when
the scope of a symbolic type name can be restricted to implementations
and clients of a particular interface. For example, an interface
defining array-like operations for which all index arguments must be
unsigned shorts might include:
class ArrayLikeThing { public: typedef unsigned short Index; virtual Index firstElementIndex() = 0; ... };
Note that a client would have to invoke this via something like:
void f(ArrayLikeThing* a) { ArrayLikeThing::Index i = a->firstElementIndex(); ... }
C++ supports a number of variant definition styles that lie on the border between interface-based and OO techniques. Unlike most other sublanguage interactions, most of these are pretty straightforward and useful.
Sometimes you'd like to add some miscellaneous utilities that
conveniently package up a certain sequence or combination of
invocations on the base operations defined in an interface. For a
too-simple example, suppose you have an interface Coll
for
a collection of some sort with a put(int x)
operation,
and a lot of expected clients that will need to put in items in
pairs. You'd like to have something like putPair(int a, int
b)
, with the obvious implementation. There are at least four
alternatives:
PairColl
with method
putPair
, with the understanding that it could be implemented
by a concrete subclass that holds a link to a Coll
and sends it pairs of put
s.
void putPair(Coll* c, int x, int
y) { c->put(x); c->put(y); }
.
putPair
directly into the Coll
class as a non-abstract operation, with the series of
puts
as the default implementation code.
This is a version of the subclassing versus composition issue. In
general, the best answers are either (1) or (4). The first provides
clean layering of code that uses a set of base functionality to
provide more complex functionality. It also allows you to come up with
totally different implementations (for example here to rely on objects
that somehow maintain all elements in pairs). The second and
third are less abstract simplifications of (1), that are sometimes
appropriate for one-shot use (for example, as local helpers in a
client module). In contrast, the final option doesn't have the
layering benefits but does make it easier for concrete subclasses to
specialize the operation; it may be that for some implementations
there is a faster way of put
ting two items than invoking
put
twice.
Another case in which option (1) best applies is when you have a utility that operates on pairs of objects of some nominal type, and you ever need to be able to specialize that behavior on the basis of both types. You must either encapsulate this as a specializable interface or be prepared to handcraft multiple dispatching at the implementation level.
Sometimes, there is a reasonable default implementation for an operation defined in an interface, and this default can be coded in a way that does not introduce any internal representation mechanics. Pure interface-level defaults only work nicely when they do not introduce any internal representation constraints (but see next variant).
One common example is that during development, you wight want to stub out operations by simply printing a message whenever they are invoked. This might as well be implemented as a default in the interface class itself. (Although the logistics are sometimes tricky for operations that are supposed to return something; you have to figure out some value to return.)
You can add additional scaffolding via protected
methods. For example, if the printed message should take a particular
form, you could declare and implement a protected
operation printMsg(char*)
(or whatever) and then invoke
it in the default implementation of all the others.
Even further, you can set things up to rely on representations
without actually declaring them. For example, suppose that this
printMsg requires a C++ ostream*
representing a log file,
and that this logfile might be different for different objects. You
can still add the default without introducing any representational
mechanics by adding a pure virtual protected
attribute-style method logFile()
, that returns the
current log file handle. This might be implemented in different ways
in different subclasses -- some might just keep a pointer internally;
while others might ask another object what the current log file is
every time logFile
is invoked. So all together, we'd
have:
class X { protected: virtual ostream* logFile() const = 0; virtual void printMsg(const char* m) { (*logfile()) << m << " called\n"; } public virtual void anOp() { printMsg("anOp"); } ... };
Each subclass would have to implement logFile()
itself (perhaps just as an accessor for a private: ostream*
logFile_
), but once this is done, the default mechanisms work as
defined.
Sometimes, there are good reasons for claiming that some aspect of an interface's functionality must be implemented in a certain way. For example, perhaps all implementations must be compatible with representation types and conventions of some legacy code. Or perhaps there is only one way that you can imagine ever implementing a subset of the attributes or operations described in an interface. All-in-all, this is fairly common.
Once you add slots (member variables) to a class in C++, it stops acting like an interface class -- all subclasses will contain the representations, which means that they should use them to implement functionality. (Having the subclass carry around the slots but not ever using them is just asking for trouble. This is not to say that it's never an option; it's just intrinsically dangerous.)
So the first question to ask is whether you can avoid introducing member variables. Variants of the pure virtual protected attribute idiom often suffice, although they add enough performance overhead to be unattractive in some cases. For example, if all instances are required to maintain an IDnumber, and this number must be maintained in a fixed known representation, and it is commonly accessed in performance-critical code, then it would probably be overkill to delegate maintenance of IDNumbers to some other IDNumberMaintainer object that each object accessed via a virtual attribute, that in turn would probably always be implemented via a direct pointer to the IDNumberMaintainer anyway. So in this case, probably the best option is just to introduce some representation and operations that maintain it in the interface class itself.
However, you can still avoid overcommitment by encasing these mechanics in a subclass of the main interface class, so other options still remain possible while still simplifying and regularizing subclasses that rely upon the same mechanism. This is among the best way to introduce code-sharing in class hierarchies supporting standardized interfaces.
Usually the best approach in carrying this out is to write things
almost as if you were embedding a little inner
representation-maintainer class in the main class itself. Given this,
the representation should be private
, and manipulated
only via methods usable by the ``outer'' objects and/or their
clients. The maintenace methods typically include intialization via a
non-default constructor (which is in turn problematic with virtual
inheritance; you might instead need to define an explicit
protected
initialization method.)
There are several ways to set this up, varying in flexibility. The
most flexible option is to declare the representation manipulation
methods as protected
, non-virtual
, and
possibly inline
, with slightly mangled names. Then the
default virtual
versions of those operations that
are publically exported can be written to just invoke the
internal versions. For example:
class ThingWithID { private: int IDRep_; protected: inline int id_() const { return IDRep_; } ThingWithID(int ID) IDRep_(ID) {} public: virtual int id() { return id_(); } ... };
Whenever you enter the world of even partially concrete classes, you also have to make some policy about copying and assignment. If any subclasses will need to support a copy constructor and/or an assignent operator, then scaffolding for them must reside in all (semi-)concrete superclasses.
The vast majority of classes defined via interface-based design do not support any natural meaning for the notion of copying or assignment. In fact, this is true for many other classes as well. Unfortunately, C++ has a rule saying that the compiler will create these for you itself unless you explicitly list them. The best way to disable them entirely is to list them as private (so they are not callable) with no-op implementations; as in:
class ThingWithID { ... private: ThingWithID(const ThingWithID& t) {} void operator=(const ThingWithID& t) {} };
With some compilers, it is not even necessary to put in the no-op definitions; the compiler will then complain at link time if they are ever called. (Remember that even private methods may be called (by mistake in this case) internally.)
(These kinds of declarations are not needed in pure abstract class declarations since they are not instantiable to begin with.)
On the other hand, if you do need copyability in subclasses, then you need to add support for them in the base class. It turns out that in this example, and all others in which all slots are of native scalar type, that the versions of these operations that the C++ compiler would automatically generate would be OK, so you wouldn't really have to code this in the present example, but the form is:
class ThingWithID { ... public: ThingWithID(const ThingWithID& t) :IDRep_(t.IDRep_){} ThingWithID& operator=(const ThingWithID& t) { IDRep_ = t.IDRep; *return this; } };
(When unimplemented, operator=()
might as well be void,
but when implemented, it should obey the usual C++ conventions for
assigment operators; thus returning *this
.)