ConcreteClassName ConcreteOperation1(arg1, arg2)
Type ConcreteOperation2( )
AbstractClassName AbstractOperation(Type arg) Type ConcreteOperation( )
Examples of abstract and concrete classes. The names of
abstract classes and of pure virtual functions (functions without an im-
plementation) are set in italics. Note that type information for arguments
and return value is optional.
it is not possible to instantiate any objects using an abstract class; that
would not make sense as some of the operations lack implementation.
New classes can be deﬁned in terms of existing
classes using inheritance. Inheritance relationships is indicated with a
vertical line and a triangle that points toward the parent class, see Fig-
ure A.2. As a convention, the parent class is placed above its subclasses.
As it is clear that a subclass inherits all the operations and all the mem-
ber variables of its parent class, inherited attributes are not repeated in
the class diagram, unless there is a special reason for doing so.
Aggregation and Acquaintance
It is common to use object compo-
sition to create more complex objects. For example, an object represent-
ing a drawing can be composed of graphical objects such as lines and
polygons, see Figure A.2 (a). As another example, a company can be
seen as a composition of its divisions, and each division as a composi-
tion of its departments, see Figure A.2 (b). This type of relationship is
denoted aggregation, see [126, 45]. An aggregation relationship implies
that one object owns or is responsible for another object. As a result,
an aggregated object should never outlive its owner. In class diagrams,
aggregation is shown as an arrow with a diamond shape at its base; the
arrow points toward the aggregated object and an optional label can fur-
ther describe the role of the relationship, see Figure A.2. A ﬁlled circle at
the arrow deﬁnes a one-to-many relationship. In the case of the drawing
in Figure A.2 (a), a single drawing may consist of many graphical objects.
Objects of one class may use the services provided by objects of other
classes. For example, as shown in Figure A.2 (b), a company may use
A Class Diagram Notation
Shape Drawing Line Texture (a) Drawing application
Company Division Department Person employees
Illustration of diﬀerent class relationships.
the services provided by its employees. In the example in Figure A.2 (a),
we allow polygons to be textured. As texture objects consume much
memory, a possible optimization is to let polygons with identical tex-
tures share a common texture object. Thus, polygon objects do not own
the texture objects, they only use them. This type of relationship is de-
noted acquaintance, see [126, 45]. Acquaintance merely knows of another
object, thus it is a much weaker relationship than aggregation. In class
diagrams, acquaintance is depicted just as aggregation, except that there
is no diamond shape at the base of the arrow, see Figure A.2.
Although aggregation and acquaintance are usually implemented the
same way, they may lead to diﬀerent semantics. Consider again the exam-
ple of the company class diagram in Figure A.2 (b): Had the relationship
between a company and its employees been aggregation instead of ac-
quaintance, the meaning would be that companies are like labor camps,
from which the employees never leave.
Names of types and operations begin with a capital let-
ter and contain no underscores. The names of any attributes or function
arguments begin with a lower case letter. If an attribute name is a com-
bination of several words, the words are separated by underscores. These
conventions make it easier to distinguish between types and operations
on the one hand and attributes and arguments on the other hand.
A Class Diagram Notation
Complex RealT Abs( )
RealT Imag( )
(a) Template class
ClassName + PublicOperation( )
# ProtectedOperation( )
- PrivateOperation( )
$ ClassOperation( )
(b) Access modiﬁers
Conventions used to illustrate template classes and to con-
vey information about access rights.
The OMT class diagram notation does not have any construct for
C++ constructs like template classes and template functions. To show
that a function is a template function, we use the convention to let the
name of the type of its argument end with the letter “T”. We use the
same convention to show that a class is a template with respect to a
certain type. As an example, consider the class Complex for representing
complex numbers. We can implement this class as a template, where
the numeric type of the real part and the imaginary part is the template
argument, see Figure A.3 (a).
For some design patterns, the access right to the participating class
methods are an important part of the pattern. Where it is important to
convey access rights, we have borrowed the conventions from the UML
notation . See Figure A.3 for a description of this notation. In case
these symbols are omitted, it can be assumed that all class methods are
public. Member variables are always private, thus no access symbols are
used for these.
In this thesis we use class diagrams to illustrate the high-level design
of classes. Thus, the class diagrams never list every operation of a class.
Instead they list the operations that are most important to convey the
design of a class.
Because path planning deals with moving geometric objects, we need a
systematic way to describe position and orientation. In this appendix
we discuss rigid body transformations. In particular we look at diﬀerent
ways of representing rotations and their drawbacks and advantages from a
path planning point of view. Note that the words rotation and orientation
are used interchangeably in this discussion; any orientation of a geometric
object can be seen as the result of applying a single rotation to the object
when it is in its default orientation.
The Homogeneous Transformation Ma-
In this section we brieﬂy discuss the homogeneous transform as a way
of representing rigid body transformations. The discussion will form a
basis for the design and implementation of a class that encapsulates the
concept of homogeneous transformations.
To each object we attach a coordinate frame. Using the notation of
Craig , let
T denote the homogeneous transformation matrix that
maps position vectors from frame F
to frame F
. That is, given a
P in frame F
, its representation in frame F
P , is given by
B Rigid Body Transformations
Illustration of the notation used to distinguish between
diﬀerent coordinate frames.
Note that the position vectors must be expressed in homogeneous coordi-
nates, i.e., P = (x, y, z, 1)
(if we assume no scaling). The notation with
leading sub- and superscripts might seem awkward at ﬁrst sight, but it
provides a simple mnemonic that help us in writing transform equations
correctly–a subscript must match the following superscript. If we only
consider rigid body transformations, then the homogeneous transforma-
tion matrix has the following structure:
R is the 3 × 3 rotation matrix that describes the orientation of
with respect to frame F
denotes the origin of
, see Figure B.1.
In ﬁelds like computer graphics, the last row can be used for perspec-
tive transformations. For our purposes however, the extra dimension can
be seen as a construct that allow us to treat rotations and translations in
B.1 The Homogeneous Transformation Matrix
a uniform manner. The columns of the rotation matrix in Equation (B.2)
can be interpreted as the principal directions of frame F
T , we can use ordinary matrix inversion to ﬁnd
T . How-
ever, if we consider that rotation matrices are orthonormal, the following
equation gives a more eﬃcient method for computing the inverse:
So far we have seen the homogeneous transform as just a mapping
of position vectors from one frame to another. It can also be seen as
the description of one frame relative another; if we attach a frame F
to a geometric object, such that F
will move together with it, then
the position and orientation of the object relative the world frame F
is completely speciﬁed by
T . In this context we will denote
T as the
pose of the object.
Often the pose of an object is not given directly in terms of the world
frame. This is the case when we deal with kinematic chains, where the
pose of each frame is known only with respect to the previous frame. To
obtain the global pose of, say, frame j in the chain, we multiply all the
transforms leading to this frame:
T · · ·
Note again how the notation help us to get the order of matrix multipli-
As an example, consider Figure B.2 where a robot is about to grasp
a cylinder. For the sake of clarity, the robot arm, to which the gripper is
attached, is not shown. Using Equation (B.5), we have obtained the pose
of the hand,
T , where subscript E stands for end eﬀector. We assume
that the pose of the cylinder is known, and we denote it
T . To plan
the task, we would like to know the pose of the cylinder in terms of the
end-eﬀector frame. Using Equation (B.5), we readily obtain
B Rigid Body Transformations
A parallel-jaw gripper that is about to grasp a cylinder.
More detailed expositions on the material presented in this section can
be found in any introductory text book on robotics or computer graphics.
See, e.g., Craig , Murray et al. , or Watt .
A Class for Homogeneous Transformations
The homogeneous transform introduced in Equation (B.2) is very useful
because it allows us to treat both rotations and translations in a uni-
form manner. However, for computational purposes, the representation
is not eﬃcient; using 4 × 4 matrices in a computer program would not
only waste memory, but also time on multiplying by zeros and ones.
Thus, we would like to have a class that mimics the syntax of Equa-
tions (B.1) and (B.5), but implements it more eﬃciently. If we consider
Equations (B.2) and (B.1), and go back to ordinary vector representation,
it is easy to see what the homogeneous transform actually does:
So, considering the absence of scaling and perspective transformations,
the interface of the class Transform in CoPP follows the form of Equa-
tion (B.1), whereas the implementation uses Equation (B.7).
So far we have only discussed the transformation of position vectors,
which according to Equation (B.7) are transformed by both a rotation
and a translation. However, sometimes we want to transform free vectors,
i.e., vector quantities like velocities and forces. Transforming such vectors
according to Equation (B.7) would be an error, because they should be
rotated only. If we use the homogeneous transform, this can be solved if
we follow the convention that free vectors have the following homogeneous
B.2 Representing Rotations
representation: V = (x, y, z, 0)
. In CoPP, position vectors and free
vectors are treated as two diﬀerent types. That allows multiplication
between a transform and a vector to have diﬀerent meanings, depending
on the type of the vector.
As seen from Equations (B.2) and (B.5), multiplying two transforms
involves multiplying two rotation matrices. If ordinary matrix algebra
is used, the operation of multiplying two rotation matrices will require
27 multiplications and 18 additions. However, if we utilize that the last
column in a rotation matrix is equal to the cross product of the ﬁrst two
columns, then the operation count can be reduced to 24 multiplications
and 15 additions. This optimization is used by the Transform class.
In summary, the class Transform overloads the operator * to repre-
• Multiplication of of two transforms
• Transformation of position vectors
• Transformation of free vectors
In addition the class also provides several useful routines, such as conver-
sion from and to the equivalent axis-angle representation of the rotational
Whereas representing a translation is straightforward, this is not the
case with orientations in 3D. Many diﬀerent representations have been
suggested, each with its own merits and drawbacks. Here we will take a
look at the more common representations and see how they are useful in
the context of path planning.
As seen in Section B.1, a 3×3 rotation matrix can be used to represent the
orientation of one frame with respect to another. This is a very common
way of representing rotations with several advantages such as: elegant
and straightforward syntax, eﬃcient transformation of coordinates, and
existence of specialized hardware for such operations.
There are also some drawbacks with rotation matrices. The nine ele-
ments of a rotation matrix are clearly not independent of each other; in
B Rigid Body Transformations
fact, it is suﬃcient with only three parameters to describe any rotation
matrix. Thus, this representation is ineﬃcient when it comes to memory
use. Furthermore, due to the ﬁnite precision of computers, multiplica-
tion of rotation matrices introduces numerical errors. If many successive
rotations are combined, the resulting matrix may no longer be orthonor-
mal. Using such a matrix to transform an object will not only rotate, but
also shear and scale the object. The problem with numerical drift can be
avoided if the resulting matrix after each multiplication is replaced with
the “closest” orthonormal matrix. This operation is however nontrivial
and costly. If numerical drift should become a problem, then quaternions
provide a more eﬃcient solution, see Section B.2.3.
For applications that require input from humans, rotation matrices
are not user-friendly; imagine typing a nine-element orthonormal matrix
correctly. For such situations, other representations such as axis-angle,
or Euler angles, are used instead.
Euler angles are a more compact way of representing three-dimensional
rotations compared to rotation matrices. With Euler angles, a rotation
is expressed as the result of three successive rotations, α, β, and γ, where
each rotation is about one of the coordinate axes. Commonly the rota-
tions are chosen as: α about the x-axis, followed by β about the y-axis,
and ﬁnally γ about the z-axis. The rotation angles are sometimes referred
to as roll, pitch, and yaw angles .
Euler angles have the beneﬁt of being more compact than rotation
matrices; now only three numbers are required instead of nine. They are
also more user-friendly for applications that requires the user to specify
rotations. There are however several drawbacks with Euler angles when
used in the context of path planning. Most of these drawbacks are be-
cause Euler angles are incapable of correctly representing the topology of
SO(3), causing both theoretical and practical problems.
The roll, pitch, yaw angles are just one possible convention for the
sequence rotations and with respect to which axes to perform these.
Some conventions do not even rotate around the axes of the ﬁxed ref-
erence frame; instead they rotate around the axes of the moving frame.
Craig  showed that, if only the principal axis of the frames are used,
there are no less than 24 possible Euler angle sets. Even though only a
few of these are used in practice, confusion may arise if it is not clear
which particular convention is used.
B.2 Representing Rotations
Even if it is clear which particular convention is used, there are other,
more severe, problems. For a given convention, there are always multiple
sets of parameters which yield the same rotation. Furthermore, there are
even cases where an interval of one the rotation angles yield the same ro-
tation. As an example, consider the following common convention, where
the rotations are about the axis of the moving frame: Rotate α about
the z-axis, followed by β about the (new) y-axis, and ﬁnally γ about the
(once again, new) z-axis. It is easily veriﬁed that rotations (α, β, γ) of the
form (α, 0, −α) yield the unit rotation matrix. Thus there are inﬁnitely
many representations of the identity rotation with this convention. This
behavior is due to a singularity in the conversion from a rotation to the
corresponding Euler angles. Other conventions avoid the singularity for
the identity rotation, but then it occurs for some other rotation instead.
Indeed, it is a fundamental topological fact that singularities can never
be eliminated in any three-dimensional representation of SO(3) .
Another problem with Euler angles is that they are not suitable for
interpolation. Interpolating between two orientations using Euler angles
often results in jerky, unnatural looking motions. This is not only a cos-
metic problem, because the interpolated motions also cause the moving
body to generate a larger swept-volume . This is disadvantageous
to a path planning algorithm as the larger swept-volume increases the
probability of collision.
When choosing the limits for an Euler angle representation, care must
be taken to avoid a double coverage of SO(3). To set the range of each
angle to [−π, +π], would thus be an error; one of the angles should have
]. Which angle should have reduced range depends on
which angle convention is used.
In this section we give a brief introduction to quaternions and the beneﬁts
from using them to represent rotations. For more details, see, e.g., the
book by Kuipers .
Quaternions are a generalization of complex numbers and can be used
to represent rotations in much the same way as complex numbers on
the unit-circle can be used to represent two-dimensional rotations. A
quaternion h can be written as a linear combination
h = w · 1 + xi + yj + yk,
x, y, z, w ∈ R,
B Rigid Body Transformations
where w is the scalar component of the quaternion, and the symbols i, j,
and k denote the “imaginary” parts of a quaternion. Thus, a quater-
nion can be seen as four-dimensional vector. To honor the originator,
the mathematician William Rowan Hamilton, the set of quaternions are
often denoted by H. Hamilton deﬁned the following relationships for the
imaginary parts i
= ijk = −1. From these deﬁnitions, the
formula for multiplication of two quaternions can be derived. This for-
mula can be written compactly if the imaginary part of a quaternion,
xi + yj + zk, is written as vector v. Multiplication of two quaternions h
is then given by:
Note that due to the cross-product in Equation (B.8), quaternion multi-
plication does not commute.
Quaternions and Rotations
It can be shown that any unit quater-
nion represents a rotation . More precisely, the rotation about the
unit vector n by an angle θ is given by the quaternion
h = (cos(θ/2), sin(θ/2)n).
The set of all unit quaternions, and hence the set of all rotations, can be
seen as the surface of the unit sphere in four dimensions, S
. One of the
advantages of this representation is that it, in contrast to Euler angles,
lacks singularities. The only “tricky” part that requires consideration is
that antipodal points of S
represent the same rotation. Thus h and −h
yield the same rotation. This is no more strange than rotating by an angle
θ about the axis n, is equivalent to rotating an angle −θ about the axis
−n. To ensure that each rotation is associated with a unique quaternion,
we could for example choose to only use the upper hemisphere of S
Applying a sequence of rotations, represented by quaternions h
, is equivalent to quaternion multiplication as given by Equation (B.8).
That rotations are not commutative is thus reﬂected by the noncommu-
tative quaternion multiplication. Equation (B.8) shows two additional
advantages of using quaternions to represent rotations. Compared to
matrix multiplication, Equation (B.8) uses less operations. Furthermore,
as with rotation matrices, quaternions can also suﬀer from numerical
drift, that accumulates for each quaternion multiplication. For quater-
nions, this problem is easily remedied as it is reduced to normalizing a
four-dimensional vector to unit length.
B.2 Representing Rotations
Quaternions and Interpolation
One of main reasons for the popu-
larity of quaternions is that they allow smooth and simple interpolation
between orientations. As the set of unit quaternions is identiﬁed with
, the shortest path between two unit quaternions is a great-circle arc.
Shoemake  presented the following elegant formula for spherical lin-
ear interpolation (SLERP) between two quaternions:
, t) =
sin((1 − t)θ)
where t ∈ [0, 1], and θ = arccos(h
). As antipodal points on S
equivalent, an implementation must make sure to negate, e.g., h
the inner product between them is negative. Otherwise the interpolated
motion will not take the shortest path. Furthermore, if the inner product
between the two quaternions are close to one, then the quaternions are
almost identical. In such a case, an implementation should use linear
interpolation to avoid division by a number close to zero, i.e., sin(θ), see
As reported by Kuﬀner , the smooth interpolation provided by
Equation (B.8) was important to solve diﬃcult rigid body problems eﬃ-
Quaternions and Metrics
As the shortest path between two quater-
nions is given by the great-circle arc on S
, it seems natural to deﬁne the
distance between them to be proportional to the length of the arc. The
arc length is proportional to the angle between the quaternions, and a
possible metric is therefore given by
) = arccos(h
Note again that it is important to check wether the inner product is
negative. If so, one of the quaternions should be negated.
As a path planning algorithm may compute the distance between pairs
of conﬁgurations many thousands of times, we are sometimes willing to
use a faster, but less accurate metric. In  it was suggested to use the
following metric instead of that in Equation (B.11):
) = w
(1 − |h
is a weight, relating the rotational distance for rigid body with
the translational distance.
B Rigid Body Transformations
We have introduced a class Transform that follows the syntax of the
homogeneous transformation matrix. It is used to specify the pose of
geometric objects, to represent moving coordinate frames, and in the
communication with collision detection algorithms. The class can also
be used for transformation of vectors, and here position vectors and free
vectors are transformed diﬀerently.
For problems involving kinematic chains, the class Transform is also
used. For rigid body problems, however, we recommend using quater-
nions to specify the rotation. We summarize the following advantages of
using quaternions to represent rotations:
• When using quaternions, it is much easier to compensate for nu-
merical errors, compared to when using rotation matrices.
• Quaternions are a more compact representation than rotation ma-
• It is easy to interpolate between orientations using quaternions.
• The quaternion representation does not suﬀer from any singulari-
It should be mentioned though that quaternion-based interpolation takes
slightly longer time to compute than interpolation of Euler angles. But,
as pointed out in , the total computational time for path planning
problems is likely to be reduced due to the better motions produced by
the quaternion interpolation.
Thus, for rigid body problems, the CoPP framework uses quaternions
as the default representation. To make it easy to do comparisons be-
tween quaternion representations and, e.g., Euler angle representations,
the framework does support other representations as well.
Many path planners represent objects either as sets of triangles or as
convex polyhedra. Representing objects as sets of triangles has several
• Almost any object can be modeled, up to a certain accuracy, using
• Many graphical modelers and CAD tools can give output in this
• Triangle sets can directly be used for visualization.
• There exists eﬃcient collision detection algorithms that work on
For convex polyhedra, there exists numerous eﬃcient collision detec-
tion algorithms and the geometric representation is often more compact
compared to triangle sets. However, with convex polyhedra, the family
of objects that we can model is is much smaller. This drawback can be
alleviated somewhat if we allow objects that are unions of convex poly-
hedra. Still there are objects that are very awkward to decompose into
convex polyhedra, such as torii and other curved objects.
Most available collision detection algorithms either work on triangle
sets or on convex polyhedra. To support both types of algorithms, we
C Geometric Representations
face arr = [0, 1, 2, 3, −1,
1, 4, 2, −1,
7, 5, 6, −1]
Example of and indexed face set with three faces. Note
that the faces need not be joined together.
decided that CoPP should provide both triangle sets and convex polyhe-
dra as geometric representations. We also want to provide a geometric
type to represent unions of convex polyhedra.
Indexed Face Sets
Before implementing the concrete geometric types right away we must
ask ourselves what they have in common; doing so can help us deﬁne a
base class for them. It can be seen that all three types are actually sets
of polygons, albeit with diﬀerent constraints. In the case of a triangle
set, all polygons are constrained to be triangles, whereas in the case of
a convex polyhedron, the polygons must join together to form, well, a
convex polyhedron. Based on these observations, we should have some
class to represent a set of polygons. There exists already a data structure
for representing a set of polygons, namely an indexed face set. An indexed
face set consists of a contiguous array of vertices that can be accessed
through an index starting at zero. Each face is speciﬁed with the indices
of the vertices that form the face. The end of a face is marked by a
-1. One of the main reasons for using this data structure is its eﬃciency
regarding memory consumption: If each face would store its own vertex
data there will in general be a lot of duplicate vertices. An example of
an indexed face set is shown in Figure C.1. Notice that the faces need
not be joined together. Without any constraints though, all we have is a
C.2 Triangle Sets
The class IndexedFaceSet is an abstract class that inherits from
. From the class diagram in Figure C.2 it is seen that this class has
methods for reading the vertices and faces of an indexed face set. An in-
dexed face set is constructed using the methods AddVertex and AddFace.
These functions are protected, meaning that only derived classes (and the
class itself) can call them. As a result, it is up to the derived classes to
implement any required constraints on the polygon sets.
By extracting the common properties of triangle sets and convex poly-
hedra into the abstract class IndexedFaceSet we have gained several
• The book-keeping methods for vertices and faces are used by all
derived classes, thereby promoting code reuse.
• Functions that operate on indexed face sets will also work with
triangle sets and convex polyhedra. This is the case, for example,
for the visualization functions; the same function can be used for
drawing both triangle sets and convex polyhedra.
• A cleaner design in that it better expresses separation of concerns.
• The class hierarchy is easy to extend.
Triangle sets are a very common way to represent geometric objects,
and in computer graphics they are almost ubiquitous. Thanks to the
class IndexedFaceSet deﬁned in the previous section, the concrete class
can be implemented with little eﬀort. The only way to
build a TriangleSet is through the two methods AddTriangle and
, which make sure that the appropriate constraint is en-
forced, see Figure C.2. The second method is a template member function
that takes two iterators as arguments, specifying a sequence of triangles.
The numerous eﬃcient techniques for computing distances between con-
vex polyhedra have contributed to making this geometry representation
a popular choice in many path planners and other virtual-environment