The RG-RRT was able to solve for a swing up controller for the pendulum after expanding
the tree 360 times. The overall search took 3 seconds in Matlab. Meanwhile, the standard
RRT-based planner converged to the goal after adding 2300 nodes to the tree, a process that
required 75 seconds to complete. All simulations in this work were run on a 3ghz Intel Xeon
X5450 CPU. Figure 5-6 presents the ﬁnal tree generated by the standard RRT.
While it’s not obvious from the tree, the RG-RRT results in an approximately bang-bang
trajectory for swing up control. This behavior is more evident in Figure 5-3(a) where each
node corresponds to one of the extreme points, i.e. |u| = u
, that bound the reachable set
for its parent node. This control policy agrees with the bang-bang control strategies that
have been proposed elsewhere for the inverted pendulum [Furuta et al., 1991].
Motion Planning for the Acrobot
For comparison in a similar problem with a higher dimensional search space, a modiﬁed
bidirectional version of the RG-RRT algorithm was applied on a simulated torque-limited
Acrobot [Spong, 1994b], which is a two link pendulum with an actuator at the elbow, and
a passive joint at the base. In the simulation, each link weighed 2kg and was .5m in length
with the COM in the center of the link. The reachable set was approximated by discretizing
actions into three possible torques values. A standard bidirectional RRT was also run for
comparison using the same code base. The results indicate that the standard RRT worked
fastest when the control was sampled from a uniform distribution between the torque limits.
The bi-directional RG-RRT was run 10 times and took an average of 38 seconds to run,
compared to 112 seconds for the standard RRT.
Motion Planning for a Simple Nonholonomic Car
Another class of problem considered is a nonholonomic car navigating through a set of
corridors bounded by obstacles as shown in Figure 5-7. The state of the vehicle is represented
by its position, heading, and forward velocity, i.e. (x, y, θ, v). The control inputs to the system
consist of the angular rate and the forward acceleration, u =
θ, ˙v , both of which are
bounded in magnitude. For our simulation, the vehicle is restricted to forward motion. The
task is to ﬁnd a collision-free trajectory that brings the rectangular-shaped vehicle through
the narrow corridors from the lower-left to the upper-right corners of the environment, both
with an Eastward heading. The setup of this problem is challenging because the only feasible
(b) Standard RRT
Figure 5-7: A comparison of the ﬁnal collision-free trajectories that lead the simple car model
from its initial position in the lower left-hand corner of the environment to the goal at the
upper right. The ﬁnal tree for (a) the RG-RRT consists of 292 nodes and was generated in
under 3 seconds. In contrast, the tree built with the standard RRT-based planner includes
1350 nodes and required 43,000 expansion attempts and 73 seconds to compute.
IEEE [Shkolnik et al., 2009].
paths require the vehicle to substantially slow down and make a wide turn in order to avoid
collision around the tight turns.
One can visualize the state of the system in two dimensions where the location of a point
denotes the vehicle’s (x, y) position. The direction of a vector extending from the point
represents the heading, θ, while the length of the vector denotes the forward velocity, v. In
this space, the reachable set, R, is bound by a quadrilateral positioned in front (per the
orientation vector) of the point in space where the four corners correspond to the four pairs
of saturated control inputs. Any point within this quadrilateral is reachable in time ∆t
according to the diﬀerential constraints.
Both the RG-RRT as well as the standard RRT planner were applied to solve for a
sequence of control inputs that drive the vehicle to the goal pose while satisfying the
diﬀerential constraints and avoiding obstacles.
Both algorithms utilized the Euclidean
distance metric, and shared much of the same code base. Figure 5-7 compares examples
of trees resulting from both algorithms. Each was run a total of 20 times, with the same
start and goal poses and obstacle conﬁguration as shown in the ﬁgure. On average, the
RG-RRT found a path in 2.3 seconds, with a mean of 405 nodes in the tree, and a mean
total of 2150 integrations. Five integrations were performed for each expansion, including
four required to generate a reachable set for each new node.
On the other hand, the
standard RRT required 51 seconds on average, with a mean of 1700 nodes in the tree, and
35,000 integrations required. The large number of integrations resulted from a substantial
proportion of expansion attempts that failed due to collision.
If the reachable set is exactly known, or if the distance metric exactly measures the distance
to the reachable set (perhaps it is not necessary to exactly compute the reachable set itself
to do this), then the resulting RG-RRT will be probabilistically complete. Completeness
may not hold for some approximations of the reachable set.
However, it is trivial to
guarantee completeness by occasionally making a standard RRT expansion step by ignoring
the reachable set when selecting a node nearest to a sample.
The notion of probabilistic completeness is inherently a weak measure. In practice, it
is not possible to run an algorithm for inﬁnite time; to be useful, the RRT must ﬁnd a
plan within a reasonable time frame, whatever that happens to be. Therefore, an algorithm
which has no completeness guarantees, but which has empirically shown to perform well (for
example by solving a planning problem n times out of n attempts, and did so within a short
time frame), is more satisfying than a probabilistically complete algorithm which takes a
very long time to converge.
Comparison of Voronoi regions when approximating the Reachable Set. (a): 3 nodes
of the tree; (b): Reachable set, R, and Voronoi mapping of T; (c) - (e): discrete approximations of
R, and corresponding Voronoi mappings; (f): Voronoi mapping of actual R set
Approximating the Reachable Set
A complete description of a continuous space reachable set can be diﬃcult to determine, and
the distances between samples and these spaces can be time consuming to compute. In this
chapter, I showed that simple discretization of the action space can, at least in some cases,
be an eﬀective approach for approximating the reachable set to speed up RRT exploration.
To explore the eﬀect of approximating a reachable set, consider for example a kinematic car
problem, where each time the car moves, it is limited to a bounded distance, and with some
bounded turning radius. An example of a kinematic car tree with three nodes is shown in
Figure 5-8(a), together with the Voronoi diagram found using the Euclidean distance metric
on the Cartesian coordinate (ignoring rotation of car body). The next image, 5-8(b) shows
a ﬁnely discretized reachable set for each of the three nodes. This image shows color-coded
Voronoi regions, where any sample in those regions are mapped to the corresponding colored
node in the tree. It is clear that the Euclidean distance to the three nodes in this tree
does not produce a useful Voronoi bias that can be taken advantage of by an RRT planner.
Consider that to explore additional actions of the ﬁrst (blue) node, a sample would have to
fall behind (to the left of) the blue node, which does not make much sense for this system.
A Voronoi mapping for the true reachable set of this system is shown in 5-8(f). White
regions correspond to regions where samples are discarded. Thus, any samples behind the
blue node would be thrown away. However, the blue node could be expanded if a sample
lands to the front or right side of this node. The remaining images in this ﬁgure demonstrate
reasonable approximations of the reachable set. In 5-8(c), the reachable set is approximated
using only the four corners of the set. One can see that the resulting Voronoi mapping is
fairly similar to the mapping found if using the actual reachable set, with the exception
that regions near currently explored nodes (the red and green nodes of the original tree) are
white, indicating that samples are discarded. The next image in 5-8(d) is similar, except
that when a node is added to the tree, a new node is added to the parent’s reachable set
approximation, which is moved toward the “inside” of the reachable set from the position
of the new node. The result is less white space corresponding to regions where samples are
discarded. Lastly, in 5-8(e), the reachable set is approximated by a quadrilateral deﬁned
by the four corners of the reachable set. This approximation produces a Voronoi mapping
which is very similar to that of the actual reachable set.
As the control dimension increases, it becomes increasingly ineﬃcient to approximate the
reachable set using discrete sampling, where the number of samples grows exponentially on
dimension. Several recent works, e.g. [Branicky et al., 2008, Green and Kelly, 2007], have
examined the problem of reducing the set of discrete trajectories for motion planning by
pre-computing control actions that maximize path dispersion and minimize probability of
collision, given assumptions or a prior on the probability of encountering obstacles.
In practice, heuristics that approximate the reachable set, even poorly, can perform
signiﬁcantly better than using Euclidean distance to the tree nodes.
Some care should
be taken, because a poor approximation can lead to parts of state space being classiﬁed
as regions where samples are inappropriately discarded, which could lock the RRT search.
One method for achieving a low dimensional approximation is by using a task space, a
topic which I further investigate in the next chapter. Other approaches would include ﬂow
tube approximations, e.g. [Hofmann and Williams, 2006], geometric heuristics, and machine
learning approaches, which is left to future work.
Regions of Inevitable Collision
One beneﬁt of generating the reachable set using discretized actions is that points that
comprise the reachable set can also be eﬃciently tested for collisions before they are added
to the Reachable set, in order to reduce the likelihood of trajectories leaving free space as
part of the exploration phase. This idea has parallels to quickly (but locally) approximating
regions of inevitable collision (RIC), while altering the sampling strategy to avoid expanding
towards such regions. The RIC is ﬁrst mentioned by LaValle in [LaValle et al., 2001], and this
idea is expanded upon in [Fraichard et al., 2003, Fraichard and T., 2007], and with heuristics
in [Chan et al., 2008]. Reachability guidance helps to avoids such collisions by looking one
step into the future. The result is that some nodes will have an empty reachable set, which
means these nodes will never be expanded upon, but also these nodes in the tree serve to
block nearby sampling in state space, because of the dynamic sampling properties of the
RG-RRT, which throws away samples that are closer to the tree rather than the reachable
set of the tree.
Some systems are modeled as having continuous dynamics most of the time, but upon
some event (reaching some surface in state space), there is a switching behavior, or discrete
impulse in the dynamics, so that the trajectory jumps to a diﬀerent part of the state space.
Mixing continuous dynamics and discrete dynamics is known as hybrid dynamics, and can be
challenging for motion planning algorithms such as the RRT to handle. An example of this
might include a simulation of legged robot if collisions are modeled as inelastic events that
cause an instantaneous change in velocity (and/or change of coordinates) when a foot touches
down. The RG-RRT can handle hybrid dynamics by integrating through the impulse when
generating the reachable set. If approximating the reachable set by discretizing the action
space, some care should be taken, because when going through an impulse, the reachable set
may be divided in two, which means any assumptions of a convex reachable set should be
applied now to two separate regions instead of one.
Motion primitives are sometimes used in planning, in order to reduce the action space. See
for example [Frazzoli et al., 2002, Frazzoli et al., 2005], which provides in depth discussion of
motion primitives in a planning setting. As long as there is a way to eﬃciently approximate
the result of using a motion primitive for generating a reachable set, the RG-RRT is amenable
to this approach.
The next chapter will show an example of an application of motion
primitives for tackling higher-dimensional motion planning problems.
The RG-RRT algorithm alleviates much of the sensitivity of randomized sampling for systems
with diﬀerential constraints to the metric that is employed to expand the tree. Essentially,
the algorithm rejects samples, and utilizes the metric only for regions of the state space for
which the tree can make progress towards the sample. The result is a modiﬁed Voronoi bias
that emphasizes nodes that are more likely to promote exploration given the constraints
on the system dynamics. The RG-RRT takes advantage of the fact that sampling is cheap
compared to the process of expanding a node and, therefore, is willing to re-sample until a
point is drawn from a region that yields better exploration.
The size distribution for the modiﬁed Voronoi regions under the RG-RRT varies as the
tree explores the state space. Initially, there is a large variability in the size of the diﬀerent
Voronoi regions but, as the tree expands, the size tends to become more uniform. The lower
bound on the size of the regions, typically corresponding to the inside of the tree, is inversely
proportional to the resolution of the reachable set. In turn, the sampling of the space will be
probabilistically complete so long as the resolution of the reachable set is suﬃcient relative
to the smallest “gap” in the state space.
Overall, the algorithm presented is a simple way of eliminating metric and sampling
speciﬁcity in the RRT algorithm implementation, which is particularly useful when there
are diﬀerential constraints, or when reachable sets are otherwise non-spherical because of
motion primitives or hybrid dynamics. The algorithm was demonstrated to be eﬀective for
planning on systems such as the pendulum, Acrobot, and a simple model of a car.
Application: Motion Planning to Achieve
Bounding Over Rough Terrain with LittleDog
In the previous chapter, reachability guidance was shown to be eﬀective for dynamically
biasing sampling to improve RRT search eﬃciency for dynamic systems.
discretizing the control space does not scale well with dimension. In this chapter, we build
on the sampling ideas developed in this thesis, particularly reachability guidance and task-
space biasing. These approaches are combined together with a motion primitive in an RRT
based planning framework that quickly ﬁnds feasible motion plans for bounding over rough
terrain with the LittleDog robot.
Signiﬁcant portions of this chapter also appear in [Shkolnik et al., 2010].
Bounding on Rough Terrain
While many successful approaches to dynamic legged locomotion exist, we do not yet have
approaches which are general and ﬂexible enough to cope with the incredible variety of
terrain traversed by animals. Progress in motion planning algorithms has enabled general
and ﬂexible solutions for slowly moving robots, but in order to quickly and eﬃciently traverse
very diﬃcult terrain, extending these algorithms to dynamic gaits is essential.
This chapter describes a motion planning strategy to achieve dynamic locomotion over
rough terrain with the LittleDog robot. Recall that LittleDog (Figure 6-1) is a small, 3kg
position-controlled quadruped robot with point feet and was developed by Boston Dynamics
under the DARPA Learning Locomotion program. The program ran over several phases
from 2006 to 2009, and challenged six teams from universities around the United States to
compete in developing algorithms that enable LittleDog to navigate known, uneven terrain,
as quickly as possible. The program was very successful, with many teams demonstrating
robust planning and locomotion over quite challenging terrain (e.g., [Rebula et al., 2007,
Kolter et al., 2008, Pongas et al., 2007, Zucker, 2009]), with an emphasis on walking gaits,
and some short stereotyped dynamic maneuvers that relied on an intermittent existence of
Figure 6-1: LittleDog robot, and a corresponding 5-link planar model (blue circles represent
actuated joints, white circles represent passive joints)
a support polygon to regain control and simplify planning [Byl et al., 2008]. In this work, I
present a method for generating a continuous dynamic bounding gait over rough terrain.
Achieving bounding on LittleDog is diﬃcult for a number of reasons. First of all, the robot
is mechanically limited - high-gear ratio transmissions generally provide suﬃcient torque
but cause severe limits on joint velocities and complicate any attempts at direct torque
control. Furthermore, a stiﬀ frame complicates impact modeling and provides essentially no
opportunity for energy storage. Finally, the robot is underactuated, with the dynamics of the
unactuated joints resulting in a complicated dynamical relationship between the actuated
joints and the interactions with the ground. These eﬀects are dominant enough that they
must be considered during the planning phase.
An essential component of any model-based planning approach is a suﬃciently accurate
identiﬁcation of the system dynamics. Obtaining an accurate dynamic model for LittleDog
is challenging due to subtleties in the ground interactions and the dominant eﬀects of motor
saturations and transmission dynamics. These eﬀects are more pronounced in bounding
gaits than in walking gaits, due to the increased magnitude of ground reaction forces at
impact and the perpetual saturations of the motor; as a result, we required a more detailed
model. In this section, I brieﬂy describe the model of the LittleDog robot, which was largely
developed and identiﬁed by Michael Levashov. A more comprehensive description of the
model and system identiﬁcation can be found in [Shkolnik et al., 2010].
The LittleDog robot has 12 actuators (two in each hip, one in each knee) and a total of
22 essential degrees of freedom (six for the body, three rotational joints in each leg, and one
prismatic spring in each leg). By assuming that the leg springs are over-damped, yielding
ﬁrst-order dynamics, we arrive at a 40 dimensional state space (18 × 2 + 4). However, to
Figure 6-2: Dog bounding up stairs.
Images from video at http://www.ﬂickr.com/photos/istolethetv/3321159490/
keep the model as simple (low-dimensional) as possible, we approximate the dynamics of the
robot using a planar 5-link serial rigid-body chain model, with revolute joints connecting the
links, and a free base joint, as shown in Figure 6-3. The planar model assumes that the back
legs move together as one and the front legs move together as one (see Figure 6-1). Each leg
has a single hip joint, connecting the leg to the main body, and a knee joint. The foot of the
real robot is a rubber-coated ball that connects to the shin through a small spring, which is
constrained to move along the axis of the shin. The spring is stiﬀ, heavily damped, and has
a limited travel range, so it is not considered when computing the kinematics of the robot,
but is important for computing the ground forces. In addition, to reduce the state space,
only the length of the shin spring is considered.
The model’s 7-dimensional conﬁguration space, C = R
, consists of the planar
position of the back foot (x, y), the pitch angle ω, and the 4 actuated joint angles q
, ..., q
The full state of the robot, x = [q, ˙q, l] ∈ X , has 16 dimensions and consists of the robot
conﬁguration, the corresponding velocities, and the two prismatic shin-spring lengths, l =
], one for each foot.
The control command, u, speciﬁes reference angles for the 4
actuated joints. The robot receives joint commands at 100 Hz and then applies an internal
PD controller at 1KHz. For simulation, planning and control purposes, the dynamics are
x[n + 1] = f (x[n], u[n]),
where x[n + 1] is the state at t[n + 1], x[n] is the state at t[n], and u[n] is the actuated
joint position command applied during the time interval between t[n] and t[n + 1]. We will
sometimes refer to the control time step, ∆T = t[n + 1] − t[n] = 0.01 seconds. A ﬁxed-step
4th order Runge-Kutta integration of the continuous Euler-Lagrange dynamics model is used
to compute the state update.
A self-contained motor model is used to describe the movement of the actuated joints.
Motions of these joints are prescribed in the 5-link system, so that as the dynamics are
integrated forward, joint torques are back-computed, and the joint trajectory speciﬁed by
the model is exactly followed. This model is also constrained so that actuated joints respect
bounds placed on angle limits, actuator velocity limits, and actuator torque limits.
addition, forces computed from a ground contact model are applied to the 5-link chain when
the feet are in contact with the ground. The actuated joints are relatively stiﬀ, so the model
is most important for predicting the motion of the unactuated degrees of freedom of the
system, in particular the pitch angle, as well as the horizontal position of the robot.
Figure 6-3: LittleDog model. The state space is X = [q, ˙q, l], where q = [x, y, ω, q