Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
209 views

Linear Algebra and Robot Modeling: 1 Basic Kinematic Equations

This document discusses linear algebra concepts that are fundamental to robot modeling, control, and optimization. It reviews basic kinematic equations relating the position, velocity, and acceleration of robot joints to the end effector. It then presents a geometric perspective on linear algebra using the singular value decomposition to understand the underlying structure of linear maps. Specifically, it describes how linear maps form bijections between the row and column spaces, and remove components in the null spaces.

Uploaded by

Abdur Hamzah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
209 views

Linear Algebra and Robot Modeling: 1 Basic Kinematic Equations

This document discusses linear algebra concepts that are fundamental to robot modeling, control, and optimization. It reviews basic kinematic equations relating the position, velocity, and acceleration of robot joints to the end effector. It then presents a geometric perspective on linear algebra using the singular value decomposition to understand the underlying structure of linear maps. Specifically, it describes how linear maps form bijections between the row and column spaces, and remove components in the null spaces.

Uploaded by

Abdur Hamzah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Linear Algebra and Robot Modeling

Nathan Ratliff

Abstract
Linear algebra is fundamental to robot modeling, control, and opti-
mization. This document reviews some of the basic kinematic equations
and uses them to motivate an SVD-centric geometric perspective on linear
algebra. This perspective illuminates the underlying structure and be-
havior of linear maps and simplifies analysis, especially for reduced rank
matrices. We later review some properties of multidimensional quadratic
forms that are central to optimal control. Most of this material should
already be familiar to the reader; this document explores some specifics
and offers a potentially unique intuition oriented perspective.

1 Basic kinematic equations


Let C ⊂ Rd be a configuration space. For instance, the configuration space
may be a space of valid joint angles for a robot manipulator. Consider as a
running example the differentiable map mapping a point in the configuration
space (a particular set of joint angles) to the three-dimensional location of the
manipulator’s fingertip. We denote this forward kinematics map abstractly as
φ : C → R3 .
This map is no different from any other multidimensional differentiable map-
ping from calculus, so we can ask the typical calculus questions such as how does
the output change with changes in the input, etc. Kinematic equations are none
other than equations relating derivatives of the inputs to derivatives of the out-
puts.
Two common kinematic equations are

ẋ = Jφ q̇ and ẍ = Jφ q̈ + J˙φ q̇, (1)

where Jφ = ∂φ ∂q is the Jacobian (total derivative) of φ. These equations are


easy to write down, especially when referencing tables of derivative formulas in
an Advanced Calculus text, but in robotics we need a strong intuitive under-
standing of what they mean. For instance, two questions that may pop to mind
are
1. How does that first equation relate to the equation δx = Jφ δq?

2. What does J˙φ actually mean, and how do we compute it?

1
1.1 Interpreting time derivatives
Whenever an equation uses time derivatives, such as q̇, there is an implicit
assumption that q is really a trajectory q : [0, T ] → Rd . The time derivatives
d d
give the position q = q(t), velocity q̇ = dt q(t), and acceleration q̈ = dt q̇(t) of
the trajectory at some time t.
Thus, the equation ẋ = Jφ q̇ relates how the velocity q̇ in the system’s
configuration space relates to the velocity of the point on the end-effector in
Cartesian space. Concretely, using our manipulator example, it tells us how
joint velocities (the rate at which each joint is changing over time) relate to the
finger tip’s velocity (the rate at which the Cartesian dimensions of the fingertip
are changing over time). The implicit assumption here is always that the system
is moving along some trajectory q(t), and that that trajectory gives us the time
variable that allows us to address questions of how the system evolves over time.
Now given that q refers implicitly to an underlying trajectory, we can now
interpret what J˙φ means. As the system moves along its trajectory from q(t)
to a point dt in the future q(t + dt), the associated Jacobian changes slightly as
well since it’s a non-constant function of t. J˙φ gives explicitly the rate at which
the Jacobian is changing with time:
1  
J˙φ = lim Jφ (q(t + ∆t)) − Jφ (q(t)) (2)
∆t→0 ∆t

We usually calculate this quantity numerically by taking a finite-difference be-


tween the Jacobian now and the Jacobian we find a little bit in the direction of
the current velocity. That approximation can be derived from the above using
a first-order Taylor expansion and a finite time step:
1  
J˙φ ≈ Jφ (q(t + ∆t)) − Jφ (q(t)) (3)
∆t
1  
= Jφ (q(t) + ∆tq̇)) − Jφ (q(t)) . (4)
∆t

1.2 Tangent vectors and co-tangent vectors


How does the expression δx = Jφ δq differ from the expression above relating
velocities ẋ = Jφ q̇? Notationally, δq denotes a small displacement of q which
is qualitatively similar to a velocity vector, but depending on whether you’re an
engineer with a tool belt full of math tricks or a mathematical purist the two
expressions can range from being “slightly different but heuristically the same”
to being “fundamentally different”. Clearly, it’s superficially similar because
derivatives work in very structured ways, but this equation doesn’t assume that
there’s any underlying trajectory. The quantity δq should be thought of as a
small perturbation in q, and the equation tells us how small perturbations in q
result in small perturbations of x.
There are technical details that we won’t worry about in this class, but
rigorous treatments of these ideas can be found in textbooks on differential
geometry. There, q̇ is known as a tangent vector, and these small perturbations

2
δq are known as co-tangent vectors. The former generalizes tangents to curves
across a manifold, and the latter generalizes gradients of scalar functions defined
on the manifold.
Within robotics, rigor is often secondary to physical intuition, so definitions
and manipulations of these ideas are less stringent. The bottom line is that q̇
always refers to a time derivative of an underlying trajectory, whereas δq will
simply denote a perturbation (small movement of) the point q in space with no
notion of time.
We’ll often use δq to also represent a finite-sized perturbation away from
some approximation point q, which lets us write the first-order Taylor expansion
of a scalar function c : C → R as

c(q + δq) ≈ c(q) + ∇c(q)T δq. (5)

In this notation, the expression δx = (x + δx) − x ≈ Jφ δq can be viewed as


a first-order Taylor expansion of the map φ. Following this line of thought, we
can also expand the approximation to a second order and write1
1 T 2 
δx ≈ Jφ δq + δq ∇ φk (q)δq k (6)
2
This notion of second-order expansion doesn’t make sense for ẋ = Jφ q̇ since
we always treat this latter equation as an exact differential expression relating
time-rates of change.

2 A geometric perspective on linear algebra


Linear algebra is fundamental to control and optimization. This section reviews
some of the most important properties of linear maps from the perspective
of the Singular Value Decomposition (SVD). If we start from the notion that
every matrix has an SVD, the structure and behavior of the matrix becomes
immediately clear and algebraic manipulations of even reduced rank matrices are
straightforward. We assume familiarity with the basic linear algebraic concepts
and focus here on understanding the underlying geometry of linear maps.

2.1 Linear maps are bijections between fundamental spaces


The row space and column space are fundamental to a linear map. We’ll see
in this section that a linear map forms a bijection2 between between these two
spaces, and that all components in the orthogonal compliment to these spaces
are simply removed when necessary. These orthogonal compliments are known
as the left and right null spaces, respectively.
1 Forthose familiar with tensor notation and the Einstein summation convention, this
second term can be more compactly written as 21 (∂ij φk ) δq i δq j .
2 A bijection is formally a one-to-one and onto function. One may view it as a full pairing

between points in the domain and range. Every domain point x has an corresponding range
point y and vice versa under the bijection.

3
Usually, the row and column spaces are defined as the space spanning the
rows and the space spanning the columns. Those definitions, though true, aren’t
very insightful. This section shows that the SVD provides a nice geometric
view of what these spaces are and how the linear map operates on them. This
decomposition gives geometric insight into the fundamental nature of linear
algebra.
Let A ∈ Rm×n be a matrix representing a linear map from Rn to Rm .
We know from what is sometimes referred to as the fundamental theorem of
linear algebra that A has a Singular Value Decomposition (SVD) of the form
A = U SV T , where U and V are orthogonal matrices and S ∈ Rm×n is a
diagonal matrix of singular values σ1 , . . . , σk . We don’t assume that A is full
rank, so k may be less than both m and n. Since S is a non-square diagonal
matrix with only k ≤ m, n nonzero entries, we can better reveal its structure by
writing it as
   
Σ 0 I 
S= = Σ I 0 (7)
0 0 0

where the 0 matrices are appropriately sized and Σ = diag(σ1 , . . . , σk ).


Since S as shown in Equation 7 decomposes as a square diagonal matrix Σ
with zero buffers on either side, we can further decompose both U and V into
 
U = U// U⊥ and V = V// V⊥ , (8)

where the columns with the subscript ⊥ are the ones annihilated by the zeros
and the columns with the subscript//are those that remain. This decomposition
allows us to rewrite the SVD as
  T 
  Σ 0 V//
A = U// U⊥ (9)
0 0 V⊥T
     T
  I   I
= U// U⊥ Σ V// V⊥ (10)
0 0
= U//ΣV//T . (11)

It’s fairly easy to show that, in terms of this decomposition, span(V//) is the
column space, span(V⊥ ) is the right null space, span(U//) is the row space, and
span(U⊥ ) is the left null space (which we’ll see below is the null space of the
natural generalized inverse).
The last expression in Equation 11, known as the thin SVD, reveals the
fundamental structure of any (possibly reduced rank) matrix. Some, depending
on background, find it more insightful to view that expression as
k
X
A = U//ΣV//T = σi ui viT , (12)
i=1

where ui and vi are the k columns of U// and V//, respectively.

4
Geometrically, this expression says that all rank k matrices are simultane-
ously a correspondence between a k-dimensional orthogonal basis in the domain
space BV = {vi }ki=1 and a k-dimensional orthogonal basis in the range space
BU = {ui }ki=1 with a rule for how vectors should be stretched or compressed (or
reflected when σi < 0) along those dimensions. (We refer to these bases below
as simply the matrices V// and U//, respectively.)
It’s useful to think about
P each term in the expansion in relation to its corre-
sponding endomorphism 3 i σi ui uTi . Each term of this endomorphism acts on
a vector x to find its component (uTi x)ui along the basis element ui and stretch
it by σi . Since each basis element is orthogonal to all others, this endomorphism
simply stretches, compresses, and/or reflects the vector independently along the
given basis directions.
Equation 12, therefore, shows that all matrices have this fundamental be-
havior. Except in this case, when the domain and range spaces differ, we need
to both decompose the vector x in terms of the domain basis and simultaneously
map that decomposition onto the corresponding basis of the range space. Once
that connection between the (sub)spaces is established the matrix can apply its
underlying operation defined by its singular values σi .
The following equivalent expressions illustrate the underlying behavior of
the linear map from multiple perspectives:
     
A = U//ΣV//T = U//V//T V//ΣV//T = U//ΣU//T U//V//T , (13)
| {z } | {z }
Stretch then map Map then stretch

Each gives a subtly different way of thinking about how the matrix A operates.
The first says that we can think of A physically as a stretching/squishing of the
domain space space followed by an incompressible mapping between orthogonal
bases (simply mapping each domain basis element to its corresponding range
basis element), whereas the second says we can equivalently view A as first
a mapping between basis elements followed by a stretching of the space. It’s
natural then to think of A’s action holistically as simply an association be-
tween the two k-dimensional subspaces defined by U//V//T and a corresponding
stretching/squishing/reflection of that unified space.

2.2 A natural generalized inverse


Pk
Note that the basis-correspondence mappings of the form U//V//T = i=1 ui viT
are all between k-dimensional subspaces, not the full space. What happens
to the dimensions
  of x orthogonal that space in the domain in the expression
Ax = U//ΣV// x? They vanish!4 There aren’t enough singular values to
T

represent those dimensions, so simply by the basic structure of a linear map, di-
mensions orthogonal to the fundamental k dimensional subspaces are removed.
3 An endomorphism is a mapping from a space back onto the same space.
4 More explicitly, the operator V//V//T projects a vector onto the space span(V//), and V⊥ V⊥T
 
projects onto span(V⊥ ), so we can always decompose a domain vector x as x = V//V//T x +

5
In other words, the mapping between the subspaces is bijective, and any com-
ponents orthogonal to those spaces cannot be represented by the linear map.
Thus, if A = U//ΣV//T is the forward map implementing a forward mapping
between fundamental spaces and removing any domain element orthogonal to
the column space then it reasons to say that the opposite mapping procedure
which implements the inverse of that bijection between fundamental space and
removes any component orthogonal to the row space is a natural generalized
inverse for this map. This generalized inverse is given explicitly by the map

A† = V//Σ−1 U//T . (14)

It’s straightforward to show that A† gives the exact inverse on the bijection be-
tween the fundamental k-dimensional subspaces, but any component orthogonal
to V// is removed. This expression is exactly the Moore-Penrose pseudoinverse:

A† = AT (AAT )−1 , (15)

as can be shown simply by expanding each A = U//ΣV//T . In other words, by


construction, we’ve shown that the Moore-Penrose pseudoinverse is a natural
generalization of the notion of inverse. Prescriptively, it says that between the
domain and range spaces the right thing to do is perform either the forward or
inverse bijection between the fundamental k-dimensional subspaces, and simply
remove any components orthogonal to those spaces. For the forward map, we
remove any component orthogonal to the column space and perform the forward
bijection. And for the inverse map, we remove any component orthogonal to
the row space and perform the backward bijection.

2.3 Using the SVD to solve reduced rank linear equations


Let A = U//ΣV//T ∈ Rm×n be any rank k matrix as above. A may be reduced
rank, making k smaller than either m or n. Solving the equation

Ax = b

fully for the entire space of valid solutions is straightforward using this decom-
position as we show here:

Ax = b
⇒ U//ΣV//T x = b
⇒ V//T x = Σ−1 U//T b. (16)

V⊥ V⊥T x. Using that decomposition,




h    i
Ax = U//ΣV//T V//V//T x + V⊥ V⊥T x

= U//ΣV//T x,

since V//T V⊥ = 0 by construction.

6
x can always be decomposed as a linear combination of the columns V// and a
linear combination of the columns V⊥ as in x = V//α + V⊥ β, where α and β
are the associated coefficients. Doing so shows that the left hand side of the
expression in Equation 16 reduces to

V//T x = V//T V//α + V⊥ β



   
= V//T V// α + V//T V⊥ β
= α.

In other words, the equation Ax = b constrains x only through α, which is


given by

α = Σ−1 U//T b. (17)

The full solution is, therefore,


 
x = V//α + V⊥ β = V//Σ−1 U//T b + V⊥ β (18)
| {z }
A†

for any (n − k)-dimensional coefficient vector β. As indicated, the first term is


just the well known pseudoinverse solution, and the second term is an element
of the null space spanned by V⊥ .
This notation is somewhat different than commonly used to describe these
solutions (raw matrix operations such as in Equation 15), but even computa-
tionally, expressions derived through applications of SVDs can be easily and ro-
bustly implemented, too, since matrix libraries make SVDs computations readily
available in practice. The trade-off is robustness, stability, and geometric insight
(SVD) for computational speed (raw matrix operations).

2.4 Solving the system without algebra


Consider the equation Ax = b, where A ∈ Rm×n can be any possibly reduced
rank matrix. Then, as discussed above, the decomposition A = U//ΣV//T tells
us that A transports specifically only the portion of x lying within span(V//)
to a point in the space span(U//). That means that all vectors of the form

z(β) = x0 + V⊥ β (19)

are transported to the same point under A. In particular, when x0 lies on


the column space span(V//), i.e. x0 = V//α, then we know where x0 ends up
under the fundamental internal bijection of A and that means we know where
all points z(β) end up.
Thus, to solve the system, we just need to know which point in span(V//)
maps to b ∈ span(U//). Since the forward map takes the coordinates of the
point x in the domain basis V//, inflates them by the factors σi , and applies them
directly to the range basis U//, the inverse operation must just do the opposite. It

7
should take the coefficients in range basis U//, shrink them by inverse factors σ1i ,
and apply them to domain basis V//. Thus (as we derived above), if the forward
map is A = U//ΣV//T , the inverse map between fundamental spaces must be
A† = V//Σ−1 U//T (specifically, U//T finds the components in the k-dimensional
orthogonal basis for span(U//), Σ−1 scales those components, and V// applies the
resulting components directly to the k orthogonal basis elements of span(V//)).
Given this intuition, we’re now equipped to simply write down the full linear
space solution to the system directly:
 
x∗ = V//Σ−1 U//T b + V⊥ β (20)

for any β ∈ Rk .
Note that this argument can also be used to solve, in the least squares
sense, the system when b doesn’t actually lie within span(U//). In this case, we
just need to find which element of span(V//) gets mapped onto the orthogonal
projection of b onto span(U//), which is just the point defined by the coordinates
of b in basis U//. Considering only these coordinates, the above argument still
unfolds in the same way and the solution doesn’t change. In other words, we
can use a geometric argument to show that that Equation 20 also solves the
least squares problem.

3 Quadratic forms and their manipulation


Quadratic forms are important tools in optimization and control because of their
close connection to linear systems. Above, for instance, we saw that the basic
structure of a linear map encodes an implicit least squares problem, which itself
may be viewed as a quadratic objective function.
This section reviews very briefly some of the basic rules governing quadratic
forms that should ideally be understood at an intuitive level. These rules relate
how quadratics combine with one another and how they behave under linear
transformation. We’ll state the rules here without proof. They can all be
derived by calculation.
1. Adding quadratics always gives another quadratic. Q1 (x) + Q2 (x)
is a quadratic over x, and Qx (x) + Qq (q) is a quadratic function defined
jointly over x and q. Generally, adding any two quadratic functions of
any collection of variables gives a joint quadratic over the union of those
variables. Calculating the coefficients can be tedious, but we always know
that it’ll still be a quadratic.
2. Linearly transforming a quadratic gives another quadratic. If
Q(x) is a quadratic and x = Aq + Bu is a linear transformation of q
and u into x, then the composition Q(Aq + Bu) = Q(q,
e u) is a quadratic
function over q and u.

8
3. Conditioning: Fixing one variable gives a quadratic over the
others. If Q(x, u) is a joint quadratic defined over both x and u, fixing on
of the variables, say x at a particular value xt , gives a quadratic function
Q(xt , u) = Q
e x (u) over the remaining variables u.
t

4. Marginalization: Optimizing over a subset of variables gives a


quadratic over the rest. Let Q(x, u) be a quadratic over both x and
u. We can solve analytically for the optimal u as a function of x to get an
expression u∗ (x) that tells us the optimal setting to the variable u given
the value x. Plugging that expression back into the quadratic removes u
(since we’re optimizing over it) and gives Q(x)
e = Q(x, u∗ (x)). This new
function Q(x)
e is also a quadratic.
It’s important to be able to perform the calculations to explicitly derive the
above results, but it’s more important to understand these results at an intuitive
level since their calculation can be tedious and understanding them alone can
give a lot of theoretical insight into optimization and control problems. In
particular, the last two properties, denoted as conditioning and marginalization
in reference to their corresponding Gaussian analogs (Gaussian inference and
quadratic optimization are very closely related), are of central importance to
optimal control.

You might also like