U NIVERSIDADE DE C OIMBRA
I NSTITUTO DE S ISTEMAS E R OB ÓTICA
3030-290 C OIMBRA , P ORTUGAL
A RTICLE : Calibration
Author:
Pedro M IRALDO
miraldo@isr.uc.pt
of Smooth Camera Models
Co–Author:
Prof. Dr. Helder A RAUJO
helder@isr.uc.pt
C ONTENTS
1
Introduction
1.1
Model and Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Outline of the Paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2
3
2
Notation and Background
2.1
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Plücker coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3
Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
3
3
3
Smooth Camera Model
4
4
The Proposed Method
4.1
Vector–Valued Function . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2
Linear Point–based Calibration . . . . . . . . . . . . . . . . . . . . . . .
4.2.1
Calibration Matrix M . . . . . . . . . . . . . . . . . . . . . . .
4.2.2
Computation of the Camera Matrix . . . . . . . . . . . . . . .
4.2.3
Relationship between Control Points, point correspondences
used in the calibration . . . . . . . . . . . . . . . . . . . . . .
4.3
Data Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1
Scaling the Coordinates of the Set of Image Points . . . . .
4.3.2
Scaling the Coordinates of the Set of World Points . . . . .
4.3.3
Camera Model and Calibration with Normalization . . . . .
5
6
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
and radial basis function
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
4
4
5
5
6
6
6
6
7
7
.
.
.
.
.
.
.
.
7
7
7
9
9
10
10
11
11
Conclusions
6.1
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2
Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
13
13
Experiments
5.1
Results with Synthetic Data Sets . . . . . . . . . . . . . . . . . . . . . . .
5.1.1
Evaluation Results Using Smooth Camera Models . . . . . . .
5.1.2
Results with Noise Added . . . . . . . . . . . . . . . . . . . . .
5.1.3
Results Using Non–Smooth Camera Models . . . . . . . . . .
5.1.4
Experimental Results Using 3D Data from a Single Surface .
5.2
Results with Data sets of Real Images . . . . . . . . . . . . . . . . . . . .
5.3
Using a Calibrated Perspective Camera to Acquire a Data-set with Real
5.4
Removing Distortion: Results for a Spherical Catadioptric System . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Data
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
References
14
Biographies
Pedro Miraldo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Helder Araujo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
14
14
1
Calibration of Smooth Camera Models
Pedro Miraldo and Helder Araujo
Abstract—Generic imaging models can be used to represent any camera. Current generic models are discrete and define a mapping
between each pixel in the image and a straight line in 3D space. This paper presents a modification of the generic camera model
that allows the simplification of the calibration procedure. The only requirement is that the coordinates of the 3D projecting lines are
related by functions that vary smoothly across space. Such model is obtained by modifying the general imaging model using radial
basis functions to interpolate image coordinates and 3D lines, thereby allowing both an increase in resolution (due to their continuous
nature) and a more compact representation. Using this variation of the general imaging model we also develop a calibration procedure.
This procedure only requires that a 3D point be matched to each pixel. In addition not all the pixels need to be calibrated. As a result
the complexity of the procedure is significantly decreased. Normalization is applied to the coordinates of both image and 3D points
which increases the accuracy of the calibration. Results with both synthetic and real data sets show that the model and calibration
procedure are easily applicable and provide accurate calibration results.
Index Terms—General Camera Models, Camera Calibration, Smooth Vector–Valued Functions.
✦
1
I NTRODUCTION
M
OST cameras can be modeled by a perspective
projection, which implies that all projecting rays
intersect at a single point. These cameras are usually
called central cameras. The calibration process for perspective cameras model (pinhole model) is well known
and easy to implement [1], [2], [3]. Other camera models
that verify the single view point constraint but do not
satisfy the pinhole model have been defined, for example
special cases of catadioptric camera models [4], [5].
Moreover, central camera models with non–parametric
association from image pixels to rays, (General Central
Camera Model) and their respective calibration have also
been studied [6], [7], [8].
In the last few years, cameras whose projecting rays do
not satisfy the constraint of intersecting at a single effective viewpoint started to be used, due essentially to the
large fields of view that can be obtained. These cameras
are called non–central and in many cases are obtained
by a generic combination of a perspective camera and a
curved mirror–the non-central catadioptric cameras [9],
[10], [11], [12] (Figure 3(a)). They are used in several
applications ranging from robotics to visualization.
Many other non–central camera models have also been
developed such as linear cameras [13], [14], [15], [16]
(Figure 3(d)), fisheye cameras [17], [18] or camera models
that include refractive elements [19], [20] (Figure 3(g)).
All of these systems have, in general, models that are
characterized by a small number of parameters. Since the
• P. Miraldo and H. Araujo are with the Institute for System and Robotics,
Department of Electrical and Computer Engineering, University of Coimbra, 3030-290 Coimbra, Portugal.
E-mail: {miraldo, helder}@isr.uc.pt
• This work was supported by the Portuguese Fundation for Science
and Technology (FCT) with grants PTDC/EIA-EIA/122454/2010 and
PTDC/EIA-CCO/109120/2008. P. Miraldo also wants to thanks to the
FCT for his doctoral degree grant with reference SFRH/BD/49054/2008.
number of parameters is small the calibration procedures
are in general, simple.
All of these models are specific since they are derived
using geometric models of the cameras [21] and therefore
they lack generality. To deal with these drawbacks, some
techniques were developed to calibrate special classes of
cameras such as [22], [23].
Grossberg and Nayar [24], [25] define a non–
parametric discrete imaging model that can represent
any type of camera, central or non–central (General Camera Model). Differently from the usual parametric camera
models, this camera model consists in the individual
mapping between pixels and rays in 3D space, Figure 1.
To each pixel, a set of parameters called raxel is associated. The set of all raxels (representing all pixels) constitutes the complete General Camera Model. A raxel is a
set of parameters including image pixel coordinates, the
coordinates of the associated ray in the world (position
and direction) and radiometric parameters.
Grossberg and Nayar also propose a method for estimating the parameters of the General Camera Model. Their
approach requires the acquisition of, at least, two images
of a calibration object with known structure and also
requires the knowledge of the object motion between the
images. It requires two world points for the same image
pixel.
Sturm and Ramalingam at [26] and Ramalingam et
al.at [27] proposed a calibration method based on the
non–parametric imaging model, suggested by Grossberg
and Nayar. However, they excluded from their model the
radiometric entities of the raxel. Their method assumes
that the camera is fully described by the coordinates of
rays and the mapping between rays and pixels.
Instead of using two images, Sturm and Ramalingam
methods require three images of the calibration object,
acquired from arbitrary and unknown viewing positions.
If three points of the calibration object are seen for
the same pixel, the collinearity constraint allows the
2
World Space
World Space
250
200
5
2
4
6
150
1
yy axis
100
Image Space
50
0
−50
−100
200
−150
180
−200
160
2
3
−250
140
200
120
100
200
100
0
Fig. 1. A unified camera model only assumes that an
association between image pixels and rays in the world
space exists.
5
xx axis
80
−100
−200
100
0
1
−100
−200
zz axis
4
60
40
3
20
0
460
480
500
6
520
540
560
580
600
Image Space
computation of the motion between the images of the
calibration object and, as a result, it allows the estimation
of the direction of the ray in 3D space.
All the methods mentioned above are discrete and
non–parametric, using mapping arrays (raxels) to calibrate the imaging model. Image pixels have associated
a set of parameters that are independent from their
neighbors.
Most of the useful camera models have, in general,
a pixel–ray relationships that vary smoothly along the
image. That is the case of the central perspective camera,
non-central catadioptric systems with quadric mirrors,
linear cameras, fisheye cameras and of camera models
that include refracting elements.
In this article we change the discrete model defined
by Grossberg and Nayar by assuming that the 3D rays
associated to the pixels vary smoothly throughout the
image. The assumption of smoothness is used as a
constraint in the definition of the camera model and
therefore in the calibration method. This paper extends
the approach presented in [28] where the Smooth Camera Model was presented for the first time. This paper
provides further details concerning the application of
the Smooth Camera Model, and improves it by using
normalized coordinates. An example and discussion of
calibration with minimal data is presented. Additional
experimental results with synthetic data are presented.
The additional experimental results include a crossedslits camera model, a ”refraction camera” made up of a
perspective camera looking through a volume of water
and a non-smooth configuration made up of four perspective cameras looking at the same scene. Additional
experimental results with real images are also included
namely the case of a perspective camera looking at a
scene through a water-filled glass tank.
1.1
Model and Approach
Since the estimation of the raxels for all the image pixels
require a lengthy calibration procedure, other authors
have also assumed that the relationship between neighboring pixels and neighboring 3D rays in the world vary
smoothly.
Fig. 2. In this figure we display the result of the application
of the calibration method to the minimal case according
to the theoretical constraints (see Section 4.2.2). In the
minimal case there are six points in the the image and
their correspondent points in the world.
In [25] and [26], [27], the authors have also used this
constraint. The constraint is applied after the estimation
of at least a subset of all possible image pixels. Sturm and
Ramalingam used this constraint to interpolate between
a subset of calibrated raxels corresponding to neighboring pixels. Grossberg and Nayar at [25] defined a Caustic
Raxel Model that aimed at estimating a caustic surface
from a set of raxels.
In that model the estimation of the caustic (and therefore the calibration of the imaging model) requires the
estimation of the raxels. Such estimation needs that more
than one 3D point be used for each image pixel.
In both previous cases, the use of constraints enforcing
smoothness between neighboring pixels is used post
calibration to interpolate between pixels that were not
calibrated. In the proposed approach, the assumption
that the relationship between pixels and rays varies
smoothly is used directly in the model. For that purpose
a vector–valued function that can represent any Smooth
Camera Model is defined. A parametric representation for
the General Camera Model is defined which results in:
• a decrease in the number of unknown parameters;
• the number of parameters and image resolution
becoming independent.
Using this camera model, a new calibration procedure
can be developed, where:
• only one world point is needed for each image
point, unlike previous methods, [24] and [26] that
require at least two points in the world, for each
image point;
• the calibration procedure has lower complexity
since all the parameters are estimated in a single
step.
As in [26], only geometric entities of General Camera
Model are considered.
In Figure 2, an example of the advantages of the use
3
of the proposed method in the calibration of smooth
camera models is shown. The minimal case (defined
according to the theoretical conditions) is considered,
where six matchings between world and image points
are known (see the proof in Section 4.2.2). In this example all the coordinates of the image points are different.
The estimated 3D lines pass through the corresponding
3D points using one-to-one mappings, which occurs
as a result of applying the constraint that neighboring
pixels must correspond to neighboring rays in the world
(smoothness constraint).
The methods proposed can be considered a black box
general Smooth Camera Model which is linear and leads
to a simpler calibration method.
1.2
Outline of the Paper
In the next section, the mathematical notation that will
be used in the article is presented. In Section 3 the
proposed change to the General Camera Models, Smooth
Camera Model is defined. In Section 4 the proposed
method is described. Results are presented in Section 5
and discussed in Section 6.
coordinates, we can represent a 3D line, incident on both
points, as
.
lR = x ∧ w = (l01 , l02 , l03 , l23 , l31 , l12 ) ∈ Λ2 R4 ⊂ R6
| {z } | {z }
m
d
(2)
with lij = xi wj − xj wi , basis eij = ei ∧ ej (ei are R4
basis) and d and m are, respectively, the direction and
the moment of the line.
Although all elements of the four dimensional exterior
product, Λ2 R4 , belong to R6 , not all elements of R6 represent lines in 3D space. It can be shown that, Equation (2)
is the result of a four dimensional space exterior product
(and therefore it is a line in 3D space), if and only if it
belongs to the Klein quadric which is the same as write
hd, mi = 0.
Plücker coordinates enable the computation of the
incidence condition of lines and points in the world.
Using the direction and moment vectors, a point p ∈ P 3
is incident on a line l ∈ Λ2 R4 if
!
[p]x −I
lR = 0
(3)
0
pT
|
{z
}
Q(p)
2
2.1
N OTATION AND BACKGROUND
Notation
Matrices are represented as bold capital letters (eg.A ∈
Rn×m , n rows and m columns). Vectors are represented
as bold small letters (eg.a ∈ Rn , n elements). By default,
a vector is considered a column. Small letters (eg.a)
represent one dimensional elements. By default, the jth
column vector of A is specified as aj . The jth element
of a vector a is written as aj . The element of A in the
line i and column j is represented as ai,j . Regular capital
letters (eg.A) indicate one dimensional constants.
We use R after a vector or matrix to denote that it is
represented up to a scale factor.
Let U ∈ Rm×n , V ∈ Rk×l and C ∈ Rm×k be known
and X ∈ Rn×l unknown. Using Kronecker product, we
can write
UXVT = C =⇒ (V ⊗ U) vec (X) = vec (C)
(1)
where ⊗ is the Kronecker product of U and V, with
(V ⊗ U) ∈ Rmk×nl , and vec (X) is a nl–vector formed
by stacking the columns of X.
2.2
Plücker coordinates
Plücker coordinates are a special case of Grassmann coordinates [29]. A Grassmann manifold is the set of k
dimensional subspaces, in a n dimensional vector space,
and it is denoted as Λk Rn . Plücker coordinates can be
obtained as a representation of the exterior product
to four dimensional vectors x ∧ w. The result of this
operation lies in a six dimensional vector space R6 , that
can represent lines in 3D space.
Consider two points in the world (x and w in P 3 ) represented in homogeneous coordinates R4 . Using Plücker
where p ∈ R3 (non–homogeneous representation), [a]x is
a matrix that linearizes the exterior product as [a]x b =
a × b and I is a 3 × 3 identity matrix.
2.3 Interpolation
Suppose that we want to estimate an unknown function,
f : RD 7→ R, from a set of scattered data points
.
X = {xi } ⊂ RD (with D a natural number) and
.
Y = {yi }, where the set {xi , yi } forms a training data
set {yi = f (xi )}.
Interpolation requires the computation of an interpolating function, s : RD 7→ R, that satisfies
s (xi ) = f (xi ) ,
∀i.
(4)
Radial basis functions (RBF) [30], [31], [32], [33] can be
used to solve this problem. For a set P of training points
{x1 , . . . , xP }, the RBF interpolant function has the form
s (x) = a0 + aTx x +
P
X
wi φ (||x − xi ||)
(5)
i=1
where ||.|| is the 2–norm vector [34], φ : R+ 7→ R is the
radial basis function and ax ∈ RD . a0 , ax and wi are the
interpolant unknowns.
There are two types of kernel functions that can be
used as the RBF interpolant. One type of kernel functions
does not have shape parameters, like thin–plate splines,
φ (r) = r2 log (r), or φ (r) = r2 . The other type of kernel
functions does have shape parameters,
such as Gaussian
functions φ (r) = exp −γ 2 r2 , and multi–quadrics with
1/2
where γ is the shape parameter.
φ (r) = γ 2 + r2
The interpolation is obtained by means of the estimation of the unknown parameters of the interpolant
4
(a = (a0 , ax ) and w = (w1 , . . . , wP ) of Equation (5)).
The interpolating function s (x) has P + D + 1 degrees
of freedom and the data sets X and Y only yield P
equations. For the estimation of the unknowns in Equation (5), additional constraints have to be used. Considering the function φ conditionally positive definite [31],
the following equations are verified,
P
P
i=1
P
P
wi = 0 &
i=1
(i)
wi x 1 = · · · =
(j)
P
P
i=1
(i)
wi x D = 0
(6)
[28], [31] where xi is the ith element of the jth observation. The use of these constraints allows the estimation
of all unknowns.
The estimation of the unknown parameters can be
obtained using the following relation
!
!
!
y
w
Φ PT
(7)
=
0
a
P 0
|
{z
}
than one point in the image plane. As a result, a General
Camera Model can only be defined when considering the
mapping from image coordinates to 3D lines. A recent
result, however, derived an analytical model for the
forward projection of a non-central system made up of
a perspective camera and a rotationally symmetric conic
mirror [10].
As previously described, a single mapping between
points in the image to points in the world is insufficient
to define a raxel and as a consequence it can not define
a General Camera Model. The information of the direction
of the line can not be recovered. However, if we assume
that the relationship between image points and world
lines varies smoothly, we are imposing new constraints
to lines. Those constraints will implicitly allow the complete calibration of the Smooth Camera Model using only
a single point in the world for a point in the image,
as proved in the following sections. An example can be
seen in Figure 2.
Γ
where φi,j = φ (||xi − xj ||). Furthermore, considering
D = 2, we can see that C (Γ) = P + 3 implies P ≥ 3.
3
S MOOTH C AMERA M ODEL
From the definition of General Camera Model Figure 1,
introduced by Grossberg and Nayar [24], each pixel in
the image x ∈ P 2 is mapped to a ray in 3D space l ∈ L3 .
As mentioned before, the model is based on an array of
parameters called raxel and it can be used to represent
central or non–central camera model as well as smooth
and non–smooth camera models.
In that model, a complete General Camera Model is
represented by a non–parametric discrete array of raxels, that contains all possible pixels in an image. This
means that the pixel–line mappings are required for
all pixels, independently of the image resolution or
of the smoothness on the variation of the parameters
(corresponding to the 3D lines associated to neighboring
pixels). If we consider only the geometric entities in
Grossberg and Nayar’s model, each raxel contains at least
seven parameters. Thus, for each pixel, there are seven
unknown parameters to be computed. For an image with
size N × M , there are 7N M unknown parameters to be
computed. We aim at deriving a model that requires a
smaller number of parameters.
Since most of the imaging devices are characterized
by having a smooth relationship between image pixels
and lines in the world we use the assumption that the
pixel–line mappings can be represented by a smoothly
varying vector–valued function f : P 2 7→ L3 , that maps
a point in the image plane to a line in 3D space. We use
this approach to represent the Smooth Camera Model. This
assumption significantly decreases the number of model
unknowns and also allows to filter out some error due
to noise.
In most cases, a general direct projection model does
not exist, since one 3D line can be mapped into more
4
T HE P ROPOSED M ETHOD
In this section we describe the formulation of the proposed approach to represent the Smooth Camera Model,
Section 4.1, and the calibration procedure, Section 4.2. In
Section 4.3, we describe the data-set normalization.
4.1 Vector–Valued Function
A 3D line representation has to be chosen for the output
of the vector–valued function. Lines in 3D have four
degrees of freedom. However, none of the compact four
variable representations for 3D lines is complete.
Plücker coordinates (Section 2.2) are a complete, elegant and easy to understand line representation. On the
other hand, it has six elements to represent four degrees
of freedom. It is defined up to a scale factor and has an
orthogonal constraint associated to its elements.
Instead of using Plücker coordinates to represent 3D
lines, we use a vector made up by
the vectors of
stacking
b
b
b . In other words,
the direction and moment lR = d, m
two independent vectors are estimated, up to the same
scale factor.
b and m
b yield small deviations
Good estimates of d
from the orthogonal constraint. However, it is possible
to find orthogonal vectors d and m from their estimates
using Schmidt orthogonalization
[34], by finding
the
′
′
′
′
b
b
c
c
closest rotation matrix to
d m d × m , where
x′ = x/ ||x||, or by using the algorithm proposed by
Bartoli and Sturm [35].
There are several ways to estimate a non–injective
function from a set of scattered data. We use the RBF
interpolant described in Section 2.3
s (x) = a0 + aTx x +
P
X
i=1
wi φ (||x − ci ||)
(8)
5
where {ci } , x ∈ R2 . We can rewrite the previous equation, in matrix form as
!
w
s (x) = φ (x) p (x)
(9)
a
| {z }
hwa
where φ (x) and p (x) are row vectors,
with φi (x) =
φ (||x − ci ||) and p (x) = 1 x1 x2 .
The vector–valued function output is bl ∈ R6 where
each b
li is independent. Thus, we can use six independent
RBF interpolants
to form the vector–valued
function
s1 (x) s2 (x) . . . s6 (x) , which can be
s (x) =
rewritten as
(1)
(2)
(6)
hwa hwa . . . hwa . (10)
s (x) = φ (x) p (x)
{z
}
|
Hwa
The vector–valued function is a row vector, s (x) ∈ R1×6 ,
which implies bl (x) R = sT (x).
For a set {ci }, a matrix Hwa and a certain RBF, the
estimates of the direction and moment vectors, as a
function of the coordinates of the image points, are given
by
b (x)T m
bl (x)T R = d
b (x)T R = φ (x) p (x) Hwa .
(11)
From the previous equation, it can be observed that
for two different imaging systems using the same set of
points {ci }, the estimation of bl for an image point x only
depends on matrix Hwa . Thus, we call Hwa the camera
matrix. On the other hand, for the same imaging system,
the values of the parameters of the camera matrix depend
on the set {ci }, and that is why we call them control
points.
Usually, in statistics, the set {ci } is called centers. In
computer vision, the word center in an imaging system
typically designates the center of the projection. As a
result, we chose to name to the set {ci } as control points.
For a set P of control points defined a priori {ci } and
a camera matrix Hwa ∈ R(P +3)×6 , we define the general
Smooth Camera Model model by the vector–valued function s : R2 7→ R6 .
The Smooth Camera Model only depends on the unknown matrix Hwa , for a set of previously defined control
points. Therefore, the complete calibration of the Smooth
Camera Model can be obtained by estimating 6P + 18
unknown parameters, that sets up the camera matrix.
4.2
Linear Point–based Calibration
One disadvantage of the use of the General Camera Model
is the difficulty of its calibration. Grossberg and Nayar
[24] and Sturm and Ramalingam [26] define two different
methods for the calibration. However in both cases, the
complete calibration is achieved when we have a raxel
for any pixel of the image. Note that the estimation of a
single raxel requires at least two points in the world for
the each image pixel. Since the estimation of the raxels
for every pixel is a difficult task, interpolation methods
were used to estimate all the raxels from a subset of
calibrated raxels (already described in Section 1.1). The
goal in the calibration of a Smooth Camera Model is to
allow the calibration of the respective model without
estimating each and all the raxels while, at same time,
allowing the same resolution.
In the method described in this paper, the calibration
procedure only requires a set of point correspondences
{xi 7→ pi }, (where {pi } is the set of world points and
{xi } are their corresponding points in the image plane).
Point correspondences that satisfy
xi 6= xj ,
∀i6=j ,
(12)
can be used. For each image pixel only the coordinates
of one 3D point are necessary.
There are two sets of unknowns in the calibration
procedure: the set of control points {ci } and the elements
of the camera matrix Hwa . Control points can be defined a
priori, by selecting a set P of scattered image points, that
can or cannot be a subset of data points {ci } ⊂ {xi }.
In the rest of this section, we describe a linear method
to estimate the 6P + 18 parameters of the camera matrix,
for a set of control points defined a priori.
4.2.1 Calibration Matrix M
World points incident on lines must verify Equation (3).
T
From bl (x) R = s (x) and Equation (3)
bl (x)T Q (p)T = s (x) Q (p)T = 0.
(13)
Replacing s (x) from the previous equation, using
Equation (10)
T
T
T
Hwa Q (p) = 0
(14)
φ (x)
p (x)
{z
}
|
r(x)
where r (x) ∈ R1×(P +3) .
The unknowns are the elements of matrix Hwa . Thus,
using the Kronecker product, Section 2.1, which allows
to rewrite (14) in order to isolate the unknown camera
matrix as
Q (p) ⊗ r (x) vec (Hwa ) = 0
(15)
where matrix Q (p)⊗r (x) ∈ R4×(6P +18) and vec (Hwa ) ∈
(i)
R(6P +18) is the stacking of hwa columns, for i = 1, . . . , 6.
For a set N of point correspondences {x
i 7→ pi } of the
same imaging system, Q (pi ) ⊗ r (xi ) vec (Hwa ) = 0
and the calibration matrix can be built as
Q (p1 ) ⊗ r (x1 )
..
.
(16)
M=
Q (pN ) ⊗ r (xN )
D
6
where M ∈ R(4N +18)×(6P +18) and matrix D ∈
R18×(6P +18) is as [36].
The calibration procedure is reduced to the estimation
of the unknown camera matrix Hwa such that
Mvec (Hwa ) = 0, which means vec (Hwa ) ∈ null (M)
(17)
where null (.) indicates matrix null–space.
4.2.2
Computation of the Camera Matrix
From Equation (17), the estimate of the camera matrix
must belong the the null–space of the calibration matrix
M. To ensure an unique solution, and since vec (Hwa ) ∈
R6P +18 , the dimension of the column space of M must be
C (M) = 6P + 17, which means that the dimension of the
null–space of M will be N (M) = 1.
From Equation (11), any solution for the camera matrix
is defined up to a scale factor. Thus, assuming that
N (M) = 1, any element of the one dimensional null–
space of M is a solution, except for the trivial solution
vec (Hwa ) = 0.
To prove that the dimension of the column space of
M is C (M) = 6P + 17, we decompose Equation (16)
into rows, by means of the Kronecker product. Since the
permutation of the rows in any matrix does not change
the dimension of the column space, defining N = 2P , it
is possible to rearrange matrix M as a block of matrices
similar to Γ (Equation (7)). It can be proved that if a
specific set of constraints is met, then matrices Γ are full
rank, C (D) = 17 and C (M) = 6P + 17–see next section
and [36].
From this result, and since C (Γ) = P + 3 implies
P ≥ 3, we can conclude that the minimal configuration
to ensure a unique solution for the computation of the
camera matrix corresponds to N = 6. An example of a
calibration corresponding to the minimal case is shown
in Figure 2.
4.2.3 Relationship between Control Points, point correspondences and radial basis function used in the calibration
In this section, we describe the constraints that must be
met by {xi } and {ci } to obtain C (M) = 6P + 17. Two
types of radial basis functions, that can be applied to get
that rank are also discussed.
Let us assume that a set of point correspondences,
{xi 7→ pi } for i = 1, . . . , 2P and xi 6= xj , ∀i 6= j is
available, and that it can be split into two sub-sets such
that
.
K(1) = {xi } for i = 1, . . . , P
(18)
.
K(2) = {xi } for i = P + 1, . . . , 2P.
In [32] and [36] it is shown that only control points
({ci }, for i = 1, . . . , P ) and data points ({xi }, for i =
1, . . . , 2P ) that meet the condition
d < ǫq
(19)
can be considered. In Equation (19), d = min {d1 , d2 },
0 < ǫ ≤ 1, d1 = max {||xi − ci ||}, where the set {xi }
belongs to the set K(1) , d2 = max {||xi − ci ||}, where the
set {xi } belongs to K(2) and 2q = minj6=i {||ci − cj ||}.
.
Note that (from [36]) if we consider {ci } = K(1) and if
d2 < ǫq then we can also obtain C (M)) = 6P + 17. This
solution also meets the constraint xi 6= xj , ∀i 6= j.
1/2
Quak et al. [32] proved
that φ1 (r) = γ12 + r2
and φ2 (r) = exp −γ2 r2 are good choices for radial
basis functions, since that, by choosing an appropriate γ1
and γ2 , the negative effects of small values of q and ǫ
respectively in Equation (19) can be reduced.
4.3 Data Normalization
One of the issues in the method proposed in [28] is
related to the shape parameters of the RBF. The choice
of this values depends on the distribution of set of the
image points {xi }. To avoid this problem, a data normalization procedure for both image and world coordinates
was developed. We consider non–isotropic normalization [37] for both image and world points, such that
• the centroid is at the origin;
• the principal moments are equal to unity.
If on one hand, normalization of the image space
coordinates is easy to implement, on the other hand,
normalization of the coordinates of the set of world
points {pi } will have additional implications. The normalization of the coordinates of the world points, {pi },
in the calibration procedure will imply a change on the
parameters of the Smooth Camera Model. To account for
this change, the inverse of this transformation has to be
applied to the parameters of the Smooth Camera Model
that were estimated.
4.3.1 Scaling the Coordinates of the Set of Image Points
For the normalization of the image space, an affine
transformation
ui = a−1 (Axi + a)
(20)
is considered, where A ∈ R2×2 is upper triangular, and
a ∈ R2 .
To obtain the normalization parameters {A, a, a}, an
affine transformation based on Choleski factorization is
used [34], [37]. Let us consider the homogeneous representation of the image coordinates x as x. Since matrix
N
P
xi xTi is symmetric and positive definite, using the
i=1
using the Choleski factorization one can define
N
P
i=1
xi xTi =
N K1 KT1 , where K1 ∈ R3×3 . Developing this equation,
N
P
−T
T
K−1
we get
= N I. As a result, and using
1 x i xi K 1
i=1
ui = K−1
1 xi , one obtains
N
P
i=1
ui uTi = N I which means
that the set {ui } has its centroid at the origin of the
coordinate systems and the two principal moments of
the set of points are equal to one.
7
Since K1 is upper triangular, K−1
1 is also upper triangular and we can define
!
A a
−1
,
(21)
K1 =
0 a
which includes the parameters of the affine transformation of Equation (20).
4.3.2
Scaling the Coordinates of the Set of World Points
To apply the normalization to the coordinates of the
points in world space, a method similar to the method
described in the previous section is applied. The coordinates of the set of world points {pi } are changed into
{qi } such that
!
B b
−1
−1
(22)
qi = b (Bpi + b) and K2 =
0 b
∈ R4×4 is an upper triangular matrix (for
where K−1
2
points in 3D space). B ∈ R3×3 is upper triangular and
b ∈ R3 .
Note that this normalization has consequences: it will
change the coordinates of the line space. The output of
function s are the coordinates of the 3D lines. Therefore
one has to account for this change of coordinates. The
line coordinates which are the output of function s have
to be changed into the original coordinate frame.
Using the affine transformation defined at the Equation (22) and from the definition of Plücker coordinates,
we define an affine transformation in line space as
!
B
0
l(1)
(23)
l(2) R =
−b−1 [b]x B b−1 det (B) B−T
{z
}
|
E
(for the derivation of the matrix E see the author web
page). Note that B is non–singular, which means that
matrix E is invertible and l(1) can be estimated such that
l(1) R = E−1 l(2) .
4.3.3
Camera Model and Calibration with Normalization
In this section we describe the algorithm for the calibration of the Smooth Camera Model, using normalized data.
Step 1: Compute matrices K1 and K2 using the
method described in Sections 4.3.1 and 4.3.2, for the
set of points {xi } and {pi } respectively. Compute the
affine transformations parameters {A, a, a} and {B, b, b}
from Equations (21) and (22) respectively. Compute the
normalized image coordinates {ui } and {qi }, using
Equations (20) and (22) respectively. Using the affine
transformation, get the matrix E, using Equation (23).
Step 2: Compute the calibration matrix M using Equation (16). Note that instead of {xi } and {pi }, {ui } and
{qi } should be used, respectively.
Step 3: Get vec (Hwa ) which minimizes Mvec (Hwa ),
using the Singular Value Decomposition [3], and un-stack
vector vec (Hwa ) to matrix Hwa .
Step 4: The camera model with normalization is defined by a function s : R2 7→ R6 where blR = s (x), such
that
s (x) = r a−1 (Ax + a) Hwa E−T
(24)
where r : R2 7→ R(p+3) is defined in Equation (14). Note
that the coordinate system where line coordinates bl (x)
are represented is the same coordinate system where the
original set of point in the world {pi } is represented.
5
E XPERIMENTS
5.1 Results with Synthetic Data Sets
To evaluate the method described, synthetic data sets
were generated. Using known camera models, 3D lines
are generated, which are mapped into image points.
The data set made up by pairings between 3D points
and image points is obtained from the first data set
(containing coordinates of 3D lines) by computing 3D
points incident on the lines. The camera models used in
the evaluation were: catadioptric system with quadric
mirror, Figure 3(a); crossed–slits camera model, Figure 3(d); refraction camera model, made up of a camera
looking through a volume of water contained between
two parallel planes, Figure 3(g).
Since the mappings between points in the image and
3D lines are known, the calibration results can be evaluated by the estimates of the distance errors (distances
between lines) in line space. Note that the camera model
is represented by a mapping between points in the image
and 3D lines.
5.1.1 Evaluation Results Using Smooth Camera Models
Synthetic data sets were used to evaluate the effect of the
number of control points on the error defined as a distance
between ground truth 3D lines and estimated 3D lines.
As radial basis functions multi–quadrics and exponential
were used.
The synthetic data sets were obtained for the following smooth camera models f : P 2 7→ L3 : non–central
catadioptric systems using quadric mirrors [9], crossedslits camera model [16] and refraction-based camera
model [19], [20].
The calibration method estimates an interpolant function s : R2 7→ R6 that should fit an imaging model,
defined by a function f : P 2 7→ L3 , where f is an analytic
representation of the corresponding imaging model.
f is used to generate a data set of {xi 7→ pi }, for i =
1, . . . , N and P control points are selected from the set
{xi }, with the conditions described in Section 4.2.3 being
met.
Using f , a ground truth data set is generated {yi 7→ li },
where yi are the ground truth image coordinates and li
are the ground truth 3D line coordinates.
The sets {xi 7→ pi } and {ci } are used to calibrate the
camera
model.
Using the interpolant functions s, the set
o
n
b
yi 7→ li is estimated, where bli = s (yi ).
8
Line Distance Error Using Gaussians
35
Line Distance Error Using Multi-Quadrics
25
Normalized Coordinates
Original Size
25% Bigger Image
75% Bigger Image
150% Bigger Image
30
Normalized Coordinates
Original Size
25% Bigger Image
75% Bigger Image
150% Bigger Image
20
Distance Error
Distance Error
25
20
15
15
10
10
5
5
0
0
10
20
30
40
50
60
70
80
90
0
100
0
10
20
Number of Control Points
(a)
30
40
50
60
70
80
90
100
Number of Control Points
(b)
(c)
Line Distance Error Using Gaussians
Line Distance Error Using Multi-Quadrics
350
Normalized Coordinates
Original Size
25% Bigger Image
75% Bigger Image
150% Bigger Image
12
300
10
200
150
100
9
8
Distance Error
Distance Error
250
Normalized Coordinates
Original Size
25% Bigger Image
75% Bigger Image
150% Bigger Image
10
8
6
7
6
5
4
4
3
50
2
Calibration Points
World Lines
0
2
1
-20
0
0
20
50
40
30
20
10
0
-10
-20
-30
-40
0
10
20
30
40
50
60
70
80
90
0
100
0
10
20
Number of Control Points
-50
(d)
30
40
(e)
35
70
80
90
100
Line Distance Error Using Multi-Quadrics
Normalized Coordinates
Original Size
25% Bigger Image
75% Bigger Image
150% Bigger Image
40
60
(f)
Line Distance Error Using Gaussians
45
50
Number of Control Points
Normalized Coordinates
Original Size
25% Bigger Image
75% Bigger Image
150% Bigger Image
40
35
Distance Error
Distance Error
30
30
25
20
25
20
15
15
10
10
5
5
0
0
10
20
30
40
50
60
70
80
90
100
Number of Control Points
(g)
0
0
10
20
30
40
50
60
70
80
90
100
Number of Control Points
(h)
(i)
Fig. 3. Evaluation of the average line distance error, Equation (27), as a function of the number of control points, for
non–central catadioptric systems with a quadric mirror, for a crossed–slits camera model and for a model including
refraction (a camera looking through a volume of water contained between two parallel planes) (a), (d) and (g)
respectively. For each camera model we evaluate the distance error defined in Equation (27), for gaussian and multi–
quadrics radial basis functions and the errors are shown in Figures (b)-(c), (e)-(f) and (h)-(i) respectively.
To evaluate the calibration method the distances between bli and li are used. To characterize
the error the
b
average of the distances ǫi = d li , li is computed.
Let us consider two lines represented by Plücker coordinates g and h, which can be approximated via a local
e = (g1 , g2 , g3 , g4 )
mapping into an Euclidean 4-space g
e = (h1 , h2 , h3 , h4 ) [29] such that
and h
g = (g3 − g1 , g4 − g2 , 1, g2 , −g1 , g1 g4 − g2 g3 )
(25)
h = (h3 − h1 , h4 − h2 , 1, h2 , −h1 , h1 h4 − h2 h3 ) . (26)
The distances between the lines are estimated using
2
4
e = P (gi − hi )2 + (g1 − h1 ) (g3 − h3 ) +
e, h
d g
i=1
(27)
(g2 − h2 ) (g4 − h4 ) .
The results are evaluated by varying the number of
control points, P , from 10 to 100. For each number of control points, the calibration is evaluated using the ground
truth data set {yi 7→ li } for i = 1, . . . , 120. For each set
of control points, the calibration is repeated 150 times for
9
Line Distance Error Using Gaussians
Line Distance Error Using MultiQuadrics
50
50
45
45
40
40
Line Distance Error Using Gaussians
25
20
15
Cat. Quadric
X-Slit
Refraction
10
5
0
0
2
4
6
8
10
12
14
16
18
35
30
25
20
15
Cat. Quadric
X-Slit
Refraction
10
5
20
0
0
2
Noise
4
6
8
10
12
14
16
18
Noise
(a)
(b)
40
35
30
25
20
15
10
0
35
30
25
20
15
10
5
20
Cat. Quadric
X-Slit
Refraction
45
Distance Error
30
Distance Error
Distance Error
Distance Error
40
35
Line Distance Error Using MultiQuadrics
Cat. Quadric
X-Slit
Refraction
45
5
0
5
10
15
20
25
β
(c)
30
35
40
45
50
0
0
5
10
15
20
25
β
30
35
40
45
50
(d)
Fig. 4. In this figure we show results of tests performed with the three previously mentioned camera models, Figure 3,
with noise added. We set the number of control points to 60 and vary the variance of the noise added to the coordinates
of the 3D points used in the calibration procedure. The standard deviation of the noise is proportional to the smallest
distance among all the world points of the calibration data set. (a) is the distance error in the line space for gaussian
RBF and (b) for multi–quadrics. To eliminate some errors, we can use more points than the ones needed for the
minimal solution. We consider the noise as 10% rate of the smallest distance between the world points and we vary
the β, where N = βP . The results are show in Figures (c) and (d) for gaussian and multi–quadrics respectively.
different values of {xi }, {yi } and {pi }, {li }, respectively.
The average of the distance errors is computed for each
number of control points.
The calibration results are dependent on the value of
the shape parameter in the case of either multi–quadric
or gaussian radial basis functions. By normalizing the data
coordinates as described in Section 4.3.3, the value of the
shape parameter is no longer dependent on the image
size and also on the spatial distribution of the image
points.
For a fixed value of the shape parameter, the error as
a function of the spatial distribution of the set {xi } was
evaluated.
and for each set {xi } and
o
oFor that
n purpose
n
(j)
(j)
were computed with
and yi
{yi }, xi
(j)
xi
= xi σj ∀i
(28)
for j = 1, 2, 3 with σ1 = 1.25, σ2 = 1.75 and σ3 = 2.5.
The results are shown in Figure 3. Note that by using
normalized coordinates the error variation does not depend on the image size. Therefore the distance error can
be represented by a single curve.
5.1.2 Results with Noise Added
The same evaluation procedure was repeated by adding
Gaussian noise {µi } to the 3D coordinates of the points
used in the calibration procedure {pi }. For that purpose
the standard deviation of the norm of the noise vector
was defined as
std (||µi ||) = αe
(29)
where e = min (||pi − pj ||) for all i 6= j. Instead of using
e i }, where
the set {xi 7→ pi } to calibrate, the set {xi 7→ p
e i = pi + µi for all i, was used.
p
The errors are evaluated in line space (Equation (27))
for the three previously mentioned camera models, with
values for α from zero to 0.20. Calibration was performed using normalized coordinates (Section 4.3.3) with
P = 60.
Results are shown in Figure 4(a) and 4(b) for gaussian
and multi–quadrics RBF, respectively.
Note that in the previous experiments, N = 2P was
used, for P equal to the number of control points. On
the other hand if N = βP for β > 2, then matrix M
can become full–column rank and, as a consequence, there
will be no exact solution for Hwa . However, an estimate
for vec (Hwa ) can still be obtained, in the least–squares
sense [3].
The errors for β from 2 to 50 were also evaluated
(measured as the distances between the 3D lines corresponding to the ground truth and the estimated 3D
lines). For that purpose, P = 60 was used, and Gaussian
noise with α = 0.10, Equation (29) was added.
The results are shown in Figure 4(c) and 4(d) for
gaussian and multi–quadrics RBF, respectively.
5.1.3 Results Using Non–Smooth Camera Models
The approach described in this paper is based on the
assumption that the camera model is smooth. Therefore
this method can not be applied to general camera models
which line positions and orientations do not change
smoothly. However, and even in some of those camera
models, the approach can still be applied, provided that
errors at the discontinuities can be accepted.
As an example the calibration procedure was applied
to a non–central camera model obtained by combining
images acquired by four different perspective cameras,
with different poses (Figure 5(a)). The errors in 3D line
space (Equation (27)), displayed as a function of the
image coordinates are shown in Figure 5(b). These results were obtained for P = 250 and multi–quadrics RBF.
The errors are small over most of the global image with
the exception of the boundaries between the individual
images.
To obtain results with smaller errors, the image points
can be clustered based on the error distribution. To each
10
Image Plane
P (3)
P (4)
Wrold Space
P (1)
Camera 3
40
P (2)
30
20
Camera 4
10
0
60
Camera 1
50
Camera 2
40
30
20
10
0
40
30
20
10
0
−10
−20
−30
(a)
(b)
Fig. 5. This figure shows results for a non–smooth camera model synthetically generated, and made up by combining
four images obtained from four different perspective cameras as shown in (a). In figure (b) errors corresponding to the
distances between the 3D lines are shown.
cluster, a specific radial basis functions can be assigned
(based on the the distribution of control points). Each
cluster can then be recalibrated individually.
5.1.4 Experimental Results Using 3D Data from a Single
Surface
In this section we present the results of a new calibration
experiment where the 3D points data used for the calibration belong to a single surface. Synthetic data was
obtained from a non-central catadioptric system made
up by a quadric mirror (Figure 3(a)). The catadioptric
system was used to generate the data for the calibration,
where it was assumed that the correspondences between
the world points and their images were known. We
consider world points on a spherical surface. Note that
the 3D points can not belong to a single plane or to
a single 3D line since in those cases the theoretical
conditions described in [36] are not satisfied and the null
space of matrix M has dimension greater than 1.
The results are shown in Figure 6. The image points
used to evaluate the calibration (ray estimation) are not
the image points used in the calibration process. We use
the a non–minimal solution, with 40 control points and
360 data points, and the camera matrix is given in the
least–squares sense [3]. From the experiments, lines are
estimated with small error.
5.2
Results with Data sets of Real Images
For experiments with real data sets, three different types
of imaging systems were calibrated, using the model
described in this article and point–based calibration.
A perspective camera and two different catadioptric
systems were used, shown in Figures 7(a), 7(d) and 7(g)
respectively. Note that the catadioptric camera model
with two planar mirrors is a non–smooth camera model.
To acquire the point correspondences {xi 7→ pi },
which is the data necessary for the calibration method, a
chess board was used. Infrared (IR) LEDs were attached
to the chess board and their positions in the world were
measured using an IR tracker [38]. This tracker has an
accuracy of 0.1[mm] and a resolution of 0.01[mm].
Each corner of the chess board image xi is associated
to a position in the world pi , which is given by its
corresponding position in the 3D chess board. The set
of associations {xi 7→ pi }, for i = 1, . . . , 3840, was used
in the calibration procedure.
The set of image points {xi } are the yellow points in
Figures 7(b), 7(e) and 7(h). The set of world points {pi }
are the yellow points in Figures 7(c), 7(f) and 7(i).
Control points {ci } are chosen as a subset of {xi }. They
are shown as green points, in Figures 7(b), 7(e) and 7(h).
Using {xi 7→ pi }, the interpolant function s is estimated. Since in the real experiments we have N > 2P ,
the solution will be obtained solving an over–determined
system of equations. The camera matrix is estimated as
least–squares solution for the homogeneous equations
[3], Section 4.3.3.
To evaluate the calibration, a different set of 3D coordinates, corresponding to a different object were used.
The IR tracker has a ”test object” (also with LEDs) which
is provided to enable the estimation of 3D coordinates
of points. This different object was used to generate a
new data set {yi 7→ wi }, for i = 1, . . . , 700, where yi are
image points and wi are world points. This new data set
was used to evaluate the calibration performed with the
former data set.
The distance error is defined by the geometric distance
ǫi from the world point wi to the line estimated bli , where
bli = s (yi ). A subset of {yi } and the corresponding
n o
lines bli are shown as red squares, in Figures 7(b), 7(e)
and 7(h), and as red lines, in Figure 7(c), 7(f) and 7(i),
respectively. We derived the geometric distance between
a line (in Plücker coordinates) and a point in the world
as ǫi = ||ǫi ||, where
(30)
ǫi = [wi ]x −I li
b i . For more informatin, see the author
and li = bli / d
web page.
The number of control points used in the calibration
were 30, 35 and 40, for the imaging systems of Figures 7(a), 7(d) and 7(g) respectively. The results are
shown in Table 1.
11
0
Data Points
Control Points
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
(a)
(b)
(c)
Fig. 6. Figure (a) represents the image points (black squares) and control points (red circles) used in the calibration
process. Their correspondent world points are shown in Figure (b) (black points). We consider 3D points on a spherical
surface. Figure (c) shows the results, where the ground–truth lines are shown as black lines and the estimated lines in
red. Points in the image that were used to generate the estimated lines Figure (c) are different from the image points
used in the calibration, (Figure (a)).
5.3 Using a Calibrated Perspective Camera to Acquire a Data-set with Real Data
To further evaluate the calibration method, an additional
data-set of real data was acquired.
Consider an object or scene whose images are acquired
by two perspective cameras. One of the cameras is
looking directly at the object/scene, whereas the second
camera acquires images of the object by viewing it
through a glass tank filled with water (Figure 8(a)).
The goal is to calibrate the second imaging system, i.e.,
including the effect of viewing the object through the
glass tank filled with water. An example of the image
acquired by this system is shown in Figure 8(b).
A chess board is used to calibrate the system. Several
images of the chess board were acquired, with the chess
board in different positions and orientations. The direct
images of the chess board, i.e., acquired by the camera
that views the scene directly (not through the glass tank)
were used to estimate the coordinates of the planes in the
world coordinate system. For that purpose the Bouguet
calibration toolbox was used [39].
Eighteen different images of the chess board in different positions and orientations were used. In each chess
board image there were of 160 points whose coordinates
could be used (these points are located at the intersection
of the edges). The points are displayed in Figure 8(c) as
yellow and blue dots. The coordinates of the points on
the
o board are used as world points forming the set
n chess
(j)
(j)
where pi are the coordinates of the i’th point
pi
of the j’th chess position in the world.
Since these points are also visible in the image acTABLE 1
Results for the experiments, using the IR tracker [38].
RBF
Projective
Sphere mirror
Plane mirrors
Gaussian [cm]
0.93058 ± 1.9783
2.8743 ± 2.1074
0.91795 ± 1.0608
Multi–quadrics [cm]
0.54711 ± 0.43567
1.8863 ± 1.2379
0.99492 ± 0.78778
quired through the water-filled glass tank (i.e. including
the effects of refraction), thenimages
o of the points on
(j)
(j)
(where xi is the
of each chess-board images xi
i’th point coordinate of the j’th image of the j’th chess
board position) aren associated
with the corresponding
o
(j)
world coordinates pi , forming the set of mappings
o
n
(j)
(j)
xi 7→ pi .
To test the calibration method the following procedure
was followed: for each specific chess board position
and orientation j, all the data from all other chess
boards at different positions and orientations was used
as calibration data set. The coordinates of the points of
the j’th chess board position were used as as ground
truth. Therefore the coordinates of the points of the j’th
chess board position were used as ground truth data
set {yi 7→ wi } (160 mappings between image and world
points) and the coordinates of the other points of the
other chess board positions were used as calibration
data–set {xi 7→ pi } (2720 mappings between image and
world points). For each yi , the distances between the
estimated lines bli and the corresponding points in the
world wi are estimated, using Equation (30).
This procedure is repeated for all the j’th chess board
positions and the means and standard deviation of the
distance errors are computed, for both gaussian and
multi-quadrics RBFs. The number of control points used
was 10. The corresponding results were 0.111±0.075[cm]
for multi–quadrics and 0.313 ± 0.545[cm] for gaussian RBF,
respectively.
An example of the estimated lines, estimated using the
method described (for a specific plane position) is shown
in Figure 8(c), as red lines.
5.4 Removing Distortion: Results for a Spherical
Catadioptric System
In general, undistorted views cannot be computed from
images acquired by a general non–central camera models
(without any additional constraint). As mentioned by
12
Calibration Points
Estimated Lines
-2600
100
-2800
200
-3000
-3200
300
-3400
400
-3600
500
-3800
-4000
600
-4200
700
GAUSS:
MQ:
RED
GREEN BLUE MAGENTA
0.71686 0.78906 0.80757 0.73794
0.34631 0.44775 0.4288 0.4274
500
Calibration Points
Control Points
Results Points
800
200
0
-500
400
600
(a)
800
1000
-400
1200
-200
200
0
(b)
400
600
800
1000
1200
1400
1600
(c)
-2200
Calibration Points
Estimated Lines
-2400
200
-2600
-2800
300
-3000
-3200
-3400
400
-3600
-3800
500
-4000
-4200
600
GAUSS:
MQ:
RED
GREEN BLUE MAGENTA
2.3775 2.4179 0.90933 1.0673
1.3424 1.1416 0.8852 1.1823
1000
Calibration Points
Control Points
Results Points
700
100
200
300
400
500
0
-500
500
(d)
600
700
800
900
1000
(e)
100
Calibration Points
Control Points
Results Points
-2000
-2500
400
-3000
500
-3500
0
600
BLUE
MAGENTA
MQ:
0.68138
0.27322
1.5249
2.843
GAUSS:
1.0979
0.36503
2.2777
1.5749
-4000
500
1500
700
200
(g)
1500
1000
Calibration Points
Estimated Lines
300
GREEN
500
(f)
200
RED
0
-500
1100
400
600
800
(h)
1000
1200
1000
500
0
1000
-500
-1000
(i)
Fig. 7. Results with real images, obtained using the method described for three types of camera models: perspective
camera (a), catadioptric system with spherical mirror (d) and catadioptric system with two planar mirrors (g). Yellow
points in (b), (e) and (h) are the set of image coordinates {xi } and yellow points in (c), (f) and (i) are the corresponding
set of world points
in the calibration method. Red squares in the 2D plot and the red rays in the 3D plot are
o
n {pi }, used
b
the a subset of yi 7→ li obtained with multi–quadrics RBF. Green points in (b), (e) and (h), are the corresponding
set of control points used in the calibration.
Swaminathan et al. at [40], obtaining undistorted images
from general non-central systems requires the knowledge of the scene structure.
Using the approach proposed by Swaminathan et
al.(and the knowledge of the scene structure) we performed a new experiment to estimate undistorted images from the non–central catadioptric camera using a
spherical mirror.
We used the same calibration process described in Section 5.3. We know position of the seventeen chess boards
and use sixteen of these chessboard positions to calibrate
the camera model. We then obtained the undistorted
image of the 17th plane. Using the knowledge of the
position of the plane not used in the calibration, and
its image, the undistorted image was computed.
Using a virtual perspective camera (located at (0, 0, 0)),
we generated the undistorted image. The results are
presented in Figure 9. As it can be seen from Figure 9(a)
and Figure 9(b) the curved lines in the distorted image
are mapped into straight line in the undistorted image.
13
Chess Board
yy axis
500
-1000
0
Cameras
-500
0
-500
3500
500
3000
2500
2000
Water−filled glass tank
xx axis
1500
1000
zz axis
(a)
(b)
(c)
Fig. 8. In Figure (a), we display the set-up used to acquire the images of the data set. The goal is the calibration
of an imaging system that views a scene through a water-filled glass tank. An example of an image acquired by this
imaging system is shown in Figure (b). A planar chess board was used. The data set is made up by the coordinates
of the images of the points corresponding to the intersections of the chess board edges (seen through the tank). An
example of the results is shown in Figure (c). In yellow we show all the 3D points used in the calibration procedure. In
blue we show the 3D points on the which the estimated lines must be incident. The estimated lines are shown in red.
The units in Figure (c) are millimeters.
6
6.1
C ONCLUSIONS
Discussion
The experiments and results with synthetic data were
specially relevant for the goals of this paper. They were
particularly important to evaluate the effect of the number of control points as well as to evaluate the sensitivity
of the method to the noise, by means of a fully specified
data set.
Considering the experimental results obtained without
noise Figure 3, those obtained using gaussian RBF are
slightly better than those obtained using multi–quadrics.
However, multi–quadrics RBF tend to be less sensitive
to the image scale variation. The experimental results
also show that the inclusion of the coordinate normalization procedure (Section 4.3.3) tends to consistently yield
better results, as expected. When noise is added to the
synthetic data, and as expected, the use the least-squares
estimates yields better results than the ones obtained
applying the theoretical condition N = 2P .
(a)
(b)
Fig. 9. In Figure (a) we show the original image obtained
from a catadioptric camera system, made up by a spherical mirror. In Figure (b), we show the undistorted image.
In some cases the calibration is affected with errors
that go up to 60 units. These are extreme cases and we
discuss the reasons why they occur. In one case (Figure 3)
these errors correspond to results obtained with a very
small number of control points 10. Note that this number
of control points is small for the specific camera model
(for other systems it might be adequate). Therefore in
that case the reason for the error value is the small
number of control points. In the case of Figure 4 these
values are obtained for levels of noise with standard
deviations which go up to 20 units. These are presented
to show the performance of the model in very special
(and unrealistic) conditions.
Experiments with synthetic data also show that this
method can be applied to the calibration of non–smooth
camera models if the discontinuities on the data are
limited.
To conclude the synthetic experiments, we perform
an experiment where (with the appropriate choises for
the interpolant paramenters) we can calibrate a camera
model using world points in a single surface, Figure 6.
These experiments show that the method can be used
without explicitly accounting for the collinearity constraints.
Experimental results with real data–sets were also
performed to evaluate the behavior of the method under
real conditions. The results shown in Figures 7 and 8
show the expected behavior of the 3D line estimates obtained under the assumption of a Smooth Camera Model.
In addition, we also show distortion correction results
for a non–central catadioptric camera, using the proposed camera model.
6.2 Closure
The proposed Smooth Camera Model and calibration procedure can be used to calibrate complex smooth camera
14
models, namely in the case of cameras for which no
analytical model exists. Despite the fact that this model
is for smooth imaging systems, it can also be used also
in special cases of non–smooth imaging systems.
This approach can model a camera with significantly
less parameters than the discrete General Camera Model.
For the model described in this paper the number of
parameters does not depend on the image size. Instead
of the 7N M parameters, for an M × N image, required
by the discrete General Camera Model, this approach only
requires 6 (P + 3) for P control points.
The calibration procedure only requires the 3D coordinates of a single world point for each image point,
whereas previous approaches require two or more 3D
points for each image point. On the other hand, the
calibration parameters which are estimated for a sub-set
of image points are implicitly generalized for all image
pixels, which constitutes an important advantage of this
method.
R EFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
R. Y. Tsai, “An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision,” In CVPR, 1986.
Z. Zhang, “A Flexible New Technique for Camera Calibration,”
IEEE Trans. Pattern Analysis and Machine Intelligence, 2000.
R. Hartley and A. Zisserman, Multiple View Geometry in Computer
Vision (2nd ed.). The Edinburgh Building, Cambridge CB2 2RU,
UK: Cambridge University Press, 2000.
S. Baker and S. K. Nayar, “A Theory of Single–Viewpoint Catadioptric Image Formation,” International Journal of Computer Vision,
1999.
S. Gasparini, P. Sturm, and J. P. Barreto, “Plane–Based Calibration
of Central Catadioptric Cameras,” In ICCV, 2009.
R. Hartley and S. B. Kang, “Parameter–free Radial Distortion
Correction with Centre of Distortion Estimation,” In ICCV, 2005.
D. Nistér, H. Stewénius, and E. Grossmann, “Non–Parametric
Self–Calibration,” In ICCV, 2005.
S. Ramalingam, P. Sturm, and Lodha Suresh K., “Generic selfcalibration of central cameras,” Computer Vision and Image Understanding, 2010.
N. Goncalves, “Noncentral Catadioptric Systems with Quadric
Mirrors, Geometry and Calibration,” Ph.D. dissertation, University of Coimbra, October 2008.
A. Agrawal, Y. Taguchi, and S. Ramalingam, “Analytical Forward
Projection for Axial Non–Central Dioptric & Catadioptric Cameras,” In ECCV, 2010.
R. Swaminathan, M. D. Grossberg, and S. K. Nayar, “Non–
Single Viewpoint Catadioptric Cameras: Geometry and Analysis,”
International Journal of Computer Vision, 2006.
B. Micusik and T. Pajdla, “Autocalibration & 3D Reconstruction
with Non–Central Catadioptric Cameras,” In CVPR, 2004.
J. Ponce, “What is a Camera?” In CVPR, 2009.
R. Gupta and R. I. Hartley, “Linear Pushbroom Cameras,” IEEE
Trans. Pattern Analysis and Machine Intelligence, 1997.
J. Yu and L. McMillan, “General Linear Cameras,” In ECCV, 2004.
A. Zomet, D. Feldman, S. Peleg, and D. Weinshall, “Mosaicing
New Views: The Crossed–Slits Projection,” IEEE Trans. Pattern
Analysis and Machine Intelligence, 2003.
S. Hu and X. Feng, “Calibration of Panorama Based on Linear
CCD and Fisheye Lens,” In CISP, 2009.
J. Kannala and S. S. Brandt, “A Generic Camera Model and
Calibration Method for Conventional, Wide–Angle, and Fish-Eye
Lenses,” IEEE Trans. Pattern Analysis and Machine Intelligence, 2006.
T. Treibitz, Y. Y. Schechner, and H. Singh, “Flat Refractive Geometry,” In CVPR, 2008.
K. N. Kutulakos and E. Steger, “A Theory of Refractive and
Specular 3D Shape by Light–Path Triangulation,” International
Journal of Computer Vision, 2008.
[21] P. Sturm, S. Ramalingam, J.-P. Tardif, S. Gasparini, and J. Barreto,
“Camera Models and Fundamental Concepts Used in Geometric
Computer Vision,” Foundations and Trends in Computer Graphics
and Vision, 2011.
[22] M. Menem and T. Pajdla, “Constraints on perspective images and
circular panoramas,” In BMVC, 2004.
[23] S. Thirthala and M. Pollefeys, “Radial Multi-focal Tensors – Applications to Omnidirectional Camera Calibration,” International
Journal of Computer Vision, 2012.
[24] M. D. Grossberg and S. K. Nayar, “A General Imaging Model and
a Method for Finding its Parameters,” In ICCV, 2001.
[25] ——, “The Raxel Imaging Model and Ray–Based Calibration,”
International Journal of Computer Vision, 2005.
[26] P. Sturm and S. Ramalingam, “A Generic Concept for Camera
Calibration,” In ECCV, 2004.
[27] S. Ramalingam, P. Sturm, and S. K. Lodha, “Towards Complete
Generic Camera Calibration,” In CVPR, 2005.
[28] P. Miraldo, H. Araujo, and J. Queiró, “Point–based Calibration Using a Parametric Representation of the General Imaging Model,”
In ICCV, 2011.
[29] H. Pottmann and J. Wallner, Computational Line Geometry. Berlin:
Springer–Verlag, 2001.
[30] M. Buhmann, Radial Basis Functions: Theory and Implementations.
The Edinburgh Building, Cambridge CB2 8RU, UK: Cambridge
University Press, 2003.
[31] H. Wendland, Scattered Data Approximation.
The Edinburgh
Building, Cambridge CB2 8RU, UK: Cambridge University Press,
2005.
[32] E. Quak, N. Sivakumar, and J. D. Ward, “Least Squares Approximation by Radial Functions,” SIAM Journal on Mathematical
Analysis, 1993.
[33] N. Sivakumar and J. D. Ward, “On the least squares fit by
radial functions to multidimensional scattered data,” Numerische
Mathematik, 1993.
[34] G. H. Golub and C. F. Van Loan, Matrix Computations (3rd ed.).
Baltimore, MD, USA: Johns Hopkins University Press, 1996.
[35] A. Bartoli and P. Sturm, “Structure–From–Motion Using Lines:
Representation, Triangulation and Bundle Adjustment,” Computer
Vision and Image Understanding, 2005.
[36] P. Miraldo, H. Araujo, and J. Queiró, “Unique Solution for
the Estimation of the Plücker Coordinates Using Radial Basis
Functions,” University of Coimbra, Tech. Rep., 2011. [Online].
Available: http://hdl.handle.net/10316/19078
[37] R. Hartley, “In Defense of the Eight–Point Algorithm,” IEEE Trans.
Pattern Analysis and Machine Intelligence, 1997.
[38] Northern Digital Incorporated, “Optotrak Certus Motion Capture
System,” Northern Digital Incorporated, 2009, www.ndigital.com.
[39] J.-Y. Bouguet. (2010, July (last update)) Camera Calibration
Toolbox for Matlab. [Online].
Available: http://www.vision.caltech.edu/bouguetj/calib doc
[40] R. Swaminathan, M. D. Grossberg, and S. K. Nayar, “A Prespective on Distortions,” In CVPR, 2003.
Pedro Miraldo received the Master degree in
Electrical and Computer Engineering from the
University of Coimbra in 2008. Currently he is
a researcher and PhD student at the Institute
for Systems and Robotics-University of Coimbra,
and his work focusses on general camera models.
Helder Araujo is a Professor in the Department of Electrical and Computer Engineering
of the University of Coimbra, Portugal and also
an Invited Professor at Blaise Pascal University,
Clermont-Ferrand, France. He is a researcher at
the Institute for Systems and Robotics-Coimbra.
His research interests include computer and
robot vision, robot navigation and localization,
and sensor modeling for robot navigation and
localization.