Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Calibration of Smooth Camera Models

2000, IEEE Transactions on Pattern Analysis and Machine Intelligence

U NIVERSIDADE DE C OIMBRA I NSTITUTO DE S ISTEMAS E R OB ÓTICA 3030-290 C OIMBRA , P ORTUGAL A RTICLE : Calibration Author: Pedro M IRALDO miraldo@isr.uc.pt of Smooth Camera Models Co–Author: Prof. Dr. Helder A RAUJO helder@isr.uc.pt C ONTENTS 1 Introduction 1.1 Model and Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Outline of the Paper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 3 2 Notation and Background 2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Plücker coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 3 3 3 Smooth Camera Model 4 4 The Proposed Method 4.1 Vector–Valued Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Linear Point–based Calibration . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Calibration Matrix M . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Computation of the Camera Matrix . . . . . . . . . . . . . . . 4.2.3 Relationship between Control Points, point correspondences used in the calibration . . . . . . . . . . . . . . . . . . . . . . 4.3 Data Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Scaling the Coordinates of the Set of Image Points . . . . . 4.3.2 Scaling the Coordinates of the Set of World Points . . . . . 4.3.3 Camera Model and Calibration with Normalization . . . . . 5 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and radial basis function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 5 5 6 6 6 6 7 7 . . . . . . . . 7 7 7 9 9 10 10 11 11 Conclusions 6.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 13 13 Experiments 5.1 Results with Synthetic Data Sets . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Evaluation Results Using Smooth Camera Models . . . . . . . 5.1.2 Results with Noise Added . . . . . . . . . . . . . . . . . . . . . 5.1.3 Results Using Non–Smooth Camera Models . . . . . . . . . . 5.1.4 Experimental Results Using 3D Data from a Single Surface . 5.2 Results with Data sets of Real Images . . . . . . . . . . . . . . . . . . . . 5.3 Using a Calibrated Perspective Camera to Acquire a Data-set with Real 5.4 Removing Distortion: Results for a Spherical Catadioptric System . . . . . . . . . . . . . . . . . . . . . . . . . . . Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References 14 Biographies Pedro Miraldo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Helder Araujo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 14 14 1 Calibration of Smooth Camera Models Pedro Miraldo and Helder Araujo Abstract—Generic imaging models can be used to represent any camera. Current generic models are discrete and define a mapping between each pixel in the image and a straight line in 3D space. This paper presents a modification of the generic camera model that allows the simplification of the calibration procedure. The only requirement is that the coordinates of the 3D projecting lines are related by functions that vary smoothly across space. Such model is obtained by modifying the general imaging model using radial basis functions to interpolate image coordinates and 3D lines, thereby allowing both an increase in resolution (due to their continuous nature) and a more compact representation. Using this variation of the general imaging model we also develop a calibration procedure. This procedure only requires that a 3D point be matched to each pixel. In addition not all the pixels need to be calibrated. As a result the complexity of the procedure is significantly decreased. Normalization is applied to the coordinates of both image and 3D points which increases the accuracy of the calibration. Results with both synthetic and real data sets show that the model and calibration procedure are easily applicable and provide accurate calibration results. Index Terms—General Camera Models, Camera Calibration, Smooth Vector–Valued Functions. ✦ 1 I NTRODUCTION M OST cameras can be modeled by a perspective projection, which implies that all projecting rays intersect at a single point. These cameras are usually called central cameras. The calibration process for perspective cameras model (pinhole model) is well known and easy to implement [1], [2], [3]. Other camera models that verify the single view point constraint but do not satisfy the pinhole model have been defined, for example special cases of catadioptric camera models [4], [5]. Moreover, central camera models with non–parametric association from image pixels to rays, (General Central Camera Model) and their respective calibration have also been studied [6], [7], [8]. In the last few years, cameras whose projecting rays do not satisfy the constraint of intersecting at a single effective viewpoint started to be used, due essentially to the large fields of view that can be obtained. These cameras are called non–central and in many cases are obtained by a generic combination of a perspective camera and a curved mirror–the non-central catadioptric cameras [9], [10], [11], [12] (Figure 3(a)). They are used in several applications ranging from robotics to visualization. Many other non–central camera models have also been developed such as linear cameras [13], [14], [15], [16] (Figure 3(d)), fisheye cameras [17], [18] or camera models that include refractive elements [19], [20] (Figure 3(g)). All of these systems have, in general, models that are characterized by a small number of parameters. Since the • P. Miraldo and H. Araujo are with the Institute for System and Robotics, Department of Electrical and Computer Engineering, University of Coimbra, 3030-290 Coimbra, Portugal. E-mail: {miraldo, helder}@isr.uc.pt • This work was supported by the Portuguese Fundation for Science and Technology (FCT) with grants PTDC/EIA-EIA/122454/2010 and PTDC/EIA-CCO/109120/2008. P. Miraldo also wants to thanks to the FCT for his doctoral degree grant with reference SFRH/BD/49054/2008. number of parameters is small the calibration procedures are in general, simple. All of these models are specific since they are derived using geometric models of the cameras [21] and therefore they lack generality. To deal with these drawbacks, some techniques were developed to calibrate special classes of cameras such as [22], [23]. Grossberg and Nayar [24], [25] define a non– parametric discrete imaging model that can represent any type of camera, central or non–central (General Camera Model). Differently from the usual parametric camera models, this camera model consists in the individual mapping between pixels and rays in 3D space, Figure 1. To each pixel, a set of parameters called raxel is associated. The set of all raxels (representing all pixels) constitutes the complete General Camera Model. A raxel is a set of parameters including image pixel coordinates, the coordinates of the associated ray in the world (position and direction) and radiometric parameters. Grossberg and Nayar also propose a method for estimating the parameters of the General Camera Model. Their approach requires the acquisition of, at least, two images of a calibration object with known structure and also requires the knowledge of the object motion between the images. It requires two world points for the same image pixel. Sturm and Ramalingam at [26] and Ramalingam et al.at [27] proposed a calibration method based on the non–parametric imaging model, suggested by Grossberg and Nayar. However, they excluded from their model the radiometric entities of the raxel. Their method assumes that the camera is fully described by the coordinates of rays and the mapping between rays and pixels. Instead of using two images, Sturm and Ramalingam methods require three images of the calibration object, acquired from arbitrary and unknown viewing positions. If three points of the calibration object are seen for the same pixel, the collinearity constraint allows the 2 World Space World Space 250 200 5 2 4 6 150 1 yy axis 100 Image Space 50 0 −50 −100 200 −150 180 −200 160 2 3 −250 140 200 120 100 200 100 0 Fig. 1. A unified camera model only assumes that an association between image pixels and rays in the world space exists. 5 xx axis 80 −100 −200 100 0 1 −100 −200 zz axis 4 60 40 3 20 0 460 480 500 6 520 540 560 580 600 Image Space computation of the motion between the images of the calibration object and, as a result, it allows the estimation of the direction of the ray in 3D space. All the methods mentioned above are discrete and non–parametric, using mapping arrays (raxels) to calibrate the imaging model. Image pixels have associated a set of parameters that are independent from their neighbors. Most of the useful camera models have, in general, a pixel–ray relationships that vary smoothly along the image. That is the case of the central perspective camera, non-central catadioptric systems with quadric mirrors, linear cameras, fisheye cameras and of camera models that include refracting elements. In this article we change the discrete model defined by Grossberg and Nayar by assuming that the 3D rays associated to the pixels vary smoothly throughout the image. The assumption of smoothness is used as a constraint in the definition of the camera model and therefore in the calibration method. This paper extends the approach presented in [28] where the Smooth Camera Model was presented for the first time. This paper provides further details concerning the application of the Smooth Camera Model, and improves it by using normalized coordinates. An example and discussion of calibration with minimal data is presented. Additional experimental results with synthetic data are presented. The additional experimental results include a crossedslits camera model, a ”refraction camera” made up of a perspective camera looking through a volume of water and a non-smooth configuration made up of four perspective cameras looking at the same scene. Additional experimental results with real images are also included namely the case of a perspective camera looking at a scene through a water-filled glass tank. 1.1 Model and Approach Since the estimation of the raxels for all the image pixels require a lengthy calibration procedure, other authors have also assumed that the relationship between neighboring pixels and neighboring 3D rays in the world vary smoothly. Fig. 2. In this figure we display the result of the application of the calibration method to the minimal case according to the theoretical constraints (see Section 4.2.2). In the minimal case there are six points in the the image and their correspondent points in the world. In [25] and [26], [27], the authors have also used this constraint. The constraint is applied after the estimation of at least a subset of all possible image pixels. Sturm and Ramalingam used this constraint to interpolate between a subset of calibrated raxels corresponding to neighboring pixels. Grossberg and Nayar at [25] defined a Caustic Raxel Model that aimed at estimating a caustic surface from a set of raxels. In that model the estimation of the caustic (and therefore the calibration of the imaging model) requires the estimation of the raxels. Such estimation needs that more than one 3D point be used for each image pixel. In both previous cases, the use of constraints enforcing smoothness between neighboring pixels is used post calibration to interpolate between pixels that were not calibrated. In the proposed approach, the assumption that the relationship between pixels and rays varies smoothly is used directly in the model. For that purpose a vector–valued function that can represent any Smooth Camera Model is defined. A parametric representation for the General Camera Model is defined which results in: • a decrease in the number of unknown parameters; • the number of parameters and image resolution becoming independent. Using this camera model, a new calibration procedure can be developed, where: • only one world point is needed for each image point, unlike previous methods, [24] and [26] that require at least two points in the world, for each image point; • the calibration procedure has lower complexity since all the parameters are estimated in a single step. As in [26], only geometric entities of General Camera Model are considered. In Figure 2, an example of the advantages of the use 3 of the proposed method in the calibration of smooth camera models is shown. The minimal case (defined according to the theoretical conditions) is considered, where six matchings between world and image points are known (see the proof in Section 4.2.2). In this example all the coordinates of the image points are different. The estimated 3D lines pass through the corresponding 3D points using one-to-one mappings, which occurs as a result of applying the constraint that neighboring pixels must correspond to neighboring rays in the world (smoothness constraint). The methods proposed can be considered a black box general Smooth Camera Model which is linear and leads to a simpler calibration method. 1.2 Outline of the Paper In the next section, the mathematical notation that will be used in the article is presented. In Section 3 the proposed change to the General Camera Models, Smooth Camera Model is defined. In Section 4 the proposed method is described. Results are presented in Section 5 and discussed in Section 6. coordinates, we can represent a 3D line, incident on both points, as . lR = x ∧ w = (l01 , l02 , l03 , l23 , l31 , l12 ) ∈ Λ2 R4 ⊂ R6 | {z } | {z } m d (2) with lij = xi wj − xj wi , basis eij = ei ∧ ej (ei are R4 basis) and d and m are, respectively, the direction and the moment of the line. Although all elements of the four dimensional exterior product, Λ2 R4 , belong to R6 , not all elements of R6 represent lines in 3D space. It can be shown that, Equation (2) is the result of a four dimensional space exterior product (and therefore it is a line in 3D space), if and only if it belongs to the Klein quadric which is the same as write hd, mi = 0. Plücker coordinates enable the computation of the incidence condition of lines and points in the world. Using the direction and moment vectors, a point p ∈ P 3 is incident on a line l ∈ Λ2 R4 if ! [p]x −I lR = 0 (3) 0 pT | {z } Q(p) 2 2.1 N OTATION AND BACKGROUND Notation Matrices are represented as bold capital letters (eg.A ∈ Rn×m , n rows and m columns). Vectors are represented as bold small letters (eg.a ∈ Rn , n elements). By default, a vector is considered a column. Small letters (eg.a) represent one dimensional elements. By default, the jth column vector of A is specified as aj . The jth element of a vector a is written as aj . The element of A in the line i and column j is represented as ai,j . Regular capital letters (eg.A) indicate one dimensional constants. We use R after a vector or matrix to denote that it is represented up to a scale factor. Let U ∈ Rm×n , V ∈ Rk×l and C ∈ Rm×k be known and X ∈ Rn×l unknown. Using Kronecker product, we can write UXVT = C =⇒ (V ⊗ U) vec (X) = vec (C) (1) where ⊗ is the Kronecker product of U and V, with (V ⊗ U) ∈ Rmk×nl , and vec (X) is a nl–vector formed by stacking the columns of X. 2.2 Plücker coordinates Plücker coordinates are a special case of Grassmann coordinates [29]. A Grassmann manifold is the set of k dimensional subspaces, in a n dimensional vector space, and it is denoted as Λk Rn . Plücker coordinates can be obtained as a representation of the exterior product to four dimensional vectors x ∧ w. The result of this operation lies in a six dimensional vector space R6 , that can represent lines in 3D space. Consider two points in the world (x and w in P 3 ) represented in homogeneous coordinates R4 . Using Plücker where p ∈ R3 (non–homogeneous representation), [a]x is a matrix that linearizes the exterior product as [a]x b = a × b and I is a 3 × 3 identity matrix. 2.3 Interpolation Suppose that we want to estimate an unknown function, f : RD 7→ R, from a set of scattered data points . X = {xi } ⊂ RD (with D a natural number) and . Y = {yi }, where the set {xi , yi } forms a training data set {yi = f (xi )}. Interpolation requires the computation of an interpolating function, s : RD 7→ R, that satisfies s (xi ) = f (xi ) , ∀i. (4) Radial basis functions (RBF) [30], [31], [32], [33] can be used to solve this problem. For a set P of training points {x1 , . . . , xP }, the RBF interpolant function has the form s (x) = a0 + aTx x + P X wi φ (||x − xi ||) (5) i=1 where ||.|| is the 2–norm vector [34], φ : R+ 7→ R is the radial basis function and ax ∈ RD . a0 , ax and wi are the interpolant unknowns. There are two types of kernel functions that can be used as the RBF interpolant. One type of kernel functions does not have shape parameters, like thin–plate splines, φ (r) = r2 log (r), or φ (r) = r2 . The other type of kernel functions does have shape parameters, such as Gaussian  functions φ (r) = exp −γ 2 r2 , and multi–quadrics with 1/2 where γ is the shape parameter. φ (r) = γ 2 + r2 The interpolation is obtained by means of the estimation of the unknown parameters of the interpolant 4 (a = (a0 , ax ) and w = (w1 , . . . , wP ) of Equation (5)). The interpolating function s (x) has P + D + 1 degrees of freedom and the data sets X and Y only yield P equations. For the estimation of the unknowns in Equation (5), additional constraints have to be used. Considering the function φ conditionally positive definite [31], the following equations are verified, P P i=1 P P wi = 0 & i=1 (i) wi x 1 = · · · = (j) P P i=1 (i) wi x D = 0 (6) [28], [31] where xi is the ith element of the jth observation. The use of these constraints allows the estimation of all unknowns. The estimation of the unknown parameters can be obtained using the following relation ! ! ! y w Φ PT (7) = 0 a P 0 | {z } than one point in the image plane. As a result, a General Camera Model can only be defined when considering the mapping from image coordinates to 3D lines. A recent result, however, derived an analytical model for the forward projection of a non-central system made up of a perspective camera and a rotationally symmetric conic mirror [10]. As previously described, a single mapping between points in the image to points in the world is insufficient to define a raxel and as a consequence it can not define a General Camera Model. The information of the direction of the line can not be recovered. However, if we assume that the relationship between image points and world lines varies smoothly, we are imposing new constraints to lines. Those constraints will implicitly allow the complete calibration of the Smooth Camera Model using only a single point in the world for a point in the image, as proved in the following sections. An example can be seen in Figure 2. Γ where φi,j = φ (||xi − xj ||). Furthermore, considering D = 2, we can see that C (Γ) = P + 3 implies P ≥ 3. 3 S MOOTH C AMERA M ODEL From the definition of General Camera Model Figure 1, introduced by Grossberg and Nayar [24], each pixel in the image x ∈ P 2 is mapped to a ray in 3D space l ∈ L3 . As mentioned before, the model is based on an array of parameters called raxel and it can be used to represent central or non–central camera model as well as smooth and non–smooth camera models. In that model, a complete General Camera Model is represented by a non–parametric discrete array of raxels, that contains all possible pixels in an image. This means that the pixel–line mappings are required for all pixels, independently of the image resolution or of the smoothness on the variation of the parameters (corresponding to the 3D lines associated to neighboring pixels). If we consider only the geometric entities in Grossberg and Nayar’s model, each raxel contains at least seven parameters. Thus, for each pixel, there are seven unknown parameters to be computed. For an image with size N × M , there are 7N M unknown parameters to be computed. We aim at deriving a model that requires a smaller number of parameters. Since most of the imaging devices are characterized by having a smooth relationship between image pixels and lines in the world we use the assumption that the pixel–line mappings can be represented by a smoothly varying vector–valued function f : P 2 7→ L3 , that maps a point in the image plane to a line in 3D space. We use this approach to represent the Smooth Camera Model. This assumption significantly decreases the number of model unknowns and also allows to filter out some error due to noise. In most cases, a general direct projection model does not exist, since one 3D line can be mapped into more 4 T HE P ROPOSED M ETHOD In this section we describe the formulation of the proposed approach to represent the Smooth Camera Model, Section 4.1, and the calibration procedure, Section 4.2. In Section 4.3, we describe the data-set normalization. 4.1 Vector–Valued Function A 3D line representation has to be chosen for the output of the vector–valued function. Lines in 3D have four degrees of freedom. However, none of the compact four variable representations for 3D lines is complete. Plücker coordinates (Section 2.2) are a complete, elegant and easy to understand line representation. On the other hand, it has six elements to represent four degrees of freedom. It is defined up to a scale factor and has an orthogonal constraint associated to its elements. Instead of using Plücker coordinates to represent 3D lines, we use a vector made up by the vectors of  stacking  b b b . In other words, the direction and moment lR = d, m two independent vectors are estimated, up to the same scale factor. b and m b yield small deviations Good estimates of d from the orthogonal constraint. However, it is possible to find orthogonal vectors d and m from their estimates using Schmidt orthogonalization [34], by finding the   ′ ′ ′ ′ b b c c closest rotation matrix to d m d × m , where x′ = x/ ||x||, or by using the algorithm proposed by Bartoli and Sturm [35]. There are several ways to estimate a non–injective function from a set of scattered data. We use the RBF interpolant described in Section 2.3 s (x) = a0 + aTx x + P X i=1 wi φ (||x − ci ||) (8) 5 where {ci } , x ∈ R2 . We can rewrite the previous equation, in matrix form as !   w s (x) = φ (x) p (x) (9) a | {z } hwa where φ (x) and p (x) are row vectors,  with φi (x) = φ (||x − ci ||) and p (x) = 1 x1 x2 . The vector–valued function output is bl ∈ R6 where each b li is independent. Thus, we can use six independent RBF interpolants to form the vector–valued function   s1 (x) s2 (x) . . . s6 (x) , which can be s (x) = rewritten as    (1) (2) (6) hwa hwa . . . hwa . (10) s (x) = φ (x) p (x) {z } | Hwa The vector–valued function is a row vector, s (x) ∈ R1×6 , which implies bl (x) R = sT (x). For a set {ci }, a matrix Hwa and a certain RBF, the estimates of the direction and moment vectors, as a function of the coordinates of the image points, are given by     b (x)T m bl (x)T R = d b (x)T R = φ (x) p (x) Hwa . (11) From the previous equation, it can be observed that for two different imaging systems using the same set of points {ci }, the estimation of bl for an image point x only depends on matrix Hwa . Thus, we call Hwa the camera matrix. On the other hand, for the same imaging system, the values of the parameters of the camera matrix depend on the set {ci }, and that is why we call them control points. Usually, in statistics, the set {ci } is called centers. In computer vision, the word center in an imaging system typically designates the center of the projection. As a result, we chose to name to the set {ci } as control points. For a set P of control points defined a priori {ci } and a camera matrix Hwa ∈ R(P +3)×6 , we define the general Smooth Camera Model model by the vector–valued function s : R2 7→ R6 . The Smooth Camera Model only depends on the unknown matrix Hwa , for a set of previously defined control points. Therefore, the complete calibration of the Smooth Camera Model can be obtained by estimating 6P + 18 unknown parameters, that sets up the camera matrix. 4.2 Linear Point–based Calibration One disadvantage of the use of the General Camera Model is the difficulty of its calibration. Grossberg and Nayar [24] and Sturm and Ramalingam [26] define two different methods for the calibration. However in both cases, the complete calibration is achieved when we have a raxel for any pixel of the image. Note that the estimation of a single raxel requires at least two points in the world for the each image pixel. Since the estimation of the raxels for every pixel is a difficult task, interpolation methods were used to estimate all the raxels from a subset of calibrated raxels (already described in Section 1.1). The goal in the calibration of a Smooth Camera Model is to allow the calibration of the respective model without estimating each and all the raxels while, at same time, allowing the same resolution. In the method described in this paper, the calibration procedure only requires a set of point correspondences {xi 7→ pi }, (where {pi } is the set of world points and {xi } are their corresponding points in the image plane). Point correspondences that satisfy xi 6= xj , ∀i6=j , (12) can be used. For each image pixel only the coordinates of one 3D point are necessary. There are two sets of unknowns in the calibration procedure: the set of control points {ci } and the elements of the camera matrix Hwa . Control points can be defined a priori, by selecting a set P of scattered image points, that can or cannot be a subset of data points {ci } ⊂ {xi }. In the rest of this section, we describe a linear method to estimate the 6P + 18 parameters of the camera matrix, for a set of control points defined a priori. 4.2.1 Calibration Matrix M World points incident on lines must verify Equation (3). T From bl (x) R = s (x) and Equation (3) bl (x)T Q (p)T = s (x) Q (p)T = 0. (13) Replacing s (x) from the previous equation, using Equation (10)   T T T Hwa Q (p) = 0 (14) φ (x) p (x) {z } | r(x) where r (x) ∈ R1×(P +3) . The unknowns are the elements of matrix Hwa . Thus, using the Kronecker product, Section 2.1, which allows to rewrite (14) in order to isolate the unknown camera matrix as   Q (p) ⊗ r (x) vec (Hwa ) = 0 (15) where matrix Q (p)⊗r (x) ∈ R4×(6P +18) and vec (Hwa ) ∈ (i) R(6P +18) is the stacking of hwa columns, for i = 1, . . . , 6. For a set N of point correspondences {x  i 7→ pi } of the same imaging system, Q (pi ) ⊗ r (xi ) vec (Hwa ) = 0 and the calibration matrix can be built as   Q (p1 ) ⊗ r (x1 )   ..   .   (16) M=   Q (pN ) ⊗ r (xN )    D 6 where M ∈ R(4N +18)×(6P +18) and matrix D ∈ R18×(6P +18) is as [36]. The calibration procedure is reduced to the estimation of the unknown camera matrix Hwa such that Mvec (Hwa ) = 0, which means vec (Hwa ) ∈ null (M) (17) where null (.) indicates matrix null–space. 4.2.2 Computation of the Camera Matrix From Equation (17), the estimate of the camera matrix must belong the the null–space of the calibration matrix M. To ensure an unique solution, and since vec (Hwa ) ∈ R6P +18 , the dimension of the column space of M must be C (M) = 6P + 17, which means that the dimension of the null–space of M will be N (M) = 1. From Equation (11), any solution for the camera matrix is defined up to a scale factor. Thus, assuming that N (M) = 1, any element of the one dimensional null– space of M is a solution, except for the trivial solution vec (Hwa ) = 0. To prove that the dimension of the column space of M is C (M) = 6P + 17, we decompose Equation (16) into rows, by means of the Kronecker product. Since the permutation of the rows in any matrix does not change the dimension of the column space, defining N = 2P , it is possible to rearrange matrix M as a block of matrices similar to Γ (Equation (7)). It can be proved that if a specific set of constraints is met, then matrices Γ are full rank, C (D) = 17 and C (M) = 6P + 17–see next section and [36]. From this result, and since C (Γ) = P + 3 implies P ≥ 3, we can conclude that the minimal configuration to ensure a unique solution for the computation of the camera matrix corresponds to N = 6. An example of a calibration corresponding to the minimal case is shown in Figure 2. 4.2.3 Relationship between Control Points, point correspondences and radial basis function used in the calibration In this section, we describe the constraints that must be met by {xi } and {ci } to obtain C (M) = 6P + 17. Two types of radial basis functions, that can be applied to get that rank are also discussed. Let us assume that a set of point correspondences, {xi 7→ pi } for i = 1, . . . , 2P and xi 6= xj , ∀i 6= j is available, and that it can be split into two sub-sets such that . K(1) = {xi } for i = 1, . . . , P (18) . K(2) = {xi } for i = P + 1, . . . , 2P. In [32] and [36] it is shown that only control points ({ci }, for i = 1, . . . , P ) and data points ({xi }, for i = 1, . . . , 2P ) that meet the condition d < ǫq (19) can be considered. In Equation (19), d = min {d1 , d2 }, 0 < ǫ ≤ 1, d1 = max {||xi − ci ||}, where the set {xi } belongs to the set K(1) , d2 = max {||xi − ci ||}, where the set {xi } belongs to K(2) and 2q = minj6=i {||ci − cj ||}. . Note that (from [36]) if we consider {ci } = K(1) and if d2 < ǫq then we can also obtain C (M)) = 6P + 17. This solution also meets the constraint xi 6= xj , ∀i 6= j. 1/2 Quak et al. [32] proved that φ1 (r) = γ12 + r2  and φ2 (r) = exp −γ2 r2 are good choices for radial basis functions, since that, by choosing an appropriate γ1 and γ2 , the negative effects of small values of q and ǫ respectively in Equation (19) can be reduced. 4.3 Data Normalization One of the issues in the method proposed in [28] is related to the shape parameters of the RBF. The choice of this values depends on the distribution of set of the image points {xi }. To avoid this problem, a data normalization procedure for both image and world coordinates was developed. We consider non–isotropic normalization [37] for both image and world points, such that • the centroid is at the origin; • the principal moments are equal to unity. If on one hand, normalization of the image space coordinates is easy to implement, on the other hand, normalization of the coordinates of the set of world points {pi } will have additional implications. The normalization of the coordinates of the world points, {pi }, in the calibration procedure will imply a change on the parameters of the Smooth Camera Model. To account for this change, the inverse of this transformation has to be applied to the parameters of the Smooth Camera Model that were estimated. 4.3.1 Scaling the Coordinates of the Set of Image Points For the normalization of the image space, an affine transformation ui = a−1 (Axi + a) (20) is considered, where A ∈ R2×2 is upper triangular, and a ∈ R2 . To obtain the normalization parameters {A, a, a}, an affine transformation based on Choleski factorization is used [34], [37]. Let us consider the homogeneous representation of the image coordinates x as x. Since matrix N P xi xTi is symmetric and positive definite, using the i=1 using the Choleski factorization one can define N P i=1 xi xTi = N K1 KT1 , where K1 ∈ R3×3 . Developing this equation, N P −T T K−1 we get = N I. As a result, and using 1 x i xi K 1 i=1 ui = K−1 1 xi , one obtains N P i=1 ui uTi = N I which means that the set {ui } has its centroid at the origin of the coordinate systems and the two principal moments of the set of points are equal to one. 7 Since K1 is upper triangular, K−1 1 is also upper triangular and we can define ! A a −1 , (21) K1 = 0 a which includes the parameters of the affine transformation of Equation (20). 4.3.2 Scaling the Coordinates of the Set of World Points To apply the normalization to the coordinates of the points in world space, a method similar to the method described in the previous section is applied. The coordinates of the set of world points {pi } are changed into {qi } such that ! B b −1 −1 (22) qi = b (Bpi + b) and K2 = 0 b ∈ R4×4 is an upper triangular matrix (for where K−1 2 points in 3D space). B ∈ R3×3 is upper triangular and b ∈ R3 . Note that this normalization has consequences: it will change the coordinates of the line space. The output of function s are the coordinates of the 3D lines. Therefore one has to account for this change of coordinates. The line coordinates which are the output of function s have to be changed into the original coordinate frame. Using the affine transformation defined at the Equation (22) and from the definition of Plücker coordinates, we define an affine transformation in line space as ! B 0 l(1) (23) l(2) R = −b−1 [b]x B b−1 det (B) B−T {z } | E (for the derivation of the matrix E see the author web page). Note that B is non–singular, which means that matrix E is invertible and l(1) can be estimated such that l(1) R = E−1 l(2) . 4.3.3 Camera Model and Calibration with Normalization In this section we describe the algorithm for the calibration of the Smooth Camera Model, using normalized data. Step 1: Compute matrices K1 and K2 using the method described in Sections 4.3.1 and 4.3.2, for the set of points {xi } and {pi } respectively. Compute the affine transformations parameters {A, a, a} and {B, b, b} from Equations (21) and (22) respectively. Compute the normalized image coordinates {ui } and {qi }, using Equations (20) and (22) respectively. Using the affine transformation, get the matrix E, using Equation (23). Step 2: Compute the calibration matrix M using Equation (16). Note that instead of {xi } and {pi }, {ui } and {qi } should be used, respectively. Step 3: Get vec (Hwa ) which minimizes Mvec (Hwa ), using the Singular Value Decomposition [3], and un-stack vector vec (Hwa ) to matrix Hwa . Step 4: The camera model with normalization is defined by a function s : R2 7→ R6 where blR = s (x), such that   s (x) = r a−1 (Ax + a) Hwa E−T (24) where r : R2 7→ R(p+3) is defined in Equation (14). Note that the coordinate system where line coordinates bl (x) are represented is the same coordinate system where the original set of point in the world {pi } is represented. 5 E XPERIMENTS 5.1 Results with Synthetic Data Sets To evaluate the method described, synthetic data sets were generated. Using known camera models, 3D lines are generated, which are mapped into image points. The data set made up by pairings between 3D points and image points is obtained from the first data set (containing coordinates of 3D lines) by computing 3D points incident on the lines. The camera models used in the evaluation were: catadioptric system with quadric mirror, Figure 3(a); crossed–slits camera model, Figure 3(d); refraction camera model, made up of a camera looking through a volume of water contained between two parallel planes, Figure 3(g). Since the mappings between points in the image and 3D lines are known, the calibration results can be evaluated by the estimates of the distance errors (distances between lines) in line space. Note that the camera model is represented by a mapping between points in the image and 3D lines. 5.1.1 Evaluation Results Using Smooth Camera Models Synthetic data sets were used to evaluate the effect of the number of control points on the error defined as a distance between ground truth 3D lines and estimated 3D lines. As radial basis functions multi–quadrics and exponential were used. The synthetic data sets were obtained for the following smooth camera models f : P 2 7→ L3 : non–central catadioptric systems using quadric mirrors [9], crossedslits camera model [16] and refraction-based camera model [19], [20]. The calibration method estimates an interpolant function s : R2 7→ R6 that should fit an imaging model, defined by a function f : P 2 7→ L3 , where f is an analytic representation of the corresponding imaging model. f is used to generate a data set of {xi 7→ pi }, for i = 1, . . . , N and P control points are selected from the set {xi }, with the conditions described in Section 4.2.3 being met. Using f , a ground truth data set is generated {yi 7→ li }, where yi are the ground truth image coordinates and li are the ground truth 3D line coordinates. The sets {xi 7→ pi } and {ci } are used to calibrate the camera model. Using the interpolant functions s, the set o n b yi 7→ li is estimated, where bli = s (yi ). 8 Line Distance Error Using Gaussians 35 Line Distance Error Using Multi-Quadrics 25 Normalized Coordinates Original Size 25% Bigger Image 75% Bigger Image 150% Bigger Image 30 Normalized Coordinates Original Size 25% Bigger Image 75% Bigger Image 150% Bigger Image 20 Distance Error Distance Error 25 20 15 15 10 10 5 5 0 0 10 20 30 40 50 60 70 80 90 0 100 0 10 20 Number of Control Points (a) 30 40 50 60 70 80 90 100 Number of Control Points (b) (c) Line Distance Error Using Gaussians Line Distance Error Using Multi-Quadrics 350 Normalized Coordinates Original Size 25% Bigger Image 75% Bigger Image 150% Bigger Image 12 300 10 200 150 100 9 8 Distance Error Distance Error 250 Normalized Coordinates Original Size 25% Bigger Image 75% Bigger Image 150% Bigger Image 10 8 6 7 6 5 4 4 3 50 2 Calibration Points World Lines 0 2 1 -20 0 0 20 50 40 30 20 10 0 -10 -20 -30 -40 0 10 20 30 40 50 60 70 80 90 0 100 0 10 20 Number of Control Points -50 (d) 30 40 (e) 35 70 80 90 100 Line Distance Error Using Multi-Quadrics Normalized Coordinates Original Size 25% Bigger Image 75% Bigger Image 150% Bigger Image 40 60 (f) Line Distance Error Using Gaussians 45 50 Number of Control Points Normalized Coordinates Original Size 25% Bigger Image 75% Bigger Image 150% Bigger Image 40 35 Distance Error Distance Error 30 30 25 20 25 20 15 15 10 10 5 5 0 0 10 20 30 40 50 60 70 80 90 100 Number of Control Points (g) 0 0 10 20 30 40 50 60 70 80 90 100 Number of Control Points (h) (i) Fig. 3. Evaluation of the average line distance error, Equation (27), as a function of the number of control points, for non–central catadioptric systems with a quadric mirror, for a crossed–slits camera model and for a model including refraction (a camera looking through a volume of water contained between two parallel planes) (a), (d) and (g) respectively. For each camera model we evaluate the distance error defined in Equation (27), for gaussian and multi– quadrics radial basis functions and the errors are shown in Figures (b)-(c), (e)-(f) and (h)-(i) respectively. To evaluate the calibration method the distances between bli and li are used. To characterize the error the   b average of the distances ǫi = d li , li is computed. Let us consider two lines represented by Plücker coordinates g and h, which can be approximated via a local e = (g1 , g2 , g3 , g4 ) mapping into an Euclidean 4-space g e = (h1 , h2 , h3 , h4 ) [29] such that and h g = (g3 − g1 , g4 − g2 , 1, g2 , −g1 , g1 g4 − g2 g3 ) (25) h = (h3 − h1 , h4 − h2 , 1, h2 , −h1 , h1 h4 − h2 h3 ) . (26) The distances between the lines are estimated using  2 4 e = P (gi − hi )2 + (g1 − h1 ) (g3 − h3 ) + e, h d g i=1 (27) (g2 − h2 ) (g4 − h4 ) . The results are evaluated by varying the number of control points, P , from 10 to 100. For each number of control points, the calibration is evaluated using the ground truth data set {yi 7→ li } for i = 1, . . . , 120. For each set of control points, the calibration is repeated 150 times for 9 Line Distance Error Using Gaussians Line Distance Error Using MultiQuadrics 50 50 45 45 40 40 Line Distance Error Using Gaussians 25 20 15 Cat. Quadric X-Slit Refraction 10 5 0 0 2 4 6 8 10 12 14 16 18 35 30 25 20 15 Cat. Quadric X-Slit Refraction 10 5 20 0 0 2 Noise 4 6 8 10 12 14 16 18 Noise (a) (b) 40 35 30 25 20 15 10 0 35 30 25 20 15 10 5 20 Cat. Quadric X-Slit Refraction 45 Distance Error 30 Distance Error Distance Error Distance Error 40 35 Line Distance Error Using MultiQuadrics Cat. Quadric X-Slit Refraction 45 5 0 5 10 15 20 25 β (c) 30 35 40 45 50 0 0 5 10 15 20 25 β 30 35 40 45 50 (d) Fig. 4. In this figure we show results of tests performed with the three previously mentioned camera models, Figure 3, with noise added. We set the number of control points to 60 and vary the variance of the noise added to the coordinates of the 3D points used in the calibration procedure. The standard deviation of the noise is proportional to the smallest distance among all the world points of the calibration data set. (a) is the distance error in the line space for gaussian RBF and (b) for multi–quadrics. To eliminate some errors, we can use more points than the ones needed for the minimal solution. We consider the noise as 10% rate of the smallest distance between the world points and we vary the β, where N = βP . The results are show in Figures (c) and (d) for gaussian and multi–quadrics respectively. different values of {xi }, {yi } and {pi }, {li }, respectively. The average of the distance errors is computed for each number of control points. The calibration results are dependent on the value of the shape parameter in the case of either multi–quadric or gaussian radial basis functions. By normalizing the data coordinates as described in Section 4.3.3, the value of the shape parameter is no longer dependent on the image size and also on the spatial distribution of the image points. For a fixed value of the shape parameter, the error as a function of the spatial distribution of the set {xi } was evaluated. and for each set {xi } and o oFor that n purpose n (j) (j) were computed with and yi {yi }, xi (j) xi = xi σj ∀i (28) for j = 1, 2, 3 with σ1 = 1.25, σ2 = 1.75 and σ3 = 2.5. The results are shown in Figure 3. Note that by using normalized coordinates the error variation does not depend on the image size. Therefore the distance error can be represented by a single curve. 5.1.2 Results with Noise Added The same evaluation procedure was repeated by adding Gaussian noise {µi } to the 3D coordinates of the points used in the calibration procedure {pi }. For that purpose the standard deviation of the norm of the noise vector was defined as std (||µi ||) = αe (29) where e = min (||pi − pj ||) for all i 6= j. Instead of using e i }, where the set {xi 7→ pi } to calibrate, the set {xi 7→ p e i = pi + µi for all i, was used. p The errors are evaluated in line space (Equation (27)) for the three previously mentioned camera models, with values for α from zero to 0.20. Calibration was performed using normalized coordinates (Section 4.3.3) with P = 60. Results are shown in Figure 4(a) and 4(b) for gaussian and multi–quadrics RBF, respectively. Note that in the previous experiments, N = 2P was used, for P equal to the number of control points. On the other hand if N = βP for β > 2, then matrix M can become full–column rank and, as a consequence, there will be no exact solution for Hwa . However, an estimate for vec (Hwa ) can still be obtained, in the least–squares sense [3]. The errors for β from 2 to 50 were also evaluated (measured as the distances between the 3D lines corresponding to the ground truth and the estimated 3D lines). For that purpose, P = 60 was used, and Gaussian noise with α = 0.10, Equation (29) was added. The results are shown in Figure 4(c) and 4(d) for gaussian and multi–quadrics RBF, respectively. 5.1.3 Results Using Non–Smooth Camera Models The approach described in this paper is based on the assumption that the camera model is smooth. Therefore this method can not be applied to general camera models which line positions and orientations do not change smoothly. However, and even in some of those camera models, the approach can still be applied, provided that errors at the discontinuities can be accepted. As an example the calibration procedure was applied to a non–central camera model obtained by combining images acquired by four different perspective cameras, with different poses (Figure 5(a)). The errors in 3D line space (Equation (27)), displayed as a function of the image coordinates are shown in Figure 5(b). These results were obtained for P = 250 and multi–quadrics RBF. The errors are small over most of the global image with the exception of the boundaries between the individual images. To obtain results with smaller errors, the image points can be clustered based on the error distribution. To each 10 Image Plane P (3) P (4) Wrold Space P (1) Camera 3 40 P (2) 30 20 Camera 4 10 0 60 Camera 1 50 Camera 2 40 30 20 10 0 40 30 20 10 0 −10 −20 −30 (a) (b) Fig. 5. This figure shows results for a non–smooth camera model synthetically generated, and made up by combining four images obtained from four different perspective cameras as shown in (a). In figure (b) errors corresponding to the distances between the 3D lines are shown. cluster, a specific radial basis functions can be assigned (based on the the distribution of control points). Each cluster can then be recalibrated individually. 5.1.4 Experimental Results Using 3D Data from a Single Surface In this section we present the results of a new calibration experiment where the 3D points data used for the calibration belong to a single surface. Synthetic data was obtained from a non-central catadioptric system made up by a quadric mirror (Figure 3(a)). The catadioptric system was used to generate the data for the calibration, where it was assumed that the correspondences between the world points and their images were known. We consider world points on a spherical surface. Note that the 3D points can not belong to a single plane or to a single 3D line since in those cases the theoretical conditions described in [36] are not satisfied and the null space of matrix M has dimension greater than 1. The results are shown in Figure 6. The image points used to evaluate the calibration (ray estimation) are not the image points used in the calibration process. We use the a non–minimal solution, with 40 control points and 360 data points, and the camera matrix is given in the least–squares sense [3]. From the experiments, lines are estimated with small error. 5.2 Results with Data sets of Real Images For experiments with real data sets, three different types of imaging systems were calibrated, using the model described in this article and point–based calibration. A perspective camera and two different catadioptric systems were used, shown in Figures 7(a), 7(d) and 7(g) respectively. Note that the catadioptric camera model with two planar mirrors is a non–smooth camera model. To acquire the point correspondences {xi 7→ pi }, which is the data necessary for the calibration method, a chess board was used. Infrared (IR) LEDs were attached to the chess board and their positions in the world were measured using an IR tracker [38]. This tracker has an accuracy of 0.1[mm] and a resolution of 0.01[mm]. Each corner of the chess board image xi is associated to a position in the world pi , which is given by its corresponding position in the 3D chess board. The set of associations {xi 7→ pi }, for i = 1, . . . , 3840, was used in the calibration procedure. The set of image points {xi } are the yellow points in Figures 7(b), 7(e) and 7(h). The set of world points {pi } are the yellow points in Figures 7(c), 7(f) and 7(i). Control points {ci } are chosen as a subset of {xi }. They are shown as green points, in Figures 7(b), 7(e) and 7(h). Using {xi 7→ pi }, the interpolant function s is estimated. Since in the real experiments we have N > 2P , the solution will be obtained solving an over–determined system of equations. The camera matrix is estimated as least–squares solution for the homogeneous equations [3], Section 4.3.3. To evaluate the calibration, a different set of 3D coordinates, corresponding to a different object were used. The IR tracker has a ”test object” (also with LEDs) which is provided to enable the estimation of 3D coordinates of points. This different object was used to generate a new data set {yi 7→ wi }, for i = 1, . . . , 700, where yi are image points and wi are world points. This new data set was used to evaluate the calibration performed with the former data set. The distance error is defined by the geometric distance ǫi from the world point wi to the line estimated bli , where bli = s (yi ). A subset of {yi } and the corresponding n o lines bli are shown as red squares, in Figures 7(b), 7(e) and 7(h), and as red lines, in Figure 7(c), 7(f) and 7(i), respectively. We derived the geometric distance between a line (in Plücker coordinates) and a point in the world as ǫi = ||ǫi ||, where  (30) ǫi = [wi ]x −I li b i . For more informatin, see the author and li = bli / d web page. The number of control points used in the calibration were 30, 35 and 40, for the imaging systems of Figures 7(a), 7(d) and 7(g) respectively. The results are shown in Table 1. 11 0 Data Points Control Points 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 (a) (b) (c) Fig. 6. Figure (a) represents the image points (black squares) and control points (red circles) used in the calibration process. Their correspondent world points are shown in Figure (b) (black points). We consider 3D points on a spherical surface. Figure (c) shows the results, where the ground–truth lines are shown as black lines and the estimated lines in red. Points in the image that were used to generate the estimated lines Figure (c) are different from the image points used in the calibration, (Figure (a)). 5.3 Using a Calibrated Perspective Camera to Acquire a Data-set with Real Data To further evaluate the calibration method, an additional data-set of real data was acquired. Consider an object or scene whose images are acquired by two perspective cameras. One of the cameras is looking directly at the object/scene, whereas the second camera acquires images of the object by viewing it through a glass tank filled with water (Figure 8(a)). The goal is to calibrate the second imaging system, i.e., including the effect of viewing the object through the glass tank filled with water. An example of the image acquired by this system is shown in Figure 8(b). A chess board is used to calibrate the system. Several images of the chess board were acquired, with the chess board in different positions and orientations. The direct images of the chess board, i.e., acquired by the camera that views the scene directly (not through the glass tank) were used to estimate the coordinates of the planes in the world coordinate system. For that purpose the Bouguet calibration toolbox was used [39]. Eighteen different images of the chess board in different positions and orientations were used. In each chess board image there were of 160 points whose coordinates could be used (these points are located at the intersection of the edges). The points are displayed in Figure 8(c) as yellow and blue dots. The coordinates of the points on the o board are used as world points forming the set n chess (j) (j) where pi are the coordinates of the i’th point pi of the j’th chess position in the world. Since these points are also visible in the image acTABLE 1 Results for the experiments, using the IR tracker [38]. RBF Projective Sphere mirror Plane mirrors Gaussian [cm] 0.93058 ± 1.9783 2.8743 ± 2.1074 0.91795 ± 1.0608 Multi–quadrics [cm] 0.54711 ± 0.43567 1.8863 ± 1.2379 0.99492 ± 0.78778 quired through the water-filled glass tank (i.e. including the effects of refraction), thenimages o of the points on (j) (j) (where xi is the of each chess-board images xi i’th point coordinate of the j’th image of the j’th chess board position) aren associated with the corresponding o (j) world coordinates pi , forming the set of mappings o n (j) (j) xi 7→ pi . To test the calibration method the following procedure was followed: for each specific chess board position and orientation j, all the data from all other chess boards at different positions and orientations was used as calibration data set. The coordinates of the points of the j’th chess board position were used as as ground truth. Therefore the coordinates of the points of the j’th chess board position were used as ground truth data set {yi 7→ wi } (160 mappings between image and world points) and the coordinates of the other points of the other chess board positions were used as calibration data–set {xi 7→ pi } (2720 mappings between image and world points). For each yi , the distances between the estimated lines bli and the corresponding points in the world wi are estimated, using Equation (30). This procedure is repeated for all the j’th chess board positions and the means and standard deviation of the distance errors are computed, for both gaussian and multi-quadrics RBFs. The number of control points used was 10. The corresponding results were 0.111±0.075[cm] for multi–quadrics and 0.313 ± 0.545[cm] for gaussian RBF, respectively. An example of the estimated lines, estimated using the method described (for a specific plane position) is shown in Figure 8(c), as red lines. 5.4 Removing Distortion: Results for a Spherical Catadioptric System In general, undistorted views cannot be computed from images acquired by a general non–central camera models (without any additional constraint). As mentioned by 12 Calibration Points Estimated Lines -2600 100 -2800 200 -3000 -3200 300 -3400 400 -3600 500 -3800 -4000 600 -4200 700 GAUSS: MQ: RED GREEN BLUE MAGENTA 0.71686 0.78906 0.80757 0.73794 0.34631 0.44775 0.4288 0.4274 500 Calibration Points Control Points Results Points 800 200 0 -500 400 600 (a) 800 1000 -400 1200 -200 200 0 (b) 400 600 800 1000 1200 1400 1600 (c) -2200 Calibration Points Estimated Lines -2400 200 -2600 -2800 300 -3000 -3200 -3400 400 -3600 -3800 500 -4000 -4200 600 GAUSS: MQ: RED GREEN BLUE MAGENTA 2.3775 2.4179 0.90933 1.0673 1.3424 1.1416 0.8852 1.1823 1000 Calibration Points Control Points Results Points 700 100 200 300 400 500 0 -500 500 (d) 600 700 800 900 1000 (e) 100 Calibration Points Control Points Results Points -2000 -2500 400 -3000 500 -3500 0 600 BLUE MAGENTA MQ: 0.68138 0.27322 1.5249 2.843 GAUSS: 1.0979 0.36503 2.2777 1.5749 -4000 500 1500 700 200 (g) 1500 1000 Calibration Points Estimated Lines 300 GREEN 500 (f) 200 RED 0 -500 1100 400 600 800 (h) 1000 1200 1000 500 0 1000 -500 -1000 (i) Fig. 7. Results with real images, obtained using the method described for three types of camera models: perspective camera (a), catadioptric system with spherical mirror (d) and catadioptric system with two planar mirrors (g). Yellow points in (b), (e) and (h) are the set of image coordinates {xi } and yellow points in (c), (f) and (i) are the corresponding set of world points in the calibration method. Red squares in the 2D plot and the red rays in the 3D plot are o n {pi }, used b the a subset of yi 7→ li obtained with multi–quadrics RBF. Green points in (b), (e) and (h), are the corresponding set of control points used in the calibration. Swaminathan et al. at [40], obtaining undistorted images from general non-central systems requires the knowledge of the scene structure. Using the approach proposed by Swaminathan et al.(and the knowledge of the scene structure) we performed a new experiment to estimate undistorted images from the non–central catadioptric camera using a spherical mirror. We used the same calibration process described in Section 5.3. We know position of the seventeen chess boards and use sixteen of these chessboard positions to calibrate the camera model. We then obtained the undistorted image of the 17th plane. Using the knowledge of the position of the plane not used in the calibration, and its image, the undistorted image was computed. Using a virtual perspective camera (located at (0, 0, 0)), we generated the undistorted image. The results are presented in Figure 9. As it can be seen from Figure 9(a) and Figure 9(b) the curved lines in the distorted image are mapped into straight line in the undistorted image. 13 Chess Board yy axis 500 -1000 0 Cameras -500 0 -500 3500 500 3000 2500 2000 Water−filled glass tank xx axis 1500 1000 zz axis (a) (b) (c) Fig. 8. In Figure (a), we display the set-up used to acquire the images of the data set. The goal is the calibration of an imaging system that views a scene through a water-filled glass tank. An example of an image acquired by this imaging system is shown in Figure (b). A planar chess board was used. The data set is made up by the coordinates of the images of the points corresponding to the intersections of the chess board edges (seen through the tank). An example of the results is shown in Figure (c). In yellow we show all the 3D points used in the calibration procedure. In blue we show the 3D points on the which the estimated lines must be incident. The estimated lines are shown in red. The units in Figure (c) are millimeters. 6 6.1 C ONCLUSIONS Discussion The experiments and results with synthetic data were specially relevant for the goals of this paper. They were particularly important to evaluate the effect of the number of control points as well as to evaluate the sensitivity of the method to the noise, by means of a fully specified data set. Considering the experimental results obtained without noise Figure 3, those obtained using gaussian RBF are slightly better than those obtained using multi–quadrics. However, multi–quadrics RBF tend to be less sensitive to the image scale variation. The experimental results also show that the inclusion of the coordinate normalization procedure (Section 4.3.3) tends to consistently yield better results, as expected. When noise is added to the synthetic data, and as expected, the use the least-squares estimates yields better results than the ones obtained applying the theoretical condition N = 2P . (a) (b) Fig. 9. In Figure (a) we show the original image obtained from a catadioptric camera system, made up by a spherical mirror. In Figure (b), we show the undistorted image. In some cases the calibration is affected with errors that go up to 60 units. These are extreme cases and we discuss the reasons why they occur. In one case (Figure 3) these errors correspond to results obtained with a very small number of control points 10. Note that this number of control points is small for the specific camera model (for other systems it might be adequate). Therefore in that case the reason for the error value is the small number of control points. In the case of Figure 4 these values are obtained for levels of noise with standard deviations which go up to 20 units. These are presented to show the performance of the model in very special (and unrealistic) conditions. Experiments with synthetic data also show that this method can be applied to the calibration of non–smooth camera models if the discontinuities on the data are limited. To conclude the synthetic experiments, we perform an experiment where (with the appropriate choises for the interpolant paramenters) we can calibrate a camera model using world points in a single surface, Figure 6. These experiments show that the method can be used without explicitly accounting for the collinearity constraints. Experimental results with real data–sets were also performed to evaluate the behavior of the method under real conditions. The results shown in Figures 7 and 8 show the expected behavior of the 3D line estimates obtained under the assumption of a Smooth Camera Model. In addition, we also show distortion correction results for a non–central catadioptric camera, using the proposed camera model. 6.2 Closure The proposed Smooth Camera Model and calibration procedure can be used to calibrate complex smooth camera 14 models, namely in the case of cameras for which no analytical model exists. Despite the fact that this model is for smooth imaging systems, it can also be used also in special cases of non–smooth imaging systems. This approach can model a camera with significantly less parameters than the discrete General Camera Model. For the model described in this paper the number of parameters does not depend on the image size. Instead of the 7N M parameters, for an M × N image, required by the discrete General Camera Model, this approach only requires 6 (P + 3) for P control points. The calibration procedure only requires the 3D coordinates of a single world point for each image point, whereas previous approaches require two or more 3D points for each image point. On the other hand, the calibration parameters which are estimated for a sub-set of image points are implicitly generalized for all image pixels, which constitutes an important advantage of this method. R EFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] R. Y. Tsai, “An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision,” In CVPR, 1986. Z. Zhang, “A Flexible New Technique for Camera Calibration,” IEEE Trans. Pattern Analysis and Machine Intelligence, 2000. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision (2nd ed.). The Edinburgh Building, Cambridge CB2 2RU, UK: Cambridge University Press, 2000. S. Baker and S. K. Nayar, “A Theory of Single–Viewpoint Catadioptric Image Formation,” International Journal of Computer Vision, 1999. S. Gasparini, P. Sturm, and J. P. Barreto, “Plane–Based Calibration of Central Catadioptric Cameras,” In ICCV, 2009. R. Hartley and S. B. Kang, “Parameter–free Radial Distortion Correction with Centre of Distortion Estimation,” In ICCV, 2005. D. Nistér, H. Stewénius, and E. Grossmann, “Non–Parametric Self–Calibration,” In ICCV, 2005. S. Ramalingam, P. Sturm, and Lodha Suresh K., “Generic selfcalibration of central cameras,” Computer Vision and Image Understanding, 2010. N. Goncalves, “Noncentral Catadioptric Systems with Quadric Mirrors, Geometry and Calibration,” Ph.D. dissertation, University of Coimbra, October 2008. A. Agrawal, Y. Taguchi, and S. Ramalingam, “Analytical Forward Projection for Axial Non–Central Dioptric & Catadioptric Cameras,” In ECCV, 2010. R. Swaminathan, M. D. Grossberg, and S. K. Nayar, “Non– Single Viewpoint Catadioptric Cameras: Geometry and Analysis,” International Journal of Computer Vision, 2006. B. Micusik and T. Pajdla, “Autocalibration & 3D Reconstruction with Non–Central Catadioptric Cameras,” In CVPR, 2004. J. Ponce, “What is a Camera?” In CVPR, 2009. R. Gupta and R. I. Hartley, “Linear Pushbroom Cameras,” IEEE Trans. Pattern Analysis and Machine Intelligence, 1997. J. Yu and L. McMillan, “General Linear Cameras,” In ECCV, 2004. A. Zomet, D. Feldman, S. Peleg, and D. Weinshall, “Mosaicing New Views: The Crossed–Slits Projection,” IEEE Trans. Pattern Analysis and Machine Intelligence, 2003. S. Hu and X. Feng, “Calibration of Panorama Based on Linear CCD and Fisheye Lens,” In CISP, 2009. J. Kannala and S. S. Brandt, “A Generic Camera Model and Calibration Method for Conventional, Wide–Angle, and Fish-Eye Lenses,” IEEE Trans. Pattern Analysis and Machine Intelligence, 2006. T. Treibitz, Y. Y. Schechner, and H. Singh, “Flat Refractive Geometry,” In CVPR, 2008. K. N. Kutulakos and E. Steger, “A Theory of Refractive and Specular 3D Shape by Light–Path Triangulation,” International Journal of Computer Vision, 2008. [21] P. Sturm, S. Ramalingam, J.-P. Tardif, S. Gasparini, and J. Barreto, “Camera Models and Fundamental Concepts Used in Geometric Computer Vision,” Foundations and Trends in Computer Graphics and Vision, 2011. [22] M. Menem and T. Pajdla, “Constraints on perspective images and circular panoramas,” In BMVC, 2004. [23] S. Thirthala and M. Pollefeys, “Radial Multi-focal Tensors – Applications to Omnidirectional Camera Calibration,” International Journal of Computer Vision, 2012. [24] M. D. Grossberg and S. K. Nayar, “A General Imaging Model and a Method for Finding its Parameters,” In ICCV, 2001. [25] ——, “The Raxel Imaging Model and Ray–Based Calibration,” International Journal of Computer Vision, 2005. [26] P. Sturm and S. Ramalingam, “A Generic Concept for Camera Calibration,” In ECCV, 2004. [27] S. Ramalingam, P. Sturm, and S. K. Lodha, “Towards Complete Generic Camera Calibration,” In CVPR, 2005. [28] P. Miraldo, H. Araujo, and J. Queiró, “Point–based Calibration Using a Parametric Representation of the General Imaging Model,” In ICCV, 2011. [29] H. Pottmann and J. Wallner, Computational Line Geometry. Berlin: Springer–Verlag, 2001. [30] M. Buhmann, Radial Basis Functions: Theory and Implementations. The Edinburgh Building, Cambridge CB2 8RU, UK: Cambridge University Press, 2003. [31] H. Wendland, Scattered Data Approximation. The Edinburgh Building, Cambridge CB2 8RU, UK: Cambridge University Press, 2005. [32] E. Quak, N. Sivakumar, and J. D. Ward, “Least Squares Approximation by Radial Functions,” SIAM Journal on Mathematical Analysis, 1993. [33] N. Sivakumar and J. D. Ward, “On the least squares fit by radial functions to multidimensional scattered data,” Numerische Mathematik, 1993. [34] G. H. Golub and C. F. Van Loan, Matrix Computations (3rd ed.). Baltimore, MD, USA: Johns Hopkins University Press, 1996. [35] A. Bartoli and P. Sturm, “Structure–From–Motion Using Lines: Representation, Triangulation and Bundle Adjustment,” Computer Vision and Image Understanding, 2005. [36] P. Miraldo, H. Araujo, and J. Queiró, “Unique Solution for the Estimation of the Plücker Coordinates Using Radial Basis Functions,” University of Coimbra, Tech. Rep., 2011. [Online]. Available: http://hdl.handle.net/10316/19078 [37] R. Hartley, “In Defense of the Eight–Point Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, 1997. [38] Northern Digital Incorporated, “Optotrak Certus Motion Capture System,” Northern Digital Incorporated, 2009, www.ndigital.com. [39] J.-Y. Bouguet. (2010, July (last update)) Camera Calibration Toolbox for Matlab. [Online]. Available: http://www.vision.caltech.edu/bouguetj/calib doc [40] R. Swaminathan, M. D. Grossberg, and S. K. Nayar, “A Prespective on Distortions,” In CVPR, 2003. Pedro Miraldo received the Master degree in Electrical and Computer Engineering from the University of Coimbra in 2008. Currently he is a researcher and PhD student at the Institute for Systems and Robotics-University of Coimbra, and his work focusses on general camera models. Helder Araujo is a Professor in the Department of Electrical and Computer Engineering of the University of Coimbra, Portugal and also an Invited Professor at Blaise Pascal University, Clermont-Ferrand, France. He is a researcher at the Institute for Systems and Robotics-Coimbra. His research interests include computer and robot vision, robot navigation and localization, and sensor modeling for robot navigation and localization.