Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
112 views

Computer Vision: Homework 5 3D Reconstruction

This document provides instructions for homework 5 on 3D reconstruction from two images. It includes estimating the fundamental matrix F from point correspondences using the 8-point and 7-point algorithms. It also describes using RANSAC to automatically compute F. Next, it details converting F to the essential matrix E and recovering the camera matrices M1 and M2. Finally, it explains triangulating 2D points to reconstruct the 3D scene structure.

Uploaded by

Norah M Kiggundu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views

Computer Vision: Homework 5 3D Reconstruction

This document provides instructions for homework 5 on 3D reconstruction from two images. It includes estimating the fundamental matrix F from point correspondences using the 8-point and 7-point algorithms. It also describes using RANSAC to automatically compute F. Next, it details converting F to the essential matrix E and recovering the camera matrices M1 and M2. Finally, it explains triangulating 2D points to reconstruct the 3D scene structure.

Uploaded by

Norah M Kiggundu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

16720 Computer Vision: Homework 5

3D Reconstruction.
Instructor: Martial Hebert
TAs: Varun Ramakrishna and Tomas Simon
Due Date: October 24th , 2011.

Update Nov 11th: (Code changes) Changed ransacF to more appropriate thresholds
for the new images and added final call to eightpoint with inliers. Changed triangulate
input / output format to match text. Added temple/some_corresp.mat that you can
use to check your F matrix.
Update Nov 9th: in sevenpoint_norm, return only one F matrix (not a cell array).

Figure 1: Two novel views of an object reconstructed in 3D from two images.

1
In this assignment you will first explore different methods of estimating the funda-
mental matrix given a pair of images. Then, you will compute the camera matrices and
triangulate the 2D points to obtain the 3D Scene structure.

1 Fundamental matrix estimation


The fundamental matrix completely describes the epipolar geometry of two images. In
this section, you will recover the fundamental matrix F from a set of point correspon-
dences in 2 images. In the temple/ directory, you will find two images from the Middle-
bury multi-view dataset. The Middlebury dataset is used to evaluate the performance
of modern 3D reconstruction algorithms.

The Eight Point Algorithm


The 8-point algorithm discussed in class (Geometry of Multiple Views lecture notes,
page 16 ) and outlined in Section 10.1 of Forsyth & Ponce is arguably the simplest
method for estimating the fundamental matrix. For this section, you are supposed to
manually select point correspondences in an image pair using cpselect, and pass them
to your implementation of the 8-point algorithm. (Note: you can automate this process
by using the Harris corner detector and the matching functions we provided (see later
sections); however, it is suggested you first test your code by picking the correspondences
manually).
Q1.1 (10 points) Submit a function with the following signature for this portion
of the assignment:

function F = eightpoint(p1,p2);

where p1 and p2 are each N 2 matrices with the first column of each corresponding
to x coordinates and the second column corresponding to y coordinates (from cpselect).
Your function should minimize the equation: pTim2 Fpim1 = 0. You should test your code
on hand-selected correspondences for each image pair. Remember that the x-coordinate
of a point in the image is its column entry, and y-coordinate is the row entry. Note:
8-point is just a figurative name. Your algorithm should use an over-determined system.
To visualize the correctness of your estimated F, use the function displayEpipolarF.
You should scale the data as was discussed in class, by dividing each coordinate by
the image size M. After computing F, you will have to unscale the fundamental matrix.
Hints: if xnormalized = T x, then Funnormalized = T T F T .
Also save this F matrix to the mat file q1_1.mat (save q1_1 F p1 p2; )

The 7 Point Algorithm


Q1.2 (20 points) Since the fundamental matrix only has seven degrees of freedom, it
is possible to calculate F using only seven point correspondences; this requires solving a
polynomial equation.

2
In the section, you will implement the Seven Point Algorithm described in class, and
outlined in Section 15.6 of Forsyth and Ponce, and in p. 350 in Szeliski. Use cpselect
to obtain 7 point correspondences this time. The function should have the signature:

function F = sevenpoint norm(p1, p2, m)

where p1 and p2 are 7-by-2 matrices containing the correspondences and m is the normal-
izer (use max of the images length and width), and F is a 3-by-3 Fundamental matrix.
Note that in general, it is possible to obtain from 1 to 3 solutions for the fundamental
matrix from the polynomial equation. For the purposes of this assignment, just return
one of the solutions.
Verify your solution for F by using displayEpipolarF to visualize F . In your an-
swer sheet: write your recovered F and print an output of displayEpipolarF.
Also save this F matrix and your correspondences to the mat file q1_2.mat (save q1_2 F p1 p2;
)
Hints: Use m to normalize the point values between [0, 1] and remember to unnor-
malize your computed F afterwards. if xnormalized = T x, then Funnormalized = T T F T ;
you can use roots; the epipolar lines may not match exactly due to the TAs imperfect
selected correspondences.

Automatic Computation of F
In some real world applications, manually determining correspondences is infeasible.
Fortunately, there are methods for estimating the epipolar geometry between two im-
ages automatically. In this section you will implement a popular automatic algorithm,
RANSAC (RANdom SAmpling Consensus). (The RANSAC algorithm is described in
Geometry of Multiple Views lecture notes, page 20 and Section 15.5.2 of Forsyth and
Ponce and 6.1.4 in Szeliski) We have provided a function for determining the candidate
correspondences:

[f1 x,f1 y,f2 x,f2 y] = getCandidateCorrespondences(im1,im2)

This function finds features in both images and then finds potential correspondences
by finding the smallest Sum of Absolute Differences (SAD) between feature windows.
It returns those features which matched best when comparing their windows in either
direction (i.e. the minimum SAD match was the same whether comparing a feature
window in im1 to the features in im2 or vice versa). This function relies on our im-
plementation of the Harris interest point detector, for which we have also provided the
code. You are encouraged to look through each of these functions, as both contain fairly
good examples of vectorization for efficient MATLAB implementations. Make sure to
convert the images to grayscale before calling.
Now, given the list of candidate correspondences from the function above, RANSAC
should choose 7 of these randomly, estimate F based on those correspondences, and
evaluate the quality of F . As was discussed in class, estimating the quality of F is done
by counting the total number of inliers among the correspondences. In other words,

3
count how many of the points in the second image lie on the epipolar line of their
corresponding point from the first image, to within some threshold that you choose.
If the quality of F improves, you keep it. This process repeats (for several hundred
iterations) and the best F is returned at the end.
Since this is very similar to what you did in the previous assignment, we have provided
an implementation of the above algorithm with the signature:

function [F] = ransacF(im1, im2)

Note that the actual location of the epipoles for the F found by your code may be fairly
unstable (i.e. between two runs, you may see some variation in the epipole location, or
equivalently the angles of the epipolar lines). All that we are concerned with for this
assignment is that a corresponding point in the second image for a given point in the
first image does indeed lie on the epipolar line. This should be true regardless of the
variations in the epipoles position. (Also, feel free to tweak this function.)
Q1.3 (10 points) In your answer sheet: write your best F and print an
output of displayEpipolarF. Also save this F matrix to the mat file q1_3.mat

2 3D Reconstruction
Metric Reconstruction
The fundamental matrix can be used to determine the camera matrices of the two views
used to generate F . In general, M1 and M2 will be projective camera matrices, i.e.,
they are related to each other by a projective transform. In order to obtain the 3D
structure of the scene, we would need to convert these camera matrices to Euclidean
camera matrices. The Euclidean camera matrices encode the overall scene structure
to a similarity transform, such that the reconstructed 3D structure and the true 3D
structure differ from each other by a global rotation, translation and scaling.
To obtain the Euclidean scene structure, first we convert the fundamental matrix
F to an essential matrix E. Examine the lecture notes and the textbook to find out
how to convert a fundamental matrix to an essential matrix when the internal camera
calibration matrices K1 and K2 are known; these are provided in intrinsics.mat.
Q2.1 (10 points) Write a function to compute the essential matrix E given F and
K1 and K2 with the signature:

function E = essentialMatrix(F, K1, K2)

In your answer sheet: write your estimated E using F from Q1.3 (ransac).
Also save this E matrix to the mat file q2_1.mat. (You may want to debug it on the
manual correspondences if ransac doesnt return a correct F.).

Given an essential matrix, it is possible to retrieve the camera matrices from it.
Assuming M1 is fixed at [I, 0], M2 can be retrieved up to a scale and four-fold rotation

4
ambiguity; this is in contrast to the projective ambiguity from the fundamental matrix.
For details on recovering M2 , see 7.2 in Szeliski.
To make this assignment more tolerable, this code has been provided. The function:

function [M2s] = camera2(E)

calculates four possible locations for the second camera, assuming the first camera
is fixed at M1 = [I,0]. An essential matrix can be decomposed into four different pairs
of camera matrices; however, only the correct pair of camera matrices would give you
triangulated 3D points in front of both the cameras, see the next section for triangulation
details. Once you have implemented the triangulation function, you can test the 4
different camera matrices in M2s to determine the correct M2.
As a sanity check, to ensure you are on the right track, you can compare your M2
with the TAs calculation (your result may be slightly different):

0.9994 0.0315 0.0109 0.0341
0.0332 0.9669 0.2530 1.0000 (1)
0.0026 0.2532 0.9674 0.0573
The following function to triangulate a set of points, has been provided for you:

function P = triangulate(M1, p1, M2, p2)

where p1 and p2 are N 2 matrices with the 2-D image coordinates and P is an N 3
matrix with the corresponding 3-D points, per row. M1 and M2 are the 3 4 camera
matrices. Remember that you will need to multiply the given internal camera calibration
matrices to your obtained solution for the canonical camera matrices to obtain the final
camera matrices.
Q2.2 (10 points) Write a script findM2.m to obtain the correct M2 from M2s by
testing the four solutions through triangulation. You can use your own correspondences
or the correspondences from temple/some_corresp.mat as 2D points p1,p2. Save the
correct M2 and 2D points p1,p2 and 3D points P to q2_2.mat (save q2_2 M2 P p1 p2;
).

3 3D Visualization
To culminate this project, you will now create a 3D rendering of the temple images. By
treating our two images as a stereo-pair, we can triangulate corresponding points in each
image, and render their 3D locations.
Q3.1 (10 points) Write the function:

function [x2, y2] = epipolarCorrespondence(im1, im2, F, x1, y1)

This function takes in the x and y coordinates of a pixel on im1, and your fundamental
matrix F, and returns the coordinates of the pixel on im2 which correspond to the input
point. The supplied function getCandidateCorrespondences compares small windows

5
around feature points to look for correspondences. However, it yields many outliers, since
it searches over the entire image for matches. Fortunately, we know the fundamental
matrix F. Therefore, instead of searching for our matching point at every possible
location in im2, we can simply search over the set of pixels that lie along the epipolar
line (recall that the epipolar line passes through a single point in im2 which corresponds
to the point (x1, y1) in im1). Implementation hints:

Experiment with various window sizes.

It may help to use a Gaussian weighting of the window, so that the center has
greater influence than the periphery.

Since the two images only differ by a small amount, it might be beneficial to
consider matches for which the distance from (x1, y1) to (x2, y2) is small.

To test your function epipolarCorrespondence, we have included a GUI:

function [coordsIM1, coordsIM2] = epipolarMatchGUI(im1, im2, F)

This script allows you to click on a point in im1, and will use your function to display
the corresponding point in im2. The process repeats until you right-click in the figure,
and the sets of matches will be returned.
Included in hw5.zip, is a file templeCoords.mat which contains 288 hand-selected
points from im1 saved in the variables x1 and y1. Use your function epipolarCorrespondence
to calculate the corresponding points x2 and y2 in im2.
Now, we can determine the 3D location of these point correspondences using the
triangulate function. These 3D point locations can then plotted using the MATLAB
function scatter3. The resulting figure can be rotated using the Rotate 3D tool, which
can be accessed through the figure menubar.
Please take a few screenshots of the 3D visualization, and include them with your
homework submission.

4 Theory Questions : Epipolar Geometry


Q1 Consider the case of two cameras1 (assume they have the same intrinsic param-
eter matrix, i.e., K1 = K2 ) viewing an object such that the second camera differs from
the first by a pure translation that is parallel to the X-axis. Show that the epipolar lines
in the two cameras are also parallel to the x-axis.

Q2 Suppose that a camera views an object and its reflection in a plane mirror. Show
that this situation is equivalent to two views of the object, and that the fundamental
matrix is skew-symmetric. Hint: (The matrix of a reflection for a plane with normal n
passing through the origin is I 2nnT ).
1
You can use the fact that the fundamental matrix between two cameras given by P1 = K1 [I|0] and
P2 = K2 [R|t] is be given by F = [K1 t] K1 RK21

You might also like