Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
12 views

03 Two-View Geometry

Uploaded by

gzhth776vf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

03 Two-View Geometry

Uploaded by

gzhth776vf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Note) Slides and example codes are available:

https://github.com/mint-lab/3dv_tutorial

An Invitation to 3D Vision:
Two-View Geometry

Sunglok Choi, Assistant Professor, Ph.D.


Computer Science and Engineering Department, SEOULTECH
sunglok@seoultech.ac.kr | https://mint-lab.github.io/
Review) Absolute Camera Pose Estimation

▪ Example) Pose estimation (book) + camera calibration – initially given K [pose_estimation_book3.py]


(due to wrong initial K)
- Load the cover image
- Extract ORB features
- Build 3D object points 𝐗1 , 𝐗 2 , …

Pass tentative matches Pass correct matches


- Grab an input image [Outlier Rejection] [Camera Pose Estimation]
- Extract ORB features 𝐱1 , 𝐱 2 , … - Remove wrong matches using - Estimate camera parameters using
- Match them to the object features cv::findHomography() cv::calibrateCamera()
(free from K)

[Camera Pose Update]


- Intrinsic: K (initially known)
Draw the box using cv::projectPoints()
- Extrinsic: R, 𝐭

2
Table of Contents: Two-view Geometry

▪ Planar Homography
▪ Epipolar Geometry
– Epipolar constraint
– Fundamental and essential matrix
▪ Relative Camera Pose Estimation
▪ Triangulation

윤두서(1668-1715) 자화상, 국보 제240호


Korean National Treasure No. 240 3
Planar Homography

plane

𝐱
𝐱′
𝐱 ′ = H𝐱

transformation H

shear
rotation composition

translation

perspective
projection
scaling / aspect ratio
4
Planar Homography
Euclidean Transformation Similarity Transformation Affine Transformation Projective Transformation
(a.k.a. Rigid Transform) (a.k.a. Planar Homography)

cos 𝜃 − sin 𝜃 𝑡𝑥 𝑠 cos 𝜃 −𝑠 sin 𝜃 𝑡𝑥 𝑎11 𝑎12 𝑡𝑥 𝑎11 𝑎12 𝑡𝑥


Matrix Forms H sin 𝜃 cos 𝜃 𝑡𝑦 𝑠 sin 𝜃 𝑠 cos 𝜃 𝑡𝑦 𝑎21 𝑎22 𝑡𝑦 𝑎21 𝑎22 𝑡𝑦
0 0 1 0 0 1 0 0 1 𝑣1 𝑣2 1

DOF 3 4 6 8

Transformations
- rotation O O O O
- translation O O O O
- scaling X O O O
- aspect ratio X X O O
- shear X X O O
- perspective projection X X X O
Invariants
- length O X X X
- angle O O X X
- ratio of lengths O O X X
- parallelism O O O X
- incidence O O O O
- cross ratio O O O O
cv.getAffineTransform() cv.getPerspectiveTransform()
cv.estimateRigidTransform() -
OpenCV APIs
- cv.findHomography()
cv.warpAffine() cv.warpPerspective()

Note) Similarly 3D transformations (3D-3D geometry) are represented as 4x4 matrices. 5


Note) Affine Transformation

Image: Wikipedia 6
Note) Affine Transformation

▪ Q) What is difference between two images?


– Parallel projection vs. Perspective projection
• Affine camera (a.k.a. weak perspective camera; 𝑓 = ∞)
• Less natural but (sometimes) useful in technical visualization

Image: Scatchpixel, Wikipedia 7


Planar Homography

▪ Planar homography estimation


– Unknown: Planar homography H (8 DOF)
– Given: Point correspondence (𝐱1 , 𝐱1′ ), …, (𝐱 𝑛 , 𝐱 𝑛′ )
– Constraints: 𝑛 x projective transformation 𝐱 𝑖′ = H𝐱 𝑖
– Solutions (𝑛 ≥ 4) → 4-point algorithm
• OpenCV: cv.getPerspectiveTransform() and cv.findHomography()
• Note) More simplified transformations need less number of minimal correspondence.
– Affine (𝑛 ≥ 3), similarity (𝑛 ≥ 2), Euclidean (𝑛 ≥ 2)
– Note) Planar homography can be decomposed as relative camera pose.
• OpenCV: cv.decomposeHomographyMat()
• The decomposition needs to know camera matrices.
𝐱𝑖
𝐱 𝑖′
𝐱 𝑖′ = H𝐱 𝑖

relative pose R, 𝐭
8
Planar Homography

▪ Example) Perspective distortion correction [perspective_correction.py]

pts_dst
𝐱 𝑖′ = H𝐱 𝑖

Click

pts_src

9
Planar Homography

▪ Example) Perspective distortion correction [perspective_correction.py]


def mouse_event_handler(event, x, y, flags, param):
if event == cv.EVENT_LBUTTONDOWN:
param.append((x, y))

if __name__ == '__main__':
img_file = '../data/sunglok_card.jpg'
card_size = (450, 250)
offset = 10

# Prepare the rectified points


pts_dst = np.array([[0, 0], [card_size[0], 0], [0, card_size[1]], [card_size[0], card_size[1]]])

# Load an image
img = cv.imread(img_file)

# Get the matched points from mouse clicks


pts_src = []
wnd_name = 'Perspective Correction: Point Selection'
cv.namedWindow(wnd_name)
cv.setMouseCallback(wnd_name, mouse_event_handler, pts_src)
while len(pts_src) < 4:
img_display = img.copy()
cv.rectangle(img_display, (offset, offset), (offset + card_size[0], offset + card_size[1]), (0, 0, 255), 2)
idx = min(len(pts_src), len(pts_dst))
cv.circle(img_display, offset + pts_dst[idx], 5, (0, 255, 0), -1)
cv.imshow(wnd_name, img_display) 10
Planar Homography

▪ Example) Perspective distortion correction [perspective_correction.py]


if __name__ == '__main__':
img_file = '../data/sunglok_card.jpg'
card_size = (450, 250)
offset = 10

# Prepare the rectified points


pts_dst = np.array([[0, 0], [card_size[0], 0], [0, card_size[1]], [card_size[0], card_size[1]]])

# Load an image
img = cv.imread(img_file)

# Get the matched points from mouse clicks


pts_src = []
...

if len(pts_src) == 4:
# Calculate planar homography and rectify perspective distortion
H, _ = cv.findHomography(np.array(pts_src), pts_dst)
img_rectify = cv.warpPerspective(img, H, card_size)

# Show the rectified image


cv.imshow('Perspective Correction: Rectified Image', img_rectify)
cv.waitKey(0)

cv.destroyAllWindows()
11
Planar Homography

▪ Example) Planar image stitching [image_stitching.py]


# Load two images
img1 = cv.imread('../data/hill01.jpg')
img2 = cv.imread('../data/hill02.jpg')

# Retrieve matching points


brisk = cv.BRISK_create()
keypoints1, descriptors1 = brisk.detectAndCompute(img1, None)
keypoints2, descriptors2 = brisk.detectAndCompute(img2, None)

fmatcher = cv.DescriptorMatcher_create('BruteForce-Hamming')
match = fmatcher.match(descriptors1, descriptors2)

# Calculate planar homography and merge them


pts1, pts2 = [], []
for i in range(len(match)):
pts1.append(keypoints1[match[i].queryIdx].pt)
pts2.append(keypoints2[match[i].trainIdx].pt)
pts1 = np.array(pts1, dtype=np.float32)
pts2 = np.array(pts2, dtype=np.float32)

H, inlier_mask = cv.findHomography(pts2, pts1, cv.RANSAC)


img_merged = cv.warpPerspective(img2, H, (img1.shape[1]*2, img1.shape[0]))
img_merged[:,:img1.shape[1]] = img1 # Copy

# Show the merged image


img_matched = cv.drawMatches(img1, keypoints1, img2, keypoints2, match, None, None, None, 12
Planar Homography

▪ Example) 2D video stabilization [video_stabilization.py]


# Open a video and get the reference image and feature points
video = cv.VideoCapture('../data/traffic.avi')

_, gray_ref = video.read()
if gray_ref.ndim >= 3:
gray_ref = cv.cvtColor(gray_ref, cv.COLOR_BGR2GRAY)
pts_ref = cv.goodFeaturesToTrack(gray_ref, 2000, 0.01, 10)

# Run and show video stabilization A shaking CCTV video


while True:
# Read an image from `video`
valid, img = video.read()
if not valid:
break
if img.ndim >= 3:
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
else:
gray = img.copy()

# Extract optical flow and calculate planar homography


pts, status, err = cv.calcOpticalFlowPyrLK(gray_ref, gray, pts_ref, None)
H, inlier_mask = cv.findHomography(pts, pts_ref, cv.RANSAC)

# Synthesize a stabilized image


warp = cv.warpPerspective(img, H, (img.shape[1], img.shape[0]))
13
Planar Homography

▪ Assumption) A plane is observed by two views.


– Perspective distortion correction: A complete plane
– Planar image stitching: An approximated plane (← distance ≫ depth variation)
– 2D video stabilization: An approximated plane (← small motion)

14
𝐗
Epipolar Geometry

▪ Epipolar constraint
⊺ image plane
𝐱 ′ F𝐱 = 0 (F: Fundamental matrix on the image plane)
↕ (𝐱 = Kො𝐱)
𝐱 𝐱′

𝐱ො ′⊺ K ′⊺ FKො𝐱 = 0 camera #1
camera #2
↕ (E = K ′⊺ FK)
relative pose R, 𝐭
𝐱ො ′⊺ Eො𝐱 = 0 (E: Essential matrix on the normalized image plane)

normalized image plane

𝐱ො 𝐱ො ′

camera #1
camera #2

relative pose R, 𝐭

15
Epipolar Geometry

▪ Epipolar constraint: Derivation


𝐗 ′ = R𝐗 + 𝐭
↓ (𝐗 = 𝜆ො𝐱)
𝜆′ො𝐱 ′ = 𝜆Rො𝐱 + 𝐭
𝐭 ×↓
𝜆′ 𝐭 × 𝐱ො ′ = 𝜆𝐭 × Rො𝐱
𝐱ො ′ ∙↓
𝜆′ 𝐱ො ′ ∙ (𝐭 × 𝐱ො ′ ) = 𝜆ො𝐱 ′ ∙ 𝐭 × Rො𝐱 𝐗

↓ (ො𝐱 ′ ∙ (𝐭 × 𝐱ො ′ ) = 0)
𝜆ො𝐱 ′ ∙ 𝐭 × Rො𝐱 = 0
0 −𝑡𝑧 𝑡𝑦
↓ (E = 𝐭 × R or E = 𝐭 × R) Note) 𝐭 × = 𝑡𝑧 0 −𝑡𝑥 normalized image plane
−𝑡𝑦 𝑡𝑥 0
𝐱ො ′ Eො𝐱 = 0 𝐱ො 𝐱ො ′

camera #1
camera #2

relative pose R, 𝐭

16
Epipolar Geometry

▪ Epipolar geometry
– Baseline
𝐗
• Distance between two camera centers, 𝐂 and 𝐂′
– Epipolar plane
• Plane generated from three points, 𝐗, 𝐂, and 𝐂′ epipolar plane
image plane
– Epipole (a.k.a. epipolar point)
𝐱 𝐱′
• Projection of other camera centers
𝐂
• 𝐞 = P𝐂′ and 𝐞′ = P 𝐂 ′
camera #1
𝐞 baseline 𝐞′ 𝐂′ camera #2
– e.g. P = K[I|0] and P = K [R|𝐭]
′ ′

• 𝐞 = −KR⊺ 𝐭 and 𝐞′ = K ′ 𝐭 epipole

17
Epipolar Geometry

▪ Epipolar geometry
– Epipolar line
𝐗
⊺ ⊺
• 𝐱 ′ F𝐱 = 0 → 𝐱 ′ 𝐥′ = 0 where 𝐥′ = F𝐱
– 𝐱 ′ will lie on the line 𝐥′.
epipolar line 𝐥′ = F𝐱
′⊺
• 𝐱 F𝐱 = 0 → 𝐥⊺ 𝐱 = 0 where 𝐥 = F ⊺ 𝐱’ image plane
– 𝐱 will lie on the line 𝐥. 𝐱
• The search space of feature matching is reduced
from a image plane to the epipolar line. camera #1
𝐞 baseline 𝐞′ camera #2
– Note) Every epipolar line intersects at the epipole.
epipole
• F𝐞 = 0
• 𝐞 is the null space of F. Special case) Stereo cameras

epipolar line

relative pose epipole at infinite


R = I3×3 , 𝐭 = −𝑏, 0, 0 18
Epipolar Geometry

▪ Example) Epipolar line visualization [epipolar_line_visualization.py]

Image #2

Image #2

Image #1

Image #1 19
Epipolar Geometry

▪ Example) Epipolar line visualization [epipolar_line_visualization.py]


def mouse_event_handler(event, x, y, flags, param):
if event == cv.EVENT_LBUTTONDOWN:
param.append((x, y)) pt1

def draw_straight_line(img, line, color, thickness=1):


h, w, *_ = img.shape
a, b, c = line # Line: ax + by + c = 0
if abs(a) > abs(b):
pt1 = (int(c / -a), 0)
pt2 = (int((b*h + c) / -a), h)
else: pt2
...
cv.line(img, pt1, pt2, color, thickness)

if __name__ == '__main__':
# Load two images
img1 = cv.imread('../data/KITTI07/image_0/000000.png', cv.IMREAD_COLOR)
img2 = cv.imread('../data/KITTI07/image_0/000023.png', cv.IMREAD_COLOR)
# Note) `F` is derived from `fundamental_mat_estimation.py`.
F = np.array([[ 3.34638533e-07, 7.58547151e-06, -2.04147752e-03], ...])

# Register event handlers and show images


wnd1_name, wnd2_name = 'Epipolar Line: Image #1', 'Epipolar Line: Image #2'
img1_pts, img2_pts = [], []
cv.namedWindow(wnd1_name)
cv.setMouseCallback(wnd1_name, mouse_event_handler, img1_pts) 20
Epipolar Geometry

▪ Example) Epipolar line visualization [epipolar_line_visualization.py]


if __name__ == '__main__':
# Load two images
img1 = cv.imread('../data/KITTI07/image_0/000000.png', cv.IMREAD_COLOR)
img2 = cv.imread('../data/KITTI07/image_0/000023.png', cv.IMREAD_COLOR)
F = np.array([[ 3.34638533e-07, 7.58547151e-06, -2.04147752e-03], ...])

# Register event handlers and show images


...

# Get a point from a image and draw its correponding epipolar line on the other image
while True:
if len(img1_pts) > 0:
for x, y in img1_pts:
color = (random.randrange(256), random.randrange(256), random.randrange(256))
cv.circle(img1, (x, y), 4, color, -1)
epipolar_line = F @ [[x], [y], [1]] 𝐥′ = F𝐱
draw_straight_line(img2, epipolar_line, color, 2)
img1_pts.clear()
if len(img2_pts) > 0:
for x, y in img2_pts:
color = (random.randrange(256), random.randrange(256), random.randrange(256))
cv.circle(img2, (x, y), 4, color, -1)
epipolar_line = F.T @ [[x], [y], [1]]
draw_straight_line(img1, epipolar_line, color, 2) 𝐥 = F ⊺ 𝐱′
img2_pts.clear()
cv.imshow(wnd2_name, img2) 21
Relative Camera Pose Estimation

▪ Relative camera pose estimation (~ fundamental/essential matrix estimation)


– Unknown: Rotation and translation R, 𝐭 (5 DOF; up-to scale “scale ambiguity”) R, 𝐭
↕ (E = 𝐭 × R)
– Given: Point correspondence (𝐱1 , 𝐱1′ ), …, (𝐱 𝑛 , 𝐱 𝑛′ ) and camera matrices K, K′
E
– Constraints: 𝑛 x epipolar constraint (𝐱′⊺ F𝐱 = 0 or 𝐱ො ′⊺ Eො𝐱 = 0) ↕ (E = K ′⊺ FK)
F
– Solutions) Fundamental matrix: 7/8-point algorithm (7 DOF; intrinsic+extrinsic)
• Properties: det(F) = 0 (or rank(F) = 2)
• Estimation: cv.findFundamentalMat() → 1 solution
• Conversion to E: E = K′⊺ FK
– Solutions) Essential matrix: 5-point algorithm (5 DOF; extrinsic)
• Properties: det(E) = 0 and 2EE ⊺ E − tr EE ⊺ E = 0
• Estimation: cv.findEssentialMat() → 𝑘 solutions
• Decomposition: cv.decomposeEssentialMat() → 4 solutions “relative pose ambiguity”
• Decomposition with positive-depth check: cv.recoverPose() → 1 solution

22
Relative Camera Pose Estimation: Scale Ambiguity

The Sandcrawler @ Star Wars IV: A New Hope (1977)

(a) 2-meter-wide tunnel (b) 1-meter-wide tunnel

Reference: Choi et al., Resolving Scale Ambiguity in Monocular Visual Odometry, URAI, 2013 23
Relative Camera Pose Estimation: Scale Ambiguity

▪ How to resolve scale ambiguity


– Additional sensors: Speedometers (odometers), IMUs, GPSs, depth/distance (stereo, RGB-D, LiDAR, …)
– Motion constraints: Known initial translation, Ackerman’s steering kinematics
– Observation constraints: Known size of objects, known and constant height of camera

1
𝐿
∴𝜌=
𝑙
𝜌

𝑙
𝐿

(a) The reconstructed world (b) The real world

Reference: Choi et al., Resolving Scale Ambiguity in Monocular Visual Odometry, URAI, 2013 24
Relative Camera Pose Estimation: Relative Pose Ambiguity

▪ How to resolve pose ambiguity


– Positive depth constraint

Image: Hartley and Zisserman, Multiple View Geometry in Computer Vision (2nd Edition), Cambridge University Press, 2004 25
Relative Camera Pose Estimation

Example) Fundamental matrix estimation [fundamental_mat_estimation.py]


# Load two images
img1 = cv.imread('../data/KITTI07/image_0/000000.png')
img2 = cv.imread('../data/KITTI07/image_0/000023.png’)
f, cx, cy = 707.0912, 601.8873, 183.1104 # From the KITTI dataset
K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])

# Retrieve matching points


brisk = cv.BRISK_create()
keypoints1, descriptors1 = brisk.detectAndCompute(img1, None)
keypoints2, descriptors2 = brisk.detectAndCompute(img2, None)

fmatcher = cv.DescriptorMatcher_create('BruteForce-Hamming')
match = fmatcher.match(descriptors1, descriptors2)

26
Relative Camera Pose Estimation

Example) Fundamental matrix estimation [fundamental_mat_estimation.py]


# Load two images
...

# Retrieve matching points


...

# Calculate the fundamental matrix


pts1, pts2 = [], []
for i in range(len(match)):
pts1.append(keypoints1[match[i].queryIdx].pt)
pts2.append(keypoints2[match[i].trainIdx].pt)
pts1 = np.array(pts1, dtype=np.float32)
pts2 = np.array(pts2, dtype=np.float32)
F, inlier_mask = cv.findFundamentalMat(pts1, pts2, cv.FM_RANSAC, 0.5, 0.999)

27
Relative Camera Pose Estimation

Example) Fundamental matrix estimation [fundamental_mat_estimation.py]


# Load two images
...
f, cx, cy = 707.0912, 601.8873, 183.1104 # From the KITTI dataset
K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])

# Retrieve matching points


...

# Calculate the fundamental matrix


...
F, inlier_mask = cv.findFundamentalMat(pts1, pts2, cv.FM_RANSAC, 0.5, 0.999)
Image #2
# Extract relative camera pose between two images [-0.57, 0.09, 0.82]
E = K.T @ F @ K
positive_num, R, t, positive_mask = cv.recoverPose(E, pts1, pts2, K, mask=inlier_mask)
...
print(f'* The position of Image #2 = {-R.T @ t}') # [-0.57, 0.09, 0.82]
Image #1
[0, 0, 0]

28
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]

Image 𝒏 − 𝟏 Image 𝒏

Relative pose R, 𝐭

Given K Use_5pt = true


[Tentative Matching] [Outlier Rejection] [Relative Pose Estimation]
- Extract optical flow - Remove wrong matches using - Decompose camera pose using
between adjacent images cv::findEssentialMat() cv::recoverPose()
Pass tentative matches Pass the essential matrix E

[Relative Pose Accumulation]


−1
R 𝐭
0
T𝑛0 = T𝑛−1 Δ𝑛−1
𝑛
0
= T𝑛−1 Pass the relative pose R, 𝐭 (Note: ∥ 𝐭 ∥ = 1)
0 1

29
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]


4X

Given K Use_5pt = true


[Tentative Matching] [Outlier Rejection] [Relative Pose Estimation]
- Extract optical flow - Remove wrong matches using - Decompose camera pose using
between adjacent images cv::findEssentialMat() cv::recoverPose()
Pass tentative matches Pass the essential matrix E

[Relative Pose Accumulation]


−1
R 𝐭
0
T𝑛0 = T𝑛−1 Δ𝑛−1
𝑛
0
= T𝑛−1 Pass the relative pose R, 𝐭 (Note: ∥ 𝐭 ∥ = 1)
0 1

30
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]


video_file = '../data/KITTI07/image_0/%06d.png'
f, cx, cy = 707.0912, 601.8873, 183.1104
use_5pt = True
min_inlier_num = 100
min_inlier_ratio = 0.2
traj_file = '../data/vo_epipolar.xyz'

# Open a video and get an initial image


video = cv.VideoCapture(video_file)

_, gray_prev = video.read()
if gray_prev.ndim >= 3 and gray_prev.shape[2] > 1:
gray_prev = cv.cvtColor(gray_prev, cv.COLOR_BGR2GRAY)

# Prepare a plot to visualize the camera trajectory


...

# Run the monocular visual odometry


K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])
camera_traj = np.zeros((1, 3))
camera_pose = np.eye(4)
while True:
# Grab an image from the video
valid, img = video.read()
if not valid:
break 31
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]


# Run the monocular visual odometry
K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])
camera_traj = np.zeros((1, 3))
camera_pose = np.eye(4)
while True:
# Grab an image from the video
valid, img = video.read()
if not valid:
break
if img.ndim >= 3 and img.shape[2] > 1:
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
else:
gray = img.copy()

# Extract optical flow


pts_prev = cv.goodFeaturesToTrack(gray_prev, 2000, 0.01, 10)
pts, status, err = cv.calcOpticalFlowPyrLK(gray_prev, gray, pts_prev, None)
gray_prev = gray

# Calculate relative pose


if use_5pt:
E, inlier_mask = cv.findEssentialMat(pts_prev, pts, f, (cx, cy), cv.FM_RANSAC, 0.99, 1)
else:
F, inlier_mask = cv.findFundamentalMat(pts_prev, pts, cv.FM_RANSAC, 1, 0.99)
E = K.T @ F @ K
inlier_num, R, t, inlier_mask = cv.recoverPose(E, pts_prev, pts, focal=f, pp=(cx, cy), mask=inlier_mask) 32
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]


# Run the monocular visual odometry
while True:
# Grab an image from the video
...

# Extract optical flow


...

# Calculate relative pose


if use_5pt:
E, inlier_mask = cv.findEssentialMat(pts_prev, pts, f, (cx, cy), cv.FM_RANSAC, 0.99, 1)
else:
F, inlier_mask = cv.findFundamentalMat(pts_prev, pts, cv.FM_RANSAC, 1, 0.99)
E = K.T @ F @ K
inlier_num, R, t, inlier_mask = cv.recoverPose(E, pts_prev, pts, focal=f, pp=(cx, cy), mask=inlier_mask)
inlier_ratio = inlier_num / len(pts)

# Accumulate relative pose if result is reliable


info_color = (0, 255, 0)
if inlier_num > min_inlier_num and inlier_ratio > min_inlier_ratio:
T = np.eye(4)
T[:3, :3] = R
T[:3, 3] = t.flatten() −1
0 0 R 𝐭
camera_pose = camera_pose @ np.linalg.inv(T) T𝑛0 = T𝑛−1 Δ𝑛−1
𝑛 = T𝑛−1
0 1
info_color = (0, 0, 255)
33
Relative Camera Pose Estimation

▪ Relative camera pose estimation (~ fundamental/essential matrix estimation)


– Unknown: Rotation and translation R, 𝐭 (5 DOF; up-to scale “scale ambiguity”) R, 𝐭
↕ (E = 𝐭 × R)
– Given: Point correspondence (𝐱1 , 𝐱1′ ), …, (𝐱 𝑛 , 𝐱 𝑛′ ) and camera matrices K, K′
E
– Constraints: 𝑛 x epipolar constraint (𝐱′⊺ F𝐱 = 0 or 𝐱ො ′⊺ Eො𝐱 = 0) ↕ (E = K ′⊺ FK)
F
– Solutions) Fundamental matrix: 7/8-point algorithm (7 DOF; intrinsic+extrinsic)
• Properties: det(F) = 0 (or rank(F) = 2)
• Estimation: cv.findFundamentalMat() → 1 solution
• Conversion to E: E = K′⊺ FK
• Degenerate cases: No translation, correspondence from a single plane
– Solutions) Essential matrix: 5-point algorithm (5 DOF; extrinsic)
• Properties: det(E) = 0 and 2EE ⊺ E − tr EE ⊺ E = 0
• Estimation: cv.findEssentialMat() → 𝑘 solutions
• Decomposition: cv.decomposeEssentialMat() → 4 solutions “relative pose ambiguity”
• Decomposition with positive-depth check: cv.recoverPose() → 1 solution
• Degenerate cases: No translation (∵ E = 𝐭 × R)

34
Relative Camera Pose Estimation

▪ Relative camera pose estimation


– Solutions) Fundamental matrix: 7/8-point algorithm (7 DOF; intrinsic+extrinsic)
• Estimation: cv.findFundamentalMat() → 1 solution
• Conversion to E: E = K′⊺ FK
complementary

• Degenerate cases: No translation, correspondence from a single plane


– Solutions) Essential matrix: 5-point algorithm (5 DOF; extrinsic)
• Estimation: cv.findEssentialMat() → 𝑘 solutions
• Decomposition: cv.decomposeEssentialMat() → 4 solutions “relative pose ambiguity”
• Decomposition with positive-depth check: cv.recoverPose() → 1 solution
• Degenerate cases: No translation (∵ E = 𝐭 × R)
– Solutions) Planar homography: 4-point algorithm (8 DOF; up-to scale “scale ambiguity” )
• Estimation: cv.findHomography() → 1 solutions
• Conversion to calibrated H: H
෡ = K′−1 HK

• Decomposition: cv.decomposeHomographyMat() → 4 solutions “relative pose ambiguity”


• Degenerate cases: Correspondence not from a single plane

35
Relative Camera Pose Estimation

▪ Relative camera pose estimation


– Solutions) Planar homography: 4-point algorithm (8 DOF; up-to scale “scale ambiguity” )
• Estimation: cv.findHomography() → 1 solutions
• Derivation
𝜆′ 𝐱ො ′ = 𝜆Rො𝐱 + 𝐭
1 ⊺
↓ 𝐧 𝐱ො = 1 (∵ 𝑛𝑥 𝑥ො + 𝑛𝑦 𝑦ො + 𝑛𝑧 𝑧Ƹ − 𝑑 = 0)
𝑑
1
𝐱ො ′ = 𝜆′′ R + ′
𝐭 𝐧⊺ 𝐱ො
𝑑
1
↓ H
෡ =R+ 𝐭 𝐧⊺
𝑑′
෡ 𝐱ො
𝐱ො ′ = H
↓ 𝐱ො = K −1 𝐱 and 𝐱ො ′ = K′−1 𝐱′
𝐱′ = H 𝐱
• Conversion to calibrated H: H
෡ = K′−1 HK

• Decomposition: cv.decomposeHomographyMat() → 4 solutions “relative pose ambiguity”


• Degenerate cases: Correspondence not from a single plane
36
Relative Camera Pose Estimation: Overview
Items General 2D-2D Geometry Planar 2D-2D Geometry
Model Fundamental Matrix (7 DOF) Planar Homography (8 DOF)
Formulation F = K′−⊺ EK −1 E = K′⊺ FK ෡ K −1
H = K′H ෡ = K′−1 HK
H
On Image Planes

Estimation - 7-point algorithm (𝑛 ≥ 7) → 𝑘 solution - 4-point algorithm (𝑛 ≥ 4) → 1 solution


- (normalized) 8-point algorithm → 1 solution - cv::findHomography()
- cv.findFundamentalMat()
Input - (𝐱𝑖 , 𝐱𝑖′ ) [px] on the image plane - (𝐱𝑖 , 𝐱𝑖′ ) [px] on a plane in the image plane
Degenerate - No translational motion - Correspondence not from a single plane
Cases - Correspondence from a single plane
Decomposition - Convert to an essential matrix and decompose it - cv.decomposeHomographyMat()
to R and 𝐭
Model Essentials Matrix (5 DOF) (Calibrated) Planar Homgraphy (8 DOF)
On Normalized Image Planes

Formulation 1
E = 𝐭 ×R ෡ = R + 𝐭𝐧⊺
H
𝑑
Estimation - 5-point algorithm (𝑛 ≥ 5) → k solution - 4-point algorithm (𝑛 ≥ 4) → 1 solution
- cv.findEssentialMat() - cv::findHomography()
Input - (ො𝐱 𝑖 , 𝐱ො 𝑖′ ) [m] on the normalized image plane - (ො𝐱 𝑖 , 𝐱ො 𝑖′ ) [m] on a plane in the normalized image plane
Degenerate - No translational motion - Correspondence not from a single plane
Cases
Decomposition - cv.decomposeEssentialMat() - cv.decomposeHomographyMat() with K = I3×3
to R and 𝐭 - cv.recoverPose()
37
Triangulation

▪ Triangulation (point localization)


– Unknown: Position of a 3D point 𝐗 (3 DOF)
– Given: Point correspondence (𝐱, 𝐱′), camera matrices (K, K′), and relative pose (R, 𝐭)
– Constraints: 𝐱 = K I | 𝟎 𝐗, 𝐱′ = K′ R | 𝐭 𝐗
– Solution
• OpenCV cv.triangulatePoints()
𝐗
• Special case) Stereo cameras
−𝑏 𝑓 0 𝑐𝑥
R = I3×3 , 𝐭 = 0 , and K = K ′ = 0 𝑓 𝑐𝑦
0 0 0 1
𝑓
image plane
∴𝑍= ′𝑏
𝑥−𝑥
𝐱 𝐱′

camera #1
camera #2

relative pose R, 𝐭

39
A point cloud: data/box.xyz
Triangulation

▪ Example) Triangulation
f, cx, cy = 1000., 320., 240.
pts0 = np.loadtxt('../data/image_formation0.xyz')[:,:2]
pts1 = np.loadtxt('../data/image_formation1.xyz')[:,:2]
output_file = '../data/triangulation.xyz'
CloudCompare
# Estimate relative pose of two view
F, _ = cv.findFundamentalMat(pts0, pts1, cv.FM_8POINT)
K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])
E = K.T @ F @ K
_, R, t, _ = cv.recoverPose(E, pts0, pts1)

# Reconstruct 3D points (triangulation)


P0 = K @ np.eye(3, 4, dtype=np.float32)
Rt = np.hstack((R, t))
P1 = K @ Rt
X = cv.triangulatePoints(P0, P1, pts0.T, pts1.T) 𝐱 =K I|𝟎 𝐗
X /= X[3]
X = X.T 𝐱′ = K′ R | 𝐭 𝐗

output_file: data/triangulation.xyz
# Write the reconstructed 3D points
np.savetxt(output_file, X)

40
Summary

▪ Planar Homography: 𝐱 𝑖′ = H𝐱 𝑖
– Example) Perspective distortion correction
– Example) Planar image stitching
– Example) 2D video stabilization

▪ Epipolar Geometry: 𝐱 ′ F𝐱 = 0 (on the image plane), 𝐱ො ′⊺ Eො𝐱 = 0 (on the normalized image plane)
– Example) Epipolar line visualization
▪ Relative Camera Pose Estimation: Finding R and 𝐭 (5 DOF) R, 𝐭
– Solutions) Fundamental matrix: 7/8-point algorithm (7 DOF) ↕ (E = 𝐭 × R)
E
– Solutions) Essential matrix: 5-point algorithm (5 DOF) ↕ (E = K ′⊺ FK)
– Solutions) Planar homography: 4-point algorithm (8 DOF) F
– Example) Fundamental matrix estimation
– Example) Monocular Visual Odometry (Epipolar Version)
▪ Triangulation: Finding 𝐗 (3 DOF)
– Example) Triangulation

41

You might also like