Trifocal Morphing
Angus M.K. Siu
Ada S.K. Wan
Rynson W.H. Lau
C.W. Ngo
City University of Hong Kong, Hong Kong
E-mail: {angus,adaskw,rynson}@cs.cityu.edu.hk, cscwngo@cityu.edu.hk
Abstract
Image morphing allows smooth transition between
2D images. However, one of the limitations of
existing image morphing techniques is the lack of
interaction - the viewpoints of the interpolated
images are restrained to the line joining the optical
centers of the source and the destination images.
Another limitation of existing image morphing
techniques is that shape warping often causes
distortion due to barycentric mapping. In this
paper, we present our trifocal morphing technique
to address these problems. The new technique
allows a user to change the viewpoint of the output
images, i.e., increasing the degrees of freedom of
interaction, and supports both interpolation and
extrapolation. By making use of the intrinsic
geometric relationship among the reference images
for projective transformation, the distortion from
barycentric mapping is also prevented. Unlike other
warping-based view transferring techniques,
trifocal morphing provides very smooth transition
between reference images and supports both rigid
and non-rigid scenes.
Keywords: image morphing, trifocal morphing,
novel view synthesis.
1. Introduction
Metamorphosis, or morphing for short, generally refers to
the transformation of shape and visual attributes of a
graphic object (GO) [4]. The shape of a GO in n
dimensional space is defined as U n while its
attributes can be expressed as f: U no k. A morph
involves two GOs, U0 and U1. It maps C0: U0 o U1 and
C1: U1 o U0 , specifying a complete correspondence
between points of the two GOs. The morphing process
includes shape warping and attributes blending. If we
define points p0 U0 and p1 U1, shape warping
interpolates the shapes between U0 and U1 with a single
parameter O [0,1]:
w0 (p 0 , O ) (1 O )p 0 sC 0 (p 0 )
w1 ( p1 , O ) (1 O )p1 sC1 ( p1 )
while attribute blending is defined as:
G(O ) (1 O ) f ( W0 (O )) Of ( W1 (1 O ))
Image morphing [3,5,9] is a special type of morphing
operation where the GOs are defined in 2D image
domain, i.e., U 2. It is simple and efficient; both shape
warping and attribute blending are performed under 2D
image domain. However, these 2D GOs are in fact
projection of 3D GOs. Thus, 3D distortions or unnatural
image transitions may occur if naive image morphing
techniques are used.
View Morphing [8,10] focuses on producing smooth
transition between the source and the destination views. It
considers the fact that each 2D image point is a projection
of a 3D scene point. Natural transition can be achieved by
first prewarping two images, computing a morph (image
warp and cross-dissolve) between the prewarped images,
and then postwarping each in-between image produced by
the morph. View morphing also accommodates changes
in projective shape. However, it does not address the
image partitioning and visibility problems.
In [6], an automatic technique was proposed, which is
based on the joint view triangulation (JVT) method to
carry image partitioning and matching for image
interpolation / morphing. The JVT method constructs an
efficient representation scheme for the source and the
destination images for image interpolation.
In general, existing image morphing techniques have
restricted viewpoint of the output images and thus an
arbitrary view synthesis is not supported. In addition, the
output of the morphological function depends only on a
single input parameter (usually on time t), which limits
the application of the morphing techniques to video or
other time dependent applications. Therefore, the user
cannot interact with the system to obtain a particular
output image from the morphological function. Besides,
some image point warping techniques provide solutions
for transferring image point to arbitrary view.
Proceedings of the Seventh International Conference on Information Visualization (IV’03)
1093-9547/03 $17.00 © 2003 IEEE
In this paper, we propose a trifocal morphing
technique which allows interactive synthesis of arbitrary
novel views and provides smooth transition between
reference images. In addition, it only requires partial
correspondences and thus supports both rigid and nonrigid scene.
The rest of this paper is organized as follows. Section
2 presents the overall framework and the detail of our
trifocal morphing technique. Section 3 demonstrates some
experimental results of the proposed technique. Finally,
section 4 concludes the paper with discussion on possible
future work.
2. Trifocal Morphing
In this section, we first present the representation scheme
of trifocal morphing. After that, the rendering and
computational details are discussed.
2.1 Representation Scheme
Trifocal morphing is based on three arbitrary reference
images I0, I1, I2 2. Each reference image Ii is
composed of a set of GOs, Uji Ii 2. In our method,
we first extracts a set of feature points for image I0 and
then partition the image through delaunay triangulation on
the feature points.
The partitioned triangular patches in image I0 are then
matched with those images I1 and I2 to obtain a set of
matched triangular patches. For the non-matched regions,
we perform edge-constrained delaunay triangulation on
each image. As a result, each image is partitioned into a
set of matched triangular patches, Mi, and another set of
non-matched triangular patches Ni , where Mi Ni = .
Figure 1 shows an example of three partitioned images.
In order to allow effective image warping to any
arbitrary viewpoints, generalized disparity [7] is also
calculated for the feature points. Since the accuracy of
point matching is up to one pixel, point correspondences
do not exactly satisfy the geometric relations. We use
linear triangulation method to get the maximum
likelihood estimate of a 3D point X from its 2D
Figure 1.
correspondences x, x’ and x’’ in reference images I0, I1,
and I2, respectively. Let P, P’ and P” be the 3x4 camera
matrices of images I0, I1, and I2, respectively. Then, x =
PX, x’ = P’X and x” = P”X. These equations can be
combined into a form of AX = 0, where
A
ª xp 3T p1T º
»
«
« yp 3T p 2T »
«
3T
1T »
»
« x ' p' p'
« y ' p' 3T p' 2T »
»
«
« x" p" 3T p"1T »
«
3T
2T »
¬ y" p" p" ¼
P
§ p1T ·
¨ 2T ¸
¨p ¸
¨ p 3T ¸
¹
©
x = (x, y, 1)T
This is a redundant set of equations of the form AX =
0. We do singular value decomposition on A and get the
least square solution of X by finding the solution as unit
singular vector corresponding to the singular value of A.
After X is obtained, the generalized disparity, G(x), for 2D
image point x can be calculated as :
Nx
~
~
G (x) ~ ~ where C §¨ C ·¸ X §¨ X ·¸
¨1¸
¨1¸
CX
© ¹
© ¹
N is a 3x3 projective matrix of image 1. i.e., P = (NT|p4).
2.2 3D Morphing with 2D images
In image morphing, both shape warping and attribute
blending are carried out in 2D image domain. Hence, it is
considered as 2D morphing. With view morphing [8],
shape warping is performed in 3D spatial domain while
attribute blending is performed in 2D image domain.
Thus, view morphing is considered as 2.5D morphing.
Our trifocal morphing uses 3 reference images. While
shape warping is carried out in 3D spatial domain,
attribute blending considers the direction of ray in 3D
spatial domain such that reference images are selected to
minimize the angular deviation. Since both shape warping
and attribute blending are performed in 3D spatial
domain, trifocal morphing is considered as 3D morphing
even though only 2D images are available. Table 1
compares the properties of different morphing techniques.
Details of shape warping, attribute blending and attribute
combination of trifocal morphing are presented in the
remaining subsections.
The partitioning for images I0, I1 & I2.
Proceedings of the Seventh International Conference on Information Visualization (IV’03)
1093-9547/03 $17.00 © 2003 IEEE
Comparison of morphing techniques.
Table 1.
No. of
image
used
Image
Morphing
View
Morphing
Trifocal
Morphing
2
2
3
Source
Domain
2D image
domain
2D image
domain
2D image
domain
Shape
Warping
Attribute
Blending
2D image
domain
3D spatial
domain
3D spatial
domain
2D image
domain
2D image
domain
3D spatial
domain
Morphing
2D
2.5D
3D
2.3 Shape Warping
To allow arbitrary view synthesis, we cannot simply
compute the location of feature points in the novel views
by linear interpolation between source and destination
images as in existing morphing techniques. Instead, we
use the warping equation in [7] to transfer the feature
points from the source images to the novel views and then
perform shape warping. The warping equation for
transferring feature points is defined as follows:
~ ~
x' G ( x ) N'1 (C C' ) N'1 Nx
where x and x’ are points in the reference image and the
output image, respectively. C and C’ are the centers of
projection of the reference image and of the output image,
respectively. N and N’ are the 3x3 projection matrices of
the reference image and of the output image, respectively.
G(x) is the generalized disparity at point x.
2.4 Projective Transformation
After the feature points are warped, images points within
each triangular patch can be transferred. One of the
commonly used methods is barycentric mapping [2],
which maintains a constant distance ratio of the
transferring point to the three vertices of the triangular
patch when mapping the point from the source image to
the output image. However, the major problem of
barycentric mapping is the potential transformation
distortion. Figure 2 shows that there may be severe
distortion in transforming a pair of triangles with
barycentric mapping, where W is the function to transfer
the image points from the source image to the output
image.
W
A plane homography H is a one to one mapping, which
carries projective transformation on a plane from image 1
to image 2, such that Ox’=Hx, where H is a 3x3 matrix, O
is a constant, x is a homogenous 2D point in image 1, and
x’ is a homogenous 2D point in image 2.
Since the degrees of freedom of a plane homography
are 9 – 1 = 8, 4 matches (ui, ui’) between two images are
required to calculate the homography. Thus, the shape of
the patches to determine the homography needs to be
quadrangle (with 4 vertices) instead of triangular (with 3
vertices). This is the difficulty for a triangular patch to
find out the plane homography in order to carry out
projective transformation.
In trifocal morphing, we address this homography
computation problem by proposing a new way of
calculating the plane homography from triangular patches
based on intrinsic geometric relationship among the
reference images. Referring to figure 3, since a 3D
triangle may define a plane in the projective domain, the
plane homography can be obtained even the shape of the
patch is triangular.
S
H1
C’
image 2
image 1
Implicit geometry of a triangular patch.
Figure 3.
Let H1 be a 4x3 transformation matrix that transforms
a homogenous 2D point into the projective 3D space and
H2 be a 3x4 transformation matrix that transforms a
homogenous 3D point into 2D point in image 2. The
plane homography H is then given as:
x' Hx H2H1x
(1)
where x and x’ are the corresponding homogenous 2D
point pair in images 1 and 2, respectively. Besides, we
may determine the projective plane, S, that contains all
the three projective 3D points of the vertex. Suppose that
the three projective points defining the plane S are:
~
§ X1 ·
¨ ¸ X2
¨ 1 ¸
© ¹
Distortion induced by barycentric mapping.
Then
Since a GO in the 2D image domain is a projection of
a GO in 3D spatial domain, the use of projective
transformation can avoid the distortion due to barycentric
mapping. In order to perform projective transformation,
we need to determine the plane homography of the patch.
H2
C
X1
Figure 2.
plane defined by the
projective triangle
ʌ
~
§X2 ·
¨ ¸ X3
¨ 1 ¸
© ¹
~
~
~
~
§ (X 1 X 3 ) u ( X 2 X 3 ) ·
¸
¨
~
~
~
T
¨ X (X X ) ¸
3
1
2
¹
©
Since H1 of equation (1) is given by:
Proceedings of the Seventh International Conference on Information Visualization (IV’03)
1093-9547/03 $17.00 © 2003 IEEE
~
§ X3 ·
¨ ¸
¨ 1 ¸
© ¹
(2)
H1
~~T
§
·
¨ (I C~ʌ )N ¸
T
~
¨
ʌ C 1 ¸
~
¨
¸
ʌTN
¨ ~T ~
¸
ʌ C 1 ¹
©
O1
(3)
H2 is in fact the camera matrix of image 2, P’. Through
offline computation of the trifocal tensor [1] from
correspondences, we can retrieve the camera matrices P,
P’ and the projective matrix N in a fixed projective space.
The 3D points in the projective space from the 2D image
correspondences can be obtained to calculate the
projective plane S. Then, the plane homography can be
obtained from Equation (1).
2.5 Computation of the Blending Coefficients
After warping the triangular patches, we perform the
attribute blending. The problem in attribute blending is
how to select the appropriate reference images and
compute the corresponding texture blending coefficient
for each patch. Referring to figure 4, C1, C2 and C3 are the
location of the reference images taken by cameras 1, 2
and 3, respectively. D is the location of the desired view.
p1, p2 are two example triangular patches.
p2
T2
, O2
T1 T 2
1 O1 and O3 = 0
where O1, O2, O3 are the texture blending coefficients of
cameras C1, C2 and C3 on patch p1, respectively. The
texture blending coefficients for patch p2 can be
determined in a similar way. With this method, the
angular deviation between the novel views and the
reference view is minimized and hence, image distortion
due to occlusion is also minimized.
2.6 Combining Attributes
After the texture blending coefficients are computed, the
image attribute can be combined to produce the output
image. Unmatched triangles are drawn first as they may
contain occluded1 areas.
Let the three vertices of a triangle
1 1
in image 1 be v1 , v 2 , v 3 and the area of the triangle in image
1 be s1 =
v11v12 v13
. The texture blending coefficient for each
pixel of the triangle, O, is approximated as the average
value of the texture blending coefficients of its three
vertices. The resulting pixel value of the output image I is:
O1 s1 I 1 O 2 s 2 I 2 O3 s3 I 3
I
O1 s1 O 2 s 2 O3 s3
p1
3. Results and Discussion
T1 T
3
T2
C1
C3
D
C2
Figure 4. Selection of appropriate patches for
the output view.
To select the appropriate reference images and
blending coefficient O, a ray is first projected from the
desired view D to the center of the patch (e.g., p1). Rays
are also projected from the reference cameras C1, C2 and
C3 to the patch. We refer the angles between the output
ray (D, p1) and each of the camera rays (C1, p1), (C2, p1)
and (C3, p1) as T1, T2 and T3, respectively. We select the
patches from the cameras with the smallest angle to the
right side of the ray (D, p1) and that to the left side of the
ray (D, p1) for texture blending. For example, for patch p1,
the patches from cameras C1 and C2 would be selected.
However, for patch p2, the patches from cameras C2 and
C3 would be selected. The texture blending coefficients
for patch p1 can be obtained simply by
We have implemented the trifocal morphing technique in
Java and conducted a number of experiments to test the
performance of the proposed method. The experiments
were carried out on a Pentium 4 PC with a 2.2GHz
processor and 512MB RAM. No hardware graphics
accelerator is used. We capture three images of an
outdoor scene as the reference images with an
inexpensive digital camera at a resolution of 640x480.
We use an automatic tool to partition and to match the
reference images, which adopts the triangulation concept
similar to the joint view triangulation (JVT) method in
[6]. The registered images are then used for trifocal
morphing. Figure 6 shows several arbitrary views
generated by the trifocal morphing technique. It illustrates
that both interpolation and extrapolation are supported by
this morphing technique. Some video demos of interactive
navigation can be found in http://www.cs.cityu.edu.hk/
~angus/trifocal.
Unlike existing morphing techniques, we target at
arbitrary view synthesis instead of linear image
interpolation. The challenge is that it requires 3D
geometric information in order to render views at
arbitrary positions. Although geometric information may
be derived from matching reference images, most images
contain occluded regions which would cause matching
problems. Hence, geometric information may not be
Proceedings of the Seventh International Conference on Information Visualization (IV’03)
1093-9547/03 $17.00 © 2003 IEEE
1
2
3
available in those regions. From figure 7, we can see that
some regions in the images are only visible in one of the
images and fail to match, e.g., the rectangular region.
Trifocal morphing handles the problem by implicitly
estimating the geometric information of each nonmatched region from its neighboring matched regions.
Through selecting individual patches from the most
appropriate images, i.e., images closest to the given view,
smooth transition and texture blending can be achieved
for arbitrary view synthesis. Figure 8 shows a sequence of
view point movement. We can see that the rectangular
region, which fails to match among the three reference
images, can smoothly transit from one image to the next.
Figure 9 plots the rendering time of the proposed
method. On average, it takes 0.5s to render an image of
resolution 640x480. To improve output image quality,
bilinear interpolation can be used in pixel warping.
However, the rendering time will increase to 0.7s. If we
reduce the image resolution to 320x240, the processing
time can be reduced significantly to about 0.17s. We can
see that the rendering performance depends mainly on the
image resolution.
800
700
r
es
o
l
ut
io
n640x480(bil
in
ea
rin
t
er
po
l
a
t
io
n
)
Time (ms)
600
500
r
es
o
l
ut
io
n640x480(n
ea
r
es
tn
eig
h
bo
ur
)
400
300
200
r
es
o
l
ut
io
n320x240(bil
in
ea
rin
t
er
po
l
a
t
io
n
)
100
4
0
1
6
11
16
21
26
31
36
41
46
51
56
Frame Number
Figure 9.
Rendering performance of the new method
4. Conclusion and Future Work
5
Figure 8. Smooth texture blending on sequence of
rendered images
Although image morphing can produce smooth image
transition and appealing visual effects, existing morphing
techniques are lack of interaction. In this paper, the
trifocal morphing technique is presented, which allows
morphing to be performed in 3D spatial domain, even
though the reference images are in 2D. In shape warping,
we propose a method to calculate the plane homography
from triangular patches based on the intrinsic geometric
relationship among reference images. Thus, shape
warping can be performed with projective transformation
to avoid distortion due to barycentric mapping. In
attribute blending, the angular deviation of the ray
between the output image and the reference images is
considered to minimize mapping errors.
Because of these advantages, trifocal morphing has a
much wider application domain, from a single parametric
Proceedings of the Seventh International Conference on Information Visualization (IV’03)
1093-9547/03 $17.00 © 2003 IEEE
time-dependent application (e.g., producing video
sequence) to unrestricted interactive applications (e.g.
navigation). An important area of future work is to extend
the technique for image-based rendering (IBR).
[5]
S. Lee, G. Wolberg and S.Y. Shin, “Polymorph: Morphing
Among Multiple Images,” IEEE Computer Graphics and
Applications, 18(1), pp. 58-71, 1998.
[6]
M. Lhuillier and L. Quan, “Image Interpolation by Joint
View Triangulation,” Proc. of CVPR, 2, pp. 139-15, 1999.
Reference
[7]
L. McMillan. “An Image-Based Approach to ThreeDimensonal Computer Graphics,” Technical Report 97013, University of North Carolina at Chapel Hill, 1997.
[8]
S. Seitz and C. Dyer, “View Morphing,” Proc. of ACM
SIGGRAPH’96, pp. 21-30, 1996.
A. Tal and G. Elber, “Image Morphing with Feature
Preserving Texture,” Computer Graphics Forum, 18(3),
1999.
[1]
S. Carlsson, “Duality of Reconstruction and Positioning
from Projective Views,” Proc. of IEEE Workshop on
Representation of Visual Scenes, 1995.
[2]
M. S. Floater and C. Gotsman. “How to Morph Tilings
Injectively,” J. Comp. Appl. Math., 101, pp. 117-129,
1999.
[9]
[3]
M. Iwanowski, “Image Morphing Based on Morphological
Interpolation Combined with Linear Filtering,” Journal of
WSCG, Vol. 10, 2002.
[10] M. Terasawa, Y. Yamaguchi and K. Odaka, “Real-Time
View Morphing for Web Application,” Journal of WSCG,
Vol. 10, 2002.
[4]
G. Jonas, V. Luiz, C. Bruno, and D. Lucia, Warping and
Morphing of Graphical Objects, Morgan Kaufmann
Publishers, 1998.
Figure 6. Rendered image on arbitrary views.
Reference Image 1
Reference Image 2
Figure 7.
Regions that cannot be matched
Proceedings of the Seventh International Conference on Information Visualization (IV’03)
1093-9547/03 $17.00 © 2003 IEEE
Reference Image 3