Trifocal morphing

Angus Siu

Trifocal morphing

Proceedings on Seventh International Conference on Information Visualization, 2003. IV 2003.

In this paper, we propose a trifocal morphing technique which allows interactive synthesis of arbitrary novel views and provides smooth transition between reference images. In addition, it only requires partial correspondences and thus supports both rigid and non- rigid scene. The rest of this paper is organized as follows. Section 2 presents the overall framework and the detail of our trifocal morphing technique. Section 3 demonstrates some experimental results of the proposed technique. Finally, section 4 concludes the paper with discussion on possible future work. 2. Trifocal Morphing In this section, we first present the representation scheme of trifocal morphing. After that, the rendering and computational details are discussed. 2.1 Representation Scheme Trifocal morphing is based on three arbitrary reference images I 0 , I 1 , I 2 2 . Each reference image I i is composed of a set of GOs, U j i I i 2 . In our method, we first extracts a set of feature points for image I 0 and then partition the image through delaunay triangulation on the feature points. The partitioned triangular patches in image I 0 are then matched with those images I 1 and I 2 to obtain a set of matched triangular patches. For the non-matched regions, we perform edge-constrained delaunay triangulation on each image. As a result, each image is partitioned into a set of matched triangular patches, M i , and another set of non-matched triangular patches N i , where M i N i = . Figure 1 shows an example of three partitioned images. In order to allow effective image warping to any arbitrary viewpoints, generalized disparity [7] is also calculated for the feature points. Since the accuracy of point matching is up to one pixel, point correspondences do not exactly satisfy the geometric relations. We use linear triangulation method to get the maximum likelihood estimate of a 3D point X from its 2D correspondences x, x’ and x’’ in reference images I 0 , I 1 , and I 2 , respectively. Let P, P’ and P” be the 3x4 camera matrices of images I 0 , I 1 , and I 2 , respectively. Then, x = PX, x’ = P’X and x” = P”X. These equations can be combined into a form of AX = 0, where T T T T T T T T T T T T 2 " 3 " " 1 " 3 " " 2 ' 3 ' ' 1 ' 3 ' ' 2 3 1 3 p p p p p p p p p p p p A y x y x y x T T T 3 2 1 p p p P x = (x, y, 1) T This is a redundant set of equations of the form AX = 0. We do singular value decomposition on A and get the least square solution of X by finding the solution as unit singular vector corresponding to the singular value of A. After X is obtained, the generalized disparity, (x), for 2D image point x can be calculated as : X C x N x ~ ~ ) ( where 1 X X 1 C C ~ ~ N is a 3x3 projective matrix of image 1. i.e., P = (N T |p 4 ). 2.2 3D Morphing with 2D images In image morphing, both shape warping and attribute blending are carried out in 2D image domain. Hence, it is considered as 2D morphing. With view morphing [8], shape warping is performed in 3D spatial domain while attribute blending is performed in 2D image domain. Thus, view morphing is considered as 2.5D morphing. Our trifocal morphing uses 3 reference images. While shape warping is carried out in 3D spatial domain, attribute blending considers the direction of ray in 3D spatial domain such that reference images are selected to minimize the angular deviation. Since both shape warping and attribute blending are performed in 3D spatial domain, trifocal morphing is considered as 3D morphing even though only 2D images are available. Table 1 compares the properties of different morphing techniques. Details of shape warping, attribute blending and attribute combination of trifocal morphing are presented in the remaining subsections. Figure 1. The partitioning for images I 0 , I 1 & I 2 . Proceedings of the Seventh International Conference on Information Visualization (IV’03) 1093-9547/03 $17.00 © 2003 IEEE

Trifocal Morphing Angus M.K. Siu Ada S.K. Wan Rynson W.H. Lau C.W. Ngo City University of Hong Kong, Hong Kong E-mail: {angus,adaskw,rynson}@cs.cityu.edu.hk, cscwngo@cityu.edu.hk Abstract Image morphing allows smooth transition between 2D images. However, one of the limitations of existing image morphing techniques is the lack of interaction - the viewpoints of the interpolated images are restrained to the line joining the optical centers of the source and the destination images. Another limitation of existing image morphing techniques is that shape warping often causes distortion due to barycentric mapping. In this paper, we present our trifocal morphing technique to address these problems. The new technique allows a user to change the viewpoint of the output images, i.e., increasing the degrees of freedom of interaction, and supports both interpolation and extrapolation. By making use of the intrinsic geometric relationship among the reference images for projective transformation, the distortion from barycentric mapping is also prevented. Unlike other warping-based view transferring techniques, trifocal morphing provides very smooth transition between reference images and supports both rigid and non-rigid scenes. Keywords: image morphing, trifocal morphing, novel view synthesis. 1. Introduction Metamorphosis, or morphing for short, generally refers to the transformation of shape and visual attributes of a graphic object (GO) [4]. The shape of a GO in n dimensional space is defined as U n while its attributes can be expressed as f: U no k. A morph involves two GOs, U0 and U1. It maps C0: U0 o U1 and C1: U1 o U0 , specifying a complete correspondence between points of the two GOs. The morphing process includes shape warping and attributes blending. If we define points p0 U0 and p1 U1, shape warping interpolates the shapes between U0 and U1 with a single parameter O [0,1]: w0 (p 0 , O ) (1 O )p 0 sC 0 (p 0 ) w1 ( p1 , O ) (1 O )p1 sC1 ( p1 ) while attribute blending is defined as: G(O ) (1 O ) f ( W0 (O )) Of ( W1 (1 O )) Image morphing [3,5,9] is a special type of morphing operation where the GOs are defined in 2D image domain, i.e., U 2. It is simple and efficient; both shape warping and attribute blending are performed under 2D image domain. However, these 2D GOs are in fact projection of 3D GOs. Thus, 3D distortions or unnatural image transitions may occur if naive image morphing techniques are used. View Morphing [8,10] focuses on producing smooth transition between the source and the destination views. It considers the fact that each 2D image point is a projection of a 3D scene point. Natural transition can be achieved by first prewarping two images, computing a morph (image warp and cross-dissolve) between the prewarped images, and then postwarping each in-between image produced by the morph. View morphing also accommodates changes in projective shape. However, it does not address the image partitioning and visibility problems. In [6], an automatic technique was proposed, which is based on the joint view triangulation (JVT) method to carry image partitioning and matching for image interpolation / morphing. The JVT method constructs an efficient representation scheme for the source and the destination images for image interpolation. In general, existing image morphing techniques have restricted viewpoint of the output images and thus an arbitrary view synthesis is not supported. In addition, the output of the morphological function depends only on a single input parameter (usually on time t), which limits the application of the morphing techniques to video or other time dependent applications. Therefore, the user cannot interact with the system to obtain a particular output image from the morphological function. Besides, some image point warping techniques provide solutions for transferring image point to arbitrary view. Proceedings of the Seventh International Conference on Information Visualization (IV’03) 1093-9547/03 $17.00 © 2003 IEEE In this paper, we propose a trifocal morphing technique which allows interactive synthesis of arbitrary novel views and provides smooth transition between reference images. In addition, it only requires partial correspondences and thus supports both rigid and nonrigid scene. The rest of this paper is organized as follows. Section 2 presents the overall framework and the detail of our trifocal morphing technique. Section 3 demonstrates some experimental results of the proposed technique. Finally, section 4 concludes the paper with discussion on possible future work. 2. Trifocal Morphing In this section, we first present the representation scheme of trifocal morphing. After that, the rendering and computational details are discussed. 2.1 Representation Scheme Trifocal morphing is based on three arbitrary reference images I0, I1, I2 2. Each reference image Ii is composed of a set of GOs, Uji Ii 2. In our method, we first extracts a set of feature points for image I0 and then partition the image through delaunay triangulation on the feature points. The partitioned triangular patches in image I0 are then matched with those images I1 and I2 to obtain a set of matched triangular patches. For the non-matched regions, we perform edge-constrained delaunay triangulation on each image. As a result, each image is partitioned into a set of matched triangular patches, Mi, and another set of non-matched triangular patches Ni , where Mi Ni = . Figure 1 shows an example of three partitioned images. In order to allow effective image warping to any arbitrary viewpoints, generalized disparity [7] is also calculated for the feature points. Since the accuracy of point matching is up to one pixel, point correspondences do not exactly satisfy the geometric relations. We use linear triangulation method to get the maximum likelihood estimate of a 3D point X from its 2D Figure 1. correspondences x, x’ and x’’ in reference images I0, I1, and I2, respectively. Let P, P’ and P” be the 3x4 camera matrices of images I0, I1, and I2, respectively. Then, x = PX, x’ = P’X and x” = P”X. These equations can be combined into a form of AX = 0, where A ª xp 3T p1T º » « « yp 3T p 2T » « 3T 1T » » « x ' p' p' « y ' p' 3T p' 2T » » « « x" p" 3T p"1T » « 3T 2T » ¬ y" p" p" ¼ P § p1T · ¨ 2T ¸ ¨p ¸ ¨ p 3T ¸ ¹ © x = (x, y, 1)T This is a redundant set of equations of the form AX = 0. We do singular value decomposition on A and get the least square solution of X by finding the solution as unit singular vector corresponding to the singular value of A. After X is obtained, the generalized disparity, G(x), for 2D image point x can be calculated as : Nx ~ ~ G (x) ~ ~ where C §¨ C ·¸ X §¨ X ·¸ ¨1¸ ¨1¸ CX © ¹ © ¹ N is a 3x3 projective matrix of image 1. i.e., P = (NT|p4). 2.2 3D Morphing with 2D images In image morphing, both shape warping and attribute blending are carried out in 2D image domain. Hence, it is considered as 2D morphing. With view morphing [8], shape warping is performed in 3D spatial domain while attribute blending is performed in 2D image domain. Thus, view morphing is considered as 2.5D morphing. Our trifocal morphing uses 3 reference images. While shape warping is carried out in 3D spatial domain, attribute blending considers the direction of ray in 3D spatial domain such that reference images are selected to minimize the angular deviation. Since both shape warping and attribute blending are performed in 3D spatial domain, trifocal morphing is considered as 3D morphing even though only 2D images are available. Table 1 compares the properties of different morphing techniques. Details of shape warping, attribute blending and attribute combination of trifocal morphing are presented in the remaining subsections. The partitioning for images I0, I1 & I2. Proceedings of the Seventh International Conference on Information Visualization (IV’03) 1093-9547/03 $17.00 © 2003 IEEE Comparison of morphing techniques. Table 1. No. of image used Image Morphing View Morphing Trifocal Morphing 2 2 3 Source Domain 2D image domain 2D image domain 2D image domain Shape Warping Attribute Blending 2D image domain 3D spatial domain 3D spatial domain 2D image domain 2D image domain 3D spatial domain Morphing 2D 2.5D 3D 2.3 Shape Warping To allow arbitrary view synthesis, we cannot simply compute the location of feature points in the novel views by linear interpolation between source and destination images as in existing morphing techniques. Instead, we use the warping equation in [7] to transfer the feature points from the source images to the novel views and then perform shape warping. The warping equation for transferring feature points is defined as follows: ~ ~ x' G ( x ) N'1 (C C' ) N'1 Nx where x and x’ are points in the reference image and the output image, respectively. C and C’ are the centers of projection of the reference image and of the output image, respectively. N and N’ are the 3x3 projection matrices of the reference image and of the output image, respectively. G(x) is the generalized disparity at point x. 2.4 Projective Transformation After the feature points are warped, images points within each triangular patch can be transferred. One of the commonly used methods is barycentric mapping [2], which maintains a constant distance ratio of the transferring point to the three vertices of the triangular patch when mapping the point from the source image to the output image. However, the major problem of barycentric mapping is the potential transformation distortion. Figure 2 shows that there may be severe distortion in transforming a pair of triangles with barycentric mapping, where W is the function to transfer the image points from the source image to the output image. W A plane homography H is a one to one mapping, which carries projective transformation on a plane from image 1 to image 2, such that Ox’=Hx, where H is a 3x3 matrix, O is a constant, x is a homogenous 2D point in image 1, and x’ is a homogenous 2D point in image 2. Since the degrees of freedom of a plane homography are 9 – 1 = 8, 4 matches (ui, ui’) between two images are required to calculate the homography. Thus, the shape of the patches to determine the homography needs to be quadrangle (with 4 vertices) instead of triangular (with 3 vertices). This is the difficulty for a triangular patch to find out the plane homography in order to carry out projective transformation. In trifocal morphing, we address this homography computation problem by proposing a new way of calculating the plane homography from triangular patches based on intrinsic geometric relationship among the reference images. Referring to figure 3, since a 3D triangle may define a plane in the projective domain, the plane homography can be obtained even the shape of the patch is triangular. S H1 C’ image 2 image 1 Implicit geometry of a triangular patch. Figure 3. Let H1 be a 4x3 transformation matrix that transforms a homogenous 2D point into the projective 3D space and H2 be a 3x4 transformation matrix that transforms a homogenous 3D point into 2D point in image 2. The plane homography H is then given as: x' Hx H2H1x (1) where x and x’ are the corresponding homogenous 2D point pair in images 1 and 2, respectively. Besides, we may determine the projective plane, S, that contains all the three projective 3D points of the vertex. Suppose that the three projective points defining the plane S are: ~ § X1 · ¨ ¸ X2 ¨ 1 ¸ © ¹ Distortion induced by barycentric mapping. Then Since a GO in the 2D image domain is a projection of a GO in 3D spatial domain, the use of projective transformation can avoid the distortion due to barycentric mapping. In order to perform projective transformation, we need to determine the plane homography of the patch. H2 C X1 Figure 2. plane defined by the projective triangle ʌ ~ §X2 · ¨ ¸ X3 ¨ 1 ¸ © ¹ ~ ~ ~ ~ § (X 1 X 3 ) u ( X 2 X 3 ) · ¸ ¨ ~ ~ ~ T ¨ X (X X ) ¸ 3 1 2 ¹ © Since H1 of equation (1) is given by: Proceedings of the Seventh International Conference on Information Visualization (IV’03) 1093-9547/03 $17.00 © 2003 IEEE ~ § X3 · ¨ ¸ ¨ 1 ¸ © ¹ (2) H1 ~~T § · ¨ (I C~ʌ )N ¸ T ~ ¨ ʌ C 1 ¸ ~ ¨ ¸ ʌTN ¨ ~T ~ ¸ ʌ C 1 ¹ © O1 (3) H2 is in fact the camera matrix of image 2, P’. Through offline computation of the trifocal tensor [1] from correspondences, we can retrieve the camera matrices P, P’ and the projective matrix N in a fixed projective space. The 3D points in the projective space from the 2D image correspondences can be obtained to calculate the projective plane S. Then, the plane homography can be obtained from Equation (1). 2.5 Computation of the Blending Coefficients After warping the triangular patches, we perform the attribute blending. The problem in attribute blending is how to select the appropriate reference images and compute the corresponding texture blending coefficient for each patch. Referring to figure 4, C1, C2 and C3 are the location of the reference images taken by cameras 1, 2 and 3, respectively. D is the location of the desired view. p1, p2 are two example triangular patches. p2 T2 , O2 T1 T 2 1 O1 and O3 = 0 where O1, O2, O3 are the texture blending coefficients of cameras C1, C2 and C3 on patch p1, respectively. The texture blending coefficients for patch p2 can be determined in a similar way. With this method, the angular deviation between the novel views and the reference view is minimized and hence, image distortion due to occlusion is also minimized. 2.6 Combining Attributes After the texture blending coefficients are computed, the image attribute can be combined to produce the output image. Unmatched triangles are drawn first as they may contain occluded1 areas. Let the three vertices of a triangle 1 1 in image 1 be v1 , v 2 , v 3 and the area of the triangle in image 1 be s1 = v11v12 v13 . The texture blending coefficient for each pixel of the triangle, O, is approximated as the average value of the texture blending coefficients of its three vertices. The resulting pixel value of the output image I is: O1 s1 I 1 O 2 s 2 I 2 O3 s3 I 3 I O1 s1 O 2 s 2 O3 s3 p1 3. Results and Discussion T1 T 3 T2 C1 C3 D C2 Figure 4. Selection of appropriate patches for the output view. To select the appropriate reference images and blending coefficient O, a ray is first projected from the desired view D to the center of the patch (e.g., p1). Rays are also projected from the reference cameras C1, C2 and C3 to the patch. We refer the angles between the output ray (D, p1) and each of the camera rays (C1, p1), (C2, p1) and (C3, p1) as T1, T2 and T3, respectively. We select the patches from the cameras with the smallest angle to the right side of the ray (D, p1) and that to the left side of the ray (D, p1) for texture blending. For example, for patch p1, the patches from cameras C1 and C2 would be selected. However, for patch p2, the patches from cameras C2 and C3 would be selected. The texture blending coefficients for patch p1 can be obtained simply by We have implemented the trifocal morphing technique in Java and conducted a number of experiments to test the performance of the proposed method. The experiments were carried out on a Pentium 4 PC with a 2.2GHz processor and 512MB RAM. No hardware graphics accelerator is used. We capture three images of an outdoor scene as the reference images with an inexpensive digital camera at a resolution of 640x480. We use an automatic tool to partition and to match the reference images, which adopts the triangulation concept similar to the joint view triangulation (JVT) method in [6]. The registered images are then used for trifocal morphing. Figure 6 shows several arbitrary views generated by the trifocal morphing technique. It illustrates that both interpolation and extrapolation are supported by this morphing technique. Some video demos of interactive navigation can be found in http://www.cs.cityu.edu.hk/ ~angus/trifocal. Unlike existing morphing techniques, we target at arbitrary view synthesis instead of linear image interpolation. The challenge is that it requires 3D geometric information in order to render views at arbitrary positions. Although geometric information may be derived from matching reference images, most images contain occluded regions which would cause matching problems. Hence, geometric information may not be Proceedings of the Seventh International Conference on Information Visualization (IV’03) 1093-9547/03 $17.00 © 2003 IEEE 1 2 3 available in those regions. From figure 7, we can see that some regions in the images are only visible in one of the images and fail to match, e.g., the rectangular region. Trifocal morphing handles the problem by implicitly estimating the geometric information of each nonmatched region from its neighboring matched regions. Through selecting individual patches from the most appropriate images, i.e., images closest to the given view, smooth transition and texture blending can be achieved for arbitrary view synthesis. Figure 8 shows a sequence of view point movement. We can see that the rectangular region, which fails to match among the three reference images, can smoothly transit from one image to the next. Figure 9 plots the rendering time of the proposed method. On average, it takes 0.5s to render an image of resolution 640x480. To improve output image quality, bilinear interpolation can be used in pixel warping. However, the rendering time will increase to 0.7s. If we reduce the image resolution to 320x240, the processing time can be reduced significantly to about 0.17s. We can see that the rendering performance depends mainly on the image resolution. 800 700 r es o l ut io n640x480(bil in ea rin t er po l a t io n ) Time (ms) 600 500 r es o l ut io n640x480(n ea r es tn eig h bo ur ) 400 300 200 r es o l ut io n320x240(bil in ea rin t er po l a t io n ) 100 4 0 1 6 11 16 21 26 31 36 41 46 51 56 Frame Number Figure 9. Rendering performance of the new method 4. Conclusion and Future Work 5 Figure 8. Smooth texture blending on sequence of rendered images Although image morphing can produce smooth image transition and appealing visual effects, existing morphing techniques are lack of interaction. In this paper, the trifocal morphing technique is presented, which allows morphing to be performed in 3D spatial domain, even though the reference images are in 2D. In shape warping, we propose a method to calculate the plane homography from triangular patches based on the intrinsic geometric relationship among reference images. Thus, shape warping can be performed with projective transformation to avoid distortion due to barycentric mapping. In attribute blending, the angular deviation of the ray between the output image and the reference images is considered to minimize mapping errors. Because of these advantages, trifocal morphing has a much wider application domain, from a single parametric Proceedings of the Seventh International Conference on Information Visualization (IV’03) 1093-9547/03 $17.00 © 2003 IEEE time-dependent application (e.g., producing video sequence) to unrestricted interactive applications (e.g. navigation). An important area of future work is to extend the technique for image-based rendering (IBR). [5] S. Lee, G. Wolberg and S.Y. Shin, “Polymorph: Morphing Among Multiple Images,” IEEE Computer Graphics and Applications, 18(1), pp. 58-71, 1998. [6] M. Lhuillier and L. Quan, “Image Interpolation by Joint View Triangulation,” Proc. of CVPR, 2, pp. 139-15, 1999. Reference [7] L. McMillan. “An Image-Based Approach to ThreeDimensonal Computer Graphics,” Technical Report 97013, University of North Carolina at Chapel Hill, 1997. [8] S. Seitz and C. Dyer, “View Morphing,” Proc. of ACM SIGGRAPH’96, pp. 21-30, 1996. A. Tal and G. Elber, “Image Morphing with Feature Preserving Texture,” Computer Graphics Forum, 18(3), 1999. [1] S. Carlsson, “Duality of Reconstruction and Positioning from Projective Views,” Proc. of IEEE Workshop on Representation of Visual Scenes, 1995. [2] M. S. Floater and C. Gotsman. “How to Morph Tilings Injectively,” J. Comp. Appl. Math., 101, pp. 117-129, 1999. [9] [3] M. Iwanowski, “Image Morphing Based on Morphological Interpolation Combined with Linear Filtering,” Journal of WSCG, Vol. 10, 2002. [10] M. Terasawa, Y. Yamaguchi and K. Odaka, “Real-Time View Morphing for Web Application,” Journal of WSCG, Vol. 10, 2002. [4] G. Jonas, V. Luiz, C. Bruno, and D. Lucia, Warping and Morphing of Graphical Objects, Morgan Kaufmann Publishers, 1998. Figure 6. Rendered image on arbitrary views. Reference Image 1 Reference Image 2 Figure 7. Regions that cannot be matched Proceedings of the Seventh International Conference on Information Visualization (IV’03) 1093-9547/03 $17.00 © 2003 IEEE Reference Image 3

Log In

Trifocal morphing