Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Compression-based Image Registration Anton Bardera, Miquel Feixas, Imma Boada, Mateu Sbert IIiA, Universitat de Girona Campus Montilivi, P4, 17004 Girona, Spain Email: anton.bardera@udg.es Abstract— Image registration is an important component of image analysis used to align two or more images. In this paper, we present a new framework for image registration based on compression. The basic idea underlying our approach is the conjecture that two images are correctly registered when we can maximally compress one image given the information in the other. The contribution of this paper is twofold. First, we show that the image registration process can be dealt with from the perspective of a compression problem. Second, we demonstrate that the similarity metric, introduced by Li et al., performs well in image registration. Two different versions of the similarity metric have been used: the Kolmogorov version, computed using standard real-world compressors, and the Shannon version, calculated from an estimation of the entropy rate of the images. I. I NTRODUCTION The registration of two images consists in finding the transformation that brings one image into the best possible spatial correspondence with the other one. A common method of solving the registration task is to treat it as a mathematical optimization problem, using a similarity measure to quantify the quality of the alignment of the two images for any given transformation. Some information-theoretic measures, such as mutual information (M I) [1], [2] and normalized mutual information (N M I) [3], have become a standard reference, mainly in medical imaging, due to their accuracy and robustness. On the other hand, the normalization of information distance [4], based on the non-computable notion of Kolmogorov complexity, has been introduced for measuring similarity between sequences [5], [6]. It has been successfully applied in areas such as genome phylogeny [5], language phylogeny [6] and classification of music pieces [7]. In essence, the idea behind it is that two objects are similar if we can significantly compress one given the information in the other. In this paper, we apply a normalized version of the information metric [4], called the similarity metric, to image registration using two different versions. In the first case, the Kolmogorov complexity is computed using standard realworld compressors and, in the second case, an estimation of the entropy rate of the images is used. In both cases, experimental results demonstrate that the similarity metric performs well in image registration. However, the entropy rate approach is more accurate and robust than the Kolmogorov version due to the compressor imperfections. This paper is organized as follows. In Section II we survey background and related work, and in Section III we present our framework for image registration. Experimental results are given in Section IV. II. BACKGROUND In this section we review the similarity metric based on the Kolmogorov complexity [6], some basic informationtheoretic measures [8], [9] and their application to image registration [1], [2], [10]. A. The Similarity Metric The Kolmogorov complexity K(x) of a string x is the length of the shortest program to compute x on an appropriate universal computer. Essentially, the Kolmogorov complexity of a string is the length of the ultimate compressed version of the string. The conditional complexity K(x|y) of x relative to y is defined as the length of the shortest program to compute x given y as an auxiliary input to the computation. The joint complexity K(x, y) represents the length of the shortest program for the pair (x, y) [6]. In [4], the information distance is defined as the length of the shortest program that computes x from y and y from x. It was shown there that, up to an additive logarithmic term, the information distance is given by E(x, y) = max {K(y|x), K(x|y)}. (1) It was also shown that E(x, y) is a metric. It is interesting to note that long strings that differ by a tiny part are intuitively closer than short strings that differ by the same amount. Hence, there arises the necessity to normalize the information distance. In [6], the normalized version of E(x, y), called the normalized information distance (N ID) or the similarity metric, is defined by N ID(x, y) = = max{K(x|y), K(y|x)} max{K(x), K(y)} K(x, y) − min{K(x), K(y)} . max{K(x), K(y)} (2) Li et al. have shown that N ID(x, y) is a metric and it takes values in [0, 1]. It is also universal in the sense that if two strings are similar according to the particular feature described by a particular normalized admissible distance (not necessarily metric), then they are also similar in the sense of the normalized information metric [11]. Due to the non-computability of Kolmogorov complexity, a feasible version of the normalized information distance (2), pixels called the normalized compression distance, is defined as N CD(x, y) = C(x, y) − min{C(x), C(y)} . max{C(x), C(y)} Fixed Image pixels B. Information-theoretic Measures Let X be a finite set, let X be a random variable taking values x in X with distribution p(x) = P r[X = x]. Likewise, let Y be a random variable taking values y in Y. The Shannon entropy H(X) of a random variable X is defined by X H(X) = − p(x) log p(x). (4) x∈X The Shannon entropy H(X) measures the average uncertainty of random variable X. If the logarithms are taken in base 2, entropy is expressed in bits. The conditional entropy is defined by X p(x, y) log p(x|y), (5) H(X|Y ) = − x∈X ,y∈Y where p(x, y) = P r[X = x, Y = y] is the joint probability and p(x|y) = P r[X = x|Y = y] is the conditional probability. The conditional entropy H(X|Y ) measures the average uncertainty associated with X if we know the outcome of Y . The mutual information between X and Y is defined by (6) It is a measure of the shared information between X and Y . We review now the definition of entropy rate. The notation used here is inspired by the work of Feldman and Crutchfield [9]. Given a chain . . . X−2 X−1 X0 X1 X2 . . . of random variables Xi taking values in X , a block of L consecutive random variables is denoted by X L = X1 . . . XL . The probability that the particular L-block xL occurs is denoted by p(xL ). The Shannon entropy of length-L sequences or Lblock entropy is defined by X p(xL ) log p(xL ), (7) H(L) = − xL ∈X L where the sum runs over all possible L-blocks. The entropy rate is defined by H(L) (8) L and measures the average amount of information per symbol x [8]. The entropy rate can also be rewritten as hx = lim L→∞ hx = lim hx (L), L→∞ Optimizer (3) N CD is computed from the lengths of compressed data files. Thus, C(x) (or C(y)) represents the length of compressed string x (or y) and C(x, y) the length of the compressed pair (x, y). Thus, N CD approximates N ID by using a standard real-world compressor. I(X, Y ) = H(X) − H(X|Y ) = H(Y ) − H(Y |X). fitness value Metric Interpolator Transform parameters Moving image Transform pixels Fig. 1. points Main components of the registration process. C. MI-based Image Registration Image registration is treated as an iterative optimization problem with the goal of finding the spatial mapping that will bring two images into alignment resulting in a fused image. This process is composed of four elements (see Fig. 1). As input, we have both fixed X and moving Y images. The transform represents the spatial mapping of points from the fixed image space to points in the moving image space. The interpolator is used to evaluate moving image intensity at nongrid positions. The metric provides a measure of how well the fixed image is matched by the transformed moving one. This measure forms the quantitative criterion to be optimized by the optimizer over the search space defined by the parameters of the transform. The crucial point of image registration is the choice of a metric. The registration between two images X and Y can be represented by an information channel X → Y , where its marginal and joint probability distributions are obtained by simple normalization of the corresponding intensity histograms of the overlap area of both images [1]. The most successful automatic image registration methods are based on the maximization of M I. This method, almost simultaneously introduced by Maes et al. [1] and Viola et al. [2], is based on the conjecture that the correct registration corresponds to the maximum M I between the overlap areas of the two images. Later, Studholme et al. [3] proposed a normalization of mutual information defined by N M I(X, Y ) = I(X, Y ) H(X|Y ) + H(Y |X) =1− , (10) H(X, Y ) H(X, Y ) which is more robust than M I, due to its greater independence of the overlap area. Another justification about its good behavior is the following. In information theory, the most basic information metric is given by H(X|Y ) + H(Y |X). This measure fulfills the requirements for a true distance. The normalization of this distance by the joint entropy is also a true distance [6], [13]. In image registration literature, this fact has not been sufficiently stressed. It has to be noted that 1−N M I is very close to the Shannon version of N ID (see [6], [13]). (9) where hx (L) = H(XL |XL−1 XL−1 . . . X1 ) is the entropy of a symbol conditioned on a block of L−1 adjacent symbols. The entropy rate of a sequence is a measure of its compressibility, i.e., the optimal achievement for any possible compression algorithm [12]. III. R EGISTRATION BASED ON THE S IMILARITY M ETRIC We propose three different approaches for image registration based on N ID. The two first ones are based on the Kolmogorov complexity (Sect. III-A) and the third one on the Shannon information theory (Section III-B). Fig. 4. Fig. 2. Lena image and its R, G and B channels. The similarity metric is evaluated within the dashed window. 428 0.555 426 0.55 424 0.545 422 0.54 420 0.535 418 0.53 416 0.525 414 −10 −5 0 x translation 5 (a) 10 0.52 −10 −5 0 x translation 5 10 (b) Fig. 3. JPEG 2000-based registration of Lena image. (a) Plot of the compressed file length C(x, y). (b) Plot of N CD. A. Kolmogorov Version In this section, the similarity metric is implemented using both image and text compressors. 1) Image compressor-based registration: A color image is represented as a finite number of color image planes where each color is obtained by filtering the image spectrum and by measuring the resulting luminosity energy. Usually, digital color images use three filters, corresponding to the red (R), green (G) and blue (B) channels. In Fig. 2, we show a color image with its RGB channels. Our proposal is to use an image compressor to register monochrome images. To obtain the fused image, the method places each one of the monochrome images in a different color channel of an RGB image. Then, for each arrangement of images, the resulting image and the original ones are compressed. In our experiments, we used the JPEG 2000 compressor, that is based on wavelet technology. C(x, y), C(x), and C(y) are given by the lengths of the JPEG 2000 files corresponding to the compressed fused image and the original images X and Y , respectively. The correct registration would be achieved when N CD is minimum. The application of an image compressor requires rectangular images. Since the overlap area, in general, is not rectangular, different strategies can be considered. In our experiments, we reduce the area of the image to be processed to the maximal Scan path on the overlap area between images X and Y . rectangular area contained in the overlap region. 2) Text compressor-based registration: Our proposal is to convert the registration process into a text compression problem. We scan the overlap area of the two images to be registered (see Fig. 4). From the scan path, strings x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) are obtained by taking in a consecutive order the intensity value of the pixels of both fixed ¡ ¢and moving images, respectively. In a similar way, string xy = ³¡ ¢ ¡ ¢ ¡xn ¢´ x2 x1 is obtained by taking alternately the y1 , y2 , . . . , yn intensity value of the pixels of both superimposed images. This notation has been extracted from [14]. For each iteration in the registration process, the resulting string files are compressed. In our experiments, we use the bzip2 compressor based on the Burrows-Wheeler block-sorting text compression algorithm and Huffman coding. The registration position is achieved minimizing N CD. In this case, C(x, y), C(x), and C(y) are given by the length of files compressed by bzip2. This method has no restrictions on the shape of the overlap area. B. Shannon Version From (2), our proposal is to substitute the Kolmogorov complexity by the entropy rate, which expresses the maximum compressibility ¡ ¢ of a sequence. We assume here that strings x, y and xy are generated by finite-order stationary Markov sources [14]. Thus, the Shannon version of the similarity metric is defined by h(y) − min{hx , hy } , max{hx , hy } x N ED(x, y) = (11) where ¡X L ¢ µ ¶ X1 Y1 L→∞ L→∞ L (12) ¡x¢ represents the entropy rate of y , i.e., the maximum compressibility for the two superimposed strings. With this proposal, we make use of neighbor information by considering the correspondence between blocks of pixels instead of the correspondence between individual pixels. Similar to the bzip case, to generate the blocks of pixels we follow the scan path on the overlap area of the two images to be registered (see Fig. 4). Each block is composed of L consecutive pixels. Note that computations are influenced by h(y) = lim x H YL = lim H XL YL XL−1 YL−1 ... ... Fig. 5. CT-MR (top) and MR-PET (below) pairs of test images. the size of the blocks, since the sparsity of the joint histogram and also the computational cost of the process increases with it. To overcome these limitations, reductions of the original alphabet (0..255) to 8 or 16 symbols are appropriate. IV. R ESULTS In this section, we analyze the performance of the compression-based registration measures. The first experiment analyzes the JPEG 2000-based registration approach. For this test we use the Lena image of Fig. 2, moving the red channel through the X axis from -10 pixels to 10 pixels around the origin. To guarantee a rectangular overlap, we define a rectangular window as shown in Fig. 2. The resulting images are compressed, obtaining C(x, y), C(x) and C(y) from the length of the compressed RGB image, the length of the green-blue image and the length of the red image, respectively. Fig. 3 depicts the obtained compression C(x, y) (a) and the values of the N CD measure (b). Observe that both plots behave similarly, reaching the minimum at the registration position. For the next three experiments, we use two different pairs of medical images of 256 × 256 pixels obtained from a patient’s brain (see Fig. 5). The first pair consists of a computed tomography (CT), which provides precise anatomical characteristics of the brain, and a magnetic resonance (MR) image, which gives accurate information of the soft tissue. The second pair consists of an MR image and a positron emission tomography (PET), which provides functional information. Registration of these modalities is of special interest for medical diagnosis. In the tests with the CT-MR pair, the CT image has been taken as a fixed image and the MR image as the moving one, and in the tests with the MR-PET pair, the PET image has been taken as the moving one. In both cases, the moving images are translated through the X axis from -10 pixels to 10 pixels around the registration position. The obtained results for each one of the methods (JPEG 2000, bzip2, and entropy rate) are illustrated, respectively, in Fig. 6 (a), (b) and (c). With the CT-MR pair, Fig. 6(i.a-b) show the compression C(x, y) x and Fig. 6(i.c) the h(y) measure, and Fig. 6(ii.a-b) the N CD measure and Fig. 6(ii.c) the N ED measure. With the MR-PET pair, Fig. 6(iii.a-b) shows the N CD measure and Fig. 6(iii.c) the N ED measure. To apply the JPEG 2000-based registration method, we define a rectangular window in all test images in order to obtain a rectangular overlap area. Observe that for the MRCT pair, the compression C(x, y) (Fig. 6(i.a)) has several local minima and the absolute minimum is not at the registration position. On the contrary, the N CD plot (Fig. 6(ii.a)) has a smooth shape, reaching the minimum at the correct position. The undesirable behaviour obtained in the N CD plot of the MR-PET pair (Fig. 6(iii.a)) is due to the high level of noise of the PET images. From the results obtained with the bzip2-based registration for the MR-CT pair (see Fig. 6(i-ii.b)), it can be seen that the minima of the C(x, y) and N CD plots do not coincide, and nor do the different shape at the extremes of the plots. Observe that the compression C(x, y) decreases with long translations (more than 5 pixels). This behaviour is due to the fact that long translations lead to small overlap regions and a reduction of the information to be compressed. As it can be seen in the N CD plot (Fig. 6(ii.b)), this measure corrects this undesirable behaviour and, moreover, the minimum of the function coincides with the registration position. Observe that the N CD plot obtained with the MR-PET pair (Fig. 6(iii.b)), although it behaves better than the JPEG 2000 method (Fig. 6(iii.a)), it still has some local minima. In our experiments with bzip2, we also observe that N CD ≥ 1, which is a consequence of bzip2 compressor imperfections (see [11]). We evaluate the N ED measure (Fig. 6(c)) considering three different L values, 1, 2 and 3, represented by solid, dash-dotted and dashed lines, respectively. Due to the high dimensionality of the joint histogram in the L=3 case, the number of bins has been reduced to 8 bins. For comparison purposes, this quantization has been kept in all cases. In Fig. 6(ii-iii.c), the bold plot represents the standard image registration measure N M I. As we expected, entropy rate estimation decreases with L (Fig. 6(i.c)), giving us a more approximate measure of the real entropy rate and, equivalently, the string compressibility. Observe in Fig. 6(ii-iii.c) the smoothness of the N CD curves, without local minima, and the accuracy of the registration, achieving their minimum at the correct position for both MR-CT and MR-PET pairs. Note also that N M I behaves very similar to the entropy rate approach when L=1. This behaviour was also noted by Kraskov et al. [13] in clustering applications. From our experiments, we can conclude that registration based on Shannon information shows more accuracy and robustness than real-world compressor-based registration. V. C ONCLUSIONS AND F UTURE W ORK We have presented a new compression-based framework for image registration using the similarity metric. The behaviour of this measure has been analyzed using two different perspectives: the Kolmogorov complexity and the Shannon information theory. Experimental results reveal the good performance of both versions, computed using standard real-world compressors and an estimation of entropy rate, respectively. Due to the 4 88.4 6.3 x 10 3.2 88.3 3 6.28 88.2 2.8 6.26 88.1 2.6 88 6.24 2.4 6.22 2.2 87.9 87.8 2 87.7 6.2 1.8 87.6 6.18 1.6 87.5 87.4 −10 −5 0 x translation 5 10 6.16 −10 (i.a) C(x, y) −5 0 x translation 5 10 1.4 −10 −5 5 10 5 10 (i.c) h( ) x y (i.b) C(x, y) 0.736 0 x translation 1.24 0.85 1.23 0.8 0.734 1.22 0.75 0.732 1.21 0.7 0.73 1.2 0.65 1.19 0.728 0.6 1.18 0.726 0.55 1.17 0.724 −10 −5 0 x translation 5 10 1.16 −10 (ii.a) N CD −5 0 x translation 5 10 0.5 −10 (ii.b) N CD 0.481 0.9 0.48 0.89 0.479 0.88 0.478 0.87 0.477 0.86 0.476 0.85 −5 0 x translation (ii.c) N ED 0.9 0.88 0.86 0.84 0.82 0.8 0.475 −10 −5 0 x translation 5 (iii.a) N CD 10 0.84 −10 0.78 −5 0 x translation 5 (iii.b) N CD 10 0.76 −10 −5 0 x translation 5 10 (iii.c) N ED Fig. 6. (i) Compression and (ii) distance results of MR-CT images, and (iii) distance results of MR-PET images, using (a) JPEG 2000-based method, (b) bzip2-based method, and (c) entropy rate-based method. compressor imperfections, more accurate and robust results are obtained in the second case. In contrast to traditional MI-based registration algorithms, which only use individual pixel information to perform computations, our approaches take into account spatial or structural information contained in the images. Future work will be addressed to analyze the Shannon version of the similarity metric using different scan spacefilling curves and uniformly distributed random lines [15] in order to improve the capture of structural information. ACKNOWLEDGMENTS This project has been funded in part with grant numbers TIN2004-08065-C02-02, TIN2004-07451-C03-01 and 2001SGR-00296. R EFERENCES [1] F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximization of mutual information,” IEEE Transactions on Medical Imaging, vol. 16, no. 2, pp. 187– 198, 1997. [2] P. A. Viola, “Alignment by maximization of mutual information,” Ph.D. dissertation, Massachusetts Institute of Technology, Massachusetts (MA), USA, 1995. [3] C. Studholme, “Measures of 3D medical image alignment,” Ph.D. dissertation, University of London, London, UK, August 1997. [4] C. Bennett, P. Gács, M. Li, P. Vitányi, and W. Zurek, “Information distance,” IEEE Transactions on Information Theory, vol. 44, no. 4, pp. 1407–1423, July 1998. [5] M. Li, J. Badger, X. Chen, S.Kwong, P. Kearney, and H. Zang, “An information-based sequence distance and its applications to whole mitochondrial genome phylogeny,” Bioinformatics, vol. 17, no. 2, pp. 149–154, 2001. [6] M. Li, X. Chen, X. Li, B. Ma, and P. Vitányi, “The similarity metric,” IEEE Transactions on Information Theory, vol. 50, no. 12, pp. 3250– 3264, December 2004. [7] R. Cilibrasi, R. de Wolf, and P. Vitányi, “Algorithmic clustering of music,” 2003. [Online]. Available: http://arxiv.org/archive/cs/0303025 [8] T. M. Cover and J. A. Thomas, Elements of Information Theory. Wiley Series in Telecommunications, 1991. [9] D. P. Feldman and J. P. Crutchfield, “Structural information in two-dimensional patterns: Entropy convergence and excess entropy,” December 2002. [Online]. Available: arxiv.org/abs/cond-mat/0212078 [10] J. P. Pluim, J. Maintz, and M. Viergever, “Mutual-information-based registration of medical images: a survey,” IEEE Transactions on Medical Imaging, vol. 22, pp. 986–1004, 2003. [11] R. Cilibrasi and P. M. B. Vitányi, “Clustering by compression,” IEEE Transactions on Information Theory, vol. 51, no. 4, pp. 1523–1545, April 2005. [12] P. Grassberger, “Data compression and entropy estimates by non-sequential recursive pair substitution.” [Online]. Available: http://arxiv.org/abs/physics/0207023 [13] A. Kraskov, H. Stögbauer, R. G. Andrzejak, and P. Grassberger, “Hierarchical clustering based on mutual information,” 2003. [Online]. Available: http://arxiv.org/abs/q-bio/0311039 [14] A. Kaltchenko, “Algorithms for Estimating Information Distance with Application to Bioinformatics and Linguistics,” April 2004. [Online]. Available: http://arxiv.org/archive/cs/0404039 [15] A. Bardera, M. Feixas, I. Boada, and M. Sbert, “Medical image registration based on random line sampling,” in IEEE International Conference on Image Processing (ICIP’05), Proceedings, Genova, Italy, September 2005.