research-article

Open access

Robust Hashing via Global and Local Invariant Features for Image Copy Detection

Authors: Xiaoping Liang, Zhenjun Tang, Zhixin Li, Mengzhu Yu, Hanyun Zhang, Xianquan ZhangAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 20, Issue 1

Article No.: 2, Pages 1 - 22

https://doi.org/10.1145/3600234

Published: 24 August 2023 Publication History

PDF eReader

Abstract

Robust hashing is a powerful technique for processing large-scale images. Currently, many reported image hashing schemes do not perform well in balancing the performances of discrimination and robustness, and thus they cannot efficiently detect image copies, especially the image copies with multiple distortions. To address this, we exploit global and local invariant features to develop a novel robust hashing for image copy detection. A critical contribution is the global feature calculation by gray level co-occurrence moment learned from the saliency map determined by the phase spectrum of quaternion Fourier transform, which can significantly enhance discrimination without reducing robustness. Another essential contribution is the local invariant feature computation via Kernel Principal Component Analysis (KPCA) and vector distances. As KPCA can maintain the geometric relationships within image, the local invariant features learned with KPCA and vector distances can guarantee discrimination and compactness. Moreover, the global and local invariant features are encrypted to ensure security. Finally, the hash is produced via the ordinal measures of the encrypted features for making a short length of hash. Numerous experiments are conducted to show efficiency of our scheme. Compared with some well-known hashing schemes, our scheme demonstrates a preferable classification performance of discrimination and robustness. The experiments of detecting image copies with multiple distortions are tested and the results illustrate the effectiveness of our scheme.

1 Introduction

Copy detection is an important topic of image processing. Image copies with multiple distortions are often generated in many image applications [1, 2]. Hence, it is a significant task to find image copies with multiple distortions from large-scale images [3, 4, 5]. Figure 1 depicts an original image and its image copies, where (a) is an original image, (b) is a copy with single distortion attacked by JPEG Compression (JC), (c) is a copy with double distortions generated by JC and Brightness Adjustment (BA), and (d) is a copy with multiple distortions produced by JC, BA, and Additive White Gaussian Noise (AWGN). To make quantitative analysis, the Peak Signal-to-Noise Ratio (PSNR) is adopted to calculate the quality of their image copies and the results are presented in Figure 1. From Figure 1, it can be seen that the multiple distortions leads to more image degradations. Therefore, efficient approaches are needed for detecting image copies with multiple distortions. This article presents a novel robust hashing scheme for multiple-distorted image copy detection.

Fig. 1.

Numerous researchers suggest to use image hashing [6, 7, 8] to handle a variety of image processing problems [9, 10, 11], including quality assessment, image retrieval, and copy detection. Image hashing can compress and encode an image into a short sequence. Generally, image hashing needs to meet two important performance indicators [12, 13, 14], i.e., discrimination and robustness. Discrimination is also called anti-collision [6, 15], which means that different images ought to be compressed and encoded into different hashes. Robustness demands that similar images ought to be mapped to the similar or same hashes. It is clear that these two performance indicators are interrelated and mutually constraining. An increase in one performance will decrease the other performance. Hence, it is an essential task to establish effective hashing schemes with a favorable performance trade-off between discrimination and robustness.

Presently, most hashing schemes have made good performance in the concerned applications. Nevertheless, they do not perform well in balancing the performances of discrimination and robustness and thus they cannot efficiently detect image copies, especially image copies with multiple distortions. To address this, we design a new hashing scheme with global and local invariant features for detecting image copies with multiple distortions. The contributions of our work are highlighted below.

(1) Global features are calculated by using the Phase spectrum of Quaternion Fourier Transform (PQFT) and Gray Leve Co-occurrence Moment (GLCM). Specifically, the PQFT is exploited to extract saliency map of the image, where all color information of image pixels are utilized during the extraction. Next, the GLCM is produced from the saliency map. The global feature calculation with PQFT and GLCM can enhance discrimination without reducing robustness.

(2) Local invariant features are computed by using the KPCA and vector distances. Since the KPCA can maintain the geometric relationships within image, the local invariant features learned with KPCA can guarantee discrimination. In addition, the low-dimensional features in the KPCA domain are viewed as vectors, and the vector distances in the KPCA domain are taken to represent the features, which can guarantee compactness of our scheme. As far as we known, this is the first work of the KPCA applied to robust hashing and it will inspire the research community of image hashing.

(3) Hash is generated by using the ordinal measures of the encrypted global and local invariant features. The encryption of the global and local invariant features provides security of our scheme. The use of ordinal measures can make low storage.

Extensive experiments on prevalent image databases are utilized to validate the performances of our scheme. Compared with some well-known hashing schemes, our scheme shows a preferable classification performance of discrimination and robustness. The experiments of detecting image copies with multiple distortions are tested and the results illustrate the effectiveness of our scheme.

The rest of this article is composed of four parts. The related work is listed in Section 2. Details of our scheme are depicted in Section 3. Experimental results and performance comparison of our scheme are discussed in Sections 4 and 5, respectively. Our application to copy detection is introduced in Section 6. Section 7 sums up this article.

2 Related Work

Generally, a high-performance hashing scheme should get a desirable trade-off between discrimination and robustness. Numerous researchers have tried their hardest to develop some effective hashing schemes with high classification performance. Depending on the features extracted, these image hashing schemes are grouped into three classes.

2.1 Image Hashing Schemes With Global Features

Some typical schemes employ the moments, histogram, Tucker decomposition, Multi-dimensional Scaling (MDS) or Global Neighborhood Structure (GNS) to extract global features for hash computation. For instance, Reference [2] uses the Quaternion Zernike Moments (QZMs) to produce a hashing scheme. This scheme has good discrimination, but the robustness ought to be improved. To enhance the robustness, Reference [16] extracts a hash from a third-order tensor by Tucker decomposition. This scheme is robust to common digital operations, but it can only resist small angle rotation. To resist large angle rotation, Reference [17] computes Color Vector Angles (CVAs) of color image, employs the Histogram of CVAs (HCVAs) as image feature and compresses it with DCT. This scheme has excellent rotation robustness, but the discrimination ought to be improved. In another work [18], the polar coordinate transformation and DFT are combined to construct the rotation-invariant secondary image and then the MDS is employed to create a hash from the secondary image. This scheme has excellent robustness to rotation, but the discrimination ought to be enhanced. Recently, Reference [19] exploited Dominant Neighborhood Structure (DNS) and GNS to derive global features for hash generation. This hashing scheme has good application in Tor domain identification, but the time efficiency is too low.

2.2 Image Hashing Schemes with Local Features

These hashing schemes utilize local features to generate a hash. Common-used local features include local feature points, block features, and ring features. For example, Reference [20] uses the endpoints of linear targets to design a hashing scheme. This scheme has desirable robustness against gamma correction and geometric attack. In Reference [21], the Random Walk (RW) theory and the statistical features of blocks are employed to form a hash. This scheme is effective in security. The reference [22] proposes an adaptive local feature extraction technology to construct a multi-scale hashing scheme. However, the robustness of this scheme is not satisfactory. To enhance robustness, Reference [23] employs the important DCT coefficients from each image block to make a hash. This scheme reaches the expected robustness of brightness adjustment and contrast adjustment. In another work [24], the Canny operator and DWT are employed to calculate a hash. This scheme performs well in image quality assessment. In Reference [25], the DCT coefficients and LLE technology are utilized to form a hash. This scheme has promising robustness. In Reference [26], a new scheme by combining the Center-Symmetric Local Binary Pattern (CSLBP) and SVD is introduced. But the discrimination of this scheme needs to be enhanced. To enhance discrimination, Reference [27] uses the singular values of Quaternion SVD (QSVD) to construct a hash. This scheme has excellent discrimination, but the robustness is not optimal yet. To take full account of the robustness of geometric distortions, Reference [28] extracts mean, variance, kurtosis and skewness of image rings to form a hash. This scheme is robust to geometric distortions. In another work [29], Ring Partition (RP) theory and Non-negative Matrix Factorization (NMF) are employed to compute a hash. This scheme has good robustness. Recently, in Reference [30], the Saliency Map (SM) and Isometric Mapping (Isomap) are jointly used to generate a robust hash. This scheme obtains a relatively preferable balance between discrimination and robustness. In another work [31], a novel scheme with the Feature Map (FM) and Two-Dimensional Principal Component Analysis (2D PCA) is introduced. This scheme has good detection performance for single distortion, but it does not consider the issue of multiple distortions.

2.3 Image Hashing Schemes with Local Features and Global Features

These image hashing schemes combine local and global features to produce a hash. For example, Reference [15] calculates a hash by combining Zernike moment, texture features and position features of the salient region. This scheme has a good application for tampered image detection. To improve robustness, Reference [32] proposes a Fourier Mellin Transform (FMT)-based hashing scheme. This scheme has favorable robustness, but there is still room for improving discrimination. In another work [33], the FMT and Fractal Coding (FC) are utilized to generate a hash. This scheme has excellent robustness, but the discrimination ought to be improved. To enhance discrimination, Reference [34] employs local energy features and three-dimensional global statistical features to generate a hash. This scheme has high spatio-temporal efficiency, but security issue is not considered. To ensure security, Reference [35] utilizes the local features of Three-dimensional Space Angle (3DSA) and the global features of Cool and Warm Hue opposite Color Quaternion (CWHCQ) to form a hash. This scheme has a good key dependency, but its classification performance is far from adequate. To improve classification performance, Reference [36] designs a novel hashing scheme based on two-dimensional DCT and GLCM theory. This scheme obtains a relatively preferable classification performance. In another work [1], the Hu invariant moment and two-dimensional DCT are employed to form a hash. This scheme performs well in copy detection. Recently, Reference [37] combines the KAZE feature and Singular Value Decomposition (SVD) to generate a hash. This scheme has desirable robustness.

The categories and core techniques of some famous schemes are presented in Table 1. In summary, there is still much scope for getting a preferable classification performance and making good performance in copy detection, especially for detecting image copies with multiple distortions. In this work, we present a new hashing scheme with global and local invariant features for copy detection.

Table 1.

Scheme	Category	Core technique
[2]	Global	QZMs
[17]	Global	HCVAs + DCT
[18]	Global	DFT + MDS
[19]	Global	DNS + GNS
[24]	Local	Canny + DWT
[25]	Local	DCT + LLE
[27]	Local	QSVD
[29]	Local	RP + NMF
[30]	Local	SM + Isomap
[31]	Local	FM + 2D PCA
[33]	Global + Local	FMT + FC
[35]	Global + Local	3DSA + CWHCQ
[36]	Global + Local	DCT + GLCM
[37]	Global + Local	KAZE + SVD

Table 1. Core Techniques of Some Typical Schemes

3 Our Scheme

Our scheme is composed of four phases. Figure 2 shows the block diagram of our scheme. First, global features are generated by PQFT and GLCM. Second, local invariant features are computed via KPCA and vector distances. Third, local and global features are encrypted. In the end, the encrypted features are quantized via ordinal measures. These phases are explained below.

Fig. 2.

3.1 Global Feature Calculation

The input image is firstly converted to the size of \(M_{0} \times M_{0}\) . Subsequently, the global features are calculated by the PQFT and GLCM. These features can enhance discrimination without reducing robustness. Calculations of PQFT and GLCM are explained below.

3.1.1 PQFT for Saliency Map.

Saliency map can indicate critical image areas focused by Human Visual System (HVS). To improve visual robustness, the PQFT [38] is used to determine saliency map. The PQFT uses all color information for saliency map calculation and has been employed in video compression [39] and image segmentation [40], and so on. Steps of the PQFT for saliency map calculation are briefly explained as below.

First, two color opponent-components simulated the HVS [38] are generated as follows:

\(\begin{equation} {BY(i,j) = B(i,j) - Y(i,j),} \end{equation}\)

(1)

\(\begin{equation} {RG(i,j) = R(i,j) - G(i,j),} \end{equation}\)

(2)

where \((i,j)\) is the position of each pixel in the spatial domain.

Four broadly tuned color tunnels \(B(i,j)\) , \(Y(i,j)\) , \(R(i,j)\) and \(G(i,j)\) are computed as follows:

\(\begin{equation} {Y(i,j) = \frac{r(i,j) + g(i,j)}{2} - \frac{\left| r(i,j) - g(i,j) \right|}{2} - b(i,j),} \end{equation}\)

(3)

\(\begin{equation} {B(i,j) = b(i,j) - (r(i,j) + g(i,j))/2,} \end{equation}\)

(4)

\(\begin{equation} {R(i,j) = r(i,j) - (g(i,j) + b(i,j))/2,} \end{equation}\)

(5)

\(\begin{equation} {G(i,j) = g(i,j) - (r(i,j) + b(i,j))/2,} \end{equation}\)

(6)

in which \(b(i,j)\) , \(g(i,j)\) , and \(r(i,j)\) are the blue, green, and red channels of the pixel in the \((i,j)\) position, respectively.

For an image, the motion component \(T(i,j)\) is 0 and the brightness component \(I(i,j)\) are computed as follows:

\(\begin{equation} {I(i,j) = (r(i,j) + g(i,j) + b(i,j))/3.} \end{equation}\)

(7)

Second, a quaternion \(q(i,j)\) can be defined to describe a color image by the below equation:

\(\begin{equation} {q(i,j) = T(i,j)+RG(i,j)\eta _{1} + BY(i,j)\eta _{2} + I(i,j)\eta _{3}}, \end{equation}\)

(8)

where \(\eta _{1}\) , \(\eta _{2}\) , and \(\eta _{3}\) are the imaginary units that satisfy the below relations: \(\eta _{1}^{2}=-1\) , \(\eta _{2}^{2}=-1\) , \(\eta _{3}^{2}=-1\) , \(\eta _{1}\bot \eta _{2}\) , \(\eta _{1}\bot \eta _{3}\) , \(\eta _{2}\bot \eta _{3}\) , and \(\ \eta _{3} = \eta _{1}\eta _{2}\) . Then, \(q(i,j)\) can be written as follows:

\(\begin{equation} {q(i,j) = f_{1}(i,j)+f_{2}(i,j)\eta _{2}}, \end{equation}\)

(9)

\(\begin{equation} {f_{1}(i,j) = T(i,j)+RG(i,j)\eta _{1}}, \end{equation}\)

(10)

\(\begin{equation} {f_{2}(i,j) =BY(i,j)+I(i,j)\eta _{1}}. \end{equation}\)

(11)

Third, the Quaternion Fourier transform (QFT) is applied to the image denoted by quaternion numbers to obtain the \(Q(u,v)\) as follows:

\(\begin{equation} {Q(u,v) = F_{2}(u,v)\eta _{2}+F_{1}(u,v),} \end{equation}\)

(12)

\(\begin{equation} {F_{\varphi }(u,v) = \frac{1}{\sqrt {M_{1}N_{1}}}\sum _{j = 0}^{M_{1} - 1}{\sum _{i = 0}^{N_{1} - 1}{e^{- \mu _{1}2\pi ((\frac{{iu}}{N_{1}}) + (\frac{jv}{M_{1}})) }f_{\varphi }}}(i,j),} \end{equation}\)

(13)

in which \(\varphi \in \left\lbrace 1,\ 2 \right\rbrace\) , \(M_{1}\) and \(N_{1}\) are the width and height of the image, \((u,v)\) is the position of the pixel in the frequency domain.

Fourth, the inverse QFT can be obtained as follows:

\(\begin{equation} {f_{\varphi }(i,j) = \frac{1}{\sqrt {{M_{1}N_{1}}}}\sum _{v = 0}^{M_{1} - 1}{\sum _{u = 0}^{N_{1}- 1}{e^{- \mu _{1}2\pi ((\frac{iu}{N_{1}}) + (\frac{{jv}}{M_{1}}))}F_{\varphi }(u,v).}}} \end{equation}\)

(14)

The above equations are used to acquire frequency domain representation \(Q(u,v)\) of \(q(i,j)\) . For simplicity, \(Q(u,v)\) can be expressed in polar form as follows:

\(\begin{equation} {Q(u,v) = \left\Vert Q(u,v) \right\Vert e^{\xi \aleph (u,v)}}, \end{equation}\)

(15)

in which \(\xi\) is the unit pure quaternion and \(\aleph (u,v)\) is the phase spectrum of \(Q(u,v)\) .

Let \(\left\Vert Q(u,v) \right\Vert\) =1. Thus, \(Q(u,v)\) retains the phase spectrum information in the frequency domain and the inverse QFT is conducted to obtain the reconstruction of \(Q(u,v)\) as follows:

\(\begin{equation} {q^{^{\prime }}(i,j) = T^{^{\prime }}(i,j)+RG^{^{\prime }}(i,j)\eta _{1} + BY^{^{\prime }}(i,j)\eta _{2} + I^{^{\prime }}(i,j)\eta _{3}}, \end{equation}\)

(16)

in which \(T^{^{\prime }}(i,j)\) , \(RG^{^{\prime }}(i,j)\) , \(BY^{^{\prime }}(i,j),\) and \(I^{^{\prime }}(i,j)\) are the reconstruction information of the quaternion.

Finally, the saliency map is defined as follows:

\(\begin{equation} {S(i,j) = g*\left\Vert q^{^{\prime }}(i,j) \right\Vert ^{2}}, \end{equation}\)

(17)

where \(g\) is a two-dimensional Gaussian filter. A visual example of saliency map determined by PQFT is displayed in Figure 3.

Fig. 3.

3.1.2 GLCM.

GLCM [41] is an effective method to represent the image. It can describe the spatial relations among pixels and has been applied to image recognition [42] and medical image classification [43]. During GLCM calculation, there are two parameters, i.e., the distance \(t\) and the direction \(\theta\) . To extract rich features, four groups of the parameters \((t, \theta)\) are used, i.e., \((1, 0^{\circ })\) , \((1, 45^{\circ })\) , \((1, 90^{\circ }),\) and \((1, 135^{\circ })\) . Therefore, after GLCM calculation on the saliency map, there are four GLCM matrices in total. More details of GLCM calculation can be found in Reference [36].

To extract compact global features, four statistical features of each GLCM matrix are calculated. These statistical features are Angular Second Moment (ASM), Contrast, Correlation, and Homogeneity. Suppose that \(p(i,j)\) is the matrix element in the \(i\) th row and \(j\) th column. The definitions of these statistical features are given as follows.

(1) ASM

ASM reflects the texture thickness and gray distribution of the image [44], which can be denoted by \(\chi _{1}\) :

\(\begin{equation} {\chi _{1} = \sum _{i}^{}{\sum _{j}^{}{p(i,j)}^{2}}}. \end{equation}\)

(18)

(2) Contrast. Contrast is used to judge the clarity and depth of the texture. It is denoted by \(\chi _{2}\) :

\(\begin{equation} {\chi _{2} = \sum _{i}^{}{\sum _{j}^{}p(i,j)|i - j|^{2}}}. \end{equation}\)

(19)

(3) Correlation. Correlation indicates the similarity degree of the element to its neighbor, which can be denoted by \(\chi _{3}\) :

\(\begin{equation} {\chi _{3} = \sum _{i}^{}{\sum _{j}^{}\frac{(j - \mu _{j})(i - \mu _{i})p(i,j)}{\sigma _{i}\sigma _{j}}}}, \end{equation}\)

(20)

where \(\mu _{i}\) and \(\ \sigma _{i}\) are the mean and the variance of the \(i\) th row of the GLCM matrix, and \(\mu _{j}\) and \(\sigma _{j}\) are the mean and the variance of the \(j\) th column of the GLCM matrix, respectively.

(4) Homogeneity. Homogeneity measures the degree of texture regularity and it is computed by \(\chi _{4}\) :

\(\begin{equation} {\chi _{4} = \sum _{i}^{}{\sum _{j}^{}\frac{1}{|i -j|^{2}+1}}{\ p}(i,j)}. \end{equation}\)

(21)

As there are four GLCM matrices and four statistical features are extracted from each GLCM matrix, we can obtain a sequence of 16 global features. For simplicity, the global feature sequence is represented by \(\mathbf {\chi }\) in the below description.

3.2 Local Invariant Feature Computation

The computation of the local invariant features includes three phases. The first phase is to select the \(\mathbf {V}\) component with the size of \(M_{0} \times M_{0}\) in HSV color space for extracting stable image features. The second phase is to learn discriminative local invariant features via KPCA. The third phase is to calculate the vector distances in the KPCA domain for producing compact local features. The KPCA and vector distance calculation are explained in detail below.

3.2.1 KPCA.

KPCA [45] is a nonlinear extension of PCA, which can effectively capture the nonlinear features of data. Currently, KPCA has been widely used in numerous applications, such as fault detection [46], face recognition [47], and target detection [48]. However, the utilization of KPCA in image hashing is seldom studied. In this work, we propose to use KPCA to learn local features from the \(\mathbf {V}\) component of the image. The KPCA can derive compact features by the below steps.

(1) Map data to the feature space. Suppose that \(\mathbf {x}_{i}\) and \(\mathbf {x}_{j}\) are two data points in high-dimensional space and there are \(N\) data points in total. The nonlinear mapping function \(\varnothing\) is employed to map the high-dimensional data to the low-dimensional feature space as follows:

\(\begin{equation} {\mathbf {x}_{i} \rightarrow \varnothing \left(\mathbf {x}_{i} \right).} \end{equation}\)

(22)

(2) Determine the kernel function. The \(N \times N\) matrix of the kernel function \(K(\mathbf {x}_{i},\mathbf {x}_{j})\) is defined as follows:

\(\begin{equation} {K(\mathbf {x}_{i},\mathbf {x}_{j}) =\varnothing (\mathbf {x}_{i})\varnothing (\mathbf {x}_{j}) = \exp (- \frac{\Vert (\mathbf {x}_{i}-\mathbf {x}_{j}\Vert ^{2}}{2\sigma ^{2}}),} \end{equation}\)

(23)

in which \(\sigma\) is a kernel parameter.

(3) Calculate the projection of eigenvectors. Suppose that \(\mathbf {C}\) is a covariance matrix with the size of \(N \times N\) , which can be computed as follows:

\(\begin{equation} { \mathbf {C} = \frac{1}{N}\sum _{i = 1}^{N}{\varnothing (\mathbf {x}_{i})\varnothing (\mathbf {x}_{i})^{\rm {T} }}}. \end{equation}\)

(24)

The eigenvalues \(\lambda\) and eigenvectors \(\mathbf {E}\) of \(\mathbf {C}\) are computed and the calculation equation is as follows:

\(\begin{equation} {\lambda \mathbf {E} = \mathbf {CE}}. \end{equation}\)

(25)

The eigenvalues are arranged in descending sort and the projection matrix \(\mathbf {A}_{d} = \lbrack a_{1}, a_{2},\ldots , a_{d}\rbrack\) is attained by the eigenvectors \(\mathbf {E}\) of the first \(d\) maximum eigenvalues of \(\mathbf {C}\) . Note that \(\mathbf {E}\) can be defined by the following equation:

\(\begin{equation} {\mathbf {E} = \sum _{i = 1}^{N}{\alpha _{i}\varnothing (\mathbf {x}_{i})}}, \end{equation}\)

(26)

in which \(\alpha _i\) is the \(i\) th column vector. Finally, the projection of eigenvectors is computed as follows:

\(\begin{equation} {\mathbf {Y} = \sum _{j = 1}^{N}{\mathbf {A}_{d}K(\mathbf {x}_{i},\mathbf {x}_{j}),}} \end{equation}\)

(27)

where \(\mathbf {Y}\) is a low dimensional matrix with the size of \(d \times N\) . More specific information about KPCA is given in Reference [45].

In this work, the \(\mathbf {V}\) component is partitioned into non-overlapping blocks with the size \(m \times m\) . Consequently, the block numbers are N = (M/m)(M/m). For each block, its pixels are concatenated to form a vector with \(m^{2}\) elements, which is corresponding to the data point \(\mathbf {x}_{i}\) described in the KPCA. Therefore, there are \(N\) vectors in total. After applying the KPCA, the compact features are extracted. Here, the compact features are the output of the KPCA, i.e., the low dimensional matrix \(\mathbf {Y}\) with the size of \(d \times N\) .

3.2.2 Vector Distance Calculation.

To obtain compact features, each column of the low-dimensional matrix \(\mathbf {Y}\) is treated as a vector. As the vector distance [28] is an invariant feature for normal operations, it is used to represent features. To do so, a reference vector \(\mathbf {y}_{0}\) is computed as follows:

\(\begin{equation} \mathbf {y}_{0} = \left[ y_{0}(1),y_{0}(2),\ldots ,y_{0}(d) \right]^{\rm {T}}, \end{equation}\)

(28)

where \(y_{0}(i)\) is the \(i\) th element of \(\mathbf {y}_{0}\) defined as follows:

\(\begin{equation} {y_{0}(i) = \frac{1}{N}\sum _{j = 1}^{N}{y_{j}(i)}}, \end{equation}\)

(29)

in which \(y_{j}(i)\) is the \(i\) th element of \(\mathbf {y}_{j}\) .

Next, the \(L_{2}\) norm is computed as follows:

\(\begin{equation} {z_{j} = \sqrt {\sum _{i = 1}^{d}{\lbrack y_{0}(i) - y_{j}(i)\rbrack ^{2}}}}. \end{equation}\)

(30)

Subsequently, \(z_{j}\) is quantized as follows:

\(\begin{equation} {\phi (j) = \text{round}(z_{j} \times 1,000),} \end{equation}\)

(31)

in which \(\rm {round(\cdot)}\) is the rounding operation. Last, a feature sequence \(\mathbf {\Phi }\) is obtained as follows:

\(\begin{equation} {\mathbf {\Phi } = \lbrack \phi (1),\phi (2),\ldots ,\phi (N)\rbrack .} \end{equation}\)

(32)

3.3 Encryption

The global feature sequence and the local invariant feature sequence are concatenated to produce the feature sequence \(\mathbf {g} = \lbrack g(1), g(2),\ldots , g(L)\rbrack\) , where the length is \(L = 16 + N\) . To guarantee security, the pseudo-random generator is exploited to permute the elements of \(\mathbf {g}\) . This can be achieved below. First, a secret key is used as a seed of the random generator to produce \(L\) pseudo-random numbers. Second, these pseudo-random numbers are sorted to form an ordered sequence. Third, the original positions of the sorted numbers are recorded in a sequence \(\mathbf {r}_{1}\) . Finally, the permutation can be done by the following equation:

\(\begin{equation} {r_{2}(j) = g(r_{1}(j)),} \end{equation}\)

(33)

in which \(r_{1}(j)\) and \(r_{2}(j)\) are the \(j\) th elements of \(\mathbf {r}_{1}\) and \(\mathbf {r}_{2}\) , respectively. Clearly, there are \(L!\) possibilities of random permutation. Without the correct key, it is difficult to generate the correct encrypted feature. Therefore, the encrypted feature sequence \(\mathbf {r}_{2}\) can be formally defined as follows:

\(\begin{equation} {\mathbf {r}_{2} = \lbrack r_{2}(1),r_{2}(2),\ldots ,r_{2}(L)\rbrack .} \end{equation}\)

(34)

3.4 Ordinal Measures

Ordinal measures [49] are useful compact features for many robust image applications [50], such as iris recognition and copy detection. To make a short and robust hash, the ordinal measures are exploited to quantize the encrypted feature vector \(\mathbf {r}_{2}\) . To do so, the elements in the encrypted feature sequence are firstly sorted in ascending order. Next, their positions in the sorted sequence are selected as the compact features.

To make easy understanding of the ordinal measures, an example is presented in Table 2, in which the first row is the original sequence, the second row is the ascending sorted sequence, and the last row is the ordinal measures of the sequence in the first row. Obviously, the first element of the original sequence is 7, which is at the third position of the sorted sequence. Therefore, its ordinal number is 3. Similarly, the second element of the original sequence is 3, which is at the first position of the sorted sequence. Thus, its ordinal number is 1.

Table 2.

Original sequence	7	3	5	19	12	24	28	16
Sorted sequence	3	5	7	12	16	19	24	28
Ordinal measures	3	1	2	6	4	7	8	5

Table 2. Example of Ordinal Measures

The final hash h can be written as follows:

\(\begin{equation} {{\bf h}=\lbrack h(1),h(2),\ldots ,h(L)\rbrack ,} \end{equation}\)

(35)

where \(h(j)(1 \le j \le L)\) is the position of \(r_{2}(j)\) in the sorted feature sequence. Therefore, our hash contains L integers.

3.5 Pseudo-code Description

Our scheme consists of four phases. They are global feature calculation, local invariant feature computation, encryption, and ordinal measures. To improve readability, the pseudo-code of our scheme is described in Algorithm 1.

3.6 Similarity Evaluation

The L₁ norm is a well-known distance for measuring similarity and its inputs are generally the sequences of integers or decimals. As the L₁ norm is simple and effective and the hash sequence of our scheme consists of integers, we select the L₁ norm as the metric for determining the similarity of two hash sequences. Suppose that \(\mathbf {h}_{1}\) and \(\mathbf {h}_{2}\) are two hash sequences. Thus, the L₁ norm \(\zeta \left(\mathbf {h}_{\mathbf {1}},\mathbf {h}_{\mathbf {2}} \right)\) is defined as follows:

\(\begin{equation} {\zeta \left(\mathbf {h}_{\mathbf {1}},\mathbf {h}_{\mathbf {2}} \right) = \sum _{j = 1}^{L}\left| h_{1}(j) - h_{2}(j) \right|,} \end{equation}\)

(36)

in which \(h_{1}(j)\) and \(h_{2}(j)\) are the jth elements of \(\mathbf {h}_{1}\) and \(\mathbf {h}_{2}\) , respectively. The smaller the L₁ norm is, the more similar their corresponding images are. A threshold \(T\) can be employed to judge whether two images are similar or not.

4 Experimental Results

Our scheme is implemented on the Matlab R2018B platform. The adopted computer has a CPU with 3.20 GHz main frequency (Intel i7 dual-core 8700) and 8 GB RAM. In the following experiments, our employed parameter settings are detailed as below. The size of the image is 512 \(\times\) 512, the kernel parameter of KPCA is 10,000, the size of block is 64 \(\times\) 64 and the feature dimension is 4. In other words, our parameter settings are \(M_{0}\) =512, \(m\) =64, \({\sigma =\text{10,000}}\) , and \(d\) =4. Hence, \(N=64\) and \(L=64+16=80\) .

4.1 Robustness Test

The Kodak database¹ is exploited to evaluate the robustness of our scheme. This database has 24 color images. For each image, there are 100 similar versions, which are formed by 16 multiple distortion types, including Image Deformation with Rotation (IDR) (angle: 30 \(°\) )+Salt and Pepper Noise (SPN) (density: 0.01)+BA, IDR+SPN+Contrast Adjustment (CA), IDR+SPN+Gamma Correction (GC), IDR+SPN+Gaussian Low-pass Filtering (GLF), IDR+SPN+JC, IDR+SPN+Image Scaling (IS), IDR+SPN+Speckle Noise (SN), IDR+SPN+AWGN, IDR+AWGN (variance: 0.01)+BA, IDR+AWGN+CA, IDR+AWGN+GC, IDR+AWGN+GLF, IDR+AWGN+SPN, IDR+AWGN+JC, IDR+AWGN+IS, IDR+AWGN+SN. In the experiment, the total number of similar images is 2,400 pairs. In summary, 2,400 + 24 = 2,424 images are employed. Table 3 lists the specific distortions and their parameter settings.

Table 3.

Distortions	Parameter setting	Number
IDR+SPN+BA	magnitude \(\in \left\lbrace 10, 20\right\rbrace\)	2
IDR+SPN+CA	magnitude \(\in \left\lbrace 10, 20\right\rbrace\)	2
IDR+SPN+GC	\(\gamma\) \(\in \left\lbrace 1.25, 1.1, 0.9, 0.75\right\rbrace\)	4
IDR+SPN+GLF	standard deviation:[0.3, 1.0], step:0.1	8
IDR+SPN+JC	factor:[30, 100], step:10	8
IDR+SPN+IS	ratio \(\in \left\lbrace 2, 1.5, 1.1, 0.9, 0.75, 0.5\right\rbrace\)	6
IDR+SPN+SN	variance:[0.001, 0.01], step:0.001	10
IDR+SPN+AWGN	variance:[0.001, 0.01], step:0.001	10
IDR+AWGN+BA	magnitude \(\in \left\lbrace 10, 20\right\rbrace\)	2
IDR+AWGN+CA	magnitude \(\in \left\lbrace 10, 20\right\rbrace\)	2
IDR+AWGN+GC	\(\gamma\) \(\in \left\lbrace 1.25, 1.1, 0.9, 0.75\right\rbrace\)	4
IDR+AWGN+GLF	standard deviation:[0.3, 1.0], step:0.1	8
IDR+AWGN+JC	factor:[30, 100], step:10	8
IDR+AWGN+SPN	density:[0.001, 0.01], step:0.001	10
IDR+AWGN+IS	ratio \(\in \left\lbrace 2, 1.5, 1.1, 0.9, 0.75, 0.5\right\rbrace\)	6
IDR+AWGN+SN	variance:[0.001, 0.01], step:0.001	10
Total		100

Table 3. Parameter Settings of Multiple Distortions

Table 4 shows the Mean of Median Values (MMV), Mean of Lower Quartiles (MLQ), and Mean of Upper Quartiles (MUQ) of L₁ norms under multiple distortions with different parameters to further validate the robustness of our hashing. Clearly, the MUQs of multiple distortions are all less than 600. The results show that when the threshold is 600, the proposed scheme can correctly identify 95.75% similar images. When the threshold is chosen as 700, our scheme can obtain a correct detection rate of 98.38%. Therefore, our scheme has good robustness according to the high correct detection rate of the multiple distortions.

Table 4.

Distortions	MMV	MLQ	MUQ
IDR+SPN+BA	392.5	355.0	496.0
IDR+SPN+CA	390.5	348.5	488.0
IDR+SPN+GC	424.5	362.0	501.0
IDR+SPN+GLF	391.7	343.7	480.0
IDR+SPN+JC	373.1	355.2	463.7
IDR+SPN+IS	388.5	345.6	483.8
IDR+SPN+SN	398.8	357.5	495.7
IDR+SPN+AWGN	429.6	388.1	523.1
IDR+AWGN+BA	473.0	421.0	562.5
IDR+AWGN+CA	453.5	393.5	553.0
IDR+AWGN+GC	471.5	410.0	559.2
IDR+AWGN+GLF	426.1	379.7	518.5
IDR+AWGN+JC	387.2	353.6	471.0
IDR+AWGN+SPN	448.2	407.5	552.2
IDR+AWGN+IS	441.8	388.3	533.0
IDR+AWGN+SN	461.2	409.0	553.6

Table 4. Statistical Results of Multiple Distortions

4.2 Discrimination Test

The VOC2007 database² is used to check the discrimination of our scheme. There are 5,011 color images in the VOC2007 database. The L₁ norms between each pair of hash codes in this database are produced. Hence, the total number of the distances can be established by the combinatorial number \(C(5,011, 2)= 5,011(5,011-1)/2=12,552,555\) . Figure 5 is the distribution of these distances. The calculation results show that the median value, the lower quartile and the upper quartile of the L₁ norms are 1,342.0, 1,236.0, and 1,444.0, respectively, which are all larger than the biggest MUQ (562.5) of the similar images. This illustrates that our scheme is discriminative.

Fig. 4.

Fig. 5.

4.3 Block Size Selection

To view our performances under different block sizes, the Receiver Operating Characteristics (ROC) graph [51] is employed. The ordinate axis and the abscissae axis are defined by \(P_1\) and \(P_2\) as follows:

\(\begin{equation} P_1 = \frac{ \rm {\# similar\; images\; rightly\; detected}}{\rm {\# similar\; images}}, \end{equation}\)

(37)

\(\begin{equation} P_2 = \frac{ \rm {\# different\; images\; falsely\; recognized}}{\rm {\# different\; images}}. \end{equation}\)

(38)

A small \(P_2\) value means a good discrimination, while a large \(P_1\) value means a good robustness. A set of points with coordinates \((P_{1}, P_{2})\) are employed to plot a curve of ROC graph. If there are two curves in a graph, then the curve near the upper-left corner has better performance than that away from it. In addition, the Area Under the ROC Curve (AUC) is computed for performance comparison, where the scope of AUC is [0, 1]. The curve with a big AUC is better than the curve with a small one.

In this section, the used datasets are the same with the image databases of robustness in Section 4.1 and discrimination in Section 4.2. Note that block size is varied and the settings of other parameters remain unchanged. Figure 6 shows the curves under different block sizes. The curve of 64 \(\times\) 64 is nearer the upper left corner than those of other sizes. Furthermore, the AUCs of different block sizes are calculated. The AUC values of 256 \(\times\) 256, 128 \(\times\) 128, 64 \(\times\) 64, 32 \(\times\) 32, and 16 \(\times\) 16 are 0.15438, 0.77170, 0.99999, 0.99985, and 0.99977, respectively. It is found that the AUC of 64 \(\times\) 64 is larger than the values of other block sizes. This means that the classification performance of the block size 64 \(\times\) 64 is superior to those of 256 \(\times\) 256, 128 \(\times\) 128, 32 \(\times\) 32, and 16 \(\times\) 16. The computational time of 256 \(\times\) 256, 128 \(\times\) 128, 64 \(\times\) 64, 32 \(\times\) 32, and 16 \(\times\) 16 is 0.1005, 0.1163, 0.1453, 0.2601, and 1.0620 s, respectively. This means that the selection of 64 \(\times\) 64 is the optimal choice. The hash lengths of the selections of 256 \(\times\) 256, 128 \(\times\) 128, 64 \(\times\) 64, 32 \(\times\) 32, and 16 \(\times\) 16 are 20, 32, 80, 272, and 1040 integers, respectively. Table 5 lists the performance comparison under different block sizes. Hence, the block size 64 \(\times\) 64 can provide the best performance of the proposed scheme.

Table 5.

Block size	AUC	Time (s)	Length (integer)
256 \(\times\) 256	0.15438	0.1005	20
128 \(\times\) 128	0.77170	0.1163	32
64 \(\times\) 64	0.99999	0.1453	80
32 \(\times\) 32	0.99985	0.2601	272
16 \(\times\) 16	0.99977	1.0620	1,040

Table 5. Performance Comparison of Different Block Sizes

Fig. 6.

4.4 Dimension Selection

A key parameter of our scheme is the dimension selection in the KPCA, i.e., the parameter \(d\) . In the experiments, the \(d\) value is chosen from the set \(\left\lbrace 2, 3, 4, 5, 6 \right\rbrace\) , and other parameter settings remain the same. Figure 7 shows the ROC curves under different dimensions. The quantitative results show that the AUC values of the dimension values of 2, 3, 4, 5, and 6 are 0.99905, 0.99995, 0.99999, 0.99994, and 0.99996, respectively. The AUC value of \(d\) =4 is greater than those of other dimensions. The computational time of the dimension values of 2, 3, 4, 5, and 6 is 0.1453, 0.1597, 0.1625, 0.1686, and 0.1694 s, respectively. As to the computational time, there is little difference between different dimensions. Performance comparisons are summarized in Table 6. Our scheme achieves better performances in both classification performance and computational time when the dimension value is 4.

Table 6.

\(d\)	AUC	Time (s)
2	0.99905	0.1453
3	0.99995	0.1597
4	0.99999	0.1625
5	0.99994	0.1686
6	0.99996	0.1694

Table 6. Performance Comparison of Different Dimensions

Fig. 7.

4.5 Security Analysis

To ensure the security of our scheme, the similarity between different hashes created by different keys should be small. In our scheme, the key dependence means that the \(L_{1}\) norms between the hash codes of different keys are very large. The six test images named “Airplane,” “Baboon,” “House,” “Peppers,” “Splash,” and “Tree” are employed to compute the \(L_{1}\) norms between hash codes generated by different keys. First, the hash of each image is generated by the correct key. Second, 100 wrong keys are used to compute 100 different hash codes of each image. Finally, the \(L_{1}\) norms between hashes of the correct key and the incorrect keys are computed. Figure 8 shows the \(L_{1}\) norms between the hashes generated by different keys. In Figure 8, the \(y\) -axis describes the \(L_{1}\) norm and the \(x\) -axis describes the index of the wrong key. The minimum \(L_{1}\) norm is greater than 1,700, which shows that our scheme has good key dependence.

Fig. 8.

5 Performance Comparison

To demonstrate the advantage of our scheme, some famous schemes are compared, including RW scheme [21], SVD-CSLBP scheme [26], HCVA-DCT scheme [17], MDS scheme [18], FM-2D PCA scheme [31], and DCT-GLCM scheme [36]. The compared schemes are recently published in reputable journals or conference. In addition, SVD-CSLBP scheme, MDS scheme, and FM-2D PCA scheme also use the techniques of dimension reduction, such as SVD, MDS, and 2D PCA. To guarantee a fair comparison, the reported parameter values and the measurements of hash similarity in the papers of these schemes are all employed here, and all images are resized to 512 \(\times\) 512 before they are input to these schemes. In the below sections, Section 5.1 presents classification performance, and Section 5.2 compares the performances of time and storage.

5.1 Classification Performance

The image databases described in Sections 4.1 and 4.2 are used to evaluate the classification performance of our scheme. Specifically, 2,424 images are used for robustness and 5,011 images are taken for discrimination. The ROC graph is still chosen to conduct visual comparison. In Figure 9, all evaluated schemes’ curves are drawn in the same graph for easy comparison. For a better view of these curves, their local details are enlarged and drawn in the same graph. Clearly, the curve of our scheme is nearer to the upper-left corner than those of the compared schemes. The visual comparison shows that the classification performance of our scheme outperforms those of the compared schemes. For further analysis, the quantitative metric called AUC is also computed.

Fig. 9.

The experimental results show that the AUC of our scheme is 0.99999. However, the AUCs of the RW scheme, SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, DCT-GLCM scheme, our scheme only with global features, and our scheme only with local features are 0.98298, 0.87868, 0.93143, 0.96125, 0.997642, 0.99913, 0.15394, and 0.99981, respectively. Obviously, the AUC of our scheme is bigger than those of the compared schemes. The AUC comparison also illustrates that our scheme is superior to all compared schemes. Our scheme obtains the competitive advantage of classification performance because of the following reasons. The global feature calculation with PQFT and GLCM can enhance discrimination without reducing robustness. Since the KPCA can maintain the geometric relationships within image, the local invariant features learned with KPCA can guarantee discrimination.

5.2 Performances of Time and Storage

Time performance is measured by the computational time of calculating a hash code. The results show that the computational time of our scheme is 0.1453 s. The computational times of the RW scheme, SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, DCT-GLCM scheme, our scheme only with global features, and our scheme only with local features are 0.0587, 0.1213, 0.0328, 0.4285, 0.0847, 0.1211, 0.0671, and 0.1024 s, respectively. Obviously, our scheme is quicker than some compared schemes. Compared with some dimension reduction-based schemes, such as the MDS scheme, our scheme is much quicker.

Storage performance is determined by the required bits of a hash code. In the experiment, 5,011 different images in VOC2007 are utilized to analyze the needed bits of storing our scheme. As each hash includes 80 integers, there are \(\text{5,011} \times 80 = \text{400,880}\) integers in total. According to the ordinal measures, the maximum hash value is 80. As \(80 \lt 128 = 2^{7}\) , only 7 bits are used to store a hash element. Hence, there are \(80 \times 7 = 560\) bits for our hash. As for our scheme only with global features, only 4 bits are used to store a hash element because of \(16 = 2^{4}\) . Thus, there are \(16 \times 4 = 64\) bits for a hash. As for our scheme only with local features, only 6 bits are used to store a hash element because of \(64 = 2^{6}\) . As a result, there are \(64 \times 6 = 384\) bits for a hash. For the SVD-CSLBP scheme and the HCVA-DCT scheme, their lengths are 64 and 20 floating-point numbers. Note that a floating-point number requires 32 bits according to IEEE standard. Thus, the SVD-CSLBP scheme and the HCVA-DCT scheme need 2048 bits and 640 bits for saving a hash code, respectively. The lengths of the RW scheme, SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, DCT-GLCM scheme, our scheme only with global features, our scheme only with local features, and our scheme are 144 bits, 2048 bits, 640 bits, 720 bits, 832 bits, 720 bits, 64 bits, 384 bits, and 560 bits, respectively. Obviously, the storage complexity of our scheme is higher than those of RW scheme and DCT-LLE scheme, but it is lower than those of the SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, and GLCM-DCT scheme. The complexity summary is presented in Table 7.

Table 7.

Scheme	Hash length (bit)	Time (s)
RW	144	0.0587
SVD-CSLLBP	2,048	0.1213
HCVA-DCT	640	0.0328
MDS	720	0.4285
FM-2D PCA	832	0.0847
DCT-GLCM	720	0.1211
Our-global	64	0.0671
Our-local	384	0.1024
Our	560	0.1453

Table 7. Performance Summary of Time and Storage

6 Application to Detection of Image Copies with Multiple Distortions

With the extensive use of digital images, copyright protection of images has become a significant task of information security. Due to the benefit of robust hashing in robustness and low storage, numerous researchers have utilized robust hashing to pursue copy detection. In this section, we test the copy detection performance of our scheme. Section 6.1 depicts the employed database and metric. Section 6.2 presents the results of copy detection.

6.1 Database and Metric

To build an image database for copy detection, the UCID [52] is employed. In the experiment, 16 images are randomly chosen as the query images from 1338 color images in the UCID. Figure 10 shows the thumbnails of these query images. To simulate copy detection, 10 digital operations are utilized to produce 10 copies of each query image. These 10 operations are listed below: IDR+JC+BA (angle: 30 \(^{\circ }\) , quality factor: 30, parameter: 20), IDR+JC+CA (angle: 30 \(^{\circ }\) , quality factor: 30, parameter: 20), IDR+JC+GC (angle: 30 \(^{\circ }\) , quality factor: 30, \(\gamma\) : 0.9), IDR+JC+GLF (angle: 30 \(^{\circ }\) , quality factor: 30, standard deviation: 0.2), IDR+JC+SPN (angle: 30 \(^{\circ }\) , quality factor: 30, density: 0.02), IDR+JC+IS (angle: 30 \(^{\circ }\) , quality factor: 30, ratio: 0.75), IDR+JC+SN (angle: 30 \(^{\circ }\) , quality factor: 30, variance: 0.02), IDR+JC with Text Adding (IDR+JC+TA) (text content: Copyright 2022), IDR+JC with Logo Embedding (IDR+JC+LE) (size of logo: \(66\times 70\) , weight of logo: 0.2), and IDR+JC+SN (angle: 30 \(^{\circ }\) , quality factor: 30, variance: 0.02). So there are 160 image copies. These image copies and the images of UCID excluding the above chosen 16 images are employed to form the copy image database. Therefore, the total image number in the database is \(160+1,338-16=1482\) . For every query image, there are 1,472 different images and 10 image copies.

Fig. 10.

To validate the copy detection performance of different schemes, the Mean Average Precision (MAP) is used to test. The MAP is computed by the Average Precision (AP). The calculation of AP is related to the order of the returned images of different schemes. The equation of AP is as follows:

\(\begin{equation} {\rm{AP}} = \frac{1}{10}\sum_{i = 1}^{10}{f_{i}\left(\frac{\sum_{j = 1}^{i} f_{j}}{i}\right)} \end{equation}\)

(39)

in which \(f_{i}=1\) when the \(i\) th returned image is an image copy. Otherwise, \(f_{i}=0\) . The MAP is acquired by computing the average of the APs of all query images. The scope of MAP is [0, 1]. In general, a bigger MAP means a superior copy detection performance.

6.2 Detection Results

To show advantage, the copy detection performance of our scheme is also compared with the RW scheme [21], SVD-CSLBP scheme [26], HCVA-DCT scheme [17], MDS scheme [18], FM-2D PCA scheme [31], and DCT-GLCM scheme [36]. Note that the SVD-CSLBP scheme, the MDS scheme, and the FM-2D PCA scheme are also dimension reduction-based schemes. The MAPs of these schemes are calculated and shown in Figure 11. It can be seen that the MAP of our scheme is 0.99820. The MAPs of the RW scheme, SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, and DCT-GLCM scheme are 0.79779, 0.35234, 0.47503, 0.72384, 0.96875, and 0.98748, respectively. Clearly, the MAP of our scheme is greater than the MAPs of the compared schemes. Our scheme performs better in copy detection than the compared schemes. This is because our scheme has better classification performance, which can reduce classification error during copy detection.

Fig. 11.

7 Conclusions

This article has proposed a novel hashing with global and local invariant features for detecting image copies with multiple distortions. An important contribution is the global feature calculation with the PQFT and the GLCM, which can strengthen discrimination without reducing robustness. Another critical contribution is the local invariant feature computation via KPCA and vector distances. As KPCA can maintain the geometric relationships within image, the local invariant features learned with KPCA can guarantee discrimination. In addition, the low-dimensional features in the KPCA domain are viewed as vectors, and the vector distances in the KPCA domain are taken to represent the features, which can guarantee compactness of our scheme. Hash is finally determined by using the ordinal measures of the encrypted global and local invariant features. Numerous experiments have been carried out and the results have shown that our scheme is superior to some advanced hashing schemes in terms of classification and detection of image copies with multiple distortions.

Acknowledgments

Many thanks to the reviewers for their helpful suggestions. Zhenjun Tang is the corresponding author.

Footnotes

http://r0k.us/graphics/kodak/.

http://host.robots.ox.ac.uk/pascal/VOC/.

References

[1]

Shiguang Liu and Ziqing Huang. 2019. Efficient image hashing with geometric invariant vector distance for copy detection. ACM Trans. Multimedia Comput., Commun. Appl. 15, 4 (2019), 1–22. DOI:

Abstract

1 Introduction

2 Related Work

2.1 Image Hashing Schemes With Global Features

2.2 Image Hashing Schemes with Local Features

2.3 Image Hashing Schemes with Local Features and Global Features

3 Our Scheme

3.1 Global Feature Calculation

3.1.1 PQFT for Saliency Map.

3.1.2 GLCM.

(1) ASM

3.2 Local Invariant Feature Computation

3.2.1 KPCA.

3.2.2 Vector Distance Calculation.

3.3 Encryption

3.4 Ordinal Measures

3.5 Pseudo-code Description

3.6 Similarity Evaluation

4 Experimental Results

4.1 Robustness Test

4.2 Discrimination Test

4.3 Block Size Selection

4.4 Dimension Selection

4.5 Security Analysis

5 Performance Comparison

5.1 Classification Performance

5.2 Performances of Time and Storage

6 Application to Detection of Image Copies with Multiple Distortions

6.1 Database and Metric

6.2 Detection Results

7 Conclusions

Acknowledgments

Footnotes

References

Cited By

Index Terms

Recommendations

Generic radial orthogonal moment invariants for invariant image recognition

Invariant Image Recognition Using Radial Jacobi Moment Invariants

A novel signature based on the combination of global and local signatures for image copy detection

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations