Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Robust Hashing via Global and Local Invariant Features for Image Copy Detection

Published: 24 August 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Robust hashing is a powerful technique for processing large-scale images. Currently, many reported image hashing schemes do not perform well in balancing the performances of discrimination and robustness, and thus they cannot efficiently detect image copies, especially the image copies with multiple distortions. To address this, we exploit global and local invariant features to develop a novel robust hashing for image copy detection. A critical contribution is the global feature calculation by gray level co-occurrence moment learned from the saliency map determined by the phase spectrum of quaternion Fourier transform, which can significantly enhance discrimination without reducing robustness. Another essential contribution is the local invariant feature computation via Kernel Principal Component Analysis (KPCA) and vector distances. As KPCA can maintain the geometric relationships within image, the local invariant features learned with KPCA and vector distances can guarantee discrimination and compactness. Moreover, the global and local invariant features are encrypted to ensure security. Finally, the hash is produced via the ordinal measures of the encrypted features for making a short length of hash. Numerous experiments are conducted to show efficiency of our scheme. Compared with some well-known hashing schemes, our scheme demonstrates a preferable classification performance of discrimination and robustness. The experiments of detecting image copies with multiple distortions are tested and the results illustrate the effectiveness of our scheme.

    1 Introduction

    Copy detection is an important topic of image processing. Image copies with multiple distortions are often generated in many image applications [1, 2]. Hence, it is a significant task to find image copies with multiple distortions from large-scale images [3, 4, 5]. Figure 1 depicts an original image and its image copies, where (a) is an original image, (b) is a copy with single distortion attacked by JPEG Compression (JC), (c) is a copy with double distortions generated by JC and Brightness Adjustment (BA), and (d) is a copy with multiple distortions produced by JC, BA, and Additive White Gaussian Noise (AWGN). To make quantitative analysis, the Peak Signal-to-Noise Ratio (PSNR) is adopted to calculate the quality of their image copies and the results are presented in Figure 1. From Figure 1, it can be seen that the multiple distortions leads to more image degradations. Therefore, efficient approaches are needed for detecting image copies with multiple distortions. This article presents a novel robust hashing scheme for multiple-distorted image copy detection.
    Fig. 1.
    Fig. 1. Original image and its copies.
    Numerous researchers suggest to use image hashing [6, 7, 8] to handle a variety of image processing problems [9, 10, 11], including quality assessment, image retrieval, and copy detection. Image hashing can compress and encode an image into a short sequence. Generally, image hashing needs to meet two important performance indicators [12, 13, 14], i.e., discrimination and robustness. Discrimination is also called anti-collision [6, 15], which means that different images ought to be compressed and encoded into different hashes. Robustness demands that similar images ought to be mapped to the similar or same hashes. It is clear that these two performance indicators are interrelated and mutually constraining. An increase in one performance will decrease the other performance. Hence, it is an essential task to establish effective hashing schemes with a favorable performance trade-off between discrimination and robustness.
    Presently, most hashing schemes have made good performance in the concerned applications. Nevertheless, they do not perform well in balancing the performances of discrimination and robustness and thus they cannot efficiently detect image copies, especially image copies with multiple distortions. To address this, we design a new hashing scheme with global and local invariant features for detecting image copies with multiple distortions. The contributions of our work are highlighted below.
    (1) Global features are calculated by using the Phase spectrum of Quaternion Fourier Transform (PQFT) and Gray Leve Co-occurrence Moment (GLCM). Specifically, the PQFT is exploited to extract saliency map of the image, where all color information of image pixels are utilized during the extraction. Next, the GLCM is produced from the saliency map. The global feature calculation with PQFT and GLCM can enhance discrimination without reducing robustness.
    (2) Local invariant features are computed by using the KPCA and vector distances. Since the KPCA can maintain the geometric relationships within image, the local invariant features learned with KPCA can guarantee discrimination. In addition, the low-dimensional features in the KPCA domain are viewed as vectors, and the vector distances in the KPCA domain are taken to represent the features, which can guarantee compactness of our scheme. As far as we known, this is the first work of the KPCA applied to robust hashing and it will inspire the research community of image hashing.
    (3) Hash is generated by using the ordinal measures of the encrypted global and local invariant features. The encryption of the global and local invariant features provides security of our scheme. The use of ordinal measures can make low storage.
    Extensive experiments on prevalent image databases are utilized to validate the performances of our scheme. Compared with some well-known hashing schemes, our scheme shows a preferable classification performance of discrimination and robustness. The experiments of detecting image copies with multiple distortions are tested and the results illustrate the effectiveness of our scheme.
    The rest of this article is composed of four parts. The related work is listed in Section 2. Details of our scheme are depicted in Section 3. Experimental results and performance comparison of our scheme are discussed in Sections 4 and 5, respectively. Our application to copy detection is introduced in Section 6. Section 7 sums up this article.

    2 Related Work

    Generally, a high-performance hashing scheme should get a desirable trade-off between discrimination and robustness. Numerous researchers have tried their hardest to develop some effective hashing schemes with high classification performance. Depending on the features extracted, these image hashing schemes are grouped into three classes.

    2.1 Image Hashing Schemes With Global Features

    Some typical schemes employ the moments, histogram, Tucker decomposition, Multi-dimensional Scaling (MDS) or Global Neighborhood Structure (GNS) to extract global features for hash computation. For instance, Reference [2] uses the Quaternion Zernike Moments (QZMs) to produce a hashing scheme. This scheme has good discrimination, but the robustness ought to be improved. To enhance the robustness, Reference [16] extracts a hash from a third-order tensor by Tucker decomposition. This scheme is robust to common digital operations, but it can only resist small angle rotation. To resist large angle rotation, Reference [17] computes Color Vector Angles (CVAs) of color image, employs the Histogram of CVAs (HCVAs) as image feature and compresses it with DCT. This scheme has excellent rotation robustness, but the discrimination ought to be improved. In another work [18], the polar coordinate transformation and DFT are combined to construct the rotation-invariant secondary image and then the MDS is employed to create a hash from the secondary image. This scheme has excellent robustness to rotation, but the discrimination ought to be enhanced. Recently, Reference [19] exploited Dominant Neighborhood Structure (DNS) and GNS to derive global features for hash generation. This hashing scheme has good application in Tor domain identification, but the time efficiency is too low.

    2.2 Image Hashing Schemes with Local Features

    These hashing schemes utilize local features to generate a hash. Common-used local features include local feature points, block features, and ring features. For example, Reference [20] uses the endpoints of linear targets to design a hashing scheme. This scheme has desirable robustness against gamma correction and geometric attack. In Reference [21], the Random Walk (RW) theory and the statistical features of blocks are employed to form a hash. This scheme is effective in security. The reference [22] proposes an adaptive local feature extraction technology to construct a multi-scale hashing scheme. However, the robustness of this scheme is not satisfactory. To enhance robustness, Reference [23] employs the important DCT coefficients from each image block to make a hash. This scheme reaches the expected robustness of brightness adjustment and contrast adjustment. In another work [24], the Canny operator and DWT are employed to calculate a hash. This scheme performs well in image quality assessment. In Reference [25], the DCT coefficients and LLE technology are utilized to form a hash. This scheme has promising robustness. In Reference [26], a new scheme by combining the Center-Symmetric Local Binary Pattern (CSLBP) and SVD is introduced. But the discrimination of this scheme needs to be enhanced. To enhance discrimination, Reference [27] uses the singular values of Quaternion SVD (QSVD) to construct a hash. This scheme has excellent discrimination, but the robustness is not optimal yet. To take full account of the robustness of geometric distortions, Reference [28] extracts mean, variance, kurtosis and skewness of image rings to form a hash. This scheme is robust to geometric distortions. In another work [29], Ring Partition (RP) theory and Non-negative Matrix Factorization (NMF) are employed to compute a hash. This scheme has good robustness. Recently, in Reference [30], the Saliency Map (SM) and Isometric Mapping (Isomap) are jointly used to generate a robust hash. This scheme obtains a relatively preferable balance between discrimination and robustness. In another work [31], a novel scheme with the Feature Map (FM) and Two-Dimensional Principal Component Analysis (2D PCA) is introduced. This scheme has good detection performance for single distortion, but it does not consider the issue of multiple distortions.

    2.3 Image Hashing Schemes with Local Features and Global Features

    These image hashing schemes combine local and global features to produce a hash. For example, Reference [15] calculates a hash by combining Zernike moment, texture features and position features of the salient region. This scheme has a good application for tampered image detection. To improve robustness, Reference [32] proposes a Fourier Mellin Transform (FMT)-based hashing scheme. This scheme has favorable robustness, but there is still room for improving discrimination. In another work [33], the FMT and Fractal Coding (FC) are utilized to generate a hash. This scheme has excellent robustness, but the discrimination ought to be improved. To enhance discrimination, Reference [34] employs local energy features and three-dimensional global statistical features to generate a hash. This scheme has high spatio-temporal efficiency, but security issue is not considered. To ensure security, Reference [35] utilizes the local features of Three-dimensional Space Angle (3DSA) and the global features of Cool and Warm Hue opposite Color Quaternion (CWHCQ) to form a hash. This scheme has a good key dependency, but its classification performance is far from adequate. To improve classification performance, Reference [36] designs a novel hashing scheme based on two-dimensional DCT and GLCM theory. This scheme obtains a relatively preferable classification performance. In another work [1], the Hu invariant moment and two-dimensional DCT are employed to form a hash. This scheme performs well in copy detection. Recently, Reference [37] combines the KAZE feature and Singular Value Decomposition (SVD) to generate a hash. This scheme has desirable robustness.
    The categories and core techniques of some famous schemes are presented in Table 1. In summary, there is still much scope for getting a preferable classification performance and making good performance in copy detection, especially for detecting image copies with multiple distortions. In this work, we present a new hashing scheme with global and local invariant features for copy detection.
    Table 1.
    SchemeCategoryCore technique
    [2]GlobalQZMs
    [17]GlobalHCVAs + DCT
    [18]GlobalDFT + MDS
    [19]GlobalDNS + GNS
    [24]LocalCanny + DWT
    [25]LocalDCT + LLE
    [27]LocalQSVD
    [29]LocalRP + NMF
    [30]LocalSM + Isomap
    [31]LocalFM + 2D PCA
    [33]Global + LocalFMT + FC
    [35]Global + Local3DSA + CWHCQ
    [36]Global + LocalDCT + GLCM
    [37]Global + LocalKAZE + SVD
    Table 1. Core Techniques of Some Typical Schemes

    3 Our Scheme

    Our scheme is composed of four phases. Figure 2 shows the block diagram of our scheme. First, global features are generated by PQFT and GLCM. Second, local invariant features are computed via KPCA and vector distances. Third, local and global features are encrypted. In the end, the encrypted features are quantized via ordinal measures. These phases are explained below.
    Fig. 2.
    Fig. 2. Block diagram of our scheme.

    3.1 Global Feature Calculation

    The input image is firstly converted to the size of \(M_{0} \times M_{0}\) . Subsequently, the global features are calculated by the PQFT and GLCM. These features can enhance discrimination without reducing robustness. Calculations of PQFT and GLCM are explained below.

    3.1.1 PQFT for Saliency Map.

    Saliency map can indicate critical image areas focused by Human Visual System (HVS). To improve visual robustness, the PQFT [38] is used to determine saliency map. The PQFT uses all color information for saliency map calculation and has been employed in video compression [39] and image segmentation [40], and so on. Steps of the PQFT for saliency map calculation are briefly explained as below.
    First, two color opponent-components simulated the HVS [38] are generated as follows:
    \(\begin{equation} {BY(i,j) = B(i,j) - Y(i,j),} \end{equation}\)
    (1)
    \(\begin{equation} {RG(i,j) = R(i,j) - G(i,j),} \end{equation}\)
    (2)
    where \((i,j)\) is the position of each pixel in the spatial domain.
    Four broadly tuned color tunnels \(B(i,j)\) , \(Y(i,j)\) , \(R(i,j)\) and \(G(i,j)\) are computed as follows:
    \(\begin{equation} {Y(i,j) = \frac{r(i,j) + g(i,j)}{2} - \frac{\left| r(i,j) - g(i,j) \right|}{2} - b(i,j),} \end{equation}\)
    (3)
    \(\begin{equation} {B(i,j) = b(i,j) - (r(i,j) + g(i,j))/2,} \end{equation}\)
    (4)
    \(\begin{equation} {R(i,j) = r(i,j) - (g(i,j) + b(i,j))/2,} \end{equation}\)
    (5)
    \(\begin{equation} {G(i,j) = g(i,j) - (r(i,j) + b(i,j))/2,} \end{equation}\)
    (6)
    in which \(b(i,j)\) , \(g(i,j)\) , and \(r(i,j)\) are the blue, green, and red channels of the pixel in the \((i,j)\) position, respectively.
    For an image, the motion component \(T(i,j)\) is 0 and the brightness component \(I(i,j)\) are computed as follows:
    \(\begin{equation} {I(i,j) = (r(i,j) + g(i,j) + b(i,j))/3.} \end{equation}\)
    (7)
    Second, a quaternion \(q(i,j)\) can be defined to describe a color image by the below equation:
    \(\begin{equation} {q(i,j) = T(i,j)+RG(i,j)\eta _{1} + BY(i,j)\eta _{2} + I(i,j)\eta _{3}}, \end{equation}\)
    (8)
    where \(\eta _{1}\) , \(\eta _{2}\) , and \(\eta _{3}\) are the imaginary units that satisfy the below relations: \(\eta _{1}^{2}=-1\) , \(\eta _{2}^{2}=-1\) , \(\eta _{3}^{2}=-1\) , \(\eta _{1}\bot \eta _{2}\) , \(\eta _{1}\bot \eta _{3}\) , \(\eta _{2}\bot \eta _{3}\) , and \(\ \eta _{3} = \eta _{1}\eta _{2}\) . Then, \(q(i,j)\) can be written as follows:
    \(\begin{equation} {q(i,j) = f_{1}(i,j)+f_{2}(i,j)\eta _{2}}, \end{equation}\)
    (9)
    \(\begin{equation} {f_{1}(i,j) = T(i,j)+RG(i,j)\eta _{1}}, \end{equation}\)
    (10)
    \(\begin{equation} {f_{2}(i,j) =BY(i,j)+I(i,j)\eta _{1}}. \end{equation}\)
    (11)
    Third, the Quaternion Fourier transform (QFT) is applied to the image denoted by quaternion numbers to obtain the \(Q(u,v)\) as follows:
    \(\begin{equation} {Q(u,v) = F_{2}(u,v)\eta _{2}+F_{1}(u,v),} \end{equation}\)
    (12)
    \(\begin{equation} {F_{\varphi }(u,v) = \frac{1}{\sqrt {M_{1}N_{1}}}\sum _{j = 0}^{M_{1} - 1}{\sum _{i = 0}^{N_{1} - 1}{e^{- \mu _{1}2\pi ((\frac{{iu}}{N_{1}}) + (\frac{jv}{M_{1}})) }f_{\varphi }}}(i,j),} \end{equation}\)
    (13)
    in which \(\varphi \in \left\lbrace 1,\ 2 \right\rbrace\) , \(M_{1}\) and \(N_{1}\) are the width and height of the image, \((u,v)\) is the position of the pixel in the frequency domain.
    Fourth, the inverse QFT can be obtained as follows:
    \(\begin{equation} {f_{\varphi }(i,j) = \frac{1}{\sqrt {{M_{1}N_{1}}}}\sum _{v = 0}^{M_{1} - 1}{\sum _{u = 0}^{N_{1}- 1}{e^{- \mu _{1}2\pi ((\frac{iu}{N_{1}}) + (\frac{{jv}}{M_{1}}))}F_{\varphi }(u,v).}}} \end{equation}\)
    (14)
    The above equations are used to acquire frequency domain representation \(Q(u,v)\) of \(q(i,j)\) . For simplicity, \(Q(u,v)\) can be expressed in polar form as follows:
    \(\begin{equation} {Q(u,v) = \left\Vert Q(u,v) \right\Vert e^{\xi \aleph (u,v)}}, \end{equation}\)
    (15)
    in which \(\xi\) is the unit pure quaternion and \(\aleph (u,v)\) is the phase spectrum of \(Q(u,v)\) .
    Let \(\left\Vert Q(u,v) \right\Vert\) =1. Thus, \(Q(u,v)\) retains the phase spectrum information in the frequency domain and the inverse QFT is conducted to obtain the reconstruction of \(Q(u,v)\) as follows:
    \(\begin{equation} {q^{^{\prime }}(i,j) = T^{^{\prime }}(i,j)+RG^{^{\prime }}(i,j)\eta _{1} + BY^{^{\prime }}(i,j)\eta _{2} + I^{^{\prime }}(i,j)\eta _{3}}, \end{equation}\)
    (16)
    in which \(T^{^{\prime }}(i,j)\) , \(RG^{^{\prime }}(i,j)\) , \(BY^{^{\prime }}(i,j),\) and \(I^{^{\prime }}(i,j)\) are the reconstruction information of the quaternion.
    Finally, the saliency map is defined as follows:
    \(\begin{equation} {S(i,j) = g*\left\Vert q^{^{\prime }}(i,j) \right\Vert ^{2}}, \end{equation}\)
    (17)
    where \(g\) is a two-dimensional Gaussian filter. A visual example of saliency map determined by PQFT is displayed in Figure 3.
    Fig. 3.
    Fig. 3. A visual example of saliency map determined by PQFT.

    3.1.2 GLCM.

    GLCM [41] is an effective method to represent the image. It can describe the spatial relations among pixels and has been applied to image recognition [42] and medical image classification [43]. During GLCM calculation, there are two parameters, i.e., the distance \(t\) and the direction \(\theta\) . To extract rich features, four groups of the parameters \((t, \theta)\) are used, i.e., \((1, 0^{\circ })\) , \((1, 45^{\circ })\) , \((1, 90^{\circ }),\) and \((1, 135^{\circ })\) . Therefore, after GLCM calculation on the saliency map, there are four GLCM matrices in total. More details of GLCM calculation can be found in Reference [36].
    To extract compact global features, four statistical features of each GLCM matrix are calculated. These statistical features are Angular Second Moment (ASM), Contrast, Correlation, and Homogeneity. Suppose that \(p(i,j)\) is the matrix element in the \(i\) th row and \(j\) th column. The definitions of these statistical features are given as follows.
    (1) ASM
    ASM reflects the texture thickness and gray distribution of the image [44], which can be denoted by \(\chi _{1}\) :
    \(\begin{equation} {\chi _{1} = \sum _{i}^{}{\sum _{j}^{}{p(i,j)}^{2}}}. \end{equation}\)
    (18)
    (2) Contrast. Contrast is used to judge the clarity and depth of the texture. It is denoted by \(\chi _{2}\) :
    \(\begin{equation} {\chi _{2} = \sum _{i}^{}{\sum _{j}^{}p(i,j)|i - j|^{2}}}. \end{equation}\)
    (19)
    (3) Correlation. Correlation indicates the similarity degree of the element to its neighbor, which can be denoted by \(\chi _{3}\) :
    \(\begin{equation} {\chi _{3} = \sum _{i}^{}{\sum _{j}^{}\frac{(j - \mu _{j})(i - \mu _{i})p(i,j)}{\sigma _{i}\sigma _{j}}}}, \end{equation}\)
    (20)
    where \(\mu _{i}\) and \(\ \sigma _{i}\) are the mean and the variance of the \(i\) th row of the GLCM matrix, and \(\mu _{j}\) and \(\sigma _{j}\) are the mean and the variance of the \(j\) th column of the GLCM matrix, respectively.
    (4) Homogeneity. Homogeneity measures the degree of texture regularity and it is computed by \(\chi _{4}\) :
    \(\begin{equation} {\chi _{4} = \sum _{i}^{}{\sum _{j}^{}\frac{1}{|i -j|^{2}+1}}{\ p}(i,j)}. \end{equation}\)
    (21)
    As there are four GLCM matrices and four statistical features are extracted from each GLCM matrix, we can obtain a sequence of 16 global features. For simplicity, the global feature sequence is represented by \(\mathbf {\chi }\) in the below description.

    3.2 Local Invariant Feature Computation

    The computation of the local invariant features includes three phases. The first phase is to select the \(\mathbf {V}\) component with the size of \(M_{0} \times M_{0}\) in HSV color space for extracting stable image features. The second phase is to learn discriminative local invariant features via KPCA. The third phase is to calculate the vector distances in the KPCA domain for producing compact local features. The KPCA and vector distance calculation are explained in detail below.

    3.2.1 KPCA.

    KPCA [45] is a nonlinear extension of PCA, which can effectively capture the nonlinear features of data. Currently, KPCA has been widely used in numerous applications, such as fault detection [46], face recognition [47], and target detection [48]. However, the utilization of KPCA in image hashing is seldom studied. In this work, we propose to use KPCA to learn local features from the \(\mathbf {V}\) component of the image. The KPCA can derive compact features by the below steps.
    (1) Map data to the feature space. Suppose that \(\mathbf {x}_{i}\) and \(\mathbf {x}_{j}\) are two data points in high-dimensional space and there are \(N\) data points in total. The nonlinear mapping function \(\varnothing\) is employed to map the high-dimensional data to the low-dimensional feature space as follows:
    \(\begin{equation} {\mathbf {x}_{i} \rightarrow \varnothing \left(\mathbf {x}_{i} \right).} \end{equation}\)
    (22)
    (2) Determine the kernel function. The \(N \times N\) matrix of the kernel function \(K(\mathbf {x}_{i},\mathbf {x}_{j})\) is defined as follows:
    \(\begin{equation} {K(\mathbf {x}_{i},\mathbf {x}_{j}) =\varnothing (\mathbf {x}_{i})\varnothing (\mathbf {x}_{j}) = \exp (- \frac{\Vert (\mathbf {x}_{i}-\mathbf {x}_{j}\Vert ^{2}}{2\sigma ^{2}}),} \end{equation}\)
    (23)
    in which \(\sigma\) is a kernel parameter.
    (3) Calculate the projection of eigenvectors. Suppose that \(\mathbf {C}\) is a covariance matrix with the size of \(N \times N\) , which can be computed as follows:
    \(\begin{equation} { \mathbf {C} = \frac{1}{N}\sum _{i = 1}^{N}{\varnothing (\mathbf {x}_{i})\varnothing (\mathbf {x}_{i})^{\rm {T} }}}. \end{equation}\)
    (24)
    The eigenvalues \(\lambda\) and eigenvectors \(\mathbf {E}\) of \(\mathbf {C}\) are computed and the calculation equation is as follows:
    \(\begin{equation} {\lambda \mathbf {E} = \mathbf {CE}}. \end{equation}\)
    (25)
    The eigenvalues are arranged in descending sort and the projection matrix \(\mathbf {A}_{d} = \lbrack a_{1}, a_{2},\ldots , a_{d}\rbrack\) is attained by the eigenvectors \(\mathbf {E}\) of the first \(d\) maximum eigenvalues of \(\mathbf {C}\) . Note that \(\mathbf {E}\) can be defined by the following equation:
    \(\begin{equation} {\mathbf {E} = \sum _{i = 1}^{N}{\alpha _{i}\varnothing (\mathbf {x}_{i})}}, \end{equation}\)
    (26)
    in which \(\alpha _i\) is the \(i\) th column vector. Finally, the projection of eigenvectors is computed as follows:
    \(\begin{equation} {\mathbf {Y} = \sum _{j = 1}^{N}{\mathbf {A}_{d}K(\mathbf {x}_{i},\mathbf {x}_{j}),}} \end{equation}\)
    (27)
    where \(\mathbf {Y}\) is a low dimensional matrix with the size of \(d \times N\) . More specific information about KPCA is given in Reference [45].
    In this work, the \(\mathbf {V}\) component is partitioned into non-overlapping blocks with the size \(m \times m\) . Consequently, the block numbers are N = (M/m)(M/m). For each block, its pixels are concatenated to form a vector with \(m^{2}\) elements, which is corresponding to the data point \(\mathbf {x}_{i}\) described in the KPCA. Therefore, there are \(N\) vectors in total. After applying the KPCA, the compact features are extracted. Here, the compact features are the output of the KPCA, i.e., the low dimensional matrix \(\mathbf {Y}\) with the size of \(d \times N\) .

    3.2.2 Vector Distance Calculation.

    To obtain compact features, each column of the low-dimensional matrix \(\mathbf {Y}\) is treated as a vector. As the vector distance [28] is an invariant feature for normal operations, it is used to represent features. To do so, a reference vector \(\mathbf {y}_{0}\) is computed as follows:
    \(\begin{equation} \mathbf {y}_{0} = \left[ y_{0}(1),y_{0}(2),\ldots ,y_{0}(d) \right]^{\rm {T}}, \end{equation}\)
    (28)
    where \(y_{0}(i)\) is the \(i\) th element of \(\mathbf {y}_{0}\) defined as follows:
    \(\begin{equation} {y_{0}(i) = \frac{1}{N}\sum _{j = 1}^{N}{y_{j}(i)}}, \end{equation}\)
    (29)
    in which \(y_{j}(i)\) is the \(i\) th element of \(\mathbf {y}_{j}\) .
    Next, the \(L_{2}\) norm is computed as follows:
    \(\begin{equation} {z_{j} = \sqrt {\sum _{i = 1}^{d}{\lbrack y_{0}(i) - y_{j}(i)\rbrack ^{2}}}}. \end{equation}\)
    (30)
    Subsequently, \(z_{j}\) is quantized as follows:
    \(\begin{equation} {\phi (j) = \text{round}(z_{j} \times 1,000),} \end{equation}\)
    (31)
    in which \(\rm {round(\cdot)}\) is the rounding operation. Last, a feature sequence \(\mathbf {\Phi }\) is obtained as follows:
    \(\begin{equation} {\mathbf {\Phi } = \lbrack \phi (1),\phi (2),\ldots ,\phi (N)\rbrack .} \end{equation}\)
    (32)

    3.3 Encryption

    The global feature sequence and the local invariant feature sequence are concatenated to produce the feature sequence \(\mathbf {g} = \lbrack g(1), g(2),\ldots , g(L)\rbrack\) , where the length is \(L = 16 + N\) . To guarantee security, the pseudo-random generator is exploited to permute the elements of \(\mathbf {g}\) . This can be achieved below. First, a secret key is used as a seed of the random generator to produce \(L\) pseudo-random numbers. Second, these pseudo-random numbers are sorted to form an ordered sequence. Third, the original positions of the sorted numbers are recorded in a sequence \(\mathbf {r}_{1}\) . Finally, the permutation can be done by the following equation:
    \(\begin{equation} {r_{2}(j) = g(r_{1}(j)),} \end{equation}\)
    (33)
    in which \(r_{1}(j)\) and \(r_{2}(j)\) are the \(j\) th elements of \(\mathbf {r}_{1}\) and \(\mathbf {r}_{2}\) , respectively. Clearly, there are \(L!\) possibilities of random permutation. Without the correct key, it is difficult to generate the correct encrypted feature. Therefore, the encrypted feature sequence \(\mathbf {r}_{2}\) can be formally defined as follows:
    \(\begin{equation} {\mathbf {r}_{2} = \lbrack r_{2}(1),r_{2}(2),\ldots ,r_{2}(L)\rbrack .} \end{equation}\)
    (34)

    3.4 Ordinal Measures

    Ordinal measures [49] are useful compact features for many robust image applications [50], such as iris recognition and copy detection. To make a short and robust hash, the ordinal measures are exploited to quantize the encrypted feature vector \(\mathbf {r}_{2}\) . To do so, the elements in the encrypted feature sequence are firstly sorted in ascending order. Next, their positions in the sorted sequence are selected as the compact features.
    To make easy understanding of the ordinal measures, an example is presented in Table 2, in which the first row is the original sequence, the second row is the ascending sorted sequence, and the last row is the ordinal measures of the sequence in the first row. Obviously, the first element of the original sequence is 7, which is at the third position of the sorted sequence. Therefore, its ordinal number is 3. Similarly, the second element of the original sequence is 3, which is at the first position of the sorted sequence. Thus, its ordinal number is 1.
    Table 2.
    Original sequence7351912242816
    Sorted sequence3571216192428
    Ordinal measures31264785
    Table 2. Example of Ordinal Measures
    The final hash h can be written as follows:
    \(\begin{equation} {{\bf h}=\lbrack h(1),h(2),\ldots ,h(L)\rbrack ,} \end{equation}\)
    (35)
    where \(h(j)(1 \le j \le L)\) is the position of \(r_{2}(j)\) in the sorted feature sequence. Therefore, our hash contains L integers.

    3.5 Pseudo-code Description

    Our scheme consists of four phases. They are global feature calculation, local invariant feature computation, encryption, and ordinal measures. To improve readability, the pseudo-code of our scheme is described in Algorithm 1.

    3.6 Similarity Evaluation

    The L1 norm is a well-known distance for measuring similarity and its inputs are generally the sequences of integers or decimals. As the L1 norm is simple and effective and the hash sequence of our scheme consists of integers, we select the L1 norm as the metric for determining the similarity of two hash sequences. Suppose that \(\mathbf {h}_{1}\) and \(\mathbf {h}_{2}\) are two hash sequences. Thus, the L1 norm \(\zeta \left(\mathbf {h}_{\mathbf {1}},\mathbf {h}_{\mathbf {2}} \right)\) is defined as follows:
    \(\begin{equation} {\zeta \left(\mathbf {h}_{\mathbf {1}},\mathbf {h}_{\mathbf {2}} \right) = \sum _{j = 1}^{L}\left| h_{1}(j) - h_{2}(j) \right|,} \end{equation}\)
    (36)
    in which \(h_{1}(j)\) and \(h_{2}(j)\) are the jth elements of \(\mathbf {h}_{1}\) and \(\mathbf {h}_{2}\) , respectively. The smaller the L1 norm is, the more similar their corresponding images are. A threshold \(T\) can be employed to judge whether two images are similar or not.

    4 Experimental Results

    Our scheme is implemented on the Matlab R2018B platform. The adopted computer has a CPU with 3.20 GHz main frequency (Intel i7 dual-core 8700) and 8 GB RAM. In the following experiments, our employed parameter settings are detailed as below. The size of the image is 512 \(\times\) 512, the kernel parameter of KPCA is 10,000, the size of block is 64 \(\times\) 64 and the feature dimension is 4. In other words, our parameter settings are \(M_{0}\) =512, \(m\) =64, \({\sigma =\text{10,000}}\) , and \(d\) =4. Hence, \(N=64\) and \(L=64+16=80\) .

    4.1 Robustness Test

    The Kodak database1 is exploited to evaluate the robustness of our scheme. This database has 24 color images. For each image, there are 100 similar versions, which are formed by 16 multiple distortion types, including Image Deformation with Rotation (IDR) (angle: 30 \(°\) )+Salt and Pepper Noise (SPN) (density: 0.01)+BA, IDR+SPN+Contrast Adjustment (CA), IDR+SPN+Gamma Correction (GC), IDR+SPN+Gaussian Low-pass Filtering (GLF), IDR+SPN+JC, IDR+SPN+Image Scaling (IS), IDR+SPN+Speckle Noise (SN), IDR+SPN+AWGN, IDR+AWGN (variance: 0.01)+BA, IDR+AWGN+CA, IDR+AWGN+GC, IDR+AWGN+GLF, IDR+AWGN+SPN, IDR+AWGN+JC, IDR+AWGN+IS, IDR+AWGN+SN. In the experiment, the total number of similar images is 2,400 pairs. In summary, 2,400 + 24 = 2,424 images are employed. Table 3 lists the specific distortions and their parameter settings.
    Table 3.
    DistortionsParameter settingNumber
    IDR+SPN+BAmagnitude \(\in \left\lbrace 10, 20\right\rbrace\) 2
    IDR+SPN+CAmagnitude \(\in \left\lbrace 10, 20\right\rbrace\) 2
    IDR+SPN+GC \(\gamma\) \(\in \left\lbrace 1.25, 1.1, 0.9, 0.75\right\rbrace\) 4
    IDR+SPN+GLFstandard deviation:[0.3, 1.0], step:0.18
    IDR+SPN+JCfactor:[30, 100], step:108
    IDR+SPN+ISratio \(\in \left\lbrace 2, 1.5, 1.1, 0.9, 0.75, 0.5\right\rbrace\) 6
    IDR+SPN+SNvariance:[0.001, 0.01], step:0.00110
    IDR+SPN+AWGNvariance:[0.001, 0.01], step:0.00110
    IDR+AWGN+BAmagnitude \(\in \left\lbrace 10, 20\right\rbrace\) 2
    IDR+AWGN+CAmagnitude \(\in \left\lbrace 10, 20\right\rbrace\) 2
    IDR+AWGN+GC \(\gamma\) \(\in \left\lbrace 1.25, 1.1, 0.9, 0.75\right\rbrace\) 4
    IDR+AWGN+GLFstandard deviation:[0.3, 1.0], step:0.18
    IDR+AWGN+JCfactor:[30, 100], step:108
    IDR+AWGN+SPNdensity:[0.001, 0.01], step:0.00110
    IDR+AWGN+ISratio \(\in \left\lbrace 2, 1.5, 1.1, 0.9, 0.75, 0.5\right\rbrace\) 6
    IDR+AWGN+SNvariance:[0.001, 0.01], step:0.00110
    Total 100
    Table 3. Parameter Settings of Multiple Distortions
    Table 4 shows the Mean of Median Values (MMV), Mean of Lower Quartiles (MLQ), and Mean of Upper Quartiles (MUQ) of L1 norms under multiple distortions with different parameters to further validate the robustness of our hashing. Clearly, the MUQs of multiple distortions are all less than 600. The results show that when the threshold is 600, the proposed scheme can correctly identify 95.75% similar images. When the threshold is chosen as 700, our scheme can obtain a correct detection rate of 98.38%. Therefore, our scheme has good robustness according to the high correct detection rate of the multiple distortions.
    Table 4.
    DistortionsMMVMLQMUQ
    IDR+SPN+BA392.5355.0496.0
    IDR+SPN+CA390.5348.5488.0
    IDR+SPN+GC424.5362.0501.0
    IDR+SPN+GLF391.7343.7480.0
    IDR+SPN+JC373.1355.2463.7
    IDR+SPN+IS388.5345.6483.8
    IDR+SPN+SN398.8357.5495.7
    IDR+SPN+AWGN429.6388.1523.1
    IDR+AWGN+BA473.0421.0562.5
    IDR+AWGN+CA453.5393.5553.0
    IDR+AWGN+GC471.5410.0559.2
    IDR+AWGN+GLF426.1379.7518.5
    IDR+AWGN+JC387.2353.6471.0
    IDR+AWGN+SPN448.2407.5552.2
    IDR+AWGN+IS441.8388.3533.0
    IDR+AWGN+SN461.2409.0553.6
    Table 4. Statistical Results of Multiple Distortions

    4.2 Discrimination Test

    The VOC2007 database2 is used to check the discrimination of our scheme. There are 5,011 color images in the VOC2007 database. The L1 norms between each pair of hash codes in this database are produced. Hence, the total number of the distances can be established by the combinatorial number \(C(5,011, 2)= 5,011(5,011-1)/2=12,552,555\) . Figure 5 is the distribution of these distances. The calculation results show that the median value, the lower quartile and the upper quartile of the L1 norms are 1,342.0, 1,236.0, and 1,444.0, respectively, which are all larger than the biggest MUQ (562.5) of the similar images. This illustrates that our scheme is discriminative.
    Fig. 4.
    Fig. 4. Some Images of the VOC2007 Database.
    Fig. 5.
    Fig. 5. Distribution of distances of different hash codes.

    4.3 Block Size Selection

    To view our performances under different block sizes, the Receiver Operating Characteristics (ROC) graph [51] is employed. The ordinate axis and the abscissae axis are defined by \(P_1\) and \(P_2\) as follows:
    \(\begin{equation} P_1 = \frac{ \rm {\# similar\; images\; rightly\; detected}}{\rm {\# similar\; images}}, \end{equation}\)
    (37)
    \(\begin{equation} P_2 = \frac{ \rm {\# different\; images\; falsely\; recognized}}{\rm {\# different\; images}}. \end{equation}\)
    (38)
    A small \(P_2\) value means a good discrimination, while a large \(P_1\) value means a good robustness. A set of points with coordinates \((P_{1}, P_{2})\) are employed to plot a curve of ROC graph. If there are two curves in a graph, then the curve near the upper-left corner has better performance than that away from it. In addition, the Area Under the ROC Curve (AUC) is computed for performance comparison, where the scope of AUC is [0, 1]. The curve with a big AUC is better than the curve with a small one.
    In this section, the used datasets are the same with the image databases of robustness in Section 4.1 and discrimination in Section 4.2. Note that block size is varied and the settings of other parameters remain unchanged. Figure 6 shows the curves under different block sizes. The curve of 64 \(\times\) 64 is nearer the upper left corner than those of other sizes. Furthermore, the AUCs of different block sizes are calculated. The AUC values of 256 \(\times\) 256, 128 \(\times\) 128, 64 \(\times\) 64, 32 \(\times\) 32, and 16 \(\times\) 16 are 0.15438, 0.77170, 0.99999, 0.99985, and 0.99977, respectively. It is found that the AUC of 64 \(\times\) 64 is larger than the values of other block sizes. This means that the classification performance of the block size 64 \(\times\) 64 is superior to those of 256 \(\times\) 256, 128 \(\times\) 128, 32 \(\times\) 32, and 16 \(\times\) 16. The computational time of 256 \(\times\) 256, 128 \(\times\) 128, 64 \(\times\) 64, 32 \(\times\) 32, and 16 \(\times\) 16 is 0.1005, 0.1163, 0.1453, 0.2601, and 1.0620 s, respectively. This means that the selection of 64 \(\times\) 64 is the optimal choice. The hash lengths of the selections of 256 \(\times\) 256, 128 \(\times\) 128, 64 \(\times\) 64, 32 \(\times\) 32, and 16 \(\times\) 16 are 20, 32, 80, 272, and 1040 integers, respectively. Table 5 lists the performance comparison under different block sizes. Hence, the block size 64 \(\times\) 64 can provide the best performance of the proposed scheme.
    Table 5.
    Block sizeAUCTime (s)Length (integer)
    256 \(\times\) 2560.154380.100520
    128 \(\times\) 1280.771700.116332
    64 \(\times\) 640.999990.145380
    32 \(\times\) 320.999850.2601272
    16 \(\times\) 160.999771.06201,040
    Table 5. Performance Comparison of Different Block Sizes
    Fig. 6.
    Fig. 6. Curves of different block sizes.

    4.4 Dimension Selection

    A key parameter of our scheme is the dimension selection in the KPCA, i.e., the parameter \(d\) . In the experiments, the \(d\) value is chosen from the set \(\left\lbrace 2, 3, 4, 5, 6 \right\rbrace\) , and other parameter settings remain the same. Figure 7 shows the ROC curves under different dimensions. The quantitative results show that the AUC values of the dimension values of 2, 3, 4, 5, and 6 are 0.99905, 0.99995, 0.99999, 0.99994, and 0.99996, respectively. The AUC value of \(d\) =4 is greater than those of other dimensions. The computational time of the dimension values of 2, 3, 4, 5, and 6 is 0.1453, 0.1597, 0.1625, 0.1686, and 0.1694 s, respectively. As to the computational time, there is little difference between different dimensions. Performance comparisons are summarized in Table 6. Our scheme achieves better performances in both classification performance and computational time when the dimension value is 4.
    Table 6.
    \(d\) AUCTime (s)
    20.999050.1453
    30.999950.1597
    40.999990.1625
    50.999940.1686
    60.999960.1694
    Table 6. Performance Comparison of Different Dimensions
    Fig. 7.
    Fig. 7. Curves of different dimensions.

    4.5 Security Analysis

    To ensure the security of our scheme, the similarity between different hashes created by different keys should be small. In our scheme, the key dependence means that the \(L_{1}\) norms between the hash codes of different keys are very large. The six test images named “Airplane,” “Baboon,” “House,” “Peppers,” “Splash,” and “Tree” are employed to compute the \(L_{1}\) norms between hash codes generated by different keys. First, the hash of each image is generated by the correct key. Second, 100 wrong keys are used to compute 100 different hash codes of each image. Finally, the \(L_{1}\) norms between hashes of the correct key and the incorrect keys are computed. Figure 8 shows the \(L_{1}\) norms between the hashes generated by different keys. In Figure 8, the \(y\) -axis describes the \(L_{1}\) norm and the \(x\) -axis describes the index of the wrong key. The minimum \(L_{1}\) norm is greater than 1,700, which shows that our scheme has good key dependence.
    Fig. 8.
    Fig. 8. The \(L_{1}\) norms between the hashes generated by different keys.

    5 Performance Comparison

    To demonstrate the advantage of our scheme, some famous schemes are compared, including RW scheme [21], SVD-CSLBP scheme [26], HCVA-DCT scheme [17], MDS scheme [18], FM-2D PCA scheme [31], and DCT-GLCM scheme [36]. The compared schemes are recently published in reputable journals or conference. In addition, SVD-CSLBP scheme, MDS scheme, and FM-2D PCA scheme also use the techniques of dimension reduction, such as SVD, MDS, and 2D PCA. To guarantee a fair comparison, the reported parameter values and the measurements of hash similarity in the papers of these schemes are all employed here, and all images are resized to 512 \(\times\) 512 before they are input to these schemes. In the below sections, Section 5.1 presents classification performance, and Section 5.2 compares the performances of time and storage.

    5.1 Classification Performance

    The image databases described in Sections 4.1 and 4.2 are used to evaluate the classification performance of our scheme. Specifically, 2,424 images are used for robustness and 5,011 images are taken for discrimination. The ROC graph is still chosen to conduct visual comparison. In Figure 9, all evaluated schemes’ curves are drawn in the same graph for easy comparison. For a better view of these curves, their local details are enlarged and drawn in the same graph. Clearly, the curve of our scheme is nearer to the upper-left corner than those of the compared schemes. The visual comparison shows that the classification performance of our scheme outperforms those of the compared schemes. For further analysis, the quantitative metric called AUC is also computed.
    Fig. 9.
    Fig. 9. ROC curves of different schemes.
    The experimental results show that the AUC of our scheme is 0.99999. However, the AUCs of the RW scheme, SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, DCT-GLCM scheme, our scheme only with global features, and our scheme only with local features are 0.98298, 0.87868, 0.93143, 0.96125, 0.997642, 0.99913, 0.15394, and 0.99981, respectively. Obviously, the AUC of our scheme is bigger than those of the compared schemes. The AUC comparison also illustrates that our scheme is superior to all compared schemes. Our scheme obtains the competitive advantage of classification performance because of the following reasons. The global feature calculation with PQFT and GLCM can enhance discrimination without reducing robustness. Since the KPCA can maintain the geometric relationships within image, the local invariant features learned with KPCA can guarantee discrimination.

    5.2 Performances of Time and Storage

    Time performance is measured by the computational time of calculating a hash code. The results show that the computational time of our scheme is 0.1453 s. The computational times of the RW scheme, SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, DCT-GLCM scheme, our scheme only with global features, and our scheme only with local features are 0.0587, 0.1213, 0.0328, 0.4285, 0.0847, 0.1211, 0.0671, and 0.1024 s, respectively. Obviously, our scheme is quicker than some compared schemes. Compared with some dimension reduction-based schemes, such as the MDS scheme, our scheme is much quicker.
    Storage performance is determined by the required bits of a hash code. In the experiment, 5,011 different images in VOC2007 are utilized to analyze the needed bits of storing our scheme. As each hash includes 80 integers, there are \(\text{5,011} \times 80 = \text{400,880}\) integers in total. According to the ordinal measures, the maximum hash value is 80. As \(80 \lt 128 = 2^{7}\) , only 7 bits are used to store a hash element. Hence, there are \(80 \times 7 = 560\) bits for our hash. As for our scheme only with global features, only 4 bits are used to store a hash element because of \(16 = 2^{4}\) . Thus, there are \(16 \times 4 = 64\) bits for a hash. As for our scheme only with local features, only 6 bits are used to store a hash element because of \(64 = 2^{6}\) . As a result, there are \(64 \times 6 = 384\) bits for a hash. For the SVD-CSLBP scheme and the HCVA-DCT scheme, their lengths are 64 and 20 floating-point numbers. Note that a floating-point number requires 32 bits according to IEEE standard. Thus, the SVD-CSLBP scheme and the HCVA-DCT scheme need 2048 bits and 640 bits for saving a hash code, respectively. The lengths of the RW scheme, SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, DCT-GLCM scheme, our scheme only with global features, our scheme only with local features, and our scheme are 144 bits, 2048 bits, 640 bits, 720 bits, 832 bits, 720 bits, 64 bits, 384 bits, and 560 bits, respectively. Obviously, the storage complexity of our scheme is higher than those of RW scheme and DCT-LLE scheme, but it is lower than those of the SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, and GLCM-DCT scheme. The complexity summary is presented in Table 7.
    Table 7.
    SchemeHash length (bit)Time (s)
    RW1440.0587
    SVD-CSLLBP2,0480.1213
    HCVA-DCT6400.0328
    MDS7200.4285
    FM-2D PCA8320.0847
    DCT-GLCM7200.1211
    Our-global640.0671
    Our-local3840.1024
    Our5600.1453
    Table 7. Performance Summary of Time and Storage

    6 Application to Detection of Image Copies with Multiple Distortions

    With the extensive use of digital images, copyright protection of images has become a significant task of information security. Due to the benefit of robust hashing in robustness and low storage, numerous researchers have utilized robust hashing to pursue copy detection. In this section, we test the copy detection performance of our scheme. Section 6.1 depicts the employed database and metric. Section 6.2 presents the results of copy detection.

    6.1 Database and Metric

    To build an image database for copy detection, the UCID [52] is employed. In the experiment, 16 images are randomly chosen as the query images from 1338 color images in the UCID. Figure 10 shows the thumbnails of these query images. To simulate copy detection, 10 digital operations are utilized to produce 10 copies of each query image. These 10 operations are listed below: IDR+JC+BA (angle: 30 \(^{\circ }\) , quality factor: 30, parameter: 20), IDR+JC+CA (angle: 30 \(^{\circ }\) , quality factor: 30, parameter: 20), IDR+JC+GC (angle: 30 \(^{\circ }\) , quality factor: 30, \(\gamma\) : 0.9), IDR+JC+GLF (angle: 30 \(^{\circ }\) , quality factor: 30, standard deviation: 0.2), IDR+JC+SPN (angle: 30 \(^{\circ }\) , quality factor: 30, density: 0.02), IDR+JC+IS (angle: 30 \(^{\circ }\) , quality factor: 30, ratio: 0.75), IDR+JC+SN (angle: 30 \(^{\circ }\) , quality factor: 30, variance: 0.02), IDR+JC with Text Adding (IDR+JC+TA) (text content: Copyright 2022), IDR+JC with Logo Embedding (IDR+JC+LE) (size of logo: \(66\times 70\) , weight of logo: 0.2), and IDR+JC+SN (angle: 30 \(^{\circ }\) , quality factor: 30, variance: 0.02). So there are 160 image copies. These image copies and the images of UCID excluding the above chosen 16 images are employed to form the copy image database. Therefore, the total image number in the database is \(160+1,338-16=1482\) . For every query image, there are 1,472 different images and 10 image copies.
    Fig. 10.
    Fig. 10. The 16 query images selected from the UCID.
    To validate the copy detection performance of different schemes, the Mean Average Precision (MAP) is used to test. The MAP is computed by the Average Precision (AP). The calculation of AP is related to the order of the returned images of different schemes. The equation of AP is as follows:
    \(\begin{equation} {\rm{AP}} = \frac{1}{10}\sum_{i = 1}^{10}{f_{i}\left(\frac{\sum_{j = 1}^{i} f_{j}}{i}\right)} \end{equation}\)
    (39)
    in which \(f_{i}=1\) when the \(i\) th returned image is an image copy. Otherwise, \(f_{i}=0\) . The MAP is acquired by computing the average of the APs of all query images. The scope of MAP is [0, 1]. In general, a bigger MAP means a superior copy detection performance.

    6.2 Detection Results

    To show advantage, the copy detection performance of our scheme is also compared with the RW scheme [21], SVD-CSLBP scheme [26], HCVA-DCT scheme [17], MDS scheme [18], FM-2D PCA scheme [31], and DCT-GLCM scheme [36]. Note that the SVD-CSLBP scheme, the MDS scheme, and the FM-2D PCA scheme are also dimension reduction-based schemes. The MAPs of these schemes are calculated and shown in Figure 11. It can be seen that the MAP of our scheme is 0.99820. The MAPs of the RW scheme, SVD-CSLBP scheme, HCVA-DCT scheme, MDS scheme, FM-2D PCA scheme, and DCT-GLCM scheme are 0.79779, 0.35234, 0.47503, 0.72384, 0.96875, and 0.98748, respectively. Clearly, the MAP of our scheme is greater than the MAPs of the compared schemes. Our scheme performs better in copy detection than the compared schemes. This is because our scheme has better classification performance, which can reduce classification error during copy detection.
    Fig. 11.
    Fig. 11. MAP comparison.

    7 Conclusions

    This article has proposed a novel hashing with global and local invariant features for detecting image copies with multiple distortions. An important contribution is the global feature calculation with the PQFT and the GLCM, which can strengthen discrimination without reducing robustness. Another critical contribution is the local invariant feature computation via KPCA and vector distances. As KPCA can maintain the geometric relationships within image, the local invariant features learned with KPCA can guarantee discrimination. In addition, the low-dimensional features in the KPCA domain are viewed as vectors, and the vector distances in the KPCA domain are taken to represent the features, which can guarantee compactness of our scheme. Hash is finally determined by using the ordinal measures of the encrypted global and local invariant features. Numerous experiments have been carried out and the results have shown that our scheme is superior to some advanced hashing schemes in terms of classification and detection of image copies with multiple distortions.

    Acknowledgments

    Many thanks to the reviewers for their helpful suggestions. Zhenjun Tang is the corresponding author.

    Footnotes

    References

    [1]
    Shiguang Liu and Ziqing Huang. 2019. Efficient image hashing with geometric invariant vector distance for copy detection. ACM Trans. Multimedia Comput., Commun. Appl. 15, 4 (2019), 1–22. DOI:
    [2]
    Junlin Ouyang, Xingzi Wen, Jianxun Liu, and Jinjun Chen. 2016. Robust hashing based on quaternion Zernike moments for image authentication. ACM Trans. Multimedia Comput., Commun. Appl. 12, 4 (2016), 1–13. DOI:
    [3]
    Haozhe Chen, Hang Zhou, Jie Zhang, Dongdong Chen, Weiming Zhang, Kejiang Chen, Gang Hua, and Nenghai Yu. 2023. Perceptual hashing of deep convolutional neural networks for model copy detection. ACM Trans. Multimedia Comput., Commun. Appl. 19, 3, Article 123 (2023), 20 pages. DOI:
    [4]
    Zhaoda Ye and Yuxin Peng. 2019. Sequential cross-modal hashing learning via multi-scale correlation mining. ACM Trans. Multimedia Comput., Commun. Appl. 15, 4 (2019), 1–20.
    [5]
    Xiaoping Liang, Zhenjun Tang, Ziqing Huang, Xianquan Zhang, and Shichao Zhang. 2023. Efficient hashing method using 2D-2D PCA for image copy detection. IEEE Trans. Knowl. Data Eng. 35, 4 (2023), 3765–3778. DOI:
    [6]
    Ziqing Huang and Shiguang Liu. 2020. Perceptual hashing with visual content understanding for reduced-reference screen content image quality assessment. IEEE Trans. Circ. Syst. Video Technol. 31, 7 (2020), 2808–2823.
    [7]
    Chuan Qin, Liang Wu, Xinpeng Zhang, and Guorui Feng. 2021. Efficient non-targeted attack for deep hashing-based image retrieval. IEEE Signal Process. Lett. 28 (2021), 1893–1897. DOI:
    [8]
    Xiaoping Liang, Zhenjun Tang, Xiaolan Xie, Jingli Wu, and Xianquan Zhang. 2021. Robust and fast image hashing with two-dimensional PCA. Multimedia Syst. 27, 3 (2021), 389–401.
    [9]
    Yuenan Li and Linlin Guo. 2018. Robust image fingerprinting via distortion-resistant sparse coding. IEEE Signal Process. Lett. 25, 1 (2018), 140–144. DOI:
    [10]
    Chuan Qin, Xueqin Chen, Xiangyang Luo, Xinpeng Zhang, and Xingming Sun. 2018. Perceptual image hashing via dual-cross pattern encoding and salient structure detection. Info. Sci. 423 (2018), 284–302.
    [11]
    Cheng Xiong, Enli Liu, Xinran Li, Heng Yao, Lei Zhang, and Chuan Qin. 2021. Perceptual image hashing based on multitask neural network. Secur. Commun. Netw. Retrieved from https://www.hindawi.com/journals/scn/2021/8297244/.
    [12]
    Qi Shen and Yan Zhao. 2020. Perceptual hashing for color image based on color opponent component and quadtree structure. Signal Process. 166 (2020), Article ID. 107244. DOI:
    [13]
    Chuan Qin, Enli Liu, Guorui Feng, and Xinpeng Zhang. 2021. Perceptual image hashing for content authentication based on convolutional neural network with multiple constraints. IEEE Trans. Circ. Syst. Video Technol. 31, 11 (2021), 4523–4537. DOI:
    [14]
    Ziqing Huang, Zhenjun Tang, Xianquan Zhang, Linlin Ruan, and Xinpeng Zhang. 2023. Perceptual image hashing with locality preserving projection for copy detection. IEEE Trans. Depend. Secure Comput. 20, 1 (2023), 463–477. DOI:
    [15]
    Yan Zhao, Shuozhong Wang, Xinpeng Zhang, and Heng Yao. 2013. Robust hashing for image authentication using zernike moments and local features. IEEE Trans. Info. Forensics Secur. 8 (2013), 55–63. DOI:
    [16]
    Zhenjun Tang, Lv Chen, Xianquan Zhang, and Shichao Zhang. 2019. Robust image hashing with tensor decomposition. IEEE Trans. Knowl. Data Eng. 31, 3 (2019), 549–560.
    [17]
    Zhenjun Tang, Xuelong Li, Xian Quan Zhang, Shichao Zhang, and Yumin Dai. 2018. Image hashing with color vector angle. Neurocomputing 308, 25 (2018), 147–158. DOI:
    [18]
    Zhenjun Tang, Ziqing Huang, Xian Quan Zhang, and Huan Lao. 2017. Robust image hashing with multidimensional scaling. Signal Process. 137 (2017), 240–250. DOI:
    [19]
    Rubel Biswas, Víctor González-Castro, Eduardo Fidalgo, and Enrique Alegre. 2020. Perceptual image hashing based on frequency dominant neighborhood structure applied to Tor domains recognition. Neurocomputing 383 (2020), 24–38. DOI:
    [20]
    Vishal Monga and Brian Evans. 2006. Perceptual image hashing via feature points: Performance evaluation and tradeoffs. IEEE Trans. Image Process. 15, 11 (2006), 3452–3465. DOI:
    [21]
    Xi Huang, Xiaoguang Liu, Gang Wang, and Ming Su. 2016. A robust image hashing with enhanced randomness by using random walk on zigzag blocking. In Proceedings of the IEEE Trustcom/BigDataSE/ISPA. 14–18. DOI:
    [22]
    Caiping Yan, Chi-Man Pun, and Xiaochen Yuan. 2016. Multi-scale image hashing using adaptive local feature extraction for robust tampering detection. Signal Process. 121 (2016), 1–16. DOI:
    [23]
    Zhenjun Tang, Fan Yang, Liyan Huang, and Xianquan Zhang. 2014. Robust image hashing with dominant DCT coefficients. Optik—Int. J. Light Electron Optics 125 (2014), 5102–5107. DOI:
    [24]
    Zhenjun Tang, Ziqing Huang, Heng Yao, Xianquan Zhang, Lv Chen, and Chunqiang Yu. 2018. Perceptual image hashing with weighted DWT features for reduced-reference image quality assessment. Comput. J. 61, 10 (2018), 1695–1709. DOI:
    [25]
    Zhenjun Tang, Huan Lao, Xianquan Zhang, and Kai Liu. 2016. Robust image hashing via DCT and LLE. Comput. Secur. 62 (2016), 133–148. DOI:
    [26]
    Reza Davarzani, Saeed Mozaffari, and Khashayar Yaghmaie. 2016. Perceptual image hashing using center-symmetric local binary patterns. Multimedia Tools Appl. 75, 8 (2016), 4639–4667.
    [27]
    Zhenjun Tang, Mengzhu Yu, Heng Yao, Hanyun Zhang, Chunqiang Yu, and Xianquan Zhang. 2021. Robust image hashing with singular values of quaternion SVD. Comput. J. 64, 11 (2021), 1656–1671.
    [28]
    Zhenjun Tang, Xian Quan Zhang, Xianxian Li, and Shichao Zhang. 2016. Robust image hashing with ring partition and invariant vector distance. IEEE Trans. Info. Forensics Secur. 11, 1 (2016), 200–214. DOI:
    [29]
    Zhenjun Tang, Xianquan Zhang, and Shichao Zhang. 2014. Robust perceptual image hashing based on ring partition and NMF. IEEE Trans. Knowl. Data Eng. 26 (2014), 711–724. DOI:
    [30]
    Xiaoping Liang, Zhenjun Tang, Jingli Wu, Zhixin Li, and Xinpeng Zhang. 2023. Robust image hashing with Isomap and saliency map for copy detection. IEEE Trans. Multimedia 25 (2023), 1085–1097. DOI:
    [31]
    Xiaoping Liang, Zhenjun Tang, Sheng Li, Chunqiang Yu, and Xianquan Zhang. 2022. A novel hashing scheme via image feature map and 2D PCA. IET Image Process. 16, 12 (2022), 3225–3236.
    [32]
    Ashwin Swaminathan, Yinian Mao, and Min Wu. 2006. Robust and secure image hashing. IEEE Trans. Info. Forensics Secur. 1 (2006), 215–230. DOI:
    [33]
    Sani Abdullahi, Hongxia Wang, and Tao Li. 2020. Fractal coding-based robust and alignment-free fingerprint image hashing. IEEE Trans. Info. Forensics Secur. 15 (2020), 2587–2601. DOI:
    [34]
    Xiaoran Yuan and Yan Zhao. 2021. Perceptual image hashing based on three-dimensional global features and image energy. IEEE Access 9 (2021), 49325–49337.
    [35]
    Yan Zhao and Shuai Liu. 2021. Robust image hashing based on cool and warm hue and space angle. Secur. Commun. Netw. Retrieved from DOI:.
    [36]
    Ziqing Huang and Shiguang Liu. 2021. Perceptual image hashing with texture and invariant vector distance for copy detection. IEEE Trans. Multimedia 23 (2021), 1516–1529.
    [37]
    Satendra Pal Singh, Gaurav Bhatnagar, and Amit Kumar Singh. 2022. A new robust reference image hashing system. IEEE Trans. Depend. Secure Comput. 19, 4 (2022), 2211–2225. DOI:
    [38]
    Chenlei Guo, Qi Ma, and Liming Zhang. 2008. Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’08). 1–8. DOI:
    [39]
    Chenlei Guo and Liming Zhang. 2010. A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19 (2010), 185–198. DOI:
    [40]
    Prerana Mukherjee, Brejesh Lall, and Archit Shah. 2015. Saliency map-based improved segmentation. In Proceedings of the International Conference on Image Processing (ICIP’15). 1290–1294. DOI:
    [41]
    Andrik Rampun, Harry Strange, and Reyer Zwiggelaar. 2013. Texture segmentation using different orientations of GLCM features. In Proceedings of the 6th International Conference on Computer Vision/Computer Graphics Collaboration Techniques and Applications. 1–8. DOI:
    [42]
    Jagmal Singh and Mihai Datcu. 2013. SAR image categorization with log cumulants of the fractional fourier transform coefficients. IEEE Trans. Geosci. Remote Sens. 51, 12 (2013), 5273–5282. DOI:
    [43]
    Deepak Parashar and Dheeraj Kumar Agrawal. 2021. Automatic classification of glaucoma stages using two-dimensional tensor empirical wavelet transform. IEEE Signal Process. Lett. 28 (2021), 66–70. DOI:
    [44]
    Ziqing Huang and Shiguang Liu. 2018. Robustness and discrimination oriented hashing combining texture and invariant vector distance. In Proceedings of the 26th ACM International Conference on Multimedia. 1389–1397.
    [45]
    Bernhard Schölkopf, Alex Smola, and Klaus-Robert Müller. 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10 (1998), 1299–1319. DOI:
    [46]
    WanJing Meng, Tao Ju, and HongYun Yu. 2010. CFAR and KPCA for SAR image target detection. In Proceedings of the International Congress on Image and Signal Processing, Vol. 4. 1832–1835. DOI:
    [47]
    Qiang Gao, Wenjie Liu, Xuewen Zhao, Junfang Li, and Xiao Yu. 2017. Research and application of the distillation column process fault prediction based on the improved KPCA. In Proceedings of the IEEE International Conference on Mechatronics and Automation (ICMA’17). 247–251. DOI:
    [48]
    Xudong Xie and Kin-Man Lam. 2006. Gabor-based kernel PCA with doubly nonlinear mapping for face recognition with a single face image. IEEE Trans. Image Process. 15, 9 (2006), 2481–2492. DOI:
    [49]
    Dinkar N. Bhat and Shree K. Nayar. 1996. Ordinal measures for visual correspondence. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR’96). 351–357. DOI:
    [50]
    Zhenan Sun and Tieniu Tan. 2009. Ordinal measures for Iris recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31, 12 (2009), 2211–2226. DOI:
    [51]
    Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recogn. Lett. 27, 8 (2006), 861–874. DOI:
    [52]
    Schaefer Gerald and Stich Michal. 2004. UCID—An uncompressed colour image database. Stor. Retriev. Methods Appl. Multimedia 5307 (2004), 472–480.

    Cited By

    View all
    • (2024)Online Cross-modal Hashing With Dynamic PrototypeACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366524920:8(1-18)Online publication date: 13-Jun-2024
    • (2024)Robust Image Hashing via CP Decomposition and DCT for Copy DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365011220:7(1-22)Online publication date: 25-Apr-2024
    • (2024)Robust Image Fingerprinting Algorithm Based on Local and Global Content DependenciesIEEE Signal Processing Letters10.1109/LSP.2024.335305731(336-340)Online publication date: 2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 1
    January 2024
    639 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3613542
    • Editor:
    • Abdulmotaleb El Saddik
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 August 2023
    Online AM: 27 May 2023
    Accepted: 22 May 2023
    Revised: 20 April 2023
    Received: 05 January 2023
    Published in TOMM Volume 20, Issue 1

    Check for updates

    Author Tags

    1. Robust hashing
    2. saliency map
    3. invariant features
    4. image copy detection
    5. dimension reduction

    Qualifiers

    • Research-article

    Funding Sources

    • Guangxi Natural Science Foundation
    • National Natural Science Foundation of China
    • Guangxi “Bagui Scholar” Team for Innovation and Research, Guangxi Talent Highland Project of Big Data Intelligence and Application
    • Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,818
    • Downloads (Last 6 weeks)134
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Online Cross-modal Hashing With Dynamic PrototypeACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366524920:8(1-18)Online publication date: 13-Jun-2024
    • (2024)Robust Image Hashing via CP Decomposition and DCT for Copy DetectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365011220:7(1-22)Online publication date: 25-Apr-2024
    • (2024)Robust Image Fingerprinting Algorithm Based on Local and Global Content DependenciesIEEE Signal Processing Letters10.1109/LSP.2024.335305731(336-340)Online publication date: 2024
    • (2024)Protecting Tampered Images with Lossless Auto Recovery using improved INN2024 2nd International Conference on Artificial Intelligence and Machine Learning Applications Theme: Healthcare and Internet of Things (AIMLA)10.1109/AIMLA59606.2024.10531375(1-6)Online publication date: 15-Mar-2024
    • (2023)Content Authentication and Tampered Localization Using Ring Partition and CSLBP-Based Image HashingIEEE Access10.1109/ACCESS.2023.333096911(126791-126802)Online publication date: 2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media