Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Robust Image Hashing via CP Decomposition and DCT for Copy Detection

Published: 25 April 2024 Publication History

Abstract

Copy detection is a key task of image copyright protection. This article proposes a robust image hashing algorithm by CP decomposition and discrete cosine transform (DCT) for copy detection. The first contribution is the third-order tensor construction with low-frequency coefficients in the DCT domain. Since the low-frequency DCT coefficients contain most of the image energy, they can reflect the basic visual content of the image and are less disturbed by noise. Hence, the third-order tensor construction with the low-frequency DCT coefficients can ensure robustness of our algorithm. Another contribution is the application of the CP decomposition to the third-order tensor for learning a short binary hash. As the factor matrices learned from the CP decomposition can preserve the topology of the original tensor, the binary hash derived from the factor matrices can reach good discrimination. Lots of experiments and comparisons are done to validate effectiveness and advantage of our algorithm. The results demonstrate that our algorithm has superior classification and copy detection performances than several baseline algorithms. In addition, our algorithm is also better than some baseline algorithms with regard to hash length and computational time.

1 Introduction

In the digital era, it is convenient to obtain digital images via mobile devices, e.g., tablet computers and smartphones. Meanwhile, the online social platforms, such as Instagram, Little red book, and Oasis, are popular and result in an explosive growth of digital images on the Internet [1, 2, 3, 4]. How to efficiently store and manage massive images has emerged as an important issue of the current research of data science. For example, many users would like to post digital images on the social platform. During the process, digital images may undergo some manipulations, e.g., image compression and format conversion, which do not change image contents. Hence, after these manipulations, the processed images have similar contents with the original one. This means that there are many image copies on the Internet. Figure 1 presents an instance of three copy versions of a digital image, in which (a) is a digital image, and (b), (c), and (d) are its copies generated by brightness adjustment (BA), JPEG compression (JC), and text addition (TA). An efficient technique is expected to accurately detect all image copies, e.g., all three copies of Figure 1 (a). Therefore, it is imperative to devise efficient methods for detecting image copies. This article introduces a robust hashing algorithm for image copy detection.
Fig. 1.
Fig. 1. Three copy versions of a digital image.
Image hashing [5, 6, 7, 8, 9] is an effective technology to achieve efficient management of massive images. It can not only reduce the storage space but also reduce computational complexity of similarity search. Motivated by these advantages, many researchers have applied image hashing algorithms to many scenarios, such as authentication, copy detection, retrieval, tampering detection, and quality evaluation [10, 11, 12, 13]. Generally, an image hashing algorithm should have two basic performance indicators, namely robustness and discrimination[14, 15, 16, 17]. Robustness requires a hashing algorithm to map images with similar visual contents to the similar hashes. Discrimination is also called anti-collision [5, 12], which requires a hashing algorithm to map images with different visual contents to different hashes. There are constraints on the two basic performance indicators. Currently, most image hashing algorithms do not get a desirable balance between the two basic performance indicators, and thus their performance in copy detection is unsatisfactory. To handle this problem, we introduce a new hashing algorithm via CP decomposition and DCT for image copy detection. The contributions of this work are outlined as follows:
(1) A third-order tensor is constructed by using low-frequency coefficients in the DCT domain. As the low-frequency DCT coefficients contain most of the image energy, they can reflect the basic visual content of the image and are less disturbed by noise. Hence, the third-order tensor construction with the low-frequency DCT coefficients can guarantee the robustness of our algorithm.
(2) CP decomposition is applied to the third-order tensor for learning a short binary hash. The CP decomposition can decompose a third-order tensor into three factor matrices. As the factor matrices learned from the CP decomposition can preserve the topology of the original tensor, the binary hash derived from the factor matrices can reach good discrimination.
Lots of experiments and comparisons are done on public image datasets. The results show that our algorithm has superior classification and copy detection performances than some baseline algorithms. Moreover, our algorithm also outperforms several baseline algorithms with regard to hash length and computational time. The rest parts of the article are arranged below. Related work is introduced in Section 2. Our algorithm is elaborated in Section 3. Experiments and comparisons are discussed in Sections 4 and 5, respectively. At last, conclusions are given in Section 6.

2 Related Work

In literature, lots of robust hashing algorithms have been reported for different purposes. Based on the type of feature extraction techniques, these reported algorithms are divided into three kinds. Typical techniques of each kind are illustrated below.

2.1 Statistical Features-based Hashing Algorithms

These image hashing algorithms exploit different statistical features to derive a hash, such as moments, histogram, variance, skewness, and kurtosis. For example, Zhao et al. [6] extracted the amplitude and phase of image blocks by Zernike moments to produce an image hash. Based on the Zernike moment theory, Ouyang et al. [2] combined global features of quaternion Zernike moments (QZMs) and local feature points of SIFT to form a hash. This scheme exhibits strong security and is suitable for image authentication. Hosny et al. [18] introduced an algorithm via quaternion polar complex exponential transform (QPCET) moments. This algorithm is robust to some content-preserving operations. Tang et al. [19] employed the histogram of the color vector angle (CVA) matrix by ring partition to design a hashing algorithm. This hashing has the capability to resist rotation. In addition to moments and histogram, some other statistical features have also been employed in image hashing research. For instance, Tang et al. [20] utilized stable statistical features of mean, variance, skewness, and kurtosis from image rings to produce a hash. This hashing can ensure rotation robustness effectively, but the discrimination needs to be strengthened.

2.2 Transform Domain-based Hashing Algorithms

These image hashing algorithms extract features in the transform domain to form a hash. Some frequently used techniques include discrete Fourier transform (DFT) [21], discrete wavelet transform (DWT) [22, 23], and DCT [24, 25]. To improve robustness, Wang et al. [26] developed an algorithm via Watson’s visual model generated by DCT technique. This algorithm performs well in resisting blur operation and noise. Tang et al. [22] employed low-frequency coefficients in the DCT domain to form a hash. This hashing reaches preferable robustness, but the discrimination needs to be strengthened. To improve discrimination, Laradji et al. [27] employed quaternion discrete Fourier transform (QFT) to extract a hash. This hashing has good discrimination, but its robustness is not ideal. In another work, Yan et al. [28] designed a new hashing scheme for tamper detection by quaternion Fourier-Mellin transform (QFMT). This scheme has good discrimination because it can well capture color information via QFMT.

2.3 Data Dimensionality Reduction-based Hashing Algorithms

These hashing algorithms utilize low-dimensional data features to produce an image hash. For example, Tang et al. [29] employed tucker decomposition (TD) to derive a hash from the third-order tensor in the spatial domain. This algorithm reaches good discrimination. In another work, Tang et al. [30] employed MDS to calculate a hash. There is a need to enhance the classification of this algorithm. To make better classification performance, Qin et al. [31] combined the singular value decomposition (SVD), Canny operator, and CVA to produce a hash. This algorithm demonstrates good classification. However, the calculation speed of this algorithm is slow. To make a fast speed, Liang et al. [32] used feature map (FM) and 2D PCA to construct a hashing algorithm. This algorithm has a fast computational efficiency. Tang et al. [33] employed the quaternion SVD (QSVD) to compute a hash via the Euclidean distance between the singular values. The classification performance of these two algorithms needs further improvement. Recently, Liang et al. [34] employed saliency map (SM) and isometric mapping (Isomap) to form a hash. This algorithm has competitive classification performance and copy detection. In another work, Huang et al. [35] used locally-preserving projection (LPP) and Gabor filtering (GF) to form a hash. This algorithm has good classification performance and high security.
Apart from the three kinds mentioned above, there are other techniques that can be employed for image hashing. For example, to enhance rotation robustness, Li et al. [36] introduced an algorithm via random GF and lattice vector quantization (LVQ). This algorithm can effectively resist rotation. Huang et al. [37] produced a hash via random walk (RW) on Zigzag block of the image. This scheme can obtain remarkable security. Zhao et al. [38] combined cool and warm tones quaternion features and 3D spatial angle features to produce a hash. Table 1 summarizes the core techniques of some hashing algorithms. The above review demonstrates that substantial improvement has been made. Nevertheless, most algorithms fail to obtain the desired classification performance. Therefore, their performance in copy detection application is not satisfactory yet. To handle this problem, we introduce a new image hashing algorithm via CP decomposition and DCT for image copy detection.
Table 1.
AlgorithmCore techniquesYear
[2]QZMs + SIFT2016
[18]QPCET2018
[27]QFT2013
[28]QFMT2016
[29]TD2018
[31]SVD + CVA2018
[32]FM + 2D PCA2022
[34]SM+ Isomap2023
[35]LPP + GF2023
Table 1. Core Techniques of Some Hashing Algorithms

3 Our Algorithm

Our algorithm includes four steps. The block diagram of our algorithm is displayed in Figure 2. Firstly, the input image is adjusted and filtered to obtain a pre-processed image. Secondly, local features are generated from the pre-processed image via DCT. Thirdly, a third-order tensor is constructed by using the local features in DCT domain. In the end, CP decomposition is applied to the third-order tensor, and the factor matrices of the CP decomposition are employed to construct a short binary hash. The following sections elaborate on these four parts.
Fig. 2.
Fig. 2. Block diagram of our algorithm.

3.1 Pre-processing

The first part is the pre-processing operation which consists of converting the input image to \(F\times F\) size using bi-cubic interpolation and filtering the resized image by the Gaussian low-pass filtering with a 3×3 convolution mask. Typically, the filtering template is defined below.
\begin{equation} T_g=\frac{T^{(1)}(i,j)}{\sum _i\sum _jT^{(1)}(i,j)}, \end{equation}
(1)
where \(T^{(1)}(i,j)\) is defined as below.
\begin{equation} T^{(1)}(i,j)=e^{\frac{-(i^2+j^2)}{2\sigma ^2}}, \end{equation}
(2)
where \(\sigma\) is the standard deviation.

3.2 Local Feature Extraction via DCT

We select the low-frequency DCT coefficients of the pre-processed image in the RGB color space as local features. Since the low-frequency DCT coefficients contain most of the image energy, they can reflect the basic visual content of the image and are less disturbed by noise [1, 39]. Specifically, the red, green, and blue components of the pre-processed image in RGB color space are first extracted. Then, the three color components are partitioned into non-overlapping blocks with a size of \(S\times S\). Note that the S value is selected to ensure that the remainder of \(F/S\) is zero. Therefore, there are \(N={(F/S)}^2\) blocks for each color component. The DCT feature extraction in the red component is explained below.
Let that \(\mathbf {R}_i\) denote the ith block of the red component labeled from top to bottom and left to right. The two-dimensional DCT is then applied to the block \(\mathbf {R}_i\). The 2D DCT is computed by the below formula.
\begin{equation} \begin{aligned}D_i(u,v)=a(u)a(v)\sum _{l=0}^{S-1}\sum _{n=0}^{S-1}R_i(l,n)\cos \left[\frac{(2l+1)u\pi }{2S}\right] \cos \left[\frac{(2n+1)v\pi }{2S}\right] \end{aligned}, \end{equation}
(3)
where \(R_i(l,n)\) represents the pixel value in the \((l+1)\)-th row and the \((n+1)\)-th column of \(\mathbf {R}_i\), \(D_i(u,v)\) represents the DCT coefficients in the \((u+1)\)-th row and the \((v+1)\)-th column (\(u,v=0,1, \ldots ,S-1\)), and \(a(u)\) is defined as follows:
\[\begin{eqnarray} a(u)= \left\lbrace \begin{array}{ll} \sqrt {1/S}, & \text{if} \quad u = 0 \\ \sqrt {2/S}, & \text{otherwise} \end{array} \right. . \end{eqnarray}\]
(4)
The DCT coefficients are scanned in the zigzag order to obtain a sequence of DCT coefficients in one-dimensional form. As the high-frequency DCT coefficients contain only a small amount of image information and are susceptible to noise influence, the first K elements in the zigzag scanned DCT sequence of the block \(\mathbf {R}_i\) is selected as the feature vector \(\mathbf {x}_i=[x_{1,i},x_{2,i}, \ldots ,x_{K,i}]^{\rm {T}}\). The vectors of all image blocks are arranged to obtain a feature matrix \(\mathbf {X}_R\) sized \(K\times N\) as follows:
\[\begin{eqnarray} \mathbf {X}_R=[\mathbf {x}_1,\mathbf {x}_2, \ldots ,\mathbf {x}_N] = \begin{bmatrix} x_{1,1}& x_{1,2} & \cdots & x_{1,N} \\ x_{2,1} & x_{2,2}& \cdots & x_{2,N} \\ \vdots & \vdots & \ddots & \vdots \\ x_{K,1} & x_{K,2} & \cdots & x_{K,N} \\ \end{bmatrix}. \end{eqnarray}\]
(5)
Similarly, the green and blue components are processed in the same way and thus two local feature matrices \(\mathbf {X}_G\) and \(\mathbf {X}_B\) are constructed.

3.3 Tensor Construction

After the local feature extraction, three feature matrices in the DCT domain are generated, i.e., \(\mathbf {X}_R\), \(\mathbf {X}_G\), and \(\mathbf {X}_B\). To create a robust third-order tensor, the three feature matrices are partitioned into non-overlapping blocks, which are then stacked to form a third-order tensor. Suppose that the block size is \(D\times D\). For simplicity, the D value is selected to ensure that the remainders of \(K/D\) and \(N/D\) are both zero. Thus, there are \(M=3\times (K/D)\times (N/D)\) blocks in total. The blocks of each feature matrix are stacked from left to right and top to bottom, and the stacking order of the feature matrices is \(\mathbf {X}_R\), \(\mathbf {X}_G\), and \(\mathbf {X}_B\). Finally, a third-order tensor of size \(D\times D\times M\) is constructed. To visualize the tensor construction process, a block diagram is given in Figure 3. Note that the third-order tensor is constructed by using the low-frequency DCT coefficients. As these DCT coefficients are less disturbed by noise, hash calculation from the third-order tensor can ensure the robustness of our algorithm.
Fig. 3.
Fig. 3. Block diagram of tensor construction.

3.4 CP Decomposition

CP (CANDECOMP and PARAFAC) decomposition [40, 41] is an efficient data dimensionality reduction technology that can learn low-dimensional features from the input tensor. At present, CP decomposition has been employed in image fusion [42], feature extraction [43], data analysis [44], and so on. In general, CP decomposition can decompose a third-order tensor into three factor matrices. Detailed calculation steps are as follows.
For a third-order tensor \(\mathbf {X}\in \mathbb {R}^{D\times D\times M}\), suppose that its three factor matrices are denoted by \(\mathbf {A}\), \(\mathbf {B}\), and \(\mathbf {C}\), respectively, where \(\mathbf {A} \in \mathbb {R}^{D \times P}\), \(\mathbf {B} \in \mathbb {R}^{D \times P}\), and \(\mathbf {C} \in \mathbb {R}^{M \times P}\). Thus, CP decomposition is represented as below.
\[\begin{eqnarray} \mathbf {X} \approx [\![ \mathbf {A}, \mathbf {B}, \mathbf {C} ]\!] \equiv \sum _{p=1}^{P} \mathbf {a}_{p} \circ \mathbf {b}_{p} \circ \mathbf {c}_{p}, \end{eqnarray}\]
(6)
in which \(\circ\) is the inner product of two vectors, P is a positive integer representing the decomposition rank, \(\equiv\) is the constant equals, \([\![ \quad ]\!]\) is a concise expression of CP decomposition, \(\mathbf {a}_p\in \mathbf {A}\), \(\mathbf {b}_p\in \mathbf {B}\) and \(\mathbf {c}_p\in \mathbf {C}\) are the column vectors. In an element-wise manner, the CP decomposition can be denoted as below.
\[\begin{eqnarray} x_{ijk}\approx \sum _{p=1}^Pa_{ip}b_{jp}c_{kp}, \end{eqnarray}\]
(7)
where \(a_{ip}\), \(b_{jp}\), and \(c_{kp}\) are the elements in the \(\mathbf {A}\), \(\mathbf {B}\), and \(\mathbf {C}\), respectively.
Note that the factor matrices obtained from CP decomposition have the same number of column vectors. Specifically, the factor matrix \(\mathbf {A}\) is composed of \(\mathbf {a}_p=[a_{1,p},a_{2,p}, \ldots ,a_{D,p}]^{\rm {T}} \quad (p=1, 2, \ldots , P)\), the factor matrix \(\mathbf {B}\) is composed of \(\mathbf {b}_p=[b_{1,p},b_{2,p}, \ldots ,b_{D,p}]^{\rm {T}} \quad (p=1, 2, \ldots , P)\) and the factor matrix \(\mathbf {C}\) is composed of \(\mathbf {c}_p=[c_{1,p},c_{2,p}, \ldots ,c_{M,p}]^{\rm {T}} \quad (p=1, 2, \ldots , P)\). The formal definitions are as follows:
\[\begin{eqnarray} \mathbf {A}=[\mathbf {a}_1,\mathbf {a}_2, \ldots ,\mathbf {a}_P], \end{eqnarray}\]
(8)
\[\begin{eqnarray} \mathbf {B}=[\mathbf {b}_1,\mathbf {b}_2, \ldots ,\mathbf {b}_P], \end{eqnarray}\]
(9)
\[\begin{eqnarray} \mathbf {C}=[\mathbf {c}_1,\mathbf {c}_2, \ldots ,\mathbf {c}_P]. \end{eqnarray}\]
(10)
If the column vectors of \(\mathbf {A}\), \(\mathbf {B}\), and \(\mathbf {C}\) are normalized and the weights \(\lambda \in \mathbb {R}^P\) are taken into account in the vectors, the CP decomposition is written below.
\[\begin{eqnarray} \mathbf {X}\approx [\![ \lambda ;\mathbf {A},\mathbf {B},\mathbf {C}]\!] \equiv \sum _{p=1}^P\lambda _p \mathbf {a}_p\circ \mathbf {b}_p\circ \mathbf {c}_p. \end{eqnarray}\]
(11)
In practice, the alternating least squares (ALS) method [45, 46, 47] is used to achieve the CP decomposition by solving an optimization problem, which is mathematically expressed as below.
\[\begin{eqnarray} \min _{\lambda _p \mathbf {a}_p \mathbf {b}_p \mathbf {c}_p}\left\Vert \mathbf {X}-\sum _{p=1}^P\lambda _p \mathbf {a}_p\circ \mathbf {b}_p\circ \mathbf {c}_p\right\Vert . \end{eqnarray}\]
(12)
The ALS method can find an optimal value by iterative calculation. More details of the ALS method can be found in the reference [48]. Figure 4 illustrates the schematic diagram of CP decomposition.
Fig. 4.
Fig. 4. Schematic diagram of CP decomposition.
As the factor matrices are low-dimensional representations which can preserve the topology of the original tensor, image hash derived from the factor matrices can reach good discrimination. Details of hash generation from the factor matrices are explained as follows. To convert each factor matrix into a vector, the means of the elements in the rows of the matrix are calculated. Thus, the feature vectors \(\mathbf {y}^{\rm {(A)}}\), \(\mathbf {y}^{\rm {(B)}}\), and \(\mathbf {y}^{\rm {(C)}}\) of \(\mathbf {A}\), \(\mathbf {B}\), and \(\mathbf {C}\) are obtained as follows:
\[\begin{eqnarray} \mathbf {y}^{\rm {(A)}}= \left[y_1^{\rm {(A)}},y_2^{\rm {(A)}}, \ldots ,y_D^{\rm {(A)}}\right]^{\rm {T}}, \end{eqnarray}\]
(13)
\[\begin{eqnarray} \mathbf {y}^{\rm {(B)}}=\left[y_1^{\rm {(B)}},y_2^{\rm {(B)}}, \ldots ,y_D^{\rm {(B)}}\right]^{\rm {T}}, \end{eqnarray}\]
(14)
\[\begin{eqnarray} \mathbf {y}^{\rm {(C)}}=\left[y_1^{\rm {(C)}},y_2^{\rm {(C)}}, \ldots ,y_M^{\rm {(C)}}\right]^{\rm {T}}, \end{eqnarray}\]
(15)
where \(y_i^{\rm {(A)}}\), \(y_i^{\rm {(B)}}\), and \(y_i^{\rm {(C)}}\) represent the means of the ith rows of \(\mathbf {A}\), \(\mathbf {B}\), and \(\mathbf {C}\), respectively.
To reduce the storage space of saving our hash, the mean value of the elements of each feature vector is computed. Then, the \(\mathbf {y}^{\rm {(A)}}\), \(\mathbf {y}^{\rm {(B)}}\), and \(\mathbf {y}^{\rm {(C)}}\) can be converted to binary sequences by comparing their elements with their mean values. Detailed calculations are defined as follows:
\[\begin{eqnarray} h_i^{\rm {(A)}}= \left\lbrace \begin{array}{ll} 1, & \text{if} \quad y_i^{\rm {(A)}}\gt t^{\rm {(A)}} \\ 0, & \text{otherwise} \end{array} \right. , \end{eqnarray}\]
(16)
\[\begin{eqnarray} h_i^{\rm {(B)}}= \left\lbrace \begin{array}{ll} 1, & \text{if} \quad y_i^{\rm {(B)}}\gt t^{\rm {(B)}} \\ 0, & \text{otherwise} \end{array} \right. , \end{eqnarray}\]
(17)
\[\begin{eqnarray} h_i^{\rm {(C)}}= \left\lbrace \begin{array}{ll} 1, & \text{if} \quad y_i^{\rm {(C)}}\gt t^{\rm {(C)}} \\ 0, & \text{otherwise} \end{array} \right. , \end{eqnarray}\]
(18)
in which \(t^{\rm {(A)}}\), \(t^{\rm {(B)}}\), and \(t^{\rm {(C)}}\) represent the means of \(\mathbf {y}^{\rm {(A)}}\), \(\mathbf {y}^{\rm {(B)}}\), and \(\mathbf {y}^{\rm {(C)}}\), respectively. Finally, our image hash \(\mathbf {h}\) is constructed by concatenating the binary sequences of \(\mathbf {y}^{\rm {(A)}}\), \(\mathbf {y}^{\rm {(B)}}\), and \(\mathbf {y}^{\rm {(C)}}\).
\[\begin{eqnarray} \mathbf {h}= \left[h_1^{\rm {(A)}},h_2^{\rm {(A)}}, \ldots ,h_D^{\rm {(A)}},h_1^{\rm {(B)}},h_2^{\rm {(B)}}, \ldots ,h_D^{\rm {(B)}},h_1^{\rm {(C)}},h_2^{\rm {(C)}}, \ldots ,h_M^{\rm {(C)}}\right]. \end{eqnarray}\]
(19)
Hence, the hash length of our algorithm is \(L=2D+M\) bits.

3.5 Hash Similarity Evaluation

Since our image hash is a compact sequence of binary bits, the Hamming distance is employed to analyze the similarity of two given hashes. Assume that \(\mathbf {h}_1\) and \(\mathbf {h}_2\) are two hash sequences. Then, their Hamming distance can be expressed as below.
\[\begin{eqnarray} d_H(\mathbf {h}_1,\mathbf {h}_2)=\sum _{f=1}^L\vert h_1(f)-h_2(f)\vert , \end{eqnarray}\]
(20)
where \(h_1(f)\) and \(h_2(f)\) represent the fth bit values of the two given hashes \(\mathbf {h}_1\) and \(\mathbf {h}_2\), respectively. In general, a smaller \(d_H\) implies more similar hash sequences.

3.6 Pseudo-code Description

Our algorithm contains four steps: pre-processing, local feature extraction via DCT, tensor construction, and CP decomposition. To improve readability, the pseudo-code of our algorithm is described in Algorithm 1.

4 Experimental Results

The experimental parameters are listed below. The selected image size is \(512\times 512\), and the \(\mathbf {R}\), \(\mathbf {G}\) and \(\mathbf {B}\) components are divided into \(32\times 32\) non-overlapping blocks. Thus, for each component, a total of 256 blocks are obtained. For each block, the first 32 low-frequency DCT coefficients are extracted to construct feature matrix. Then, three feature matrices sized \(32\times 256\) are obtained. To construct a three-order tensor, the three feature matrices are divided into \(32\times 32\) non-overlapping blocks and then there are 24 blocks in total. Next, a three-order tensor of size \(32\times 32\times 24\) is constructed by stacking these blocks. The rank of CP decomposition is 1. Briefly, our parameters are set as follows: \(F=512\), \(S=32\), \(N=(F/S)^2=256\), \(K=32\), \(D=32\), and \(P=1\). Therefore, \(M=3\times (32/32)\times (256/32)= 24\) and our hash length is \(L=2D+M=88\) bits. Our algorithm is coded by MATLAB R2018b. The adopted computer has a CPU of Intel i7 dual-core 8700 with a main frequency of 3.20 GHz and the memory size of 8 GB.

4.1 Robustness Analysis

To analyze the robustness of our algorithm, we use the open dataset called Berkeley dataset [49] to construct a new dataset with 25,800 pairs of similar color images. Specifically, the used Berkeley dataset contains 300 color images with the sizes of 481×321 and 321×481. Typical images in the Berkeley dataset are shown in Figure 5. To create visually similar images, StirMark [50], MATLAB, and Photoshop are utilized. In the experiments, there are 12 operations for testing robustness, of which nine operations are single attacks and three operations are combinational attacks.
Fig. 5.
Fig. 5. Typical images in Berkeley dataset.
The parameter settings of the nine single operations are as follows:
Gamma Correction (GC): Four \(\gamma\) values of 1.25, 1.1, 0.9, and 0.75;
BA: Four magnitudes of ±20 and ±10;
Speckle Noise (SN): Nine variances ranging from 0.01 to 0.09 with a step size of 0.01;
Salt and Pepper Noise (SPN): Nine densities ranging from 0.01 to 0.09 with a step size of 0.01;
Contrast Adjustment (CA): Four magnitudes of ±20 and ±10;
Gaussian Low-pass filtering (GLF): Nine standard deviations ranging from 0.9 to 2.5 with a step size of 0.2;
Image Scaling (IS): Five ratio values of 2.0, 1.5, 1.1, 0.9, and 0.75;
JC: Eight quality factors ranging from 30 to 100 with a step size of 10;
Watermark Embedding (WE): Eight strengths ranging from 30 to 100 with a step size of 10.
The parameter settings of the three combinational operations are detailed below. The combinational operation 1 (CO1) consists of rotation (10 rotation angles: ±1\(^{\circ }\), ±2\(^{\circ }\), ±3\(^{\circ }\), ±4\(^{\circ }\), and ±5\(^{\circ }\)), cropping and rescaling. The CO1 is provided by the well-known tool called StirMark [50]. This operation firstly conducts rotation. Since the rotation will introduce some padded regions around the four corners of image, the CO1 exploits the cropping to remove the padded regions and then resizes the cropped image to the size of the original image. More details of the CO1 can be referred to [50]. The combinational operation 2 (CO2) is SPN+JC, where the density of SPN is 0.05 and the quality factor of JC ranges from 30 to 100 with a step size of 10. The combinational operation 3 (CO3) is BA+JC, where the magnitude of BA is 20 and the quality factor of JC ranges from 30 to 100 with a step size of 10. CO2 and CO3 both use JC because it is a commonly-used operation in practice. In summary, there are 86 different operations for each original image. Therefore, there are \(300 \times 86=25,800\) pairs of similar images for robustness analysis.
To view the quality of these similar images, two well-known image quality assessment (IQA) metrics called SSIM [51] and PSNR [52] are used to conduct quality evaluation. Note that SSIM and PSNR are full-reference IAQ metrics which require that the size of a reference image must be equal to the size of its original image. As the sizes of the images attacked by IS are changed, they are resized to the same sizes as their original images by the bicubic interpolation before quality evaluation. Table 2 presents the statistical results of SSIM and PSNR based on Berkeley dataset. Clearly, except for the CO1 results, all other SSIM means are bigger than 0.7 and all other PSNR means are bigger than 22 dB. In addition, the Std. Dev. results are small values. This illustrates that these attacked images are visually the same as their original images. The CO1 results are smaller than those of other operations. This is because it contains the rotation operation and cropping operation, which are not considered by the SSIM and PSNR.
Table 2.
OperationMean (SSIM)Std. Dev. (SSIM)Mean (PSNR)Std. Dev. (PSNR)
BA0.96960.010826.91323.7959
CA0.98050.006633.68201.4228
GC0.94520.030925.66413.6833
JC0.95620.031325.62790.4663
SN0.79200.128524.83083.2624
SPN0.70040.146623.05552.9075
GLF0.92170.047529.62563.3004
WE0.95420.040022.96302.1918
IS0.93460.051524.73041.2437
CO10.59550.163016.21692.7083
CO20.91980.039823.19650.6217
CO30.89510.042228.74541.1595
Table 2. Statistical Results of SSIM and PSNR (dB) Based on Berkeley Dataset
Hamming distances between 25,800 pairs of similar images are calculated. Figure 6 displays the mean values of Hamming distances under different operations, in which the abscissa indicates the parameter value of each operation and the ordinate is the mean value of Hamming distances. In Figure 6, the mean values of Hamming distances for all operations are smaller than 1.0, except for GC, CO1, and CO2. For GC, only three means are slightly bigger than 1.0. For CO1, all means are bigger than 2, but they are all smaller than 7. For CO2, its 8 means are around 1.0 and only two values are slightly bigger than 1.0. Small mean values illustrate that our algorithm can generate similar hash sequences for similar image pairs. Table 3 lists the statistical results of Hamming distances under different operations based on the Berkeley dataset. If the threshold of Hamming distance is set to 7, our algorithm can correctly detect all similar images without consideration of the operation of CO1. Even if the CO1 is considered, our algorithm can also make a correct detection rate of 97.26%. High correct detection rate verifies the robustness of our algorithm.
Table 3.
OperationMaxMinMeanStd. Dev.
BA700.70751.0055
CA400.49660.7486
GC700.81331.0737
JC300.38540.6439
SN700.62890.9414
SPN400.33180.5903
GLF300.09960.3233
WE500.42580.7631
IS300.38330.6137
CO12804.39133.2689
CO2500.98951.0436
CO3300.20250.4590
Table 3. Statistical Results of Hamming Distances Based on Berkeley Dataset
Fig. 6.
Fig. 6. Robustness performance on the Berkeley dataset.

4.2 Discrimination Test

To assess the discrimination of our algorithm, the VOC2012 dataset [53] is employed. This dataset includes 17,125 different color images and some typical images are shown in Figure 7. In the experiment, the Hamming distances between hash codes of each pair of images are computed. Then, the total number of distances reaches \(C_{17125}^2=17125\times (17125-1)/2=146,624,250\). Figure 8 displays the distribution of 146,624,250 Hamming distances. In Figure 8, the x-axis is the Hamming distance and the y-axis denotes the frequency of each Hamming distance. The calculation results demonstrate that the smallest Hamming distance is 0 and the largest Hamming distance is 57. In addition, the mean Hamming distance of different images is 27.2452, which is much bigger than the largest mean Hamming distance of similar images (4.3913). Consequently, it can intuitively find that our algorithm is discriminative.
Fig. 7.
Fig. 7. Typical images of VOC2012 dataset.
Fig. 8.
Fig. 8. Distribution of 146,624,250 Hamming distances.
To analyze the quantitative results of our discrimination, Table 4 lists the correct recognition rate and the false detection rate under different thresholds. In Table 4, the discrimination is expressed as the false detection rate of different images and the robustness is represented by the correct recognition rate of similar images. Obviously, as the threshold value increases, the robustness will increase and the discrimination will decrease. Based on the results of Table 4, a proper threshold value can be set for a practical application according to its performance requirement.
Table 4.
ThresholdCorrect recognition rate of similar imagesFalse detection rate of different images
389.58%0.005%
493.10%0.016%
595.02%0.038%
696.36%0.080%
797.26%0.160%
Table 4. Detection Performances Under Different Thresholds

4.3 Selection of Block Size

The popular tool called Receiver Operating Characteristics (ROC) graph [54] is adopted. In the ROC graph, the horizontal axis presents False Positive Rate (FPR) and the vertical axis presents True Positive Rate (TPR). Their calculation formulas are defined below.
\[\begin{eqnarray} P_{\rm {{FPR}}}(d_H\le T)=\frac{N_f}{N_d}, \end{eqnarray}\]
(21)
\[\begin{eqnarray} P_{\rm {{TPR}}}(d_H\le T)=\frac{N_t}{N_s}, \end{eqnarray}\]
(22)
where \(N_d\) is the number of different images, \(N_f\) is the number of different images incorrectly distinguished, \(N_s\) is the number of similar images, and \(N_t\) is the number of similar images successfully identified. Note that the \(P_{\rm {{FPR}}}\) and \(P_{\rm {{TPR}}}\) correspond to discrimination and robustness, respectively. A group of points with the coordinates \((P_{\rm {{FPR}}}, P_{\rm {{TPR}}})\) are calculated by using some thresholds and these points are used to plot an ROC curve. According to the meanings of \(P_{\rm {{FPR}}}\) and \(P_{\rm {{TPR}}}\), an ROC curve near the top left corner makes better classification than that far away from the corner.
In this section, the used datasets are identical to Sections 4.1 and 4.2. The selection of block size \(S \times S\) is discussed. In the experiments, only the S value is different and other parameters are kept the same. The selection sizes are \(S=16\), \(S=32\), and \(S=64\). Figure 9 shows the ROC curves of different S values. It can be seen that the ROC curve of \(S=16\) is slightly nearer than the curves of \(S=32\) and \(S=64\), illustrating a little better performance of classification. Therefore, compared to \(S=32\) and \(S=64\), \(S=16\) can provide a better classification performance. The area under the ROC curve (AUC) is calculated and its interval ranges from 0 to 1. A greater AUC value indicates superior classification performance. Table 5 compares the AUC values, hash length and the running time of different S values. The AUCs of \(S=16\), \(S=32\), and \(S=64\) are 0.99976, 0.99967, and 0.99958, respectively. Clearly, \(S=16\) has the largest AUC value, but the AUC difference among these S values are not large. For example, the AUC difference between \(S=16\) and \(S=32\) is 0.00009, while the AUC difference between \(S=16\) and \(S=64\) is 0.00018. The hash lengths of \(S=16\), \(S=32\), and \(S=64\) are 160, 88, and 70 bits, respectively. The running time of \(S=16\), \(S=32\), and \(S=64\) is 0.226, 0.096, and 0.074 seconds, respectively. In view of the whole performances of the AUC, hash length, and time, the selection of \(S=32\) is preferable for our algorithm.
Table 5.
SAUCHash (bit)Time (s)
160.999761600.226
320.99967880.096
640.99958700.074
Table 5. Performances of Different S Values
Fig. 9.
Fig. 9. ROC curves of different S values.

4.4 Selection of K Value

In our algorithm, the K value is used to select the number of the low-frequency DCT coefficients for local feature extraction. To view the effect of the K value, we select the K value from the set \(\left\lbrace 32, 64, 128, 256 \right\rbrace\) and keep other parameters constant. Figure 10 displays the ROC curves of different K values. The curve comparison shows that \(K=32\) is significantly superior to \(K=64\), \(K=128\), and \(K=256\) because its curve is much nearer the top left corner. This indicates that \(K=32\) achieves superior classification performance than other K values. In addition, the AUCs of different K values are calculated, and Table 6 shows that the AUCs of \(K=32\), \(K=64\), \(K=128\), and \(K=256\) are 0.99967, 0.99757, 0.99640, and 0.99629, respectively. Obviously, \(K=32\) has the largest value. As to the performances of hash length and running time, the hash lengths of \(K=32\), \(K=64\), \(K=128\), and \(K=256\) are 88, 112, 160, and 256 bits, and the time of \(K=32\), \(K=64\), \(K=128\), and \(K=256\) is 0.096, 0.103, 0.107, and 0.113 seconds, respectively. It can be found that the hash length of \(K=32\) is shorter than those of other K values, and the time varies slightly when the K value is changed. Taken together, the whole performance of our algorithm is superior when \(K=32\).
Table 6.
KAUCHash (bit)Time (s)
320.99967880.096
640.997571120.103
1280.996401600.107
2560.996292560.113
Table 6. Performances of Different K Values
Fig. 10.
Fig. 10. ROC curves of different K values.

4.5 Selection of P Value

In our algorithm, the P value is the rank for CP decomposition. This section discusses the effect of the P value on hash performance. Since the rank of CP decomposition cannot be greater than the order of tensor, the rank selection includes \(P=1\), \(P=2\), and \(P=3\). Similarly, we only change the P value and keep other parameters constant. Figure 11 illustrates the ROC curves of different P values. Compared to the ROC curves of other P values, the curve of \(P=1\) is nearest to the top left corner. It implies that \(P=1\) can provide better classification than other P values. In addition, the AUC values of \(P=1\), \(P=2\), and \(P=3\) are 0.99967, 0.99794, and 0.89141, respectively. As to the running time, the results of \(P=1\), \(P=2\), and \(P=3\) are 0.096, 0.121, and 0.114 seconds, respectively. Table 7 presents the performance comparison of different P values. Taken together, the whole performance of our algorithm is superior when \(P=1\).
Table 7.
PAUCTime (s)
10.999670.096
20.997940.121
30.891410.114
Table 7. Performance of Different P Values
Fig. 11.
Fig. 11. ROC curves of different P values.

4.6 Selection of Color Space

To analyze the validity of our choice in color space, some common color spaces are compared, such as HSV space, RGB space, YCbCr space, CIE L*a*b* space, and HSI space. In the experiment, three components of each color space are selected to construct the tensor, and other parameters are kept constant. Figure 12 displays the ROC curves of these color spaces. To view the detailed information, the local parts of these curves are enlarged and presented in Figure 12. Figure 12 shows that the curve nearest to the top left corner is given by the RGB space. Thus, the AUC of the RGB space gets the largest value. In addition, the running time of CIE L*a*b* space, RGB space, YCbCr space, HSV space, and HSI space is 0.151, 0.096, 0.128, 0.116, and 0.129 seconds, respectively. It can be found that the time varies slightly. Table 8 summarizes the AUCs and the running time of different color spaces. Taken together, the whole performance of our algorithm is superior when the RGB color space is selected.
Table 8.
Color spaceAUCTime (s)
RGB0.999670.096
CIE L*a*b*0.995920.151
YCbCr0.997580.128
HSV0.989680.116
HSI0.948940.129
Table 8. Performances of Different Color Spaces
Fig. 12.
Fig. 12. ROC curves of different color spaces.

5 Performance Comparisons

To demonstrate the superiority of our algorithm, we compare our algorithm with several advanced algorithms, including GF-LVQ algorithm [36], TD algorithm [29], MDS algorithm [30], RW algorithm [37], and QSVD algorithm [33]. These algorithms are reported in prestigious journals or conferences, and all of them are selected as the compared algorithms in many articles. In addition, QSVD algorithm, TD algorithm, and MDS algorithm also utilize the techniques of dimensionality reduction, such as SVD, TD, and MDS. To make a fair comparison, all images are converted to the size of 512×512 before they are input to the compared algorithms, and the similarity metrics and parameter settings of the compared algorithms are in line with their source articles. The main parameters of our algorithm are \(S=32\), \(K=32\), and \(P=1\).

5.1 Classification Performance

The image datasets utilized in Sections 4.1 and 4.2 are adopted to test classification performances. Specifically, 25,800 pairs of similar images are exploited in the robustness analysis and 17,125 images are used in the discrimination test. The ROC graph is also utilized to conduct visual comparison. Figure 13 presents the ROC curves of all algorithms in the same ROC graph for easy identification. Obviously, the curve of our algorithm is closest to the upper-left corner of the graph. This means that our algorithm makes better classification performance than the compared algorithms. To further illustrate this, the AUC values of different algorithms are listed in Table 9. As can be seen, the AUC of our algorithm is 0.99967. The AUCs of GF-LVQ algorithm, TD algorithm, MDS algorithm, RW algorithm, and QSVD algorithm are 0.97027, 0.99807, 0.97587, 0.96161, and 0.99825, respectively. Clearly, the AUC of our algorithm is bigger than those of the advanced algorithms. To view more quantitative results, TPR comparison when FPR\(\approx 0.01\) is presented in Table 10. Clearly, our TPR is bigger than those of the compared algorithms. Our algorithm can achieve competitive advantage of classification performance due to the following reasons. The third-order tensor construction with the low-frequency DCT coefficients can guarantee robustness of our algorithm because they contain most of the image energy and are less disturbed by noise. Moreover, Since the factor matrices of the CP decomposition can preserve the topology of original tensor, the binary hash derived from the factor matrices can provide our algorithm with good discrimination.
Table 9.
AlgorithmAUCLength (bit)Time (s)
GF-LVQ0.970271200.241
TD0.99807960.147
MDS0.975879000.132
RW0.961611440.039
QSVD0.998256400.352
Our0.99967880.096
Table 9. Performance Comparison
Table 10.
AlgorithmGF-LVQTDMDSRWQSVDOur
TPR when FPR \(\approx\) 0.010.841640.963910.869930.756460.991020.99417
Table 10. TPR Comparison
Fig. 13.
Fig. 13. ROC curves of different algorithms.

5.2 Performance of Time and Hash Storage

Time and hash storage are also two critical performance indicators. Computational time of GF-LVQ algorithm, TD algorithm, MDS algorithm, RW algorithm, QSVD algorithm, and our algorithm is 0.241, 0.147, 0.132, 0.039, 0.352, and 0.096 seconds, respectively. Clearly, our algorithm is slower than the RW algorithm, but it is faster than other compared algorithms. Our algorithm runs faster than most compared algorithms. This can be understood as follows. The 2D DCT and CP decomposition are the main calculations of our algorithm. Since the block size is small, the computational cost of the block-based DCT is low. In addition, as the size of the third-order tensor is small, the computational cost of the CP decomposition is also low. Therefore, our algorithm has a low computational cost. As to the hash storage, our hash length is 88 bits, while the hash lengths of GF-LVQ algorithm, TD algorithm, MDS algorithm, RW algorithm, and QSVD algorithm are 120, 96, 900, 144, and 640 bits, respectively. It’s evident that our algorithm has the shortest hash length. Performances of time and hash storage are presented in Table 9. In summary, our algorithm exhibits superior performance advantages in regards to time and hash storage.

5.3 Copy Detection Performance

To further demonstrate our advantage, the experiments about copy detection are also conducted for comparison. To create a copy detection dataset, Wang’s dataset [55] is used. In the experiments, 10 randomly selected images from the 1,000 color images of the Wang’s dataset are employed as query images. To simulate image copy detection, 18 digital operations are performed on each query image, resulting in the generation of 18 image copies. These image copies are generated by the below operations: JC (compression factor: 30, 50, 80), logo insertion (LI) (weight: 0.2, size: \(66\times 70\)), TA (text: copyright 2023), mosaic (parameter: 5, 10), additive white Gaussian noise (AWGN) (variance: 0.01), CA (parameter: 20), GLF (standard deviation: 0.3), BA (parameter: 20), GC (\(\gamma\): 0.75), SPN (density: 0.02), SN (density: 0.02), CO1 (angle: 1°, 5°), and IS (ratio: 0.5, 0.75). Thus, there are 180 image copies in total. Therefore, the copy image dataset has 1,180 images.
The precision-recall (PR) graph is employed to check the detection performances of different algorithms. Specifically, the precision and recall values under different thresholds are computed and thus some points with the coordinates (recall, precision) are generated for plotting the P-R curve. The quantitative metric called the PR Area Under the Curve (PRAUC) is calculated for comparison. The range of PRAUC is [0, 1]. A bigger PRAUC means higher accuracy in detecting image copies. Figure 14 presents the PRAUC comparison of different algorithms. The PRAUC of our algorithm is 0.99144. The PRAUCs of the GF-LVQ algorithm, TD algorithm, MDS algorithm, RW algorithm, and QSVD algorithm are 0.60946, 0.64472, 0.74353, 0.82583, and 0.98455, respectively. The PRAUC of our algorithms is bigger than the PRAUCs of the compared algorithms. Our algorithm can perform better than these baseline algorithms in copy detection. The reason is that our algorithm has better classification than the compared algorithms, resulting in a reduction of classification error during copy detection.
Fig. 14.
Fig. 14. PRAUC comparison.

6 Conclusions

This article has presented a new hashing algorithm via CP decomposition and DCT for copy detection. A crucial contribution is the third-order tensor construction with low-frequency coefficients in the DCT domain. Since the low-frequency DCT coefficients contain most of the image energy, they can reflect the basic visual content of the image and are less disturbed by noise. Hence, the third-order tensor construction with the low-frequency DCT coefficients can ensure robustness of our algorithm. Another key contribution is the application of the CP decomposition to the third-order tensor for learning a short binary hash. As the factor matrices learned from the CP decomposition can preserve the topology of the original tensor, the binary hash derived from the factor matrices can reach good discrimination. Extensive experiments have been executed and proved the effectiveness of our algorithm. Performance comparisons have shown that our algorithm outperforms several baseline algorithms in classification and copy detection. In addition, our algorithm has low costs in computational time and storage.

Acknowledgments

Many thanks to the referees for their good suggestions.

References

[1]
Shiguang Liu and Ziqing Huang. 2019. Efficient image hashing with geometric invariant vector distance for copy detection. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 4 (2019), 1–22.
[2]
Junlin Ouyang, Xingzi Wen, Jianxun Liu, and Jinjun Chen. 2016. Robust hashing based on quaternion Zernike moments for image authentication. ACM Transactions on Multimedia Computing, Communications, and Applications 12, 4s (2016), 1–13.
[3]
Haozhe Chen, Hang Zhou, Jie Zhang, Dongdong Chen, Weiming Zhang, Kejiang Chen, Gang Hua, and Nenghai Yu. 2023. Perceptual hashing of deep convolutional neural networks for model copy detection. ACM Transactions on Multimedia Computing, Communications, and Applications 19, 3, Article 123 (2023), 20 pages. DOI:
[4]
Zhaoda Ye and Yuxin Peng. 2019. Sequential cross-modal hashing learning via multi-scale correlation mining. ACM Transactions on Multimedia Computing, Communications, and Applications 15, 4 (2019), 1–20.
[5]
Xiaoping Liang, Zhenjun Tang, Zhixin Li, Mengzhu Yu, Hanyun Zhang, and Xiangquan Zhang. 2023. Robust hashing via global and local invariant features for image copy detection. ACM Transactions on Multimedia Computing, Communications and Applications 20, 1 (2023), 1–22.
[6]
Yan Zhao, Shuozhong Wang, Xinpeng Zhang, and Heng Yao. 2013. Robust hashing for image authentication using zernike moments and local features. IEEE Transactions on Information Forensics and Security 8, 1 (2013), 55–63. DOI:
[7]
Zhenjun Tang, Ziqing Huang, Heng Yao, Xianquan Zhang, Lv Chen, and Chunqiang Yu. 2018. Perceptual image hashing with weighted DWT features for reduced-reference image quality assessment. The Computer Journal 61, 11 (2018), 1695–1709.
[8]
Xiaoping Liang, Zhenjun Tang, Ziqing Huang, Xianquan Zhang, and Shichao Zhang. 2023. Efficient hashing method using 2D-2D PCA for image copy detection. IEEE Transactions on Knowledge and Data Engineering 35, 4 (2023), 3765–3778. DOI:
[9]
Chuan Qin, Enli Liu, Guorui Feng, and Xinpeng Zhang. 2021. Perceptual image hashing for content authentication based on convolutional neural network with multiple constraints. IEEE Transactions on Circuits and Systems for Video Technology 31, 11 (2021), 4523–4537.
[10]
Xiaoping Liang, Zhenjun Tang, Xianquan Zhang, Mengzhu Yu, and Xinpeng Zhang. 2024. Robust hashing With local tangent space alignment for image copy detection. IEEE Transactions on Dependable and Secure Computing (2024), 1–13. https://ieeexplore.ieee.org/document/10226258
[11]
Qi Shen and Yan Zhao. 2020. Perceptual hashing for color image based on color opponent component and quadtree structure. Signal Processing 166 (2020), 107244. https://www.sciencedirect.com/science/article/pii/S0165168419302907
[12]
Ziqing Huang and Shiguang Liu. 2021. Perceptual hashing with visual content understanding for reduced-reference screen content image quality assessment. IEEE Transactions on Circuits and Systems for Video Technology 31, 7 (2021), 2808–2823. DOI:
[13]
Ziqing Huang and Shiguang Liu. 2021. Perceptual image hashing with texture and invariant vector distance for copy detection. IEEE Transactions on Multimedia 23 (2021), 1516–1529. https://ieeexplore.ieee.org/abstract/document/9107479
[14]
Xiaoping Liang, Zhenjun Tang, Xiaolan Xie, Jingli Wu, and Xianquan Zhang. 2021. Robust and fast image hashing with two-dimensional PCA. Multimedia Systems 27, 3 (2021), 389–401.
[15]
Yan Zhao and Xiaoran Yuan. 2020. Perceptual image hashing based on color structure and intensity gradient. IEEE Access 8 (2020), 26041–26053. https://ieeexplore.ieee.org/abstract/document/8977470
[16]
Chuan Qin, Yecen Hu, Heng Yao, Xintao Duan, and Liping Gao. 2019. Perceptual image hashing based on weber local binary pattern and color angle representation. IEEE Access 7 (2019), 45460–45471. https://ieeexplore.ieee.org/abstract/document/8675975
[17]
Chuan Qin, Xueqin Chen, Xiangyang Luo, Xinpeng Zhang, and Xingming Sun. 2018. Perceptual image hashing via dual-cross pattern encoding and salient structure detection. Information Sciences 423 (2018), 284–302. https://www.sciencedirect.com/science/article/pii/S0020025517302013
[18]
Khalid M. Hosny, Yasmeen M. Khedr, Walid I. Khedr, and Ehab R. Mohamed. 2018. Robust color image hashing using quaternion polar complex exponential transform for image authentication. Circuits, Systems, and Signal Processing 37, 12 (2018), 5441–5462.
[19]
Zhenjun Tang, Xuelong Li, Xianquan Zhang, Shichao Zhang, and Yumin Dai. 2018. Image hashing with color vector angle. Neurocomputing 308 (2018), 147–158. https://www.sciencedirect.com/science/article/pii/S0925231218304971
[20]
Zhenjun Tang, Xianquan Zhang, Xianxian Li, and Shichao Zhang. 2016. Robust image hashing with ring partition and invariant vector distance. IEEE Transactions on Information Forensics and Security 11, 1 (2016), 200–214.
[21]
Junlin Ouyang, Gouenou Coatrieux, and Huazhong Shu. 2015. Robust hashing for image authentication using quaternion discrete Fourier transform and log-polar transform. Digital Signal Processing 41 (2015), 98–109. https://www.sciencedirect.com/science/article/pii/S1051200415000810
[22]
Zhenjun Tang, Fan Yang, Liyan Huang, and Xianquan Zhang. 2014. Robust image hashing with dominant DCT coefficients. Optik 125, 18 (2014), 5102–5107.
[23]
R. Venkatesan, S. Koon, Mariusz Jakubowski, and P. Moulin. 2000. Robust image hashing. In Proceedings of the International Conference on Image Processing (ICIP). 664–666. DOI:
[24]
Chuan Qin, Xueqin Chen, Jing Dong, and Xinpeng Zhang. 2016. Perceptual image hashing with selective sampling for salient structure features. Displays 45(2016), 26–37. https://www.sciencedirect.com/science/article/pii/S0141938216301020
[25]
Jiri Fridrich and Miroslav Goljan. 2000. Robust hash functions for digital watermarking. In Proceedings of the International Conference on Information Technology: Coding and Computing. 178–183.
[26]
Xiaofeng Wang, Kemu Pang, Xiaorui Zhou, Yang Zhou, Lu Li, and Jianru Xue. 2015. A visual model-based perceptual image hash for content authentication. IEEE Transactions on Information Forensics and Security 10, 7 (2015), 1336–1349.
[27]
Issam H. Laradji, Lahouari Ghouti, and El-Hebri Khiari. 2013. Perceptual hashing of color images using hypercomplex representations. In Proceedings of the 2013 IEEE International Conference on Image Processing. 4402–4406.
[28]
Caiping Yan, Chi-Man Pun, and Xiaochen Yuan. 2016. Quaternion-based image hashing for adaptive tampering localization. IEEE Transactions on Information Forensics and Security 11, 12 (2016), 2664–2677.
[29]
Zhenjun Tang, Lv Chen, Xian Quan Zhang, and Shichao Zhang. 2019. Robust image hashing with tensor decomposition. IEEE Transactions on Knowledge and Data Engineering 31, 3 (2019), 549–560.
[30]
Zhenjun Tang, Ziqing Huang, Xianquan Zhang, and Huan Lao. 2017. Robust image hashing with multidimensional scaling. Signal Processing 137 (2017), 240–250. https://www.sciencedirect.com/science/article/pii/S0165168417300646
[31]
Chuan Qin, Meihui Sun, and Chinchen Chang. 2018. Perceptual hashing for color images based on hybrid extraction of structural features. Signal Processing 142 (2018), 194–205. https://www.sciencedirect.com/science/article/pii/S0165168417302621
[32]
Xiaoping Liang, Zhenjun Tang, Sheng Li, Chunqiang Yu, and Xianquan Zhang. 2022. A novel hashing scheme via image feature map and 2D PCA. IET Image Processing 16, 12 (2022), 3225–3236.
[33]
Zhenjun Tang, Mengzhu Yu, Heng Yao, Hanyun Zhang, Chunqiang Yu, and Xianquan Zhang. 2021. Robust image hashing with singular values of quaternion SVD. The Computer Journal 64, 11 (2021), 1656–1671.
[34]
Xiaoping Liang, Zhenjun Tang, Jingli Wu, Zhixin Li, and Xinpeng Zhang. 2023. Robust image hashing with isomap and saliency map for copy detection. IEEE Transactions on Multimedia 25 (2023), 1085–1097. https://ieeexplore.ieee.org/document/9665342
[35]
Ziqing Huang, Zhenjun Tang, Xianquan Zhang, Linlin Ruan, and Xinpeng Zhang. 2023. Perceptual image hashing with locality preserving projection for copy detection. IEEE Transactions on Dependable and Secure Computing 20, 1 (2023), 463–477. DOI:
[36]
Yuenan Li, Zheming Lu, Ce Zhu, and Xiamu Niu. 2011. Robust image hashing based on random Gabor filtering and dithered lattice vector quantization. IEEE Transactions on Image Processing 21, 4 (2011), 1963–1980.
[37]
Xi Huang, Xiaoguang Liu, Gang Wang, and Ming Su. 2016. A robust image hashing with enhanced randomness by using random walk on zigzag blocking. In Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE, 14–18.
[38]
Yan Zhao and Shuai Liu. 2021. Robust image hashing based on cool and warm hue and space angle. Security and Communication Networks 3803481 (2021). https://www.hindawi.com/journals/scn/2021/3803481/
[39]
Ziqing Huang and Shiguang Liu. 2018. Robustness and discrimination oriented hashing combining texture and invariant vector distance. In Proceedings of the 26th ACM International Conference on Multimedia. 1389–1397.
[40]
Frank L. Hitchcock. 1927. The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics 6, 1-4 (1927), 164–189.
[41]
Frank L. Hitchcock. 1928. Multiple invariants and generalized rank of a p-way matrix or tensor. Journal of Mathematics and Physics 7, 1-4 (1928), 39–79.
[42]
Yang Xu, Zebin Wu, Jocelyn Chanussot, Pierre Comon, and Zhihui Wei. 2019. Nonlocal coupled tensor CP decomposition for hyperspectral and multispectral image fusion. IEEE Transactions on Geoscience and Remote Sensing 58, 1 (2019), 348–362.
[43]
Rafał Zdunek, Krzysztof Fonał, and Andrzej Wołczowski. 2019. Linked CP tensor decomposition algorithms for shared and individual feature extraction. Signal Processing: Image Communication 73 (2019), 37–52. https://www.sciencedirect.com/science/article/pii/S092359651831035X
[44]
Miguel A. Veganzones, Jeremy E. Cohen, Rodrigo Cabral Farias, Jocelyn Chanussot, and Pierre Comon. 2015. Nonnegative tensor CP decomposition of hyperspectral data. IEEE Transactions on Geoscience and Remote Sensing 54, 5 (2015), 2577–2588.
[45]
J. Douglas Carroll and Jih-Jie Chang. 1970. Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika 35, 3 (1970), 283–319.
[46]
Yang Xu, Zebin Wu, Jocelyn Chanussot, Pierre Comon, and Zhihui Wei. 2020. Nonlocal coupled tensor CP decomposition for hyperspectral and multispectral image fusion. IEEE Transactions on Geoscience and Remote Sensing 58, 1 (2020), 348–362. DOI:
[47]
Pieter M. Kroonenberg and Jan De Leeuw. 1980. Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika 45, 1 (1980), 69–97.
[48]
Dimitri Nion and Lieven De Lathauwer. 2008. An enhanced line search scheme for complex-valued tensor decompositions. Application in DS-CDMA. Signal Processing 88, 3 (2008), 749–755.
[49]
David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the 8th IEEE International Conference on Computer Vision (ICCV). 416–423.
[50]
Fabien A. P. Petitcolas. 2000. Watermarking schemes evaluation. IEEE Signal Processing Magazine 17, 5 (2000), 58–64.
[51]
Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.
[52]
Zhenjun Tang, Zhiyuan Chen, Zhixin Li, Bineng Zhong, Xianquan Zhang, and Xinpeng Zhang. 2023. Unifying dual-attention and siamese transformer network for full-reference image quality assessment. ACM Transactions on Multimedia Computing, Communications and Applications 19, 6 (2023), 1–24.
[53]
Mark Everingham, Luc Van Gool, Chris Williams, John Winn, Andrew Zisserman, Yusuf Aytar, and Ali Eslami. 2012. The PASCAL Visual Object Classes (VOC) Challenge 2012 Dataset,. Retrieved April 1, 2020 from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/
[54]
Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recognition Letters 27, 8 (2006), 861–874. DOI:
[55]
James Wang, Jia Li, and Gio Wiederhold. 2001. SIMPLIcity: Semantics-sensitive integrated matching for picture LIbraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 9 (2001), 947–963. DOI:

Cited By

View all
  • (2024)Supervised perceptual image hashing using collective matrix factorizationJournal of Electronic Imaging10.1117/1.JEI.33.6.06304133:06Online publication date: 1-Nov-2024
  • (2024)Robust video hashing with canonical polyadic decomposition and Hahn momentsJournal of Electronic Imaging10.1117/1.JEI.33.4.04300733:04Online publication date: 1-Jul-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 7
July 2024
973 pages
EISSN:1551-6865
DOI:10.1145/3613662
  • Editor:
  • Abdulmotaleb El Saddik
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 April 2024
Online AM: 01 March 2024
Accepted: 25 February 2024
Revised: 21 January 2024
Received: 10 October 2023
Published in TOMM Volume 20, Issue 7

Check for updates

Author Tags

  1. CP decomposition
  2. tensor construction
  3. image hashing
  4. copy detection
  5. dimensionality reduction

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Guangxi Natural Science Foundation
  • Guangxi “Bagui Scholar” Team for Innovation and Research
  • Guangxi Talent Highland Project of Big Data Intelligence and Application
  • Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)723
  • Downloads (Last 6 weeks)126
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Supervised perceptual image hashing using collective matrix factorizationJournal of Electronic Imaging10.1117/1.JEI.33.6.06304133:06Online publication date: 1-Nov-2024
  • (2024)Robust video hashing with canonical polyadic decomposition and Hahn momentsJournal of Electronic Imaging10.1117/1.JEI.33.4.04300733:04Online publication date: 1-Jul-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media