3.1.1 Active Methods.
Active methods are robust mechanisms for safeguarding the integrity of digital images and exploiting certain information entered into them during the process of image acquisition and digitalization [
58]. Alterations in an image are detected by analyzing such embedded (known) data. The main applications of these image authentication methods include intellectual and owner identification. They are classified according to whether they ensure strict integrity, content authentication, and the storage strategy of authentication data (i.e., watermark or external signature). These approaches include watermarking, digital signatures, and conventional cryptography [
139].
Digital Watermarking Techniques. A large amount of literature currently deals with the topic of digital watermarking techniques [
76,
110,
152,
188,
225]. Intuitively, image watermarking alters an image by inserting a mark that guarantees its authenticity. Watermarking methods are classified into three categories: (1)
fragile watermarks (2)
semi-fragile watermarks, and (3)
robust watermarks.
Fragile watermarking methods only allow a strict integrity check, whereas
semi-fragile watermarking methods, based on external signatures, ensure content authentication. The former watermarking methods are highly sensitive to any type of tampering even in the modification of a single bit. The results obtained with these techniques show that they are both robust and secure and are therefore the best solution for copyright protection. The latter semi-fragile watermarking methods can be used for forensic purposes. Through these techniques, the authenticator can distinguish the original images from those whose content is intentionally modified while preserving the content of an image. The results offered by them show that despite their computational complexity, these techniques are secure and robust for all types of counterfeiting attacks. Finally, there are the kinds of
robust watermarking algorithms that can survive the content by preserving changes such as compression, noise addition, and filtering, as well as geometric changes such as resize, translation, rotation, and cropping, and many more. They are used for ownership authentication [
64]. These approaches have a high computational complexity and also offer less embedding capability. Fragile watermarking methods remain the simplest to implement [
271]. In digital watermarking, a specific message (digest) is inserted when an image is captured from a digital device. The watermarking is either “visible” or invisible [
223] and is, generally, independent of the image data [
323]. The basic underlying idea is that any attempt to alter the content of an image will also alter the watermark itself. In the following stages, a summary is then extracted from it to verify its legitimacy to compare it with the original digest. The result will indicate whether the image has been modified after acquisition. Therefore, the watermarking techniques consist of two phases. In the first phase, the image is generated and the watermark is also inserted. In the second phase, once the destination is reached, the watermark is extracted from the image and compared with the initial watermark. These techniques can be broadly classified into two categories depending on the kind of image transformation: (1) the
spatial domain (least significant bit [
284], spread spectrum [
46], random insertion in file [
36]) and (2) the
frequency domain (
Discrete Wavelet Transform (DWT) [
304], DCT [
95], discrete Fourier transform [
306],
Singular Value Decomposition (SVD) [
217]). It has been pointed out that the frequency domain methods are more robust than the spatial domain techniques [
79].
Advantages include that these methods provide authenticated multimedia content and focus on four essential features: robustness, security, capacity, and invisibility. Disadvantages include the following: (1) few (and generally expensive) devices have the function to incorporate a watermark in the image acquisition process, and (2) these techniques, which require specialized software to incorporate the abstract into the image, are unable to distinguish the legitimate manipulations performed for the enhancement of the image quality (enhancement of contrast, sharpening, etc.) from the illegitimate ones (see [
271]).
Cryptographic Techniques. Encryption techniques involve two stages. In the encryption stage, the plaintext is converted into ciphertext using a cipher. Conversely, in the decryption stage, the same cipher is used to convert ciphertext to plaintext. Approaches such as classic cryptographically secure hash functions, like MD-4, MD-5 (message digest), CRC-32 (32-bit cyclic redundancy check), and SHA-1 (secure hash algorithms), are just some of the many methods existing in the literature that adopt a common general procedure. Due to the large research field, interested readers are advised to refer to other works [
127,
194,
323,
328,
329,
336].
Digital Signatures Techniques. The validation of the authenticity of digital messages is normally based on a digital signature. Thanks to the valid signature, the recipient can assume that the message belongs to the known sender. The digital signature guarantees that the content is authentic, reliable, and from an authentic source. Applications of these techniques have been found to improve data integrity and image authentication in many industries, and some techniques are combined with watermarking, steganography, and cryptographic techniques [
15,
181,
259,
301].
In conclusion, in active methods, a trustworthy camera calculates a digital watermark or signature from the image at capture time, and any subsequent changes to the image can be detected by checking the value of the digital watermark or digital signature value at the time of its use. The disadvantages of active methods include that digital cameras must be equipped with a watermarking chip or a digital signature chip which, using a private key wired into the camera itself, authenticates each captured image before storing it on the memory card. This implies the definition of a common standard protocol: a requirement that would limit the application of such solutions only to very limited scenarios [
254].
3.1.2 Passive Methods.
As discussed previously, active methods use specially designed digital signatures and watermarking tools in cameras.
Passive methods, sometimes known as blind methods, overcome this limitation by exclusively analyzing the binary information of an image [
101], relying directly on image statistics, without requiring any prior information on image acquisition (Figure
3). Passive methods fall into three categories: (A) based on
tampering operations, (B) based on
intrinsic regularities and inconsistencies, and (C) coping with
natural and computer graphics images. The most prevalent characteristics are certainly the first and second ones. The first category is divided into forgery-dependent and forgery-independent methods [
35,
42,
242,
264,
312]:
forgery type-dependent techniques depend on the different types of forgery performed on the image and include methods to detect the most popular kinds of attack copy-move and image slicing, and
forgery type-independent techniques are independent of the type of attack.
A. Methods Based on Tampering Operations
Forgery Type-Dependent Techniques
Copy-Move Forgery. Copy-move attack refers to forgeries that use a single image to copy a region from it and paste it into the same image to hide or duplicate specific objects. The copied part may or may not be modified. Since it belongs to the same image, its essential properties such as noise, color, and texture do not change, so the final forged image has homogeneous characteristics. This makes it difficult to even locate the forgery. Further operations such as rescaling, filtering, and noise scattering can be applied to hide any traces of the forgery.
The copy-move forgery detection technique is the process of identifying the occurrence of a copy-move in an image and can be classified into three categories: (1) brute force detection, (2) block-based detection, and (3) keypoint-based detection. Due to the computational time complexity of the first category, the literature mainly refers to approaches based on blocks and key points.
Brute force detection is based on an exhaustive search and autocorrelation technique, which checks for any position change. The exhaustive search examines the corresponding image segment through circular shifts, producing a large number of comparisons. The computational times are relatively high.
Block-based detection divides the forged image into overlapping or non-overlapping blocks to analyze the block features in the frequency domain.
In the phase of block feature extraction several approaches have been applied:
(1)
Since the
Local Binary Patterns (LBP) operator is a texture descriptor for rotation-invariant grayscale images, it is widely used as a grayscale operator. LBP codes can be extracted from blocks beyond block texture verification. Farooq et al. [
105] proposed a generic passive image forgery scheme that combines a spatial rich model with a textural feature based on the LBP operator. The combination of LBP with co-occurrence matrices makes the model capable of detecting almost all types of forgeries with improved detection accuracy.
(2)
The use of the frequency domain, through a signal transform, can carry on the signatures for the image blocks, thus allowing for the identification of duplicate regions. Mahmood et al. [
215] used stationary wavelet transform and DCT to detect and localize the copy-move operation in digital images. Stationary wavelet transform allows it to work in both spatial and spectral domains, and the feature vectors are reduced by applying DCT. Dimension reduction processes are widely used to enhance the performance of the frameworks. Table
1 references the most used algorithms.
In
keypoint-based detection, feature-based or keypoint-based approaches extract key points from forged images. They use a two-step process of locating and describing the local interest points. Robust local descriptors are constructed which must be invariant to affine alterations. They are not only robust to noise and geometric transformations but also use scale- and rotation-invariant feature detectors and descriptors. Table
1 lists the main algorithms used in the feature extraction phase. These methods rely on the identification and selection of high-entropy image regions (i.e., the “key points”): one feature vector per keypoint is then extracted. As a result, fewer feature vectors are estimated, resulting in reduced computational complexity of feature matching and post-processing. Post-processing thresholds are also lower than those of block-based methods. The downside is that the copied regions are not always covered by a satisfactory number of corresponding key points. If the copied regions show little structure, the region may be completely missing. Yang et al. [
331] proposed a keypoint-based copy-move forgery detection and location technique based on a hierarchical point matching that reduces the number of points to improve the matching process. Feature-based approaches are relatively better than block-based and brute force methods in terms of computational efficiency, complexity, and robustness against many transformations such as scaling, rotation, and cropping [
167].
In the phase of matching and filtering, both in block-based and keypoint-based detection approaches, in identifying matching among the feature descriptors and to reduce the probability of false matches, most authors propose the use of main similarity measures and algorithms such as the Euclidean distance, correlation coefficient, sorting (lexicographic sorting, KD-tree, radix sort), hash (counting Bloom filters, locality-sensitive hashing), DCT and sequential clustering, Best Bin First, 2NNg, 2NN, and clustering (HAC, WPGMC), among others.
In conclusion, both classes of methods have strengths and limitations. To address the limitation of these methods in flat regions, some approaches (see [
111]) also implemented mixed processes to tackle copy-move tampering detection problems, combining techniques based on features with block-based techniques. The approach in the work of Hashmi et al. [
140] combines the keypoint-based SIFT method with the block-based method using DyWT (Dyadic Wavelet Transform. Additionally, in the work of Su and Kaizhen [
295], a combination of the two different approaches is proposed by extracting the key SIFT points in an image and then combining LPP (Locality Preserving Projection) to obtain low-dimensional feature descriptors.
Image Splicing. Splicing is the process of cutting out a section of an image and pasting it onto the same image or another image. To create a fake image, splicing entails merging a minimum of two images. When images with contrasting bases have been well blended, the edges between the spliced regions can be visually imperceptible. Unlike copy-move forgery, in splicing, used objects are harvested from more than one image. The splicing, however, disturbs the higher-order Fourier statistics. These insights can then be used as an element to distinguish fakes [
115,
237,
238,
348].
Image splicing detection is fundamental in image forgery detection [
42]. Table
1 lists the main related algorithms.
It appears from the literature that many of the methods mentioned work well when the analyzed image is compressed by a high-quality factor. Otherwise, compression artifacts make it very difficult to spot the fake.
Forgery Type-Independent Techniques
J-Compression. Editing an image involves loading it, modifying it, and then saving it again. Knowing the history of these compression operations, as well as whether a bitmap image has been previously compressed, is a clue in forgery detection. Table
2 lists the main used methods in image J-Compression detection techniques. These methods work well for detecting saved images: the problem is when they are rotated, resized, and/or enhanced.
Image Re-Sampling. When spatial transformations such as resizing, rotation, and stretching are applied to a digital image (to a specific object in the image or to all the image content), the type of forgery is known as re-sampling. The resizing of an image changes the dimensions of an object but does not improve the quality of that object. Re-sampling can be performed in different ways: up/down-sampling, mirroring, skewing, and seam carving.
With regard to image re-sampling detection, to create a fake image, some selected regions need to undergo geometric transformations such as rotate, scale, stretch, skew, and flip. For example, if the face of a person is larger in an image, it should be scaled to the extent that the sizes of the faces are similar in the composite image. This requires re-sampling the image to compose a new sample and adding periodic correlations between pixels in the neighborhood. These transformations leave traces that are not typically present in the original images, and forgery techniques seek to identify such traces that constitute re-sampling cues. The interpolation phase (e.g., nearest neighbor, bilinear, bicubic) also plays an important role in the re-sampling process and introduces non-negligible statistical changes due to the specific periodic correlations in the image. These correlations can be used to recognize falsehoods caused by re-sampling [
210,
259]. Table
2 summarizes several elements proposed and combined over time.
Image Retouching. Image retouching forgery aims to enhance an object or image to exhibit or hide a specific feature such as coloring, lighting, or background changing. It is commonly used for aesthetic and commercial purposes, to enhance or reduce image features, and to create a compelling composite of two images that may require transformations such as rotating, stretching, or scaling one of the images.
With regard to image retouching detection, many methods have been proposed for the detection of retouched forgeries, which is difficult if the image is significantly changed from the original image. Additionally, human intervention is often required to interpret the result. Table
2 lists the main related proposals.
Image Morphing. A gradual transformation from one graphical object or image (source) to another graphical object or image (target) is called
image morphing [
155]. Differently from warping, image morphing interpolates two or more graphical objects. It is a combination of image warping and blending techniques to interpolate objects to create a novel one. The basic idea of morphing is to distort the first image into another image by some pre-defined set of rules. The two basic principles of image morphing are image warping and cross-fading, which must be coupled. Obviously, during this transformation, the central image (i.e., the generated one) is the key point of the technology because it decides whether the sequence will look good or not. It is intuitive to think that as the source image evolves, it begins to degrade and the target image evolves with new characteristics. The first images in the sequence will look more like the source. The central image will have the characteristics of both the source and the final images and will be distorted [
60]. Table
3 lists the basis of the latest generation algorithms and automated morphing attack detection approaches recently proposed [
219,
307].
Inconsistencies (Image Processing Operations). An altered image has undergone basic image processing operations (Brightness&Contrast, Cropping&Recompressed, Sharpening&Blurring, and Filtering). Tampering produces inconsistencies in the regular models, and the traces of these operations can be useful in identifying forgery. Table
3 also lists the most recently proposed techniques in image processing operations detection.
B. Based on Intrinsic Regularities and Inconsistencies
Image regularities are of different nature—for example, optical, processing related, sensor related, and statistical.
Optical Regularities. These regularities include imperfections due to illumination, radial distortion of the lens, and chromatic aberration. Table
4 lists the most recently proposed techniques in optical regularities detection.
Processing Regularities. These regularities include demosaicing, white balancing, and gamma correction. Table
4 lists the most recently proposed techniques in processing regularities detection.
Sensor Regularities (PNU (Pixel Non-Uniformity)). The
Source Camera Identification (SCI) techniques identify the intrinsic evidence left in images by the corresponding digital cameras responsible for their acquisitions. It is worth mentioning sensor pattern noise (the
Photo Response Non-Uniformity (PRNU), dust pattern, feature extraction, and
Camera Response Function (CRF). Table
4 lists the most recently proposed techniques in sensor regularities detection.
Statistical Regularities. These regularities include color features, image quality metrics, and natural image statistics. Table
4 lists the most recently proposed techniques in statistical regularities detection. Methods based on image features do not work well when the image from a camera has a similar CCD [
312].
C. Natural and Computer Graphics Images
Graphic software can generate photorealistic images. A challenging problem is to distinguish between computer-generated photorealistic and real (photographic) images. Table
5 lists the main bases for the proposed algorithms in natural and computer graphics image detection.