Latent Diffusion Models for Image Watermarking: A Review of Recent Trends and Future Directions

Hur, Hongjun; Kang, Minjae; Seo, Sanghyeok; Hou, Jong-Uk

doi:10.3390/electronics14010025

Open AccessReview

Latent Diffusion Models for Image Watermarking: A Review of Recent Trends and Future Directions

by

Hongjun Hur

^1,†,

Minjae Kang

^1,†

,

Sanghyeok Seo

^2,† and

Jong-Uk Hou

^1,*

¹

Division of Software, Hallym University, Chuncheon City 24252, Gangwon-do, Republic of Korea

²

Division of AI Convergence, Hallym University, Chuncheon City 24252, Gangwon-do, Republic of Korea

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2025, 14(1), 25; https://doi.org/10.3390/electronics14010025

Submission received: 24 October 2024 / Revised: 23 December 2024 / Accepted: 24 December 2024 / Published: 25 December 2024

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Recent advancements in deep learning-based generative models have simplified image generation, increasing the need for improved source tracing and copyright protection, especially with the efficient, high-quality output of latent diffusion models (LDMs) raising concerns about unauthorized use. This paper provides a comprehensive review of watermarking techniques applied to latent diffusion models, focusing on recent trends and the potential utility of these approaches. Watermarking using latent diffusion models offers the potential to overcome these limitations by embedding watermarks in the latent space during the image generation process. This represents a new paradigm of watermarking that leverages a degree of freedom unavailable in traditional watermarking techniques and underscores the need to explore the potential advancements in watermark technology. LDM-based watermarking allows for the natural internalization of watermarks within the content generation process, enabling robust watermarking without compromising image quality. We categorize the methods based on embedding strategies and analyze their effectiveness in achieving key functionalities—source tracing, copyright protection, and AI-generated content identification. The review highlights the strengths and limitations of current techniques and discusses future directions for enhancing the robustness and applicability of watermarking in the evolving landscape of generative AI.

Keywords:

image watermarking; diffusion model; generative AI

1. Introduction

Recent advancements in the complexity and sophistication of generative models have created new opportunities and challenges. On one hand, they provide unprecedented creativity and efficiency in multimedia content production, but on the other hand, they raise concerns about copyright infringement and unauthorized distribution of AI-generated content due to the ease of generating realistic content. As generative AI becomes increasingly accessible, the ability to trace the origin of digital content, verify authenticity, and protect the rights of content creators is becoming more important. Latent diffusion models (LDMs) [1], in particular, are characterized by their use of latent variable modeling to generate high-resolution images while maintaining computational efficiency. This makes LDMs suitable for a range of practical applications, from entertainment to scientific visualization. For example, in the film industry, LDMs are used for creating high-quality visual effects, and in the medical field, they are used for high-resolution image restoration and analysis. However, the powerful features of LDMs—high fidelity and generative capability—also increase the risk of generated content being misused without proper attribution or consent [2]. Therefore, innovative watermarking solutions that are well integrated into the generation process are needed. Such solutions can serve as a crucial means to clarify the origin of generated content, strengthen copyright protection, and prevent illegal use of content [3,4].

In recent years, interest in integrating deep learning techniques into image steganography and watermarking technologies has been growing [5]. As deep learning-based generative models continue to advance, watermarking and steganography technologies have also evolved in new ways through this integration. Timely consolidation of the latest research findings is crucial for the development of image steganography and watermarking algorithms. However, there is still a lack of comprehensive and systematic summaries and characterizations of deep learning-based steganography and watermarking algorithms. Existing review papers primarily address traditional watermarking techniques [6] and deep learning methodologies [7,8,9,10], whereas this paper focuses on the emerging field of watermarking using latent diffusion models and provides a detailed analysis of how it differs from existing research.

Traditional watermarking studies have mainly focused on embedding watermarks after image generation or adding watermarks to generated content using frequency domains such as Fourier transform [11,12], DCT [13,14], and DWT [15]. These methods are relatively straightforward to implement but can be vulnerable to various attacks, such as image compression, cropping, and filtering. In addition, previous studies have also explored learning-based watermarking [7,8], where deep learning models are used to embed and extract watermarks through end-to-end training. These approaches often utilize convolutional neural networks (CNNs) to embed watermarks invisibly within an image during post-processing [16,17,18,19]. Learning-based methods have demonstrated improved robustness against certain types of attacks compared to traditional methods, as they can adaptively learn features that are resistant to compression, noise, and other transformations. However, learning-based watermarking still primarily focuses on post-generation embedding, which leaves it vulnerable to removal attacks [20] and may affect the image quality if not carefully optimized. In contrast, watermarking using latent diffusion models offers the potential to overcome these limitations by embedding watermarks in the latent space during the image generation process. This allows for the natural internalization of watermarks within the content generation process, enabling robust watermarking without compromising image quality.

This paper provides an in-depth analysis of how watermarking techniques can be integrated into latent diffusion model. Latent diffusion model have greater computational efficiency compared to traditional deep learning-based generative models and have the potential to generate high-quality images [1]. These characteristics offer significant advantages for applying watermarking techniques. For example, by performing watermarking in the latent space, it is possible to maintain the quality of the generated image while ensuring the robustness of the watermark [21]. This plays an important role in preventing the illegal reproduction or misuse of AI-generated content and clarifying the origin of content. We explore the unique challenges and opportunities presented by diffusion model, and how various parts of the diffusion process can be utilized for watermark embedding and tracing. For example, since the latent space of diffusion model deals with intermediate representations of image generation, watermarking in this space provides a high level of security without affecting the visual quality of the generated image [21,22]. This helps to address the problems encountered in traditional post-generation watermarking methods.

The remainder of this paper is organized as follows: Section 2 presents the LDM architecture and background information. Section 3 reviews recent approaches to watermark embedding in LDMs. Section 4 provides an overview of technical evaluation metrics, including imperceptibility, robustness, and capacity. Section 5 discusses key challenges and open research directions in watermarking for latent diffusion models. Finally, Section 6 concludes the paper with a summary of key findings and future directions.

2. Backgrounds

LDMs, such as Stable Diffusion, are powerful generative models that use a latent space to perform high-quality image generation efficiently. LDMs break down the image synthesis process into a sequence of denoising steps, each carried out in a lower-dimensional latent space rather than in pixel space [1]. This architecture helps maintain computational efficiency while preserving high visual quality, which is crucial for generating high-resolution content.

LDM Structure Overview

The architecture of LDMs consists of several core components, each playing a unique role in the generative process. Figure 1 provides an overview of the LDM structure. Understanding these components is vital for identifying suitable points for watermark embedding:

Autoencoder (encoder and decoder): The encoder is responsible for transforming the high-dimensional input image into a compact latent representation. This step significantly reduces the data’s dimensionality, allowing for more efficient processing while retaining essential features of the original content [1]. The latent space created by the encoder is crucial as it serves as the stage for embedding watermarks without affecting pixel-level details [22]. The decoder reconstructs the image from the latent representation generated by the encoder. It is trained to reverse the transformation applied by the encoder, ensuring that the reconstructed image maintains high fidelity to the original input. Watermarking embedded in the latent space can thus be carried through the entire generation process and preserved in the output image [23].
Latent diffusion process: The diffusion process is the heart of LDMs, involving both a forward and a reverse process: (1) Forward diffusion: In this phase, Gaussian noise is progressively added to the latent representation, simulating a diffusion process that leads to increasingly noisy versions of the latent variables. The goal of this phase is to create a path through which the model learns to handle noisy data effectively [1]. (2) Reverse diffusion (Denoising): The model is trained to reverse the forward diffusion by gradually removing noise from the latent representation. The reverse process is a stepwise denoising operation that transforms the noisy latent into a clear, usable representation, which can then be decoded into an image. The iterative nature of this process provides multiple opportunities to embed watermarks within the noise removal steps, potentially making them more resilient [22].
UNet-based architecture: The denoising process in LDMs is typically implemented using a U-Net architecture, which is well suited for capturing both local and global features through its use of skip connections [1]. These skip connections allow the network to retain fine details from earlier layers, which is essential for high-quality image reconstruction. For watermarking purposes, the U-Net provides specific layers where watermark information could be injected during the denoising process, ensuring that it becomes an inherent part of the generative flow.
Conditioning mechanisms: (1) Cross-attention layers: Many LDMs, such as Stable Diffusion, incorporate conditioning inputs (e.g., text prompts) to guide the generative process [1]. This conditioning is facilitated by cross-attention layers that integrate the conditioning information directly into the diffusion process. These layers provide a strategic point where watermarking can be conditioned on specific inputs, adding flexibility to the watermarking strategy [24]. (2) Latent transformation blocks: These blocks are responsible for modifying latent variables at different stages of the generation process. They present another potential point for watermark embedding, where watermarks could be introduced in a manner that integrates seamlessly with the latent transformations, making them harder to remove [24].

By understanding the detailed structure and functions of LDMs, it becomes clear that there are numerous opportunities for embedding watermarks that are robust, efficient, and imperceptible. The following sections will explore specific methodologies for implementing these watermarking strategies and evaluate their effectiveness in practical scenarios.

3. Watermarking Approaches

In this section, we analyze the latest research trends in watermarking techniques that operate based on LDMs. LDM watermarking provides a level of flexibility that was previously unattainable by embedding watermarks directly into the image generation process. This high degree of freedom enables more sophisticated integration of watermark signals, ensuring robustness while maintaining generated image quality [23]. Recent studies have introduced effective methods for embedding watermarks at various stages of the LDM generation process, as described below.

Figure 2 presents a chronological overview of the recent watermarking techniques integrated into LDMs, highlighting key methods for embedding watermarks across different stages of the generation process. It includes advancements such as latent space modulation, initial noise manipulation, and decoder fine-tuning, showcasing the diversity and sophistication of recent approaches aimed at ensuring robust and traceable watermarking in AI-generated content.

3.1. LDM-Based Watermarking Categorization

Given the modular architecture of LDMs, several components provide distinct opportunities for embedding watermarks effectively. Below are several approaches that leverage different parts of the LDM architecture for watermark embedding:

(1): Latent space embedding (Latent): This approach involves modifying the latent space information of LDM’s variational autoencoder (VAE)-based latent space to embed a watermark. By manipulating the features within the latent representation, it is possible to incorporate watermark signals in a manner that is highly integrated and resilient to attacks [34]. This method ensures that the watermark becomes intrinsic to the latent features, resulting in a robust watermarking strategy that remains invisible to human observers.
(2): Noise removal steps in the diffusion process (Diffusion Process): The iterative nature of the reverse diffusion process in LDMs offers multiple opportunities for embedding watermarks incrementally. By applying watermarking progressively at each noise removal step, the embedded information can achieve redundancy, making it harder to remove during post-processing [24]. This makes the watermark more deeply ingrained in the generated content, improving its resistance to adversarial attacks.
(3): Full LDM model optimization (LDM): Another approach is to fine-tune the entire LDM model, ensuring that the watermark is inherently integrated into the outputs generated by the model. By adjusting the overall generative process, the LDM itself learns to produce images that contain embedded watermark signals [35]. This model-level optimization offers a unified watermarking solution that integrates watermarking into every generated image.
(4): Decoder parameter manipulation (Decoder): This technique involves modifying the parameters of the VAE decoder within the LDM to embed a watermark. By making targeted adjustments to the decoder parameters, it is possible to introduce watermark information directly during the image reconstruction process [36]. This approach takes advantage of the final stage of image generation, ensuring that the watermark is applied as the latent representation is translated into a complete image.
(5): Initial noise modification (Noise): Watermarking can also be embedded by altering the distribution of the initial noise used as input in the diffusion process. By slightly changing the characteristics of the initial noise, watermark signals can be subtly incorporated into the generated image [23]. This approach is effective as the watermark is inherently present from the start of the generative process, making it less likely to be fully removed during subsequent steps.

Figure 3 and Figure 4 provide visual summaries of these methods. Figure 3 illustrates the structural insertion points for watermarking within the LDM architecture. It provides a simple diagrammatic explanation of where each approach integrates watermarking into the generative model. Figure 4 shows the research trends for each of these approaches, highlighting the progression and preferences in the use of various watermarking techniques within LDMs.

In addition to utilizing the inherent structure of LDMs, watermarking techniques can also be categorized based on additional training requirements or the use of fine-tuning. Some approaches require extensive retraining or fine-tuning of the model to integrate watermarking seamlessly into the generation process, while others focus on efficiency and adaptability. For instance, plug-and-play methods using adapters allow for the insertion of watermarking capabilities without modifying the entire model architecture [25]. These adapter-based methods provide a practical and flexible solution, enabling watermarking to be implemented without extensive computational resources or significant alterations to the pre-existing LDM framework.

3.2. Watermarking Methods

This section provides an overview of watermarking methods in the context of latent diffusion models, analyzing their techniques, characteristics, and pros and cons. The section is structured to include different watermarking approaches, categorized based on their embedding locations and mechanisms, followed by an analysis of their strengths, and limitations.

Table 1 provides an overview of various watermarking techniques applied in latent diffusion models. The Embedding Type refers to the point where the watermark is embedded (see Section 3.1 for more details). Requires Training indicates whether additional training is required for the model or adapter to insert the watermark. Controllable describes whether the insertion of the watermark can be controlled. Specifically, Prompt allows control over watermark insertion using a prompt trigger, while Adaptor enables control through the plug-and-play capability of an adaptor. Controllable means the watermark can be selectively applied based on whether the method is used, whereas Uncontrollable means that the watermark is always embedded in the generated image without the possibility of control. The Task is divided into two categories: Verification, which checks whether an image was generated by the model, and Identification, which embeds a message of a certain length for identifying the user who generated the image. fine-tuning refers to whether additional fine-tuning of the generative model is required.

3.2.1. Watermark Embedding Through Latent Space

One of the significant strengths of latent space watermarking is its adaptability across different models and applications. Since the watermark is embedded in the latent space, it can be seamlessly integrated into various versions of the Stable Diffusion model without the need for extensive retraining or fine-tuning [23]. This plug-and-play characteristic not only simplifies the implementation process but also enhances the practicality of deploying robust watermarking in real-world scenarios. Zhang et al. [34] introduce a method for embedding watermarks directly into the latent space during image generation, ensuring invisibility and robustness while maintaining image quality. The framework is adaptable to different versions of Stable Diffusion without performance degradation. Figure 5 provides a comparison of watermarked images. Hai et al. [25] present a lightweight, plug-and-play watermark plugin that enables dynamic watermark embedding without requiring per-watermark fine-tuning. WMAdapter uses a novel contextual adapter structure to efficiently integrate watermark information during image generation, maintaining high image quality while providing robust watermarking. A hybrid fine-tuning strategy further enhances image sharpness and minimizes artifacts, offering strong protection against watermark removal attacks. Zheling et al. [37] embed watermarks directly into the latent space, unlike traditional pixel-based approaches. This technique ensures robustness without retraining and avoids degrading image quality by progressively injecting watermarks during forward propagation, making it more resistant to removal and tampering. Embedding watermarks in latent space provides a robust and adaptive solution to watermarking [37]. It ensures that the watermark remains invisible while offering flexibility in embedding strategies, which helps maintain image quality and enhance protection against attacks.

3.2.2. Watermark Embedding via Noise Removal in Diffusion Process

Watermarking during the diffusion sampling process involves modifying the sampling equations to incorporate watermark information directly into the generated data. This approach ensures that the watermark is seamlessly embedded during sampling, making it highly resilient to removal attempts. Liu et al. [38] introduce a novel approach for the diffusion model on convex constrained sets. Mirror maps transform data from constrained sets to Euclidean space, enabling efficient diffusion without performance loss. The watermarking scheme modifies the sampling equations to add user-defined tokens that create watermarked outputs, ensuring that only authorized users can verify the watermark, thus providing secure generation with certifiable watermarks. By modifying the sampling equations, these methods embed watermarks directly into the generated output during sampling, ensuring traceability and robustness while preserving the quality of the generated content.

3.2.3. Watermark Embedding Through Full LDM Optimization

Watermarking in LDMs focuses on embedding secure and traceable watermarks directly during the generative process. This integrated approach ensures that the watermark becomes an inherent part of the generated image, providing robustness against erasure and modification while maintaining high image quality. Ma et al. [26] present a framework for embedding graphical watermarks, such as QR codes, directly into pixel structures during the generative process of the Stable Diffusion model. Unlike post-processing methods vulnerable to attacks, this approach integrates watermarking into the diffusion steps, enhancing robustness against erasure or modification. The model optimizes watermark injection and detection within a single network, improving both efficiency and traceability while ensuring high-fidelity image generation.

To protect the intellectual property of latent diffusion models, Liu et al. [35] propose NAIVEWM and FIXEDWM for embedding watermarks. NAIVEWM activates the watermark using a pre-defined trigger word, while FIXEDWM provides higher stealthiness by activating the watermark only when the trigger is in a specific position. Both methods exhibit minimal performance degradation and robust resistance to attacks, enabling owners to verify model ownership without compromising functionality. Yuan et al. [31] fine-tune the text encoder and conditional diffusion model in two stages, enabling the model to generate a predefined watermark when a specific trigger prompt is used. This method does not require access to the original training data or model parameters, making it practical and cost-effective for copyright protection. The watermark embedding preserves model functionality and allows for ownership verification through third-party validation without impairing the generation of unrelated images. Zhao et al. [27] provide a comprehensive approach for embedding watermarks, covering two pipelines: one for unconditional/class-conditional diffusion models and another for text-to-image models. For smaller models, the watermark is embedded directly during training, while larger models, such as Stable Diffusion, are fine-tuned using a trigger prompt and watermark image. This flexible method ensures robust copyright protection and effective detection of generated content. Xiong et al. [39] introduce an end-to-end watermarking approach that addresses the limitations of existing methods for embedding messages in generated images. The proposed method uses an encoder-decoder architecture and a message matrix to embed dynamic, changeable messages during the image generation process. Unlike previous methods that require retraining to change the embedded message, this approach allows flexible message modification using a message encoder. Additionally, a built-in security mechanism prevents bypassing the message matrix during image generation, providing strong protection against tampering. Min et al. [24] introduce WaDiff, a framework that embeds user-specific fingerprints into images during generation. This approach enables AI-generated content detection and ownership identification while maintaining image quality across different users.

To explore alternative approaches to watermarking, Desu et al. [40] introduce a method for verifying content ownership of generative models by leveraging inherent fingerprints in model outputs. Unlike the methods described earlier, this approach regenerates outputs to compare their similarity to the original. The model’s unique signature is amplified through iterative regeneration, allowing reliable authorship verification. This approach avoids modifying the model or generated content while providing a robust solution for intellectual property protection. Although this deviates somewhat from the full LDM optimization approach discussed in this section, this methodology, which creatively utilizes the LDM structure, is worth mentioning.

Optimizing watermark embedding across the entire latent diffusion model, though it requires higher training costs, allows for the creation of extremely resilient and robust watermarks. This method ensures excellent protection against tampering and enhances the ability to recover the watermark, providing stronger security and traceability for the generated images.

3.2.4. Watermark Embedding via Decoder Parameter Manipulation

Another key area of watermarking in diffusion models involves embedding watermarks in the decoder stage to maintain secure and traceable image generation [36]. These methods focus on modifying the decoder component to embed watermarks, providing an additional layer of traceability.

Fernandez et al. [28] introduce a watermarking method specifically designed for latent diffusion models. By fine-tuning the VAE decoder, Stable Signature embeds invisible watermarks into images during generation. These watermarks are resistant to removal and can verify the model used to generate the image, embedding unique user identifiers without degrading quality. Kim et al. [32] present a method for embedding unique fingerprints into generative models like Stable Diffusion. It allows model distributors to create user-specific models by modulating the model’s weights, enabling tracing of generated images back to individual users. This prevents malicious use of the models and ensures high attribution accuracy without compromising quality. Rezaei et al. [41] present a watermarking method that integrates watermarking directly into the image generation process using multi-scale embedding modules. This ensures robustness against various image transformations while maintaining high perceptual quality. LaWa also functions as a post-generation watermarking method without fine-tuning, offering flexibility for embedding watermarks in generated content. Embedding watermarks at the decoder stage provides an additional layer of security, enabling traceability of generated content back to its origins, ensuring accountability, and deterring unauthorized usage, all while maintaining a low computational cost during training [36].

3.2.5. Watermark Embedding Through Initial Noise Modification

Another approach to watermarking focuses on embedding information into the initial noise used during the diffusion process. These methods utilize the properties of the initial noise to ensure that the embedded watermark is imperceptible while being resilient against removal attempts.

Wen et al. [23] introduce Tree-Ring Watermarking, a method that embeds watermarks into the Fourier space of the noise used during image generation. It does not require additional training and is highly robust against transformations such as rotations, crops, and noise. The watermark is invisible to human observers and can only be detected by inverting the diffusion process to retrieve the initial noise. Varlamov et al. [29] build on Tree-Ring Watermarking by modifying the initial noise distribution in Fourier space. METR allows embedding multiple unique binary messages as concentric circles in the latent noise, providing flexibility in encoding large amounts of data while maintaining invisibility and resilience to attacks. Tan et al. [30] leverage pretrained diffusion probabilistic models to embed watermarks using the Discrete Wavelet transform. By embedding watermark data in high-frequency sub-band coefficients, WaterDiff ensures robustness and imperceptibility. The reversible nature of diffusion models allows for watermark extraction through a reverse process, achieving a balance between watermark capacity, imperceptibility, and resistance to attacks like JPEG compression.

Yang et al. [22] embed watermarks into the latent space without degrading image quality. Mapping watermarks to Gaussian latent representations maintains the integrity of generated images while ensuring provable detection. The method is resistant to compression attacks, ensuring robust watermarking without performance loss. Zhang et al. [42] introduce ZoDiac, a watermarking framework that embeds invisible watermarks into the latent space of images using pre-trained Stable Diffusion models. ZoDiac is highly resistant to removal, even by advanced attacks, and requires no model retraining. The watermark is injected into the Fourier space of the latent vector, with detection performed using a DDIM inversion, ensuring robustness against multiple removal attempts. Lei et al. [21] embed invisible watermarks into latent variables during the sampling phase of image generation, ensuring resilience against various image processing attacks without compromising quality or requiring retraining. The watermark can be flexibly modified and traced without affecting model performance. Embedding watermarks into the initial noise used in diffusion models ensures that the watermark remains imperceptible while being deeply integrated into the generative process. This method provides a secure and resilient watermarking solution that is effective against various image processing transformations [23].

4. Technical Review and Discussion

4.1. Evaluation Metrics

In this section, we discuss the various evaluation metrics used to assess the effectiveness of watermarking approaches for LDMs. These metrics are critical for understanding the performance of watermarking schemes in terms of imperceptibility, robustness, and capacity. The following subsections provide an overview of the evaluation metrics used in each of these categories, highlighting their relevance to LDM-based watermarking and contrasting them with traditional approaches.

4.1.1. Imperceptibility Evaluation for Watermarking in Latent Diffusion Models

Imperceptibility refers to how well the embedded watermark is hidden within the generated content without affecting its visual quality. The primary goal of an imperceptible watermark is to remain invisible to the human eye while maintaining the aesthetics and quality of the generated image. Traditionally, metrics such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), and Mean Squared Error (MSE) are employed to quantify visual quality differences between original and watermarked images. However, watermarking in generative model domains like LDMs does not involve modifying an original cover image directly, making traditional comparison metrics less applicable (See Figure 6).

Unlike traditional metrics, which provide specific numerical results to quantify image quality, embedding transparency is about maintaining consistency without direct measurement. Instead, it serves as a qualitative assessment strategy, ensuring that watermark embedding causes minimal alteration. To properly evaluate imperceptibility within LDMs, quantitative approaches like Learned Perceptual Image Patch Similarity (LPIPS) [43], Fréchet Inception Distance (FID) [44], or human perceptual studies can be used alongside embedding transparency to report specific results or scores [23].

These evaluation strategies and metrics are crucial for assessing how well watermarking techniques preserve the original quality of generated images. Unlike conventional watermarking, where an original cover image is directly modified, LDM watermarking must ensure that the watermark is seamlessly integrated during the latent generation process, resulting in high-quality outputs that do not raise suspicion.

Given these strategies, recent research has developed alternative evaluation methods specific to the generative context:

Perceptual consistency metrics: In studies such as [37], the focus is on maintaining consistency in high-level semantics rather than individual pixel values. Metrics like LPIPS are often utilized. LPIPS leverages deep neural network features to determine perceptual differences, providing a more reliable measure of how similar two images appear to a human observer when traditional pixel-based differences are not available [45].
FID: The FID score has also been adopted as a proxy for evaluating imperceptibility in generative contexts. It measures the distance between feature representations of watermarked and non-watermarked images using an Inception network, thus quantifying how closely the distribution of watermarked images matches that of non-watermarked ones. This helps in ensuring that the introduction of a watermark does not degrade the quality of generated images perceptually [29].
Human evaluation studies: Some works also incorporate user studies as a part of their imperceptibility evaluation. Participants are asked to differentiate between watermarked and non-watermarked images in a controlled setting to determine if watermarks are noticeable. For example, in the DiffuseTrace [21], this evaluation approach was used to empirically validate the transparency of embedded watermarks across different types of generative content.

4.1.2. Evaluation Metrics for Robustness

Robustness refers to the watermark’s resilience to various types of attacks, such as noise addition, compression, or transformations. Traditional robustness metrics often focus on common image processing attacks; however, LDM-based watermarking needs to account for the diffusion model’s generative process and potential adversarial manipulation in the latent space.

Bit error rate (BER): BER is used to quantify the robustness of the watermark by measuring how many bits differ between the original and extracted watermark [46]. A low BER implies that the watermark is resilient against attacks like cropping, rotation, and compression.
Normalized correlation (NC): NC measures the similarity between the embedded and extracted watermarks after the image has undergone attacks [46]. It helps in determining the extent to which the watermark remains intact despite modifications. A high NC value indicates that the watermark is successfully retrieved and is resistant to degradation.
Attack simulation tests: The watermark robustness is tested under several attack scenarios, including JPEG compression, Gaussian noise addition, cropping, and brightness adjustment [46]. These tests ensure that the embedded watermark can withstand transformations that are likely to occur in real-world applications. Different studies use various image attacks, including destructive, constructive, and reconstructive attacks, to evaluate the resilience of the watermark.

4.1.3. Evaluation Metrics for Capacity

Capacity metrics measure how much information can be embedded within an image while maintaining the watermark’s imperceptibility and robustness. Traditional watermarking usually involves a balance between capacity and imperceptibility, often limited by pixel-level distortions. In contrast, LDM watermarking operates in latent space, which offers opportunities for embedding larger amounts of information without affecting visual quality directly.

Payload Size: Payload size is the total number of bits that can be embedded within an image. For LDM watermarking, this is a critical parameter, as a higher payload means more information can be embedded without compromising the image quality. For instance, in the Gaussian Shading method [22], the watermarking capacity is quantified in bits, and the authors demonstrate successful embedding of up to 256 bits per image without visible quality degradation.
Embedding Rate (Bits per Pixel—bpp): This metric evaluates the density of information embedded in an image, usually expressed as bits per pixel [46]. A higher embedding rate indicates that more watermark information is present per unit of image data, which can affect robustness and image quality if not properly balanced.

4.2. Datasets and Open-Source Implementations

Datasets play an essential role in evaluating watermarking methods for LDMs, providing a basis for assessing imperceptibility, robustness, and security under diverse conditions. The following datasets are commonly used in the evaluation of LDM-based watermarking:

MS-COCO [47]: The MS-COCO dataset, which contains over 328K images, is used extensively to evaluate the effectiveness of watermarking schemes. For LDM watermarking, a subset of 500 images is often used to assess both the imperceptibility and robustness of embedded watermarks.
StableDiffusionDB [48]: This dataset includes images generated by Stable Diffusion based on prompts provided by users. It contains around 500 samples and is used for evaluating the quality of watermarked images under generative settings. The diverse nature of the images, influenced by user-generated prompts, allows for comprehensive robustness testing.
Flickr30k [49]: This dataset is utilized for evaluating latent watermarking schemes. With captions provided for each image, it allows researchers to explore how well the watermark remains embedded in images generated under different textual conditions. Latent Watermark has demonstrated its effectiveness using this dataset, showing higher robustness compared to earlier methods.

Open-source implementations are vital for promoting transparency and reproducibility in the development of watermarking techniques for LDMs. Table 2 shows key open-source projects related to LDM watermarking. These datasets and implementations provide a foundation for evaluating the practicality and robustness of different watermarking methods in latent diffusion models, enabling broader community involvement and further advancements in watermarking technology.

5. Challenges and Open Directions

5.1. Architectural Innovations and Watermark Integration

The integration of watermarking modules into the latent diffusion model architecture presents a series of complex challenges. Traditional watermarking methods typically operate in the pixel space, leading to a trade-off between robustness and image quality. The use of latent space for watermark embedding, as discussed in Latent Watermark [37] and Diffusetrace [21], introduces the potential for maintaining high-quality image generation while embedding robust watermarks. However, architectural innovations are required to make these embeddings more efficient without increasing the computational cost of model training. Recent approaches like WMAdapter [25] suggest using plug-and-play modules that allow for flexible integration without retraining, and this approach could pave the way for more dynamic watermarking strategies. Additionally, Tree-Ring [23] proposes a training-free and model-agnostic approach that offers robustness in watermark embedding, which could be valuable in future architectural designs.

5.2. Enhancing Robustness Against Adversarial Attacks

The robustness of watermarking in latent diffusion models against adversarial attacks remains a significant challenge. Research shows that latent diffusion watermarking methods, such as those embedded during the diffusion process, are susceptible to attacks like latent noise manipulation, variational autoencoder-based reconstructions, and advanced adversarial perturbations [21]. Methods such as [21,37] are advancing towards robustness by embedding watermarks within latent representations in a way that maintains resistance to various attack vectors. Furthermore, adversarial examples such as those discussed in [50,51] demonstrate how adversaries might exploit the weaknesses of watermarking methods. Future work should focus on developing advanced training techniques that involve adversarial training specific to the generative processes of diffusion models, improving the resilience of watermarks embedded in both latent and pixel domains.

5.3. Balancing Imperceptibility and Detectability

Achieving a balance between imperceptibility and detectability is a fundamental challenge for watermarking techniques. Ideally, watermarks should remain invisible to human observers while being easily detectable by automated systems. Techniques like Gaussian Shading [22] provide a promising solution by embedding watermarks in the latent space using standard Gaussian distributions, thereby avoiding direct pixel perturbations that might degrade image quality. Additionally, the WaterDiff approach [30] integrates perceptual metrics to ensure minimal perceptual difference between watermarked and non-watermarked content. Furthermore, AEROBLADE [52] uses autoencoder reconstruction errors to enhance the detectability of latent watermarks without relying on supervised training, thus improving traceability while preserving imperceptibility. Future work should aim to improve these approaches by integrating perceptual metrics directly into the training process, allowing watermarks to adapt dynamically to different content types while maintaining a consistent level of imperceptibility.

5.4. Scalable Watermarking Solutions for Diverse Users

The ability to provide scalable watermarking solutions for a large number of users without needing to retrain the model is an ongoing challenge. Methods like The Stable Signature [28] and Tree-Ring [23] require model retraining for each unique watermark, which becomes impractical in real-world scenarios involving numerous users. The development of plug-and-play watermark modules, as seen in Diffusetrace [21], allows for a more scalable approach by embedding multi-bit watermarks in latent variables during the image generation process. WMAdapter [25] also demonstrates a method for incorporating watermarking as an adaptable layer that can be added without retraining. However, the stability of these approaches can be a concern, as highlighted in [36], which emphasizes the need for more stable solutions against advanced removal techniques. Future research should explore modular watermarking strategies that can dynamically assign user-specific watermarks while maintaining robustness across different versions of the model.

5.5. Challenges in Payload Capacity and Security

Payload capacity is another major challenge when designing watermarking solutions for latent diffusion models. Traditional methods are often limited by the amount of data that can be embedded without compromising the quality of the output image. For LDMs, the latent space offers an opportunity to embed larger payloads while maintaining imperceptibility. However, maximizing payload capacity often risks reducing the robustness of the watermark. Techniques like METR [29] have made strides in embedding larger messages, but there is still a need for more efficient payload management strategies that can balance capacity with resilience to various attacks.

Security, in terms of preventing unauthorized detection or removal of the watermark, is also a critical concern. Many current methods rely heavily on key-based security, such as using secret keys to embed and extract watermarks securely. Safe-SD [26] introduces a text-prompt-triggered mechanism to enhance watermark security by using secure prompts to verify content authenticity. Future directions should explore hybrid security methods that combine key-based techniques with stochastic embedding processes, making it more difficult for adversaries to reverse-engineer or remove the watermark.

5.6. Open Directions for Enhancing Traceability and Copyright Protection

Traceability and copyright protection are essential for combating the misuse of generative content. Safe-SD [26] introduces a watermark-trigger mechanism using text prompts to enhance traceability, while ProMark [53] provides proactive watermarking strategies aimed at causal attribution. The work by Wang et al. [54] proposes tracing mechanisms that do not require artificial watermarking, offering a different perspective on generative model traceability. Current watermarking approaches need to evolve to handle increasingly sophisticated attacks that aim to remove or obscure watermarks, such as diffusion-based adversarial attacks or latent variable manipulations. Methods like watermark-embedded adversarial examples [55] have shown promise in enhancing copyright protection, but their implementation is currently limited by scalability and computational efficiency concerns. Future research could explore hybrid techniques combining watermarking with other traceability methods like model fingerprinting [56] or blockchain-based logging [57] to ensure a secure and transparent trail of content ownership.

6. Conclusions

Watermarking for latent diffusion model is a rapidly evolving field, driven by the need for effective copyright protection and the prevention of the misuse of generative content. Despite significant advancements, several challenges remain, including enhancing the robustness of embedded watermarks, achieving a balance between imperceptibility and detectability, increasing payload capacity without compromising quality, and developing scalable watermarking solutions. Future research directions should focus on adversarially robust training techniques, dynamic watermarking strategies for multi-user environments, and hybrid approaches that integrate watermarking with other traceability mechanisms. As generative models continue to gain traction across industries, addressing these challenges will be crucial for ensuring the ethical use of AI-generated content. By continuing to innovate in watermarking techniques for latent diffusion model, we can move towards more secure, transparent, and responsible applications of generative technologies.

Author Contributions

H.H.: conceptualization, methodology, writing-original draft; M.K.: conceptualization, data curation, writing—original draft; S.S.: formal analysis, data curation, writing—original draft; J.-U.H.: supervision, writing—original draft. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by a Hallym University Research Fund, 2023 (HRF-202312-006).

Acknowledgments

We would like to acknowledge the use of ChatGPT, developed by OpenAI, for assisting with the English editing and refinement of this manuscript. Its contributions have helped enhance the clarity and precision of our content.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-resolution image synthesis with latent diffusion models. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE Computer Society: Los Alamitos, CA, USA, 2022; pp. 10684–10695. [Google Scholar]
Brundage, M.; Avin, S.; Clark, J.; Toner, H.; Eckersley, P.; Garfinkel, B.; Dafoe, A.; Scharre, P.; Zeitzoff, T.; Filar, B.; et al. The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. arXiv 2018, arXiv:1802.07228. [Google Scholar]
Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event, 3–10 March 2021; pp. 610–623. [Google Scholar]
Grinbaum, A.; Adomaitis, L. The Ethical Need for Watermarks in Machine-Generated Language. arXiv 2022, arXiv:2209.03118. [Google Scholar]
Dong, S.; Zhang, R.; Liu, J. Invisible steganography via generative adversarial network. arXiv 2018, arXiv:1807.08571. [Google Scholar]
Singh, P.; Chadha, R.S. A survey of digital watermarking techniques, applications and attacks. Int. J. Eng. Innov. Technol. (IJEIT) 2013, 2, 165–175. [Google Scholar]
Hosny, K.M.; Magdi, A.; ElKomy, O.; Hamza, H.M. Digital image watermarking using deep learning: A survey. Comput. Sci. Rev. 2024, 53, 100662. [Google Scholar] [CrossRef]
Li, Y.; Wang, H.; Barni, M. A survey of deep neural network watermarking techniques. Neurocomputing 2021, 461, 171–193. [Google Scholar] [CrossRef]
Wang, Z.; Byrnes, O.; Wang, H.; Sun, R.; Ma, C.; Chen, H.; Wu, Q.; Xue, M. Data hiding with deep learning: A survey unifying digital watermarking and steganography. IEEE Trans. Comput. Soc. Syst. 2023, 10, 2985–2999. [Google Scholar] [CrossRef]
Amrit, P.; Singh, A.K. Survey on watermarking methods in the artificial intelligence domain and beyond. Comput. Commun. 2022, 188, 52–65. [Google Scholar] [CrossRef]
Tsui, T.K.; Zhang, X.P.; Androutsos, D. Color image watermarking using multidimensional Fourier transforms. IEEE Trans. Inf. Forensics Secur. 2008, 3, 16–28. [Google Scholar] [CrossRef]
Fares, K.; Amine, K.; Salah, E. A robust blind color image watermarking based on Fourier transform domain. Optik 2020, 208, 164562. [Google Scholar] [CrossRef]
Barni, M.; Bartolini, F.; Cappellini, V.; Piva, A. A DCT-domain system for robust image watermarking. Signal Process. 1998, 66, 357–372. [Google Scholar] [CrossRef]
Bors, A.G.; Pitas, I. Image watermarking using DCT domain constraints. In Proceedings of the 3rd IEEE International Conference on Image Processing, Lausanne, Switzerland, 19 September 1996; Volume 3, pp. 231–234. [Google Scholar]
Al-Haj, A. Combined DWT-DCT digital image watermarking. J. Comput. Sci. 2007, 3, 740–746. [Google Scholar] [CrossRef]
Zhu, J.; Kaplan, R.; Johnson, J.; Li, F.-F. HiDDeN: Hiding Data with Deep Networks. In Proceedings of the 15th European Conference, Munich, Germany, 8–14 September 2018. [Google Scholar]
Jia, Z.; Fang, H.; Zhang, W. MBRS: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event, 20–24 October 2021; pp. 41–49. [Google Scholar]
Zhang, C.; Benz, P.; Karjauv, A.; Sun, G.; Kweon, I.S. UDH: Universal deep hiding for steganography, watermarking, and light field messaging. In Proceedings of the NIPS’20: 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020; pp. 10223–10234. [Google Scholar]
Zhong, X.; Huang, P.C.; Mastorakis, S.; Shih, F.Y. An automated and robust image watermarking scheme based on deep neural networks. IEEE Trans. Multimed. 2020, 23, 1951–1961. [Google Scholar] [CrossRef]
Chen, X.; Wang, W.; Bender, C.; Ding, Y.; Jia, R.; Li, B.; Song, D. REFIT: A unified watermark removal framework for deep learning systems with limited data. In Proceedings of the ASIA CCS ’21: ACM Asia Conference on Computer and Communications Security, Virtual Event, 7–11 June 2021; pp. 321–335. [Google Scholar]
Lei, L.; Gai, K.; Yu, J.; Zhu, L. Diffusetrace: A transparent and flexible watermarking scheme for latent diffusion model. arXiv 2024, arXiv:2405.02696. [Google Scholar]
Yang, Z.; Zeng, K.; Chen, K.; Fang, H.; Zhang, W.; Yu, N. Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 12162–12171. [Google Scholar]
Wen, Y.; Kirchenbauer, J.; Geiping, J.; Goldstein, T. Tree-rings watermarks: Invisible fingerprints for diffusion images. In Proceedings of the NIPS’23: 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023. [Google Scholar]
Min, R.; Li, S.; Chen, H.; Cheng, M. A watermark-conditioned diffusion model for ip protection. arXiv 2024, arXiv:2403.10893. [Google Scholar]
Ci, H.; Song, Y.; Yang, P.; Xie, J.; Shou, M.Z. WMAdapter: Adding WaterMark Control to Latent Diffusion Models. arXiv 2024, arXiv:2406.08337. [Google Scholar]
Ma, Z.; Jia, G.; Qi, B.; Zhou, B. Safe-SD: Safe and Traceable Stable Diffusion with Text Prompt Trigger for Invisible Generative Watermarking. arXiv 2024, arXiv:2407.13188. [Google Scholar]
Zhao, Y.; Pang, T.; Du, C.; Yang, X.; Cheung, N.M.; Lin, M. A recipe for watermarking diffusion models. arXiv 2023, arXiv:2303.10137. [Google Scholar]
Fernandez, P.; Couairon, G.; Jégou, H.; Douze, M.; Furon, T. The stable signature: Rooting watermarks in latent diffusion models. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 22466–22477. [Google Scholar]
Varlamov, A.; Diatlova, D.; Spirin, E. METR: Image Watermarking with Large Number of Unique Messages. arXiv 2024, arXiv:2408.08340. [Google Scholar]
Tan, Y.; Peng, Y.; Fang, H.; Chen, B.; Xia, S.T. WaterDiff: Perceptual Image Watermarks Via Diffusion Model. In Proceedings of the ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 3250–3254. [Google Scholar]
Yuan, Z.; Li, L.; Wang, Z.; Zhang, X. Watermarking for Stable Diffusion Models. IEEE Internet Things J. 2024, 11, 35238–35249. [Google Scholar] [CrossRef]
Kim, C.; Min, K.; Patel, M.; Cheng, S.; Yang, Y. WOUAF: Weight modulation for user attribution and fingerprinting in text-to-image diffusion models. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 8974–8983. [Google Scholar]
Wu, H.; Liu, G.; Yao, Y.; Zhang, X. Watermarking neural networks with watermarked images. IEEE Trans. Circuits Syst. Video Technol. 2020, 31, 2591–2601. [Google Scholar] [CrossRef]
Zhang, G.; Wang, L.; Su, Y.; Liu, A.A. A Training-Free Plug-and-Play Watermark Framework for Stable Diffusion. arXiv 2024, arXiv:2404.05607. [Google Scholar]
Liu, Y.; Li, Z.; Backes, M.; Shen, Y.; Zhang, Y. Watermarking diffusion model. arXiv 2023, arXiv:2305.12502. [Google Scholar]
Hu, Y.; Jiang, Z.; Guo, M.; Gong, N. Stable Signature is Unstable: Removing Image Watermark from Diffusion Models. arXiv 2024, arXiv:2405.07145. [Google Scholar]
Meng, Z.; Peng, B.; Dong, J. Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space. arXiv 2024, arXiv:2404.00230. [Google Scholar]
Liu, G.H.; Chen, T.; Theodorou, E.; Tao, M. Mirror diffusion models for constrained and watermarked generation. In Proceedings of the NIPS’23: 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023. [Google Scholar]
Xiong, C.; Qin, C.; Feng, G.; Zhang, X. Flexible and secure watermarking for latent diffusion model. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 1668–1676. [Google Scholar]
Desu, A.; He, X.; Xu, Q.; Lu, W. Generative Models are Self-Watermarked: Declaring Model Authentication through Re-Generation. arXiv 2024, arXiv:2402.16889. [Google Scholar]
Rezaei, A.; Akbari, M.; Alvar, S.R.; Fatemi, A.; Zhang, Y. LaWa: Using latent space for in-generation image watermarking. arXiv 2024, arXiv:2408.05868. [Google Scholar]
Zhang, L.; Liu, X.; Martin, A.V.; Bearfield, C.X.; Brun, Y.; Guan, H. Robust Image Watermarking using Stable Diffusion. arXiv 2024, arXiv:2401.04247. [Google Scholar]
Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 586–595. [Google Scholar]
Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar]
Hu, K.; Wang, M.; Ma, X.; Chen, J.; Wang, X.; Wang, X. Learning-based image steganography and watermarking: A survey. Expert Syst. Appl. 2024, 249, 123715. [Google Scholar] [CrossRef]
Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common objects in context. In Computer Vision—ECCV 2014, Proceedings of the 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13; Springer International Publishing: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
Wang, Z.J.; Montoya, E.; Munechika, D.; Yang, H.; Hoover, B.; Chau, D.H. Diffusiondb: A large-scale prompt gallery dataset for text-to-image generative models. arXiv 2022, arXiv:2210.14896. [Google Scholar]
Plummer, B.A.; Wang, L.; Cervantes, C.M.; Caicedo, J.C.; Hockenmaier, J.; Lazebnik, S. Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 2641–2649. [Google Scholar]
Li, X.; Shen, Q.; Kawaguchi, K. VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 12363–12373. [Google Scholar]
Ye, X.; Huang, H.; An, J.; Wang, Y. Duaw: Data-free universal adversarial watermark against stable diffusion customization. arXiv 2023, arXiv:2308.09889. [Google Scholar]
Ricker, J.; Lukovnikov, D.; Fischer, A. AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 9130–9140. [Google Scholar]
Asnani, V.; Collomosse, J.; Bui, T.; Liu, X.; Agarwal, S. ProMark: Proactive Diffusion Watermarking for Causal Attribution. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 10802–10811. [Google Scholar]
Wang, Z.; Sehwag, V.; Chen, C.; Lyu, L.; Metaxas, D.N.; Ma, S. How to Trace Latent Generative Model Generated Images without Artificial Watermark? arXiv 2024, arXiv:2405.13360. [Google Scholar]
Zhu, P.; Takahashi, T.; Kataoka, H. Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 24420–24430. [Google Scholar]
Nie, G.; Kim, C.; Yang, Y.; Ren, Y. Attributing image generative models using latent fingerprints. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 26150–26165. [Google Scholar]
Megías, D.; Kuribayashi, M.; Qureshi, A. Survey on decentralized fingerprinting solutions: Copyright protection through piracy tracing. Computers 2020, 9, 26. [Google Scholar] [CrossRef]

Figure 1. Overview of the structure of the latent diffusion model (source: [1]).

Figure 2. Chronological overview of generative image watermarking approaches using latent diffusion models ([22,25,26,27,28,29,30,31,32,33]).

Figure 3. Overview of watermark embedding region/type in LDMs: (1) latent space embedding; (2) diffusion process embedding; (3) full model optimization; (4) decoder manipulation; (5) initial noise modification. The ‘***’ symbol indicates the watermarking process.

Figure 4. Research trends in watermarking approaches within LDMs, illustrating the evolution and preference of various watermarking techniques.

Figure 5. Comparison of watermarked images generated by [34] with two other methods (left to right: [28], [39], [34]) for Stable Diffusion (source: [34]).

Figure 6. Watermarking technique through adjustment of the generative semantic space and the inapplicability of PSNR and SSIM metrics. (a) Original image, (b) traditional watermarking, and (c) watermarked image from generative model. Images are taken from [23].

Table 1. Comparison of watermarking methods in latent diffusion model. (Dec means LDM Decoder.)

Reference	Embedding Type	Requires Training	Controllable	Task	Fine-Tuning
[34]	Latent	O (Adaptor)	Adaptor	Identification	×
[25]	Latent	O (Adaptor)	Adaptor	Identification	×
[37]	Latent	O (Adaptor)	Adaptor	Identification	×
[38]	Diffusion Process	X	Controllable	Verification	×
[26]	LDM	O (Adaptor)	Prompt, Adaptor	Verification	O
[35]	LDM	O	Prompt	Verification	O
[31]	LDM	O	Prompt	Verification	O
[27]	LDM	O	Prompt	Verification	O
[39]	LDM	X	Uncontrollable	Identification	O
[24]	LDM	X	Uncontrollable	Identification	×
[40]	LDM	X	Uncontrollable	Verification	×
[32]	Dec	X	Uncontrollable	Identification	O (Dec)
[28]	Dec	X	Uncontrollable	Verification	O (Dec)
[41]	Dec	X	Adaptor	Verification	×
[29]	Dec, Initial Noise	X	Controllable	Identification	O (Dec)
[29]	Initial Noise	X	Controllable	Identification	×
[30]	Initial Noise	X	Controllable	Identification	×
[22]	Initial Noise	X	Adaptor	Identification	×
[23]	Initial Noise	X	Controllable	Verification	×
[42]	Initial Noise	X	Controllable	Verification	×
[21]	Initial Noise	X	Uncontrollable	Verification	×

Table 2. Collections of open-source implementations for watermarking methods based on LDMs.

Year	Source	Model	Implementation	Links
2023	Neurips	MDM [38]	PyTorch	https://github.com/ghliu/mdm (23 December 2024)
	Neurips	Tree [23]	PyTorch	https://github.com/YuxinWenRick/tree-ring-watermark (23 December 2024)
	arXiv	WaterDM [27]	PyTorch	https://github.com/yunqing-me/WatermarkDM (23 December 2024)
	ICCV	Stable [36]	PyTorch	https://github.com/facebookresearch/stable_signature (23 December 2024)
2024	arXiv	METR [29]	PyTorch	https://github.com/deepvk/metr (23 December 2024)
	ECCV	WaDiff [24]	PyTorch	https://github.com/rmin2000/WaDiff (23 December 2024)
	CVPR	Gaussian [22]	PyTorch	https://github.com/bsmhmmlf/Gaussian-Shading (23 December 2024)
	CVPR	WOUAF [32]	PyTorch	https://github.com/kylemin/WOUAF (23 December 2024)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hur, H.; Kang, M.; Seo, S.; Hou, J.-U. Latent Diffusion Models for Image Watermarking: A Review of Recent Trends and Future Directions. Electronics 2025, 14, 25. https://doi.org/10.3390/electronics14010025

AMA Style

Hur H, Kang M, Seo S, Hou J-U. Latent Diffusion Models for Image Watermarking: A Review of Recent Trends and Future Directions. Electronics. 2025; 14(1):25. https://doi.org/10.3390/electronics14010025

Chicago/Turabian Style

Hur, Hongjun, Minjae Kang, Sanghyeok Seo, and Jong-Uk Hou. 2025. "Latent Diffusion Models for Image Watermarking: A Review of Recent Trends and Future Directions" Electronics 14, no. 1: 25. https://doi.org/10.3390/electronics14010025

APA Style

Hur, H., Kang, M., Seo, S., & Hou, J. -U. (2025). Latent Diffusion Models for Image Watermarking: A Review of Recent Trends and Future Directions. Electronics, 14(1), 25. https://doi.org/10.3390/electronics14010025

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Latent Diffusion Models for Image Watermarking: A Review of Recent Trends and Future Directions

Abstract

1. Introduction

2. Backgrounds

LDM Structure Overview

3. Watermarking Approaches

3.1. LDM-Based Watermarking Categorization

3.2. Watermarking Methods

3.2.1. Watermark Embedding Through Latent Space

3.2.2. Watermark Embedding via Noise Removal in Diffusion Process

3.2.3. Watermark Embedding Through Full LDM Optimization

3.2.4. Watermark Embedding via Decoder Parameter Manipulation

3.2.5. Watermark Embedding Through Initial Noise Modification

4. Technical Review and Discussion

4.1. Evaluation Metrics

4.1.1. Imperceptibility Evaluation for Watermarking in Latent Diffusion Models

4.1.2. Evaluation Metrics for Robustness

4.1.3. Evaluation Metrics for Capacity

4.2. Datasets and Open-Source Implementations

5. Challenges and Open Directions

5.1. Architectural Innovations and Watermark Integration

5.2. Enhancing Robustness Against Adversarial Attacks

5.3. Balancing Imperceptibility and Detectability

5.4. Scalable Watermarking Solutions for Diverse Users

5.5. Challenges in Payload Capacity and Security

5.6. Open Directions for Enhancing Traceability and Copyright Protection

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI