Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A Secure and Efficient Distributed Semantic Communication System for Heterogeneous Internet of Things Devices

Weihao Zeng,  Xinyu Xu, Qianyun Zhang,  Jiting Shi Zhijin Qin,  Zhenyu Guan W. Zeng, X. Xu, Q. Zhang, J. Shi and Z. Guan are with the School of Cyber Science and Technology, Beihang University, Beijing 100191, China (email: {zengweihao, 20231010, zhangqianyun, shijiting, guanzhenyu}@buaa.edu.cn).Z. Qin is with the Department of Electronic Engineering, Tsinghua University, Beijing 100084, China (e-mail: qinzhijin@tsinghua.edu.cn)
Abstract

Semantic communications have emerged as a promising solution to address the challenge of efficient communication in rapidly evolving and increasingly complex Internet of Things (IoT) networks. However, protecting the security of semantic communication systems within the distributed and heterogeneous IoT networks is critical issues that need to be addressed. We develop a secure and efficient distributed semantic communication system in IoT scenarios, focusing on three aspects: secure system maintenance, efficient system update, and privacy-preserving system usage. Firstly, we propose a blockchain-based interaction framework that ensures the integrity, authentication, and availability of interactions among IoT devices to securely maintain system. This framework includes a novel digital signature verification mechanism designed for semantic communications, enabling secure and efficient interactions with semantic communications. Secondly, to improve the efficiency of interactions, we develop a flexible semantic communication scheme that leverages compressed semantic knowledge bases. This scheme reduces the data exchange required for system update and is adapt to dynamic task requirements and the diversity of device capabilities. Thirdly, we exploit the integration of differential privacy into semantic communications. We analyze the implementation of differential privacy taking into account the lossy nature of semantic communications and wireless channel distortions. An joint model-channel noise mechanism is introduced to achieve differential privacy preservation in semantic communications without compromising the system’s functionality. Experiments show that the system is able to achieve integrity, availability, efficiency and the preservation of privacy.

Index Terms:
Semantic communications, Internet of Things, blockchain, differential privacy.

I Introduction

The proliferation of the Internet of Things (IoT) has led to a significant increase in data volumes and network connectivity. This rapid expansion highlights the necessity for efficient communication systems within IoT networks. Semantic communications[1, 2] are novel communication paradigms that focus on directly conveying intended meanings and sharing only the essential information relevant to the receiver’s needs, i.e. semantics. Semantic communication systems are built on neural network models and shared knowledge bases, which combine to effectively extract semantic features from diverse sources and accurately interpret them to facilitate execution of specific tasks. It has emerged as a promising approach to achieve efficient communication in IoT scenarios, and pave the way for more intelligent IoT tasks[3, 4].

However, the distributed and heterogeneous natures of IoT networks and the presence of malicious attackers pose significant challenges to the security and practical deployment of semantic communication systems. Unlike end-to-end semantic communications[5], semantic communication systems within IoT networks require more complex multi-party interactions. To be specific, a critical concern is to synchronize semantic communication models and shared knowledge bases among multiple participants to prevent inaccurate extraction and interpretation of semantic information. In addition, ever-emerging communication tasks in IoT scenarios necessitate ongoing updates of semantic communication systems. This requires IoT devices to collect evolving data about communication tasks to update neural network models and tune knowledge bases. The data is inevitably distributed across different devices. These devices require collaborative model training, such as federated learning[6], to exploit this distributed data. It is worth noting that the above interactions are inherently communication tasks, which can also be accomplished through semantic communications, thereby enhancing the efficiency of the entire semantic communication system.

In order to establish a secure distributed semantic communication system, several issues need to be addressed. The first challenge is to achieve interaction integrity, authentication and availability to securely maintain semantic communication systems among IoT devices. The integrity and authentication of interactions are threatened by various attacks, such as data tampering, data falsification and man-in-the-middle attacks[7]. Adversaries can maliciously modify or falsify the information exchanged, causing conflicts among models and knowledge bases of each devices. They can also introduce perturbations into the information related to the collaborative system update, impeding the convergence of models and the representation of knowledge bases[8]. Furthermore, the lossy transmission nature of semantic communication raises significant issues for verifying the integrity and authentication of the exchanged information. Traditional verification mechanisms cannot be directly applied to semantic communications, as small distortions from the semantic communication process can make the verification fail. It hinders semantic communications to facilitate efficient interactions.

The availability of interactions is also threatened. The inherent dynamics of IoT network topology, along with the potential for device malfunctions, disconnections, and communication delays, pose difficulties in maintaining availability of interactions among IoT devices. External attacks, such as distributed denial-of-service attack, also present threats that compromise the availability of interactions. The aforementioned problems with the integrity, authentication and availability emphasize the importance of developing a interaction framework that is trustworthy and fault-tolerant while being able to leverage semantic communications for efficient interactions.

Second, the diverse transmission and computation capabilities of IoT devices are obstacles to the practical deployment of semantic communication systems. During interactions of update and synchronization, the direct exchange of entire models and knowledge bases among IoT devices imposes severe burdens on these transmission-limited IoT devices. This is due to the substantial size of the current implementation of models[3] and knowledge base, such as knowledge graph[9, 10], training datasets[11] and feature vector sets[12], which result in overwhelming data transmission requirements. In addition, the immense data size of models and knowledge bases significantly increases the computational overhead of model inference. This challenge is particularly acute for IoT devices with limited computing power, leading to higher latency and a severe degradation of overall system efficiency. Therefore, it is imperative to develop semantic communication system that facilitates efficient updates and synchronizations with minimal data exchange. This system also must possess scalability and elasticity to accommodate a diverse range of devices and tasks.

Third, preserving the privacy of IoT devices throughout the maintenance and utilization of semantic communication systems is also a critical issue that needs to be addressed. In the context of collaborative training for system maintenance, although integrating semantic communications with federated learning[13] limits the exposure of individual training data to other parties by keeping training data localized and only transmitting training result, privacy concerns remain a pressing issue. The gradient leakage attack[14] is one of the most serious privacy attacks in collaborative model training, where adversaries maliciously extract privacy information contained in gradients exchanged among IoT devices. Similar considerations apply to the usage of semantic communication systems as to system maintenance. For tasks that focus on data analysis and do not require precise data recovery, semantic communications deliver only the semantics, while leaving the original data local. The sensitive information in the raw data remains implicit in the semantics and can be inferred by methods such as model inversion attacks[15, 16]. Differential privacy (DP)[17, 18, 19] has emerged as a prominent framework for ensuring privacy in data analysis. It provides a rigorous mathematical defend against model inversion attacks and gradient leakage attack. Therefore, there is a necessity for a differential privacy mechanism in semantic communication systems.

To tackle above challenges presented in semantic communications within IoT networks, we propose a secure and efficient distributed semantic communication system. Our contributions are presented in detail as follows.

  1. 1.

    We propose a blockchain-based interaction framework for secure updates and synchronization of the distributed semantic communication system, ensuring the integrity, authentication and availability of interactions. Furthermore, an integrity and authentication verification mechanism for semantic communications is designed. It enables the application of semantic communications in secure interactions.

  2. 2.

    We develop a flexible semantic communication scheme for IoT scenarios based on high-level representational and compressed semantic knowledge bases. Mainly by updating and synchronizing semantic knowledge vectors, semantic communication systems are flexibly adapted to dynamically changing task requirements, and reduce the amount of data exchange required during system maintenance. The scheme offers flexibility for IoT devices to strike a balance between transmission and computation consumption by adjusting the size of knowledge bases utilized in semantic communications.

  3. 3.

    We explore the differential privacy model in semantic communication, which takes into account both the lossy nature of semantic communication and the distortion caused by wireless channels. Building upon our model, we introduce an joint model-channel noise mechanism that optimally adds noise into signal symbols to achieve differential privacy in semantic communications. The mechanism is able to uniformly and transparently provide differential privacy protection for any data analysis task in semantic communications.

The rest of this article is organized in the following way. In Section II, we present the related work. In Section III, we present system model including scenario description, semantic communication system model with semantic knowledge base and problem definition. Section IV introduces an overview of the proposed system, followed by a detailed description of three important schemes, blockchain based interaction framework, flexible semantic communication scheme and an joint model-channel noise mechanism. The performance of the system are evaluated in Section V. Finally, we conclude our work in Section VI.

II Related Work

There are many studies that discuss the security of semantic communication systems from a holistic perspective. In [4], authors evaluated classical security techniques in the context of wireless semantic communication security, and the paper also included an analysis of attack and defense methods specific to semantic communications. The multi-domain security vulnerabilities of using deep neural networks for semantic communications are discussed in [20]. The paper also explored targeted and non-targeted adversarial attacks on computer vision and wireless channel with small perturbations. The outcomes of these attacks demonstrated the potential to manipulate the semantics of transmitted information. Authors in [21] clarified the requirements for secure semantic communication and presented the multiple potential security threats that exist at each step of semantic communications, along with the possible defenses against these threats.

In addition to the overall perspective, the following section describes works on semantic communication security from two specific perspectives: data integrity and privacy protection. In semantic communication systems, risks of data integrity arising from data tampering and forgery exist at all stages of data collection, model training, model inference and wireless transmission. To ensure the data integrity in semantic communication system, a semantic signature generation method is proposed in [22] based on generative adversarial networks to protect the integrity of semantics against adversarial perturbations over the end-to-end semantic communication system. Moreover, in distributed semantic communication systems, with a focus on efficient and secure information interaction in Web 3.0 and Metaverse, authors in [23, 24] integrate blockchain with semantic communications. Tamper-resistant mechanisms inherent in blockchain and smart contracts is utilized to verify the integrity and authenticity of semantics, and validate the quality of semantics. However, the current studies lack authentication of data sources for lossy semantics, and no proper integrity verification mechanism has been proposed for lossy transmission of semantic communications.

Attacks against privacy generally occur in the model inference phase. A combined attack involving model inversion attack and eavesdropping attack for semantic communication is proposed in [15]. The attacker first intercepts the semantic information transmitted in the wireless channel and then tries to reconstruct the original information by inverting the model, which leads to the leakage of the user’s private information. To resist the model inversion attack, a defense method based on random semantics permutation and substitution[15] is proposed to prevent the attacker from efficiently reconstructing the original information. Authors in [25] proposed an information bottleneck and adversarial learning approach to protect users’ privacy against model inversion attacks, where adversarial learning is used to train encoders to fool adversaries by maximizing reconstruction distortion. To address the privacy risk caused by knowledge discrepancies among communicating nodes, a knowledge discrepancy oriented privacy preserving method for semantic communication is proposed in [26]. Knowledge mapping and disambiguation reduce the knowledge discrepancy between the sender and receiver, and the use of path-cutting module prevent sensitive data from being leaked. A framework is proposed to address the utility-informativeness-security trade-off in the discrete task-oriented semantic communications[27]. It leverage adversarial learning to achieve privacy-preserving. Current privacy-preserving schemes in semantic communications are limited to specific scenarios and tasks, and lack mathematically rigorous proof of privacy-preserving effectiveness.

III System Model

III-A Scenario Description

We investigate the application of semantic communications in distributed IoT networks, as illustrated in Fig. 1. Within IoT networks, IoT devices exhibit a wide range of transmission and computation capabilities. These devices leverage semantic communication system to exchange semantics associated with specific tasks. These tasks, ranging from simple data collection to complex data analysis, are evolving in response to ever-changing environmental conditions. These devices not only simply utilize static semantic communication models and knowledge bases, but also perform interactions to continuously update and synchronize the semantic communication system. The objective of the system update is to keep pace with the evolving demands of IoT tasks. The aim of synchronizing models and knowledge among participants is to ensure accurate extraction and interpretation of semantic information.

There are attackers in IoT scenarios, categorized into internal and external attackers. Internal attackers within IoT networks are “honest and curious”. They comply with network protocols, but out of curiosity or malicious intent, they may conduct passive attacks, carrying out unauthorized information eavesdropping and analysis. For example, such an adversary might attempt to exploit gradient leakage to gain access to sensitive data without disrupting interaction processes within the network. External attackers are from outside the IoT networks, and can launch active attacks in addition to passive attacks. They initiate active attacks, including data tampering, data falsification, and denial-of-service attacks, with the aim of directly corrupting the update and synchronization processes.

III-B Semantic Communication System with Semantic Knowledge Base

Without loss of generality, we concentrate on semantic communications for the task of text transmission following the [5]. The input sentence to the semantic communication system is denoted as 𝒔=[w1,w2,,wL]𝒔subscript𝑤1subscript𝑤2subscript𝑤𝐿\boldsymbol{s}=[w_{1},w_{2},\dots,w_{L}]bold_italic_s = [ italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ], where wlsubscript𝑤𝑙w_{l}italic_w start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT is the l𝑙litalic_l-th word in the sentence. The transmitter comprises three essential components: semantic encoder, channel encoder, and semantic knowledge base. The semantic encoder is responsible for transforming the input data into meaningful semantic features. By leveraging the semantic knowledge base, the semantic encoder gains access to fundamental understanding and representations that significantly enhance its effectiveness. The channel encoder, which follows the semantic encoder, converts and compresses the semantic representations into fewer signal symbols suitable for transmission over the communication channel, ensuring reliable and efficient data delivery among IoT devices. The signal sent by the transmitter is denoted as

𝒙=C𝜷(S𝜶(𝒔,𝜿))𝒙subscript𝐶𝜷subscript𝑆𝜶𝒔𝜿\boldsymbol{x}=C_{\boldsymbol{\beta}}\left(S_{\boldsymbol{\alpha}}\left(% \boldsymbol{s},\boldsymbol{\kappa}\right)\right)bold_italic_x = italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT ( bold_italic_s , bold_italic_κ ) ) (1)

where 𝒙K×1𝒙superscript𝐾1\boldsymbol{x}\in\mathbb{C}^{K\times 1}bold_italic_x ∈ blackboard_C start_POSTSUPERSCRIPT italic_K × 1 end_POSTSUPERSCRIPT represents the power-normalized signal that is to be transmitted, 𝜿P×Q𝜿superscript𝑃𝑄\boldsymbol{\kappa}\in\mathbb{R}^{P\times Q}bold_italic_κ ∈ blackboard_R start_POSTSUPERSCRIPT italic_P × italic_Q end_POSTSUPERSCRIPT is represented as a semantic knowledge base with P𝑃Pitalic_P vectors, each of size Q𝑄Qitalic_Q, S𝜶()subscript𝑆𝜶S_{\boldsymbol{\alpha}}\left(\cdot\right)italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT ( ⋅ ) is the semantic encoder with the parameters 𝜶𝜶\boldsymbol{\alpha}bold_italic_α and C𝜷()subscript𝐶𝜷C_{\boldsymbol{\beta}}\left(\cdot\right)italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( ⋅ ) is the channel encoder with the parameters 𝜷𝜷\boldsymbol{\beta}bold_italic_β. The signal received at the receiver is

𝒚=𝒉𝒙+𝒏channel𝒚𝒉𝒙subscript𝒏𝑐𝑎𝑛𝑛𝑒𝑙\boldsymbol{y}=\boldsymbol{h}\boldsymbol{x}+\boldsymbol{n}_{channel}bold_italic_y = bold_italic_h bold_italic_x + bold_italic_n start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT (2)

where 𝒚K×1𝒚superscript𝐾1\boldsymbol{y}\in\mathbb{C}^{K\times 1}bold_italic_y ∈ blackboard_C start_POSTSUPERSCRIPT italic_K × 1 end_POSTSUPERSCRIPT, 𝐧channelsubscript𝐧𝑐𝑎𝑛𝑛𝑒𝑙\mathbf{n}_{channel}bold_n start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT is the additive white Gaussian noise (AWGN), following 𝐧channel𝒞𝒩(0,σn2𝐈L)similar-tosubscript𝐧𝑐𝑎𝑛𝑛𝑒𝑙𝒞𝒩0superscriptsubscript𝜎𝑛2subscript𝐈𝐿\mathbf{n}_{channel}\sim\mathcal{CN}\left(0,\sigma_{n}^{2}\mathbf{I}_{L}\right)bold_n start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT ∼ caligraphic_C caligraphic_N ( 0 , italic_σ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ). For the Rayleigh fading channel, the channel coefficient follows 𝐡𝒞𝒩(0,𝐈L)similar-to𝐡𝒞𝒩0subscript𝐈𝐿\mathbf{h}\sim\mathcal{CN}\left(0,\mathbf{I}_{L}\right)bold_h ∼ caligraphic_C caligraphic_N ( 0 , bold_I start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ); and for Rician fading channel, it follows 𝐡𝒞𝒩(μh𝐈L,σh2𝐈L)similar-to𝐡𝒞𝒩subscript𝜇subscript𝐈𝐿superscriptsubscript𝜎2subscript𝐈𝐿\mathbf{h}\sim\mathcal{CN}\left(\mu_{h}\mathbf{I}_{L},\sigma_{h}^{2}\mathbf{I}% _{L}\right)bold_h ∼ caligraphic_C caligraphic_N ( italic_μ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT bold_I start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) with μh=r/(r+1)subscript𝜇𝑟𝑟1\mu_{h}=\sqrt{r/(r+1)}italic_μ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = square-root start_ARG italic_r / ( italic_r + 1 ) end_ARG and σh=1/(r+1)subscript𝜎1𝑟1\sigma_{h}=\sqrt{1/(r+1)}italic_σ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = square-root start_ARG 1 / ( italic_r + 1 ) end_ARG, where r𝑟ritalic_r is the Rician coefficient.

The receiver includes semantic decoder, channel decoder and semantic knowledge base. The semantic knowledge base is synchronized to the transmitter’s. The channel decoder processes the received signals to recover semantic features, mitigating errors or distortions caused during the wireless communication process. Subsequently, the semantic decoder leverages the semantic knowledge base to decode these features, recovering the sentence 𝐬𝐬\mathbf{s}bold_s. The operation on the received signal 𝒚𝒚\boldsymbol{y}bold_italic_y is

𝒔^=S𝝌1(C𝝍1(𝒚),𝜿)^𝒔superscriptsubscript𝑆𝝌1subscriptsuperscript𝐶1𝝍𝒚𝜿\hat{\boldsymbol{s}}=S_{\boldsymbol{\chi}}^{-1}\left(C^{-1}_{\boldsymbol{\psi}% }\left(\boldsymbol{y}\right),\boldsymbol{\kappa}\right)over^ start_ARG bold_italic_s end_ARG = italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_C start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT ( bold_italic_y ) , bold_italic_κ ) (3)

where 𝒔^^𝒔\hat{\boldsymbol{s}}over^ start_ARG bold_italic_s end_ARG is the recovered sentence, C𝝍1()subscriptsuperscript𝐶1𝝍C^{-1}_{\boldsymbol{\psi}}\left(\cdot\right)italic_C start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT ( ⋅ ) is the channel decoder with parameters 𝝍𝝍\boldsymbol{\psi}bold_italic_ψ, and S𝝌1()superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}\left(\cdot\right)italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( ⋅ ) is the semantic decoder with parameters 𝝌𝝌\boldsymbol{\chi}bold_italic_χ.

III-C Problem Definition

III-C1 Securing Interactions in Synchronization and Update

The timely synchronization and accurate update of 𝜶𝜶\boldsymbol{\alpha}bold_italic_α, 𝝌𝝌\boldsymbol{\chi}bold_italic_χ, 𝜷𝜷\boldsymbol{\beta}bold_italic_β, 𝝍𝝍\boldsymbol{\psi}bold_italic_ψ, and 𝜿𝜿\boldsymbol{\kappa}bold_italic_κ are critical steps for the overall effectiveness of the semantic communication system. The integrity, authentication and availability of interactions need to be achieved. These models and knowledge bases can not be tampered or falsified during interactions. And interactions must be fault-tolerant and available in complex and changing IoT networks. To utilize semantic communications in interactions, it is necessary to verify the integrity and authenticity of 𝒔^^𝒔\hat{\boldsymbol{s}}over^ start_ARG bold_italic_s end_ARG with lossy transmissions.

III-C2 Building Efficient and Flexible Semantic Communication System with Semantic Knowledge Base

The challenge of efficiency arises from the substantial volume of data exchange that occurs during the process of updating and synchronizing 𝜶𝜶\boldsymbol{\alpha}bold_italic_α, 𝝌𝝌\boldsymbol{\chi}bold_italic_χ, 𝜷𝜷\boldsymbol{\beta}bold_italic_β, 𝝍𝝍\boldsymbol{\psi}bold_italic_ψ and 𝜿𝜿\boldsymbol{\kappa}bold_italic_κ. To address this challenge, semantic knowledge bases need to be refined to achieve a small number of vectors, P𝑃Pitalic_P, while maintaining their semantic richness. This refinement is crucial to substantially reducing transmission overheads on IoT devices and efficiently empowering the semantic encoder with the fundamental information with less computational loads.

Furthermore, the wide range of transmission and computational capabilities requires the system to be adaptable and flexible. The transmission capability restricts the maximum value of the transmitted signal length M𝑀Mitalic_M, and the computation capability limits the number of semantic knowledge vectors P𝑃Pitalic_P involved in model inference. The objective of system can be represented as

maxM𝑴P𝑷ζM,P(𝒔,𝒔^)subscript𝑀𝑴subscript𝑃𝑷subscript𝜁𝑀𝑃𝒔^𝒔\max\quad\sum_{M\in\boldsymbol{M}}\sum_{P\in\boldsymbol{P}}\zeta_{M,P}\left(% \boldsymbol{s},\hat{\boldsymbol{s}}\right)\\ roman_max ∑ start_POSTSUBSCRIPT italic_M ∈ bold_italic_M end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_P ∈ bold_italic_P end_POSTSUBSCRIPT italic_ζ start_POSTSUBSCRIPT italic_M , italic_P end_POSTSUBSCRIPT ( bold_italic_s , over^ start_ARG bold_italic_s end_ARG ) (4)

where 𝑴𝑴\boldsymbol{M}bold_italic_M represents the set of numbers of symbols that devices can transmit, and 𝑷𝑷\boldsymbol{P}bold_italic_P represents the set of numbers of semantic knowledge vectors that devices can use, ζM,P(,)subscript𝜁𝑀𝑃\zeta_{M,P}\left(\cdot,\cdot\right)italic_ζ start_POSTSUBSCRIPT italic_M , italic_P end_POSTSUBSCRIPT ( ⋅ , ⋅ ) measure the similarity between 𝒔𝒔\boldsymbol{s}bold_italic_s and 𝒔^^𝒔\hat{\boldsymbol{s}}over^ start_ARG bold_italic_s end_ARG when device transmits M𝑀Mitalic_M symbols and utilize P𝑃Pitalic_P semantic communication vectors.

III-C3 Achieving Differential Privacy

Considering potential data inference attacks[19] during maintenance and utilization of semantic communications, we need to achieve differential privacy in semantic communications. By adding noise to the transmitted message, called differential privacy noise, the differential privacy mechanism can be effective against such attacks. However, in semantic communication, the transmitted information is also affected by model noise and wireless channel noise. It requires a joint analysis of the impact of differential privacy noise, model noise and wireless channel noise on achieving the differential privacy objective. Based on this, it is necessary to propose an optimal noise addition mechanism to achieve target differential privacy with the least amount of added differential privacy noise.

IV Proposed Solution

IV-A Overview

Refer to caption
Figure 1: Overview of proposed system.

The overview of the proposed secure distributed semantic communication system is shown in the Fig. 1. The system consists of three entities, which are elaborated as follows:

  1. 1.

    IoT devices: Entities are equipped with a range of bandwidth resources and computing capabilities. They can perform conventional reliable communication protocols such as Bluetooth or WiFi, which have been widely integrated within IoT ecosystems. In addition, they are also capable of semantic communications. These entities do not simply run the static semantic communication system. They interact with each other to continuously update and synchronize the semantic communication system.

  2. 2.

    Key Generation Center: A trusted third party plays a crucial role within the network, facilitating network initiation and public/private key pairs generation and distribution[28]. It is worth noting that it cannot directly organize interactions and perform complicated data processing, due to availability issues caused by complex IoT environments and the limitations of the center’s own capabilities.

  3. 3.

    Blockchain: A consortium blockchain is a intangible, conceptual entity maintained by IoT devices. This blockchain is crucial for achieving transparent and trustworthy interactions between network participants. It serves as a secure platform, ensuring that all process of synchronization and update are recorded in an immutable and tamper-proof manner. A secure environment that ensures the integrity, authentication and availability of the semantic communication system is supported by this blockchain.

The system deployment process is comprised of three main phases, which are as follows:

  1. 1.

    Update: IoT devices collect local training data about tasks and train their local models and semantic knowledge bases. Then, they share their local models and semantic knowledge bases to collectively update the semantic communication system, thereby enabling it to adapt to emerging tasks. This approach ensures that the entire IoT network is able to cope with arising requirements.

  2. 2.

    Synchronization: Since not all devices may participate in the update process because of limited resources, the synchronization phase is important to ensure that all devices are aligned with the most updated and optimized system. Furthermore, due to the inherent dynamic topology of IoT networks, where devices frequently join and leave the network, it is imperative for newly joining IoT devices to promptly retrieve the latest model to maintain consistency and coherence within the network.

  3. 3.

    Communication: Once synchronization is complete, IoT devices proceed to the communication phase, where they leverage the semantic communication system to exchange information efficiently.

In the proposed system, the signaling used for controlling interactions is carried by conventional reliable communication protocols. Semantic communications are performed for IoT tasks in the communication phase. During the update and synchronization phases, these devices can choose to use either conventional methods or semantic communications to transmit models and knowledge bases, depending on their conditions. Traditional communication protocols do not require model inference, thereby conserving computational resources. However, they require the transmission of a larger number of signal symbols. In contrast, semantic communications reduce the number of symbols transmitted, but require computational processes for model inference.

The proposed system consists of a blockchain-based interaction framework, an efficient and flexible semantic communication scheme, and an joint model-channel differential privacy noise mechanism. The blockchain-based interaction framework provides integrity, availability protection for system maintenance. Based on the secure interactions provided by the framework, the efficient and flexible semantic communication scheme is explored to achieve a more efficient system update solution with less data exchange. In response to privacy breaches arising from the system maintenance process and system usage, the joint model-channel differential privacy noise mechanism is proposed to implement differential privacy in semantic communications.

IV-B Blockchain-based Secure Interaction Framework

IoT devices collectively build a blockchain network for trustworthy interactions with integrity, authentication and availability in the semantic communication system. A blockchain[29, 30] is a distributed immutable ledger, constructed as a list of blocks. Each block records a set of transactions, where a transaction represents an operation to read or write data to the ledger. The set of rules and conditions for querying or modifying the ledger is defined in codes, known as smart contracts. Each peer maintains a copy of the ledger by a collaborative process called consensus, ensuring the proper execution of smart contracts, the validation of blocks, and the consistency of the ledger among peers. Once a new block is generated and validated, it is cryptographically linked to the last block of the current ledger and synchronized among the networks. The blockchain is fault-tolerant and can withstand a single point of failure.

In the blockchain network maintained by IoT devices, model update and synchronization can be seen as transaction in blockchain, because it is actually a modification or reading of the ledger data. The blockchain network consists of multiple channels, each of which is a sub-network responsible for a specific semantic communication task. One device can participate in different channels at the same time.

There are three main transactions in the system, model upload, model aggregation and model retrieval. We select FedAvg[6] to aggregate local models from each devices. For achieve the integrity and authentication of transaction, the interaction workflow is as follows. The device generate a transaction proposal. For data upload task, it contains the models, knowledge bases and other data. This proposal is the signed and broadcasted to the network. Other device receive and validate the transaction proposal. To validate the receive proposal, devices first verify the digital signature to confirms that the proposal originated from a legitimate device within the channel. After signature is verified, for the model upload task, devices check the integrity of the model; for the model aggregation task, devices check that the FedAvg algorithm is executed correctly. Validated transaction are bundle into block. The network employs a consensus mechanism to agree on which block to append to the blockchain. For the model retrieval task, devices can access models and knowledge bases directly from its own copy of the ledger.

Performing the above workflows in conventional reliable communication protocols has been widely studied and discussed. It is notice that the whole process requires digital signatures to ensure the integrity and authenticity of the transaction. The use of semantic communications for transmitting a transaction proposal would inevitably result in the failure of signature verification due to the inherently lossy nature of semantic communications. In order to facilitate the integration of semantic communications into the aforementioned workflows and thereby enhance system performance, with the idea of provable data possession[31, 32], we propose a probabilistic signature verification mechanism. The mechanism ensures the integrity and authentication of transmitted semantics in semantic communications.

We consider that Alice want to transmit semantics to Bob with the integrity and authentication of semantics. The output of semantic encoder can be reconstructed into a one-dimensional data, 𝑾N𝑾superscript𝑁\boldsymbol{W}\in\mathbb{R}^{N}bold_italic_W ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. This data goes through the channel codec and wireless channel and is received by the bob, denoted as 𝑾^^𝑾\widehat{\boldsymbol{W}}over^ start_ARG bold_italic_W end_ARG. Alice randomly samples 𝑾𝑾\boldsymbol{W}bold_italic_W based on a random index set, 𝑰𝑰\boldsymbol{I}bold_italic_I. The sampling result is denote as 𝑾𝑰{𝑾i|i𝑰}subscript𝑾𝑰conditional-setsubscript𝑾𝑖𝑖𝑰\boldsymbol{W}_{\boldsymbol{I}}\triangleq\left\{\boldsymbol{W}_{i}|i\in% \boldsymbol{I}\right\}bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT ≜ { bold_italic_W start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_i ∈ bold_italic_I }. Alice signs 𝑾𝑰subscript𝑾𝑰\boldsymbol{W}_{\boldsymbol{I}}bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT and 𝑰𝑰\boldsymbol{I}bold_italic_I with its privacy key sk𝑠𝑘skitalic_s italic_k, denoted as sign{𝑾𝑰||𝑰}sksign\triangleq\left\{\boldsymbol{W}_{\boldsymbol{I}}||\boldsymbol{I}\right\}_{sk}italic_s italic_i italic_g italic_n ≜ { bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT | | bold_italic_I } start_POSTSUBSCRIPT italic_s italic_k end_POSTSUBSCRIPT. {𝑾𝑰||𝑰||sign}conditional-setsubscript𝑾𝑰conditional𝑰𝑠𝑖𝑔𝑛\left\{{\boldsymbol{W}}_{\boldsymbol{I}}||\boldsymbol{I}||sign\right\}{ bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT | | bold_italic_I | | italic_s italic_i italic_g italic_n } is transmitted to Bob in conventional communication protocols. It has much smaller data than 𝑾𝑾\boldsymbol{W}bold_italic_W. Bob validates sign𝑠𝑖𝑔𝑛signitalic_s italic_i italic_g italic_n with the public key of Alice, ensuring the integrity, authentication and non-repudiation of {𝑾𝑰||𝑰}\left\{{\boldsymbol{W}}_{\boldsymbol{I}}||\boldsymbol{I}\right\}{ bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT | | bold_italic_I }. After sign is validated, Bob samples 𝑾^^𝑾\widehat{\boldsymbol{W}}over^ start_ARG bold_italic_W end_ARG with 𝑰𝑰\boldsymbol{I}bold_italic_I, denoted as 𝑾^𝑰{𝑾^i|i𝑰}subscript^𝑾𝑰conditional-setsubscript^𝑾𝑖𝑖𝑰{\widehat{\boldsymbol{W}}}_{\boldsymbol{I}}\triangleq\left\{\widehat{% \boldsymbol{W}}_{i}|i\in\boldsymbol{I}\right\}over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT ≜ { over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_i ∈ bold_italic_I }. Finally, Bob validates the difference between 𝑾𝑰subscript𝑾𝑰{{\boldsymbol{W}}}_{\boldsymbol{I}}bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT and 𝑾^𝑰subscript^𝑾𝑰{\widehat{\boldsymbol{W}}}_{\boldsymbol{I}}over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT. If the difference less than a specified threshold, the validation will be successful and vice versa.

To comprehensively quantity the discrepancy between 𝑾𝑰subscript𝑾𝑰{{\boldsymbol{W}}}_{\boldsymbol{I}}bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT and 𝑾^𝑰subscript^𝑾𝑰{\widehat{\boldsymbol{W}}}_{\boldsymbol{I}}over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT, we introduce a metric defined as

Diff=𝑾𝑰𝑾^𝑰1+𝑾𝑰𝑾^𝑰.𝐷𝑖𝑓𝑓subscriptnormsubscript𝑾𝑰subscript^𝑾𝑰1subscriptnormsubscript𝑾𝑰subscript^𝑾𝑰Diff=||{{\boldsymbol{W}}}_{\boldsymbol{I}}-{\widehat{\boldsymbol{W}}}_{% \boldsymbol{I}}||_{1}+||{{\boldsymbol{W}}}_{\boldsymbol{I}}-{\widehat{% \boldsymbol{W}}}_{\boldsymbol{I}}||_{\infty}.italic_D italic_i italic_f italic_f = | | bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT - over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + | | bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT - over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT . (5)

This metric captures the two critical aspects of the difference between 𝑾𝑰subscript𝑾𝑰{{\boldsymbol{W}}}_{\boldsymbol{I}}bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT and 𝑾^𝑰subscript^𝑾𝑰{\widehat{\boldsymbol{W}}}_{\boldsymbol{I}}over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT. The L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT norm, 𝑾𝑰𝑾^𝑰1subscriptnormsubscript𝑾𝑰subscript^𝑾𝑰1||{{\boldsymbol{W}}}_{\boldsymbol{I}}-{\widehat{\boldsymbol{W}}}_{\boldsymbol{% I}}||_{1}| | bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT - over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, measures the average deviation, providing insights into the overall magnitude of the discrepancy across all elements. And the Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT norm, 𝑾𝑰𝑾^𝑰subscriptnormsubscript𝑾𝑰subscript^𝑾𝑰||{{\boldsymbol{W}}}_{\boldsymbol{I}}-{\widehat{\boldsymbol{W}}}_{\boldsymbol{% I}}||_{\infty}| | bold_italic_W start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT - over^ start_ARG bold_italic_W end_ARG start_POSTSUBSCRIPT bold_italic_I end_POSTSUBSCRIPT | | start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT quantifies the maximum deviation, highlighting the most significant discrepancy among individual elements.

The key to the mechanism is that adversaries can not know 𝑰𝑰\boldsymbol{I}bold_italic_I until the transmission of 𝑰𝑰\boldsymbol{I}bold_italic_I is complete. Once adversaries are aware of 𝑰𝑰\boldsymbol{I}bold_italic_I before Bob receives 𝑾𝑾\boldsymbol{W}bold_italic_W, they are able to launch attacks without being detected by modifying the data whose index is not in 𝑰𝑰\boldsymbol{I}bold_italic_I and maintaining the data whose index is in 𝑰𝑰\boldsymbol{I}bold_italic_I. Therefore, it is crucial to maintain the randomness of the index set 𝑰𝑰\boldsymbol{I}bold_italic_I. It must be transmitted delayed or encrypted.

We classify attacks on this mechanism into two categories, based on whether the modification of the information is greater than the threshold value. For attacks where modifications to data exceed thresholds, with the size of index set |𝑰|𝑰|\boldsymbol{I}|| bold_italic_I | increase, the integrity and authentication of 𝑾𝑾\boldsymbol{W}bold_italic_W improves. If x𝑥xitalic_x items are modified in 𝑾𝑾\boldsymbol{W}bold_italic_W, the probability of detection with |𝑰|=I𝑰𝐼|\boldsymbol{I}|=I| bold_italic_I | = italic_I is

Pd=1CNxICNIsubscript𝑃𝑑1superscriptsubscript𝐶𝑁𝑥𝐼superscriptsubscript𝐶𝑁𝐼P_{d}=1-\frac{C_{N-x}^{I}}{C_{N}^{I}}italic_P start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 1 - divide start_ARG italic_C start_POSTSUBSCRIPT italic_N - italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT end_ARG start_ARG italic_C start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT end_ARG (6)

For attacks where the modification of data is less than a threshold value, such as poisoning attacks achieved by introducing subtly delicate noises. It can also be submerged in channel and model noises, thereby remaining the security.

IV-C Efficient and Flexible Semantic Communication Scheme

Refer to caption
Figure 2: Illustration of the proposed efficient and flexible semantic communication scheme.

The proposed system addresses the challenges posed by varying computational and communicative resources of IoT devices. By leveraging a shared semantic knowledge base, we develop an flexible semantic communication system that enables each IoT device to adapt their communication strategies in response to resource availability. The mechanism enables efficient model updating, by mainly updating only compact knowledge bases.

The proposed system is shown in the Fig. 2. In the proposed scheme, the semantic knowledge base is consists of semantic knowledge vectors. Considering the diverse requirements of different semantic communication tasks, there are semantic knowledge vectors tailored specifically to address these varying needs. We define a list of semantic knowledge vectors for the semantic communication task t𝑡titalic_t as 𝜿t=[𝒗1t,𝒗2t,,𝒗Ptt]superscript𝜿𝑡superscriptsubscript𝒗1𝑡superscriptsubscript𝒗2𝑡superscriptsubscript𝒗superscript𝑃𝑡𝑡\boldsymbol{\kappa}^{t}=\left[\boldsymbol{v}_{1}^{t},\boldsymbol{v}_{2}^{t},% \cdots,\boldsymbol{v}_{P^{t}}^{t}\right]bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = [ bold_italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT , bold_italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT , ⋯ , bold_italic_v start_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ], where Ptsuperscript𝑃𝑡P^{t}italic_P start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT is the total number of vectors, and 𝒗ntQsuperscriptsubscript𝒗𝑛𝑡superscript𝑄\boldsymbol{v}_{n}^{t}\in\mathbb{R}^{Q}bold_italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_Q end_POSTSUPERSCRIPT represents the n𝑛nitalic_n-th Q𝑄Qitalic_Q-dimensional vector in 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT. The detailed process on semantic knowledge vectors is demonstrated in Fig. 3. During the initialization phase, both the transmitter and receiver retrieve the same S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT, S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, C𝜷subscript𝐶𝜷C_{\boldsymbol{\beta}}italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT, C𝝍1superscriptsubscript𝐶𝝍1C_{\boldsymbol{\psi}}^{-1}italic_C start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT from the blockchain for the specific task t𝑡titalic_t. Transmitter utilizes the encoder S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT to extract features from the input sentences 𝒔tsuperscript𝒔𝑡\boldsymbol{s}^{t}bold_italic_s start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT, with the help of 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT. The input sentences 𝒔tsuperscript𝒔𝑡\boldsymbol{s}^{t}bold_italic_s start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT is embedded as 𝒔etL×Qsubscriptsuperscript𝒔𝑡𝑒superscript𝐿𝑄\boldsymbol{s}^{t}_{e}\in\mathbb{R}^{L\times Q}bold_italic_s start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_L × italic_Q end_POSTSUPERSCRIPT. These extracted features are 𝒇(L+Pt)×QS𝜶(𝒔t||𝜿t)\boldsymbol{f}\in\mathbb{R}^{(L+P^{t})\times Q}\triangleq S_{\boldsymbol{% \alpha}}\left(\boldsymbol{s}^{t}||\boldsymbol{\kappa}^{t}\right)bold_italic_f ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_L + italic_P start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) × italic_Q end_POSTSUPERSCRIPT ≜ italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT ( bold_italic_s start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT | | bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ), Afterward, 𝒇𝒇\boldsymbol{f}bold_italic_f is transmitted to the receiver through wireless channel with the process of channel codec, which is described in (1), (2) and (3). The features recovered by the channel decoder is denoted as 𝒇^^𝒇\hat{\boldsymbol{f}}over^ start_ARG bold_italic_f end_ARG. Finally, 𝒇^^𝒇\hat{\boldsymbol{f}}over^ start_ARG bold_italic_f end_ARG and 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT are fed into S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT as inputs in order to reconstruct the sentence, denoted as 𝒔t^S𝝌1(𝒇^||𝜿t)\hat{\boldsymbol{s}^{t}}\triangleq S_{\boldsymbol{\chi}}^{-1}\left(\hat{% \boldsymbol{f}}||\boldsymbol{\kappa}^{t}\right)over^ start_ARG bold_italic_s start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT end_ARG ≜ italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over^ start_ARG bold_italic_f end_ARG | | bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ). The semantic knowledge vectors 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT is generated by a neural model, called as semantic knowledge network, with fixed inputs. The model is only used during the training process. During semantic communications, the device can directly use its output without model inference.

Refer to caption
Figure 3: The detailed semantic communication knowledge process in the scheme.

This scheme leverages the shared semantic knowledge base to reduce data needed to be transmitted. Furthermore, the transmitter and receiver have ability to balance communication performance with computational and communicative demands by pruning 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT and 𝒇𝒇\boldsymbol{f}bold_italic_f. For devices with limited computing power, the transmitter and receiver can negotiate to truncate the basis tsuperscript𝑡\mathcal{B}^{t}caligraphic_B start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT for mitigating the computational cost of the semantic encoding and decoding process. Besides, the receiver also can trim 𝒇𝒇\boldsymbol{f}bold_italic_f to introduce fewer signal symbols to be transmitted. With the designed training scheme, vectors in 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT and𝒇𝒇\boldsymbol{f}bold_italic_f are both ordered according to their importance for performing the semantic communication task t𝑡titalic_t. So that, the IoT device can efficiently truncate them with minimal sacrifice to communication performance. The method for constructing and updating 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT and 𝒇𝒇\boldsymbol{f}bold_italic_f with order of importance will be thoroughly introduced in the following.

IV-C1 Training with random pruning mechanism

The forward propagation with random pruning mechanism is shown in Algorithm 1. Let 𝜿itsubscriptsuperscript𝜿𝑡𝑖\boldsymbol{\kappa}^{t}_{i}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT represent a subsequence of 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT comprising the first i𝑖iitalic_i elements, and 𝒇jsubscript𝒇𝑗\boldsymbol{f}_{j}bold_italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT denote a subsequence of 𝒇𝒇\boldsymbol{f}bold_italic_f containing the first j𝑗jitalic_j elements. For each batch during training, 𝜿itsubscriptsuperscript𝜿𝑡𝑖\boldsymbol{\kappa}^{t}_{i}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and 𝒇jsubscript𝒇𝑗\boldsymbol{f}_{j}bold_italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT are randomly selected, ranging from the empty set to containing all of elements in 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT and 𝒇𝒇\boldsymbol{f}bold_italic_f. The mechanism ensures devices to flexibly adjust the size of 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT and 𝒇𝒇\boldsymbol{f}bold_italic_f according to their own computational and communication capabilities, supporting a elastic semantic communication system.

1
Input: batch data 𝑺𝑺\boldsymbol{S}bold_italic_S from D𝐷Ditalic_D;
2 RandomInteger(2,Nt)i𝑅𝑎𝑛𝑑𝑜𝑚𝐼𝑛𝑡𝑒𝑔𝑒𝑟2superscript𝑁𝑡𝑖RandomInteger\left(2,N^{t}\right)\to iitalic_R italic_a italic_n italic_d italic_o italic_m italic_I italic_n italic_t italic_e italic_g italic_e italic_r ( 2 , italic_N start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) → italic_i;
3 RandomInteger(0,G)j𝑅𝑎𝑛𝑑𝑜𝑚𝐼𝑛𝑡𝑒𝑔𝑒𝑟0𝐺𝑗RandomInteger\left(0,G\right)\to jitalic_R italic_a italic_n italic_d italic_o italic_m italic_I italic_n italic_t italic_e italic_g italic_e italic_r ( 0 , italic_G ) → italic_j;
4 Transmitter:
5S𝜶(𝑺||𝜿it)𝒇S_{\boldsymbol{\alpha}}\left(\boldsymbol{S}||\boldsymbol{\kappa}^{t}_{i}\right% )\to\boldsymbol{f}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT ( bold_italic_S | | bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) → bold_italic_f;
6  Transmit 𝒇jsubscript𝒇𝑗\boldsymbol{f}_{j}bold_italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT over the channel;
7
8Receiver:
9 Receive 𝒇j^^subscript𝒇𝑗\hat{\boldsymbol{f}_{j}}over^ start_ARG bold_italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG;
10S𝝌1(𝒇j^||𝜿it)𝑺^S_{\boldsymbol{\chi}}^{-1}\left(\hat{\boldsymbol{f}_{j}}||\boldsymbol{\kappa}^% {t}_{i}\right)\to\hat{\boldsymbol{S}}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( over^ start_ARG bold_italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG | | bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) → over^ start_ARG bold_italic_S end_ARG;
11
Output: 𝒇𝒇\boldsymbol{f}bold_italic_f, 𝒇^^𝒇\hat{\boldsymbol{f}}over^ start_ARG bold_italic_f end_ARG, 𝑺^^𝑺\hat{\boldsymbol{S}}over^ start_ARG bold_italic_S end_ARG
Algorithm 1 Forward propagation with random pruning mechanism

IV-C2 Efficient local network update

1
2
3 Function Train the Semantic Codec():
       Input: batch data 𝑺𝑺\boldsymbol{S}bold_italic_S from dataset;
4      
5      Freeze C𝜷subscript𝐶𝜷C_{\boldsymbol{\beta}}italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT, C𝝍1superscriptsubscript𝐶𝝍1C_{\boldsymbol{\psi}}^{-1}italic_C start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT;
6       Forward propagation based on Algorithm 1;
7       Compute loss function CEsubscript𝐶𝐸\mathcal{L}_{CE}caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT by (7);
8       Train S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT, S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT \to Gradient descent with CEsubscript𝐶𝐸\mathcal{L}_{CE}caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT;
       Output: S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT, S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT;
9      
10
11 Function Train the Channel Codec():
       Input: batch data 𝑺𝑺\boldsymbol{S}bold_italic_S from dataset;
12      
13      Freeze S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT, S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT;
14       Forward propagation based on Algorithm 1;
15       Compute loss function MSEsubscript𝑀𝑆𝐸\mathcal{L}_{MSE}caligraphic_L start_POSTSUBSCRIPT italic_M italic_S italic_E end_POSTSUBSCRIPT by (8);
16       Train C𝜷subscript𝐶𝜷C_{\boldsymbol{\beta}}italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT, C𝝍1superscriptsubscript𝐶𝝍1C_{\boldsymbol{\psi}}^{-1}italic_C start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT \to Gradient descent with MSEsubscript𝑀𝑆𝐸\mathcal{L}_{MSE}caligraphic_L start_POSTSUBSCRIPT italic_M italic_S italic_E end_POSTSUBSCRIPT;
       Output: C𝜷subscript𝐶𝜷C_{\boldsymbol{\beta}}italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT, C𝝍1superscriptsubscript𝐶𝝍1C_{\boldsymbol{\psi}}^{-1}italic_C start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT;
17      
18
19 Function Train the Semantic Knowledge Base():
       Input: batch data 𝑺𝑺\boldsymbol{S}bold_italic_S from dataset;
20      
21      Freeze C𝜷subscript𝐶𝜷C_{\boldsymbol{\beta}}italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT, C𝝍1superscriptsubscript𝐶𝝍1C_{\boldsymbol{\psi}}^{-1}italic_C start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT, S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT;
22       Forward propagation based on Algorithm 1;
23       Compute loss function MSEsubscript𝑀𝑆𝐸\mathcal{L}_{MSE}caligraphic_L start_POSTSUBSCRIPT italic_M italic_S italic_E end_POSTSUBSCRIPT by (8);
24       Train 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT \to Gradient descent with CEsubscript𝐶𝐸\mathcal{L}_{CE}caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT;
       Output: 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT;
25      
26 Function Train the Whole System():
       Input: batch data 𝑺𝑺\boldsymbol{S}bold_italic_S from dataset;
27       Forward propagation based on Algorithm 1;
28       Compute loss function CEsubscript𝐶𝐸\mathcal{L}_{CE}caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT by (7);
29       Train S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT, S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, C𝜷subscript𝐶𝜷C_{\boldsymbol{\beta}}italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT, C𝝍1superscriptsubscript𝐶𝝍1C_{\boldsymbol{\psi}}^{-1}italic_C start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT \to Gradient descent with CEsubscript𝐶𝐸\mathcal{L}_{CE}caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT;
       Output: S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT, S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, C𝜷subscript𝐶𝜷C_{\boldsymbol{\beta}}italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT, C𝝍1superscriptsubscript𝐶𝝍1C_{\boldsymbol{\psi}}^{-1}italic_C start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT;
30      
Algorithm 2 Local update

As exhibited in Algorithm 2, the training of the semantic communication system is divided into four steps, for the individual training of the semantic codec, the channel codec, the semantic knowledge base and the overall training of the whole system. In the first steps, S𝜶subscript𝑆𝜶S_{\boldsymbol{\alpha}}italic_S start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT and S𝝌1superscriptsubscript𝑆𝝌1S_{\boldsymbol{\chi}}^{-1}italic_S start_POSTSUBSCRIPT bold_italic_χ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT are updated with the goal of minimizing the divergence between 𝒔𝒔\boldsymbol{s}bold_italic_s and 𝒔^^𝒔\hat{\boldsymbol{s}}over^ start_ARG bold_italic_s end_ARG. To quantify this divergence, we employ the cross-entropy (CE) to quantify the divergence, which is given by

CE(𝒔,𝒔^)=subscript𝐶𝐸𝒔^𝒔absent\displaystyle\mathcal{L}_{CE}\left(\boldsymbol{s},\hat{\boldsymbol{s}}\right)=caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT ( bold_italic_s , over^ start_ARG bold_italic_s end_ARG ) = (7)
l=1q(wl)log(p(wl))+(1q(wl))log(1p(wl)),subscript𝑙1𝑞subscript𝑤𝑙𝑙𝑜𝑔𝑝subscript𝑤𝑙1𝑞subscript𝑤𝑙𝑙𝑜𝑔1𝑝subscript𝑤𝑙\displaystyle-\sum_{l=1}q\left(w_{l}\right)log\left(p\left(w_{l}\right)\right)% +\left(1-q\left(w_{l}\right)\right)log\left(1-p\left(w_{l}\right)\right),- ∑ start_POSTSUBSCRIPT italic_l = 1 end_POSTSUBSCRIPT italic_q ( italic_w start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) italic_l italic_o italic_g ( italic_p ( italic_w start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) ) + ( 1 - italic_q ( italic_w start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) ) italic_l italic_o italic_g ( 1 - italic_p ( italic_w start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) ) ,

where q(wl)𝑞subscript𝑤𝑙q\left(w_{l}\right)italic_q ( italic_w start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) denotes the real probability of the occurrence of wlsubscript𝑤𝑙w_{l}italic_w start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT in original sentence 𝐬𝐬\mathbf{s}bold_s, and p(wl)𝑝subscript𝑤𝑙p\left(w_{l}\right)italic_p ( italic_w start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) is the predicted probability of the same wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT appearing in the reconstructed sentence 𝐬^^𝐬\hat{\mathbf{s}}over^ start_ARG bold_s end_ARG.

In the second steps, C𝜷subscript𝐶𝜷C_{\boldsymbol{\beta}}italic_C start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT and C𝝍1superscriptsubscript𝐶𝝍1C_{\boldsymbol{\psi}}^{-1}italic_C start_POSTSUBSCRIPT bold_italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT are updated with the MSEsubscript𝑀𝑆𝐸\mathcal{L}_{MSE}caligraphic_L start_POSTSUBSCRIPT italic_M italic_S italic_E end_POSTSUBSCRIPT, which is given by

MSE(𝒇,𝒇^)=𝒇𝒇^2,subscript𝑀𝑆𝐸𝒇^𝒇subscriptnorm𝒇^𝒇2\displaystyle\mathcal{L}_{MSE}\left(\boldsymbol{f},\hat{\boldsymbol{f}}\right)% =\left|\left|\boldsymbol{f}-\hat{\boldsymbol{f}}\right|\right|_{2},caligraphic_L start_POSTSUBSCRIPT italic_M italic_S italic_E end_POSTSUBSCRIPT ( bold_italic_f , over^ start_ARG bold_italic_f end_ARG ) = | | bold_italic_f - over^ start_ARG bold_italic_f end_ARG | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , (8)

In the third steps, 𝜿tsuperscript𝜿𝑡\boldsymbol{\kappa}^{t}bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT are update with the CEsubscript𝐶𝐸\mathcal{L}_{CE}caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT. To ensure the broad representational capability of the semantic knowledge base, we introduce the cosine distance into the loss function, aiming to enrich each vector with a more diverse set of information. By incorporating these approach, we strive to achieve a semantic knowledge base that is not only scalable and adaptable but also possesses a wider range of representation, thereby improving the comprehensiveness and accuracy of semantic communication capabilities of the system. the 𝜿subscript𝜿\mathcal{L}_{\boldsymbol{\kappa}}caligraphic_L start_POSTSUBSCRIPT bold_italic_κ end_POSTSUBSCRIPT is given by

κ(𝒔,𝒔^)=CE(𝒔,𝒔^)+(𝜿t)T(𝜿t)2,subscript𝜅𝒔^𝒔subscript𝐶𝐸𝒔^𝒔subscriptnormsuperscriptsuperscript𝜿𝑡𝑇superscript𝜿𝑡2\mathcal{L}_{\kappa}\left(\boldsymbol{s},\hat{\boldsymbol{s}}\right)=\mathcal{% L}_{CE}\left(\boldsymbol{s},\hat{\boldsymbol{s}}\right)+\left|\left|\left(% \boldsymbol{\kappa}^{t}\right)^{T}\left(\boldsymbol{\kappa}^{t}\right)\ \right% |\right|_{2},caligraphic_L start_POSTSUBSCRIPT italic_κ end_POSTSUBSCRIPT ( bold_italic_s , over^ start_ARG bold_italic_s end_ARG ) = caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT ( bold_italic_s , over^ start_ARG bold_italic_s end_ARG ) + | | ( bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_italic_κ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , (9)

Finally, the whole network is trained with totalsubscript𝑡𝑜𝑡𝑎𝑙\mathcal{L}_{total}caligraphic_L start_POSTSUBSCRIPT italic_t italic_o italic_t italic_a italic_l end_POSTSUBSCRIPT, which is given by

total=CE+MSE+𝜿subscript𝑡𝑜𝑡𝑎𝑙subscript𝐶𝐸subscript𝑀𝑆𝐸subscript𝜿\mathcal{L}_{total}=\mathcal{L}_{CE}+\mathcal{L}_{MSE}+\mathcal{L}_{% \boldsymbol{\kappa}}caligraphic_L start_POSTSUBSCRIPT italic_t italic_o italic_t italic_a italic_l end_POSTSUBSCRIPT = caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT + caligraphic_L start_POSTSUBSCRIPT italic_M italic_S italic_E end_POSTSUBSCRIPT + caligraphic_L start_POSTSUBSCRIPT bold_italic_κ end_POSTSUBSCRIPT (10)

Oriented towards the need for continuous efficient update of the semantic communication system, IoT devices can only update the semantic knowledge base based on (9) with less data exchange.

IV-D The Joint Model-Channel Differential Privacy Noise Mechanism

In this section, we propose a differential privacy semantic communication scheme for any task that focuses on data analysis and do not require precise data recovery. In our proposed system, the proposed differential privacy scheme prevents attackers from inferring sensitive information contained in the local training data, according to the analysis results, i.e., the transmitted signal symbols. With the help of the mechanism, IoT devices can efficiently share their models and semantic knowledge bases and ensure privacy protection during system maintain and usage. The whole process for transmitter is

𝒙=Ω(𝒟),𝒙Ω𝒟\boldsymbol{x}=\Omega(\mathcal{D}),bold_italic_x = roman_Ω ( caligraphic_D ) , (11)

where 𝒟𝒟\mathcal{D}caligraphic_D is the raw data collected by IoT device, Ω()Ω\Omega(\cdot)roman_Ω ( ⋅ ) represent the whole process including data analyzing, semantic encoding and channel encoding. 𝒙𝒙\boldsymbol{x}bold_italic_x is also represented as

𝒙=𝒔𝒊+𝒏model,𝒙𝒔𝒊subscript𝒏𝑚𝑜𝑑𝑒𝑙\boldsymbol{x}=\boldsymbol{si}+\boldsymbol{n}_{model},bold_italic_x = bold_italic_s bold_italic_i + bold_italic_n start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT , (12)

where 𝒔𝒊𝒔𝒊\boldsymbol{si}bold_italic_s bold_italic_i is the semantic information extracted from 𝒟𝒟\mathcal{D}caligraphic_D, 𝒏model𝒞𝒩(0,σm2𝐈)similar-tosubscript𝒏𝑚𝑜𝑑𝑒𝑙𝒞𝒩0superscriptsubscript𝜎𝑚2𝐈\boldsymbol{n}_{model}\sim\mathcal{CN}\left(0,\sigma_{m}^{2}\mathbf{I}\right)bold_italic_n start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT ∼ caligraphic_C caligraphic_N ( 0 , italic_σ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I ) represent the model noise with Gaussian distribution, which is the result of unstable gradients descending, the training data noise and other factors[33]. After being transmitted over wireless channel, based on (2) and (12), the received signal can be represented as

𝒚=𝒉(𝒔𝒊+𝒏model)+𝒏channel.𝒚𝒉𝒔𝒊subscript𝒏𝑚𝑜𝑑𝑒𝑙subscript𝒏𝑐𝑎𝑛𝑛𝑒𝑙\boldsymbol{y}=\boldsymbol{h}\left(\boldsymbol{si}+\boldsymbol{n}_{model}% \right)+\boldsymbol{n}_{channel}.bold_italic_y = bold_italic_h ( bold_italic_s bold_italic_i + bold_italic_n start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT ) + bold_italic_n start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT . (13)

Adversaries can only perform malicious analysis based on 𝒚𝒚\boldsymbol{y}bold_italic_y. We define the process from D𝐷Ditalic_D to y𝑦yitalic_y as

𝒚=(𝒟).𝒚𝒟\boldsymbol{y}=\mathcal{M}(\mathcal{D}).bold_italic_y = caligraphic_M ( caligraphic_D ) . (14)

Based on (14), we know that semantic communications achieve differential privacy if ()\mathcal{M}(\cdot)caligraphic_M ( ⋅ ) satisfy differential privacy. Formally, :𝑫𝒀:𝑫𝒀\mathcal{M}:\boldsymbol{D}\to\boldsymbol{Y}caligraphic_M : bold_italic_D → bold_italic_Y satisfies (ϵ,δ)italic-ϵ𝛿\left(\epsilon,\delta\right)( italic_ϵ , italic_δ )-differential privacy[34, 35] if and only if for any two adjacent datasets 𝒟,𝒟𝑫𝒟superscript𝒟𝑫\mathcal{D},\mathcal{D}^{\prime}\subseteq\boldsymbol{D}caligraphic_D , caligraphic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊆ bold_italic_D and output 𝜸𝒀𝜸𝒀\boldsymbol{\gamma}\subset\boldsymbol{Y}bold_italic_γ ⊂ bold_italic_Y, we have

Pr[(𝒟)𝜸]eϵPr[(𝒟)𝜸]+δ𝑃𝑟delimited-[]𝒟𝜸superscript𝑒italic-ϵ𝑃𝑟delimited-[]superscript𝒟𝜸𝛿Pr[\mathcal{M}(\mathcal{D})\in\boldsymbol{\gamma}]\leq e^{\epsilon}Pr[\mathcal% {M}(\mathcal{D}^{\prime})\in\boldsymbol{\gamma}]+\deltaitalic_P italic_r [ caligraphic_M ( caligraphic_D ) ∈ bold_italic_γ ] ≤ italic_e start_POSTSUPERSCRIPT italic_ϵ end_POSTSUPERSCRIPT italic_P italic_r [ caligraphic_M ( caligraphic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ bold_italic_γ ] + italic_δ (15)

where 𝒟𝒟\mathcal{D}caligraphic_D and 𝒟superscript𝒟\mathcal{D}^{\prime}caligraphic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT differ in only one sample, 𝑫𝑫\boldsymbol{D}bold_italic_D and 𝒀𝒀\boldsymbol{Y}bold_italic_Y are sets of all 𝒟𝒟\mathcal{D}caligraphic_D and 𝒚𝒚\boldsymbol{y}bold_italic_y respectively, ϵitalic-ϵ\epsilonitalic_ϵ controls the privacy loss, with smaller values indicating stronger privacy protection, δ𝛿\deltaitalic_δ allows for a small probability of deviation from the strict privacy guarantee, providing a more flexible approach in scenarios where absolute privacy may be impractical. Hence, a mechanism satisfies (ϵ,δ)italic-ϵ𝛿\left(\epsilon,\delta\right)( italic_ϵ , italic_δ )-differential privacy if, for any pair of adjacent datasets, and for any outputs, the ratio of the probabilities of observing these outputs under the mechanism is bounded by exp(ϵ)italic-ϵ\exp(\epsilon)roman_exp ( italic_ϵ ) with probability at least 1δ1𝛿1-\delta1 - italic_δ.

To make ()\mathcal{M}(\cdot)caligraphic_M ( ⋅ ) satisfy differential privacy, we utilize analytic Gaussian mechanism[36]. Note that \triangle is sensitivity of M()𝑀M(\cdot)italic_M ( ⋅ ), defined as the maximum of M(𝒟)M(𝒟)2subscriptnorm𝑀𝒟𝑀superscript𝒟2||M(\mathcal{D})-M(\mathcal{D}^{\prime})||_{2}| | italic_M ( caligraphic_D ) - italic_M ( caligraphic_D start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The mechanism is that for any ϵ>0italic-ϵ0\epsilon>0italic_ϵ > 0, δ(0,1)𝛿01\delta\in(0,1)italic_δ ∈ ( 0 , 1 ) and \triangle, there is a σ𝜎\sigmaitalic_σ. Adding Gaussian noise with mean 0 and standard deviation σ𝜎\sigmaitalic_σ into the result of mechanism M𝑀Mitalic_M provides (ϵ,δ)italic-ϵ𝛿\left(\epsilon,\delta\right)( italic_ϵ , italic_δ )-differential privacy.

We add Gaussian noise 𝐧dp𝒞𝒩(0,σdp2𝐈L)similar-tosubscript𝐧𝑑𝑝𝒞𝒩0superscriptsubscript𝜎𝑑𝑝2subscript𝐈𝐿\mathbf{n}_{dp}\sim\mathcal{CN}\left(0,\sigma_{dp}^{2}\mathbf{I}_{L}\right)bold_n start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT ∼ caligraphic_C caligraphic_N ( 0 , italic_σ start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) to achieve differential privacy, therefore the signal received at the adversary is

𝒚=𝒉(𝒔𝒊+𝒏model+𝒏dp)+𝒏channel.𝒚𝒉𝒔𝒊subscript𝒏𝑚𝑜𝑑𝑒𝑙subscript𝒏𝑑𝑝subscript𝒏𝑐𝑎𝑛𝑛𝑒𝑙\boldsymbol{y}=\boldsymbol{h}\left(\boldsymbol{si}+\boldsymbol{n}_{model}+% \boldsymbol{n}_{dp}\right)+\boldsymbol{n}_{channel}.bold_italic_y = bold_italic_h ( bold_italic_s bold_italic_i + bold_italic_n start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT + bold_italic_n start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT ) + bold_italic_n start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT . (16)

Following (16), considering that 𝒏dpsubscript𝒏𝑑𝑝\boldsymbol{n}_{dp}bold_italic_n start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT, 𝒏modelsubscript𝒏𝑚𝑜𝑑𝑒𝑙\boldsymbol{n}_{model}bold_italic_n start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT and 𝒏channelsubscript𝒏𝑐𝑎𝑛𝑛𝑒𝑙\boldsymbol{n}_{channel}bold_italic_n start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT are all Gaussian noise, there are multiple differential privacy mechanisms accumulated in ()\mathcal{M}(\cdot)caligraphic_M ( ⋅ ). 𝒏dpsubscript𝒏𝑑𝑝\boldsymbol{n}_{dp}bold_italic_n start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT, 𝒏modelsubscript𝒏𝑚𝑜𝑑𝑒𝑙\boldsymbol{n}_{model}bold_italic_n start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT and 𝒏channelsubscript𝒏𝑐𝑎𝑛𝑛𝑒𝑙\boldsymbol{n}_{channel}bold_italic_n start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT provide (ϵdp,δdp)subscriptitalic-ϵ𝑑𝑝subscript𝛿𝑑𝑝(\epsilon_{dp},\delta_{dp})( italic_ϵ start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT , italic_δ start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT ), (ϵmodel,δmodel)subscriptitalic-ϵ𝑚𝑜𝑑𝑒𝑙subscript𝛿𝑚𝑜𝑑𝑒𝑙(\epsilon_{model},\delta_{model})( italic_ϵ start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT , italic_δ start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT ), (ϵchannel,δchannel)subscriptitalic-ϵ𝑐𝑎𝑛𝑛𝑒𝑙subscript𝛿𝑐𝑎𝑛𝑛𝑒𝑙(\epsilon_{channel},\delta_{channel})( italic_ϵ start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT , italic_δ start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT )-differential privacy, respectively. Because the model noise and channel noise are immutable, we need to adjust the differential privacy noise appropriately to achieve the target differential privacy with minimum noise. In composition theorem for heterogeneous differential privacy mechanisms[37], for any ϵi>0subscriptitalic-ϵ𝑖0\epsilon_{i}>0italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > 0, δi[0,1]subscript𝛿𝑖01\delta_{i}\in[0,1]italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ [ 0 , 1 ] for i{1,,k}𝑖1𝑘i\in\{1,...,k\}italic_i ∈ { 1 , … , italic_k }, the class of (ϵi,δi)subscriptitalic-ϵ𝑖subscript𝛿𝑖(\epsilon_{i},\delta_{i})( italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )-DP mechanisms satisfy (ϵ^,1(1δ^)i=1k(1δi))^italic-ϵ11^𝛿superscriptsubscriptproduct𝑖1𝑘1subscript𝛿𝑖\left(\hat{\epsilon},1-(1-\hat{\delta})\prod_{i=1}^{k}\left(1-\delta_{i}\right% )\right)( over^ start_ARG italic_ϵ end_ARG , 1 - ( 1 - over^ start_ARG italic_δ end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( 1 - italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ), where

ϵ^=^italic-ϵabsent\displaystyle\hat{\epsilon}=over^ start_ARG italic_ϵ end_ARG = min{i=1kϵi,\displaystyle\min\left\{\sum_{i=1}^{k}\epsilon_{i},\right.roman_min { ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , (17)
i=1k(eϵi1)ϵieϵi+1+i=1k2ϵi2log(e+i=1kϵi2δ^),superscriptsubscript𝑖1𝑘superscript𝑒subscriptitalic-ϵ𝑖1subscriptitalic-ϵ𝑖superscript𝑒subscriptitalic-ϵ𝑖1superscriptsubscript𝑖1𝑘2superscriptsubscriptitalic-ϵ𝑖2𝑒superscriptsubscript𝑖1𝑘superscriptsubscriptitalic-ϵ𝑖2^𝛿\displaystyle\sum_{i=1}^{k}\frac{(e^{\epsilon_{i}}-1)\epsilon_{i}}{e^{\epsilon% _{i}}+1}+\sqrt{\sum_{i=1}^{k}2\epsilon_{i}^{2}\log\left(e+\frac{\sqrt{\sum_{i=% 1}^{k}\epsilon_{i}^{2}}}{\hat{\delta}}\right)},∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT divide start_ARG ( italic_e start_POSTSUPERSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 ) italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + 1 end_ARG + square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT 2 italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_log ( italic_e + divide start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG start_ARG over^ start_ARG italic_δ end_ARG end_ARG ) end_ARG ,
i=1k(eϵi1)ϵieϵi+1+i=1k2ϵi2log(1δ^)}.\displaystyle\left.\sum_{i=1}^{k}\frac{(e^{\epsilon_{i}}-1)\epsilon_{i}}{e^{% \epsilon_{i}}+1}+\sqrt{\sum_{i=1}^{k}2\epsilon_{i}^{2}\log\left(\frac{1}{\hat{% \delta}}\right)}\right\}.∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT divide start_ARG ( italic_e start_POSTSUPERSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT - 1 ) italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_e start_POSTSUPERSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + 1 end_ARG + square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT 2 italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_log ( divide start_ARG 1 end_ARG start_ARG over^ start_ARG italic_δ end_ARG end_ARG ) end_ARG } .

Based on (16) and (17), the proposed scheme first needs to confirm whether the channel noise and model noise are sufficient to achieve the differential privacy objective, and if not, then introduce 𝒏dpsubscript𝒏𝑑𝑝\boldsymbol{n}_{dp}bold_italic_n start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT as appropriate. 𝒏dpsubscript𝒏𝑑𝑝\boldsymbol{n}_{dp}bold_italic_n start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT is adjusted to achieve ϵ^<ϵt^italic-ϵsuperscriptitalic-ϵ𝑡\hat{\epsilon}<\epsilon^{t}over^ start_ARG italic_ϵ end_ARG < italic_ϵ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT and δ^<δt^𝛿superscript𝛿𝑡\hat{\delta}<\delta^{t}over^ start_ARG italic_δ end_ARG < italic_δ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT, where ϵtsuperscriptitalic-ϵ𝑡\epsilon^{t}italic_ϵ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT and δtsuperscript𝛿𝑡\delta^{t}italic_δ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT describe the differential privacy of target. The proposed scenarios are generic and can be effectively applied in a variety of situations. If analysis results are transmitted using traditional reliable communication protocols, it can be considered that 𝒏modelsubscript𝒏𝑚𝑜𝑑𝑒𝑙\boldsymbol{n}_{model}bold_italic_n start_POSTSUBSCRIPT italic_m italic_o italic_d italic_e italic_l end_POSTSUBSCRIPT and 𝒏channelsubscript𝒏𝑐𝑎𝑛𝑛𝑒𝑙\boldsymbol{n}_{channel}bold_italic_n start_POSTSUBSCRIPT italic_c italic_h italic_a italic_n italic_n italic_e italic_l end_POSTSUBSCRIPT are zeros ans 𝒉𝒉\boldsymbol{h}bold_italic_h is identity matrix in (16). If model noise or wireless channel noise is difficult to estimate, the scheme is able to ignore the poorly estimated noise and permute the available Gaussian mechanisms to achieve differential privacy by adjusting (ϵi,δi)subscriptitalic-ϵ𝑖subscript𝛿𝑖(\epsilon_{i},\delta_{i})( italic_ϵ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) in (17). Since the scheme adds 𝒏dpsubscript𝒏𝑑𝑝\boldsymbol{n}_{dp}bold_italic_n start_POSTSUBSCRIPT italic_d italic_p end_POSTSUBSCRIPT to the symbol after power normalization which has natural upper and lower bounds, its sensitivity can be easily estimated. This simplifies the implementation of differential privacy and makes the scheme broadly adaptable to different data analysis tasks without the need to analyze the sensitivity task by task.

V Performance Evaluation

In this section, we evaluation the performance of the proposed system. We first evaluate the effectiveness of the proposed compressed semantic knowledge base. Then the flexibility of the semantic communication scheme is evaluated. Finally, we evaluate the impact of the proposed differential privacy protection mechanism on the performance of semantic communications.

Following the DeepSC[5], we employ four Transformer encoder layers in the semantic encoder, and four Transformer decoder layers in the semantic decoder. The entire network parameter settings are summarized in Table I. The knowledge base is generated by the semantic knowledge network and consists of eight vectors of size 128128128128. The dataset used in experiments is the English and French corpora in the proceeding of the European Parliament[38].

TABLE I: The settings of the proposed system
Layer Name Unit
Semantic Encoder 4×\times×Transformer Encoder 128 (8 heads)
Channel Encoder Dense 256
Dense 16
Channel Decoder Dense 128
Dense 256
Semantic Decoder 4×\times×Transformer Decoder 128 (8 heads)
Predictable Layer Dense Dictonary size
Semantic Knowledge Net Dense 128
Dense 128×\times×8

In order to demonstrate that the proposed knowledge base enables efficient semantic knowledge system updating, we show the loss evolution of the proposed system in Fig. 4. The loss is CEsubscript𝐶𝐸\mathcal{L}_{CE}caligraphic_L start_POSTSUBSCRIPT italic_C italic_E end_POSTSUBSCRIPT in (7). “SKB in ‘en’ ” and “SKB in ‘fr’ ” denote the use of English corpus, French corpus and English-French corpus to train the semantic knowledge network to generate the semantic knowledge bases respectively.

Refer to caption
Figure 4: Training evolution of the proposed scheme with semantic knowledge bases.

The system is trained to perform text transmission in both English and French. In the first 1200120012001200 epochs, the semantic knowledge network is frozen and only DeepSC-related modules are being trained. After 1200120012001200 epochs, the DeepSC-related modules is trained only 5555 rounds per 100100100100 rounds on average, while the semantic knowledge network starts to be trained for English and French respectively to generate compressed semantic knowledge bases. The output of the semantic knowledge network is reshaped to 8×128superscript8128\mathbb{R}^{8\times 128}blackboard_R start_POSTSUPERSCRIPT 8 × 128 end_POSTSUPERSCRIPT, as a semantic knowledge base. At the 1200120012001200-th epoch, the system begins to converge. The incorporation of the semantic knowledge network allows the system’s loss to converge to a lower loss. Moreover, the decline in Loss is accomplished with most of the network being frozen. This will significantly reduce the amount of data that needs to be shared during the collaborative learning process in IoT networks.

Refer to caption
(a) AWGN
Refer to caption
(b) Rayleigh
Refer to caption
(c) Rician
Figure 5: Comparison of BLEU versus SNR for different 𝜿𝜿\boldsymbol{\kappa}bold_italic_κ in English transmission task over different wireless channels.
Refer to caption
(a) AWGN
Refer to caption
(b) Rayleigh
Refer to caption
(c) Rician
Figure 6: Comparison of BLEU versus SNR for different 𝜿𝜿\boldsymbol{\kappa}bold_italic_κ in French transmission task over different wireless channels.

We analyze the performance of the proposed system using the bilingual evaluation understudy (BLEU) score[38]. Fig. 5 and Fig. 6 show the comparison of BLEU versus signal to noise ratio (SNR) in English and French transmission tasks with different knowledge bases over different wireless channels, AWGN Rayleigh and Rician. The DeepSC serve as the baseline for this comparison. From the figures, it can be seen that the BLEU of the proposed scheme is higher compared to DeepSC which does not use semantic knowledge base. Moreover, the closer the training dataset is to the communication task requirements, the more the trained semantic knowledge base improves the BLEU. Based on the above experimental results, we learn that the proposed semantic communication scheme based on compressed semantic knowledge bases is able to achieve efficient system updating and support adjustment for different tasks.

Refer to caption
(a) AWGN
Refer to caption
(b) Rayleigh
Refer to caption
(c) Rician
Figure 7: Comparison of BLEU versus SNR for different sizes of 𝜿𝜿\boldsymbol{\kappa}bold_italic_κ and 𝒇𝒇\boldsymbol{f}bold_italic_f over different wireless channels.

We conduct a thorough evaluation of the flexibility of the proposed system. Fig. 7 presents a comparatively analysis of the performance of the proposed system under different pruning levels. The result indicates that the performance of the system is enhanced as the pruning level decrease. Specifically, the proposed system, when transmitting 90%percent9090\%90 % of semantic features, achieves approximately same the BLEU score of DeepSC. Transmission in only 80%percent8080\%80 % semantic features can achieve a BLEU higher than 0.850.850.850.85 even at a SNR of 3db3𝑑𝑏-3db- 3 italic_d italic_b. The proposed scheme is able to obtain better performance compared to DeepSC at low SNR due to the fact that semantic knowledge bases has been shared in advance and is not interfered by the current noise.

Refer to caption
(a) AWGN
Refer to caption
(b) Rayleigh
Refer to caption
(c) Rician
Figure 8: Comparison of BLEU versus SNR for different ϵitalic-ϵ\epsilonitalic_ϵ and δ𝛿\deltaitalic_δ over different wireless channels.

Fig. 8 shows the communication performance of the proposed differential privacy semantic communication for different δ𝛿\deltaitalic_δ and ϵitalic-ϵ\epsilonitalic_ϵ settings. The results show that the mechanism is able to guarantee mathematically rigorous proofs of privacy preservation with BLEU of more than 0.80.80.80.8.

VI Conclusion

We propose a secure, efficient, and privacy-preserving semantic communication system in IoT networks. The proposed solutions have been validated through extensive experiments, showing that they can achieve the desired goals of efficiency, and privacy preservation.

References

  • [1] Z. Qin, X. Tao, J. Lu, W. Tong, and G. Y. Li, “Semantic communications: Principles and challenges,” arXiv preprint arXiv:2201.01389, 2021.
  • [2] D. Gündüz, Z. Qin, I. E. Aguerri, H. S. Dhillon, Z. Yang, A. Yener, K. K. Wong, and C.-B. Chae, “Beyond transmitting bits: Context, semantics, and task-oriented communications,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 5–41, 2022.
  • [3] H. Xie and Z. Qin, “A lite distributed semantic communication system for internet of things,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 142–153, 2020.
  • [4] H. Du, J. Wang, D. Niyato, J. Kang, Z. Xiong, M. Guizani, and D. I. Kim, “Rethinking wireless communication security in semantic internet of things,” IEEE Wireless Communications, vol. 30, no. 3, pp. 36–43, 2023.
  • [5] H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled semantic communication systems,” IEEE Transactions on Signal Processing, vol. 69, pp. 2663–2675, 2021.
  • [6] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial intelligence and statistics.   PMLR, 2017, pp. 1273–1282.
  • [7] J. Deogirikar and A. Vidhate, “Security attacks in iot: A survey,” in 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC).   IEEE, 2017, pp. 32–37.
  • [8] L. Feng, Y. Zhao, S. Guo, X. Qiu, W. Li, and P. Yu, “Bafl: A blockchain-based asynchronous federated learning framework,” IEEE Transactions on Computers, vol. 71, no. 5, pp. 1092–1103, 2021.
  • [9] F. Zhou, Y. Li, M. Xu, L. Yuan, Q. Wu, R. Q. Hu, and N. Al-Dhahir, “Cognitive semantic communication systems driven by knowledge graph: principle, implementation, and performance evaluation,” IEEE Transactions on Communications, 2023.
  • [10] S. Jiang, Y. Liu, Y. Zhang, P. Luo, K. Cao, J. Xiong, H. Zhao, and J. Wei, “Reliable semantic communication system enabled by knowledge graph,” Entropy, vol. 24, no. 6, p. 846, 2022.
  • [11] H. Zhang, S. Shao, M. Tao, X. Bi, and K. B. Letaief, “Deep learning-enabled semantic communication systems with task-unaware transmitter and dynamic data,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 170–185, 2022.
  • [12] Y. Sun, H. Chen, X. Xu, P. Zhang, and S. Cui, “Semantic knowledge base-enabled zero-shot multi-level feature transmission optimization,” IEEE Transactions on Wireless Communications, 2023.
  • [13] L. X. Nguyen, H. Q. Le, Y. L. Tun, P. S. Aung, Y. K. Tun, Z. Han, and C. S. Hong, “An efficient federated learning framework for training semantic communication systems,” IEEE Transactions on Vehicular Technology, 2024.
  • [14] W. Wei and L. Liu, “Gradient leakage attack resilient deep learning,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 303–316, 2021.
  • [15] Y. Chen, Q. Yang, Z. Shi, and J. Chen, “The model inversion eavesdropping attack in semantic communication systems,” in GLOBECOM 2023-2023 IEEE Global Communications Conference.   IEEE, 2023, pp. 5171–5177.
  • [16] Y. Zhang, R. Jia, H. Pei, W. Wang, B. Li, and D. Song, “The secret revealer: Generative model-inversion attacks against deep neural networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 253–261.
  • [17] C. Dwork, “Differential privacy,” in International colloquium on automata, languages, and programming.   Springer, 2006, pp. 1–12.
  • [18] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 308–318.
  • [19] D. Ye, S. Shen, T. Zhu, B. Liu, and W. Zhou, “One parameter defense—defending against data inference attacks via differential privacy,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 1466–1480, 2022.
  • [20] Y. E. Sagduyu, T. Erpek, S. Ulukus, and A. Yener, “Is semantic communication secure? a tale of multi-domain adversarial attacks,” IEEE Communications Magazine, vol. 61, no. 11, pp. 50–55, 2023.
  • [21] M. Shen, J. Wang, H. Du, D. Niyato, X. Tang, J. Kang, Y. Ding, and L. Zhu, “Secure semantic communications: Challenges, approaches, and opportunities,” IEEE Network, 2023.
  • [22] X. Liu, G. Nan, Q. Cui, Z. Li, P. Liu, Z. Xing, H. Mu, X. Tao, and T. Q. Quek, “Semprotector: A unified framework for semantic protection in deep learning-based semantic communication systems,” IEEE Communications Magazine, vol. 61, no. 11, pp. 56–62, 2023.
  • [23] Y. Lin, Z. Gao, H. Du, D. Niyato, J. Kang, Z. Xiong, and Z. Zheng, “Blockchain-based efficient and trustworthy aigc services in metaverse,” IEEE Transactions on Services Computing, 2024.
  • [24] Y. Lin, Z. Gao, H. Du, D. Niyato, J. Kang, Y. Gao, J. Wang, and A. Jamalipour, “Blockchain-based semantic information sharing and pricing for web 3.0,” IEEE Transactions on Network Science and Engineering, 2023.
  • [25] Y. Wang, S. Guo, Y. Deng, H. Zhang, and Y. Fang, “Privacy-preserving task-oriented semantic communications against model inversion attacks,” IEEE Transactions on Wireless Communications, 2024.
  • [26] S. Cheng, X. Zhang, Y. Sun, Q. Cui, and X. Tao, “Knowledge discrepancy oriented privacy preserving for semantic communication,” IEEE Transactions on Vehicular Technology, 2024.
  • [27] A. Zhang, Y. Wang, and S. Guo, “On the utility-informativeness-security trade-off in discrete task-oriented semantic communication,” IEEE Communications Letters, 2024.
  • [28] Y. Miao, Z. Liu, H. Li, K.-K. R. Choo, and R. H. Deng, “Privacy-preserving byzantine-robust federated learning via blockchain systems,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 2848–2861, 2022.
  • [29] M. Belotti, N. Božić, G. Pujolle, and S. Secci, “A vademecum on blockchain technologies: When, which, and how,” IEEE Communications Surveys & Tutorials, vol. 21, no. 4, pp. 3796–3838, 2019.
  • [30] E. Androulaki, A. Barger, V. Bortnikov, C. Cachin, K. Christidis, A. De Caro, D. Enyeart, C. Ferris, G. Laventman, Y. Manevich et al., “Hyperledger fabric: a distributed operating system for permissioned blockchains,” in Proceedings of the thirteenth EuroSys conference, 2018, pp. 1–15.
  • [31] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and D. Song, “Provable data possession at untrusted stores,” in Proceedings of the 14th ACM conference on Computer and communications security, 2007, pp. 598–609.
  • [32] J. Li, J. Wu, G. Jiang, and T. Srikanthan, “Blockchain-based public auditing for big data in cloud storage,” Information Processing & Management, vol. 57, no. 6, p. 102382, 2020.
  • [33] H. Xie, Z. Qin, and G. Y. Li, “Semantic communication with memory,” IEEE Journal on Selected Areas in Communications, 2023.
  • [34] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3.   Springer, 2006, pp. 265–284.
  • [35] R. Xue, K. Xue, B. Zhu, X. Luo, T. Zhang, Q. Sun, and J. Lu, “Differentially private federated learning with an adaptive noise mechanism,” IEEE Transactions on Information Forensics and Security, 2023.
  • [36] B. Balle and Y.-X. Wang, “Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising,” in International Conference on Machine Learning.   PMLR, 2018, pp. 394–403.
  • [37] P. Kairouz, S. Oh, and P. Viswanath, “The composition theorem for differential privacy,” in International conference on machine learning.   PMLR, 2015, pp. 1376–1385.
  • [38] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.