Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (639)

Search Parameters:
Keywords = encoding scheme

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 4918 KiB  
Article
Radio Frequency Signal-Based Drone Classification with Frequency Domain Gramian Angular Field and Convolutional Neural Network
by Yuanhua Fu and Zhiming He
Drones 2024, 8(9), 511; https://doi.org/10.3390/drones8090511 - 21 Sep 2024
Viewed by 363
Abstract
Over the past few years, drones have been utilized in a wide range of applications. However, the illegal operation of drones may pose a series of security risks to sensitive areas such as airports and military bases. Hence, it is vital to develop [...] Read more.
Over the past few years, drones have been utilized in a wide range of applications. However, the illegal operation of drones may pose a series of security risks to sensitive areas such as airports and military bases. Hence, it is vital to develop an effective method of identifying drones to address the above issues. Existing drone classification methods based on radio frequency (RF) signals have low accuracy or a high computational cost. In this paper, we propose a novel RF signal image representation scheme that incorporates a convolutional neural network (CNN), named the frequency domain Gramian Angular Field with a CNN (FDGAF-CNN), to perform drone classification. Specifically, we first compute the time–frequency spectrum of raw RF signals based on short-time Fourier transform (STFT). Then, the 1D frequency spectrum series is encoded as 2D images using a modified GAF transform. Moreover, to further improve the recognition performance, the images obtained from different channels are fused to serve as the input of a CNN classifier. Finally, numerous experiments were conducted on the two available open-source DroneRF and DroneRFa datasets. The experimental results show that the proposed FDGAF-CNN can achieve a relatively high classification accuracy of 98.72% and 98.67% on the above two datasets, respectively, confirming the effectiveness and generalization ability of the proposed method. Full article
(This article belongs to the Special Issue Advances in Detection, Security, and Communication for UAV)
Show Figures

Figure 1

33 pages, 19015 KiB  
Article
A Simple Physics-Based Model of Growth-Based Economies Dependent on a Finite Resource Base
by Philip Mitchell and Tadeusz Patzek
Sustainability 2024, 16(18), 8161; https://doi.org/10.3390/su16188161 - 19 Sep 2024
Viewed by 432
Abstract
Mainstream economics describes virtual wealth with theory that is at odds with the physical laws that govern a nation’s physical resources. This confusion fundamentally prevents the realization of “sustainable” economies. The relation between debt and the metabolism of a country (measured by GDP [...] Read more.
Mainstream economics describes virtual wealth with theory that is at odds with the physical laws that govern a nation’s physical resources. This confusion fundamentally prevents the realization of “sustainable” economies. The relation between debt and the metabolism of a country (measured by GDP or power consumption) appears to follow a diffusion relationship, in which debt encodes the temporal evolution of an economic potential. Debt enables the production of resources and the realization of a country’s economic wealth potential (the sum of its environmental, geological, and societal endowments, among others). Any economic scheme dependent on finite stocks of free energy for growth must eventually collapse, and as such cannot be considered sustainable. Our simple debt–diffusion model is shown to closely match the trajectories of 44 different economies. Full article
(This article belongs to the Section Energy Sustainability)
Show Figures

Figure 1

18 pages, 1457 KiB  
Article
Enhancing Unmanned Marine Vehicle Security: A Periodic Watermark-Based Detection of Replay Attacks
by Guangrui Bian and Xiaoyang Gao
Appl. Sci. 2024, 14(18), 8298; https://doi.org/10.3390/app14188298 - 14 Sep 2024
Viewed by 279
Abstract
This paper explores a periodic watermark-based replay attack detection method for Unmanned Marine Vehicles modeled in the framework of the Takagi–Sugeno fuzzy system. The precise detection of replay attacks is crucial for ensuring the security of Unmanned Marine Vehicles; however, traditional timestamp-based or [...] Read more.
This paper explores a periodic watermark-based replay attack detection method for Unmanned Marine Vehicles modeled in the framework of the Takagi–Sugeno fuzzy system. The precise detection of replay attacks is crucial for ensuring the security of Unmanned Marine Vehicles; however, traditional timestamp-based or encoded measurement-dependent detection approaches often sacrifice system performance to achieve higher detection rates. To reduce the potential performance degradation, a periodic watermark-based detection scheme is developed, in which a compensation signal together with a periodic Gaussian watermark signal is integrated into the actuator. By compensation calculations conducted with all compensatory signals in each period, the position corresponding to a minimum value of the detection function can be derived. Then, the time that the attacks occurred can be ensured with the aid of the comparison between this position with the watermark signal in the same period. An application on a UMV is shown to demonstrate the effectiveness of the presented scheme in detecting replay attacks while minimizing control costs. Full article
(This article belongs to the Section Transportation and Future Mobility)
Show Figures

Figure 1

20 pages, 6592 KiB  
Article
Lossless Data Hiding in VQ Compressed Images Using Adaptive Prediction Difference Coding
by Sisheng Chen, Jui-Chuan Liu, Ching-Chun Chang and Chin-Chen Chang
Electronics 2024, 13(17), 3532; https://doi.org/10.3390/electronics13173532 - 5 Sep 2024
Viewed by 344
Abstract
Data hiding in digital images is an important cover communication technique. This paper studies the lossless data hiding in an image compression domain. We present a novel lossless data hiding scheme in vector quantization (VQ) compressed images using adaptive prediction difference coding. A [...] Read more.
Data hiding in digital images is an important cover communication technique. This paper studies the lossless data hiding in an image compression domain. We present a novel lossless data hiding scheme in vector quantization (VQ) compressed images using adaptive prediction difference coding. A modified adaptive index rearrangement (AIR) is presented to rearrange a codebook, and thus to enhance the correlation of the adjacent indices in the index tables of cover images. Then, a predictor based on the improved median edge detection is used to predict the indices by retaining the first index. The prediction differences are calculated using the exclusive OR (XOR) operation, and the vacancy capacity of each prediction difference type is evaluated. An adaptive prediction difference coding method based on the vacancy capacities of the prediction difference types is presented to encode the prediction difference table. Therefore, the original index table is compressed, and the secret data are embedded into the vacated room. The experimental results demonstrate that the proposed scheme can reduce the pure compression rate compared with the related works. Full article
(This article belongs to the Special Issue Recent Advances in Information Security and Data Privacy)
Show Figures

Figure 1

15 pages, 550 KiB  
Article
Performance Analysis of a New Non-Orthogonal Multiple Access Design for Mitigating Information Loss
by Sang-Wook Park, Hyoung-Do Kim, Kyung-Ho Shin, Jin-Woo Kim, Seung-Hwan Seo, Yoon-Ju Choi, Young-Hwan You, Yeon-Kug Moon and Hyoung-Kyu Song
Mathematics 2024, 12(17), 2752; https://doi.org/10.3390/math12172752 - 5 Sep 2024
Viewed by 313
Abstract
This paper proposes a scheme that adds XOR bit operations into the encoding and decoding process of the conventional non-orthogonal multiple access (NOMA) system to alleviate performance degradation caused by the power distribution of the original signal. Because the conventional NOMA combines and [...] Read more.
This paper proposes a scheme that adds XOR bit operations into the encoding and decoding process of the conventional non-orthogonal multiple access (NOMA) system to alleviate performance degradation caused by the power distribution of the original signal. Because the conventional NOMA combines and sends multiple data within limited resources, it has a higher data rate than orthogonal multiple access (OMA), at the expense of error performance. However, by using the proposed scheme, both error performance and sum rate can be improved. In the proposed scheme, the transmitter sends the original data and the redundancy data in which the exclusive OR (XOR) values of the data are compressed using the superposition coding (SC) technique. After this process, the data rate of users decreases due to redundancy data, but since the original data are sent without power allocation, the data rate of users with poor channel conditions increases compared to the conventional NOMA. As a result, the error performance and sum rate of the proposed scheme are better than those of the conventional NOMA. Additionally, we derive an exact closed-form bit error rate (BER) expression for the proposed downlink NOMA design over Rayleigh fading channels. Full article
Show Figures

Figure 1

22 pages, 3877 KiB  
Article
Mother–Daughter Vessel Operation and Maintenance Routing Optimization for Offshore Wind Farms Using Restructuring Particle Swarm Optimization
by Yuanhang Qi, Haoyu Luo, Gewen Huang, Peng Hou, Rongsen Jin and Yuhui Luo
Biomimetics 2024, 9(9), 536; https://doi.org/10.3390/biomimetics9090536 - 5 Sep 2024
Viewed by 435
Abstract
As the capacity of individual offshore wind turbines increases, prolonged downtime (due to maintenance or faults) will result in significant economic losses. This necessitates enhancing the efficiency of vessel operation and maintenance (O&M) to reduce O&M costs. Existing research mostly focuses on planning [...] Read more.
As the capacity of individual offshore wind turbines increases, prolonged downtime (due to maintenance or faults) will result in significant economic losses. This necessitates enhancing the efficiency of vessel operation and maintenance (O&M) to reduce O&M costs. Existing research mostly focuses on planning O&M schemes for individual vessels. However, there exists a research gap in the scientific scheduling for state-of-the-art O&M vessels. To bridge this gap, this paper considers the use of an advanced O&M vessel in the O&M process, taking into account the downtime costs associated with wind turbine maintenance and repair incidents. A mathematical model is constructed with the objective of minimizing overall O&M expenditure. Building upon this formulation, this paper introduces a novel restructuring particle swarm optimization which is tailed with a bespoke encoding and decoding strategy, designed to yield an optimized solution that aligns with the intricate demands of the problem at hand. The simulation results indicate that the proposed method can achieve significant savings of 28.85% in O&M costs. The outcomes demonstrate the algorithm’s proficiency in tackling the model efficiently and effectively. Full article
(This article belongs to the Special Issue Nature-Inspired Metaheuristic Optimization Algorithms 2024)
Show Figures

Figure 1

19 pages, 717 KiB  
Article
Imperative Genetic Programming
by Iztok Fajfar, Žiga Rojec, Árpád Bűrmen, Matevž Kunaver, Tadej Tuma, Sašo Tomažič and Janez Puhan
Symmetry 2024, 16(9), 1146; https://doi.org/10.3390/sym16091146 - 3 Sep 2024
Viewed by 541
Abstract
Genetic programming (GP) has a long-standing tradition in the evolution of computer programs, predominantly utilizing tree and linear paradigms, each with distinct advantages and limitations. Despite the rapid growth of the GP field, there have been disproportionately few attempts to evolve ’real’ Turing-like [...] Read more.
Genetic programming (GP) has a long-standing tradition in the evolution of computer programs, predominantly utilizing tree and linear paradigms, each with distinct advantages and limitations. Despite the rapid growth of the GP field, there have been disproportionately few attempts to evolve ’real’ Turing-like imperative programs (as contrasted with functional programming) from the ground up. Existing research focuses mainly on specific special cases where the structure of the solution is partly known. This paper explores the potential of integrating tree and linear GP paradigms to develop an encoding scheme that universally supports genetic operators without constraints and consistently generates syntactically correct Python programs from scratch. By blending the symmetrical structure of tree-based representations with the inherent asymmetry of linear sequences, we created a versatile environment for program evolution. Our approach was rigorously tested on 35 problems characterized by varying Halstead complexity metrics, to delineate the approach’s boundaries. While expected brute-force program solutions were observed, our method yielded more sophisticated strategies, such as optimizing a program by restricting the division trials to the values up to the square root of the number when counting its proper divisors. Despite the recent groundbreaking advancements in large language models, we assert that the GP field warrants continued research. GP embodies a fundamentally different computational paradigm, crucial for advancing our understanding of natural evolutionary processes. Full article
Show Figures

Figure 1

22 pages, 4816 KiB  
Article
Ultrasonic Obstacle Avoidance and Full-Speed-Range Hybrid Control for Intelligent Garages
by Lijie Wang, Xianwen Zhu, Ziyi Li and Shuchao Li
Sensors 2024, 24(17), 5694; https://doi.org/10.3390/s24175694 - 1 Sep 2024
Viewed by 518
Abstract
In the current study, which focuses on the operational safety problem in intelligent three-dimensional garages, an obstacle avoidance measurement and control scheme for the AGV parking robot is proposed. Under the premise of high-precision distance detection using Kalman filtering, a mathematical model of [...] Read more.
In the current study, which focuses on the operational safety problem in intelligent three-dimensional garages, an obstacle avoidance measurement and control scheme for the AGV parking robot is proposed. Under the premise of high-precision distance detection using Kalman filtering, a mathematical model of a brushless DC (BLDC) motor with full-speed range hybrid control is established. MATLAB/Simulink (R2022a) is used to build the control model, which has dual closed-loop vector-controlled motors in the low- to medium-speed range, with photoelectric encoders for speed feedback. The simulation results show that, at lower to medium speeds, the maximum overshoot of the output response curve is 1.5%, and the response time is 0.01 s. However, at higher speeds, there is significant jitter in the speed output waveform. Therefore, the speed feedback is switched to a sliding mode observer (SMO) instead of the original speed sensor at high speeds. Experiments show that, based on the SMO, the problem of speed waveform jitter at high motor speeds can be significantly improved, and the BLDC motor system has strong robustness. The above shows that the motor speed under the full-speed range hybrid control system can meet the AGV control and safety requirements. Full article
(This article belongs to the Special Issue Advanced Sensing and Measurement Control Applications)
Show Figures

Figure 1

12 pages, 1128 KiB  
Article
Multi-Feature Fusion in Graph Convolutional Networks for Data Network Propagation Path Tracing
by Dongsheng Jing, Yu Yang, Zhimin Gu, Renjun Feng, Yan Li and Haitao Jiang
Electronics 2024, 13(17), 3412; https://doi.org/10.3390/electronics13173412 - 28 Aug 2024
Viewed by 487
Abstract
With the rapid development of information technology, the complexity of data networks is increasing, especially in electric power systems, where data security and privacy protection are of great importance. Throughout the entire distribution process of the supply chain, it is crucial to closely [...] Read more.
With the rapid development of information technology, the complexity of data networks is increasing, especially in electric power systems, where data security and privacy protection are of great importance. Throughout the entire distribution process of the supply chain, it is crucial to closely monitor the propagation paths and dynamics of electrical data to ensure security and quickly initiate comprehensive traceability investigations if any data tampering is detected. This research addresses the challenges of data network complexity and its impact on the security of power systems by proposing an innovative data network propagation path tracing model, which is constructed based on graph convolutional networks (GCNs) and the BERT model. Firstly, propagation trees are constructed based on the propagation structure, and the key attributes of data nodes are extracted and screened. Then, GCNs are utilized to learn the representation of node features with different attribute feature combinations in the propagation path graph, while the Bidirectional Encoder Representations from Transformers (BERT) model is employed to capture the deep semantic features of the original text content. The core of this research is to effectively integrate these two feature representations, namely the structural features obtained by GCNs and the semantic features obtained by the BERT model, in order to enhance the ability of the model to recognize the data propagation path. The experimental results demonstrate that this model performs well in power data propagation and tracing tasks, and the data recognition accuracy reaches 92.5%, which is significantly better than the existing schemes. This achievement not only improves the power system’s ability to cope with data security threats but also provides strong support for protecting data transmission security and privacy. Full article
(This article belongs to the Special Issue Knowledge and Information Extraction Research)
Show Figures

Figure 1

23 pages, 1762 KiB  
Article
Dynamic Framing and Power Allocation for Real-Time Wireless Networks with Variable-Length Coding: A Tandem Queue Approach
by Yuanrui Liu, Xiaoyu Zhao, Wei Chen and Ying-Jun Angela Zhang
Network 2024, 4(3), 367-389; https://doi.org/10.3390/network4030017 - 27 Aug 2024
Viewed by 426
Abstract
Ensuring high reliability and low latency poses challenges for numerous applications that require rigid performance guarantees, such as industrial automation and autonomous vehicles. Our research primarily concentrates on addressing the real-time requirements of ultra-reliable low-latency communication (URLLC). Specifically, we tackle the challenge of [...] Read more.
Ensuring high reliability and low latency poses challenges for numerous applications that require rigid performance guarantees, such as industrial automation and autonomous vehicles. Our research primarily concentrates on addressing the real-time requirements of ultra-reliable low-latency communication (URLLC). Specifically, we tackle the challenge of hard delay constraints in real-time transmission systems, overcoming this obstacle through a finite blocklength coding scheme. In the physical layer, we encode randomly arriving packets using a variable-length coding scheme and transmit the encoded symbols by truncated channel inversion over parallel channels. In the network layer, we model the encoding and transmission processes as tandem queues. These queues backlog the data bits waiting to be encoded and the encoded symbols to be transmitted, respectively. This way, we represent the system as a two-dimensional Markov chain. By focusing on instances when the symbol queue is empty, we simplify the Markov chain into a one-dimensional Markov chain, with the packet queue being the system state. This approach allows us to analytically express power consumption and formulate a power minimization problem under hard delay constraints. Finally, we propose a heuristic algorithm to solve the problem and provide an extensive evaluation of the trade-offs between the hard delay constraint and power consumption. Full article
Show Figures

Figure 1

19 pages, 6601 KiB  
Article
An Innovative Recompression Scheme for VQ Index Tables
by Yijie Lin, Jui-Chuan Liu, Ching-Chun Chang and Chin-Chen Chang
Future Internet 2024, 16(8), 297; https://doi.org/10.3390/fi16080297 - 19 Aug 2024
Viewed by 281
Abstract
As we move into the digital era, the pace of technological advancement is accelerating rapidly. Network traffic often becomes congested during the transmission of large data volumes. To mitigate this, data compression plays a crucial role in minimizing transmitted data. Vector quantization (VQ) [...] Read more.
As we move into the digital era, the pace of technological advancement is accelerating rapidly. Network traffic often becomes congested during the transmission of large data volumes. To mitigate this, data compression plays a crucial role in minimizing transmitted data. Vector quantization (VQ) stands out as a potent compression technique where each image block is encoded independently as an index linked to a codebook, effectively reducing the bit rate. In this paper, we introduce a novel scheme for recompressing VQ indices, enabling lossless restoration of the original indices during decoding without compromising visual quality. Our method not only considers pixel correlations within each image block but also leverages correlations between neighboring blocks, further optimizing the bit rate. The experimental results demonstrated the superior performance of our approach over existing methods. Full article
Show Figures

Figure 1

22 pages, 1965 KiB  
Article
Long Short-Term Memory-Based Non-Uniform Coding Transmission Strategy for a 360-Degree Video
by Jia Guo, Chengrui Li, Jinqi Zhu, Xiang Li, Qian Gao, Yunhe Chen and Weijia Feng
Electronics 2024, 13(16), 3281; https://doi.org/10.3390/electronics13163281 - 19 Aug 2024
Viewed by 391
Abstract
This paper studies an LSTM-based adaptive transmission method for a 360-degree video and proposes a non-uniform encoding transmission strategy based on LSTM. Our goal is to maximize the user’s video experience by dynamically dividing the 360-degree video into tiles of different numbers and [...] Read more.
This paper studies an LSTM-based adaptive transmission method for a 360-degree video and proposes a non-uniform encoding transmission strategy based on LSTM. Our goal is to maximize the user’s video experience by dynamically dividing the 360-degree video into tiles of different numbers and sizes, and selecting different bitrates for each tile. This aims to reduce buffering events and video jitter. To determine the optimal number and size of tiles at the current moment, we constructed a dual-layer stacked LSTM network model. This model predicts, in real-time, the number, size, and bitrate of the tiles needed for the next moment of the 360-degree video based on the distance between the user’s eyes and the screen. In our experiments, we used an exhaustive algorithm to calculate the optimal tile division and bitrate selection scheme for a 360-degree video under different network conditions, and used this dataset to train our prediction model. Finally, by comparing with other advanced algorithms, we demonstrated the superiority of our proposed method. Full article
Show Figures

Figure 1

18 pages, 5207 KiB  
Article
MAPPNet: A Multi-Scale Attention Pyramid Pooling Network for Dental Calculus Segmentation
by Tianyu Nie, Shihong Yao, Di Wang, Conger Wang and Yishi Zhao
Appl. Sci. 2024, 14(16), 7273; https://doi.org/10.3390/app14167273 - 19 Aug 2024
Viewed by 529
Abstract
Dental diseases are among the most prevalent diseases globally, and accurate segmentation of dental calculus images plays a crucial role in periodontal disease diagnosis and treatment planning. However, the current methods are not stable and reliable enough due to the variable morphology of [...] Read more.
Dental diseases are among the most prevalent diseases globally, and accurate segmentation of dental calculus images plays a crucial role in periodontal disease diagnosis and treatment planning. However, the current methods are not stable and reliable enough due to the variable morphology of dental calculus and the blurring of the boundaries between the dental edges and the surrounding tissues; therefore, our hope is to propose an accurate and reliable calculus segmentation algorithm to improve the efficiency of clinical detection. We propose a multi-scale attention pyramid pooling network (MAPPNet) to enhance the performance of dental calculus segmentation. The network incorporates a multi-scale fusion strategy in both the encoder and decoder, forming a model with a dual-ended multi-scale structure. This design, in contrast to employing a multi-scale fusion scheme at a single end, enables more effective capturing of features from diverse scales. Furthermore, the attention pyramid pooling module (APPM) reconstructs the features on this map by leveraging a spatial-first and channel-second attention mechanism. APPM enables the network to adaptively adjust the weights of different locations and channels in the feature map, thereby enhancing the perception of important regions and key features. Experimental evaluation of our collected dental calculus segmentation dataset demonstrates the superior performance of MAPPNet, which achieves an intersection-over-union of 81.46% and an accuracy rate of 98.35%. Additionally, on two publicly available datasets, ISIC2018 (skin lesion dataset) and Kvasir-SEG (gastrointestinal polyp segmentation dataset), MAPPNet achieved an intersection-over-union of 76.48% and 91.38%, respectively. These results validate the effectiveness of our proposed network in accurately segmenting lesion regions and achieving high accuracy rates, surpassing many existing segmentation methods. Full article
Show Figures

Figure 1

16 pages, 10945 KiB  
Article
Impact of Video Motion Content on HEVC Coding Efficiency
by Khalid A. M. Salih, Ismail Amin Ali and Ramadhan J. Mstafa
Computers 2024, 13(8), 204; https://doi.org/10.3390/computers13080204 - 18 Aug 2024
Viewed by 520
Abstract
Digital video coding aims to reduce the bitrate and keep the integrity of visual presentation. High-Efficiency Video Coding (HEVC) can effectively compress video content to be suitable for delivery over various networks and platforms. Finding the optimal coding configuration is challenging as the [...] Read more.
Digital video coding aims to reduce the bitrate and keep the integrity of visual presentation. High-Efficiency Video Coding (HEVC) can effectively compress video content to be suitable for delivery over various networks and platforms. Finding the optimal coding configuration is challenging as the compression performance highly depends on the complexity of the encoded video sequence. This paper evaluates the effects of motion content on coding performance and suggests an adaptive encoding scheme based on the motion content of encoded video. To evaluate the effects of motion content on the compression performance of HEVC, we tested three coding configurations with different Group of Pictures (GOP) structures and intra refresh mechanisms. Namely, open GOP IPPP, open GOP Periodic-I, and closed GOP periodic-IDR coding structures were tested using several test sequences with a range of resolutions and motion activity. All sequences were first tested to check their motion activity. The rate–distortion curves were produced for all the test sequences and coding configurations. Our results show that the performance of IPPP coding configuration is significantly better (up to 4 dB) than periodic-I and periodic-IDR configurations for sequences with low motion activity. For test sequences with intermediate motion activity, IPPP configuration can still achieve a reasonable quality improvement over periodic-I and periodic-IDR configurations. However, for test sequences with high motion activity, IPPP configuration has a very small performance advantage over periodic-I and periodic-IDR configurations. Our results indicate the importance of selecting the appropriate coding structure according to the motion activity of the video being encoded. Full article
Show Figures

Figure 1

13 pages, 1615 KiB  
Article
Semi-Supervised Left-Atrial Segmentation Based on Squeeze–Excitation and Triple Consistency Training
by Dongsheng Wang, Tiezhen Xv, Jianshen Li, Jiehui Liu, Jinxi Guo and Lijie Yang
Symmetry 2024, 16(8), 1041; https://doi.org/10.3390/sym16081041 - 14 Aug 2024
Viewed by 443
Abstract
Convolutional neural networks (CNNs) have achieved remarkable success in fully supervised medical image segmentation tasks. However, the acquisition of large quantities of homogeneous labeled data is challenging, making semi-supervised training methods that rely on a small amount of labeled data and pseudo-labels increasingly [...] Read more.
Convolutional neural networks (CNNs) have achieved remarkable success in fully supervised medical image segmentation tasks. However, the acquisition of large quantities of homogeneous labeled data is challenging, making semi-supervised training methods that rely on a small amount of labeled data and pseudo-labels increasingly popular in recent years. Most existing semi-supervised learning methods, however, underestimate the importance of the unlabeled regions during training. This paper posits that these regions may contain crucial information for minimizing the model’s uncertainty prediction. To enhance the segmentation performance of the left-atrium database, this paper proposes a triple consistency segmentation network based on the squeeze-and-excitation mechanism (SETC-Net). Specifically, the paper constructs a symmetric architectural unit called SEConv, which adaptively recalibrates the feature responses in the channel direction by modeling the inter-channel correlations. This allows the network to adaptively weigh each channel according to the task’s needs, thereby emphasizing or suppressing different feature channels. Moreover, SETC-Net is composed of an encoder and three slightly different decoders, which convert the prediction discrepancies among the three decoders into unsupervised loss through a constructed iterative pseudo-labeling scheme, thus encouraging consistent and low-entropy predictions. This allows the model to gradually capture generalized features from these challenging unmarked regions. We evaluated the proposed SETC-Net on the public left-atrium (LA) database. The proposed method achieved an excellent Dice score of 91.14% using only 20% of the labeled data. The experiments demonstrate that the proposed SETC-Net outperforms seven current semi-supervised methods in left-atrium segmentation and is one of the best semi-supervised segmentation methods on the LA database. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

Back to TopTop