Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (41)

Search Parameters:
Keywords = AGs-Unet model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 6316 KiB  
Article
Deep Learning Based Automatic Left Ventricle Segmentation from the Transgastric Short-Axis View on Transesophageal Echocardiography: A Feasibility Study
by Yuan Tian, Wenting Qin, Zihang Zhao, Chunrong Wang, Yajie Tian, Yuelun Zhang, Kai He, Yuguan Zhang, Le Shen, Zhuhuang Zhou and Chunhua Yu
Diagnostics 2024, 14(15), 1655; https://doi.org/10.3390/diagnostics14151655 - 31 Jul 2024
Viewed by 438
Abstract
Segmenting the left ventricle from the transgastric short-axis views (TSVs) on transesophageal echocardiography (TEE) is the cornerstone for cardiovascular assessment during perioperative management. Even for seasoned professionals, the procedure remains time-consuming and experience-dependent. The current study aims to evaluate the feasibility of deep [...] Read more.
Segmenting the left ventricle from the transgastric short-axis views (TSVs) on transesophageal echocardiography (TEE) is the cornerstone for cardiovascular assessment during perioperative management. Even for seasoned professionals, the procedure remains time-consuming and experience-dependent. The current study aims to evaluate the feasibility of deep learning for automatic segmentation by assessing the validity of different U-Net algorithms. A large dataset containing 1388 TSV acquisitions was retrospectively collected from 451 patients (32% women, average age 53.42 years) who underwent perioperative TEE between July 2015 and October 2023. With image preprocessing and data augmentation, 3336 images were included in the training set, 138 images in the validation set, and 138 images in the test set. Four deep neural networks (U-Net, Attention U-Net, UNet++, and UNeXt) were employed for left ventricle segmentation and compared in terms of the Jaccard similarity coefficient (JSC) and Dice similarity coefficient (DSC) on the test set, as well as the number of network parameters, training time, and inference time. The Attention U-Net and U-Net++ models performed better in terms of JSC (the highest average JSC: 86.02%) and DSC (the highest average DSC: 92.00%), the UNeXt model had the smallest network parameters (1.47 million), and the U-Net model had the least training time (6428.65 s) and inference time for a single image (101.75 ms). The Attention U-Net model outperformed the other three models in challenging cases, including the impaired boundary of left ventricle and the artifact of the papillary muscle. This pioneering exploration demonstrated the feasibility of deep learning for the segmentation of the left ventricle from TSV on TEE, which will facilitate an accelerated and objective alternative of cardiovascular assessment for perioperative management. Full article
(This article belongs to the Special Issue Deep Learning Techniques for Medical Image Analysis)
Show Figures

Figure 1

31 pages, 7122 KiB  
Article
Delineation of 12-Lead ECG Representative Beats Using Convolutional Encoder–Decoders with Residual and Recurrent Connections
by Vessela Krasteva, Todor Stoyanov, Ramun Schmid and Irena Jekova
Sensors 2024, 24(14), 4645; https://doi.org/10.3390/s24144645 - 17 Jul 2024
Viewed by 670
Abstract
The aim of this study is to address the challenge of 12-lead ECG delineation by different encoder–decoder architectures of deep neural networks (DNNs). This study compares four concepts for encoder–decoders based on a fully convolutional architecture (CED-Net) and its modifications with a recurrent [...] Read more.
The aim of this study is to address the challenge of 12-lead ECG delineation by different encoder–decoder architectures of deep neural networks (DNNs). This study compares four concepts for encoder–decoders based on a fully convolutional architecture (CED-Net) and its modifications with a recurrent layer (CED-LSTM-Net), residual connections between symmetrical encoder and decoder feature maps (CED-U-Net), and sequential residual blocks (CED-Res-Net). All DNNs transform 12-lead representative beats to three diagnostic ECG intervals (P-wave, QRS-complex, QT-interval) used for the global delineation of the representative beat (P-onset, P-offset, QRS-onset, QRS-offset, T-offset). All DNNs were trained and optimized using the large PhysioNet ECG database (PTB-XL) under identical conditions, applying an advanced approach for machine-based supervised learning with a reference algorithm for ECG delineation (ETM, Schiller AG, Baar, Switzerland). The test results indicate that all DNN architectures are equally capable of reproducing the reference delineation algorithm’s measurements in the diagnostic PTB database with an average P-wave detection accuracy (96.6%) and time and duration errors: mean values (−2.6 to 2.4 ms) and standard deviations (2.9 to 11.4 ms). The validation according to the standard-based evaluation practices of diagnostic electrocardiographs with the CSE database outlines a CED-Net model, which measures P-duration (2.6 ± 11.0 ms), PQ-interval (0.9 ± 5.8 ms), QRS-duration (−2.4 ± 5.4 ms), and QT-interval (−0.7 ± 10.3 ms), which meet all standard tolerances. Noise tests with high-frequency, low-frequency, and power-line frequency noise (50/60 Hz) confirm that CED-Net, CED-Res-Net, and CED-LSTM-Net are robust to all types of noise, mostly presenting a mean duration error < 2.5 ms when compared to measurements without noise. Reduced noise immunity is observed for the U-net architecture. Comparative analysis with other published studies scores this research within the lower range of time errors, highlighting its competitive performance. Full article
Show Figures

Figure 1

21 pages, 12765 KiB  
Article
Unveiling the Urban Morphology of Small Towns in the Eastern Qinba Mountains: Integrating Earth Observation and Morphometric Analysis
by Xin Zhao and Zuobin Wu
Buildings 2024, 14(7), 2015; https://doi.org/10.3390/buildings14072015 - 2 Jul 2024
Viewed by 603
Abstract
In the context of the current information age, leveraging Earth observation (EO) technology and spatial analysis methods enables a more accurate understanding of the characteristics of small towns. This study conducted an in-depth analysis of the urban morphology of small towns in the [...] Read more.
In the context of the current information age, leveraging Earth observation (EO) technology and spatial analysis methods enables a more accurate understanding of the characteristics of small towns. This study conducted an in-depth analysis of the urban morphology of small towns in the Qinba Mountain Area of Southern Shaanxi by employing large-scale data analysis and innovative urban form measurement methods. The U-Net3+ model, based on deep learning technology, combined with the concave hull algorithm, was used to extract and precisely define the boundaries of 31,799 buildings and small towns. The morphological characteristics of the town core were measured, and the core areas of the small towns were defined using calculated tessellation cells. Hierarchical clustering methods were applied to analyze 12 characteristic indicators of 89 towns, and various metrics were calculated to determine the optimal number of clusters. The analysis identified eight distinct clusters based on the towns’ morphological differences. Significant morphological differences between the small towns in the Qinba Mountain Area were observed. The clustering results revealed that the towns exhibited diverse shapes and distributions, ranging from irregular and sparse to compact and dense forms, reflecting distinct layout patterns influenced by the unique context of each town. The use of the morphometric method, based on cellular and biological morphometry, provided a new perspective on the urban form and deepened the understanding of the spatial structure of the small towns from a micro perspective. These findings not only contribute to the development of quantitative morphological indicators for town development and planning but also demonstrate a novel, data-driven approach to conventional urban morphology studies. Full article
Show Figures

Figure 1

10 pages, 6435 KiB  
Article
A Deep Learning Approach to Automatic Tooth Caries Segmentation in Panoramic Radiographs of Children in Primary Dentition, Mixed Dentition, and Permanent Dentition
by Esra Asci, Munevver Kilic, Ozer Celik, Kenan Cantekin, Hasan Basri Bircan, İbrahim Sevki Bayrakdar and Kaan Orhan
Children 2024, 11(6), 690; https://doi.org/10.3390/children11060690 - 5 Jun 2024
Viewed by 852
Abstract
Objectives: The purpose of this study was to evaluate the effectiveness of dental caries segmentation on the panoramic radiographs taken from children in primary dentition, mixed dentition, and permanent dentition with Artificial Intelligence (AI) models developed using the deep learning method. Methods: This [...] Read more.
Objectives: The purpose of this study was to evaluate the effectiveness of dental caries segmentation on the panoramic radiographs taken from children in primary dentition, mixed dentition, and permanent dentition with Artificial Intelligence (AI) models developed using the deep learning method. Methods: This study used 6075 panoramic radiographs taken from children aged between 4 and 14 to develop the AI model. The radiographs included in the study were divided into three groups: primary dentition (n: 1857), mixed dentition (n: 1406), and permanent dentition (n: 2812). The U-Net model implemented with PyTorch library was used for the segmentation of caries lesions. A confusion matrix was used to evaluate model performance. Results: In the primary dentition group, the sensitivity, precision, and F1 scores calculated using the confusion matrix were found to be 0.8525, 0.9128, and 0.8816, respectively. In the mixed dentition group, the sensitivity, precision, and F1 scores calculated using the confusion matrix were found to be 0.7377, 0.9192, and 0.8185, respectively. In the permanent dentition group, the sensitivity, precision, and F1 scores calculated using the confusion matrix were found to be 0.8271, 0.9125, and 0.8677, respectively. In the total group including primary, mixed, and permanent dentition, the sensitivity, precision, and F1 scores calculated using the confusion matrix were 0.8269, 0.9123, and 0.8675, respectively. Conclusions: Deep learning-based AI models are promising tools for the detection and diagnosis of caries in panoramic radiographs taken from children with different dentition. Full article
(This article belongs to the Section Pediatric Dentistry & Oral Medicine)
Show Figures

Figure 1

17 pages, 6040 KiB  
Article
AM-UNet: Field Ridge Segmentation of Paddy Field Images Based on an Improved MultiResUNet Network
by Xulong Wu, Peng Fang, Xing Liu, Muhua Liu, Peichen Huang, Xianhao Duan, Dakang Huang and Zhaopeng Liu
Agriculture 2024, 14(4), 637; https://doi.org/10.3390/agriculture14040637 - 21 Apr 2024
Cited by 1 | Viewed by 854
Abstract
In order to solve the problem of image boundary segmentation caused by the irregularity of paddy fields in southern China, a high-precision segmentation method based on the improved MultiResUNet model for paddy field mapping is proposed, combining the characteristics of paddy field scenes. [...] Read more.
In order to solve the problem of image boundary segmentation caused by the irregularity of paddy fields in southern China, a high-precision segmentation method based on the improved MultiResUNet model for paddy field mapping is proposed, combining the characteristics of paddy field scenes. We introduce the attention gate (AG) mechanism at the end of the encoder–decoder skip connections in the MultiResUNet model to generate the weights and highlight the response of the field ridge area, add an atrous spatial pyramid pooling (ASPP) module after the end of the encoder down-sampling, use an appropriate combination of expansion rates to improve the identification of small-scale edge details, use 1 × 1 convolution to improve the range of the sensory field after bilinear interpolation to increase the segmentation accuracy, and, thus, construct the AM-UNet paddy field ridge segmentation model. The experimental results show that the IoU, precision, and F1 value of the AM-UNet model are 88.74%, 93.45%, and 93.95%, respectively, and that inference time for a single image is 168ms, enabling accurate and real-time segmentation of field ridges in a complex paddy field environment. Thus, the AM-UNet model can provide technical support for the development of vision-based automatic navigation systems for agricultural machines. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

22 pages, 26451 KiB  
Article
Mapping the Distribution of High-Value Broadleaf Tree Crowns through Unmanned Aerial Vehicle Image Analysis Using Deep Learning
by Nyo Me Htun, Toshiaki Owari, Satoshi Tsuyuki and Takuya Hiroshima
Algorithms 2024, 17(2), 84; https://doi.org/10.3390/a17020084 - 17 Feb 2024
Cited by 1 | Viewed by 2199
Abstract
High-value timber species with economic and ecological importance are usually distributed at very low densities, such that accurate knowledge of the location of these trees within a forest is critical for forest management practices. Recent technological developments integrating unmanned aerial vehicle (UAV) imagery [...] Read more.
High-value timber species with economic and ecological importance are usually distributed at very low densities, such that accurate knowledge of the location of these trees within a forest is critical for forest management practices. Recent technological developments integrating unmanned aerial vehicle (UAV) imagery and deep learning provide an efficient method for mapping forest attributes. In this study, we explored the applicability of high-resolution UAV imagery and a deep learning algorithm to predict the distribution of high-value deciduous broadleaf tree crowns of Japanese oak (Quercus crispula) in an uneven-aged mixed forest in Hokkaido, northern Japan. UAV images were collected in September and October 2022 before and after the color change of the leaves of Japanese oak to identify the optimal timing of UAV image collection. RGB information extracted from the UAV images was analyzed using a ResU-Net model (U-Net model with a Residual Network 101 (ResNet101), pre-trained on large ImageNet datasets, as backbone). Our results, confirmed using validation data, showed that reliable F1 scores (>0.80) could be obtained with both UAV datasets. According to the overlay analyses of the segmentation results and all the annotated ground truth data, the best performance was that of the model with the October UAV dataset (F1 score of 0.95). Our case study highlights a potential methodology to offer a transferable approach to the management of high-value timber species in other regions. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
Show Figures

Figure 1

14 pages, 3622 KiB  
Article
AI-Based Approach to One-Click Chronic Subdural Hematoma Segmentation Using Computed Tomography Images
by Andrey Petrov, Alexey Kashevnik, Mikhail Haleev, Ammar Ali, Arkady Ivanov, Konstantin Samochernykh, Larisa Rozhchenko and Vasiliy Bobinov
Sensors 2024, 24(3), 721; https://doi.org/10.3390/s24030721 - 23 Jan 2024
Viewed by 1074
Abstract
This paper presents a computer vision-based approach to chronic subdural hematoma segmentation that can be performed by one click. Chronic subdural hematoma is estimated to occur in 0.002–0.02% of the general population each year and the risk increases with age, with a high [...] Read more.
This paper presents a computer vision-based approach to chronic subdural hematoma segmentation that can be performed by one click. Chronic subdural hematoma is estimated to occur in 0.002–0.02% of the general population each year and the risk increases with age, with a high frequency of about 0.05–0.06% in people aged 70 years and above. In our research, we developed our own dataset, which includes 53 series of CT scans collected from 21 patients with one or two hematomas. Based on the dataset, we trained two neural network models based on U-Net architecture to automate the manual segmentation process. One of the models performed segmentation based only on the current frame, while the other additionally processed multiple adjacent images to provide context, a technique that is more similar to the behavior of a doctor. We used a 10-fold cross-validation technique to better estimate the developed models’ efficiency. We used the Dice metric for segmentation accuracy estimation, which was 0.77. Also, for testing our approach, we used scans from five additional patients who did not form part of the dataset, and created a scenario in which three medical experts carried out a hematoma segmentation before we carried out segmentation using our best model. We developed the OsiriX DICOM Viewer plugin to implement our solution into the segmentation process. We compared the segmentation time, which was more than seven times faster using the one-click approach, and the experts agreed that the segmentation quality was acceptable for clinical usage. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

20 pages, 3293 KiB  
Article
Deep Learning-Based Hip X-ray Image Analysis for Predicting Osteoporosis
by Shang-Wen Feng, Szu-Yin Lin, Yi-Hung Chiang, Meng-Han Lu and Yu-Hsiang Chao
Appl. Sci. 2024, 14(1), 133; https://doi.org/10.3390/app14010133 - 22 Dec 2023
Cited by 2 | Viewed by 1510
Abstract
Osteoporosis is a common problem in orthopedic medicine, and it has become an important medical issue in orthopedics as Taiwan is gradually becoming an aging society. In the diagnosis of osteoporosis, the bone mineral density (BMD) derived from dual-energy X-ray absorptiometry (DXA) is [...] Read more.
Osteoporosis is a common problem in orthopedic medicine, and it has become an important medical issue in orthopedics as Taiwan is gradually becoming an aging society. In the diagnosis of osteoporosis, the bone mineral density (BMD) derived from dual-energy X-ray absorptiometry (DXA) is the main criterion for orthopedic diagnosis of osteoporosis, but due to the high cost of this equipment and the lower penetration rate of the equipment compared to the X-ray images, the problem of osteoporosis has not been effectively solved for many people who suffer from osteoporosis. At present, in clinical diagnosis, doctors are not yet able to accurately interpret X-ray images for osteoporosis manually and must rely on the data obtained from DXA. In recent years, with the continuous development of artificial intelligence, especially in the fields of machine learning and deep learning, significant progress has been made in image recognition. Therefore, it is worthwhile to revisit the question of whether it is possible to use a convolutional neural network model to read a hip X-ray image and then predict the patient’s BMD. In this study, we proposed a hip X-ray image segmentation model and a hip X-ray image recognition classification model. First, we used the U-Net model as a framework to segment the femoral neck, greater trochanter, Ward’s triangle, and the total hip in the hip X-ray images. We then performed image matting and data augmentation. Finally, we constructed a predictive model for osteoporosis using deep learning algorithms. In the segmentation experiments, we used intersection over union (IoU) as the evaluation metric for image segmentation, and both the U-Net model and the U-Net++ model achieved segmentation results greater than or equal to 0.5. In the classification experiments, using the T-score as the classification basis, the total hip using the DenseNet121 model has the highest accuracy of 74%. Full article
(This article belongs to the Topic Electronic Communications, IOT and Big Data)
Show Figures

Figure 1

20 pages, 7697 KiB  
Article
Integration of Unmanned Aerial Vehicle Imagery and Machine Learning Technology to Map the Distribution of Conifer and Broadleaf Canopy Cover in Uneven-Aged Mixed Forests
by Nyo Me Htun, Toshiaki Owari, Satoshi Tsuyuki and Takuya Hiroshima
Drones 2023, 7(12), 705; https://doi.org/10.3390/drones7120705 - 13 Dec 2023
Cited by 1 | Viewed by 2565
Abstract
Uneven-aged mixed forests have been recognized as important contributors to biodiversity conservation, ecological stability, carbon sequestration, the provisioning of ecosystem services, and sustainable timber production. Recently, numerous studies have demonstrated the applicability of integrating remote sensing datasets with machine learning for forest management [...] Read more.
Uneven-aged mixed forests have been recognized as important contributors to biodiversity conservation, ecological stability, carbon sequestration, the provisioning of ecosystem services, and sustainable timber production. Recently, numerous studies have demonstrated the applicability of integrating remote sensing datasets with machine learning for forest management purposes, such as forest type classification and the identification of individual trees. However, studies focusing on the integration of unmanned aerial vehicle (UAV) datasets with machine learning for mapping of tree species groups in uneven-aged mixed forests remain limited. Thus, this study explored the feasibility of integrating UAV imagery with semantic segmentation-based machine learning classification algorithms to describe conifer and broadleaf species canopies in uneven-aged mixed forests. The study was conducted in two sub-compartments of the University of Tokyo Hokkaido Forest in northern Japan. We analyzed UAV images using the semantic-segmentation based U-Net and random forest (RF) classification models. The results indicate that the integration of UAV imagery with the U-Net model generated reliable conifer and broadleaf canopy cover classification maps in both sub-compartments, while the RF model often failed to distinguish conifer crowns. Moreover, our findings demonstrate the potential of this method to detect dominant tree species groups in uneven-aged mixed forests. Full article
(This article belongs to the Special Issue Feature Papers for Drones in Agriculture and Forestry Section)
Show Figures

Figure 1

20 pages, 4734 KiB  
Article
Facial Wrinkle Detection with Multiscale Spatial Feature Fusion Based on Image Enhancement and ASFF-SEUnet
by Jiang Chen, Mingfang He and Weiwei Cai
Electronics 2023, 12(24), 4897; https://doi.org/10.3390/electronics12244897 - 5 Dec 2023
Cited by 1 | Viewed by 1499
Abstract
Wrinkles, crucial for age estimation and skin quality assessment, present challenges due to their uneven distribution, varying scale, and sensitivity to factors like lighting. To overcome these challenges, this study presents facial wrinkle detection with multiscale spatial feature fusion based on image enhancement [...] Read more.
Wrinkles, crucial for age estimation and skin quality assessment, present challenges due to their uneven distribution, varying scale, and sensitivity to factors like lighting. To overcome these challenges, this study presents facial wrinkle detection with multiscale spatial feature fusion based on image enhancement and an adaptively spatial feature fusion squeeze-and-excitation Unet network (ASFF-SEUnet) model. Firstly, in order to improve wrinkle features and address the issue of uneven illumination in wrinkle images, an innovative image enhancement algorithm named Coiflet wavelet transform Donoho threshold and improved Retinex (CT-DIR) is proposed. Secondly, the ASFF-SEUnet model is designed to enhance the accuracy of full-face wrinkle detection across all age groups under the influence of lighting factors. It replaces the encoder part of the Unet network with EfficientNet, enabling the simultaneous adjustment of depth, width, and resolution for improved wrinkle feature extraction. The squeeze-and-excitation (SE) attention mechanism is introduced to grasp the correlation and importance among features, thereby enhancing the extraction of local wrinkle details. Finally, the adaptively spatial feature fusion (ASFF) module is incorporated to adaptively fuse multiscale features, capturing facial wrinkle information comprehensively. Experimentally, the method excels in detecting facial wrinkles amid complex backgrounds, robustly supporting facial skin quality diagnosis and age assessment. Full article
Show Figures

Figure 1

20 pages, 2974 KiB  
Article
Multi-Layer Preprocessing and U-Net with Residual Attention Block for Retinal Blood Vessel Segmentation
by Ahmed Alsayat, Mahmoud Elmezain, Saad Alanazi, Meshrif Alruily, Ayman Mohamed Mostafa and Wael Said
Diagnostics 2023, 13(21), 3364; https://doi.org/10.3390/diagnostics13213364 - 1 Nov 2023
Cited by 2 | Viewed by 1642
Abstract
Retinal blood vessel segmentation is a valuable tool for clinicians to diagnose conditions such as atherosclerosis, glaucoma, and age-related macular degeneration. This paper presents a new framework for segmenting blood vessels in retinal images. The framework has two stages: a multi-layer preprocessing stage [...] Read more.
Retinal blood vessel segmentation is a valuable tool for clinicians to diagnose conditions such as atherosclerosis, glaucoma, and age-related macular degeneration. This paper presents a new framework for segmenting blood vessels in retinal images. The framework has two stages: a multi-layer preprocessing stage and a subsequent segmentation stage employing a U-Net with a multi-residual attention block. The multi-layer preprocessing stage has three steps. The first step is noise reduction, employing a U-shaped convolutional neural network with matrix factorization (CNN with MF) and detailed U-shaped U-Net (D_U-Net) to minimize image noise, culminating in the selection of the most suitable image based on the PSNR and SSIM values. The second step is dynamic data imputation, utilizing multiple models for the purpose of filling in missing data. The third step is data augmentation through the utilization of a latent diffusion model (LDM) to expand the training dataset size. The second stage of the framework is segmentation, where the U-Nets with a multi-residual attention block are used to segment the retinal images after they have been preprocessed and noise has been removed. The experiments show that the framework is effective at segmenting retinal blood vessels. It achieved Dice scores of 95.32, accuracy of 93.56, precision of 95.68, and recall of 95.45. It also achieved efficient results in removing noise using CNN with matrix factorization (MF) and D-U-NET according to values of PSNR and SSIM for (0.1, 0.25, 0.5, and 0.75) levels of noise. The LDM achieved an inception score of 13.6 and an FID of 46.2 in the augmentation step. Full article
(This article belongs to the Special Issue Medical Data Processing and Analysis—2nd Edition)
Show Figures

Figure 1

17 pages, 1850 KiB  
Article
Improved UNet with Attention for Medical Image Segmentation
by Ahmed AL Qurri and Mohamed Almekkawy
Sensors 2023, 23(20), 8589; https://doi.org/10.3390/s23208589 - 20 Oct 2023
Cited by 13 | Viewed by 4871
Abstract
Medical image segmentation is crucial for medical image processing and the development of computer-aided diagnostics. In recent years, deep Convolutional Neural Networks (CNNs) have been widely adopted for medical image segmentation and have achieved significant success. UNet, which is based on CNNs, is [...] Read more.
Medical image segmentation is crucial for medical image processing and the development of computer-aided diagnostics. In recent years, deep Convolutional Neural Networks (CNNs) have been widely adopted for medical image segmentation and have achieved significant success. UNet, which is based on CNNs, is the mainstream method used for medical image segmentation. However, its performance suffers owing to its inability to capture long-range dependencies. Transformers were initially designed for Natural Language Processing (NLP), and sequence-to-sequence applications have demonstrated the ability to capture long-range dependencies. However, their abilities to acquire local information are limited. Hybrid architectures of CNNs and Transformer, such as TransUNet, have been proposed to benefit from Transformer’s long-range dependencies and CNNs’ low-level details. Nevertheless, automatic medical image segmentation remains a challenging task due to factors such as blurred boundaries, the low-contrast tissue environment, and in the context of ultrasound, issues like speckle noise and attenuation. In this paper, we propose a new model that combines the strengths of both CNNs and Transformer, with network architectural improvements designed to enrich the feature representation captured by the skip connections and the decoder. To this end, we devised a new attention module called Three-Level Attention (TLA). This module is composed of an Attention Gate (AG), channel attention, and spatial normalization mechanism. The AG preserves structural information, whereas channel attention helps to model the interdependencies between channels. Spatial normalization employs the spatial coefficient of the Transformer to improve spatial attention akin to TransNorm. To further improve the skip connection and reduce the semantic gap, skip connections between the encoder and decoder were redesigned in a manner similar to that of the UNet++ dense connection. Moreover, deep supervision using a side-output channel was introduced, analogous to BASNet, which was originally used for saliency predictions. Two datasets from different modalities, a CT scan dataset and an ultrasound dataset, were used to evaluate the proposed UNet architecture. The experimental results showed that our model consistently improved the prediction performance of the UNet across different datasets. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

25 pages, 13542 KiB  
Article
Comparative Analysis of Image Processing Techniques for Enhanced MRI Image Quality: 3D Reconstruction and Segmentation Using 3D U-Net Architecture
by Chee Chin Lim, Apple Ho Wei Ling, Yen Fook Chong, Mohd Yusoff Mashor, Khalilalrahman Alshantti and Mohd Ezane Aziz
Diagnostics 2023, 13(14), 2377; https://doi.org/10.3390/diagnostics13142377 - 14 Jul 2023
Cited by 3 | Viewed by 2194
Abstract
Osteosarcoma is a common type of bone tumor, particularly prevalent in children and adolescents between the ages of 5 and 25 who are experiencing growth spurts during puberty. Manual delineation of tumor regions in MRI images can be laborious and time-consuming, and results [...] Read more.
Osteosarcoma is a common type of bone tumor, particularly prevalent in children and adolescents between the ages of 5 and 25 who are experiencing growth spurts during puberty. Manual delineation of tumor regions in MRI images can be laborious and time-consuming, and results may be subjective and difficult to replicate. Therefore, a convolutional neural network (CNN) was developed to automatically segment osteosarcoma cancerous cells in three types of MRI images. The study consisted of five main stages. First, 3692 DICOM format MRI images were acquired from 46 patients, including T1-weighted, T2-weighted, and T1-weighted with injection of Gadolinium (T1W + Gd) images. Contrast stretching and median filter were applied to enhance image intensity and remove noise, and the pre-processed images were reconstructed into NIfTI format files for deep learning. The MRI images were then transformed to fit the CNN’s requirements. A 3D U-Net architecture was proposed with optimized parameters to build an automatic segmentation model capable of segmenting osteosarcoma from the MRI images. The 3D U-Net segmentation model achieved excellent results, with mean dice similarity coefficients (DSC) of 83.75%, 85.45%, and 87.62% for T1W, T2W, and T1W + Gd images, respectively. However, the study found that the proposed method had some limitations, including poorly defined borders, missing lesion portions, and other confounding factors. In summary, an automatic segmentation method based on a CNN has been developed to address the challenge of manually segmenting osteosarcoma cancerous cells in MRI images. While the proposed method showed promise, the study revealed limitations that need to be addressed to improve its efficacy. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

15 pages, 3354 KiB  
Article
Generation of Conventional 18F-FDG PET Images from 18F-Florbetaben PET Images Using Generative Adversarial Network: A Preliminary Study Using ADNI Dataset
by Hyung Jin Choi, Minjung Seo, Ahro Kim and Seol Hoon Park
Medicina 2023, 59(7), 1281; https://doi.org/10.3390/medicina59071281 - 10 Jul 2023
Viewed by 1518
Abstract
Background and Objectives: 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) (PETFDG) image can visualize neuronal injury of the brain in Alzheimer’s disease. Early-phase amyloid PET image is reported to be similar to PETFDG image. This study aimed to generate [...] Read more.
Background and Objectives: 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) (PETFDG) image can visualize neuronal injury of the brain in Alzheimer’s disease. Early-phase amyloid PET image is reported to be similar to PETFDG image. This study aimed to generate PETFDG images from 18F-florbetaben PET (PETFBB) images using a generative adversarial network (GAN) and compare the generated PETFDG (PETGE-FDG) with real PETFDG (PETRE-FDG) images using the structural similarity index measure (SSIM) and the peak signal-to-noise ratio (PSNR). Materials and Methods: Using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database, 110 participants with both PETFDG and PETFBB images at baseline were included. The paired PETFDG and PETFBB images included six and four subset images, respectively. Each subset image had a 5 min acquisition time. These subsets were randomly sampled and divided into 249 paired PETFDG and PETFBB subset images for the training datasets and 95 paired subset images for the validation datasets during the deep-learning process. The deep learning model used in this study is composed of a GAN with a U-Net. The differences in the SSIM and PSNR values between the PETGE-FDG and PETRE-FDG images in the cycleGAN and pix2pix models were evaluated using the independent Student’s t-test. Statistical significance was set at p ≤ 0.05. Results: The participant demographics (age, sex, or diagnosis) showed no statistically significant differences between the training (82 participants) and validation (28 participants) groups. The mean SSIM between the PETGE-FDG and PETRE-FDG images was 0.768 ± 0.135 for the cycleGAN model and 0.745 ± 0.143 for the pix2pix model. The mean PSNR was 32.4 ± 9.5 and 30.7 ± 8.0. The PETGE-FDG images of the cycleGAN model showed statistically higher mean SSIM than those of the pix2pix model (p < 0.001). The mean PSNR was also higher in the PETGE-FDG images of the cycleGAN model than those of pix2pix model (p < 0.001). Conclusions: We generated PETFDG images from PETFBB images using deep learning. The cycleGAN model generated PETGE-FDG images with a higher SSIM and PSNR values than the pix2pix model. Image-to-image translation using deep learning may be useful for generating PETFDG images. These may provide additional information for the management of Alzheimer’s disease without extra image acquisition and the consequent increase in radiation exposure, inconvenience, or expenses. Full article
(This article belongs to the Section Neurology)
Show Figures

Figure 1

13 pages, 16597 KiB  
Article
Deep-Learning-Based Morphological Feature Segmentation for Facial Skin Image Analysis
by Huisu Yoon, Semin Kim, Jongha Lee and Sangwook Yoo
Diagnostics 2023, 13(11), 1894; https://doi.org/10.3390/diagnostics13111894 - 29 May 2023
Cited by 8 | Viewed by 3212
Abstract
Facial skin analysis has attracted considerable attention in the skin health domain. The results of facial skin analysis can be used to provide skin care and cosmetic recommendations in aesthetic dermatology. Because of the existence of several skin features, grouping similar features and [...] Read more.
Facial skin analysis has attracted considerable attention in the skin health domain. The results of facial skin analysis can be used to provide skin care and cosmetic recommendations in aesthetic dermatology. Because of the existence of several skin features, grouping similar features and processing them together can improve skin analysis. In this study, a deep-learning-based method of simultaneous segmentation of wrinkles and pores is proposed. Unlike color-based skin analysis, this method is based on the analysis of the morphological structures of the skin. Although multiclass segmentation is widely used in computer vision, this segmentation was first used in facial skin analysis. The architecture of the model is U-Net, which has an encoder–decoder structure. We added two types of attention schemes to the network to focus on important areas. Attention in deep learning refers to the process by which a neural network focuses on specific parts of its input to improve its performance. Second, a method to enhance the learning capability of positional information is added to the network based on the fact that the locations of wrinkles and pores are fixed. Finally, a novel ground truth generation scheme suitable for the resolution of each skin feature (wrinkle and pore) was proposed. The experimental results revealed that the proposed unified method achieved excellent localization of wrinkles and pores and outperformed both conventional image-processing-based approaches and one of the recent successful deep-learning-based approaches. The proposed method should be expanded to applications such as age estimation and the prediction of potential diseases. Full article
(This article belongs to the Special Issue Advances in Non-invasive Skin Imaging Techniques)
Show Figures

Figure 1

Back to TopTop