Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,695)

Search Parameters:
Keywords = feature aggregation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 2372 KiB  
Article
PDeT: A Progressive Deformable Transformer for Photovoltaic Panel Defect Segmentation
by Peng Zhou, Hong Fang and Gaochang Wu
Sensors 2024, 24(21), 6908; https://doi.org/10.3390/s24216908 (registering DOI) - 28 Oct 2024
Abstract
Defects in photovoltaic (PV) panels can significantly reduce the power generation efficiency of the system and may cause localized overheating due to uneven current distribution. Therefore, adopting precise pixel-level defect detection, i.e., defect segmentation, technology is essential to ensuring stable operation. However, for [...] Read more.
Defects in photovoltaic (PV) panels can significantly reduce the power generation efficiency of the system and may cause localized overheating due to uneven current distribution. Therefore, adopting precise pixel-level defect detection, i.e., defect segmentation, technology is essential to ensuring stable operation. However, for effective defect segmentation, the feature extractor must adaptively determine the appropriate scale or receptive field for accurate defect localization, while the decoder must seamlessly fuse coarse-level semantics with fine-grained features to enhance high-level representations. In this paper, we propose a Progressive Deformable Transformer (PDeT) for defect segmentation in PV cells. This approach effectively learns spatial sampling offsets and refines features progressively through coarse-level semantic attention. Specifically, the network adaptively captures spatial offset positions and computes self-attention, expanding the model’s receptive field and enabling feature extraction across objects of various shapes. Furthermore, we introduce a semantic aggregation module to refine semantic information, converting the fused feature map into a scale space and balancing contextual information. Extensive experiments demonstrate the effectiveness of our method, achieving an mIoU of 88.41% on our solar cell dataset, outperforming other methods. Additionally, to validate the PDeT’s applicability across different domains, we trained and tested it on the MVTec-AD dataset. The experimental results demonstrate that the PDeT exhibits excellent recognition performance in various other scenarios as well. Full article
(This article belongs to the Special Issue Deep Learning for Perception and Recognition: Method and Applications)
Show Figures

Figure 1

16 pages, 3470 KiB  
Article
YOLOv8-Based Estimation of Estrus in Sows Through Reproductive Organ Swelling Analysis Using a Single Camera
by Iyad Almadani, Mohammed Abuhussein and Aaron L. Robinson
Digital 2024, 4(4), 898-913; https://doi.org/10.3390/digital4040044 (registering DOI) - 27 Oct 2024
Abstract
Accurate and efficient estrus detection in sows is crucial in modern agricultural practices to ensure optimal reproductive health and successful breeding outcomes. A non-contact method using computer vision to detect a change in a sow’s vulva size holds great promise for automating and [...] Read more.
Accurate and efficient estrus detection in sows is crucial in modern agricultural practices to ensure optimal reproductive health and successful breeding outcomes. A non-contact method using computer vision to detect a change in a sow’s vulva size holds great promise for automating and enhancing this critical process. However, achieving precise and reliable results depends heavily on maintaining a consistent camera distance during image capture. Variations in camera distance can lead to erroneous estrus estimations, potentially resulting in missed breeding opportunities or false positives. To address this challenge, we propose a robust six-step methodology, accompanied by three stages of evaluation. First, we carefully annotated masks around the vulva to ensure an accurate pixel perimeter calculation of its shape. Next, we meticulously identified keypoints on the sow’s vulva, which enabled precise tracking and analysis of its features. We then harnessed the power of machine learning to train our model using annotated images, which facilitated keypoint detection and segmentation with the state-of-the-art YOLOv8 algorithm. By identifying the keypoints, we performed precise calculations of the Euclidean distances: first, between each labium (horizontal distance), and second, between the clitoris and the perineum (vertical distance). Additionally, by segmenting the vulva’s size, we gained valuable insights into its shape, which helped with performing precise perimeter measurements. Equally important was our effort to calibrate the camera using monocular depth estimation. This calibration helped establish a functional relationship between the measurements on the image (such as the distances between the labia and from the clitoris to the perineum, and the vulva perimeter) and the depth distance to the camera, which enabled accurate adjustments and calibration for our analysis. Lastly, we present a classification method for distinguishing between estrus and non-estrus states in subjects based on the pixel width, pixel length, and perimeter measurements. The method calculated the Euclidean distances between a new data point and reference points from two datasets: “estrus data” and “not estrus data”. Using custom distance functions, we computed the distances for each measurement dimension and aggregated them to determine the overall similarity. The classification process involved identifying the three nearest neighbors of the datasets and employing a majority voting mechanism to assign a label. A new data point was classified as “estrus” if the majority of the nearest neighbors were labeled as estrus; otherwise, it was classified as “non-estrus”. This method provided a robust approach for automated classification, which aided in more accurate and efficient detection of the estrus states. To validate our approach, we propose three evaluation stages. In the first stage, we calculated the Mean Squared Error (MSE) between the ground truth keypoints of the labia distance and the distance between the predicted keypoints, and we performed the same calculation for the distance between the clitoris and perineum. Then, we provided a quantitative analysis and performance comparison, including a comparison between our previous U-Net model and our new YOLOv8 segmentation model. This comparison focused on each model’s performance in terms of accuracy and speed, which highlighted the advantages of our new approach. Lastly, we evaluated the estrus–not-estrus classification model by defining the confusion matrix. By using this comprehensive approach, we significantly enhanced the accuracy of estrus detection in sows while effectively mitigating human errors and resource wastage. The automation and optimization of this critical process hold the potential to revolutionize estrus detection in agriculture, which will contribute to improved reproductive health management and elevate breeding outcomes to new heights. Through extensive evaluation and experimentation, our research aimed to demonstrate the transformative capabilities of computer vision techniques, paving the way for more advanced and efficient practices in the agricultural domain. Full article
Show Figures

Figure 1

29 pages, 11619 KiB  
Article
MSA-GCN: Multistage Spatio-Temporal Aggregation Graph Convolutional Networks for Traffic Flow Prediction
by Ji Feng, Jiashuang Huang, Chang Guo and Zhenquan Shi
Mathematics 2024, 12(21), 3338; https://doi.org/10.3390/math12213338 - 24 Oct 2024
Abstract
Timely and accurate traffic flow prediction is crucial for stabilizing road conditions, reducing environmental pollution, and mitigating economic losses. While current graph convolution methods have achieved certain results, they do not fully leverage the true advantages of graph convolution. There is still room [...] Read more.
Timely and accurate traffic flow prediction is crucial for stabilizing road conditions, reducing environmental pollution, and mitigating economic losses. While current graph convolution methods have achieved certain results, they do not fully leverage the true advantages of graph convolution. There is still room for improvement in simultaneously addressing multi-graph convolution, optimizing graphs, and simulating road conditions. Based on this, this paper proposes MSA-GCN: Multistage Spatio-Temporal Aggregation Graph Convolutional Networks for Traffic Flow Prediction. This method overcomes the aforementioned issues by dividing the process into different stages and achieves promising prediction results. In the first stage, we construct a latent similarity adjacency matrix and address the randomness interference features in similarity features through two optimizations using the proposed ConvGRU Attention Layer (CGAL module) and the Causal Similarity Capture Module (CSC module), which includes Granger causality tests. In the second stage, we mine the potential correlation between roads using the Correlation Completion Module (CC module) to create a global correlation adjacency matrix as a complement for potential correlations. In the third stage, we utilize the proposed Auto-LRU autoencoder to pre-train various weather features, encoding them into the model’s prediction process to enhance its ability to simulate the real world and improve interpretability. Finally, in the fourth stage, we fuse these features and use a Bidirectional Gated Recurrent Unit (BiGRU) to model time dependencies, outputting the prediction results through a linear layer. Our model demonstrates a performance improvement of 29.33%, 27.03%, and 23.07% on three real-world datasets (PEMSD8, LOSLOOP, and SZAREA) compared to advanced baseline methods, and various ablation experiments validate the effectiveness of each stage and module. Full article
(This article belongs to the Topic New Advances in Granular Computing and Data Mining)
Show Figures

Figure 1

26 pages, 8224 KiB  
Article
SPFDNet: Water Extraction Method Based on Spatial Partition and Feature Decoupling
by Xuejun Cheng, Kuikui Han, Jian Xu, Guozhong Li, Xiao Xiao, Wengang Zhao and Xianjun Gao
Remote Sens. 2024, 16(21), 3959; https://doi.org/10.3390/rs16213959 - 24 Oct 2024
Abstract
Extracting water information from remote-sensing images is of great research significance for applications such as water resource protection and flood monitoring. Current water extraction methods aggregated richer multi-level features to enhance the output results. In fact, there is a difference in the requirements [...] Read more.
Extracting water information from remote-sensing images is of great research significance for applications such as water resource protection and flood monitoring. Current water extraction methods aggregated richer multi-level features to enhance the output results. In fact, there is a difference in the requirements for the water body and the water boundary. Indiscriminate multi-feature fusion can lead to perturbation and competition of information between these two types of features during the optimization. Consequently, models cannot accurately locate the internal vacancies within the water body with the external boundary. Therefore, this paper proposes a water feature extraction network with spatial partitioning and feature decoupling. To ensure that the water features are extracted with deep semantic features and stable spatial information before decoupling, we first design a chunked multi-scale feature aggregation module (CMFAM) to construct a context path for obtaining deep semantic information. Then, an information interaction module (IIM) is designed to exchange information between two spatial paths with two fixed resolution intervals and the two paths through. During decoding, a feature decoupling module (FDM) is developed to utilize internal flow prediction to acquire the main body features, and erasing techniques are employed to obtain boundary features. Therefore, the deep features of the water body and the detailed boundary information are supplemented, strengthening the decoupled body and boundary features. Furthermore, the integrated expansion recoupling module (IERM) module is designed for the recoupling stage. The IERM expands the water body and boundary features using expansion and adaptively compensates the transition region between the water body and boundary through information guidance. Finally, multi-level constraints are combined to realize the supervision of the decoupled features. Thus, the water body and boundaries can be extracted more accurately. A comparative validation analysis is conducted on the public datasets, including the gaofen image dataset (GID) and the gaofen2020 challenge dataset (GF2020). By comparing with seven SOTAs, the results show that the proposed method achieves the best results, with IOUs of 91.22 and 78.93, especially in the localization of water bodies and boundaries. By applying the proposed method in different scenarios, the results show the stable capability of the proposed method for extracting water with various shapes and areas. Full article
Show Figures

Figure 1

23 pages, 5405 KiB  
Article
CPH-Fmnet: An Optimized Deep Learning Model for Multi-View Stereo and Parameter Extraction in Complex Forest Scenes
by Lingnan Dai, Zhao Chen, Xiaoli Zhang, Dianchang Wang and Lishuo Huo
Forests 2024, 15(11), 1860; https://doi.org/10.3390/f15111860 - 23 Oct 2024
Abstract
The three-dimensional reconstruction of forests is crucial in remote sensing technology, ecological monitoring, and forestry management, as it yields precise forest structure and tree parameters, providing essential data support for forest resource management, evaluation, and sustainable development. Nevertheless, forest 3D reconstruction now encounters [...] Read more.
The three-dimensional reconstruction of forests is crucial in remote sensing technology, ecological monitoring, and forestry management, as it yields precise forest structure and tree parameters, providing essential data support for forest resource management, evaluation, and sustainable development. Nevertheless, forest 3D reconstruction now encounters obstacles including higher equipment costs, reduced data collection efficiency, and complex data processing. This work introduces a unique deep learning model, CPH-Fmnet, designed to enhance the accuracy and efficiency of 3D reconstruction in intricate forest environments. CPH-Fmnet enhances the FPN Encoder-Decoder Architecture by meticulously incorporating the Channel Attention Mechanism (CA), Path Aggregation Module (PA), and High-Level Feature Selection Module (HFS), alongside the integration of the pre-trained Vision Transformer (ViT), thereby significantly improving the model’s global feature extraction and local detail reconstruction abilities. We selected three representative sample plots in Haidian District, Beijing, China, as the study area and took forest stand sequence photos with an iPhone for the research. Comparative experiments with the conventional SfM + MVS and MVSFormer models, along with comprehensive parameter extraction and ablation studies, substantiated the enhanced efficacy of the proposed CPH-Fmnet model in addressing difficult circumstances such as intricate occlusions, poorly textured areas, and variations in lighting. The test results show that the model does better on a number of evaluation criteria. It has an RMSE of 1.353, an MAE of only 5.1%, an r value of 1.190, and a forest reconstruction rate of 100%, all of which are better than current methods. Furthermore, the model produced a more compact and precise 3D point cloud while accurately determining the properties of the forest trees. The findings indicate that CPH-Fmnet offers an innovative approach for forest resource management and ecological monitoring, characterized by cheap cost, high accuracy, and high efficiency. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Figure 1

19 pages, 15395 KiB  
Article
The Effect of a Parcel-Aggregated Cropping Structure Mapping Method in Irrigation-Water Estimation in Arid Regions—A Case Study of the Weigan River Basin in Xinjiang
by Haoyu Wang, Linze Bai, Chunxia Wei, Junli Li, Shuo Li, Chenghu Zhou, Philippe De Maeyer, Wenqi Kou, Chi Zhang, Zhanfeng Shen and Tim Van de Voorde
Remote Sens. 2024, 16(21), 3941; https://doi.org/10.3390/rs16213941 - 23 Oct 2024
Abstract
Effective management of agricultural water resources in arid regions relies on precise estimation of irrigation-water demand. Most previous studies have adopted pixel-level mapping methods to estimate irrigation-water demand, often leading to inaccuracies when applied in arid areas where land salinization is severe and [...] Read more.
Effective management of agricultural water resources in arid regions relies on precise estimation of irrigation-water demand. Most previous studies have adopted pixel-level mapping methods to estimate irrigation-water demand, often leading to inaccuracies when applied in arid areas where land salinization is severe and where poorly growing crops cause the growing area to be smaller than the sown area. To address this issue and improve the accuracy of irrigation-water demand estimation, this study utilizes parcel-aggregated cropping structure mapping. We conducted a case study in the Weigan River Basin, Xinjiang, China. Deep learning techniques, the Richer Convolutional Features model, and the bilayer Long Short-Term Memory model were applied to extract parcel-aggregated cropping structures. By analyzing the cropping patterns, we estimated the irrigation-water demand and calculated the supply using statistical data and the water balance approach. The results indicated that in 2020, the cultivated area in the Weigan River Basin was 5.29 × 105 hectares, distributed over 853,404 parcels with an average size of 6202 m2. Based on the parcel-aggregated cropping structure, the estimated irrigation-water demand ranges from 25.1 × 108 m3 to 30.0 × 108 m3, representing a 5.57% increase compared to the pixel-level estimates. This increase highlights the effectiveness of the parcel-aggregated cropping structure in capturing the actual irrigation-water requirements, particularly in areas with severe soil salinization and patchy crop growth. The supply was calculated at 24.4 × 108 m3 according to the water balance approach, resulting in a minimal water deficit of 0.64 × 108 m3, underscoring the challenges in managing agricultural water resources in arid regions. Overall, the use of parcel-aggregated cropping structure mapping addresses the issue of irrigation-water demand underestimation associated with pixel-level mapping in arid regions. This study provides a methodological framework for efficient agricultural water resource management and sustainable development in arid regions. Full article
Show Figures

Figure 1

16 pages, 1992 KiB  
Article
Enhanced Age-Dependent Motor Impairment in Males of Drosophila melanogaster Modeling Spinocerebellar Ataxia Type 1 Is Linked to Dysregulation of a Matrix Metalloproteinase
by Emma M. Palmer, Caleb A. Snoddy, Peyton M. York, Sydney M. Davis, Madelyn F. Hunter and Natraj Krishnan
Biology 2024, 13(11), 854; https://doi.org/10.3390/biology13110854 - 23 Oct 2024
Abstract
Over the past two decades, Drosophila melanogaster has proven to be successful in modeling the polyglutamine (polyQ) (caused by CAG repeats) family of neurodegenerative disorders, including the faithful recapitulation of pathological features such as polyQ length-dependent formation of protein aggregates and progressive neuronal degeneration. [...] Read more.
Over the past two decades, Drosophila melanogaster has proven to be successful in modeling the polyglutamine (polyQ) (caused by CAG repeats) family of neurodegenerative disorders, including the faithful recapitulation of pathological features such as polyQ length-dependent formation of protein aggregates and progressive neuronal degeneration. In this study, pan-neuronal expression of human Ataxin-1 with long polyQ repeat of 82 amino acids was driven using an elav-GAL4 driver line. This would essentially model the polyQ disease spinocerebellar ataxia type 1 (SCA1). Longevity and behavioral analysis of male flies expressing human Ataxin-1 revealed compromised lifespan and accelerated locomotor activity deficits both in diurnal activity and negative geotaxis response compared to control flies. Interestingly, this decline in motor response was coupled to an enhancement of matrix metalloproteinase 1 (dMMP1) expression together with declining expression of extracellular matrix (ECM) fibroblast growth factor (FGF) signaling by hedgehog (Hh) and branchless (bnl) and a significant decrease in expression of survival motor neuron gene (dsmn) in old (30 d) flies. Taken together, our results indicate a role for dysregulation of matrix metalloproteinase in polyQ disease with consequent impact on ECM signaling factors, as well as SMN at the neuromuscular junction causing overt physiological and behavioral deficits. Full article
(This article belongs to the Special Issue Animal Models for Disease Mechanisms)
Show Figures

Figure 1

15 pages, 1799 KiB  
Article
Heterogeneous Hierarchical Fusion Network for Multimodal Sentiment Analysis in Real-World Environments
by Ju Huang, Wenkang Chen, Fangyi Wang and Haijun Zhang
Electronics 2024, 13(20), 4137; https://doi.org/10.3390/electronics13204137 - 21 Oct 2024
Abstract
Multimodal sentiment analysis models can determine users’ sentiments by utilizing rich information from various sources (e.g., textual, visual, and audio). However, there are two key challenges when deploying the model in real-world environments: (1) the limitations of relying on the performance of automatic [...] Read more.
Multimodal sentiment analysis models can determine users’ sentiments by utilizing rich information from various sources (e.g., textual, visual, and audio). However, there are two key challenges when deploying the model in real-world environments: (1) the limitations of relying on the performance of automatic speech recognition (ASR) models can lead to errors in recognizing sentiment words, which may mislead the sentiment analysis of the textual modality, and (2) variations in information density across modalities complicate the development of a high-quality fusion framework. To address these challenges, this paper proposes a novel Multimodal Sentiment Word Optimization Module and a heterogeneous hierarchical fusion (MSWOHHF) framework. Specifically, the proposed Multimodal Sentiment Word Optimization Module optimizes the sentiment words extracted from the textual modality by the ASR model, thereby reducing sentiment word recognition errors. In the multimodal fusion phase, a heterogeneous hierarchical fusion network architecture is introduced, which first utilizes a Transformer Aggregation Module to fuse the visual and audio modalities, enhancing the high-level semantic features of each modality. A Cross-Attention Fusion Module then integrates the textual modality with the audiovisual fusion. Next, a Feature-Based Attention Fusion Module is proposed that enables fusion by dynamically tuning the weights of both the combined and unimodal representations. It then predicts sentiment polarity using a nonlinear neural network. Finally, the experimental results on the MOSI-SpeechBrain, MOSI-IBM, and MOSI-iFlytek datasets show that the MSWOHHF outperforms several baselines, demonstrating better performance. Full article
(This article belongs to the Special Issue New Advances in Affective Computing)
Show Figures

Figure 1

23 pages, 4145 KiB  
Article
A Student Facial Expression Recognition Model Based on Multi-Scale and Deep Fine-Grained Feature Attention Enhancement
by Zhaoyu Shou, Yi Huang, Dongxu Li, Cheng Feng, Huibing Zhang, Yuming Lin and Guangxiang Wu
Sensors 2024, 24(20), 6748; https://doi.org/10.3390/s24206748 - 20 Oct 2024
Viewed by 462
Abstract
In smart classroom environments, accurately recognizing students’ facial expressions is crucial for teachers to efficiently assess students’ learning states, timely adjust teaching strategies, and enhance teaching quality and effectiveness. In this paper, we propose a student facial expression recognition model based on multi-scale [...] Read more.
In smart classroom environments, accurately recognizing students’ facial expressions is crucial for teachers to efficiently assess students’ learning states, timely adjust teaching strategies, and enhance teaching quality and effectiveness. In this paper, we propose a student facial expression recognition model based on multi-scale and deep fine-grained feature attention enhancement (SFER-MDFAE) to address the issues of inaccurate facial feature extraction and poor robustness of facial expression recognition in smart classroom scenarios. Firstly, we construct a novel multi-scale dual-pooling feature aggregation module to capture and fuse facial information at different scales, thereby obtaining a comprehensive representation of key facial features; secondly, we design a key region-oriented attention mechanism to focus more on the nuances of facial expressions, further enhancing the representation of multi-scale deep fine-grained feature; finally, the fusion of multi-scale and deep fine-grained attention-enhanced features is used to obtain richer and more accurate facial key information and realize accurate facial expression recognition. The experimental results demonstrate that the proposed SFER-MDFAE outperforms the existing state-of-the-art methods, achieving an accuracy of 76.18% on FER2013, 92.75% on FERPlus, 92.93% on RAF-DB, 67.86% on AffectNet, and 93.74% on the real smart classroom facial expression dataset (SCFED). These results validate the effectiveness of the proposed method. Full article
Show Figures

Figure 1

16 pages, 7311 KiB  
Article
Vehicle Localization Method in Complex SAR Images Based on Feature Reconstruction and Aggregation
by Jinwei Han, Lihong Kang, Jing Tian, Mingyong Jiang and Ningbo Guo
Sensors 2024, 24(20), 6746; https://doi.org/10.3390/s24206746 - 20 Oct 2024
Viewed by 324
Abstract
Due to the small size of vehicle targets, complex background environments, and the discrete scattering characteristics of high-resolution synthetic aperture radar (SAR) images, existing deep learning networks face challenges in extracting high-quality vehicle features from SAR images, which impacts vehicle localization accuracy. To [...] Read more.
Due to the small size of vehicle targets, complex background environments, and the discrete scattering characteristics of high-resolution synthetic aperture radar (SAR) images, existing deep learning networks face challenges in extracting high-quality vehicle features from SAR images, which impacts vehicle localization accuracy. To address this issue, this paper proposes a vehicle localization method for SAR images based on feature reconstruction and aggregation with rotating boxes. Specifically, our method first employs a backbone network that integrates the space-channel reconfiguration module (SCRM), which contains spatial and channel attention mechanisms specifically designed for SAR images to extract features. The network then connects a progressive cross-fusion mechanism (PCFM) that effectively combines multi-view features from different feature layers, enhancing the information content of feature maps and improving feature representation quality. Finally, these features containing a large receptive field region and enhanced rich contextual information are input into a rotating box vehicle detection head, which effectively reduces false alarms and missed detections. Experiments on a complex scene SAR image vehicle dataset demonstrate that the proposed method significantly improves vehicle localization accuracy. Our method achieves state-of-the-art performance, which demonstrates the superiority and effectiveness of the proposed method. Full article
(This article belongs to the Special Issue Intelligent SAR Target Detection and Recognition)
Show Figures

Figure 1

17 pages, 3337 KiB  
Article
MulCPred: Learning Multi-Modal Concepts for Explainable Pedestrian Action Prediction
by Yan Feng, Alexander Carballo, Keisuke Fujii, Robin Karlsson, Ming Ding and Kazuya Takeda
Sensors 2024, 24(20), 6742; https://doi.org/10.3390/s24206742 - 20 Oct 2024
Viewed by 262
Abstract
Pedestrian action prediction is crucial for many applications such as autonomous driving. However, state-of-the-art methods lack the explainability needed for trustworthy predictions. In this paper, a novel framework called MulCPred is proposed that explains its predictions based on multi-modal concepts represented by training [...] Read more.
Pedestrian action prediction is crucial for many applications such as autonomous driving. However, state-of-the-art methods lack the explainability needed for trustworthy predictions. In this paper, a novel framework called MulCPred is proposed that explains its predictions based on multi-modal concepts represented by training samples. Previous concept-based methods have limitations, including the following: (1) they cannot be directly applied to multi-modal cases; (2) they lack the locality needed to attend to details in the inputs; (3) they are susceptible to mode collapse. These limitations are tackled accordingly through the following approaches: (1) a linear aggregator to integrate the activation results of the concepts into predictions, which associates concepts of different modalities and provides ante hoc explanations of the relevance between the concepts and the predictions; (2) a channel-wise recalibration module that attends to local spatiotemporal regions, which enables the concepts with locality; (3) a feature regularization loss that encourages the concepts to learn diverse patterns. MulCPred is evaluated on multiple datasets and tasks. Both qualitative and quantitative results demonstrate that MulCPred is promising in improving the explainability of pedestrian action prediction without obvious performance degradation. Moreover, by removing unrecognizable concepts, MulCPred shows improved cross-dataset prediction performance, suggesting its potential for further generalization. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

29 pages, 14558 KiB  
Article
Compressive Strength Prediction of Fly Ash-Based Concrete Using Single and Hybrid Machine Learning Models
by Haiyu Li, Heungjin Chung, Zhenting Li and Weiping Li
Buildings 2024, 14(10), 3299; https://doi.org/10.3390/buildings14103299 - 18 Oct 2024
Viewed by 393
Abstract
The compressive strength of concrete is a crucial parameter in structural design, yet its determination in a laboratory setting is both time-consuming and expensive. The prediction of compressive strength in fly ash-based concrete can be accelerated through the use of machine learning algorithms [...] Read more.
The compressive strength of concrete is a crucial parameter in structural design, yet its determination in a laboratory setting is both time-consuming and expensive. The prediction of compressive strength in fly ash-based concrete can be accelerated through the use of machine learning algorithms with artificial intelligence, which can effectively address the problems associated with this process. This paper presents the most innovative model algorithms established based on artificial intelligence technology. These include three single models—a fully connected neural network model (FCNN), a convolutional neural network model (CNN), and a transformer model (TF)—and three hybrid models—FCNN + CNN, TF + FCNN, and TF + CNN. A total of 471 datasets were employed in the experiments, comprising 7 input features: cement (C), fly ash (FA), water (W), superplasticizer (SP), coarse aggregate (CA), fine aggregate (S), and age (D). Six models were subsequently applied to predict the compressive strength (CS) of fly ash-based concrete. Furthermore, the loss function curves, assessment indexes, linear correlation coefficient, and the related literature indexes of each model were employed for comparison. This analysis revealed that the FCNN + CNN model exhibited the highest prediction accuracy, with the following metrics: R2 = 0.95, MSE = 14.18, MAE = 2.32, SMAPE = 0.1, and R = 0.973. Additionally, SHAP was utilized to elucidate the significance of the model parameter features. The findings revealed that C and D exerted the most substantial influence on the model prediction outcomes, followed by W and FA. Nevertheless, CA, S, and SP demonstrated comparatively minimal influence. Finally, a GUI interface for predicting compressive strength was developed based on six models and nonlinear functional relationships, and a criterion for minimum strength was derived by comparison and used to optimize a reasonable mixing ratio, thus achieving a fast data-driven interaction that was concise and reliable. Full article
(This article belongs to the Section Building Materials, and Repair & Renovation)
Show Figures

Figure 1

21 pages, 1019 KiB  
Review
Amyotrophic Lateral Sclerosis: Insights and New Prospects in Disease Pathophysiology, Biomarkers and Therapies
by Jameel M. Al-Khayri, Mamtha Ravindran, Akshatha Banadka, Chendanda Devaiah Vandana, Kushalva Priya, Praveen Nagella and Kowshik Kukkemane
Pharmaceuticals 2024, 17(10), 1391; https://doi.org/10.3390/ph17101391 - 18 Oct 2024
Viewed by 369
Abstract
Amyotrophic Lateral Sclerosis (ALS) is a severe neurodegenerative disorder marked by the gradual loss of motor neurons, leading to significant disability and eventual death. Despite ongoing research, there are still limited treatment options, underscoring the need for a deeper understanding of the disease’s [...] Read more.
Amyotrophic Lateral Sclerosis (ALS) is a severe neurodegenerative disorder marked by the gradual loss of motor neurons, leading to significant disability and eventual death. Despite ongoing research, there are still limited treatment options, underscoring the need for a deeper understanding of the disease’s complex mechanisms and the identification of new therapeutic targets. This review provides a thorough examination of ALS, covering its epidemiology, pathology, and clinical features. It investigates the key molecular mechanisms, such as protein aggregation, neuroinflammation, oxidative stress, and excitotoxicity that contribute to motor neuron degeneration. The role of biomarkers is highlighted for their importance in early diagnosis and disease monitoring. Additionally, the review explores emerging therapeutic approaches, including inhibitors of protein aggregation, neuroinflammation modulators, antioxidant therapies, gene therapy, and stem cell-based treatments. The advantages and challenges of these strategies are discussed, with an emphasis on the potential for precision medicine to tailor treatments to individual patient needs. Overall, this review aims to provide a comprehensive overview of the current state of ALS research and suggest future directions for developing effective therapies. Full article
Show Figures

Figure 1

13 pages, 2922 KiB  
Article
Analyzing Amylin Aggregation Inhibition Through Quantum Dot Fluorescence Imaging
by Xiaoyu Yin, Ziwei Liu, Gegentuya Huanood, Hayate Sawatari, Keiya Shimamori, Masahiro Kuragano and Kiyotaka Tokuraku
Int. J. Mol. Sci. 2024, 25(20), 11132; https://doi.org/10.3390/ijms252011132 - 17 Oct 2024
Viewed by 278
Abstract
Protein aggregation is associated with various diseases caused by protein misfolding. Among them, amylin deposition is a prominent feature of type 2 diabetes. At present, the mechanism of amylin aggregation remains unclear, and this has hindered the treatment of type 2 diabetes. In [...] Read more.
Protein aggregation is associated with various diseases caused by protein misfolding. Among them, amylin deposition is a prominent feature of type 2 diabetes. At present, the mechanism of amylin aggregation remains unclear, and this has hindered the treatment of type 2 diabetes. In this study, we analyzed the aggregation process of amylin using the quantum dot (QD) imaging method. QD fluorescence imaging revealed that in the presence of 100 μM amylin, aggregates appeared after 12 h of incubation, while a large number of aggregates formed after 24 h of incubation, with a standard deviation (SD) value of 5.435. In contrast, 50 μM amylin did not induce the formation of aggregates after 12 h of incubation, although a large number of aggregates were observed after 24 h of incubation, with an SD value of 2.883. Confocal laser microscopy observations revealed that these aggregates were deposited in three dimensions. Transmission electron microscopy revealed that amylin existed as misfolded fibrils in vitro and that QDs were uniformly bound to the amylin fibrils. In addition, using a microliter-scale high-throughput screening (MSHTS) system, we found that rosmarinic acid, a polyphenol, inhibited amylin aggregation at a half-maximal effective concentration of 852.8 μM. These results demonstrate that the MSHTS system is a powerful tool for evaluating the inhibitory activity of amylin aggregation. Our findings will contribute to the understanding of the pathogenesis of amylin-related diseases and the discovery of compounds that may be useful in the treatment and prevention of these diseases. Full article
(This article belongs to the Special Issue Quantum Dots for Biomedical Applications)
Show Figures

Figure 1

22 pages, 4866 KiB  
Article
TCEDN: A Lightweight Time-Context Enhanced Depression Detection Network
by Keshan Yan, Shengfa Miao, Xin Jin, Yongkang Mu, Hongfeng Zheng, Yuling Tian, Puming Wang, Qian Yu and Da Hu
Life 2024, 14(10), 1313; https://doi.org/10.3390/life14101313 - 16 Oct 2024
Viewed by 292
Abstract
The automatic video recognition of depression is becoming increasingly important in clinical applications. However, traditional depression recognition models still face challenges in practical applications, such as high computational costs, the poor application effectiveness of facial movement features, and spatial feature degradation due to [...] Read more.
The automatic video recognition of depression is becoming increasingly important in clinical applications. However, traditional depression recognition models still face challenges in practical applications, such as high computational costs, the poor application effectiveness of facial movement features, and spatial feature degradation due to model stitching. To overcome these challenges, this work proposes a lightweight Time-Context Enhanced Depression Detection Network (TCEDN). We first use attention-weighted blocks to aggregate and enhance video frame-level features, easing the model’s computational workload. Next, by integrating the temporal and spatial changes of video raw features and facial movement features in a self-learning weight manner, we enhance the precision of depression detection. Finally, a fusion network of 3-Dimensional Convolutional Neural Network (3D-CNN) and Convolutional Long Short-Term Memory Network (ConvLSTM) is constructed to minimize spatial feature loss by avoiding feature flattening and to achieve depression score prediction. Tests on the AVEC2013 and AVEC2014 datasets reveal that our approach yields results on par with state-of-the-art techniques for detecting depression using video analysis. Additionally, our method has significantly lower computational complexity than mainstream methods. Full article
(This article belongs to the Section Biochemistry, Biophysics and Computational Biology)
Show Figures

Figure 1

Back to TopTop