Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,147)

Search Parameters:
Keywords = autoencoder

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 9324 KiB  
Article
From Envelope Spectra to Bearing Remaining Useful Life: An Intelligent Vibration-Based Prediction Model with Quantified Uncertainty
by Haobin Wen, Long Zhang and Jyoti K. Sinha
Sensors 2024, 24(22), 7257; https://doi.org/10.3390/s24227257 - 13 Nov 2024
Abstract
Bearings are pivotal components of rotating machines where any defects could propagate and trigger systematic failures. Once faults are detected, accurately predicting remaining useful life (RUL) is essential for optimizing predictive maintenance. Although data-driven methods demonstrate promising performance in direct RUL prediction, their [...] Read more.
Bearings are pivotal components of rotating machines where any defects could propagate and trigger systematic failures. Once faults are detected, accurately predicting remaining useful life (RUL) is essential for optimizing predictive maintenance. Although data-driven methods demonstrate promising performance in direct RUL prediction, their robustness and practicability need further improvement regarding physical interpretation and uncertainty quantification. This work leverages variational neural networks to model bearing degradation behind envelope spectra. A convolutional variational autoencoder for regression (CVAER) is developed to probabilistically predict RUL distributions with confidence measures. Enhanced average envelope spectra (AES) are used as network input for its physical robustness in bearing condition assessment and fault detection. The use of the envelope spectrum ensures that it contains only bearing-related information by removing other rotor-related frequencies, hence it improves the RUL prediction. Unlike traditional variational autoencoders, the probabilistic regressor and latent generator are formulated to quantify uncertainty in RUL estimates and learn meaningful latent representations conditioned on specific RUL. Experimental validations are conducted on vibration data collected using multiple accelerometers whose natural frequencies cover bearing resonance ranges to ensure fault detection reliability. Beyond conventional bearing diagnosis, envelope spectra are extended for statistical RUL prediction integrating physical knowledge of actual defect conditions. Comparative and ablation studies are conducted against benchmark models to demonstrate their effectiveness. Full article
(This article belongs to the Special Issue Fault Diagnosis and Vibration Signal Processing in Rotor Systems)
11 pages, 1777 KiB  
Article
Pre-Impact Fall Detection for E-Scooter Riding Using an IMU: Threshold-Based, Supervised, and Unsupervised Approaches
by Seunghee Lee, Bummo Koo and Youngho Kim
Appl. Sci. 2024, 14(22), 10443; https://doi.org/10.3390/app142210443 - 13 Nov 2024
Abstract
Pre-impact fall detection during e-scooter riding is essential for rider safety. Both threshold-based and deep learning algorithms (supervised and unsupervised models) were developed in this study. Twenty participants performed normal driving maneuvers such as straight driving, speed bumps, clockwise roundabouts, and counterclockwise roundabouts, [...] Read more.
Pre-impact fall detection during e-scooter riding is essential for rider safety. Both threshold-based and deep learning algorithms (supervised and unsupervised models) were developed in this study. Twenty participants performed normal driving maneuvers such as straight driving, speed bumps, clockwise roundabouts, and counterclockwise roundabouts, along with falls (abnormal driving maneuvers). A 6-axis IMU sensor (Xsens DOT, The Netherlands) was positioned at the T7 location to record data at 60 Hz. The approaches included threshold-based, supervised learning, and unsupervised learning models The threshold-based approach yielded an accuracy of 98.86% with an F1 score of 0.99, while the supervised model had a slightly lower performance, reaching 86.29% accuracy and an F1 score of 0.56. The unsupervised knowledge distillation model achieved 98.86% accuracy, an F1 score of 0.99, and a memory size of only 46 kB. All models demonstrated lead times of more than 250 ms, sufficient for airbag deployment. Full article
(This article belongs to the Special Issue Traffic Emergency: Forecasting, Control and Planning)
Show Figures

Figure 1

8 pages, 1126 KiB  
Brief Report
Preliminary Study of Airfoil Design Synthesis Using a Conditional Diffusion Model and Smoothing Method
by Kazuo Yonekura, Yuta Oshima and Masaatsu Aichi
Computation 2024, 12(11), 227; https://doi.org/10.3390/computation12110227 - 13 Nov 2024
Viewed by 133
Abstract
Generative models such as generative adversarial networks and variational autoencoders are widely used for design synthesis. A diffusion model is another generative model that outperforms GANs and VAEs in image processing. It has also been applied in design synthesis, but was limited to [...] Read more.
Generative models such as generative adversarial networks and variational autoencoders are widely used for design synthesis. A diffusion model is another generative model that outperforms GANs and VAEs in image processing. It has also been applied in design synthesis, but was limited to only shape generation. It is important in design synthesis to generate shapes that satisfy the required performance. For such aims, a conditional diffusion model has to be used, but has not been studied. In this study, we applied a conditional diffusion model to the design synthesis and showed that the output of this diffusion model contains noisy data caused by Gaussian noise. We show that we can conduct flow analysis on the generated data by using smoothing filters. Full article
(This article belongs to the Section Computational Engineering)
Show Figures

Figure 1

19 pages, 5749 KiB  
Article
Video Anomaly Detection Based on Global–Local Convolutional Autoencoder
by Fusheng Sun, Jiahao Zhang, Xiaodong Wu, Zhong Zheng and Xiaowen Yang
Electronics 2024, 13(22), 4415; https://doi.org/10.3390/electronics13224415 - 11 Nov 2024
Viewed by 394
Abstract
Video anomaly detection (VAD) plays a crucial role in fields such as security, production, and transportation. To address the issue of overgeneralization in anomaly behavior prediction by deep neural networks, we propose a network called AMFCFBMem-Net (appearance and motion feature cross-fusion block memory [...] Read more.
Video anomaly detection (VAD) plays a crucial role in fields such as security, production, and transportation. To address the issue of overgeneralization in anomaly behavior prediction by deep neural networks, we propose a network called AMFCFBMem-Net (appearance and motion feature cross-fusion block memory network), which combines appearance and motion feature cross-fusion blocks. Firstly, dual encoders for appearance and motion are employed to separately extract these features, which are then integrated into the skip connection layer to mitigate the model’s tendency to predict abnormal behavior, ultimately enhancing the prediction accuracy for abnormal samples. Secondly, a motion foreground extraction module is integrated into the network to generate a foreground mask map based on speed differences, thereby widening the prediction error margin between normal and abnormal behaviors. To capture the latent features of various models for normal samples, a memory module is introduced at the bottleneck of the encoder and decoder structures. This further enhances the model’s anomaly detection capabilities and diminishes its predictive generalization towards abnormal samples. The experimental results on the UCSD Pedestrian dataset 2 (UCSD Ped2) and CUHK Avenue anomaly detection dataset (CUHK Avenue) demonstrate that, compared to current cutting-edge video anomaly detection algorithms, our proposed method achieves frame-level AUCs of 97.5% and 88.8%, respectively, effectively enhancing anomaly detection capabilities. Full article
Show Figures

Figure 1

25 pages, 6322 KiB  
Article
A Convolution Auto-Encoders Network for Aero-Engine Hot Jet FT-IR Spectrum Feature Extraction and Classification
by Shuhan Du, Wei Han, Zhenping Kang, Yurong Liao and Zhaoming Li
Aerospace 2024, 11(11), 933; https://doi.org/10.3390/aerospace11110933 - 11 Nov 2024
Viewed by 151
Abstract
Aiming at classification and recognition of aero-engines, two telemetry Fourier transform infrared (FT-IR) spectrometers are utilized to measure the infrared spectrum of the areo-engine hot jet, meanwhile a spectrum dataset of six types of areo-engines is established. In this paper, a convolutional autoencoder [...] Read more.
Aiming at classification and recognition of aero-engines, two telemetry Fourier transform infrared (FT-IR) spectrometers are utilized to measure the infrared spectrum of the areo-engine hot jet, meanwhile a spectrum dataset of six types of areo-engines is established. In this paper, a convolutional autoencoder (CAE) is designed for spectral feature extraction and classification, which is composed of coding network, decoding network, and classification network. The encoder network consists of convolutional layers and maximum pooling layers, the decoder network consists of up-sampling layers and deconvolution layers, and the classification network consists of a flattened layer and a dense layer. In the experiment, data for the spectral dataset were randomly sampled at a ratio of 8:1:1 to produce the training set, validation set, and prediction set, and the performance measures were accuracy, precision, recall, confusion matrix, F1 score, ROC curve, and AUC value. The experimental result of CAE reached 96% accuracy and the prediction running time was 1.57 s. Compared with the classical PCA feature extraction and SVM, XGBoost, AdaBoost, and Random Forest classifier algorithms, as well as AE, CSAE, and CVAE deep learning classification methods, the CAE network can achieve higher accuracy and efficiency and can complete the spectral classification task. Full article
(This article belongs to the Section Aeronautics)
Show Figures

Figure 1

19 pages, 5871 KiB  
Article
High-Resolution Land Use Land Cover Dataset for Meteorological Modelling—Part 2: ECOCLIMAP-SG-ML an Ensemble Land Cover Map
by Thomas Rieutord, Geoffrey Bessardon and Emily Gleeson
Land 2024, 13(11), 1875; https://doi.org/10.3390/land13111875 - 9 Nov 2024
Viewed by 289
Abstract
While the surface of the Earth plays a key role in weather forecasting through its interaction with the atmosphere, in ensemble numerical weather predictions the uncertainty on the surface is only represented with perturbations in the parameterisations representing the surface processes. Data representing [...] Read more.
While the surface of the Earth plays a key role in weather forecasting through its interaction with the atmosphere, in ensemble numerical weather predictions the uncertainty on the surface is only represented with perturbations in the parameterisations representing the surface processes. Data representing the surface, such as the land cover, are not perturbed. As fully data-driven forecasts without parameterisations are growing in importance, sampling the uncertainty on the land cover data brings a new way of making ensemble forecasts. Our work describes a method of generating ensemble land cover maps for numerical weather prediction. The target land cover map has the ECOCLIMAP-SG labels used in the SURFEX surface model and therefore is expected to have all relevant labels for surface-atmosphere interactions. The method translates the ESA WorldCover map to ECOCLIMAP-SG labels and resolution using auto-encoders. The land cover ensemble members are obtained by sampling the land cover probabilities in the output of the neural network. This paper builds upon the work done in a companion paper describing the high-resolution version of ECOCLIMAP-SG, called ECOCLIMAP-SG+, used for the training and evaluation of the neural network. The output map presented here, called ECOCLIMAP-SG-ML, improves upon the ECOCLIMAP-SG map in terms of resolution (from 300 m to 60 m), overall accuracy (from 0.41 to 0.63), and the ability to produce ensemble members. Full article
Show Figures

Figure 1

20 pages, 11655 KiB  
Article
Variational Color Shift and Auto-Encoder Based on Large Separable Kernel Attention for Enhanced Text CAPTCHA Vulnerability Assessment
by Xing Wan, Juliana Johari and Fazlina Ahmat Ruslan
Information 2024, 15(11), 717; https://doi.org/10.3390/info15110717 - 7 Nov 2024
Viewed by 350
Abstract
Text CAPTCHAs are crucial security measures deployed on global websites to deter unauthorized intrusions. The presence of anti-attack features incorporated into text CAPTCHAs limits the effectiveness of evaluating them, despite CAPTCHA recognition being an effective method for assessing their security. This study introduces [...] Read more.
Text CAPTCHAs are crucial security measures deployed on global websites to deter unauthorized intrusions. The presence of anti-attack features incorporated into text CAPTCHAs limits the effectiveness of evaluating them, despite CAPTCHA recognition being an effective method for assessing their security. This study introduces a novel color augmentation technique called Variational Color Shift (VCS) to boost the recognition accuracy of different networks. VCS generates a color shift of every input image and then resamples the image within that range to generate a new image, thus expanding the number of samples of the original dataset to improve training effectiveness. In contrast to Random Color Shift (RCS), which treats the color offsets as hyperparameters, VCS estimates color shifts by reparametrizing the points sampled from the uniform distribution using predicted offsets according to every image, which makes the color shifts learnable. To better balance the computation and performance, we also propose two variants of VCS: Sim-VCS and Dilated-VCS. In addition, to solve the overfitting problem caused by disturbances in text CAPTCHAs, we propose an Auto-Encoder (AE) based on Large Separable Kernel Attention (AE-LSKA) to replace the convolutional module with large kernels in the text CAPTCHA recognizer. This new module employs an AE to compress the interference while expanding the receptive field using Large Separable Kernel Attention (LSKA), reducing the impact of local interference on the model training and improving the overall perception of characters. The experimental results show that the recognition accuracy of the model after integrating the AE-LSKA module is improved by at least 15 percentage points on both M-CAPTCHA and P-CAPTCHA datasets. In addition, experimental results demonstrate that color augmentation using VCS is more effective in enhancing recognition, which has higher accuracy compared to RCS and PCA Color Shift (PCA-CS). Full article
(This article belongs to the Special Issue Computer Vision for Security Applications)
Show Figures

Figure 1

15 pages, 487 KiB  
Article
Deep Learning-Based Freight Recommendation System for Freight Brokerage Platform
by Yeon-Soo Kim and Tai-Woo Chang
Systems 2024, 12(11), 477; https://doi.org/10.3390/systems12110477 - 7 Nov 2024
Viewed by 470
Abstract
Platform-based businesses in the logistics market are evolving under the influence of digital transformation. Transforming the freight market into an environment where various types of freight can be traded across multiple markets and locations. Freight brokerage platforms have revolutionized the trading relationship between [...] Read more.
Platform-based businesses in the logistics market are evolving under the influence of digital transformation. Transforming the freight market into an environment where various types of freight can be traded across multiple markets and locations. Freight brokerage platforms have revolutionized the trading relationship between freight owners and vehicle owners. However, this type of system has also introduced inefficiencies, such as unestablished contracts, leading to unnecessary costs and delays. To address this issue, a freight recommendation system can assist users in finding what they are looking for while aiming to reduce failed contracts. With current advances in deep learning, complex patterns based on users’ past behaviors and preferences can be learned, enabling more accurate and personalized recommendations. This study proposes a deep learning-based freight recommendation system to provide personalized services and reduce failed contracts on freight brokerage platforms. The system is built by creating a freight transaction dataset, classifying freight categories through natural language processing and text mining techniques, and incorporating externally derived data on transportation distances. The deep learning model is trained using Autoencoder, Word2Vec, and Graph Neural Networks (GNN), with recommendation logic implemented to suggest suitable freight matches for vehicle owners. This system is expected to increase the market efficiency of the freight logistics industry and is a key step toward improving the long-term profit structure. Full article
Show Figures

Figure 1

19 pages, 3989 KiB  
Article
Population Distribution Forecasting Based on the Fusion of Spatiotemporal Basic and External Features: A Case Study of Lujiazui Financial District
by Xianzhou Cheng, Xiaoming Wang and Renhe Jiang
ISPRS Int. J. Geo-Inf. 2024, 13(11), 395; https://doi.org/10.3390/ijgi13110395 - 6 Nov 2024
Viewed by 382
Abstract
Predicting the distribution of people in the time window approaching a disaster is crucial for post-disaster assistance activities and can be useful for evacuation route selection and shelter planning. However, two major limitations have not yet been addressed: (1) Most spatiotemporal prediction models [...] Read more.
Predicting the distribution of people in the time window approaching a disaster is crucial for post-disaster assistance activities and can be useful for evacuation route selection and shelter planning. However, two major limitations have not yet been addressed: (1) Most spatiotemporal prediction models incorporate spatiotemporal features either directly or indirectly, which results in high information redundancy in the parameters of the prediction model and low computational efficiency. (2) These models usually incorporate certain basic and external features, and they can neither change spatiotemporal addressed features according to spatiotemporal features nor change them in real-time according to spatiotemporal features. The spatiotemporal feature embedding methods for these models are inflexible and difficult to interpret. To overcome these problems, a lightweight population density distribution prediction framework that considers both basic and external spatiotemporal features is proposed. In the study, an autoencoder is used to extract spatiotemporal coded information to form a spatiotemporal attention mechanism, and basic and external spatiotemporal feature attention is fused by a fusion framework with learnable weights. The fused spatiotemporal attention is fused with Resnet as the prediction backbone network to predict the people distribution. Comparison and ablation experimental results show that the computational efficiency and interpretability of the prediction framework are improved by maximizing the scalability of the spatiotemporal features of the model by unleashing the scalability of the spatiotemporal features of the model while enhancing the interpretability of the spatiotemporal information as compared to the classical and popular spatiotemporal prediction frameworks. This study has a multiplier effect and provides a reference solution for predicting population distributions in similar regions around the globe. Full article
Show Figures

Figure 1

23 pages, 2249 KiB  
Article
Improved EMAT Sensor Design for Enhanced Ultrasonic Signal Detection in Steel Wire Ropes
by Immanuel Rossteutscher, Oliver Blaschke, Florian Dötzer, Thorsten Uphues and Klaus Stefan Drese
Sensors 2024, 24(22), 7114; https://doi.org/10.3390/s24227114 - 5 Nov 2024
Viewed by 499
Abstract
This study is focused on optimizing electromagnetic acoustic transducer (EMAT) sensors for enhanced ultrasonic guided wave signal generation in steel cables using CAD and modern manufacturing to enable contactless ultrasonic signal transmission and reception. A lab test rig with advanced measurement and data [...] Read more.
This study is focused on optimizing electromagnetic acoustic transducer (EMAT) sensors for enhanced ultrasonic guided wave signal generation in steel cables using CAD and modern manufacturing to enable contactless ultrasonic signal transmission and reception. A lab test rig with advanced measurement and data processing was set up to test the sensors’ ability to detect cable damage, like wire breaks and abrasion, while also examining the effect of potential disruptors such as rope soiling. Machine learning algorithms were applied to improve the damage detection accuracy, leading to significant advancements in magnetostrictive measurement methods and providing a new standard for future development in this area. The use of the Vision Transformer Masked Autoencoder Architecture (ViTMAE) and generative pre-training has shown that reliable damage detection is possible despite the considerable signal fluctuations caused by rope movement. Full article
(This article belongs to the Special Issue Feature Papers in Physical Sensors 2024)
Show Figures

Figure 1

17 pages, 1745 KiB  
Article
Joint Learning of Volume Scheduling and Order Placement Policies for Optimal Order Execution
by Siyuan Li, Hui Niu, Jiani Lu and Peng Liu
Mathematics 2024, 12(21), 3440; https://doi.org/10.3390/math12213440 - 4 Nov 2024
Viewed by 405
Abstract
Order execution is an extremely important problem in the financial domain, and recently, more and more researchers have tried to employ reinforcement learning (RL) techniques to solve this challenging problem. There are a lot of difficulties for conventional RL methods to tackle the [...] Read more.
Order execution is an extremely important problem in the financial domain, and recently, more and more researchers have tried to employ reinforcement learning (RL) techniques to solve this challenging problem. There are a lot of difficulties for conventional RL methods to tackle the order execution problem, such as the large action space including price and quantity, and the long-horizon property. As naturally order execution is composed of a low-frequency volume scheduling stage and a high-frequency order placement stage, most existing RL-based order execution methods treat these stages as two distinct tasks and offer a partial solution by addressing either one individually. However, the current literature fails to model the non-negligible mutual influence between these two tasks, leading to impractical order execution solutions. To address these limitations, we propose a novel automatic order execution approach based on the hierarchical RL framework (OEHRL), which jointly learns the policies for volume scheduling and order placement. OEHRL first extracts the state embeddings at both the macro and micro levels with a sequential variational auto-encoder model. Based on the effective embeddings, OEHRL generates a hindsight expert dataset, which is used to train a hierarchical order execution policy. In the hierarchical structure, the high-level policy is in charge of the target volume and the low-level learns to determine the prices for a series of the allocated sub-orders from the high level. These two levels collaborate seamlessly and contribute to the optimal order execution policy. Extensive experiment results on 200 stocks across the US and China A-share markets validate the effectiveness of the proposed approach. Full article
(This article belongs to the Special Issue Machine Learning and Finance)
Show Figures

Figure 1

22 pages, 2103 KiB  
Article
Nonlinear Dynamic Process Monitoring Based on Discriminative Denoising Autoencoder and Canonical Variate Analysis
by Jun Liang, Daoguang Liu, Yinxiao Zhan and Jiayu Fan
Actuators 2024, 13(11), 440; https://doi.org/10.3390/act13110440 - 2 Nov 2024
Viewed by 357
Abstract
Modern industrial processes are characterized by increasing complexity, often exhibiting pronounced dynamic behaviors and significant nonlinearity. Addressing these dynamic and nonlinear characteristics is essential for effective process monitoring. However, many existing methods for process monitoring and fault diagnosis are insufficient in handling these [...] Read more.
Modern industrial processes are characterized by increasing complexity, often exhibiting pronounced dynamic behaviors and significant nonlinearity. Addressing these dynamic and nonlinear characteristics is essential for effective process monitoring. However, many existing methods for process monitoring and fault diagnosis are insufficient in handling these challenges. In this article, we present a novel process monitoring approach, CVA-DisDAE, which integrates an improved Denoising Autoencoder (DAE) with Canonical Variate Analysis (CVA) to address the challenges posed by dynamic behaviors and nonlinear relationships in industrial processes. First, CVA is employed to reduce data dimensionality and minimize information redundancy by maximizing correlations between past and future observations, thereby effectively capturing process dynamics. Following this, we introduce a discriminative DAE model (DisDAE) designed to serve as a semi-supervised denoising autoencoder for precise feature extraction. This is achieved by incorporating both between-class separability and within-class variability into the traditional DAE framework. The key distinction between the proposed DisDAE and the conventional DAE lies in the integration of a linear discriminant analysis (LDA) penalty into the DAE’s loss function, resulting in extracted features that are more conducive to fault classification. Finally, we validate the effectiveness of the proposed semi-supervised dynamic process monitoring approach through its application to the Tennessee Eastman benchmark process, demonstrating its superior performance. Full article
(This article belongs to the Section Control Systems)
Show Figures

Figure 1

21 pages, 3213 KiB  
Article
An Autoencoder-Based Task-Oriented Semantic Communication System for M2M Communication
by Prabhath Samarathunga, Hossein Rezaei, Maheshi Lokumarambage, Thushan Sivalingam, Nandana Rajatheva and Anil Fernando
Algorithms 2024, 17(11), 492; https://doi.org/10.3390/a17110492 - 2 Nov 2024
Viewed by 394
Abstract
Semantic communication (SC) is a communication paradigm that has gained significant attention, as it offers a potential solution to move beyond Shannon’s formulation in bandwidth-limited communication channels by delivering the semantic meaning of the message rather than its exact form. In this paper, [...] Read more.
Semantic communication (SC) is a communication paradigm that has gained significant attention, as it offers a potential solution to move beyond Shannon’s formulation in bandwidth-limited communication channels by delivering the semantic meaning of the message rather than its exact form. In this paper, we propose an autoencoder-based SC system for transmitting images between two machines over error-prone channels to support emerging applications such as VIoT, XR, M2M, and M2H communications. The proposed autoencoder architecture, with a semantically modeled encoder and decoder, transmits image data as a reduced-dimension vector (latent vector) through an error-prone channel. The decoder then reconstructs the image to determine its M2M implications. The autoencoder is trained for different noise levels under various channel conditions, and both image quality and classification accuracy are used to evaluate the system’s efficacy. A CNN image classifier measures accuracy, as no image quality metric is available for SC yet. The simulation results show that all proposed autoencoders maintain high image quality and classification accuracy at high SNRs, while the autoencoder trained with zero noise underperforms other trained autoencoders at moderate SNRs. The results further indicate that all other proposed autoencoders trained under different noise levels are highly robust against channel impairments. We compare the proposed system against a comparable JPEG transmission system, and results reveal that the proposed system outperforms the JPEG system in compression efficiency by up to 50% and in received image quality with an image coding gain of up to 17 dB. Full article
(This article belongs to the Special Issue Machine Learning Algorithms for Image Understanding and Analysis)
Show Figures

Figure 1

21 pages, 12658 KiB  
Article
A Dual-Module System for Copyright-Free Image Recommendation and Infringement Detection in Educational Materials
by Yeongha Kim, Soyeon Kim, Seonghyun Min, Youngung Han, Ohyoung Lee and Wongyum Kim
J. Imaging 2024, 10(11), 277; https://doi.org/10.3390/jimaging10110277 - 1 Nov 2024
Viewed by 566
Abstract
Images are extensively utilized in educational materials due to their efficacy in conveying complex concepts. However, unauthorized use of images frequently results in legal issues related to copyright infringement. To mitigate this problem, we introduce a dual-module system specifically designed for educators. The [...] Read more.
Images are extensively utilized in educational materials due to their efficacy in conveying complex concepts. However, unauthorized use of images frequently results in legal issues related to copyright infringement. To mitigate this problem, we introduce a dual-module system specifically designed for educators. The first module, a copyright infringement detection system, employs deep learning techniques to verify the copyright status of images. It utilizes a Convolutional Variational Autoencoder (CVAE) model to extract significant features from copyrighted images and compares them against user-provided images. If infringement is detected, the second module, an image retrieval system, recommends alternative copyright-free images using a Vision Transformer (ViT)-based hashing model. Evaluation on benchmark datasets demonstrates the system’s effectiveness, achieving a mean Average Precision (mAP) of 0.812 on the Flickr25k dataset. Additionally, a user study involving 65 teachers indicates high satisfaction levels, particularly in addressing copyright concerns and ease of use. Our system significantly aids educators in creating educational materials that comply with copyright regulations. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

24 pages, 5255 KiB  
Article
Deep Ensemble Remote Sensing Scene Classification via Category Distribution Association
by Zhenxin He, Guoxu Li, Zheng Wang, Guanxiong He, Hao Yan and Rong Wang
Remote Sens. 2024, 16(21), 4084; https://doi.org/10.3390/rs16214084 - 1 Nov 2024
Viewed by 630
Abstract
Recently, deep learning models have been successfully and widely applied in the field of remote sensing scene classification. But, the existing deep models largely overlook the distinct learning difficulties associated with discriminating different pairs of scenes. Consequently, leveraging the relationships within category distributions [...] Read more.
Recently, deep learning models have been successfully and widely applied in the field of remote sensing scene classification. But, the existing deep models largely overlook the distinct learning difficulties associated with discriminating different pairs of scenes. Consequently, leveraging the relationships within category distributions and employing ensemble learning algorithms hold considerable potential in addressing these issues. In this paper, we propose a category-distribution-associated deep ensemble learning model that pays more attention to instances that are difficult to identify between similar scenes. The core idea is to utilize the degree of difficulty between categories to guide model learning, which is primarily divided into two modules: category distribution information extraction and scene classification. This method employs an autoencoder to capture distinct scene distributions within the samples and constructs a similarity matrix based on the discrepancies between distributions. Subsequently, the scene classification module adopts a stacking ensemble framework, where the base layer utilizes various neural networks to capture sample representations from shallow to deep levels. The meta layer incorporates a novel multiclass boosting algorithm that integrates sample distribution and representations of information to discriminate scenes. Exhaustive empirical evaluations on remote sensing scene benchmarks demonstrate the effectiveness and superiority of our proposed method over the state-of-the-art approaches. Full article
Show Figures

Graphical abstract

Back to TopTop