Figure 13 - uploaded by Syed Shakib Sarwar
Content may be subject to copyright.
Performance comparison of incremental learning approaches.

Performance comparison of incremental learning approaches.

Source publication
Article
Full-text available
Deep convolutional neural network (DCNN) based supervised learning is a widely practiced approach for large-scale image classification. However, retraining these large networks to accommodate new, previously unseen data demands high computational time and energy requirements. Also, previously seen training samples may not be available at the time o...

Similar publications

Conference Paper
Full-text available
This paper addresses the problem of using unlabeled data in transfer learning. Specifically, we focus on transfer learning for a new unlabeled dataset using partially labeled training datasets that consist of a small number of labeled data points and a large number of unlabeled data points. To enable transfer learning, we assume that the training a...
Preprint
Full-text available
In this manuscript, we automate the procedure of grading of diabetic retinopathy and macular edema from fundus images using an ensemble of convolutional neural networks. The availability of limited amount of labeled data to perform supervised learning was circumvented by using transfer learning approach. The models in the ensemble were pre-trained...
Conference Paper
Full-text available
This paper proposes a method of human activity monitoring based on the regular use of sparse acceleration data and GPS positioning collected during smartphone daily utilization. The application addresses, in particular, the elderly population with regular activity patterns associated with daily routines. The approach is based on the clustering of a...
Article
Full-text available
Recently, deep learning is used with convolutional Neural Networks for image classification and figure recognition. In our research, we used Computed Tomography (CT) scans to train a double convolutional Deep Neural Network (CDNN) and a regular CDNN. These topologies were tested against lung cancer images to determine the Tx cancer stage in which t...
Preprint
Full-text available
Transfer learning through fine-tuning a pre-trained neural network with an extremely large dataset, such as ImageNet, can significantly accelerate training while the accuracy is frequently bottlenecked by the limited dataset size of the new target task. To solve the problem, some regularization methods, constraining the outer layer weights of the t...

Citations

... When constructing offline DT models using data-driven methods such as surrogate models or machine learning models, online update methods serve two purposes, including the update of model parameters through optimization, Bayesian estimation or incremental computation (Huang et al., 2005), and the adaptive modification of model structures using Bayesian techniques (Yu et al., 2021) or heuristic strategies (Han et al., 2022). Areas relevant to these online update mechanisms encompass online learning (Hoi et al., 2021), incremental learning (Sarwar et al., 2020), lifelong learning (Parisi et al., 2019), and dynamic neural networks (Han et al., 2022). Nevertheless, these methods encounter the issue of catastrophic forgetting, where the knowledge acquired from offline data diminishes gradually during online update, eventually disappearing. ...
Preprint
Full-text available
A digital twin (DT) is a model that mirrors a physical system and is continuously updated with real-time data from the physical system. Recent implementations of reduced-order-model-based DT (DT-ROM) have been applied in aerodynamics and structural health monitoring, where partial differential equations (PDEs) are utilized to update reduced bases and coefficients. However, these methods are not directly applicable when the PDEs of the system are unknown. This paper addresses the online update challenge for DT-ROM in scenarios lacking known PDEs of the system. To tackle the challenge, a systematic online update and application method is proposed. During the online update, the projection residual of online data on the reduced bases determines the necessity of updating reduced bases, while the prediction residual of online data obtained by the current DT-ROM is used to decide whether to update the coefficient model. By sequentially evaluating both criteria, the method selectively incorporates essential online data for the online DT model update. During the online application, a criterion defined based on online data is adopted to determine whether the offline DT-ROM or the online one is applied to output final predictions. The capability of the proposed method is tested through three numerical and three engineering problems. Results indicate that the proposed online update method consistently reduces both projection and prediction residuals, thereby progressively enhancing the performance of the online DT-ROM on test data. Meanwhile, the online application method provides a prediction performance better than using offline DT-ROM only. Both demonstrate that the proposed work could be applied to online DT update where the PDEs of the system are unknown.
... The online learning problem encompasses the capability of deep learning architectures to continuously improve the learned model by integrating new data while retaining previously acquired knowledge. Various methods [16][17][18][19] employ network architectures that expand during the training process. Another approach involves freezing or slowing down the learning process in specific parts of the network [20,21]. ...
Article
Full-text available
Insulated gate bipolar transistor (IGBT) is a power semiconductor module .Voids may arise in its solder process when a contaminant or gas is absorbed into the solder joint. They heavily influence the heat exchange efficiency of IGBT, so void inspection is very important. The segmentation of solder region is a crucial step for automated defect detection of IGBT based on x-ray computed laminography (CL) system. In recent years, deep learning has made remarkable process in semantic segmentation and has been used for the segmentation of solder joint between the direct bonded copper (DBC) substrate and baseplate, which has been proved to be accurate and efficient. However, deep learning architectures exhibit a critical drop of performance due to catastrophic forgetting when new IGBT samples encountered. Hence, this paper proposes to use online learning techniques to continuously improve the learned model by feeding new IGBT samples without losing previously learned knowledge.
... 2) Exemplar-free methods: Exemplar-free methods do not require old exemplar samples and can prevent catastrophic forgetting. Some techniques constrain the training of particular network modules to preserve old knowledge [20], while others expand or modify the network architecture when adding new classes [21], [22]. Another effective strategy to retain old knowledge is Knowledge Distillation (KD) [23], [24]. ...
Article
Full-text available
Deep Neural Networks (DNNs) based semantic segmentation of the robotic instruments and tissues can enhance the precision of surgical activities in robot-assisted surgery. However, in biological learning, DNNs cannot learn incremental tasks over time and exhibit catastrophic forgetting, which refers to the sharp decline in performance on previously learned tasks after learning a new one. Specifically, when data scarcity is the issue, the model shows a rapid drop in performance on previously learned instruments after learning new data with new instruments. The problem becomes worse when it limits releasing the dataset of the old instruments for the old model due to privacy concerns and the unavailability of the data for the new or updated version of the instruments for the continual learning model. For this purpose, we develop a privacy-preserving synthetic continual semantic segmentation framework by blending and harmonizing (i) open-source old instruments foreground to the synthesized background without revealing real patient data in public and (ii) new instruments foreground to extensively augmented real background. To boost the balanced logit distillation from the old model to the continual learning model, we design overlapping class-aware temperature normalization (CAT) by controlling model learning utility. We also introduce multi-scale shifted-feature distillation (SD) to maintain long and short-range spatial relationships among the semantic objects where conventional short-range spatial features with limited information reduce the power of feature distillation. We demonstrate the effectiveness of our framework on the EndoVis 2017 and 2018 instrument segmentation dataset with a generalized continual learning setting. Code is available at https://github.com/XuMengyaAmy/Synthetic_CAT_SD.
... The other set of approaches utilizes model parameter sharing. Studies 35,36 show that retraining later layers of the neural network models can effectively capture domain-specific information. ...
Preprint
Full-text available
Wearable Internet of Things (WIoT) and Artificial Intelligence (AI) are rapidly emerging technologies for healthcare. These technologies enable seamless data collection and precise analysis toward fast, resource-abundant, and personalized patient care. However, conventional machine learning workflow requires data to be transferred to the remote cloud server, which leads to significant privacy concerns. To tackle this problem, researchers have proposed federated learning, where end-point users collaboratively learn a shared model without sharing local data. However, data heterogeneity, i.e., variations in data distributions within a client (intra-client) or across clients (inter-client), degrades the performance of federated learning. Existing state-of-the-art methods mainly consider inter-client data heterogeneity, whereas intra-client variations have not received much attention. To address intra-client variations in federated learning, we propose a federated clustered multi-domain learning algorithm based on ClusterGAN, multi-domain learning, and graph neural networks. We applied the proposed algorithm to a case study on stress-level prediction, and our proposed algorithm outperforms two state-of-the-art methods by 4.4% in accuracy and 0.06 in the F1 score. In addition, we demonstrate the effectiveness of the proposed algorithm by investigating variants of its different modules.
... The other set of approaches utilizes model parameter sharing. Studies 35,36 show that retraining later layers of the neural network models can effectively capture domain-specific information. ...
Article
Full-text available
Wearable Internet of Things (WIoT) and Artificial Intelligence (AI) are rapidly emerging technologies for healthcare. These technologies enable seamless data collection and precise analysis toward fast, resource-abundant, and personalized patient care. However, conventional machine learning workflow requires data to be transferred to the remote cloud server, which leads to significant privacy concerns. To tackle this problem, researchers have proposed federated learning, where end-point users collaboratively learn a shared model without sharing local data. However, data heterogeneity, i.e., variations in data distributions within a client (intra-client) or across clients (inter-client), degrades the performance of federated learning. Existing state-of-the-art methods mainly consider inter-client data heterogeneity, whereas intra-client variations have not received much attention. To address intra-client variations in federated learning, we propose a federated clustered multi-domain learning algorithm based on ClusterGAN, multi-domain learning, and graph neural networks. We applied the proposed algorithm to a case study on stress-level prediction, and our proposed algorithm outperforms two state-of-the-art methods by 4.4% in accuracy and 0.06 in the F1 score. In addition, we demonstrate the effectiveness of the proposed algorithm by investigating variants of its different modules.
... The technique used in this analysis, termed as 'forgetting frontier', is a measure of the maximum performance on new data learned for a given stable model performance for old data. A comparison of accuracy loss against model parameter size is given in Table 4. [125]. Their approach focused on using network sharing in the unique clone-andbranch technique, where the cloned layers provide a better starting point to the weights as opposed to randomly initialised ones and hence result in faster learning kernels and faster convergence. ...
Article
Full-text available
Deep learning based visual cognition has greatly improved the accuracy of defect detection, reducing processing times and increasing product throughput across a variety of manufacturing use cases. There is however a continuing need for rigorous procedures to dynamically update model-based detection methods that use sequential streaming during the training phase. This paper reviews how new process, training or validation information is rigorously incorporated in real time when detection exceptions arise during inspection. In particular, consideration is given to how new tasks, classes or decision pathways are added to existing models or datasets in a controlled fashion. An analysis of studies from the incremental learning literature is presented, where the emphasis is on the mitigation of process complexity challenges such as, catastrophic forgetting. Further, practical implementation issues that are known to affect the complexity of deep learning model architecture, including memory allocation for incoming sequential data or incremental learning accuracy, is considered. The paper highlights case study results and methods that have been used to successfully mitigate such real-time manufacturing challenges.
... The other set of approaches utilizes model parameter sharing. Studies 34,35 show that retraining later layers of the neural network models can effectively capture domain-specific information. ...
Preprint
Full-text available
p>Wearable Internet of Things (WIoT) and Artificial Intelligence (AI) are rapidly emerging technologies for healthcare. These technologies enable seamless data collection and precise analysis toward fast, resource-abundant, and personalized patient care. However, conventional machine learning workflow requires data to be transferred to the remote cloud server, which leads to significant privacy concerns. To tackle this problem, researchers have proposed federated learning, where end-point users collaboratively learn a shared model without sharing local data. However, data heterogeneity, i.e., variations in data distributions within a client (intra-client) or across clients (inter-client), degrades the performance of federated learning. Existing state-of-the-art methods mainly consider inter-client data heterogeneity, whereas intra-client variations have not received much attention. To address intra-client variations in federated learning, we propose a federated clustered multi-domain learning algorithm based on ClusterGAN, multi-domain learning, and graph neural networks. We applied the proposed algorithm to a case study on stress-level prediction, and our proposed algorithm outperforms two state-of-the-art methods by 4.4% in accuracy and 0.06 in the F1 score. In addition, we demonstrate the effectiveness of the proposed algorithm by investigating variants of its different modules.</p
... The other set of approaches utilizes model parameter sharing. Studies 34,35 show that retraining later layers of the neural network models can effectively capture domain-specific information. ...
Preprint
Full-text available
p>Wearable Internet of Things (WIoT) and Artificial Intelligence (AI) are rapidly emerging technologies for healthcare. These technologies enable seamless data collection and precise analysis toward fast, resource-abundant, and personalized patient care. However, conventional machine learning workflow requires data to be transferred to the remote cloud server, which leads to significant privacy concerns. To tackle this problem, researchers have proposed federated learning, where end-point users collaboratively learn a shared model without sharing local data. However, data heterogeneity, i.e., variations in data distributions within a client (intra-client) or across clients (inter-client), degrades the performance of federated learning. Existing state-of-the-art methods mainly consider inter-client data heterogeneity, whereas intra-client variations have not received much attention. To address intra-client variations in federated learning, we propose a federated clustered multi-domain learning algorithm based on ClusterGAN, multi-domain learning, and graph neural networks. We applied the proposed algorithm to a case study on stress-level prediction, and our proposed algorithm outperforms two state-of-the-art methods by 4.4% in accuracy and 0.06 in the F1 score. In addition, we demonstrate the effectiveness of the proposed algorithm by investigating variants of its different modules.</p
... В литературе термин «инкрементное обучение» (Incremental Learning) относится к инкрементальному росту, сокращению сети или онлайн-обучению. Также используются другие термины, такие как обучение на протяжении всей жизни, конструктивное обучение и эволюционное обучение, пошаговое и непрерывное обучение [3,4]. ...
... Большинство методов инкрементного обучения можно сгруппировать в семейства техник с учетом различных точек зрения на решение проблемы ката-строфического забывания, обладающих схожими характеристиками: методы на основе масок [5,6], методы расширения архитектуры [4,7], методы регуляризации [8 -13], псевдорепетиционные методы [14 -19]. ...
Article
In this paper, the relevance of developing methods and algorithms for neural network incremental learning is shown. Families of incremental learning techniques are presented. A possibility of using the extreme learning machine for incremental learning is assessed. Experiments show that the extreme learning machine is suitable for incremental learning, but as the number of training examples increases, the neural network becomes unsuitable for further learning. To solve this problem, we propose a neural network incremental learning algorithm that alternately uses the extreme learning machine to correct the only output layer network weights (operation mode) and the backpropagation method (deep learning) to correct all network weights (sleep mode). During the operation mode, the neural network is assumed to produce results or learn from new tasks, optimizing its weights in the sleep mode. The proposed algorithm features the ability for real-time adaption to changing external conditions in the operation mode. The effectiveness of the proposed algorithm is shown by an example of solving the approximation problem. Approximation results after each step of the algorithm are presented. A comparison of the mean square error values when using the extreme learning machine for incremental learning and the developed algorithm of neural network alternate incremental learning is made.
... Despite its success in various applications, DRL faces several challenges that must be addressed to fully realize its potential. One of the primary challenges is the high computational cost required for training deep neural networks [14]. Training a DRL agent often requires large amounts of data and computation resources, which can make it impractical for some applications. ...
Article
Full-text available
The purpose of the research is to explore and develop Deep Reinforcement Learning and Q-Learning algorithms in order to improve Ethereum cybersecurity in contract vulnerabilities, the smart contract market and research leadership in the area. Deep Reinforcement Learning (Deep RL) is gaining popularity among AI researchers due to its ability to handle complex, dynamic, and particularly high-dimensional cyber protection problems. The benchmark of RL is goal-oriented behavior that increases rewards and decreases penalties or losses, and enhances real-time interaction between an agent and its surroundings. The research paper examines the three major cryptocurrencies (Bitcoin, Litecoin and Ethereum) and the role played by cyber-attacks.The Design Science Research Paradigm as applied in Information Systems research was used in this research, as it is hinged on the idea that information and understanding of a design problem and its solution are attained in the crafting of an artefact. The proposed constructs were in the form of Deep Reinforcement Learning and Q-Learning algorithms designed to improve Ethereum cybersecurity. Smart contracts on the Ethereum blockchain can automatically enforce contracts made between two unknown parties. Blockchain (BC) and artificial intelligence (AI) are used together to strengthen one another's skills and complement one another. Consensus algorithms (CAs) of BC and deep reinforcement learning (DRL) in ETS were thoroughly reviewed. In order to integrate many DCRs and provide grid services, this article suggests an effective incentive-based autonomous DCR control and management framework. This framework simultaneously adjusts the grid's active power with accuracy, optimizes DCR allocations, and increases profits for all prosumers and system operators. The best incentives in a continuous action space to persuade prosumers to reduce their energy consumption were found using a model-free deep deterministic policy gradient-based strategy. Extensive experimental experiments were carried out utilizing real-world data to show the framework's efficacy.