Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

distributed training
Recently Published Documents


TOTAL DOCUMENTS

189
(FIVE YEARS 85)

H-INDEX

8
(FIVE YEARS 3)

2022 ◽  
Author(s):  
Natali Alfonso Burgos ◽  
Karol Kiš ◽  
Peter Bakarac ◽  
Michal Kvasnica ◽  
Giovanni Licitra

We explore a bilingual next-word predictor (NWP) under federated optimization for a mobile application. A character-based LSTM is server-trained on English and Dutch texts from a custom parallel corpora. This is used as the target performance. We simulate a federated learning environment to assess the feasibility of distributed training for the same model. The popular Federated Averaging (FedAvg) algorithm is used as the aggregation method. We show that the federated LSTM achieves decent performance, yet it is still sub-optimal. We suggest possible next steps to bridge this performance gap. Furthermore, we explore the effects of language imbalance varying the ratio of English and Dutch training texts (or clients). We show the model upholds performance (of the balanced case) up and until a 80/20 imbalance before decaying rapidly. Lastly, we describe the implementation of local client training, word prediction and client-server communication in a custom virtual keyboard for Android platforms. Additionally, homomorphic encryption is applied to provide with secure aggregation guarding the user from malicious servers.


2022 ◽  
Author(s):  
Natali Alfonso Burgos ◽  
Karol Kiš ◽  
Peter Bakarac ◽  
Michal Kvasnica ◽  
Giovanni Licitra

We explore a bilingual next-word predictor (NWP) under federated optimization for a mobile application. A character-based LSTM is server-trained on English and Dutch texts from a custom parallel corpora. This is used as the target performance. We simulate a federated learning environment to assess the feasibility of distributed training for the same model. The popular Federated Averaging (FedAvg) algorithm is used as the aggregation method. We show that the federated LSTM achieves decent performance, yet it is still sub-optimal. We suggest possible next steps to bridge this performance gap. Furthermore, we explore the effects of language imbalance varying the ratio of English and Dutch training texts (or clients). We show the model upholds performance (of the balanced case) up and until a 80/20 imbalance before decaying rapidly. Lastly, we describe the implementation of local client training, word prediction and client-server communication in a custom virtual keyboard for Android platforms. Additionally, homomorphic encryption is applied to provide with secure aggregation guarding the user from malicious servers.


Computing ◽  
2022 ◽  
Author(s):  
Adrián Castelló ◽  
Mar Catalán ◽  
Manuel F. Dolz ◽  
Enrique S. Quintana-Ortí ◽  
José Duato

Author(s):  
Ganesan Ponnuswami ◽  
Sriram Kailasam ◽  
Dileep Aroor Dinesh

2021 ◽  
Vol 12 (1) ◽  
pp. 292
Author(s):  
Yunyong Ko ◽  
Sang-Wook Kim

The recent unprecedented success of deep learning (DL) in various fields is underlied by its use of large-scale data and models. Training a large-scale deep neural network (DNN) model with large-scale data, however, is time-consuming. To speed up the training of massive DNN models, data-parallel distributed training based on the parameter server (PS) has been widely applied. In general, a synchronous PS-based training suffers from the synchronization overhead, especially in heterogeneous environments. To reduce the synchronization overhead, asynchronous PS-based training employs the asynchronous communication between PS and workers so that PS processes the request of each worker independently without waiting. Despite the performance improvement of asynchronous training, however, it inevitably incurs the difference among the local models of workers, where such a difference among workers may cause slower model convergence. Fro addressing this problem, in this work, we propose a novel asynchronous PS-based training algorithm, SHAT that considers (1) the scale of distributed training and (2) the heterogeneity among workers for successfully reducing the difference among the local models of workers. The extensive empirical evaluation demonstrates that (1) the model trained by SHAT converges to the higher accuracy up to 5.22% than state-of-the-art algorithms, and (2) the model convergence of SHAT is robust under various heterogeneous environments.


2021 ◽  
Author(s):  
Mohammed Adnan ◽  
Shivam Kalra ◽  
Jesse C. Cresswell ◽  
Graham W. Taylor ◽  
Hamid Tizhoosh

Abstract The artificial intelligence revolution has been spurred forward by the availability of large-scale datasets. In contrast, the paucity of large-scale medical datasets hinders the application of machine learning in healthcare. The lack of publicly available multi-centric and diverse datasets mainly stems from confidentiality and privacy concerns around sharing medical data. To demonstrate a feasible path forward in medical image imaging, we conduct a case study of applying a differentially private federated learning framework for analysis of histopathology images, the largest and perhaps most complex medical images. We study the effects of IID and non-IID distributions along with the number of healthcare providers, i.e., hospitals and clinics, and the individual dataset sizes, using The Cancer Genome Atlas (TCGA) dataset, a public repository, to simulate a distributed environment. We empirically compare the performance of private, distributed training to conventional training and demonstrate that distributed training can achieve similar performance with strong privacy guarantees. We also study the effect of different source domains for histopathology images by evaluating the performance using external validation. Our work indicates that differentially private federated learning is a viable and reliable framework for the collaborative development of machine learning models in medical image analysis.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Xiaodong Wang ◽  
Zhe’nan He ◽  
Ying Wang ◽  
Linlin Dang ◽  
Weifang Han ◽  
...  

The intestine is an important organ of the human body, and its internal structure always needs to be observed in clinical applications so as to provide a basis for accurate diagnosis. However, due to the limited intestinal data obtained by a single institution, deep learning cannot effectively train the intestines, and the effect is not satisfied. For this reason, we propose a distributed training method to carry out federated learning to alleviate the situation of patient sample data shortage, not shared and uneven data distribution. And the blockchain is introduced to enhance the interaction between networks, to solve the problem of a single point of failure of the federated learning server. Fully excavate the multiscale features of samples, to construct a fusion enhancement model and intestinal segmentation module for accurate positioning. At the local end, the centerline extraction algorithm is optimized, with the edge as the main and the source as the auxiliary to realize centerline extraction.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Qingliang Meng ◽  
Meiyu Huang ◽  
Yao Xu ◽  
Naijin Liu ◽  
Xueshuang Xiang

For the space-based remote sensing system, onboard intelligent processing based on deep learning has become an inevitable trend. To adapt to the dynamic changes of the observation scenes, there is an urgent need to perform distributed deep learning onboard to fully utilize the plentiful real-time sensing data of multiple satellites from a smart constellation. However, the network bandwidth of the smart constellation is very limited. Therefore, it is of great significance to carry out distributed training research in a low-bandwidth environment. This paper proposes a Randomized Decentralized Parallel Stochastic Gradient Descent (RD-PSGD) method for distributed training in a low-bandwidth network. To reduce the communication cost, each node in RD-PSGD just randomly transfers part of the information of the local intelligent model to its neighborhood. We further speed up the algorithm by optimizing the programming of random index generation and parameter extraction. For the first time, we theoretically analyze the convergence property of the proposed RD-PSGD and validate the advantage of this method by simulation experiments on various distributed training tasks for image classification on different benchmark datasets and deep learning network architectures. The results show that RD-PSGD can effectively save the time and bandwidth cost of distributed training and reduce the complexity of parameter selection compared with the TopK-based method. The method proposed in this paper provides a new perspective for the study of onboard intelligent processing, especially for online learning on a smart satellite constellation.


Export Citation Format

Share Document