research-article

VADAF: Visualization for Abnormal Client Detection and Analysis in Federated Learning

Authors:

Wei ChenAuthors Info & Claims

ACM Transactions on Interactive Intelligent Systems (TiiS), Volume 11, Issue 3-4

Article No.: 26, Pages 1 - 23

https://doi.org/10.1145/3426866

Published: 03 September 2021 Publication History

Abstract

Federated Learning (FL) provides a powerful solution to distributed machine learning on a large corpus of decentralized data. It ensures privacy and security by performing computation on devices (which we refer to as clients) based on local data to improve the shared global model. However, the inaccessibility of the data and the invisibility of the computation make it challenging to interpret and analyze the training process, especially to distinguish potential client anomalies. Identifying these anomalies can help experts diagnose and improve FL models. For this reason, we propose a visual analytics system, VADAF, to depict the training dynamics and facilitate analyzing potential client anomalies. Specifically, we design a visualization scheme that supports massive training dynamics in the FL environment. Moreover, we introduce an anomaly detection method to detect potential client anomalies, which are further analyzed based on both the client model’s visual and objective estimation. Three case studies have demonstrated the effectiveness of our system in understanding the FL training process and supporting abnormal client detection and analysis.

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, 265–283.

Digital Library

[2]

Scott Alfeld, Xiaojin Zhu, and Paul Barford. 2016. Data poisoning attacks against autoregressive models. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI’16). AAAI Press, 1452–1458.

[3]

Saleema Amershi, Max Chickering, Steven M. Drucker, Bongshin Lee, Patrice Simard, and Jina Suh. 2015. ModelTracker: Redesigning performance analysis tools for machine learning. In Proceedings of the 33rd ACM Conference on Human Factors in Computing Systems (CHI’15). ACM, New York, NY, 337–346.

Digital Library

[4]

Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. 2019. How To Backdoor Federated Learning. arxiv:1807.00459.

[5]

Ivan Beschastnikh, Patty Wang, Yuriy Brun, and Michael D. Ernst. 2016. Debugging distributed systems. Commun. ACM 59, 8 (July 2016), 32–37.

Digital Library

[6]

Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečný, Stefano Mazzocchi, H. Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, and Jason Roselander. 2019. Towards Federated Learning at Scale: System Design. arxiv:1902.01046.

[7]

Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS’17). Association for Computing Machinery, New York, NY, 1175–1191.

Digital Library

[8]

Ingwer Borg and Patrick Groenen. 2003. Modern multidimensional scaling: Theory and applications. J. Educ. Meas. 40, 3 (2003), 277–280.

[9]

Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying density-based local outliers. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’00). Association for Computing Machinery, New York, NY, 93–104.

Digital Library

[10]

Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia Smith, and Ameet Talwalkar. 2019. LEAF: A Benchmark for Federated Settings. arxiv:1812.01097.

[11]

Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. 41, 3, Article 15 (July 2009). 58 pages.

[12]

Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. arxiv:1712.05526.

[13]

Dan C. Cosma and Radu Marinescu. 2007. Distributable features view: Visualizing the structural characteristics of distributed software systems. In Proceedings of the 4th IEEE International Workshop on Visualizing Software for Understanding and Analysis. IEEE, 55–62.

[14]

Minghong Fang, Xiaoyu Cao, Jinyuan Jia, and Neil Zhenqiang Gong. 2020. Local Model Poisoning Attacks to Byzantine-Robust Federated Learning. arxiv:1911.11815.

[15]

Shuhao Fu, Chulin Xie, Bo Li, and Qifeng Chen. 2019. Attack-Resistant Federated Learning with Residual-based Reweighting. arxiv:1912.11464.

[16]

Clement Fung, Chris J. M. Yoon, and Ivan Beschastnikh. 2020. Mitigating Sybils in Federated Learning Poisoning. arxiv:1808.04866.

[17]

Google. 2019. TensorFlow Federated: Machine Learning on Decentralized Data. Retrieved from https://www.tensorflow.org/federated.

[18]

Dan Gunter, Brian Tierney, Brian Crowley, Mason Holding, and Jason Lee. 2000. NetLogger: A toolkit for distributed system performance analysis.Proceedings of the International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’00). IEEE Computer Society, 267.

[19]

Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2019. Federated Learning for Mobile Keyboard Prediction. arxiv:1811.03604.

[20]

Li Huang and Dianbo Liu. 2019. Patient Clustering Improves Efficiency of Federated Machine Learning to predict mortality and hospital stay time using distributed Electronic Medical Records. arxiv:1903.09296.

[21]

Peter J. Huber. 2011. Robust Statistics. Springer.

[22]

B. Iglewicz and D. C. Hoaglin. 1993. How to Detect and Handle Outliers. ASQC Quality Press. 93020842Retrieved from https://books.google.nl/books?id=siInAQAAIAAJ.

[23]

M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, and B. Li. 2018. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In Proceedings of the IEEE Symposium on Security and Privacy (SP’18). 19–35.

[24]

Minsuk Kahng, Pierre Y. Andrews, Aditya Kalro, and Duen Horng Polo Chau. 2017. Activis: Visual exploration of industry-scale deep neural network models. IEEE Trans. Vis. Comput. Graph. 24, 1 (2017), 88–97.

[25]

Minsuk Kahng, Nikhil Thorat, Duen Horng Polo Chau, Fernanda B. Viégas, and Martin Wattenberg. 2018. GAN lab: Understanding complex deep generative models using interactive visual experimentation. IEEE Trans. Vis. Comput. Graph. 25, 1 (2018), 1–11.

[26]

Edwin M. Knorr and Raymond T. Ng. 1998. Algorithms for mining distance-based outliers in large datasets. In Proceedings of the 24th International Conference on Very Large Data Bases (VLDB’98). Morgan Kaufmann Publishers Inc., San Francisco, CA, 392–403.

Digital Library

[27]

Edwin M. Knorr and Raymond T. Ng. 1999. Finding intensional knowledge of distance-based outliers. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB’99). Morgan Kaufmann Publishers Inc., San Francisco, CA, 211–222.

[28]

Jakub Konečný, H. Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated Optimization: Distributed Machine Learning for On-Device Intelligence. arxiv:1610.02527.

[29]

Jakub Konečný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2017. Federated Learning: Strategies for Improving Communication Efficiency. arxiv:1610.05492.

[30]

Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning Multiple Layers of Features from Tiny Images. Master’s thesis. Department of Computer Science, University of Toronto.

[31]

André Kutzleb. 2017. Visual Analytics of Big Data from Distributed Systems. Master’s thesis. University of Stuttgart. Retrieved from http://dx.doi.org/10.18419/opus-9585.

[32]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.

[33]

Christophe Leys, Olivier Klein, Philippe Bernard, and Laurent Licata. 2013. Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. J. Exper. Soc. Psychol. 49, 4 (2013), 764–766.

[34]

Suyi Li, Yong Cheng, Yang Liu, Wei Wang, and Tianjian Chen. 2019. Abnormal Client Behavior Detection in Federated Learning. arxiv:1910.09933.

[35]

Suyi Li, Yong Cheng, Wei Wang, Yang Liu, and Tianjian Chen. 2020. Learning to Detect Malicious Clients for Robust Federated Learning. arxiv:2002.00211.

[36]

Dongyu Liu, Weiwei Cui, Kai Jin, Yuxiao Guo, and Huamin Qu. 2018. Deeptracker: Visualizing the training process of convolutional neural networks. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 1 (2018), 6.

[37]

Mengchen Liu, Jiaxin Shi, Kelei Cao, Jun Zhu, and Shixia Liu. 2017. Analyzing the training processes of deep generative models. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2017), 77–87.

[38]

Mengchen Liu, Jiaxin Shi, Zhen Li, Chongxuan Li, Jun Zhu, and Shixia Liu. 2016. Towards better analysis of deep convolutional neural networks. IEEE Transactions on Visualization and Computer Graphics 23, 1 (2016), 91–100.

Digital Library

[39]

H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. arxiv:1602.05629.

[40]

H. Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018. Learning Differentially Private Recurrent Language Models. arxiv:1710.06963.

[41]

Jeff Miller. 1991. Reaction time analysis with outlier exclusion: Bias varies with sample size. The Quarterly Journal of Experimental Psychology 43, 4 (1991), 907–912.

[42]

Y. Ming, S. Cao, R. Zhang, Z. Li, Y. Chen, Y. Song, and H. Qu. 2017. Understanding hidden memories of recurrent neural networks. In 2017 IEEE Conference on Visual Analytics Science and Technology (VAST). 13–24.

[43]

Gerhard Münz, Sa Li, and Georg Carle. 2007. Traffic anomaly detection using k-means clustering. In GI/ITG Workshop MMBnet. 13–14.

[44]

Kristin Potter, Hans Hagen, Andreas Kerren, and Peter Dannenmann. 2006. Methods for presenting statistical information: The box plot. Visualization of Large and Unstructured Data Sets 4 (2006), 97–106.

[45]

Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. 2000. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD’00). Association for Computing Machinery, New York, NY, 427–438.

Digital Library

[46]

Paulo E. Rauber, Samuel G. Fadel, Alexandre X. Falcao, and Alexandru C. Telea. 2016. Visualizing the hidden activity of artificial neural networks. IEEE Transactions on Visualization and Computer Graphics 23, 1 (2016), 101–110.

Digital Library

[47]

Thomas C. Redman. 1998. The impact of poor data quality on the typical enterprise. Commun. ACM 41, 2 (1998), 79–82.

Digital Library

[48]

Donghao Ren, Saleema Amershi, Bongshin Lee, Jina Suh, and Jason D. Williams. 2016. Squares: Supporting interactive performance analysis for multiclass classifiers. IEEE Transactions on Visualization and Computer Graphics 23, 1 (2016), 61–70.

Digital Library

[49]

Peter J Rousseeuw and Christophe Croux. 1993. Alternatives to the median absolute deviation. Journal of the American Statistical Association 88, 424 (1993), 1273–1283.

[50]

Shiqi Shen, Shruti Tople, and Prateek Saxena. 2016. A Defending against poisoning attacks in collaborative deep learning systems. In Proceedings of the 32nd Annual Conference on Computer Security Applications (ACSAC’16). Association for Computing Machinery, New York, NY, 508–519.

Digital Library

[51]

Hendrik Strobelt, Sebastian Gehrmann, Hanspeter Pfister, and Alexander M Rush. 2017. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2017), 667–676.

[52]

Richard Tomsett, Kevin Chan, and Supriyo Chakraborty. 2019. Model poisoning attacks against distributed machine learning systems. In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, Tien Pham (Ed.), Vol. 11006. International Society for Optics and Photonics, SPIE, 481–489.

[53]

Junpeng Wang, Liang Gou, Wei Zhang, Hao Yang, and Han-Wei Shen. 2019. DeepVID: Deep visual interpretation and diagnosis for image classifiers via knowledge distillation. IEEE Transactions on Visualization and Computer Graphics 25, 6 (2019), 2168–2180.

[54]

WeBank. 2019. Federated AI Technology Enabler(FATE). (2019). Retrieved Oct 2, 2019 from https://github.com/FederatedAI/FATE.

[55]

Xiguang Wei, Quan Li, Yang Liu, Han Yu, Tianjian Chen, and Qiang Yang. 2019. Multi-agent visualization for explaining federated learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19). International Joint Conferences on Artificial Intelligence Organization, 6572–6574.

[56]

Zhaoxian Wu, Qing Ling, Tianyi Chen, and Georgios B. Giannakis. 2019. Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks. (2019). arxiv:1912.12716.

[57]

Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 12.

Digital Library

[58]

Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2017. Understanding deep learning requires rethinking generalization. arxiv:1611.03530.

[59]

Jiawei Zhang, Yang Wang, Piero Molino, Lezhi Li, and David S Ebert. 2018. Manifold: A model-agnostic framework for interpretation and diagnosis of machine learning models. IEEE Transactions on Visualization and Computer Graphics 25, 1 (2018), 364–373.

Digital Library

[60]

Xingquan Zhu and Xindong Wu. 2004. Class noise vs. attribute noise: A quantitative study. Artificial Intelligence Review 22, 3 (2004), 177–210.

Cited By

Symeonides MTrihinas DNikolaidis F(2024)FedMon: A Federated Learning Monitoring ToolkitIoT10.3390/iot50200125:2(227-249)Online publication date: 11-Apr-2024
https://doi.org/10.3390/iot5020012
Huang LCui WZhu BZhang H(2023)Visually Analysing the Fairness of Clustered Federated Learning with Non-IID Data2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191762(01-10)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191762
Anitha GJegatheesan ABaburaj E(2023)A Comparative Analysis of Federated Learning Towards Big data IoT with Future Perspectives2023 3rd International Conference on Computing and Information Technology (ICCIT)10.1109/ICCIT58132.2023.10273901(518-525)Online publication date: 13-Sep-2023
https://doi.org/10.1109/ICCIT58132.2023.10273901
Show More Cited By

Index Terms

VADAF: Visualization for Abnormal Client Detection and Analysis in Federated Learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Anomaly detection
2. Human-centered computing
  1. Visualization
    1. Visualization application domains
      1. Visual analytics

Recommendations

Multi-Task Network Anomaly Detection using Federated Learning
SoICT '19: Proceedings of the 10th International Symposium on Information and Communication Technology

Because of the complexity of network traffic, there are various significant challenges in the network anomaly detection fields. One of the major challenges is the lack of labeled training data. In this paper, we use federated learning to tackle data ...
Anomaly Detection Using LSTM-Autoencoder in Smart Grid: A Federated Learning Approach
ICCBDC '23: Proceedings of the 2023 7th International Conference on Cloud and Big Data Computing

ABSTRACT. Anomaly detection is critical in industrial systems such as smart grid systems to guarantee their safe and effective operation. The smart grid stations contain sensitive data, and they are concerned about sharing it with a third-party server ...
Visual Analytics Approach for Crane Anomaly Detection Based on Digital Twin
Cooperative Design, Visualization, and Engineering
Abstract
Anomaly detection of crane operating status is the basis for ensuring its stable operation. The current detection method based on anomaly detection algorithms cannot clearly distinguish normal data from abnormal data. And the high dimensions of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Interactive Intelligent Systems

ACM Transactions on Interactive Intelligent Systems Volume 11, Issue 3-4

December 2021

483 pages

ISSN:2160-6455

EISSN:2160-6463

DOI:10.1145/3481699

Editor:
Michelle X. Zhou
Juji, Inc., USA

Issue’s Table of Contents

Copyright © 2021 Association for Computing Machinery.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 September 2021

Accepted: 01 September 2020

Revised: 01 September 2020

Received: 01 November 2019

Published in TIIS Volume 11, Issue 3-4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
Alibaba-Zhejiang University Joint Institute of Frontier Technologies (AZFT)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
545
Total Downloads

Downloads (Last 12 months)141
Downloads (Last 6 weeks)25

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Symeonides MTrihinas DNikolaidis F(2024)FedMon: A Federated Learning Monitoring ToolkitIoT10.3390/iot50200125:2(227-249)Online publication date: 11-Apr-2024
https://doi.org/10.3390/iot5020012
Huang LCui WZhu BZhang H(2023)Visually Analysing the Fairness of Clustered Federated Learning with Non-IID Data2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191762(01-10)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191762
Anitha GJegatheesan ABaburaj E(2023)A Comparative Analysis of Federated Learning Towards Big data IoT with Future Perspectives2023 3rd International Conference on Computing and Information Technology (ICCIT)10.1109/ICCIT58132.2023.10273901(518-525)Online publication date: 13-Sep-2023
https://doi.org/10.1109/ICCIT58132.2023.10273901
Chen WZhang TZhu HWang XWang Y(2022)Perspectives on cross-domain visual analysis of cyber-physical-social big data三元空间大数据跨域可视化分析展望Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.210055322:12(1559-1564)Online publication date: 5-Jan-2022
https://doi.org/10.1631/FITEE.2100553

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents