Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3580305.3599349acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

FedSkill: Privacy Preserved Interpretable Skill Learning via Imitation

Published: 04 August 2023 Publication History

Abstract

Imitation learning that replicates experts' skills via their demonstrations has shown significant success in various decision-making tasks. However, two critical challenges still hinder the deployment of imitation learning techniques in real-world application scenarios. First, existing methods lack the intrinsic interpretability to explicitly explain the underlying rationale of the learned skill and thus making learned policy untrustworthy. Second, due to the scarcity of expert demonstrations from each end user (client), learning a policy based on different data silos is necessary but challenging in privacy-sensitive applications such as finance and healthcare. To this end, we present a privacy-preserved interpretable skill learning framework (FedSkill) that enables global policy learning to incorporate data from different sources and provides explainable interpretations to each local user without violating privacy and data sovereignty. Specifically, our proposed interpretable skill learning model can capture the varying patterns in the trajectories of expert demonstrations, and extract prototypical information as skills that provide implicit guidance for policy learning and explicit explanations in the reasoning process. Moreover, we design a novel aggregation mechanism coupled with the based skill learning model to preserve global information utilization and maintain local interpretability under the federated framework. Thoroughly experiments on three datasets and empirical studies demonstrate that our proposed FedSkill framework not only outperforms state-of-the-art imitation learning methods but also exhibits good interpretability under a federated setting. Our proposed FedSkill framework is the first attempt to bridge the gaps among federated learning, interpretable machine learning, and imitation learning.

Supplementary Material

MP4 File (<rtfp1142>-2min-promo.mp4)
In this video, we present our recent work, FedSkill, a privacy-preserved imitation learning framework that discovers interpretable skills. We first discuss two existing practical challenges of imitation learning from the perspective of expert demonstrations. Next, we provide a motivating example based on dynamic treatment recommendations to illustrate our proposed framework and contributions. Then, we give a quick overview of our proposed methodology, including the design of an interpretable skill learning model and the privacy-preserved federated framework with knowledge alignment and local interpretability acquisition. Lastly, we presented part of the experiment results to demonstrate the effectiveness of FedSkill in yielding interpretable skills when exploiting demonstrations from multiple data sources.

References

[1]
Michael Bain and Claude Sammut. 1995. A Framework for Behavioural Cloning. In Machine Intelligence 15. 103--129.
[2]
Jacek M Bajor and Thomas A Lasko. 2017. Predicting medications from diagnostic codes with recurrent neural networks. In International conference on learning representations.
[3]
Justin Boyan and Andrew Moore. 1994. Generalization in reinforcement learning: Safely approximating the value function. Advances in neural information processing systems, Vol. 7 (1994).
[4]
Christopher Briggs, Zhong Fan, and Peter Andras. 2020. Federated learning with hierarchical clustering of local updates to improve training on non-IID data. In 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--9.
[5]
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014).
[6]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. Advances in neural information processing systems, Vol. 29 (2016).
[7]
Felipe Codevilla, Matthias Müller, Antonio López, Vladlen Koltun, and Alexey Dosovitskiy. 2018. End-to-end driving via conditional imitation learning. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 4693--4700.
[8]
Yutong Dai, Zeyuan Chen, Junnan Li, Shelby Heinecke, Lichao Sun, and Ran Xu. 2022. Tackling Data Heterogeneity in Federated Learning with Class Prototypes. arXiv preprint arXiv:2212.02758 (2022).
[9]
Yan Duan, Marcin Andrychowicz, Bradly Stadie, OpenAI Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, and Wojciech Zaremba. 2017. One-shot imitation learning. Advances in neural information processing system, Vol. 30 (2017).
[10]
Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ramchandran. 2020. An efficient framework for clustered federated learning. Advances in Neural Information Processing Systems, Vol. 33 (2020), 19586--19597.
[11]
Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. Advances in neural information processing systems, Vol. 29 (2016).
[12]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[13]
Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, and Dario Amodei. 2018. Reward learning from human preferences and demonstrations in atari. Advances in neural information processing systems, Vol. 31 (2018).
[14]
Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).
[15]
Alistair E W Johnson, Tom J Pollard, Lu Shen, Li-Wei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data, Vol. 3 (May 2016), 160035. https://doi.org/10.1038/sdata.2016.35
[16]
Andrej Karpathy, Justin Johnson, and Li Fei-Fei. 2015. Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015).
[17]
Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster).
[18]
Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, and Richard Zemel. 2018. Neural relational inference for interacting systems. In International conference on machine learning. PMLR, 2688--2697.
[19]
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems, Vol. 2 (2020), 429--450.
[20]
Yunzhu Li, Jiaming Song, and Stefano Ermon. 2017. Infogail: Interpretable imitation learning from visual demonstrations. Advances in Neural Information Processing Systems, Vol. 30 (2017).
[21]
Tao Lin, Lingjing Kong, Sebastian U Stich, and Martin Jaggi. 2020. Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems, Vol. 33 (2020), 2351--2363.
[22]
Boyi Liu, Lujia Wang, Ming Liu, and Cheng-Zhong Xu. 2020. Federated imitation learning: A novel framework for cloud robotic systems with heterogeneous sensor data. IEEE Robotics and Automation Letters, Vol. 5, 2 (2020), 3509--3516.
[23]
Sha Luo, Hamidreza Kasaei, and Lambert Schomaker. 2021. Self-imitation learning by planning. In 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 4823--4829
[24]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 1273--1282.
[25]
Yao Ming, Panpan Xu, Huamin Qu, and Liu Ren. 2019. Interpretable and steerable sequence learning via prototypes. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 903--913.
[26]
Jingchao Ni, Zhengzhang Chen, Wei Cheng, Bo Zong, Dongjin Song, Yanchi Liu, Xuchao Zhang, and Haifeng Chen. 2021. Interpreting Convolutional Sequence Model by Learning Local Prototypes with Adaptation Regularization. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 1366--1375.
[27]
Stuart Russell. 1998. Learning agents for uncertain environments. In Proceedings of the eleventh annual conference on Computational learning theory. 101--103.
[28]
Felix Sattler, Klaus-Robert Müller, and Wojciech Samek. 2020. Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE transactions on neural networks and learning systems, Vol. 32, 8 (2020), 3710--3722.
[29]
Mohit Sharma, Arjun Sharma, Nicholas Rhinehart, and Kris M Kitani. [n.,d.]. Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information. In International Conference on Learning Representations.
[30]
Mervyn Singer, Clifford S Deutschman, Christopher Warren Seymour, Manu Shankar-Hari, Djillali Annane, Michael Bauer, Rinaldo Bellomo, Gordon R Bernard, Jean-Daniel Chiche, Craig M Coopersmith, et al. 2016. The third international consensus definitions for sepsis and septic shock (Sepsis-3). Jama, Vol. 315, 8 (2016), 801--810.
[31]
Jiaming Song, Hongyu Ren, Dorsa Sadigh, and Stefano Ermon. 2018. Multi-agent generative adversarial imitation learning. Advances in neural information processing systems, Vol. 31 (2018).
[32]
Alysa Ziying Tan, Han Yu, Lizhen Cui, and Qiang Yang. 2022b. Towards personalized federated learning. IEEE Transactions on Neural Networks and Learning Systems (2022).
[33]
Yue Tan, Guodong Long, Lu Liu, Tianyi Zhou, Qinghua Lu, Jing Jiang, and Chengqi Zhang. 2022a. Fedproto: Federated prototype learning across heterogeneous clients. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 8432--8440.
[34]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).
[35]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
[36]
Lu Wang, Ruiming Tang, Xiaofeng He, and Xiuqiang He. 2022. Hierarchical Imitation Learning via Subgoal Representation Learning for Dynamic Treatment Recommendation. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 1081--1089.
[37]
Lu Wang, Wenchao Yu, Xiaofeng He, Wei Cheng, Martin Renqiang Ren, Wei Wang, Bo Zong, Haifeng Chen, and Hongyuan Zha. 2020. Adversarial Cooperative Imitation Learning for Dynamic Treatment Regimes. In Proceedings of The Web Conference 2020. 1785--1795.
[38]
Tianhao Wu, Mingzhi Jiang, Yinhui Han, Zheng Yuan, and Lin Zhang. 2021. Density-Aware Federated Imitation Learning for Connected and Automated Vehicles with Unsignalized Intersection. arXiv preprint arXiv:2105.01889 (2021).
[39]
Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Xiaojun Chang, and Chengqi Zhang. 2020. Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 753--763.
[40]
Bo Yang, Huaguang Shi, and Xiaofang Xia. 2022. Federated Imitation Learning for UAV Swarm Coordination in Urban Traffic Monitoring. IEEE Transactions on Industrial Informatics (2022).
[41]
Ao Yu, Qingkai Yang, Lihua Dou, and Mohamed Cheriet. 2021. Federated imitation learning: A cross-domain knowledge sharing framework for traffic scheduling in 6G ubiquitous IoT. IEEE Network, Vol. 35, 5 (2021), 136--142.
[42]
Xin Zhang, Yanhua Li, Ziming Zhang, and Zhi-Li Zhang. 2020. f-gail: Learning f-divergence for generative adversarial imitation learning. Advances in neural information processing systems, Vol. 33 (2020), 12805--12815.
[43]
Zhejun Zhang, Alexander Liniger, Dengxin Dai, Fisher Yu, and Luc Van Gool. 2021. End-to-end urban driving by imitating a reinforcement learning coach. In Proceedings of the IEEE/CVF international conference on computer vision. 15222--15232.
[44]
Boyuan Zheng, Sunny Verma, Jianlong Zhou, Ivor W Tsang, and Fang Chen. 2022. Imitation learning: Progress, taxonomies and challenges. IEEE Transactions on Neural Networks and Learning Systems (2022), 1--16.
[45]
Wei Zhu, Dongjin Song, Yuncong Chen, Wei Cheng, Bo Zong, Takehiko Mizoguchi, Cristian Lumezanu, Haifeng Chen, and Jiebo Luo. 2022. Deep Federated Anomaly Detection for Multivariate Time Series Data. In 2022 IEEE International Conference on Big Data (Big Data). 1--10. https://doi.org/10.1109/BigData55660.2022.10064694
[46]
Brian D Ziebart, Andrew L Maas, J Andrew Bagnell, Anind K Dey, et al. 2008. Maximum entropy inverse reinforcement learning.In Aaai, Vol. 8. Chicago, IL, USA, 1433--1438.
[47]
Konrad Zolna, Scott Reed, Alexander Novikov, Sergio Gomez Colmenarejo, David Budden, Serkan Cabi, Misha Denil, Nando de Freitas, and Ziyu Wang. 2021. Task-relevant adversarial imitation learning. In Conference on Robot Learning. PMLR, 247--263.

Cited By

View all
  • (2024)Adversarial client detection via non-parametric subspace monitoring in the internet of federated thingsIISE Transactions10.1080/24725854.2024.2367224(1-13)Online publication date: 29-Jul-2024

Index Terms

  1. FedSkill: Privacy Preserved Interpretable Skill Learning via Imitation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2023
    5996 pages
    ISBN:9798400701030
    DOI:10.1145/3580305
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 August 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. federated learning
    2. imitation learning
    3. interpretable machine learning
    4. prototype

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)245
    • Downloads (Last 6 weeks)25
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Adversarial client detection via non-parametric subspace monitoring in the internet of federated thingsIISE Transactions10.1080/24725854.2024.2367224(1-13)Online publication date: 29-Jul-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media

    Access Granted

    The conference sponsors are committed to making content openly accessible in a timely manner.
    This article is provided by ACM and the conference, through the ACM OpenTOC service.