short-paper

Visual Analysis of DDPG Models by Exploring the Space of Experience

Authors:

You LuAuthors Info & Claims

VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction

Article No.: 28, Pages 1 - 5

https://doi.org/10.1145/3615522.3615550

Published: 20 October 2023 Publication History

Abstract

Deep Reinforcement Learning (DRL) has been remarkably successful, but the lack of RL expertise and the complexity of DRLs hinder model understanding. In this paper, we focus on visual analysis of experience data to improve the interpretability of DRLs, which involves step aggregation, high-dimensional state data analysis, and spatio-temporal modeling of experience data. In addition, we introduce DDPGVis, a visual system which combines multiple views to show statics and allows users to explore the experience space, and its effectiveness is confirmed by case studies.

References

[1]

Liu, Mengchen, "Towards better analysis of deep convolutional neural networks." IEEE transactions on visualization and computer graphics 23.1 (2016): 91-100.

[2]

Wang, Junpeng, "Ganviz: A visual analytics approach to understand the adversarial game. IEEE transactions on visualization and computer graphics, 24.6 (2018): 1905-1917.

[3]

J. Wang, L. Gou, H.-W. Shen, and H. Yang, “DQNViz: A visual analytics approach to understand deep q-networks,” IEEE Transactions on Visualization and Computer Graphics, vol. 25, no. 1, pp. 288–298, 2018.

Digital Library

[4]

He, Wenbin, "DynamicsExplorer: Visual analytics for robot control tasks involving dynamics and LSTM-based control policies." 2020 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 2020.

[5]

Wang, Junpeng, "Visual analytics for rnn-based deep reinforcement learning IEEE transactions on visualization and computer graphics, 28.12 (2021): 4141-4155.

[6]

Bro, Rasmus, and Age K. Smilde. "Principal component analysis." Analytical methods 6.9 (2014): 2812-2831.

[7]

Comon, Pierre. Independent component analysis, a new concept? Signal processing, 36.3 (1994): 287-314.

[8]

Cayton, Lawrence. Algorithms for manifold learning. Univ. of California at San Diego Tech. Rep, 12.1-17 (2005): 1.

[9]

Baldi, Pierre. "Autoencoders, unsupervised learning, and deep architectures." Proceedings of ICML workshop on unsupervised and transfer learning. JMLR Workshop and Conference Proceedings, 2012.

[10]

Kingma, Diederik P., and Max Welling. "An introduction to variational autoencoders." Foundations and Trends® in Machine Learning 12.4 (2019): 307-392.

Digital Library

[11]

Gu, Jiuxiang, "Recent advances in convolutional neural networks. Pattern recognition, 77 (2018): 354-377.

[12]

Medsker, Larry R., and L. C. Jain. "Recurrent neural networks." Design and Applications 5 (2001): 64-67.

[13]

Kahng, Minsuk, "A cti v is: Visual exploration of industry-scale deep neural network models." IEEE transactions on visualization and computer graphics 24.1 (2017): 88-97.

[14]

Ming, Yao, Huamin Qu, and Enrico Bertini. "Rulematrix: Visualizing and understanding classifiers with rules." IEEE transactions on visualization and computer graphics 25.1 (2018): 342-352.

[15]

Kahng, Minsuk, "Gan lab: Understanding complex deep generative models using interactive visual experimentation." IEEE transactions on visualization and computer graphics 25.1 (2018): 310-320.

[16]

Mnih, Volodymyr, "Human-level control through deep reinforcement learning." nature 518.7540 (2015): 529-533.

[17]

Jin, Wei, "A novel building energy consumption prediction method using deep reinforcement learning with consideration of fluctuation points." Journal of Building Engineering 63 (2023): 105458.

[18]

Guo, Gongde, "KNN model-based approach in classification." On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3-7, 2003. Proceedings. Springer Berlin Heidelberg, 2003.

[19]

Ahmed, Mohiuddin, Raihan Seraj, and Syed Mohammed Shamsul Islam. "The k-means algorithm: A comprehensive survey and performance evaluation." Electronics 9.8 (2020): 1295.

[20]

Segal, Mark R. "Machine learning benchmarks and random forest regression." (2004).

[21]

Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of machine learning research 9.11 (2008).

[22]

Asano, Tetsuo, "Clustering algorithms based on minimum and maximum spanning trees." Proceedings of the fourth annual symposium on Computational Geometry. 1988.

[23]

Que, Xinyu, "Scalable community detection with the louvain algorithm." 2015 IEEE International Parallel and Distributed Processing Symposium. IEEE, 2015.

[24]

Fruchterman, Thomas MJ, and Edward M. Reingold. "Graph drawing by force‐directed placement." Software: Practice and experience 21.11 (1991): 1129-1164.

[25]

Ozer, Daniel J. "Correlation and the coefficient of determination." Psychological bulletin 97.2 (1985): 307.

Index Terms

Visual Analysis of DDPG Models by Exploring the Space of Experience
1. Human-centered computing
  1. Visualization
    1. Visualization techniques

Recommendations

Multi-critic DDPG Method and Double Experience Replay
2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
The remarkable Deep Deterministic Policy Gradient (DDPG) reinforcement learning method commonly consists of actor learning and critic learning. The actor learning highly relies on the critic learning, which makes the performance of DDPG method rather ...
Twin-Delayed DDPG: A Deep Reinforcement Learning Technique to Model a Continuous Movement of an Intelligent Robot Agent
ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

In this current research, Twin-Delayed DDPG (TD3) algorithm has been used to solve the most challenging virtual Artificial Intelligence application by training a 4-ant-legged robot as an Intelligent Agent to run across a field. Twin-Delayed DDPG (TD3) ...
A Modified Convergence DDPG Algorithm for Robotic Manipulation
Abstract
Today, robotic arms are widely used in industry. Reinforcement learning algorithms are used frequently for controlling robotic arms in complex environments. One of the customs off-policy model-free actor-critic deep reinforcement learning for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction

September 2023

308 pages

ISBN:9798400707513

DOI:10.1145/3615522

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

VINCI 2023

VINCI 2023: The 16th International Symposium on Visual Information Communication and Interaction

September 22 - 24, 2023

Guangzhou, China

Acceptance Rates

Overall Acceptance Rate 71 of 193 submissions, 37%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
30
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)1

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents