Search | arXiv e-print repository

Suicidal Pedestrian: Generation of Safety-Critical Scenarios for Autonomous Vehicles

Authors: Yuhang Yang, Kalle Kujanpaa, Amin Babadi, Joni Pajarinen, Alexander Ilin

Abstract: Developing reliable autonomous driving algorithms poses challenges in testing, particularly when it comes to safety-critical traffic scenarios involving pedestrians. An open question is how to simulate rare events, not necessarily found in autonomous driving datasets or scripted simulations, but which can occur in testing, and, in the end may lead to severe pedestrian related accidents. This paper… ▽ More Developing reliable autonomous driving algorithms poses challenges in testing, particularly when it comes to safety-critical traffic scenarios involving pedestrians. An open question is how to simulate rare events, not necessarily found in autonomous driving datasets or scripted simulations, but which can occur in testing, and, in the end may lead to severe pedestrian related accidents. This paper presents a method for designing a suicidal pedestrian agent within the CARLA simulator, enabling the automatic generation of traffic scenarios for testing safety of autonomous vehicles (AVs) in dangerous situations with pedestrians. The pedestrian is modeled as a reinforcement learning (RL) agent with two custom reward functions that allow the agent to either arbitrarily or with high velocity to collide with the AV. Instead of significantly constraining the initial locations and the pedestrian behavior, we allow the pedestrian and autonomous car to be placed anywhere in the environment and the pedestrian to roam freely to generate diverse scenarios. To assess the performance of the suicidal pedestrian and the target vehicle during testing, we propose three collision-oriented evaluation metrics. Experimental results involving two state-of-the-art autonomous driving algorithms trained end-to-end with imitation learning from sensor data demonstrate the effectiveness of the suicidal pedestrian in identifying decision errors made by autonomous vehicles controlled by the algorithms. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: 6 pages; 5 figures; 2 tables

arXiv:2210.01426 [pdf, other]

Continuous Monte Carlo Graph Search

Authors: Kalle Kujanpää, Amin Babadi, Yi Zhao, Juho Kannala, Alexander Ilin, Joni Pajarinen

Abstract: Online planning is crucial for high performance in many complex sequential decision-making tasks. Monte Carlo Tree Search (MCTS) employs a principled mechanism for trading off exploration for exploitation for efficient online planning, and it outperforms comparison methods in many discrete decision-making domains such as Go, Chess, and Shogi. Subsequently, extensions of MCTS to continuous domains… ▽ More Online planning is crucial for high performance in many complex sequential decision-making tasks. Monte Carlo Tree Search (MCTS) employs a principled mechanism for trading off exploration for exploitation for efficient online planning, and it outperforms comparison methods in many discrete decision-making domains such as Go, Chess, and Shogi. Subsequently, extensions of MCTS to continuous domains have been developed. However, the inherent high branching factor and the resulting explosion of the search tree size are limiting the existing methods. To address this problem, we propose Continuous Monte Carlo Graph Search (CMCGS), an extension of MCTS to online planning in environments with continuous state and action spaces. CMCGS takes advantage of the insight that, during planning, sharing the same action policy between several states can yield high performance. To implement this idea, at each time step, CMCGS clusters similar states into a limited number of stochastic action bandit nodes, which produce a layered directed graph instead of an MCTS search tree. Experimental evaluation shows that CMCGS outperforms comparable planning methods in several complex continuous DeepMind Control Suite benchmarks and 2D navigation and exploration tasks with limited sample budgets. Furthermore, CMCGS can be scaled up through parallelization, and it outperforms the Cross-Entropy Method (CEM) in continuous control with learned dynamics models. △ Less

Submitted 7 February, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

Comments: Accepted at AAMAS 2024 (full paper & oral)

arXiv:2009.10337 [pdf, other]

Learning Task-Agnostic Action Spaces for Movement Optimization

Authors: Amin Babadi, Michiel van de Panne, C. Karen Liu, Perttu Hämäläinen

Abstract: We propose a novel method for exploring the dynamics of physically based animated characters, and learning a task-agnostic action space that makes movement optimization easier. Like several previous papers, we parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets. Our novel contribution is that w… ▽ More We propose a novel method for exploring the dynamics of physically based animated characters, and learning a task-agnostic action space that makes movement optimization easier. Like several previous papers, we parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets. Our novel contribution is that with our exploration data, we are able to learn the low-level policy in a generic manner and without any reference movement data. Trained once for each agent or simulation environment, the policy improves the efficiency of optimizing both trajectories and high-level policies across multiple tasks and optimization algorithms. We also contribute novel visualizations that show how using target states as actions makes optimized trajectories more robust to disturbances; this manifests as wider optima that are easy to find. Due to its simplicity and generality, our proposed approach should provide a building block that can improve a large variety of movement optimization methods and applications. △ Less

Submitted 23 July, 2021; v1 submitted 22 September, 2020; originally announced September 2020.

Comments: Accepted as a regular paper by IEEE Transactions on Visualization and Computer Graphics (TVCG) in July 2021

arXiv:1909.07869 [pdf, other]

Visualizing Movement Control Optimization Landscapes

Authors: Perttu Hämäläinen, Juuso Toikka, Amin Babadi, C. Karen Liu

Abstract: A large body of animation research focuses on optimization of movement control, either as action sequences or policy parameters. However, as closed-form expressions of the objective functions are often not available, our understanding of the optimization problems is limited. Building on recent work on analyzing neural network training, we contribute novel visualizations of high-dimensional control… ▽ More A large body of animation research focuses on optimization of movement control, either as action sequences or policy parameters. However, as closed-form expressions of the objective functions are often not available, our understanding of the optimization problems is limited. Building on recent work on analyzing neural network training, we contribute novel visualizations of high-dimensional control optimization landscapes; this yields insights into why control optimization is hard and why common practices like early termination and spline-based action parameterizations make optimization easier. For example, our experiments show how trajectory optimization can become increasingly ill-conditioned with longer trajectories, but parameterizing control as partial target states---e.g., target angles converted to torques using a PD-controller---can act as an efficient preconditioner. Both our visualizations and quantitative empirical data also indicate that neural network policy optimization scales better than trajectory optimization for long planning horizons. Our work advances the understanding of movement optimization and our visualizations should also provide value in educational use. △ Less

Submitted 22 August, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

Comments: Accepted to IEEE Transactions on Visualization and Computer Graphics (IEEE TVCG)

arXiv:1907.11842 [pdf, other]

Self-Imitation Learning of Locomotion Movements through Termination Curriculum

Authors: Amin Babadi, Kourosh Naderi, Perttu Hämäläinen

Abstract: Animation and machine learning research have shown great advancements in the past decade, leading to robust and powerful methods for learning complex physically-based animations. However, learning can take hours or days, especially if no reference movement data is available. In this paper, we propose and evaluate a novel combination of techniques for accelerating the learning of stable locomotion… ▽ More Animation and machine learning research have shown great advancements in the past decade, leading to robust and powerful methods for learning complex physically-based animations. However, learning can take hours or days, especially if no reference movement data is available. In this paper, we propose and evaluate a novel combination of techniques for accelerating the learning of stable locomotion movements through self-imitation learning of synthetic animations. First, we produce synthetic and cyclic reference movement using a recent online tree search approach that can discover stable walking gaits in a few minutes. This allows us to use reinforcement learning with Reference State Initialization (RSI) to find a neural network controller for imitating the synthesized reference motion. We further accelerate the learning using a novel curriculum learning approach called Termination Curriculum (TC), that adapts the episode termination threshold over time. The combination of the RSI and TC ensures that simulation budget is not wasted in regions of the state space not visited by the final policy. As a result, our agents can learn locomotion skills in just a few hours on a modest 4-core computer. We demonstrate this by producing locomotion movements for a variety of characters. △ Less

Submitted 20 September, 2019; v1 submitted 27 July, 2019; originally announced July 2019.

Comments: 2019 ACM SIGGRAPH Conference on Motion, Interaction and Games (MIG 2019)

arXiv:1810.02541 [pdf, other]

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Authors: Perttu Hämäläinen, Amin Babadi, Xiaoxiao Ma, Jaakko Lehtinen

Abstract: Proximal Policy Optimization (PPO) is a highly popular model-free reinforcement learning (RL) approach. However, we observe that in a continuous action space, PPO can prematurely shrink the exploration variance, which leads to slow progress and may make the algorithm prone to getting stuck in local optima. Drawing inspiration from CMA-ES, a black-box evolutionary optimization method designed for r… ▽ More Proximal Policy Optimization (PPO) is a highly popular model-free reinforcement learning (RL) approach. However, we observe that in a continuous action space, PPO can prematurely shrink the exploration variance, which leads to slow progress and may make the algorithm prone to getting stuck in local optima. Drawing inspiration from CMA-ES, a black-box evolutionary optimization method designed for robustness in similar situations, we propose PPO-CMA, a proximal policy optimization approach that adaptively expands the exploration variance to speed up progress. With only minor changes to PPO, our algorithm considerably improves performance in Roboschool continuous control benchmarks. Our results also show that PPO-CMA, as opposed to PPO, is significantly less sensitive to the choice of hyperparameters, allowing one to use it in complex movement optimization tasks without requiring tedious tuning. △ Less

Submitted 3 November, 2020; v1 submitted 5 October, 2018; originally announced October 2018.

Comments: This paper has been accepted to IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2020). The arxiv version also includes an appendix that covers more results

arXiv:1808.06201 [pdf, other]

Intelligent Middle-Level Game Control

Authors: Amin Babadi, Kourosh Naderi, Perttu Hämäläinen

Abstract: We propose the concept of intelligent middle-level game control, which lies on a continuum of control abstraction levels between the following two dual opposites: 1) high-level control that translates player's simple commands into complex actions (such as pressing Space key for jumping), and 2) low-level control which simulates real-life complexities by directly manipulating, e.g., joint rotations… ▽ More We propose the concept of intelligent middle-level game control, which lies on a continuum of control abstraction levels between the following two dual opposites: 1) high-level control that translates player's simple commands into complex actions (such as pressing Space key for jumping), and 2) low-level control which simulates real-life complexities by directly manipulating, e.g., joint rotations of the character as it is done in the runner game QWOP. We posit that various novel control abstractions can be explored using recent advances in movement intelligence of game characters. We demonstrate this through design and evaluation of a novel 2-player martial arts game prototype. In this game, each player guides a simulated humanoid character by clicking and dragging body parts. This defines the cost function for an online continuous control algorithm that executes the requested movement. Our control algorithm uses Covariance Matrix Adaptation Evolution Strategy (CMA-ES) in a rolling horizon manner with custom population seeding techniques. Our playtesting data indicates that intelligent middle-level control results in producing novel and innovative gameplay without frustrating interface complexities. △ Less

Submitted 19 August, 2018; originally announced August 2018.

Comments: 2018 IEEE Conference on Computational Intelligence and Games (IEEE CIG 2018)

Showing 1–7 of 7 results for author: Babadi, A