Search | arXiv e-print repository

arXiv:2405.00644 [pdf, other]

ConstrainedZero: Chance-Constrained POMDP Planning using Learned Probabilistic Failure Surrogates and Adaptive Safety Constraints

Authors: Robert J. Moss, Arec Jamgochian, Johannes Fischer, Anthony Corso, Mykel J. Kochenderfer

Abstract: To plan safely in uncertain environments, agents must balance utility with safety constraints. Safe planning problems can be modeled as a chance-constrained partially observable Markov decision process (CC-POMDP) and solutions often use expensive rollouts or heuristics to estimate the optimal value and action-selection policy. This work introduces the ConstrainedZero policy iteration algorithm tha… ▽ More To plan safely in uncertain environments, agents must balance utility with safety constraints. Safe planning problems can be modeled as a chance-constrained partially observable Markov decision process (CC-POMDP) and solutions often use expensive rollouts or heuristics to estimate the optimal value and action-selection policy. This work introduces the ConstrainedZero policy iteration algorithm that solves CC-POMDPs in belief space by learning neural network approximations of the optimal value and policy with an additional network head that estimates the failure probability given a belief. This failure probability guides safe action selection during online Monte Carlo tree search (MCTS). To avoid overemphasizing search based on the failure estimates, we introduce $Δ$-MCTS, which uses adaptive conformal inference to update the failure threshold during planning. The approach is tested on a safety-critical POMDP benchmark, an aircraft collision avoidance system, and the sustainability problem of safe CO$_2$ storage. Results show that by separating safety constraints from the objective we can achieve a target level of safety without optimizing the balance between rewards and costs. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: In Proceedings of the 2024 International Joint Conference on Artificial Intelligence (IJCAI)

arXiv:2310.03217 [pdf, other]

Formal and Practical Elements for the Certification of Machine Learning Systems

Authors: Jean-Guillaume Durand, Arthur Dubois, Robert J. Moss

Abstract: Over the past decade, machine learning has demonstrated impressive results, often surpassing human capabilities in sensing tasks relevant to autonomous flight. Unlike traditional aerospace software, the parameters of machine learning models are not hand-coded nor derived from physics but learned from data. They are automatically adjusted during a training phase, and their values do not usually cor… ▽ More Over the past decade, machine learning has demonstrated impressive results, often surpassing human capabilities in sensing tasks relevant to autonomous flight. Unlike traditional aerospace software, the parameters of machine learning models are not hand-coded nor derived from physics but learned from data. They are automatically adjusted during a training phase, and their values do not usually correspond to physical requirements. As a result, requirements cannot be directly traced to lines of code, hindering the current bottom-up aerospace certification paradigm. This paper attempts to address this gap by 1) demystifying the inner workings and processes to build machine learning models, 2) formally establishing theoretical guarantees given by those processes, and 3) complementing these formal elements with practical considerations to develop a complete certification argument for safety-critical machine learning systems. Based on a scalable statistical verifier, our proposed framework is model-agnostic and tool-independent, making it adaptable to many use cases in the industry. We demonstrate results on a widespread application in autonomous flight: vision-based landing. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Best of Conference at the 2023 Digital Avionics Systems Conference (DASC)

arXiv:2306.00249 [pdf, other]

BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations

Authors: Robert J. Moss, Anthony Corso, Jef Caers, Mykel J. Kochenderfer

Abstract: Real-world planning problems, including autonomous driving and sustainable energy applications like carbon storage and resource exploration, have recently been modeled as partially observable Markov decision processes (POMDPs) and solved using approximate methods. To solve high-dimensional POMDPs in practice, state-of-the-art methods use online planning with problem-specific heuristics to reduce p… ▽ More Real-world planning problems, including autonomous driving and sustainable energy applications like carbon storage and resource exploration, have recently been modeled as partially observable Markov decision processes (POMDPs) and solved using approximate methods. To solve high-dimensional POMDPs in practice, state-of-the-art methods use online planning with problem-specific heuristics to reduce planning horizons and make the problems tractable. Algorithms that learn approximations to replace heuristics have recently found success in large-scale fully observable domains. The key insight is the combination of online Monte Carlo tree search with offline neural network approximations of the optimal policy and value function. In this work, we bring this insight to partially observable domains and propose BetaZero, a belief-state planning algorithm for high-dimensional POMDPs. BetaZero learns offline approximations that replace heuristics to enable online decision making in long-horizon problems. We address several challenges inherent in large-scale partially observable domains; namely challenges of transitioning in stochastic environments, prioritizing action branching with a limited search budget, and representing beliefs as input to the network. To formalize the use of all limited search information, we train against a novel $Q$-weighted visit counts policy. We test BetaZero on various well-established POMDP benchmarks found in the literature and a real-world problem of critical mineral exploration. Experiments show that BetaZero outperforms state-of-the-art POMDP solvers on a variety of tasks. △ Less

Submitted 31 July, 2024; v1 submitted 31 May, 2023; originally announced June 2023.

Comments: Presented at the Reinforcement Learning Conference (RLC) 2024

Journal ref: RLJ, Volume 1, Issue (Number) 1, 2024

arXiv:2305.02449 [pdf, other]

doi 10.2514/1.I011395

Bayesian Safety Validation for Failure Probability Estimation of Black-Box Systems

Authors: Robert J. Moss, Mykel J. Kochenderfer, Maxime Gariel, Arthur Dubois

Abstract: Estimating the probability of failure is an important step in the certification of safety-critical systems. Efficient estimation methods are often needed due to the challenges posed by high-dimensional input spaces, risky test scenarios, and computationally expensive simulators. This work frames the problem of black-box safety validation as a Bayesian optimization problem and introduces a method t… ▽ More Estimating the probability of failure is an important step in the certification of safety-critical systems. Efficient estimation methods are often needed due to the challenges posed by high-dimensional input spaces, risky test scenarios, and computationally expensive simulators. This work frames the problem of black-box safety validation as a Bayesian optimization problem and introduces a method that iteratively fits a probabilistic surrogate model to efficiently predict failures. The algorithm is designed to search for failures, compute the most-likely failure, and estimate the failure probability over an operating domain using importance sampling. We introduce three acquisition functions that aim to reduce uncertainty by covering the design space, optimize the analytically derived failure boundaries, and sample the predicted failure regions. Results show this Bayesian safety validation approach provides a more accurate estimate of failure probability with orders of magnitude fewer samples and performs well across various safety validation metrics. We demonstrate this approach on three test problems, a stochastic decision making system, and a neural network-based runway detection system. This work is open sourced (https://github.com/sisl/BayesianSafetyValidation.jl) and currently being used to supplement the FAA certification process of the machine learning components for an autonomous cargo aircraft. △ Less

Submitted 29 June, 2024; v1 submitted 3 May, 2023; originally announced May 2023.

Journal ref: AIAA Journal of Aerospace Information Systems (JAIS) 21.7 (2024): 533-546

arXiv:2210.08975 [pdf, other]

Prioritizing emergency evacuations under compounding levels of uncertainty

Authors: Lisa J. Einstein, Robert J. Moss, Mykel J. Kochenderfer

Abstract: Well-executed emergency evacuations can save lives and reduce suffering. However, decision makers struggle to determine optimal evacuation policies given the chaos, uncertainty, and value judgments inherent in emergency evacuations. We propose and analyze a decision support tool for pre-crisis training exercises for teams preparing for civilian evacuations and explore the tool in the case of the 2… ▽ More Well-executed emergency evacuations can save lives and reduce suffering. However, decision makers struggle to determine optimal evacuation policies given the chaos, uncertainty, and value judgments inherent in emergency evacuations. We propose and analyze a decision support tool for pre-crisis training exercises for teams preparing for civilian evacuations and explore the tool in the case of the 2021 U.S.-led evacuation from Afghanistan. We use different classes of Markov decision processes (MDPs) to capture compounding levels of uncertainty in (1) the priority category of who appears next at the gate for evacuation, (2) the distribution of priority categories at the population level, and (3) individuals' claimed priority category. We compare the number of people evacuated by priority status under eight heuristic policies. The optimized MDP policy achieves the best performance compared to all heuristic baselines. We also show that accounting for the compounding levels of model uncertainty incurs added complexity without improvement in policy performance. Useful heuristics can be extracted from the optimized policies to inform human decision makers. We open-source all tools to encourage robust dialogue about the trade-offs, limitations, and potential of integrating algorithms into high-stakes humanitarian decision-making. △ Less

Submitted 30 September, 2022; originally announced October 2022.

Comments: Submitted to the IEEE Global Humanitarian Technology Conference

arXiv:2011.02559 [pdf, ps, other]

Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems

Authors: Robert J. Moss, Ritchie Lee, Nicholas Visser, Joachim Hochwarth, James G. Lopez, Mykel J. Kochenderfer

Abstract: To find failure events and their likelihoods in flight-critical systems, we investigate the use of an advanced black-box stress testing approach called adaptive stress testing. We analyze a trajectory predictor from a developmental commercial flight management system which takes as input a collection of lateral waypoints and en-route environmental conditions. Our aim is to search for failure event… ▽ More To find failure events and their likelihoods in flight-critical systems, we investigate the use of an advanced black-box stress testing approach called adaptive stress testing. We analyze a trajectory predictor from a developmental commercial flight management system which takes as input a collection of lateral waypoints and en-route environmental conditions. Our aim is to search for failure events relating to inconsistencies in the predicted lateral trajectories. The intention of this work is to find likely failures and report them back to the developers so they can address and potentially resolve shortcomings of the system before deployment. To improve search performance, this work extends the adaptive stress testing formulation to be applied more generally to sequential decision-making problems with episodic reward by collecting the state transitions during the search and evaluating at the end of the simulated rollout. We use a modified Monte Carlo tree search algorithm with progressive widening as our adversarial reinforcement learner. The performance is compared to direct Monte Carlo simulations and to the cross-entropy method as an alternative importance sampling baseline. The goal is to find potential problems otherwise not found by traditional requirements-based testing. Results indicate that our adaptive stress testing approach finds more failures and finds failures with higher likelihood relative to the baseline approaches. △ Less

Submitted 4 November, 2020; originally announced November 2020.

Comments: 10 pages, 10 figures, 6 algorithms. Digital Avionics Systems Conference (DASC) 2020

arXiv:2009.09043 [pdf, other]

Cross-Entropy Method Variants for Optimization

Authors: Robert J. Moss

Abstract: The cross-entropy (CE) method is a popular stochastic method for optimization due to its simplicity and effectiveness. Designed for rare-event simulations where the probability of a target event occurring is relatively small, the CE-method relies on enough objective function calls to accurately estimate the optimal parameters of the underlying distribution. Certain objective functions may be compu… ▽ More The cross-entropy (CE) method is a popular stochastic method for optimization due to its simplicity and effectiveness. Designed for rare-event simulations where the probability of a target event occurring is relatively small, the CE-method relies on enough objective function calls to accurately estimate the optimal parameters of the underlying distribution. Certain objective functions may be computationally expensive to evaluate, and the CE-method could potentially get stuck in local minima. This is compounded with the need to have an initial covariance wide enough to cover the design space of interest. We introduce novel variants of the CE-method to address these concerns. To mitigate expensive function calls, during optimization we use every sample to build a surrogate model to approximate the objective function. The surrogate model augments the belief of the objective function with less expensive evaluations. We use a Gaussian process for our surrogate model to incorporate uncertainty in the predictions which is especially helpful when dealing with sparse data. To address local minima convergence, we use Gaussian mixture models to encourage exploration of the design space. We experiment with evaluation scheduling techniques to reallocate true objective function calls earlier in the optimization when the covariance is the largest. To test our approach, we created a parameterized test objective function with many local minima and a single global minimum. Our test function can be adjusted to control the spread and distinction of the minima. Experiments were run to stress the cross-entropy method variants and results indicate that the surrogate model-based approach reduces local minima convergence using the same number of function evaluations. △ Less

Submitted 18 September, 2020; originally announced September 2020.

Comments: 9 pages, 6 figures, code available at https://github.com/mossr/CrossEntropyVariants.jl

arXiv:2005.02979 [pdf, ps, other]

doi 10.1613/jair.1.12716

A Survey of Algorithms for Black-Box Safety Validation of Cyber-Physical Systems

Authors: Anthony Corso, Robert J. Moss, Mark Koren, Ritchie Lee, Mykel J. Kochenderfer

Abstract: Autonomous cyber-physical systems (CPS) can improve safety and efficiency for safety-critical applications, but require rigorous testing before deployment. The complexity of these systems often precludes the use of formal verification and real-world testing can be too dangerous during development. Therefore, simulation-based techniques have been developed that treat the system under test as a blac… ▽ More Autonomous cyber-physical systems (CPS) can improve safety and efficiency for safety-critical applications, but require rigorous testing before deployment. The complexity of these systems often precludes the use of formal verification and real-world testing can be too dangerous during development. Therefore, simulation-based techniques have been developed that treat the system under test as a black box operating in a simulated environment. Safety validation tasks include finding disturbances in the environment that cause the system to fail (falsification), finding the most-likely failure, and estimating the probability that the system fails. Motivated by the prevalence of safety-critical artificial intelligence, this work provides a survey of state-of-the-art safety validation techniques for CPS with a focus on applied algorithms and their modifications for the safety validation problem. We present and discuss algorithms in the domains of optimization, path planning, reinforcement learning, and importance sampling. Problem decomposition techniques are presented to help scale algorithms to large state spaces, which are common for CPS. A brief overview of safety-critical applications is given, including autonomous vehicles and aircraft collision avoidance systems. Finally, we present a survey of existing academic and commercially available safety validation tools. △ Less

Submitted 14 October, 2021; v1 submitted 6 May, 2020; originally announced May 2020.

Journal ref: Journal of Artificial Intelligence Research, vol. 72, p. 377-428, 2021

Showing 1–8 of 8 results for author: Moss, R J