Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 85 results for author: Abate, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.11607  [pdf, other

    cs.MA cs.AI cs.GT cs.LG eess.SY

    Networked Communication for Mean-Field Games with Function Approximation and Empirical Mean-Field Estimation

    Authors: Patrick Benjamin, Alessandro Abate

    Abstract: Recent works have provided algorithms by which decentralised agents, which may be connected via a communication network, can learn equilibria in Mean-Field Games from a single, non-episodic run of the empirical system. However, these algorithms are given for tabular settings: this computationally limits the size of players' observation space, meaning that the algorithms are not able to handle anyt… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  2. arXiv:2408.03093  [pdf, other

    cs.LG cs.AI eess.SY

    Learning Provably Robust Policies in Uncertain Parametric Environments

    Authors: Yannik Schnitzer, Alessandro Abate, David Parker

    Abstract: We present a data-driven approach for learning MDP policies that are robust across stochastic environments whose transition probabilities are defined by parameters with an unknown distribution. We produce probably approximately correct (PAC) guarantees for the performance of these learned policies in a new, unseen environment over the unknown distribution. Our approach is based on finite samples o… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  3. arXiv:2407.10971  [pdf, other

    cs.LG

    Walking the Values in Bayesian Inverse Reinforcement Learning

    Authors: Ondrej Bajgar, Alessandro Abate, Konstantinos Gatsis, Michael A. Osborne

    Abstract: The goal of Bayesian inverse reinforcement learning (IRL) is recovering a posterior distribution over reward functions using a set of demonstrations from an expert optimizing for a reward unknown to the learner. The resulting posterior over rewards can then be used to synthesize an apprentice policy that performs well on the same or a similar task. A key challenge in Bayesian IRL is bridging the c… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Published at the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

  4. arXiv:2406.15753  [pdf, other

    cs.LG cs.AI stat.ML

    The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret

    Authors: Lukas Fluri, Leon Lang, Alessandro Abate, Patrick Forré, David Krueger, Joar Skalse

    Abstract: In reinforcement learning, specifying reward functions that capture the intended task can be very challenging. Reward learning aims to address this issue by learning the reward function. However, a learned reward model may have a low error on the training distribution, and yet subsequently produce a policy with large regret. We say that such a reward model has an error-regret mismatch. The main so… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 58 pages, 1 figure

  5. arXiv:2406.10023  [pdf, other

    cs.LG cs.CL stat.ML

    Deep Bayesian Active Learning for Preference Modeling in Large Language Models

    Authors: Luckeciano C. Melo, Panagiotis Tigas, Alessandro Abate, Yarin Gal

    Abstract: Leveraging human preferences for steering the behavior of Large Language Models (LLMs) has demonstrated notable success in recent years. Nonetheless, data selection and labeling are still a bottleneck for these systems, particularly at large scale. Hence, selecting the most informative points for acquiring human feedback may considerably reduce the cost of preference labeling and unleash the furth… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  6. arXiv:2405.17304  [pdf, ps, other

    cs.LO eess.SY

    Stochastic Omega-Regular Verification and Control with Supermartingales

    Authors: Alessandro Abate, Mirco Giacobbe, Diptarko Roy

    Abstract: We present for the first time a supermartingale certificate for $ω$-regular specifications. We leverage the Robbins & Siegmund convergence theorem to characterize supermartingale certificates for the almost-sure acceptance of Streett conditions on general stochastic processes, which we call Streett supermartingales. This enables effective verification and control of discrete-time stochastic dynami… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: The conference version of this manuscript appeared at CAV'24

  7. arXiv:2405.15723  [pdf, other

    cs.LO cs.LG

    Bisimulation Learning

    Authors: Alessandro Abate, Mirco Giacobbe, Yannik Schnitzer

    Abstract: We introduce a data-driven approach to computing finite bisimulations for state transition systems with very large, possibly infinite state space. Our novel technique computes stutter-insensitive bisimulations of deterministic systems, which we characterize as the problem of learning a state classifier together with a ranking function for each class. Our procedure learns a candidate state classifi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  8. arXiv:2405.08232  [pdf, other

    cs.CE

    Distributionally Robust Aggregation of Electric Vehicle Flexibility

    Authors: Karan Mukhi, Chengrui Qu, Pengcheng You, Alessandro Abate

    Abstract: We address the problem of characterising the aggregate flexibility in populations of electric vehicles (EVs) with uncertain charging requirements. Building on previous results that provide exact characterisations of the aggregate flexibility in populations with known charging requirements, in this paper we extend the aggregation methods so that charging requirements are uncertain, but sampled from… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 6 pages, conference

  9. arXiv:2405.06624  [pdf, other

    cs.AI

    Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

    Authors: David "davidad" Dalrymple, Joar Skalse, Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia, Steve Omohundro, Christian Szegedy, Ben Goldhaber, Nora Ammann, Alessandro Abate, Joe Halpern, Clark Barrett, Ding Zhao, Tan Zhi-Xuan, Jeannette Wing, Joshua Tenenbaum

    Abstract: Ensuring that AI systems reliably and robustly avoid harmful or dangerous behaviours is a crucial challenge, especially for AI systems with a high degree of autonomy and general intelligence, or systems used in safety-critical contexts. In this paper, we will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these appro… ▽ More

    Submitted 8 July, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  10. arXiv:2404.18813  [pdf, other

    eess.SY cs.LG cs.LO

    Safe Reach Set Computation via Neural Barrier Certificates

    Authors: Alessandro Abate, Sergiy Bogomolov, Alec Edwards, Kostiantyn Potomkin, Sadegh Soudjani, Paolo Zuliani

    Abstract: We present a novel technique for online safety verification of autonomous systems, which performs reachability analysis efficiently for both bounded and unbounded horizons by employing neural barrier certificates. Our approach uses barrier certificates given by parameterized neural networks that depend on a given initial set, unsafe sets, and time horizon. Such networks are trained efficiently off… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: IFAC Conference on Analysis and Design of Hybrid Systems

  11. arXiv:2404.03314  [pdf, other

    cs.GT eess.SY

    Learning to Bid in Forward Electricity Markets Using a No-Regret Algorithm

    Authors: Arega Getaneh Abate, Dorsa Majdi, Jalal Kazempour, Maryam Kamgarpour

    Abstract: It is a common practice in the current literature of electricity markets to use game-theoretic approaches for strategic price bidding. However, they generally rely on the assumption that the strategic bidders have prior knowledge of rival bids, either perfectly or with some uncertainty. This is not necessarily a realistic assumption. This paper takes a different approach by relaxing such an assump… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  12. arXiv:2403.15398  [pdf

    cs.CY

    An International and Multidisciplinary Teaching Experience with Real Industrial Team Project Development

    Authors: Martin Mellado, Eduardo Vendrell, Filomena Ferrucci, Andrea Abate, Detlef Zuhlke, Bernard Riera

    Abstract: This paper presents the design, objectives, experiences, and results of an international cooperation project funded by the European Commission in the context of the Erasmus Intensive Programme (IP, for short) designed to improve students' curricula. An IP is a short programme of study (minimum 2 weeks) that brings together university students and staff from at least three countries in order to enc… ▽ More

    Submitted 17 February, 2024; originally announced March 2024.

    Comments: 21 pages

  13. arXiv:2403.06854  [pdf, other

    cs.LG

    Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification

    Authors: Joar Skalse, Alessandro Abate

    Abstract: Inverse reinforcement learning (IRL) aims to infer an agent's preferences (represented as a reward function $R$) from their behaviour (represented as a policy $π$). To do this, we need a behavioural model of how $π$ relates to $R$. In the current literature, the most common behavioural models are optimality, Boltzmann-rationality, and causal entropy maximisation. However, the true relationship bet… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  14. arXiv:2401.15838  [pdf, other

    stat.ML cs.LG cs.MA math.OC stat.CO

    Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers

    Authors: Alexandros E. Tzikas, Licio Romao, Mert Pilanci, Alessandro Abate, Mykel J. Kochenderfer

    Abstract: Many machine learning applications require operating on a spatially distributed dataset. Despite technological advances, privacy considerations and communication constraints may prevent gathering the entire dataset in a central unit. In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers, which is commonly used in the optimization literatur… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  15. arXiv:2401.14811  [pdf, ps, other

    cs.AI cs.LG

    On the Limitations of Markovian Rewards to Express Multi-Objective, Risk-Sensitive, and Modal Tasks

    Authors: Joar Skalse, Alessandro Abate

    Abstract: In this paper, we study the expressivity of scalar, Markovian reward functions in Reinforcement Learning (RL), and identify several limitations to what they can express. Specifically, we look at three classes of RL tasks; multi-objective RL, risk-sensitive RL, and modal RL. For each class, we derive necessary and sufficient conditions that describe when a problem in this class can be expressed usi… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Journal ref: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1974-1984, 2023

  16. arXiv:2312.11314  [pdf, other

    cs.LG cs.LO eess.SY

    Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis

    Authors: Rohan Mitta, Hosein Hasanbeig, Jun Wang, Daniel Kroening, Yiannis Kantaros, Alessandro Abate

    Abstract: This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL), such that the safety constraint violations are bounded at any point during learning. In a variety of RL applications the safety of the agent is particularly important, e.g. autonomous platforms or robots that work in proximity of humans. As enforcing safety during training might severely limit th… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  17. arXiv:2312.06344  [pdf, other

    eess.SY cs.LO

    Learning Robust Policies for Uncertain Parametric Markov Decision Processes

    Authors: Luke Rickard, Alessandro Abate, Kostas Margellos

    Abstract: Synthesising verifiably correct controllers for dynamical systems is crucial for safety-critical problems. To achieve this, it is important to account for uncertainty in a robust manner, while at the same time it is often of interest to avoid being overly conservative with the view of achieving a better cost. We propose a method for verifiably safe policy synthesis for a class of finite state mode… ▽ More

    Submitted 15 May, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: 10 pages, accepted for oral presentation at L4DC

  18. arXiv:2311.09793  [pdf, other

    eess.SY cs.LG cs.LO

    Fossil 2.0: Formal Certificate Synthesis for the Verification and Control of Dynamical Models

    Authors: Alec Edwards, Andrea Peruffo, Alessandro Abate

    Abstract: This paper presents Fossil 2.0, a new major release of a software tool for the synthesis of certificates (e.g., Lyapunov and barrier functions) for dynamical systems modelled as ordinary differential and difference equations. Fossil 2.0 is much improved from its original release, including new interfaces, a significantly expanded certificate portfolio, controller synthesis and enhanced extensibili… ▽ More

    Submitted 16 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: HSCC 2024 Tool Paper

  19. arXiv:2311.09786  [pdf, other

    eess.SY cs.AI cs.LO

    Correct-by-Construction Control for Stochastic and Uncertain Dynamical Models via Formal Abstractions

    Authors: Thom Badings, Nils Jansen, Licio Romao, Alessandro Abate

    Abstract: Automated synthesis of correct-by-construction controllers for autonomous systems is crucial for their deployment in safety-critical scenarios. Such autonomous systems are naturally modeled as stochastic dynamical models. The general problem is to compute a controller that provably satisfies a given task, represented as a probabilistic temporal logic specification. However, factors such as stochas… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: In Proceedings FMAS 2023, arXiv:2311.08987. arXiv admin note: text overlap with arXiv:2301.01526

    Journal ref: EPTCS 395, 2023, pp. 144-152

  20. arXiv:2310.01951  [pdf, other

    cs.LG cs.AI

    Probabilistic Reach-Avoid for Bayesian Neural Networks

    Authors: Matthew Wicker, Luca Laurenti, Andrea Patane, Nicola Paoletti, Alessandro Abate, Marta Kwiatkowska

    Abstract: Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: f… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: 47 pages, 10 figures. arXiv admin note: text overlap with arXiv:2105.10134

  21. arXiv:2309.15257  [pdf, other

    cs.LG cs.AI

    STARC: A General Framework For Quantifying Differences Between Reward Functions

    Authors: Joar Skalse, Lucy Farnik, Sumeet Ramesh Motwani, Erik Jenner, Adam Gleave, Alessandro Abate

    Abstract: In order to solve a task using reinforcement learning, it is necessary to first formalise the goal of that task as a reward function. However, for many real-world tasks, it is very difficult to manually specify a reward function that never incentivises undesirable behaviour. As a result, it is increasingly popular to use \emph{reward learning algorithms}, which attempt to \emph{learn} a reward fun… ▽ More

    Submitted 11 March, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

  22. arXiv:2309.06090  [pdf, other

    eess.SY cs.LG cs.LO

    A General Verification Framework for Dynamical and Control Models via Certificate Synthesis

    Authors: Alec Edwards, Andrea Peruffo, Alessandro Abate

    Abstract: An emerging branch of control theory specialises in certificate learning, concerning the specification of a desired (possibly complex) system behaviour for an autonomous or control model, which is then analytically verified by means of a function-based proof. However, the synthesis of controllers abiding by these complex requirements is in general a non-trivial task and may elude the most expert c… ▽ More

    Submitted 1 July, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  23. arXiv:2308.10587  [pdf, other

    cs.FL

    Formal Analysis and Verification of Max-Plus Linear Systems

    Authors: Muhammad Syifa'ul Mufid, Andrea Micheli, Alessandro Abate, Alessandro Cimatti

    Abstract: Max-Plus Linear (MPL) systems are an algebraic formalism with practical applications in transportation networks, manufacturing and biological systems. In this paper, we investigate the problem of automatically analyzing the properties of MPL, taking into account both structural properties such as transient and cyclicity, and the open problem of user-defined temporal properties. We propose Time-Dif… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 28 pages (including appendixes)

  24. arXiv:2307.15546  [pdf, other

    cs.LO cs.LG eess.SY

    On the Trade-off Between Efficiency and Precision of Neural Abstraction

    Authors: Alec Edwards, Mirco Giacobbe, Alessandro Abate

    Abstract: Neural abstractions have been recently introduced as formal approximations of complex, nonlinear dynamical models. They comprise a neural ODE and a certified upper bound on the error between the abstract neural network and the concrete dynamical model. So far neural abstractions have exclusively been obtained as neural networks consisting entirely of $ReLU$ activation functions, resulting in neura… ▽ More

    Submitted 2 October, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Appeared at QEST 2023. Added codebase link; corrected Eq. 11

  25. arXiv:2307.05059  [pdf, ps, other

    cs.GT cs.AI cs.MA

    On Imperfect Recall in Multi-Agent Influence Diagrams

    Authors: James Fox, Matt MacDermott, Lewis Hammond, Paul Harrenstein, Alessandro Abate, Michael Wooldridge

    Abstract: Multi-agent influence diagrams (MAIDs) are a popular game-theoretic model based on Bayesian networks. In some settings, MAIDs offer significant advantages over extensive-form game representations. Previous work on MAIDs has assumed that agents employ behavioural policies, which set independent conditional probability distributions over actions for each of their decisions. In settings with imperfec… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

    Comments: In Proceedings TARK 2023, arXiv:2307.04005

    Journal ref: EPTCS 379, 2023, pp. 201-220

  26. arXiv:2306.02766  [pdf, other

    cs.MA cs.AI cs.LG cs.SI eess.SY

    Networked Communication for Decentralised Agents in Mean-Field Games

    Authors: Patrick Benjamin, Alessandro Abate

    Abstract: We introduce networked communication to the mean-field game framework, in particular to oracle-free settings where $N$ decentralised agents learn along a single, non-episodic run of the empirical system. We prove that our architecture, with only a few reasonable assumptions about network structure, has sample guarantees bounded between those of the centralised- and independent-learning cases. We d… ▽ More

    Submitted 28 June, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

  27. arXiv:2303.17618  [pdf, other

    cs.LG eess.SY

    Data-driven abstractions via adaptive refinements and a Kantorovich metric [extended version]

    Authors: Adrien Banse, Licio Romao, Alessandro Abate, Raphaël M. Jungers

    Abstract: We introduce an adaptive refinement procedure for smart, and scalable abstraction of dynamical systems. Our technique relies on partitioning the state space depending on the observation of future outputs. However, this knowledge is dynamically constructed in an adaptive, asymmetric way. In order to learn the optimal structure, we define a Kantorovich-inspired metric between Markov chains, and we u… ▽ More

    Submitted 30 October, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: This paper is an extended version of a CDC2023 submission

  28. arXiv:2303.13657  [pdf, other

    math.OC cs.LG

    Policy Evaluation in Distributional LQR

    Authors: Zifan Wang, Yulong Gao, Siyi Wang, Michael M. Zavlanos, Alessandro Abate, Karl H. Johansson

    Abstract: Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard RL. At the same time, a main challenge in DRL is that policy evaluation in DRL typically relies on the representation of the return distribution, which needs to be carefu… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: 12pages

  29. arXiv:2302.13888  [pdf, other

    cs.GT cs.CC

    k-Prize Weighted Voting Games

    Authors: Wei-Chen Lee, David Hyland, Alessandro Abate, Edith Elkind, Jiarui Gan, Julian Gutierrez, Paul Harrenstein, Michael Wooldridge

    Abstract: We introduce a natural variant of weighted voting games, which we refer to as k-Prize Weighted Voting Games. Such games consist of n players with weights, and k prizes, of possibly differing values. The players form coalitions, and the i-th largest coalition (by the sum of weights of its members) wins the i-th largest prize, which is then shared among its members. We present four solution concepts… ▽ More

    Submitted 2 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: Accepted to AAMAS 2023

  30. arXiv:2301.11683  [pdf, other

    cs.LO cs.LG eess.SY

    Neural Abstractions

    Authors: Alessandro Abate, Alec Edwards, Mirco Giacobbe

    Abstract: We present a novel method for the safety verification of nonlinear dynamical models that uses neural networks to represent abstractions of their dynamics. Neural networks have extensively been used before as approximators; in this work, we make a step further and use them for the first time as abstractions. For a given dynamical model, our method synthesises a neural network that overapproximates… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

    Comments: NeurIPS 2022

  31. Quantitative Verification with Neural Networks

    Authors: Alessandro Abate, Alec Edwards, Mirco Giacobbe, Hashan Punchihewa, Diptarko Roy

    Abstract: We present a data-driven approach to the quantitative verification of probabilistic programs and stochastic dynamical models. Our approach leverages neural networks to compute tight and sound bounds for the probability that a stochastic process hits a target condition within finite time. This problem subsumes a variety of quantitative verification questions, from the reachability and safety analys… ▽ More

    Submitted 11 March, 2024; v1 submitted 15 January, 2023; originally announced January 2023.

    Comments: The conference version of this manuscript appeared at CONCUR 2023

    ACM Class: F.3.1; D.2.4

  32. Reasoning about Causality in Games

    Authors: Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael Wooldridge

    Abstract: Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's caus… ▽ More

    Submitted 17 April, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: Published in Artificial Intelligence (2023)

  33. Robust Control for Dynamical Systems With Non-Gaussian Noise via Formal Abstractions

    Authors: Thom Badings, Licio Romao, Alessandro Abate, David Parker, Hasan A. Poonawala, Marielle Stoelinga, Nils Jansen

    Abstract: Controllers for dynamical systems that operate in safety-critical settings must account for stochastic disturbances. Such disturbances are often modeled as process noise in a dynamical system, and common assumptions are that the underlying distributions are known and/or Gaussian. In practice, however, these assumptions may be unrealistic and can lead to poor approximations of the true noise distri… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

    Comments: To appear in the Journal of Artificial Intelligence Research (JAIR). arXiv admin note: text overlap with arXiv:2110.12662

    Journal ref: Journal of Artificial Intelligence Research (JAIR) 76 (2023) 341-391

  34. Lexicographic Multi-Objective Reinforcement Learning

    Authors: Joar Skalse, Lewis Hammond, Charlie Griffin, Alessandro Abate

    Abstract: In this work we introduce reinforcement learning techniques for solving lexicographic multi-objective problems. These are problems that involve multiple reward signals, and where the goal is to learn a policy that maximises the first reward signal, and subject to this constraint also maximises the second reward signal, and so on. We present a family of both action-value and policy gradient algorit… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

    Journal ref: IJCAI 2022; Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. Main Track, Pages 3430-3436

  35. arXiv:2212.03201  [pdf, ps, other

    cs.LG

    Misspecification in Inverse Reinforcement Learning

    Authors: Joar Skalse, Alessandro Abate

    Abstract: The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function $R$ from a policy $π$. To do this, we need a model of how $π$ relates to $R$. In the current literature, the most common models are optimality, Boltzmann rationality, and causal entropy maximisation. One of the primary motivations behind IRL is to infer human preferences from human behaviour. However, the true relationsh… ▽ More

    Submitted 24 March, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 2023

  36. arXiv:2212.00679  [pdf, other

    eess.SY cs.AI cs.RO

    Formal Controller Synthesis for Markov Jump Linear Systems with Uncertain Dynamics

    Authors: Luke Rickard, Thom Badings, Licio Romao, Alessandro Abate

    Abstract: Automated synthesis of provably correct controllers for cyber-physical systems is crucial for deployment in safety-critical scenarios. However, hybrid features and stochastic or unknown behaviours make this problem challenging. We propose a method for synthesising controllers for Markov jump linear systems (MJLSs), a class of discrete-time models for cyber-physical systems, so that they certifiabl… ▽ More

    Submitted 4 August, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: 15 pages, accepted to QEST

  37. arXiv:2210.05989  [pdf, other

    eess.SY cs.AI cs.RO

    Probabilities Are Not Enough: Formal Controller Synthesis for Stochastic Dynamical Models with Epistemic Uncertainty

    Authors: Thom Badings, Licio Romao, Alessandro Abate, Nils Jansen

    Abstract: Capturing uncertainty in models of complex dynamical systems is crucial to designing safe controllers. Stochastic noise causes aleatoric uncertainty, whereas imprecise knowledge of model parameters leads to epistemic uncertainty. Several approaches use formal abstractions to synthesize policies that satisfy temporal specifications related to safety and reachability. However, the underlying models… ▽ More

    Submitted 7 December, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted at AAAI 2023

  38. arXiv:2209.15320  [pdf, other

    cs.LG cs.AI cs.RO

    Bounded Robustness in Reinforcement Learning via Lexicographic Objectives

    Authors: Daniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr., Alessandro Abate

    Abstract: Policy robustness in Reinforcement Learning may not be desirable at any cost: the alterations caused by robustness requirements from otherwise optimal policies should be explainable, quantifiable and formally verifiable. In this work we study how policies can be maximally robust to arbitrary observational noise by analysing how they are altered by this noise through a stochastic linear operator in… ▽ More

    Submitted 11 December, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

  39. arXiv:2209.10341  [pdf, other

    cs.LG cs.AI cs.LO

    LCRL: Certified Policy Synthesis via Logically-Constrained Reinforcement Learning

    Authors: Hosein Hasanbeig, Daniel Kroening, Alessandro Abate

    Abstract: LCRL is a software tool that implements model-free Reinforcement Learning (RL) algorithms over unknown Markov Decision Processes (MDPs), synthesising policies that satisfy a given linear temporal specification with maximal probability. LCRL leverages partially deterministic finite-state machines known as Limit Deterministic Buchi Automata (LDBA) to express a given linear temporal specification. A… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: Evaluated and Accepted by the 19th International Conference on Quantitative Evaluation of Systems 2022

  40. arXiv:2208.11838  [pdf, other

    cs.LG cs.AI

    Learning Task Automata for Reinforcement Learning using Hidden Markov Models

    Authors: Alessandro Abate, Yousif Almulla, James Fox, David Hyland, Michael Wooldridge

    Abstract: Training reinforcement learning (RL) agents using scalar reward signals is often infeasible when an environment has sparse and non-Markovian rewards. Moreover, handcrafting these reward functions before training is prone to misspecification, especially when the environment's dynamics are only partially known. This paper proposes a novel pipeline for learning non-Markovian task specifications as su… ▽ More

    Submitted 3 October, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

    Comments: 14 pages, 7 figures, Accepted to the 26th European Conference on Artificial Intelligence (ECAI 2023)

  41. arXiv:2208.06385   

    cs.LG eess.SY

    Low Emission Building Control with Zero-Shot Reinforcement Learning

    Authors: Scott R. Jeen, Alessandro Abate, Jonathan M. Cullen

    Abstract: Heating and cooling systems in buildings account for 31\% of global energy use, much of which are regulated by Rule Based Controllers (RBCs) that neither maximise energy efficiency nor minimise emissions by interacting optimally with the grid. Control via Reinforcement Learning (RL) has been shown to significantly improve building energy efficiency, but existing solutions require access to buildin… ▽ More

    Submitted 15 August, 2022; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: This paper should have been submitted as a replacement for arXiv:2206.14191

  42. Low Emission Building Control with Zero-Shot Reinforcement Learning

    Authors: Scott R. Jeen, Alessandro Abate, Jonathan M. Cullen

    Abstract: Heating and cooling systems in buildings account for 31% of global energy use, much of which are regulated by Rule Based Controllers (RBCs) that neither maximise energy efficiency nor minimise emissions by interacting optimally with the grid. Control via Reinforcement Learning (RL) has been shown to significantly improve building energy efficiency, but existing solutions require access to building… ▽ More

    Submitted 5 March, 2023; v1 submitted 28 June, 2022; originally announced June 2022.

    Comments: Accepted at AAAI 2023. Code available via https://enjeeneer.io/projects/pearl/

  43. arXiv:2203.07475  [pdf, other

    cs.LG cs.AI stat.ML

    Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

    Authors: Joar Skalse, Matthew Farrugia-Roberts, Stuart Russell, Alessandro Abate, Adam Gleave

    Abstract: It is often very challenging to manually design reward functions for complex, real-world tasks. To solve this, one can instead use reward learning to infer a reward function from data. However, there are often multiple reward functions that fit the data equally well, even in the infinite-data limit. This means that the reward function is only partially identifiable. In this work, we formally chara… ▽ More

    Submitted 7 June, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: ICML 2023. 9 pages main paper, 26 pages total, 3 figures

    ACM Class: I.2.6

  44. arXiv:2110.12662  [pdf, other

    eess.SY cs.AI cs.RO

    Sampling-Based Robust Control of Autonomous Systems with Non-Gaussian Noise

    Authors: Thom S. Badings, Alessandro Abate, Nils Jansen, David Parker, Hasan A. Poonawala, Marielle Stoelinga

    Abstract: Controllers for autonomous systems that operate in safety-critical settings must account for stochastic disturbances. Such disturbances are often modelled as process noise, and common assumptions are that the underlying distributions are known and/or Gaussian. In practice, however, these assumptions may be unrealistic and can lead to poor approximations of the true noise distribution. We present a… ▽ More

    Submitted 13 December, 2021; v1 submitted 25 October, 2021; originally announced October 2021.

    Journal ref: AAAI 2022 (distinguished paper)

  45. arXiv:2105.10134  [pdf, other

    cs.LG

    Certification of Iterative Predictions in Bayesian Neural Networks

    Authors: Matthew Wicker, Luca Laurenti, Andrea Patane, Nicola Paoletti, Alessandro Abate, Marta Kwiatkowska

    Abstract: We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models. Specifically, we leverage bound propagation techniques and backward recursion to compute lower bounds for the probability that trajectories of the BNN model reach a given set of states while avoiding a set of unsafe states. We use the lower bounds in the context… ▽ More

    Submitted 19 June, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

    Comments: Accepted, UAI 2021. 17 pages

  46. arXiv:2104.14691  [pdf, other

    math.PR cs.CE

    Grid-Free Computation of Probabilistic Safety with Malliavin Calculus

    Authors: Francesco Cosentino, Harald Oberhauser, Alessandro Abate

    Abstract: This work concerns continuous-time, continuous-space stochastic dynamical systems described by stochastic differential equations (SDE). It presents a new approach to compute probabilistic safety regions, namely sets of initial conditions of the SDE associated to trajectories that are safe with a probability larger than a given threshold. The approach introduces a functional that is minimised at th… ▽ More

    Submitted 10 January, 2023; v1 submitted 29 April, 2021; originally announced April 2021.

  47. arXiv:2102.12855  [pdf, other

    cs.LG cs.AI cs.FL cs.LO

    Modular Deep Reinforcement Learning for Continuous Motion Planning with Temporal Logic

    Authors: Mingyu Cai, Mohammadhosein Hasanbeig, Shaoping Xiao, Alessandro Abate, Zhen Kan

    Abstract: This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP) with unknown transition probabilities over continuous state and action spaces. Linear temporal logic (LTL) is used to specify high-level tasks over infinite horizon, which can be converted into a limit deterministic generalized Büchi automaton (LDGBA) with several accepting sets.… ▽ More

    Submitted 23 January, 2022; v1 submitted 23 February, 2021; originally announced February 2021.

    Comments: arXiv admin note: text overlap with arXiv:2010.06797

    Journal ref: IEEE Robotics and Automation Letters, 2021

  48. arXiv:2102.05008  [pdf, other

    cs.MA cs.AI cs.GT

    Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

    Authors: Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, Michael Wooldridge

    Abstract: Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibri… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-21)

  49. arXiv:2102.00582  [pdf, other

    cs.AI cs.LG cs.LO cs.MA

    Multi-Agent Reinforcement Learning with Temporal Logic Specifications

    Authors: Lewis Hammond, Alessandro Abate, Julian Gutierrez, Michael Wooldridge

    Abstract: In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour. From a learning perspective these specifications provide a rich formal language with which to capture tasks or objectives, while from a logic and automated verification perspective the introduction of learning capabili… ▽ More

    Submitted 9 February, 2021; v1 submitted 31 January, 2021; originally announced February 2021.

    Comments: Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-21)

  50. arXiv:2101.07491  [pdf, ps, other

    cs.LO eess.SY

    Automated Verification and Synthesis of Stochastic Hybrid Systems: A Survey

    Authors: Abolfazl Lavaei, Sadegh Soudjani, Alessandro Abate, Majid Zamani

    Abstract: Stochastic hybrid systems have received significant attentions as a relevant modelling framework describing many systems, from engineering to the life sciences: they enable the study of numerous applications, including transportation networks, biological systems and chemical reaction networks, smart energy and power grids, and beyond. Automated verification and policy synthesis for stochastic hybr… ▽ More

    Submitted 10 March, 2022; v1 submitted 19 January, 2021; originally announced January 2021.