Multiagent Systems
See recent articles
- [1] arXiv:2407.12113 (cross-list from cs.LG) [pdf, html, other]
-
Title: A Graph-based Adversarial Imitation Learning Framework for Reliable & Realtime Fleet Scheduling in Urban Air MobilityComments: Accepted for presentation at the AIAA Aviation Forum 2024Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
The advent of Urban Air Mobility (UAM) presents the scope for a transformative shift in the domain of urban transportation. However, its widespread adoption and economic viability depends in part on the ability to optimally schedule the fleet of aircraft across vertiports in a UAM network, under uncertainties attributed to airspace congestion, changing weather conditions, and varying demands. This paper presents a comprehensive optimization formulation of the fleet scheduling problem, while also identifying the need for alternate solution approaches, since directly solving the resulting integer nonlinear programming problem is computationally prohibitive for daily fleet scheduling. Previous work has shown the effectiveness of using (graph) reinforcement learning (RL) approaches to train real-time executable policy models for fleet scheduling. However, such policies can often be brittle on out-of-distribution scenarios or edge cases. Moreover, training performance also deteriorates as the complexity (e.g., number of constraints) of the problem increases. To address these issues, this paper presents an imitation learning approach where the RL-based policy exploits expert demonstrations yielded by solving the exact optimization using a Genetic Algorithm. The policy model comprises Graph Neural Network (GNN) based encoders that embed the space of vertiports and aircraft, Transformer networks to encode demand, passenger fare, and transport cost profiles, and a Multi-head attention (MHA) based decoder. Expert demonstrations are used through the Generative Adversarial Imitation Learning (GAIL) algorithm. Interfaced with a UAM simulation environment involving 8 vertiports and 40 aircrafts, in terms of the daily profits earned reward, the new imitative approach achieves better mean performance and remarkable improvement in the case of unseen worst-case scenarios, compared to pure RL results.
- [2] arXiv:2407.12131 (cross-list from cs.CY) [pdf, other]
-
Title: Improving Health Information Access in the World's Largest Maternal Mobile Health Program via Bandit AlgorithmsArshika Lalan, Shresth Verma, Paula Rodriguez Diaz, Panayiotis Danassis, Amrita Mahale, Kumar Madhu Sudan, Aparna Hegde, Milind Tambe, Aparna TanejaComments: Published at Innovative Applications of Artificial Intelligence (IAAI 2024)Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Harnessing the wide-spread availability of cell phones, many nonprofits have launched mobile health (mHealth) programs to deliver information via voice or text to beneficiaries in underserved communities, with maternal and infant health being a key area of such mHealth programs. Unfortunately, dwindling listenership is a major challenge, requiring targeted interventions using limited resources. This paper focuses on Kilkari, the world's largest mHealth program for maternal and child care - with over 3 million active subscribers at a time - launched by India's Ministry of Health and Family Welfare (MoHFW) and run by the non-profit ARRMAN. We present a system called CHAHAK that aims to reduce automated dropouts as well as boost engagement with the program through the strategic allocation of interventions to beneficiaries. Past work in a similar domain has focused on a much smaller scale mHealth program and used markovian restless multiarmed bandits to optimize a single limited intervention resource. However this paper demonstrates the challenges in adopting a markovian approach in Kilkari; therefore CHAHAK instead relies on non-markovian time-series restless bandits, and optimizes multiple interventions to improve listenership. We use real Kilkari data from the Odisha state in India to show CHAHAK's effectiveness in harnessing multiple interventions to boost listenership, benefiting marginalized communities. When deployed CHAHAK will assist the largest maternal mHealth program to date.
- [3] arXiv:2407.12318 (cross-list from cs.GT) [pdf, html, other]
-
Title: Information Compression in Dynamic GamesComments: 54 pages, 3 figuresSubjects: Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA); Systems and Control (eess.SY); Optimization and Control (math.OC); Statistics Theory (math.ST)
One of the reasons why stochastic dynamic games with an underlying dynamic system are challenging is since strategic players have access to enormous amount of information which leads to the use of extremely complex strategies at equilibrium. One approach to resolve this challenge is to simplify players' strategies by identifying appropriate compression of information maps so that the players can make decisions solely based on the compressed version of information, called the information state. For finite dynamic games with asymmetric information, inspired by the notion of information state for single-agent control problems, we propose two notions of information states, namely mutually sufficient information (MSI) and unilaterally sufficient information (USI). Both these information states are obtained with information compression maps independent of the strategy profile. We show that Bayes-Nash Equilibria (BNE) and Sequential Equilibria (SE) exist when all players use MSI-based strategies. We prove that when all players employ USI-based strategies the resulting sets of BNE and SE payoff profiles are the same as the sets of BNE and SE payoff profiles resulting when all players use full information-based strategies. We prove that when all players use USI-based strategies the resulting set of weak Perfect Bayesian Equilibrium (wPBE) payoff profiles can be a proper subset of all wPBE payoff profiles. We identify MSI and USI in specific models of dynamic games in the literature. We end by presenting an open problem: Do there exist strategy-dependent information compression maps that guarantee the existence of at least one equilibrium or maintain all equilibria that exist under perfect recall? We show, by a counterexample, that a well-known strategy-dependent information compression map used in the literature does not possess any of the properties of MSI or USI.
Cross submissions for Thursday, 18 July 2024 (showing 3 of 3 entries )
- [4] arXiv:2306.04090 (replaced) [pdf, html, other]
-
Title: PlayBest: Professional Basketball Player Behavior Synthesis via Planning with DiffusionXiusi Chen, Wei-Yao Wang, Ziniu Hu, David Reynoso, Kun Jin, Mingyan Liu, P. Jeffrey Brantingham, Wei WangComments: CIKM 2024Subjects: Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Dynamically planning in complex systems has been explored to improve decision-making in various domains. Professional basketball serves as a compelling example of a dynamic spatio-temporal game, encompassing context-dependent decision-making. However, processing the diverse on-court signals and navigating the vast space of potential actions and outcomes make it difficult for existing approaches to swiftly identify optimal strategies in response to evolving circumstances. In this study, we formulate the sequential decision-making process as a conditional trajectory generation process. Based on the formulation, we introduce PlayBest (PLAYer BEhavior SynThesis), a method to improve player decision-making. We extend the diffusion probabilistic model to learn challenging environmental dynamics from historical National Basketball Association (NBA) player motion tracking data. To incorporate data-driven strategies, an auxiliary value function is trained with corresponding rewards. To accomplish reward-guided trajectory generation, we condition the diffusion model on the value function via classifier-guided sampling. We validate the effectiveness of PlayBest through simulation studies, contrasting the generated trajectories with those employed by professional basketball teams. Our results reveal that the model excels at generating reasonable basketball trajectories that produce efficient plays. Moreover, the synthesized play strategies exhibit an alignment with professional tactics, highlighting the model's capacity to capture the intricate dynamics of basketball games.
- [5] arXiv:2308.05731 (replaced) [pdf, html, other]
-
Title: Rethinking the Integration of Prediction and Planning in Deep Learning-Based Automated Driving Systems: A ReviewSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Automated driving has the potential to revolutionize personal, public, and freight mobility. Beside accurately perceiving the environment, automated vehicles must plan a safe, comfortable, and efficient motion trajectory. To promote safety and progress, many works rely on modules that predict the future motion of surrounding traffic. Modular automated driving systems commonly handle prediction and planning as sequential, separate tasks. While this accounts for the influence of surrounding traffic on the ego vehicle, it fails to anticipate the reactions of traffic participants to the ego vehicle's behavior. Recent models increasingly integrate prediction and planning in a joint or interdependent step to model bi-directional interactions. To date, a comprehensive overview of different integration principles is lacking. We systematically review state-of-the-art deep learning-based prediction and planning, and focus on integrated prediction and planning models. Different facets of the integration ranging from model architecture and model design to behavioral aspects are considered and related to each other. Moreover, we discuss the implications, strengths, and limitations of different integration principles. By pointing out research gaps, describing relevant future challenges, and highlighting trends in the research field, we identify promising directions for future research.
- [6] arXiv:2310.19039 (replaced) [pdf, html, other]
-
Title: Machine Learning for the identification of phase-transitions in interacting agent-based systems: a Desai-Zwanzig exampleNikolaos Evangelou, Dimitrios G. Giovanis, George A. Kevrekidis, Grigorios A. Pavliotis, Ioannis G. KevrekidisComments: 13 pages, 10 FiguresSubjects: Dynamical Systems (math.DS); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Deriving closed-form, analytical expressions for reduced-order models, and judiciously choosing the closures leading to them, has long been the strategy of choice for studying phase- and noise-induced transitions for agent-based models (ABMs). In this paper, we propose a data-driven framework that pinpoints phase transitions for an ABM- the Desai-Zwanzig model in its mean-field limit, using a smaller number of variables than traditional closed-form models. To this end, we use the manifold learning algorithm Diffusion Maps to identify a parsimonious set of data-driven latent variables, and show that they are in one-to-one correspondence with the expected theoretical order parameter of the ABM. We then utilize a deep learning framework to obtain a conformal reparametrization of the data-driven coordinates that facilitates, in our example, the identification of a single parameter-dependent ODE in these coordinates. We identify this ODE through a residual neural network inspired by a numerical integration scheme (forward Euler). We then use the identified ODE - enabled through an odd symmetry transformation - to construct the bifurcation diagram exhibiting the phase transition.
- [7] arXiv:2312.13910 (replaced) [pdf, html, other]
-
Title: Multi-Agent Probabilistic Ensembles with Trajectory Sampling for Connected Autonomous VehiclesSubjects: Robotics (cs.RO); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Autonomous Vehicles (AVs) have attracted significant attention in recent years and Reinforcement Learning (RL) has shown remarkable performance in improving the autonomy of vehicles. In that regard, the widely adopted Model-Free RL (MFRL) promises to solve decision-making tasks in connected AVs (CAVs), contingent on the readiness of a significant amount of data samples for training. Nevertheless, it might be infeasible in practice and possibly lead to learning instability. In contrast, Model-Based RL (MBRL) manifests itself in sample-efficient learning, but the asymptotic performance of MBRL might lag behind the state-of-the-art MFRL algorithms. Furthermore, most studies for CAVs are limited to the decision-making of a single AV only, thus underscoring the performance due to the absence of communications. In this study, we try to address the decision-making problem of multiple CAVs with limited communications and propose a decentralized Multi-Agent Probabilistic Ensembles with Trajectory Sampling algorithm MA-PETS. In particular, in order to better capture the uncertainty of the unknown environment, MA-PETS leverages Probabilistic Ensemble (PE) neural networks to learn from communicated samples among neighboring CAVs. Afterwards, MA-PETS capably develops Trajectory Sampling (TS)-based model-predictive control for decision-making. On this basis, we derive the multi-agent group regret bound affected by the number of agents within the communication range and mathematically validate that incorporating effective information exchange among agents into the multi-agent learning scheme contributes to reducing the group regret bound in the worst case. Finally, we empirically demonstrate the superiority of MA-PETS in terms of the sample efficiency comparable to MFBL.
- [8] arXiv:2406.14922 (replaced) [pdf, html, other]
-
Title: Social learning with complex contagionComments: 19 pages, 5 figuresSubjects: Physics and Society (physics.soc-ph); Multiagent Systems (cs.MA); Neural and Evolutionary Computing (cs.NE); Adaptation and Self-Organizing Systems (nlin.AO); Populations and Evolution (q-bio.PE)
We introduce a mathematical model that combines the concepts of complex contagion with payoff-biased imitation, to describe how social behaviors spread through a population. Traditional models of social learning by imitation are based on simple contagion -- where an individual may imitate a more successful neighbor following a single interaction. Our framework generalizes this process to incorporate complex contagion, which requires multiple exposures before an individual considers adopting a different behavior. We formulate this as a discrete time and state stochastic process in a finite population, and we derive its continuum limit as an ordinary differential equation that generalizes the replicator equation, the most widely used dynamical model in evolutionary game theory. When applied to linear frequency-dependent games, our social learning with complex contagion produces qualitatively different outcomes than traditional imitation dynamics: it can shift the Prisoner's Dilemma from a unique all-defector equilibrium to either a stable mixture of cooperators and defectors in the population, or a bistable system; it changes the Snowdrift game from a single to a bistable equilibrium; and it can alter the Coordination game from bistability at the boundaries to two internal equilibria. The long-term outcome depends on the balance between the complexity of the contagion process and the strength of selection that biases imitation towards more successful types. Our analysis intercalates the fields of evolutionary game theory with complex contagions, and it provides a synthetic framework that describes more realistic forms of behavioral change in social systems.
- [9] arXiv:2407.06886 (replaced) [pdf, html, other]
-
Title: Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AIYang Liu, Weixing Chen, Yongjie Bai, Jingzhou Luo, Xinshuai Song, Kaixuan Jiang, Zhida Li, Ganlong Zhao, Junyi Lin, Guanbin Li, Wen Gao, Liang LinComments: The first comprehensive review of Embodied AI in the era of MLMs, 39 pages. We also provide the paper list for Embodied AI: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA); Robotics (cs.RO)
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace and the physical world. Recently, the emergence of Multi-modal Large Models (MLMs) and World Models (WMs) have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilities, making them a promising architecture for the brain of embodied agents. However, there is no comprehensive survey for Embodied AI in the era of MLMs. In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI. Our analysis firstly navigates through the forefront of representative works of embodied robots and simulators, to fully understand the research focuses and their limitations. Then, we analyze four main research targets: 1) embodied perception, 2) embodied interaction, 3) embodied agent, and 4) sim-to-real adaptation, covering the state-of-the-art methods, essential paradigms, and comprehensive datasets. Additionally, we explore the complexities of MLMs in virtual and real embodied agents, highlighting their significance in facilitating interactions in dynamic digital and physical environments. Finally, we summarize the challenges and limitations of embodied AI and discuss their potential future directions. We hope this survey will serve as a foundational reference for the research community and inspire continued innovation. The associated project can be found at this https URL.