Search | arXiv e-print repository

arXiv:2407.19277 [pdf, other]

Predicting the Progression of Cancerous Tumors in Mice: A Machine and Deep Learning Intuition

Authors: Amit K Chattopadhyay, Aimee Pascaline N Unkundiye, Gillian Pearce, Steven Russell

Abstract: The study explores Artificial Intelligence (AI) powered modeling to predict the evolution of cancer tumor cells in mice under different forms of treatment. The AI models are analyzed against varying ambient and systemic parameters, e.g. drug dosage, volume of the cancer cell mass, and time taken to destroy the cancer cell mass. The data required for the analysis have been synthetically extracted f… ▽ More The study explores Artificial Intelligence (AI) powered modeling to predict the evolution of cancer tumor cells in mice under different forms of treatment. The AI models are analyzed against varying ambient and systemic parameters, e.g. drug dosage, volume of the cancer cell mass, and time taken to destroy the cancer cell mass. The data required for the analysis have been synthetically extracted from plots available in both published and unpublished literature (primarily using a Matlab architecture called "Grabit"), that are then statistically standardized around the same baseline for comparison. Three forms of treatment are considered - saline (multiple concentrations used), magnetic nanoparticles (mNPs) and fluorodeoxyglycose iron oxide magnetic nanoparticles (mNP-FDGs) - analyzed using three Machine Learning (ML) algorithms, Decision Tree (DT), Random Forest (RF), Multilinear Regression (MLR), and a Deep Learning (DL) module, the Adaptive Neural Network (ANN). The AI models are trained on 60-80% data, the rest used for validation. Assessed over all three forms of treatment, ANN consistently outperforms other predictive models. Our models predict mNP-FDG as the most potent treatment regime that kills the cancerous tumor completely in ca 13 days from the start of treatment. The models can be generalized to other forms of cancer treatment regimens. △ Less

Submitted 31 July, 2024; v1 submitted 27 July, 2024; originally announced July 2024.

Comments: 7 figures, 24 pages

Journal ref: Annals of Biostatistics and Biometric Applications 2024

arXiv:2407.03459 [pdf, other]

Quantum decoherence by magnetic fluctuations in a candidate axion insulator

Authors: Ruben Saatjian, Kohtaro Yamakawa, Ryan S. Russell, James G. Analytis, John W. Harter

Abstract: In magnetic topological insulators, spontaneous time-reversal symmetry breaking by intrinsic magnetic order can open an energy gap in the topological surface spectrum. In the resulting state, exotic properties like axion electrodynamics, the quantum anomalous Hall effect, and other topological magnetoelectric responses are expected to emerge. A detailed understanding of the magnetic order and its… ▽ More In magnetic topological insulators, spontaneous time-reversal symmetry breaking by intrinsic magnetic order can open an energy gap in the topological surface spectrum. In the resulting state, exotic properties like axion electrodynamics, the quantum anomalous Hall effect, and other topological magnetoelectric responses are expected to emerge. A detailed understanding of the magnetic order and its coupling to the topological surface states is essential to harness and tune these properties. Here, we leverage near-resonant electric quadrupole optical second harmonic generation to probe magnetic fluctuations in the candidate axion insulator EuSn$_2$(As,P)$_2$ across its antiferromagnetic phase boundary. We observe a pronounced dimensional crossover in the quantum decoherence induced by magnetic fluctuations, where two-dimensional in-plane ferromagnetic correlations at high temperatures give way to three-dimensional long-range order at the Néel temperature. We also observe the breaking of rotational symmetry within the long-range-ordered antiferromagnetic state and map out the resulting spatial domain structure. More generally, we demonstrate the unique capabilities of nonlinear optical spectroscopy to study quantum coherence and fluctuations in magnetic quantum materials. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2406.19501 [pdf, other]

Monitoring Latent World States in Language Models with Propositional Probes

Authors: Jiahai Feng, Stuart Russell, Jacob Steinhardt

Abstract: Language models are susceptible to bias, sycophancy, backdoors, and other tendencies that lead to unfaithful responses to the input context. Interpreting internal states of language models could help monitor and correct unfaithful behavior. We hypothesize that language models represent their input contexts in a latent world model, and seek to extract this latent world state from the activations. W… ▽ More Language models are susceptible to bias, sycophancy, backdoors, and other tendencies that lead to unfaithful responses to the input context. Interpreting internal states of language models could help monitor and correct unfaithful behavior. We hypothesize that language models represent their input contexts in a latent world model, and seek to extract this latent world state from the activations. We do so with 'propositional probes', which compositionally probe tokens for lexical information and bind them into logical propositions representing the world state. For example, given the input context ''Greg is a nurse. Laura is a physicist.'', we decode the propositions ''WorksAs(Greg, nurse)'' and ''WorksAs(Laura, physicist)'' from the model's activations. Key to this is identifying a 'binding subspace' in which bound tokens have high similarity (''Greg'' and ''nurse'') but unbound ones do not (''Greg'' and ''physicist''). We validate propositional probes in a closed-world setting with finitely many predicates and properties. Despite being trained on simple templated contexts, propositional probes generalize to contexts rewritten as short stories and translated to Spanish. Moreover, we find that in three settings where language models respond unfaithfully to the input context -- prompt injections, backdoor attacks, and gender bias -- the decoded propositions remain faithful. This suggests that language models often encode a faithful world model but decode it unfaithfully, which motivates the search for better interpretability tools for monitoring LMs. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.10026 [pdf]

Retiming dynamics of harmonically modelocked laser solitons in a self-driven optomechanical lattice

Authors: Xiaocong Wang, Benhai Wang, Wenbin He, Xintong Zhang, Qi Huang, Zhiyuan Huang, Xin Jiang, Philip St. J. Russell, Meng Pang

Abstract: Harmonic mode-locking, realized actively or passively, is an effective technique for increasing the repetition rate of lasers, with important applications in optical sampling, laser micro-machining and frequency metrology. It is critically important to understand how a harmonically mode-locked pulse train responds to external perturbations and noise, so as to make sure that it is stable and resist… ▽ More Harmonic mode-locking, realized actively or passively, is an effective technique for increasing the repetition rate of lasers, with important applications in optical sampling, laser micro-machining and frequency metrology. It is critically important to understand how a harmonically mode-locked pulse train responds to external perturbations and noise, so as to make sure that it is stable and resistant to noise. Here, in a series of carefully designed experiments, we elucidate the retiming dynamics of laser pulses generated in a soliton fiber laser harmonically mode-locked at ~2 GHz to the acoustic resonance in a photonic crystal fiber (PCF) core. We characterize the self-driven optomechanical lattice along the PCF using a homodyne set-up, and reveal that each soliton undergoes damped oscillatory retiming within its trapping potential after an abrupt perturbation. In addition we show, through statistical analysis of the intra-cavity pulse spacing, how the trapping potentials are effective for suppressing timing jitter. The experimental results are well described using a dynamic model including dissipation, which provides valuable insight into the stability and noise performance of optomechanically mode-locked laser systems, and may also be useful for studying complex inter-soliton interactions. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.00877 [pdf, other]

Evidence of Learned Look-Ahead in a Chess-Playing Neural Network

Authors: Erik Jenner, Shreyas Kapur, Vasil Georgiev, Cameron Allen, Scott Emmons, Stuart Russell

Abstract: Do neural networks learn to implement algorithms such as look-ahead or search "in the wild"? Or do they rely purely on collections of simple heuristics? We present evidence of learned look-ahead in the policy network of Leela Chess Zero, the currently strongest neural chess engine. We find that Leela internally represents future optimal moves and that these representations are crucial for its fina… ▽ More Do neural networks learn to implement algorithms such as look-ahead or search "in the wild"? Or do they rely purely on collections of simple heuristics? We present evidence of learned look-ahead in the policy network of Leela Chess Zero, the currently strongest neural chess engine. We find that Leela internally represents future optimal moves and that these representations are crucial for its final output in certain board states. Concretely, we exploit the fact that Leela is a transformer that treats every chessboard square like a token in language models, and give three lines of evidence (1) activations on certain squares of future moves are unusually important causally; (2) we find attention heads that move important information "forward and backward in time," e.g., from squares of future moves to squares of earlier ones; and (3) we train a simple probe that can predict the optimal move 2 turns ahead with 92% accuracy (in board states where Leela finds a single best line). These findings are an existence proof of learned look-ahead in neural networks and might be a step towards a better understanding of their capabilities. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: Project page: https://leela-interp.github.io/

arXiv:2405.20519 [pdf, other]

Diffusion On Syntax Trees For Program Synthesis

Authors: Shreyas Kapur, Erik Jenner, Stuart Russell

Abstract: Large language models generate code one token at a time. Their autoregressive generation process lacks the feedback of observing the program's output. Training LLMs to suggest edits directly can be challenging due to the scarcity of rich edit data. To address these problems, we propose neural diffusion models that operate on syntax trees of any context-free grammar. Similar to image diffusion mode… ▽ More Large language models generate code one token at a time. Their autoregressive generation process lacks the feedback of observing the program's output. Training LLMs to suggest edits directly can be challenging due to the scarcity of rich edit data. To address these problems, we propose neural diffusion models that operate on syntax trees of any context-free grammar. Similar to image diffusion models, our method also inverts ``noise'' applied to syntax trees. Rather than generating code sequentially, we iteratively edit it while preserving syntactic validity, which makes it easy to combine this neural model with search. We apply our approach to inverse graphics tasks, where our model learns to convert images into programs that produce those images. Combined with search, our model is able to write graphics programs, see the execution result, and debug them to meet the required specifications. We additionally show how our system can write graphics programs for hand-drawn sketches. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: https://tree-diffusion.github.io

arXiv:2405.17713 [pdf, other]

AI Alignment with Changing and Influenceable Reward Functions

Authors: Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan

Abstract: Existing AI alignment approaches assume that preferences are static, which is unrealistic: our preferences change, and may even be influenced by our interactions with AI systems themselves. To clarify the consequences of incorrectly assuming static preferences, we introduce Dynamic Reward Markov Decision Processes (DR-MDPs), which explicitly model preference changes and the AI's influence on them.… ▽ More Existing AI alignment approaches assume that preferences are static, which is unrealistic: our preferences change, and may even be influenced by our interactions with AI systems themselves. To clarify the consequences of incorrectly assuming static preferences, we introduce Dynamic Reward Markov Decision Processes (DR-MDPs), which explicitly model preference changes and the AI's influence on them. We show that despite its convenience, the static-preference assumption may undermine the soundness of existing alignment techniques, leading them to implicitly reward AI systems for influencing user preferences in ways users may not truly want. We then explore potential solutions. First, we offer a unifying perspective on how an agent's optimization horizon may partially help reduce undesirable AI influence. Then, we formalize different notions of AI alignment that account for preference change from the outset. Comparing the strengths and limitations of 8 such notions of alignment, we find that they all either err towards causing undesirable AI influence, or are overly risk-averse, suggesting that a straightforward solution to the problems of changing preferences may not exist. As there is no avoiding grappling with changing preferences in real-world settings, this makes it all the more important to handle these issues with care, balancing risks and capabilities. We hope our work can provide conceptual clarity and constitute a first step towards AI alignment practices which explicitly account for (and contend with) the changing and influenceable nature of human preferences. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Accepted to ICML 2024

arXiv:2405.06624 [pdf, other]

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

Authors: David "davidad" Dalrymple, Joar Skalse, Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia, Steve Omohundro, Christian Szegedy, Ben Goldhaber, Nora Ammann, Alessandro Abate, Joe Halpern, Clark Barrett, Ding Zhao, Tan Zhi-Xuan, Jeannette Wing, Joshua Tenenbaum

Abstract: Ensuring that AI systems reliably and robustly avoid harmful or dangerous behaviours is a crucial challenge, especially for AI systems with a high degree of autonomy and general intelligence, or systems used in safety-critical contexts. In this paper, we will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these appro… ▽ More Ensuring that AI systems reliably and robustly avoid harmful or dangerous behaviours is a crucial challenge, especially for AI systems with a high degree of autonomy and general intelligence, or systems used in safety-critical contexts. In this paper, we will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees. This is achieved by the interplay of three core components: a world model (which provides a mathematical description of how the AI system affects the outside world), a safety specification (which is a mathematical description of what effects are acceptable), and a verifier (which provides an auditable proof certificate that the AI satisfies the safety specification relative to the world model). We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them. We also argue for the necessity of this approach to AI safety, and for the inadequacy of the main alternative approaches. △ Less

Submitted 8 July, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.04669 [pdf, other]

Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics

Authors: Hanlin Zhu, Baihe Huang, Shaolun Zhang, Michael Jordan, Jiantao Jiao, Yuandong Tian, Stuart Russell

Abstract: Auto-regressive large language models (LLMs) show impressive capacities to solve many complex reasoning tasks while struggling with some simple logical reasoning tasks such as inverse search: when trained on ''A is B'', LLM fails to directly conclude ''B is A'' during inference, which is known as the ''reversal curse'' (Berglund et al., 2023). In this paper, we theoretically analyze the reversal c… ▽ More Auto-regressive large language models (LLMs) show impressive capacities to solve many complex reasoning tasks while struggling with some simple logical reasoning tasks such as inverse search: when trained on ''A is B'', LLM fails to directly conclude ''B is A'' during inference, which is known as the ''reversal curse'' (Berglund et al., 2023). In this paper, we theoretically analyze the reversal curse via the training dynamics of (stochastic) gradient descent for two auto-regressive models: (1) a bilinear model that can be viewed as a simplification of a one-layer transformer; (2) one-layer transformers using the framework of Tian et al. (2023a). Our analysis reveals a core reason why the reversal curse happens: the (effective) weights of both auto-regressive models show asymmetry, i.e., the increase of weights from a token $A$ to token $B$ during training does not necessarily cause the increase of the weights from $B$ to $A$. Moreover, our analysis can be naturally applied to other logical reasoning tasks such as chain-of-thought (COT) (Wei et al., 2022b). We show the necessity of COT, i.e., a model trained on ''$A \to B$'' and ''$B \to C$'' fails to directly conclude ''$A \to C$'' without COT (also empirically observed by Allen-Zhu and Li (2023)), for one-layer transformers via training dynamics, which provides a new perspective different from previous work (Feng et al., 2024) that focuses on expressivity. Finally, we also conduct experiments to validate our theory on multi-layer transformers under different settings. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 40 pages, 15 figures

arXiv:2404.16182 [pdf]

Optomagnetic forces on YIG/YFeO3 microspheres levitated in chiral hollow-core photonic crystal fibre

Authors: Soumya Chakraborty, Gordon K. L. Wong, Ferdi Oda, Vanessa Wachter, Silvia Viola Kusminskiy, Tadahiro Yokosawa, Sabine Hübner, Benjamin Apeleo Zubiri, Erdmann Spiecker, Monica Distaso, Philip St. J. Russell, Nicolas Y. Joly

Abstract: We explore a magnetooptomechanical system consisting of a single magnetic microparticle optically levitated within the core of a helically twisted single-ring hollow-core photonic crystal fibre. We use newly-developed magnetic particles that have a core of antiferromagnetic yttrium-ortho-ferrite (YFeO3) and a shell of ferrimagnetic YIG (Y3Fe5O12) approximately 50 nm thick. Using a 632.8 nm probe b… ▽ More We explore a magnetooptomechanical system consisting of a single magnetic microparticle optically levitated within the core of a helically twisted single-ring hollow-core photonic crystal fibre. We use newly-developed magnetic particles that have a core of antiferromagnetic yttrium-ortho-ferrite (YFeO3) and a shell of ferrimagnetic YIG (Y3Fe5O12) approximately 50 nm thick. Using a 632.8 nm probe beam, we observe optical-torque-induced rotation of the particle and rotation of the magnetization vector in presence of an external static magnetic field. This one-of-a-kind platform opens a path to novel investigations of optomagnetic physics with levitated magnetic particles. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.12536 [pdf]

Asteroid (101955) Bennu in the Laboratory: Properties of the Sample Collected by OSIRIS-REx

Authors: Dante S. Lauretta, Harold C. Connolly, Jr., Joseph E. Aebersold, Conel M. O. D. Alexander, Ronald-L. Ballouz, Jessica J. Barnes, Helena C. Bates, Carina A. Bennett, Laurinne Blanche, Erika H. Blumenfeld, Simon J. Clemett, George D. Cody, Daniella N. DellaGiustina, Jason P. Dworkin, Scott A. Eckley, Dionysis I. Foustoukos, Ian A. Franchi, Daniel P. Glavin, Richard C. Greenwood, Pierre Haenecour, Victoria E. Hamilton, Dolores H. Hill, Takahiro Hiroi, Kana Ishimaru, Fred Jourdan , et al. (28 additional authors not shown)

Abstract: On 24 September 2023, the NASA OSIRIS-REx mission dropped a capsule to Earth containing approximately 120 g of pristine carbonaceous regolith from Bennu. We describe the delivery and initial allocation of this asteroid sample and introduce its bulk physical, chemical, and mineralogical properties from early analyses. The regolith is very dark overall, with higher-reflectance inclusions and particl… ▽ More On 24 September 2023, the NASA OSIRIS-REx mission dropped a capsule to Earth containing approximately 120 g of pristine carbonaceous regolith from Bennu. We describe the delivery and initial allocation of this asteroid sample and introduce its bulk physical, chemical, and mineralogical properties from early analyses. The regolith is very dark overall, with higher-reflectance inclusions and particles interspersed. Particle sizes range from sub-micron dust to a stone about 3.5 cm long. Millimeter-scale and larger stones typically have hummocky or angular morphologies. A subset of the stones appears mottled by brighter material that occurs as veins and crusts. Hummocky stones have the lowest densities and mottled stones have the highest. Remote sensing of the surface of Bennu detected hydrated phyllosilicates, magnetite, organic compounds, carbonates, and scarce anhydrous silicates, all of which the sample confirms. We also find sulfides, presolar grains, and, less expectedly, Na-rich phosphates, as well as other trace phases. The sample composition and mineralogy indicate substantial aqueous alteration and resemble those of Ryugu and the most chemically primitive, low-petrologic-type carbonaceous chondrites. Nevertheless, we find distinct hydrogen, nitrogen, and oxygen isotopic compositions, and some of the material we analyzed is enriched in fluid-mobile elements. Our findings underscore the value of sample return, especially for low-density material that may not readily survive atmospheric entry, and lay the groundwork for more comprehensive analyses. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 73 pages, 22 figures

arXiv:2404.10271 [pdf, other]

Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback

Authors: Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mossé, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde, William S. Zwicker

Abstract: Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as helping to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level prin… ▽ More Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as helping to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level principles. But how do we deal with potentially diverging input from humans? How can we aggregate the input into consistent data about "collective" preferences or otherwise use it to make collective choices about model behavior? In this paper, we argue that the field of social choice is well positioned to address these questions, and we discuss ways forward for this agenda, drawing on discussions in a recent workshop on Social Choice for AI Ethics and Safety held in Berkeley, CA, USA in December 2023. △ Less

Submitted 4 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

Comments: 15 pages, 4 figures

MSC Class: 68T01; 68T50; 91B14; 91B12 ACM Class: I.2.0; I.2.7; K.4.2; I.2.m; J.4

arXiv:2403.19107 [pdf]

Synthetic Medical Imaging Generation with Generative Adversarial Networks For Plain Radiographs

Authors: John R. McNulty, Lee Kho, Alexandria L. Case, Charlie Fornaca, Drew Johnston, David Slater, Joshua M. Abzug, Sybil A. Russell

Abstract: In medical imaging, access to data is commonly limited due to patient privacy restrictions and the issue that it can be difficult to acquire enough data in the case of rare diseases.[1] The purpose of this investigation was to develop a reusable open-source synthetic image generation pipeline, the GAN Image Synthesis Tool (GIST), that is easy to use as well as easy to deploy. The pipeline helps to… ▽ More In medical imaging, access to data is commonly limited due to patient privacy restrictions and the issue that it can be difficult to acquire enough data in the case of rare diseases.[1] The purpose of this investigation was to develop a reusable open-source synthetic image generation pipeline, the GAN Image Synthesis Tool (GIST), that is easy to use as well as easy to deploy. The pipeline helps to improve and standardize AI algorithms in the digital health space by generating high quality synthetic image data that is not linked to specific patients. Its image generation capabilities include the ability to generate imaging of pathologies or injuries with low incidence rates. This improvement of digital health AI algorithms could improve diagnostic accuracy, aid in patient care, decrease medicolegal claims, and ultimately decrease the overall cost of healthcare. The pipeline builds on existing Generative Adversarial Networks (GANs) algorithms, and preprocessing and evaluation steps were included for completeness. For this work, we focused on ensuring the pipeline supports radiography, with a focus on synthetic knee and elbow x-ray images. In designing the pipeline, we evaluated the performance of current GAN architectures, studying the performance on available x-ray data. We show that the pipeline is capable of generating high quality and clinically relevant images based on a lay person's evaluation and the Fréchet Inception Distance (FID) metric. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Report number: Public Release Case Number 22-3965

arXiv:2403.08392 [pdf]

Nonwoven Reinforced Photocurable Poly(glycerol sebacate)-Based Hydrogels

Authors: Michael Phillips, Giuseppe Tronci, Christopher M. Pask, Stephen J. Russell

Abstract: Implantable hydrogels should ideally possess mechanical properties matched to the surrounding tissues to enable adequate mechanical function while regeneration occurs. This can be challenging, especially when degradable systems with high water content and hydrolysable chemical bonds are required in anatomical sites under constant mechanical stimulation, e.g. a foot ulcer cavity. In these circumsta… ▽ More Implantable hydrogels should ideally possess mechanical properties matched to the surrounding tissues to enable adequate mechanical function while regeneration occurs. This can be challenging, especially when degradable systems with high water content and hydrolysable chemical bonds are required in anatomical sites under constant mechanical stimulation, e.g. a foot ulcer cavity. In these circumstances, the design of hydrogel composites is a promising strategy to provide controlled structural features and macroscopic properties over time. To explore this strategy, the synthesis of a new photocurable elastomeric polymer, poly(glycerol-co-sebacic acid-co-lactic acid-co-polyethylene glycol) acrylate (PGSLPA), is investigated, along with its processing into UV-cured hydrogels, electrospun nonwovens and fibre-reinforced variants, without the need for a high temperature curing step or use of hazardous solvents. The mechanical properties of bioresorbable PGSLPA hydrogels were studied with and without electrospun nonwoven reinforcement and with varied layered configurations, aiming to determine the effects of microstructure on bulk compressive strength and elasticity. The nonwoven reinforced PGSLPA hydrogels exhibited a 60 % increase in compressive strength and an 80 % increase in elastic moduli compared to fibre-free PGSLPA samples. Mechanical properties of the fibre-reinforced hydrogels could also be modulated by altering the layering arrangement of the nonwoven and hydrogel phase. The nanofibre reinforced PGSLPA hydrogels also exhibited good elastic recovery, as evidenced by hysteresis in compression fatigue stress-strain evaluations showing a return to original dimensions. △ Less

Submitted 17 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: 26 pages, 12 figures, 3 tables. Accepted in Polymers

arXiv:2403.06003 [pdf, other]

A Generalized Acquisition Function for Preference-based Reward Learning

Authors: Evan Ellis, Gaurav R. Ghosal, Stuart J. Russell, Anca Dragan, Erdem Bıyık

Abstract: Preference-based reward learning is a popular technique for teaching robots and autonomous systems how a human user wants them to perform a task. Previous works have shown that actively synthesizing preference queries to maximize information gain about the reward function parameters improves data efficiency. The information gain criterion focuses on precisely identifying all parameters of the rewa… ▽ More Preference-based reward learning is a popular technique for teaching robots and autonomous systems how a human user wants them to perform a task. Previous works have shown that actively synthesizing preference queries to maximize information gain about the reward function parameters improves data efficiency. The information gain criterion focuses on precisely identifying all parameters of the reward function. This can potentially be wasteful as many parameters may result in the same reward, and many rewards may result in the same behavior in the downstream tasks. Instead, we show that it is possible to optimize for learning the reward function up to a behavioral equivalence class, such as inducing the same ranking over behaviors, distribution over choices, or other related definitions of what makes two rewards similar. We introduce a tractable framework that can capture such definitions of similarity. Our experiments in a synthetic environment, an assistive robotics environment with domain transfer, and a natural language processing problem with real datasets demonstrate the superior performance of our querying method over the state-of-the-art information gain method. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2402.17747 [pdf, other]

When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback

Authors: Leon Lang, Davis Foote, Stuart Russell, Anca Dragan, Erik Jenner, Scott Emmons

Abstract: Past analyses of reinforcement learning from human feedback (RLHF) assume that the human evaluators fully observe the environment. What happens when human feedback is based only on partial observations? We formally define two failure cases: deceptive inflation and overjustification. Modeling the human as Boltzmann-rational w.r.t. a belief over trajectories, we prove conditions under which RLHF is… ▽ More Past analyses of reinforcement learning from human feedback (RLHF) assume that the human evaluators fully observe the environment. What happens when human feedback is based only on partial observations? We formally define two failure cases: deceptive inflation and overjustification. Modeling the human as Boltzmann-rational w.r.t. a belief over trajectories, we prove conditions under which RLHF is guaranteed to result in policies that deceptively inflate their performance, overjustify their behavior to make an impression, or both. Under the new assumption that the human's partial observability is known and accounted for, we then analyze how much information the feedback process provides about the return function. We show that sometimes, the human's feedback determines the return function uniquely up to an additive constant, but in other realistic cases, there is irreducible ambiguity. We propose exploratory research directions to help tackle these challenges and caution against blindly applying RLHF in partially observable settings. △ Less

Submitted 8 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.08062 [pdf, ps, other]

Avoiding Catastrophe in Continuous Spaces by Asking for Help

Authors: Benjamin Plaut, Hanlin Zhu, Stuart Russell

Abstract: Most reinforcement learning algorithms with formal regret guarantees assume all mistakes are reversible and essentially rely on trying all possible behaviors. This approach leads to poor outcomes when some mistakes are irreparable or even catastrophic. We propose a variant of the contextual bandit problem where the goal is to minimize the chance of catastrophe. Specifically, we assume that the pay… ▽ More Most reinforcement learning algorithms with formal regret guarantees assume all mistakes are reversible and essentially rely on trying all possible behaviors. This approach leads to poor outcomes when some mistakes are irreparable or even catastrophic. We propose a variant of the contextual bandit problem where the goal is to minimize the chance of catastrophe. Specifically, we assume that the payoff each round represents the chance of avoiding catastrophe that round, and try to maximize the product of payoffs (the overall chance of avoiding catastrophe). We allow a limited number of queries to a mentor and assume a Lipschitz continuous payoff function. We first show that in general, any algorithm either constantly queries the mentor or is nearly guaranteed to cause catastrophe. However, when the mentor policy class has bounded Natarajan dimension and contains at least some "reasonable" policies, we provide an algorithm whose regret and rate of querying the mentor both approach 0 as the time horizon grows. We also present an alternative algorithm which provides the same regret and query guarantees when the mentor's action changes a constant number of times in a 1D state space, and can handle adversarially chosen states. △ Less

Submitted 26 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

arXiv:2401.09155 [pdf]

Frequency conversion of vortex states by chiral forward Brillouin scattering in twisted photonic crystal fibre

Authors: Xinglin Zeng, Philip St. J. Russell, Birgit Stiller

Abstract: Optical vortex states-higher optical modes with helical phase progression and carrying orbital angular momentum-have been explored to increase the flexibility and capacity of optical fibres employed for example in mode-division-multiplexing, optical trapping and multimode imaging. A common requirement in such systems is high fidelity transfer of signals between different frequency bands and modes,… ▽ More Optical vortex states-higher optical modes with helical phase progression and carrying orbital angular momentum-have been explored to increase the flexibility and capacity of optical fibres employed for example in mode-division-multiplexing, optical trapping and multimode imaging. A common requirement in such systems is high fidelity transfer of signals between different frequency bands and modes, which for vortex modes is not so straightforward. Here we report intervortex conversion between backward-propagating circularly polarised vortex modes at one wavelength, using chiral flexural phonons excited by chiral forward stimulated Brillouin scattering at a different wavelength. The experiment is carried out using chiral photonic crystal fibre, which robustly preserves circular polarisation states. The chiral acoustic wave, which has the geometry of a spinning single-spiral corkscrew, provides the orbital angular momentum necessary to conserve angular momentum between the coupled optical vortex modes. The results open up new opportunities for interband optical frequency conversion and the manipulation of vortex states in both classical and quantum regimes. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2312.12747 [pdf, other]

ALMANACS: A Simulatability Benchmark for Language Model Explainability

Authors: Edmund Mills, Shiye Su, Stuart Russell, Scott Emmons

Abstract: How do we measure the efficacy of language model explainability methods? While many explainability methods have been developed, they are typically evaluated on bespoke tasks, preventing an apples-to-apples comparison. To help fill this gap, we present ALMANACS, a language model explainability benchmark. ALMANACS scores explainability methods on simulatability, i.e., how well the explanations impro… ▽ More How do we measure the efficacy of language model explainability methods? While many explainability methods have been developed, they are typically evaluated on bespoke tasks, preventing an apples-to-apples comparison. To help fill this gap, we present ALMANACS, a language model explainability benchmark. ALMANACS scores explainability methods on simulatability, i.e., how well the explanations improve behavior prediction on new inputs. The ALMANACS scenarios span twelve safety-relevant topics such as ethical reasoning and advanced AI behaviors; they have idiosyncratic premises to invoke model-specific behavior; and they have a train-test distributional shift to encourage faithful explanations. By using another language model to predict behavior based on the explanations, ALMANACS is a fully automated benchmark. We use ALMANACS to evaluate counterfactuals, rationalizations, attention, and Integrated Gradients explanations. Our results are sobering: when averaged across all topics, no explanation method outperforms the explanation-free control. We conclude that despite modest successes in prior work, developing an explanation method that aids simulatability in ALMANACS remains an open challenge. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: Code is available at https://github.com/edmundmills/ALMANACS}{https://github.com/edmundmills/ALMANACS

arXiv:2312.08369 [pdf, other]

The Effective Horizon Explains Deep RL Performance in Stochastic Environments

Authors: Cassidy Laidlaw, Banghua Zhu, Stuart Russell, Anca Dragan

Abstract: Reinforcement learning (RL) theory has largely focused on proving minimax sample complexity bounds. These require strategic exploration algorithms that use relatively limited function classes for representing the policy or value function. Our goal is to explain why deep RL algorithms often perform well in practice, despite using random exploration and much more expressive function classes like neu… ▽ More Reinforcement learning (RL) theory has largely focused on proving minimax sample complexity bounds. These require strategic exploration algorithms that use relatively limited function classes for representing the policy or value function. Our goal is to explain why deep RL algorithms often perform well in practice, despite using random exploration and much more expressive function classes like neural networks. Our work arrives at an explanation by showing that many stochastic MDPs can be solved by performing only a few steps of value iteration on the random policy's Q function and then acting greedily. When this is true, we find that it is possible to separate the exploration and learning components of RL, making it much easier to analyze. We introduce a new RL algorithm, SQIRL, that iteratively learns a near-optimal policy by exploring randomly to collect rollouts and then performing a limited number of steps of fitted-Q iteration over those rollouts. Any regression algorithm that satisfies basic in-distribution generalization properties can be used in SQIRL to efficiently solve common MDPs. This can explain why deep RL works, since it is empirically established that neural networks generalize well in-distribution. Furthermore, SQIRL explains why random exploration works well in practice. We leverage SQIRL to derive instance-dependent sample complexity bounds for RL that are exponential only in an "effective horizon" of lookahead and on the complexity of the class used for function approximation. Empirically, we also find that SQIRL performance strongly correlates with PPO and DQN performance in a variety of stochastic environments, supporting that our theoretical analysis is predictive of practical performance. Our code and data are available at https://github.com/cassidylaidlaw/effective-horizon. △ Less

Submitted 12 April, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

Journal ref: ICLR 2024 (Spotlight)

arXiv:2311.15862 [pdf, other]

Evidence for a kilometre-scale seismically slow layer atop the core-mantle boundary from normal modes

Authors: Stuart Russell, Jessica C. E. Irving, Lisanne Jagt, Sanne Cottaar

Abstract: Geodynamic modelling and seismic studies have highlighted the possibility that a thin layer of low seismic velocities, potentially molten, may sit atop the core-mantle boundary but has thus far eluded detection. In this study we employ normal modes, an independent data type to body waves, to assess the visibility of a seismically slow layer atop the core-mantle boundary to normal mode centre frequ… ▽ More Geodynamic modelling and seismic studies have highlighted the possibility that a thin layer of low seismic velocities, potentially molten, may sit atop the core-mantle boundary but has thus far eluded detection. In this study we employ normal modes, an independent data type to body waves, to assess the visibility of a seismically slow layer atop the core-mantle boundary to normal mode centre frequencies. Using forward modelling and a dataset of 353 normal mode observations we find that some centre frequencies are sensitive to one-dimensional kilometre-scale structure at the core-mantle boundary. Furthermore, a global slow and dense layer 1 - 3 km thick is better-fitting than no layer. The well-fitting parameter space is broad with a wide range of possible seismic parameters, which precludes inferring a possible composition or phase. Our methodology cannot uniquely detect a layer in the Earth but one should be considered possible and accounted for in future studies. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.01011 [pdf, other]

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Authors: Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell

Abstract: While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to prompt injection attacks: malicious third party prompts that subvert the intent of the system designer. To help researchers study this problem, we present a dataset of over 126,000 prompt injection attacks and 46,000 prompt-based "defenses" against prompt injection, all created by p… ▽ More While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to prompt injection attacks: malicious third party prompts that subvert the intent of the system designer. To help researchers study this problem, we present a dataset of over 126,000 prompt injection attacks and 46,000 prompt-based "defenses" against prompt injection, all created by players of an online game called Tensor Trust. To the best of our knowledge, this is currently the largest dataset of human-generated adversarial examples for instruction-following LLMs. The attacks in our dataset have a lot of easily interpretable stucture, and shed light on the weaknesses of LLMs. We also use the dataset to create a benchmark for resistance to two types of prompt injection, which we refer to as prompt extraction and prompt hijacking. Our benchmark results show that many models are vulnerable to the attack strategies in the Tensor Trust dataset. Furthermore, we show that some attack strategies from the dataset generalize to deployed LLM-based applications, even though they have a very different set of constraints to the game. We release all data and source code at https://tensortrust.ai/paper △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.17688 [pdf, other]

doi 10.1126/science.adn0117

Managing extreme AI risks amid rapid progress

Authors: Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Trevor Darrell, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila McIlraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann

Abstract: Artificial Intelligence (AI) is progressing rapidly, and companies are shifting their focus to developing generalist AI systems that can autonomously act and pursue goals. Increases in capabilities and autonomy may soon massively amplify AI's impact, with risks that include large-scale social harms, malicious uses, and an irreversible loss of human control over autonomous AI systems. Although rese… ▽ More Artificial Intelligence (AI) is progressing rapidly, and companies are shifting their focus to developing generalist AI systems that can autonomously act and pursue goals. Increases in capabilities and autonomy may soon massively amplify AI's impact, with risks that include large-scale social harms, malicious uses, and an irreversible loss of human control over autonomous AI systems. Although researchers have warned of extreme risks from AI, there is a lack of consensus about how exactly such risks arise, and how to manage them. Society's response, despite promising first steps, is incommensurate with the possibility of rapid, transformative progress that is expected by many experts. AI safety research is lagging. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness, and barely address autonomous systems. In this short consensus paper, we describe extreme risks from upcoming, advanced AI systems. Drawing on lessons learned from other safety-critical technologies, we then outline a comprehensive plan combining technical research and development with proactive, adaptive governance mechanisms for a more commensurate preparation. △ Less

Submitted 22 May, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

Comments: Published in Science: https://www.science.org/doi/10.1126/science.adn0117

arXiv:2310.15288 [pdf, other]

Active teacher selection for reinforcement learning from human feedback

Authors: Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell

Abstract: Reinforcement learning from human feedback (RLHF) enables machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite querying a range of distinct teachers. We propose the Hidden Utility Bandit (HUB) framework to model differences in teacher rationality, expertise, and costline… ▽ More Reinforcement learning from human feedback (RLHF) enables machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite querying a range of distinct teachers. We propose the Hidden Utility Bandit (HUB) framework to model differences in teacher rationality, expertise, and costliness, formalizing the problem of learning from multiple teachers. We develop a variety of solution algorithms and apply them to two real-world domains: paper recommendation systems and COVID-19 vaccine testing. We find that the Active Teacher Selection (ATS) algorithm outperforms baseline algorithms by actively selecting when and which teacher to query. The HUB framework and ATS algorithm demonstrate the importance of leveraging differences between teachers to learn accurate reward models, facilitating future research on active teacher selection for robust reward modeling. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.01706 [pdf, other]

On Representation Complexity of Model-based and Model-free Reinforcement Learning

Authors: Hanlin Zhu, Baihe Huang, Stuart Russell

Abstract: We study the representation complexity of model-based and model-free reinforcement learning (RL) in the context of circuit complexity. We prove theoretically that there exists a broad class of MDPs such that their underlying transition and reward functions can be represented by constant depth circuits with polynomial size, while the optimal $Q$-function suffers an exponential circuit complexity in… ▽ More We study the representation complexity of model-based and model-free reinforcement learning (RL) in the context of circuit complexity. We prove theoretically that there exists a broad class of MDPs such that their underlying transition and reward functions can be represented by constant depth circuits with polynomial size, while the optimal $Q$-function suffers an exponential circuit complexity in constant-depth circuits. By drawing attention to the approximation errors and building connections to complexity theory, our theory provides unique insights into why model-based algorithms usually enjoy better sample complexity than model-free algorithms from a novel representation complexity perspective: in some cases, the ground-truth rule (model) of the environment is simple to represent, while other quantities, such as $Q$-function, appear complex. We empirically corroborate our theory by comparing the approximation error of the transition kernel, reward function, and optimal $Q$-function in various Mujoco environments, which demonstrates that the approximation errors of the transition kernel and reward function are consistently lower than those of the optimal $Q$-function. To the best of our knowledge, this work is the first to study the circuit complexity of RL, which also provides a rigorous framework for future research. △ Less

Submitted 10 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: 23 pages, 9 figures, to be published in ICLR 2024

arXiv:2309.00236 [pdf, other]

Image Hijacks: Adversarial Images can Control Generative Models at Runtime

Authors: Luke Bailey, Euan Ong, Stuart Russell, Scott Emmons

Abstract: Are foundation models secure against malicious actors? In this work, we focus on the image input to a vision-language model (VLM). We discover image hijacks, adversarial images that control the behaviour of VLMs at inference time, and introduce the general Behaviour Matching algorithm for training image hijacks. From this, we derive the Prompt Matching method, allowing us to train hijacks matching… ▽ More Are foundation models secure against malicious actors? In this work, we focus on the image input to a vision-language model (VLM). We discover image hijacks, adversarial images that control the behaviour of VLMs at inference time, and introduce the general Behaviour Matching algorithm for training image hijacks. From this, we derive the Prompt Matching method, allowing us to train hijacks matching the behaviour of an arbitrary user-defined text prompt (e.g. 'the Eiffel Tower is now located in Rome') using a generic, off-the-shelf dataset unrelated to our choice of prompt. We use Behaviour Matching to craft hijacks for four types of attack, forcing VLMs to generate outputs of the adversary's choice, leak information from their context window, override their safety training, and believe false statements. We study these attacks against LLaVA, a state-of-the-art VLM based on CLIP and LLaMA-2, and find that all attack types achieve a success rate of over 80%. Moreover, our attacks are automated and require only small image perturbations. △ Less

Submitted 22 April, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

Comments: Project page at https://image-hijacks.github.io

arXiv:2307.14745 [pdf, other]

Using Multi-Agent MicroServices (MAMS) for Agent Based Modelling

Authors: Martynas Jagutis, Sean Russell, Rem Collier

Abstract: This paper demonstrates the use of the Multi-Agent MicroServices (MAMS) architectural style through a case study based around the development of a prototype traffic simulation in which agents model a population of individuals who travel from home to work and vice versa by car. This paper demonstrates the use of the Multi-Agent MicroServices (MAMS) architectural style through a case study based around the development of a prototype traffic simulation in which agents model a population of individuals who travel from home to work and vice versa by car. △ Less

Submitted 27 July, 2023; originally announced July 2023.

Comments: 4 page demo paper accepted at EMAS. Paper has been extended from this version and submitted for publication in the formal proceedings

arXiv:2306.09309 [pdf, other]

Who Needs to Know? Minimal Knowledge for Optimal Coordination

Authors: Niklas Lauffer, Ameesh Shah, Micah Carroll, Michael Dennis, Stuart Russell

Abstract: To optimally coordinate with others in cooperative games, it is often crucial to have information about one's collaborators: successful driving requires understanding which side of the road to drive on. However, not every feature of collaborators is strategically relevant: the fine-grained acceleration of drivers may be ignored while maintaining optimal coordination. We show that there is a well-d… ▽ More To optimally coordinate with others in cooperative games, it is often crucial to have information about one's collaborators: successful driving requires understanding which side of the road to drive on. However, not every feature of collaborators is strategically relevant: the fine-grained acceleration of drivers may be ignored while maintaining optimal coordination. We show that there is a well-defined dichotomy between strategically relevant and irrelevant information. Moreover, we show that, in dynamic games, this dichotomy has a compact representation that can be efficiently computed via a Bellman backup operator. We apply this algorithm to analyze the strategically relevant information for tasks in both a standard and a partially observable version of the Overcooked environment. Theoretical and empirical results show that our algorithms are significantly more efficient than baselines. Videos are available at https://minknowledge.github.io. △ Less

Submitted 13 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: To be published at ICML 2023

ACM Class: I.2.6; I.2.11

arXiv:2306.06924 [pdf, other]

TASRA: a Taxonomy and Analysis of Societal-Scale Risks from AI

Authors: Andrew Critch, Stuart Russell

Abstract: While several recent works have identified societal-scale and extinction-level risks to humanity arising from artificial intelligence, few have attempted an {\em exhaustive taxonomy} of such risks. Many exhaustive taxonomies are possible, and some are useful -- particularly if they reveal new risks or practical approaches to safety. This paper explores a taxonomy based on accountability: whose act… ▽ More While several recent works have identified societal-scale and extinction-level risks to humanity arising from artificial intelligence, few have attempted an {\em exhaustive taxonomy} of such risks. Many exhaustive taxonomies are possible, and some are useful -- particularly if they reveal new risks or practical approaches to safety. This paper explores a taxonomy based on accountability: whose actions lead to the risk, are the actors unified, and are they deliberate? We also provide stories to illustrate how the various risk types could each play out, including risks arising from unanticipated interactions of many AI systems, as well as risks from deliberate misuse, for which combined technical and policy solutions are indicated. △ Less

Submitted 14 June, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

MSC Class: 68T01 ACM Class: I.2.0

arXiv:2305.11220 [pdf, other]

Protecting quantum modes in optical fibres

Authors: M. A. T. Butt, P. Roth, G. K. L. Wong, M. H. Frosz, L. L. Sanchez-Soto, E. A. Anashkina, A. V. Andrianov, P. Banzer, P. S. J. Russell, G. Leuchs

Abstract: Polarization-preserving fibers maintain the two polarization states of an orthogonal basis. Quantum communication, however, requires sending at least two nonorthogonal states and these cannot both be preserved. We present a new scheme that allows for using polarization encoding in a fiber not only in the discrete, but also in the continuous-variable regime. For the example of a helically twisted p… ▽ More Polarization-preserving fibers maintain the two polarization states of an orthogonal basis. Quantum communication, however, requires sending at least two nonorthogonal states and these cannot both be preserved. We present a new scheme that allows for using polarization encoding in a fiber not only in the discrete, but also in the continuous-variable regime. For the example of a helically twisted photonic-crystal fibre, we experimentally demonstrate that using appropriate nonorthogonal modes, the polarization-preserving fiber does not fully scramble these modes over the full Poincaré sphere, but that the output polarization will stay on a great circle; that is, within a one-dimensional protected subspace, which can be parametrized by a single variable. This will allow for more efficient measurements of quantum excitations in nonorthogonal modes. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Comments: 7 pages, 4 figures, accepted in Phys. Rev. Applied

arXiv:2305.05687 [pdf, other]

doi 10.3847/1538-4357/accc89

Coronal Heating as Determined by the Solar Flare Frequency Distribution Obtained by Aggregating Case Studies

Authors: James Paul Mason, Alexandra Werth, Colin G. West, Allison A. Youngblood, Donald L. Woodraska, Courtney Peck, Kevin Lacjak, Florian G. Frick, Moutamen Gabir, Reema A. Alsinan, Thomas Jacobsen, Mohammad Alrubaie, Kayla M. Chizmar, Benjamin P. Lau, Lizbeth Montoya Dominguez, David Price, Dylan R. Butler, Connor J. Biron, Nikita Feoktistov, Kai Dewey, N. E. Loomis, Michal Bodzianowski, Connor Kuybus, Henry Dietrick, Aubrey M. Wolfe , et al. (977 additional authors not shown)

Abstract: Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms th… ▽ More Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms that could explain it: nanoflares or Alfvén waves. To date, neither can be directly observed. Nanoflares are, by definition, extremely small, but their aggregate energy release could represent a substantial heating mechanism, presuming they are sufficiently abundant. One way to test this presumption is via the flare frequency distribution, which describes how often flares of various energies occur. If the slope of the power law fitting the flare frequency distribution is above a critical threshold, $α=2$ as established in prior literature, then there should be a sufficient abundance of nanoflares to explain coronal heating. We performed $>$600 case studies of solar flares, made possible by an unprecedented number of data analysts via three semesters of an undergraduate physics laboratory course. This allowed us to include two crucial, but nontrivial, analysis methods: pre-flare baseline subtraction and computation of the flare energy, which requires determining flare start and stop times. We aggregated the results of these analyses into a statistical study to determine that $α= 1.63 \pm 0.03$. This is below the critical threshold, suggesting that Alfvén waves are an important driver of coronal heating. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 1,002 authors, 14 pages, 4 figures, 3 tables, published by The Astrophysical Journal on 2023-05-09, volume 948, page 71

arXiv:2304.09853 [pdf, other]

Bridging RL Theory and Practice with the Effective Horizon

Authors: Cassidy Laidlaw, Stuart Russell, Anca Dragan

Abstract: Deep reinforcement learning (RL) works impressively in some environments and fails catastrophically in others. Ideally, RL theory should be able to provide an understanding of why this is, i.e. bounds predictive of practical performance. Unfortunately, current theory does not quite have this ability. We compare standard deep RL algorithms to prior sample complexity bounds by introducing a new data… ▽ More Deep reinforcement learning (RL) works impressively in some environments and fails catastrophically in others. Ideally, RL theory should be able to provide an understanding of why this is, i.e. bounds predictive of practical performance. Unfortunately, current theory does not quite have this ability. We compare standard deep RL algorithms to prior sample complexity bounds by introducing a new dataset, BRIDGE. It consists of 155 deterministic MDPs from common deep RL benchmarks, along with their corresponding tabular representations, which enables us to exactly compute instance-dependent bounds. We choose to focus on deterministic environments because they share many interesting properties of stochastic environments, but are easier to analyze. Using BRIDGE, we find that prior bounds do not correlate well with when deep RL succeeds vs. fails, but discover a surprising property that does. When actions with the highest Q-values under the random policy also have the highest Q-values under the optimal policy (i.e. when it is optimal to be greedy on the random policy's Q function), deep RL tends to succeed; when they don't, deep RL tends to fail. We generalize this property into a new complexity measure of an MDP that we call the effective horizon, which roughly corresponds to how many steps of lookahead search would be needed in that MDP in order to identify the next optimal action, when leaf nodes are evaluated with random rollouts. Using BRIDGE, we show that the effective horizon-based bounds are more closely reflective of the empirical performance of PPO and DQN than prior sample complexity bounds across four metrics. We also find that, unlike existing bounds, the effective horizon can predict the effects of using reward shaping or a pre-trained exploration policy. Our code and data are available at https://github.com/cassidylaidlaw/effective-horizon △ Less

Submitted 11 January, 2024; v1 submitted 19 April, 2023; originally announced April 2023.

Journal ref: NeurIPS 2023 (Oral)

arXiv:2303.00894 [pdf, other]

Active Reward Learning from Multiple Teachers

Authors: Peter Barnett, Rachel Freedman, Justin Svegliato, Stuart Russell

Abstract: Reward learning algorithms utilize human feedback to infer a reward function, which is then used to train an AI system. This human feedback is often a preference comparison, in which the human teacher compares several samples of AI behavior and chooses which they believe best accomplishes the objective. While reward learning typically assumes that all feedback comes from a single teacher, in pract… ▽ More Reward learning algorithms utilize human feedback to infer a reward function, which is then used to train an AI system. This human feedback is often a preference comparison, in which the human teacher compares several samples of AI behavior and chooses which they believe best accomplishes the objective. While reward learning typically assumes that all feedback comes from a single teacher, in practice these systems often query multiple teachers to gather sufficient training data. In this paper, we investigate this disparity, and find that algorithmic evaluation of these different sources of feedback facilitates more accurate and efficient reward learning. We formally analyze the value of information (VOI) when reward learning from teachers with varying levels of rationality, and define and evaluate an algorithm that utilizes this VOI to actively select teachers to query for feedback. Surprisingly, we find that it is often more informative to query comparatively irrational teachers. By formalizing this problem and deriving an analytical solution, we hope to facilitate improvement in reward learning approaches to aligning AI behavior with human values. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2302.12564 [pdf]

Valleytronics in bulk MoS$_2$ by optical control of parity and time symmetries

Authors: Igor Tyulnev, Álvaro Jiménez-Galán, Julita Poborska, Lenard Vamos, Rui F. Silva, Philip St. J. Russell, Francesco Tani, Olga Smirnova, Misha Ivanov, Jens Biegert

Abstract: The valley degree of freedom of electrons in materials promises routes toward energy-efficient information storage with enticing prospects towards quantum information processing. Current challenges in utilizing valley polarization are symmetry conditions that require monolayer structures or specific material engineering, non-resonant optical control to avoid energy dissipation, and the ability to… ▽ More The valley degree of freedom of electrons in materials promises routes toward energy-efficient information storage with enticing prospects towards quantum information processing. Current challenges in utilizing valley polarization are symmetry conditions that require monolayer structures or specific material engineering, non-resonant optical control to avoid energy dissipation, and the ability to switch valley polarization at optical speed. We demonstrate all-optical and non-resonant control over valley polarization using bulk MoS$_2$, a centrosymmetric material with zero Berry curvature at the valleys. Our universal method utilizes spin-angular momentum-shaped tri-foil optical control pulses to switch the material's electronic topology to induce valley polarization by transiently breaking time and space inversion symmetry through a simple phase rotation. The dependence of the generation of the second harmonic of an optical probe pulse on the phase rotation directly demonstrates the efficacy of valley polarization. It shows that direct optical control over the valley degree of freedom is not limited to monolayer structures. Instead, it is possible for systems with an arbitrary number of layers and bulk materials. Universal and non-resonant valley control at optical speeds unlocks the possibility of engineering efficient, multi-material valleytronic devices operating on quantum coherent timescales. △ Less

Submitted 24 February, 2023; originally announced February 2023.

Comments: 4 figures

arXiv:2301.04694 [pdf]

Science Priorities for the Extraction of the Solid MSR Samples from their Sample Tubes

Authors: N. Dauphas, S. S. Russell, D. Beaty, F. Thiessen, J. Barnes, L. Bonal, J. Bridges, T. Bristow, J. Eiler, L. Ferriere, T. Fornaro, J. Gattacceca, B. Hoffman, E. J. Javaux, T. Kleine, H. Y. McSween, M. Prasad, L. Rampe, M. Schmidt, B. Schoene, K. L. Siebach, J. Stern, N. Tosca

Abstract: Preservation of the chemical and structural integrity of samples that will be brought back from Mars is paramount to achieving the scientific objectives of MSR. Given our knowledge of the nature of the samples retrieved at Jezero by Perseverance, at least two options need to be tested for opening the sample tubes: (1) One or two radial cuts at the end of the tube to slide the sample out. (2) Two r… ▽ More Preservation of the chemical and structural integrity of samples that will be brought back from Mars is paramount to achieving the scientific objectives of MSR. Given our knowledge of the nature of the samples retrieved at Jezero by Perseverance, at least two options need to be tested for opening the sample tubes: (1) One or two radial cuts at the end of the tube to slide the sample out. (2) Two radial cuts at the ends of the tube and two longitudinal cuts to lift the upper half of the tube and access the sample. Strategy 1 will likely minimize contamination but incurs the risk of affecting the physical integrity of weakly consolidated samples. Strategy 2 will be optimal for preserving the physical integrity of the samples but increases the risk of contamination and mishandling of the sample as more manipulations and additional equipment will be needed. A flexible approach to opening the sample tubes is therefore required, and several options need to be available, depending on the nature of the rock samples returned. Both opening strategies 1 and 2 may need to be available when the samples are returned to handle different sample types (e.g., loosely bound sediments vs. indurated magmatic rocks). This question should be revisited after engineering tests are performed on analogue samples. The MSR sample tubes will have to be opened under stringent BSL4 conditions and this aspect needs to be integrated into the planning. △ Less

Submitted 11 January, 2023; originally announced January 2023.

Comments: 8 pages, 3 figures, 1 table, report NASA-ESA Mars Rock Team Report

arXiv:2211.11972 [pdf, other]

imitation: Clean Imitation Learning Implementations

Authors: Adam Gleave, Mohammad Taufeeque, Juan Rocamonde, Erik Jenner, Steven H. Wang, Sam Toyer, Maximilian Ernestus, Nora Belrose, Scott Emmons, Stuart Russell

Abstract: imitation provides open-source implementations of imitation and reward learning algorithms in PyTorch. We include three inverse reinforcement learning (IRL) algorithms, three imitation learning algorithms and a preference comparison algorithm. The implementations have been benchmarked against previous results, and automated tests cover 98% of the code. Moreover, the algorithms are implemented in a… ▽ More imitation provides open-source implementations of imitation and reward learning algorithms in PyTorch. We include three inverse reinforcement learning (IRL) algorithms, three imitation learning algorithms and a preference comparison algorithm. The implementations have been benchmarked against previous results, and automated tests cover 98% of the code. Moreover, the algorithms are implemented in a modular fashion, making it simple to develop novel algorithms in the framework. Our source code, including documentation and examples, is available at https://github.com/HumanCompatibleAI/imitation △ Less

Submitted 21 November, 2022; originally announced November 2022.

arXiv:2211.00716 [pdf, ps, other]

Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

Authors: Paria Rashidinejad, Hanlin Zhu, Kunhe Yang, Stuart Russell, Jiantao Jiao

Abstract: Offline reinforcement learning (RL), which refers to decision-making from a previously-collected dataset of interactions, has received significant attention over the past years. Much effort has focused on improving offline RL practicality by addressing the prevalent issue of partial data coverage through various forms of conservative policy learning. While the majority of algorithms do not have fi… ▽ More Offline reinforcement learning (RL), which refers to decision-making from a previously-collected dataset of interactions, has received significant attention over the past years. Much effort has focused on improving offline RL practicality by addressing the prevalent issue of partial data coverage through various forms of conservative policy learning. While the majority of algorithms do not have finite-sample guarantees, several provable conservative offline RL algorithms are designed and analyzed within the single-policy concentrability framework that handles partial coverage. Yet, in the nonlinear function approximation setting where confidence intervals are difficult to obtain, existing provable algorithms suffer from computational intractability, prohibitively strong assumptions, and suboptimal statistical rates. In this paper, we leverage the marginalized importance sampling (MIS) formulation of RL and present the first set of offline RL algorithms that are statistically optimal and practical under general function approximation and single-policy concentrability, bypassing the need for uncertainty quantification. We identify that the key to successfully solving the sample-based approximation of the MIS problem is ensuring that certain occupancy validity constraints are nearly satisfied. We enforce these constraints by a novel application of the augmented Lagrangian method and prove the following result: with the MIS formulation, augmented Lagrangian is enough for statistically optimal offline RL. In stark contrast to prior algorithms that induce additional conservatism through methods such as behavior regularization, our approach provably eliminates this need and reinterprets regularizers as "enforcers of occupancy validity" than "promoters of conservatism." △ Less

Submitted 1 November, 2022; originally announced November 2022.

Comments: 49 pages, 1 figure

arXiv:2211.00241 [pdf, other]

Adversarial Policies Beat Superhuman Go AIs

Authors: Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

Abstract: We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies against it, achieving a >97% win rate against KataGo running at superhuman settings. Our adversaries do not win by playing Go well. Instead, they trick KataGo into making serious blunders. Our attack transfers zero-shot to other superhuman Go-playing AIs, and is comprehensible to the extent that human exper… ▽ More We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies against it, achieving a >97% win rate against KataGo running at superhuman settings. Our adversaries do not win by playing Go well. Instead, they trick KataGo into making serious blunders. Our attack transfers zero-shot to other superhuman Go-playing AIs, and is comprehensible to the extent that human experts can implement it without algorithmic assistance to consistently beat superhuman AIs. The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack. Our results demonstrate that even superhuman AI systems may harbor surprising failure modes. Example games are available https://goattack.far.ai/. △ Less

Submitted 13 July, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

Comments: Accepted to ICML 2023, see paper for changelog

ACM Class: I.2.6

arXiv:2208.07976 [pdf]

doi 10.3847/2041-8213/ac83bd

Presolar stardust in asteroid Ryugu

Authors: Jens Barosch, Larry R. Nittler, Jianhua Wang, Conel M. O'D. Alexander, Bradley T. De Gregorio, Cécile Engrand, Yoko Kebukawa, Kazuhide Nagashima, Rhonda M. Stroud, Hikaru Yabuta, Yoshinari Abe, Jérôme Aléon, Sachiko Amari, Yuri Amelin, Ken-ichi Bajo, Laure Bejach, Martin Bizzarro, Lydie Bonal, Audrey Bouvier, Richard W. Carlson, Marc Chaussidon, Byeon-Gak Choi, George D. Cody, Emmanuel Dartois, Nicolas Dauphas , et al. (99 additional authors not shown)

Abstract: We have conducted a NanoSIMS-based search for presolar material in samples recently returned from C-type asteroid Ryugu as part of JAXA's Hayabusa2 mission. We report the detection of all major presolar grain types with O- and C-anomalous isotopic compositions typically identified in carbonaceous chondrite meteorites: 1 silicate, 1 oxide, 1 O-anomalous supernova grain of ambiguous phase, 38 SiC, a… ▽ More We have conducted a NanoSIMS-based search for presolar material in samples recently returned from C-type asteroid Ryugu as part of JAXA's Hayabusa2 mission. We report the detection of all major presolar grain types with O- and C-anomalous isotopic compositions typically identified in carbonaceous chondrite meteorites: 1 silicate, 1 oxide, 1 O-anomalous supernova grain of ambiguous phase, 38 SiC, and 16 carbonaceous grains. At least two of the carbonaceous grains are presolar graphites, whereas several grains with moderate C isotopic anomalies are probably organics. The presolar silicate was located in a clast with a less altered lithology than the typical extensively aqueously altered Ryugu matrix. The matrix-normalized presolar grain abundances in Ryugu are 4.8$^{+4.7}_{-2.6}$ ppm for O-anomalous grains, 25$^{+6}_{-5}$ ppm for SiC grains and 11$^{+5}_{-3}$ ppm for carbonaceous grains. Ryugu is isotopically and petrologically similar to carbonaceous Ivuna-type (CI) chondrites. To compare the in situ presolar grain abundances of Ryugu with CI chondrites, we also mapped Ivuna and Orgueil samples and found a total of SiC grains and 6 carbonaceous grains. No O-anomalous grains were detected. The matrix-normalized presolar grain abundances in the CI chondrites are similar to those in Ryugu: 23 $^{+7}_{-6}$ ppm SiC and 9.0$^{+5.3}_{-4.6}$ ppm carbonaceous grains. Thus, our results provide further evidence in support of the Ryugu-CI connection. They also reveal intriguing hints of small-scale heterogeneities in the Ryugu samples, such as locally distinct degrees of alteration that allowed the preservation of delicate presolar material. △ Less

Submitted 16 August, 2022; originally announced August 2022.

Comments: 12 pages, 3 figures, 2 tables. Published in ApJL

Journal ref: 2022, The Astrophysical Journal Letters, 935, L3 (12pp)

arXiv:2208.07006 [pdf, ps, other]

Cooperative and uncooperative institution designs: Surprises and problems in open-source game theory

Authors: Andrew Critch, Michael Dennis, Stuart Russell

Abstract: It is increasingly possible for real-world agents, such as software-based agents or human institutions, to view the internal programming of other such agents that they interact with. For instance, a company can read the bylaws of another company, or one software system can read the source code of another. Game-theoretic equilibria between the designers of such agents are called \emph{program equil… ▽ More It is increasingly possible for real-world agents, such as software-based agents or human institutions, to view the internal programming of other such agents that they interact with. For instance, a company can read the bylaws of another company, or one software system can read the source code of another. Game-theoretic equilibria between the designers of such agents are called \emph{program equilibria}, and we call this area \emph{open-source game theory}. In this work we demonstrate a series of counterintuitive results on open-source games, which are independent of the programming language in which agents are written. We show that certain formal institution designs that one might expect to defect against each other will instead turn out to cooperate, or conversely, cooperate when one might expect them to defect. The results hold in a setting where each institution has full visibility into the other institution's true operating procedures. We also exhibit examples and ten open problems for better understanding these phenomena. We argue that contemporary game theory remains ill-equipped to study program equilibria, given that even the outcomes of single games in open-source settings remain counterintuitive and poorly understood. Nonetheless, some of these open-source agents exhibit desirable characteristics -- e.g., they can unexploitably create incentives for cooperation and legibility from other agents -- such that analyzing them could yield considerable benefits. △ Less

Submitted 15 August, 2022; originally announced August 2022.

Comments: 41 pages

MSC Class: 93A14; 93A16; 91-08; 91A11; 91A35; 91A68; 91A44; 91B06; 91B41; 91B52 ACM Class: F.3.1; F.4.1; I.2.3; J.4

arXiv:2207.11538 [pdf]

doi 10.1103/PhysRevApplied.18.064069

Temporal self-compression and self-frequency shift of sub-microjoule pulses at 8 MHz repetition rate

Authors: Francesco Tani, Jacob Lampen, Martin Butryn, Michael H. Frosz, Jie Jiang, Martin Fermann, Philip St. J. Russell

Abstract: We combine soliton dynamics in gas-filled hollow-core photonic crystal fibers with a state-of-the-art fiber laser to realize a turn-key system producing few-fs pulses at 8 MHz repetition rate at pump energies as low as 220 nJ. Furthermore, by exploiting the soliton self-frequency shift in a second hydrogen-filled hollow-core fiber, we efficiently generate pulses as short as 22 fs, continuously tun… ▽ More We combine soliton dynamics in gas-filled hollow-core photonic crystal fibers with a state-of-the-art fiber laser to realize a turn-key system producing few-fs pulses at 8 MHz repetition rate at pump energies as low as 220 nJ. Furthermore, by exploiting the soliton self-frequency shift in a second hydrogen-filled hollow-core fiber, we efficiently generate pulses as short as 22 fs, continuously tunable from 1100 nm to 1474 nm. △ Less

Submitted 23 July, 2022; originally announced July 2022.

arXiv:2207.03470 [pdf, other]

For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

Authors: Scott Emmons, Caspar Oesterheld, Andrew Critch, Vincent Conitzer, Stuart Russell

Abstract: Although it has been known since the 1970s that a globally optimal strategy profile in a common-payoff game is a Nash equilibrium, global optimality is a strict requirement that limits the result's applicability. In this work, we show that any locally optimal symmetric strategy profile is also a (global) Nash equilibrium. Furthermore, we show that this result is robust to perturbations to the comm… ▽ More Although it has been known since the 1970s that a globally optimal strategy profile in a common-payoff game is a Nash equilibrium, global optimality is a strict requirement that limits the result's applicability. In this work, we show that any locally optimal symmetric strategy profile is also a (global) Nash equilibrium. Furthermore, we show that this result is robust to perturbations to the common payoff and to the local optimum. Applied to machine learning, our result provides a global guarantee for any gradient method that finds a local optimum in symmetric strategy space. While this result indicates stability to unilateral deviation, we nevertheless identify broad classes of games where mixed local optima are unstable under joint, asymmetric deviations. We analyze the prevalence of instability by running learning algorithms in a suite of symmetric games, and we conclude by discussing the applicability of our results to multi-agent RL, cooperative inverse RL, and decentralized POMDPs. △ Less

Submitted 7 July, 2022; originally announced July 2022.

arXiv:2205.15705 [pdf]

High-quality 8-fold self-compression of ultrashort near-UV pulses in Ar-filled ultrathin-walled photonic crystal fiber

Authors: Jie Luan, Philip St. J. Russell, David Novoa

Abstract: We demonstrate generation of 7.6 fs near-UV pulses centered at 400 nm via 8-fold soliton-effect self-compression in an Ar-filled hollow-core kagomé-style photonic crystal fiber with ultrathin core walls. Analytical calculations of the effective compression length and soliton order permit adjustment of the experimental parameters, and numerical modelling of the nonlinear pulse dynamics in the fiber… ▽ More We demonstrate generation of 7.6 fs near-UV pulses centered at 400 nm via 8-fold soliton-effect self-compression in an Ar-filled hollow-core kagomé-style photonic crystal fiber with ultrathin core walls. Analytical calculations of the effective compression length and soliton order permit adjustment of the experimental parameters, and numerical modelling of the nonlinear pulse dynamics in the fiber accurately predict the spectro-temporal profiles of the self-compressed pulses. After compensation of phase distortion introduced by the optical elements along the beam path from the fiber to the diagnostics, 71% of the pulse energy was in the main temporal lobe, with peak powers in excess of 0.2 GW. The convenient set-up opens up new opportunities for time-resolved studies in spectroscopy, chemistry and materials science. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: 7 pages, 5 figures

arXiv:2205.08229 [pdf, other]

doi 10.1093/gji/ggac315

A Re-examination of Ellipticity Corrections for Seismic Phases

Authors: Stuart Russell, John F. Rudge, Jessica C. E. Irving, Sanne Cottaar

Abstract: The Earth's ellipticity of figure has an effect on the travel times of seismic waves over teleseismic distances. Tables of ellipticity corrections and coefficients have been used by seismologists for several decades, however due to the increasing variety and complexity of seismic phases in use, current tables of ellipticity coefficients are now outmoded and incomplete. We present a Python package,… ▽ More The Earth's ellipticity of figure has an effect on the travel times of seismic waves over teleseismic distances. Tables of ellipticity corrections and coefficients have been used by seismologists for several decades, however due to the increasing variety and complexity of seismic phases in use, current tables of ellipticity coefficients are now outmoded and incomplete. We present a Python package, EllipticiPy, for the calculation of ellipticity corrections that removes the dependence on pre-calculated coefficients at discrete source depths and epicentral distances. EllipticiPy also facilitates the calculation of ellipticity corrections on other planetary bodies. When applied to both Earth and Mars, the magnitudes of ellipticity corrections are on the order of single seconds and are significant for some seismic studies on Earth but remain negligible on Mars due to other greater sources of uncertainty. △ Less

Submitted 11 August, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

Comments: Main paper of 11 pages, 4 figures and 1 table plus a supplement of 12 pages and 1 table

arXiv:2205.07886 [pdf, other]

An Empirical Investigation of Representation Learning for Imitation

Authors: Xin Chen, Sam Toyer, Cody Wild, Scott Emmons, Ian Fischer, Kuang-Huei Lee, Neel Alex, Steven H Wang, Ping Luo, Stuart Russell, Pieter Abbeel, Rohin Shah

Abstract: Imitation learning often needs a large demonstration set in order to handle the full range of situations that an agent might find itself in during deployment. However, collecting expert demonstrations can be expensive. Recent work in vision, reinforcement learning, and NLP has shown that auxiliary representation learning objectives can reduce the need for large amounts of expensive, task-specific… ▽ More Imitation learning often needs a large demonstration set in order to handle the full range of situations that an agent might find itself in during deployment. However, collecting expert demonstrations can be expensive. Recent work in vision, reinforcement learning, and NLP has shown that auxiliary representation learning objectives can reduce the need for large amounts of expensive, task-specific data. Our Empirical Investigation of Representation Learning for Imitation (EIRLI) investigates whether similar benefits apply to imitation learning. We propose a modular framework for constructing representation learning algorithms, then use our framework to evaluate the utility of representation learning for imitation across several environment suites. In the settings we evaluate, we find that existing algorithms for image-based representation learning provide limited value relative to a well-tuned baseline with image augmentations. To explain this result, we investigate differences between imitation learning and other settings where representation learning has provided significant benefit, such as image classification. Finally, we release a well-documented codebase which both replicates our findings and provides a modular framework for creating new representation learning algorithms out of reusable components. △ Less

Submitted 16 May, 2022; originally announced May 2022.

Comments: Accepted to NeurIPS2021 Datasets and Benchmarks Track

arXiv:2204.11971 [pdf]

Optical vortex Brillouin laser

Authors: Xinglin Zeng, Philip St. J. Russell, Yang Chen, Zheqi Wang, Gordon K. L. Wong, Paul Roth, Michael H. Frosz, Birgit Stiller

Abstract: Optical vortices, which have been extensively studied over the last decades, offer an additional degree of freedom useful in many applications, such as optical tweezers and quantum control. Stimulated Brillouin scattering, providing a narrow linewidth and a strong nonlinear response, has been used to realise quasi-continuous wave (CW) lasers. Here, we report stable oscillation of optical vortices… ▽ More Optical vortices, which have been extensively studied over the last decades, offer an additional degree of freedom useful in many applications, such as optical tweezers and quantum control. Stimulated Brillouin scattering, providing a narrow linewidth and a strong nonlinear response, has been used to realise quasi-continuous wave (CW) lasers. Here, we report stable oscillation of optical vortices and acoustic modes in a Brillouin laser based on chiral photonic crystal fibre, which robustly supports helical Bloch modes (HBMs) that carry circularly-polarized optical vortex and display circular birefringence. We implement a narrow-linewidth Brillouin fibre laser that stably emits 1st- and 2nd-order vortex-carrying HBMs. Angular momentum conservation selection rules dictate that pump and backward Brillouin signals have opposite topological charge and spin. Additionally, we show that when the chiral PCF is placed within a laser ring cavity, the linewidth-narrowing associated with lasing permits the peak of the Brillouin gain that corresponds to acoustic mode to be measured with resolution of 10 kHz and accuracy of 520 kHz. The results pave the way to a new generation of vortex-carrying SBS systems with applications in quantum information processing, vortex-carrying nonreciprocal systems. △ Less

Submitted 25 April, 2022; originally announced April 2022.

arXiv:2204.11966 [pdf, other]

Estimating and Penalizing Induced Preference Shifts in Recommender Systems

Authors: Micah Carroll, Anca Dragan, Stuart Russell, Dylan Hadfield-Menell

Abstract: The content that a recommender system (RS) shows to users influences them. Therefore, when choosing a recommender to deploy, one is implicitly also choosing to induce specific internal states in users. Even more, systems trained via long-horizon optimization will have direct incentives to manipulate users: in this work, we focus on the incentive to shift user preferences so they are easier to sati… ▽ More The content that a recommender system (RS) shows to users influences them. Therefore, when choosing a recommender to deploy, one is implicitly also choosing to induce specific internal states in users. Even more, systems trained via long-horizon optimization will have direct incentives to manipulate users: in this work, we focus on the incentive to shift user preferences so they are easier to satisfy. We argue that - before deployment - system designers should: estimate the shifts a recommender would induce; evaluate whether such shifts would be undesirable; and perhaps even actively optimize to avoid problematic shifts. These steps involve two challenging ingredients: estimation requires anticipating how hypothetical algorithms would influence user preferences if deployed - we do this by using historical user interaction data to train a predictive user model which implicitly contains their preference dynamics; evaluation and optimization additionally require metrics to assess whether such influences are manipulative or otherwise unwanted - we use the notion of "safe shifts", that define a trust region within which behavior is safe: for instance, the natural way in which users would shift without interference from the system could be deemed "safe". In simulated experiments, we show that our learned preference dynamics model is effective in estimating user preferences and how they would respond to new recommenders. Additionally, we show that recommenders that optimize for staying in the trust region can avoid manipulative behaviors while still generating engagement. △ Less

Submitted 14 July, 2022; v1 submitted 25 April, 2022; originally announced April 2022.

Comments: Accepted to ICML 2022 (Spotlight)

Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2686-2708, 2022

arXiv:2203.12053 [pdf, other]

Upmixing via style transfer: a variational autoencoder for disentangling spatial images and musical content

Authors: Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak Kim

Abstract: In the stereo-to-multichannel upmixing problem for music, one of the main tasks is to set the directionality of the instrument sources in the multichannel rendering results. In this paper, we propose a modified variational autoencoder model that learns a latent space to describe the spatial images in multichannel music. We seek to disentangle the spatial images and music content, so the learned la… ▽ More In the stereo-to-multichannel upmixing problem for music, one of the main tasks is to set the directionality of the instrument sources in the multichannel rendering results. In this paper, we propose a modified variational autoencoder model that learns a latent space to describe the spatial images in multichannel music. We seek to disentangle the spatial images and music content, so the learned latent variables are invariant to the music. At test time, we use the latent variables to control the panning of sources. We propose two upmixing use cases: transferring the spatial images from one song to another and blind panning based on the generative model. We report objective and subjective evaluation results to empirically show that our model captures spatial images separately from music content and achieves transfer-based interactive panning. △ Less

Submitted 22 March, 2022; originally announced March 2022.

arXiv:2203.07475 [pdf, other]

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Authors: Joar Skalse, Matthew Farrugia-Roberts, Stuart Russell, Alessandro Abate, Adam Gleave

Abstract: It is often very challenging to manually design reward functions for complex, real-world tasks. To solve this, one can instead use reward learning to infer a reward function from data. However, there are often multiple reward functions that fit the data equally well, even in the infinite-data limit. This means that the reward function is only partially identifiable. In this work, we formally chara… ▽ More It is often very challenging to manually design reward functions for complex, real-world tasks. To solve this, one can instead use reward learning to infer a reward function from data. However, there are often multiple reward functions that fit the data equally well, even in the infinite-data limit. This means that the reward function is only partially identifiable. In this work, we formally characterise the partial identifiability of the reward function given several popular reward learning data sources, including expert demonstrations and trajectory comparisons. We also analyse the impact of this partial identifiability for several downstream tasks, such as policy optimisation. We unify our results in a framework for comparing data sources and downstream tasks by their invariances, with implications for the design and selection of data sources for reward learning. △ Less

Submitted 7 June, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

Comments: ICML 2023. 9 pages main paper, 26 pages total, 3 figures

ACM Class: I.2.6

arXiv:2203.03680 [pdf]

Nonreciprocal vortex isolator by stimulated Brillouin scattering in chiral photonic crystal fibre

Authors: Xinglin Zeng, Philip St. J. Russell, Christian Wolff, Michael H. Frosz, Gordon K. L. Wong, Birgit Stiller

Abstract: Optical non-reciprocity, which breaks the symmetry between forward and backward propagating optical waves, has become vital in photonic systems and enables many key devices, such as optical isolators, circulators and optical routers. Most conventional optical isolators involve magneto-optic materials, but devices based on optical nonlinearities, optomechanically induced transparency and stimulated… ▽ More Optical non-reciprocity, which breaks the symmetry between forward and backward propagating optical waves, has become vital in photonic systems and enables many key devices, such as optical isolators, circulators and optical routers. Most conventional optical isolators involve magneto-optic materials, but devices based on optical nonlinearities, optomechanically induced transparency and stimulated Brillouin scattering (SBS) have also been demonstrated. So far, however, they have only been implemented for linearly or randomly polarized LP01-like fundamental modes. Here we report a light-driven nonreciprocal isolator for optical vortex modes, based on topology-selective SBS in chiral photonic crystal fibre. The device can be reconfigured as an amplifier or an isolator by adjusting the frequency of the control signal. The experimental results show vortex isolation of 22 dB, which is at the state-of-the-art in fundamental mode isolators using SBS. This unique device may find applications in optical communications, fibre lasers, quantum information processing and optical tweezers. △ Less

Submitted 7 March, 2022; originally announced March 2022.

Showing 1–50 of 240 results for author: Russell, S