Search | arXiv e-print repository

Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optimization

Authors: Felix Chalumeau, Refiloe Shabe, Noah de Nicola, Arnu Pretorius, Thomas D. Barrett, Nathan Grinsztajn

Abstract: Combinatorial Optimization is crucial to numerous real-world applications, yet still presents challenges due to its (NP-)hard nature. Amongst existing approaches, heuristics often offer the best trade-off between quality and scalability, making them suitable for industrial use. While Reinforcement Learning (RL) offers a flexible framework for designing heuristics, its adoption over handcrafted heu… ▽ More Combinatorial Optimization is crucial to numerous real-world applications, yet still presents challenges due to its (NP-)hard nature. Amongst existing approaches, heuristics often offer the best trade-off between quality and scalability, making them suitable for industrial use. While Reinforcement Learning (RL) offers a flexible framework for designing heuristics, its adoption over handcrafted heuristics remains incomplete within industrial solvers. Existing learned methods still lack the ability to adapt to specific instances and fully leverage the available computational budget. The current best methods either rely on a collection of pre-trained policies, or on data-inefficient fine-tuning; hence failing to fully utilize newly available information within the constraints of the budget. In response, we present MEMENTO, an RL approach that leverages memory to improve the adaptation of neural solvers at inference time. MEMENTO enables updating the action distribution dynamically based on the outcome of previous decisions. We validate its effectiveness on benchmark problems, in particular Traveling Salesman and Capacitated Vehicle Routing, demonstrating it can successfully be combined with standard methods to boost their performance under a given budget, both in and out-of-distribution, improving their performance on all 12 evaluated tasks. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.13637 [pdf]

The Source of Hydrogen in Earth's Building Blocks

Authors: Thomas J Barrett, James F. J. Bryson, Kalotina Geraki

Abstract: Despite being pivotal to the habitability of our planet, the process by which Earth gained its present-day hydrogen budget is unclear. Due to their isotopic similarity to terrestrial rocks across a range of elements, enstatite chondrites (ECs) are thought to be the meteorites that best represent Earth's building blocks. Because of ECs' nominally anhydrous mineralogy, these building blocks have lon… ▽ More Despite being pivotal to the habitability of our planet, the process by which Earth gained its present-day hydrogen budget is unclear. Due to their isotopic similarity to terrestrial rocks across a range of elements, enstatite chondrites (ECs) are thought to be the meteorites that best represent Earth's building blocks. Because of ECs' nominally anhydrous mineralogy, these building blocks have long been presumed to have supplied negligible hydrogen to the proto-Earth. Instead, hydrogen has been proposed to have been delivered to our planet after its main stage of formation by impacts from hydrated asteroids. In this case, our planet's habitability would have its origins in a stochastic process. However, ECs have recently been found to unexpectedly contain enough hydrogen to readily explain Earth's present-day water budget. Although this result would transform the processes we believe are required for rocky planets to be suitable to life, the mineralogical source of ~80% of hydrogen in these meteorites was previously unknown. As such, the reason ECs are seemingly rich in hydrogen was unclear. Here, we apply sulfur X-ray absorption near edge structure (S-XANES) spectroscopy to ECs, finding that most (~70%) of their hydrogen is bonded to sulfur. Moreover, the concentration of the S-H bond is intimately linked to the abundance of micrometre-scale pyrrhotite (Fe1-xS, 0<x<0.125), suggesting most hydrogen in these meteorites is carried in this phase. These findings elucidate the presence of hydrogen in Earth's building blocks, providing the key evidence that unlocks a systematic, rather than stochastic, origin of Earth's hydrogen. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 19 pages, 12 figures

arXiv:2405.15840 [pdf, other]

Learning the Language of Protein Structure

Authors: Benoit Gaujac, Jérémie Donà, Liviu Copoiu, Timothy Atkinson, Thomas Pierrot, Thomas D. Barrett

Abstract: Representation learning and \emph{de novo} generation of proteins are pivotal computational biology tasks. Whilst natural language processing (NLP) techniques have proven highly effective for protein sequence modelling, structure modelling presents a complex challenge, primarily due to its continuous and three-dimensional nature. Motivated by this discrepancy, we introduce an approach using a vect… ▽ More Representation learning and \emph{de novo} generation of proteins are pivotal computational biology tasks. Whilst natural language processing (NLP) techniques have proven highly effective for protein sequence modelling, structure modelling presents a complex challenge, primarily due to its continuous and three-dimensional nature. Motivated by this discrepancy, we introduce an approach using a vector-quantized autoencoder that effectively tokenizes protein structures into discrete representations. This method transforms the continuous, complex space of protein structures into a manageable, discrete format with a codebook ranging from 4096 to 64000 tokens, achieving high-fidelity reconstructions with backbone root mean square deviations (RMSD) of approximately 1-5 Å. To demonstrate the efficacy of our learned representations, we show that a simple GPT model trained on our codebooks can generate novel, diverse, and designable protein structures. Our approach not only provides representations of protein structure, but also mitigates the challenges of disparate modal representations and sets a foundation for seamless, multi-modal integration, enhancing the capabilities of computational methods in protein design. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.03162 [pdf, other]

Advancing Multimodal Medical Capabilities of Gemini

Authors: Lin Yang, Shawn Xu, Andrew Sellergren, Timo Kohlberger, Yuchen Zhou, Ira Ktena, Atilla Kiraly, Faruk Ahmed, Farhad Hormozdiari, Tiam Jaroensri, Eric Wang, Ellery Wulczyn, Fayaz Jamil, Theo Guidroz, Chuck Lau, Siyuan Qiao, Yun Liu, Akshay Goel, Kendall Park, Arnav Agharwal, Nick George, Yang Wang, Ryutaro Tanno, David G. T. Barrett, Wei-Hung Weng , et al. (22 additional authors not shown)

Abstract: Many clinical tasks require an understanding of specialized data, such as medical images and genomics, which is not typically found in general-purpose large multimodal models. Building upon Gemini's multimodal models, we develop several models within the new Med-Gemini family that inherit core capabilities of Gemini and are optimized for medical use via fine-tuning with 2D and 3D radiology, histop… ▽ More Many clinical tasks require an understanding of specialized data, such as medical images and genomics, which is not typically found in general-purpose large multimodal models. Building upon Gemini's multimodal models, we develop several models within the new Med-Gemini family that inherit core capabilities of Gemini and are optimized for medical use via fine-tuning with 2D and 3D radiology, histopathology, ophthalmology, dermatology and genomic data. Med-Gemini-2D sets a new standard for AI-based chest X-ray (CXR) report generation based on expert evaluation, exceeding previous best results across two separate datasets by an absolute margin of 1% and 12%, where 57% and 96% of AI reports on normal cases, and 43% and 65% on abnormal cases, are evaluated as "equivalent or better" than the original radiologists' reports. We demonstrate the first ever large multimodal model-based report generation for 3D computed tomography (CT) volumes using Med-Gemini-3D, with 53% of AI reports considered clinically acceptable, although additional research is needed to meet expert radiologist reporting quality. Beyond report generation, Med-Gemini-2D surpasses the previous best performance in CXR visual question answering (VQA) and performs well in CXR classification and radiology VQA, exceeding SoTA or baselines on 17 of 20 tasks. In histopathology, ophthalmology, and dermatology image classification, Med-Gemini-2D surpasses baselines across 18 out of 20 tasks and approaches task-specific model performance. Beyond imaging, Med-Gemini-Polygenic outperforms the standard linear polygenic risk score-based approach for disease risk prediction and generalizes to genetically correlated diseases for which it has never been trained. Although further development and evaluation are necessary in the safety-critical medical domain, our results highlight the potential of Med-Gemini across a wide range of medical tasks. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.18416 [pdf, other]

Capabilities of Gemini Models in Medicine

Authors: Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng, David G. T. Barrett, Cathy Cheung, Basil Mustafa, Anil Palepu, Daniel McDuff, Le Hou, Tomer Golany, Luyang Liu, Jean-baptiste Alayrac, Neil Houlsby , et al. (42 additional authors not shown)

Abstract: Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-G… ▽ More Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-Gemini, a family of highly capable multimodal models that are specialized in medicine with the ability to seamlessly use web search, and that can be efficiently tailored to novel modalities using custom encoders. We evaluate Med-Gemini on 14 medical benchmarks, establishing new state-of-the-art (SoTA) performance on 10 of them, and surpass the GPT-4 model family on every benchmark where a direct comparison is viable, often by a wide margin. On the popular MedQA (USMLE) benchmark, our best-performing Med-Gemini model achieves SoTA performance of 91.1% accuracy, using a novel uncertainty-guided search strategy. On 7 multimodal benchmarks including NEJM Image Challenges and MMMU (health & medicine), Med-Gemini improves over GPT-4V by an average relative margin of 44.5%. We demonstrate the effectiveness of Med-Gemini's long-context capabilities through SoTA performance on a needle-in-a-haystack retrieval task from long de-identified health records and medical video question answering, surpassing prior bespoke methods using only in-context learning. Finally, Med-Gemini's performance suggests real-world utility by surpassing human experts on tasks such as medical text summarization, alongside demonstrations of promising potential for multimodal medical dialogue, medical research and education. Taken together, our results offer compelling evidence for Med-Gemini's potential, although further rigorous evaluation will be crucial before real-world deployment in this safety-critical domain. △ Less

Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

arXiv:2402.15410 [pdf, other]

Detailed Report on the Measurement of the Positive Muon Anomalous Magnetic Moment to 0.20 ppm

Authors: D. P. Aguillard, T. Albahri, D. Allspach, A. Anisenkov, K. Badgley, S. Baeßler, I. Bailey, L. Bailey, V. A. Baranov, E. Barlas-Yucel, T. Barrett, E. Barzi, F. Bedeschi, M. Berz, M. Bhattacharya, H. P. Binney, P. Bloom, J. Bono, E. Bottalico, T. Bowcock, S. Braun, M. Bressler, G. Cantatore, R. M. Carey, B. C. K. Casey , et al. (168 additional authors not shown)

Abstract: We present details on a new measurement of the muon magnetic anomaly, $a_μ= (g_μ-2)/2$. The result is based on positive muon data taken at Fermilab's Muon Campus during the 2019 and 2020 accelerator runs. The measurement uses $3.1$ GeV$/c$ polarized muons stored in a $7.1$-m-radius storage ring with a $1.45$ T uniform magnetic field. The value of $ a_μ$ is determined from the measured difference b… ▽ More We present details on a new measurement of the muon magnetic anomaly, $a_μ= (g_μ-2)/2$. The result is based on positive muon data taken at Fermilab's Muon Campus during the 2019 and 2020 accelerator runs. The measurement uses $3.1$ GeV$/c$ polarized muons stored in a $7.1$-m-radius storage ring with a $1.45$ T uniform magnetic field. The value of $ a_μ$ is determined from the measured difference between the muon spin precession frequency and its cyclotron frequency. This difference is normalized to the strength of the magnetic field, measured using Nuclear Magnetic Resonance (NMR). The ratio is then corrected for small contributions from beam motion, beam dispersion, and transient magnetic fields. We measure $a_μ= 116 592 057 (25) \times 10^{-11}$ (0.21 ppm). This is the world's most precise measurement of this quantity and represents a factor of $2.2$ improvement over our previous result based on the 2018 dataset. In combination, the two datasets yield $a_μ(\text{FNAL}) = 116 592 055 (24) \times 10^{-11}$ (0.20 ppm). Combining this with the measurements from Brookhaven National Laboratory for both positive and negative muons, the new world average is $a_μ$(exp) $ = 116 592 059 (22) \times 10^{-11}$ (0.19 ppm). △ Less

Submitted 22 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: 48 pages, 29 figures; 4 pages of Supplement Material; version accepted for publication in Physical Review D

Report number: FERMILAB-PUB-24-0084-AD-CSAID-PPD

arXiv:2311.18260 [pdf, other]

Consensus, dissensus and synergy between clinicians and specialist foundation models in radiology report generation

Authors: Ryutaro Tanno, David G. T. Barrett, Andrew Sellergren, Sumedh Ghaisas, Sumanth Dathathri, Abigail See, Johannes Welbl, Karan Singhal, Shekoofeh Azizi, Tao Tu, Mike Schaekermann, Rhys May, Roy Lee, SiWai Man, Zahra Ahmed, Sara Mahdavi, Yossi Matias, Joelle Barral, Ali Eslami, Danielle Belgrave, Vivek Natarajan, Shravya Shetty, Pushmeet Kohli, Po-Sen Huang, Alan Karthikesalingam , et al. (1 additional authors not shown)

Abstract: Radiology reports are an instrumental part of modern medicine, informing key clinical decisions such as diagnosis and treatment. The worldwide shortage of radiologists, however, restricts access to expert care and imposes heavy workloads, contributing to avoidable errors and delays in report delivery. While recent progress in automated report generation with vision-language models offer clear pote… ▽ More Radiology reports are an instrumental part of modern medicine, informing key clinical decisions such as diagnosis and treatment. The worldwide shortage of radiologists, however, restricts access to expert care and imposes heavy workloads, contributing to avoidable errors and delays in report delivery. While recent progress in automated report generation with vision-language models offer clear potential in ameliorating the situation, the path to real-world adoption has been stymied by the challenge of evaluating the clinical quality of AI-generated reports. In this study, we build a state-of-the-art report generation system for chest radiographs, $\textit{Flamingo-CXR}$, by fine-tuning a well-known vision-language foundation model on radiology data. To evaluate the quality of the AI-generated reports, a group of 16 certified radiologists provide detailed evaluations of AI-generated and human written reports for chest X-rays from an intensive care setting in the United States and an inpatient setting in India. At least one radiologist (out of two per case) preferred the AI report to the ground truth report in over 60$\%$ of cases for both datasets. Amongst the subset of AI-generated reports that contain errors, the most frequently cited reasons were related to the location and finding, whereas for human written reports, most mistakes were related to severity and finding. This disparity suggested potential complementarity between our AI system and human experts, prompting us to develop an assistive scenario in which Flamingo-CXR generates a first-draft report, which is subsequently revised by a clinician. This is the first demonstration of clinician-AI collaboration for report writing, and the resultant reports are assessed to be equivalent or preferred by at least one radiologist to reports written by experts alone in 80$\%$ of in-patient cases and 60$\%$ of intensive care cases. △ Less

Submitted 20 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

arXiv:2311.17371 [pdf, other]

Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs

Authors: Andries Smit, Paul Duckworth, Nathan Grinsztajn, Thomas D. Barrett, Arnu Pretorius

Abstract: Recent advancements in large language models (LLMs) underscore their potential for responding to inquiries in various domains. However, ensuring that generative agents provide accurate and reliable answers remains an ongoing challenge. In this context, multi-agent debate (MAD) has emerged as a promising strategy for enhancing the truthfulness of LLMs. We benchmark a range of debating and prompting… ▽ More Recent advancements in large language models (LLMs) underscore their potential for responding to inquiries in various domains. However, ensuring that generative agents provide accurate and reliable answers remains an ongoing challenge. In this context, multi-agent debate (MAD) has emerged as a promising strategy for enhancing the truthfulness of LLMs. We benchmark a range of debating and prompting strategies to explore the trade-offs between cost, time, and accuracy. Importantly, we find that multi-agent debating systems, in their current form, do not reliably outperform other proposed prompting strategies, such as self-consistency and ensembling using multiple reasoning paths. However, when performing hyperparameter tuning, several MAD systems, such as Multi-Persona, perform better. This suggests that MAD protocols might not be inherently worse than other approaches, but that they are more sensitive to different hyperparameter settings and difficult to optimize. We build on these results to offer insights into improving debating strategies, such as adjusting agent agreement levels, which can significantly enhance performance and even surpass all other non-debate protocols we evaluated. We provide an open-source repository to the community with several state-of-the-art protocols together with evaluation scripts to benchmark across popular research datasets. △ Less

Submitted 18 July, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

Comments: 2 pages, 13 figures

arXiv:2311.13569 [pdf, other]

Combinatorial Optimization with Policy Adaptation using Latent Space Search

Authors: Felix Chalumeau, Shikha Surana, Clement Bonnet, Nathan Grinsztajn, Arnu Pretorius, Alexandre Laterre, Thomas D. Barrett

Abstract: Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge. Reinforcement Learning (RL) provides a versatile framework for designing heuristics across a broad spectrum of problem domains. However, despite notable progress, RL has not yet supplanted industrial… ▽ More Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge. Reinforcement Learning (RL) provides a versatile framework for designing heuristics across a broad spectrum of problem domains. However, despite notable progress, RL has not yet supplanted industrial solvers as the go-to solution. Current approaches emphasize pre-training heuristics that construct solutions but often rely on search procedures with limited variance, such as stochastically sampling numerous solutions from a single policy or employing computationally expensive fine-tuning of the policy on individual problem instances. Building on the intuition that performant search at inference time should be anticipated during pre-training, we propose COMPASS, a novel RL approach that parameterizes a distribution of diverse and specialized policies conditioned on a continuous latent space. We evaluate COMPASS across three canonical problems - Travelling Salesman, Capacitated Vehicle Routing, and Job-Shop Scheduling - and demonstrate that our search strategy (i) outperforms state-of-the-art approaches on 11 standard benchmarking tasks and (ii) generalizes better, surpassing all other approaches on a set of 18 procedurally transformed instance distributions. △ Less

Submitted 28 May, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: Fix typo in formula and add a reference

arXiv:2308.06230 [pdf, other]

doi 10.1103/PhysRevLett.131.161802

Measurement of the Positive Muon Anomalous Magnetic Moment to 0.20 ppm

Authors: D. P. Aguillard, T. Albahri, D. Allspach, A. Anisenkov, K. Badgley, S. Baeßler, I. Bailey, L. Bailey, V. A. Baranov, E. Barlas-Yucel, T. Barrett, E. Barzi, F. Bedeschi, M. Berz, M. Bhattacharya, H. P. Binney, P. Bloom, J. Bono, E. Bottalico, T. Bowcock, S. Braun, M. Bressler, G. Cantatore, R. M. Carey, B. C. K. Casey , et al. (166 additional authors not shown)

Abstract: We present a new measurement of the positive muon magnetic anomaly, $a_μ\equiv (g_μ- 2)/2$, from the Fermilab Muon $g\!-\!2$ Experiment using data collected in 2019 and 2020. We have analyzed more than 4 times the number of positrons from muon decay than in our previous result from 2018 data. The systematic error is reduced by more than a factor of 2 due to better running conditions, a more stable… ▽ More We present a new measurement of the positive muon magnetic anomaly, $a_μ\equiv (g_μ- 2)/2$, from the Fermilab Muon $g\!-\!2$ Experiment using data collected in 2019 and 2020. We have analyzed more than 4 times the number of positrons from muon decay than in our previous result from 2018 data. The systematic error is reduced by more than a factor of 2 due to better running conditions, a more stable beam, and improved knowledge of the magnetic field weighted by the muon distribution, $\tildeω'^{}_p$, and of the anomalous precession frequency corrected for beam dynamics effects, $ω_a$. From the ratio $ω_a / \tildeω'^{}_p$, together with precisely determined external parameters, we determine $a_μ= 116\,592\,057(25) \times 10^{-11}$ (0.21 ppm). Combining this result with our previous result from the 2018 data, we obtain $a_μ\text{(FNAL)} = 116\,592\,055(24) \times 10^{-11}$ (0.20 ppm). The new experimental world average is $a_μ(\text{Exp}) = 116\,592\,059(22)\times 10^{-11}$ (0.19 ppm), which represents a factor of 2 improvement in precision. △ Less

Submitted 4 October, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

Comments: 8 pages, 3 figures

Report number: FERMILAB-PUB-23-385-AD-CSAID-PPD

Journal ref: Phys. Rev. Lett. 131, 161802 (2023)

arXiv:2305.09072 [pdf, other]

doi 10.1145/3593013.3594114

Skin Deep: Investigating Subjectivity in Skin Tone Annotations for Computer Vision Benchmark Datasets

Authors: Teanna Barrett, Quan Ze Chen, Amy X. Zhang

Abstract: To investigate the well-observed racial disparities in computer vision systems that analyze images of humans, researchers have turned to skin tone as more objective annotation than race metadata for fairness performance evaluations. However, the current state of skin tone annotation procedures is highly varied. For instance, researchers use a range of untested scales and skin tone categories, have… ▽ More To investigate the well-observed racial disparities in computer vision systems that analyze images of humans, researchers have turned to skin tone as more objective annotation than race metadata for fairness performance evaluations. However, the current state of skin tone annotation procedures is highly varied. For instance, researchers use a range of untested scales and skin tone categories, have unclear annotation procedures, and provide inadequate analyses of uncertainty. In addition, little attention is paid to the positionality of the humans involved in the annotation process--both designers and annotators alike--and the historical and sociological context of skin tone in the United States. Our work is the first to investigate the skin tone annotation process as a sociotechnical project. We surveyed recent skin tone annotation procedures and conducted annotation experiments to examine how subjective understandings of skin tone are embedded in skin tone annotation procedures. Our systematic literature review revealed the uninterrogated association between skin tone and race and the limited effort to analyze annotator uncertainty in current procedures for skin tone annotation in computer vision evaluation. Our experiments demonstrated that design decisions in the annotation procedure such as the order in which the skin tone scale is presented or additional context in the image (i.e., presence of a face) significantly affected the resulting inter-annotator agreement and individual uncertainty of skin tone annotations. We call for greater reflexivity in the design, analysis, and documentation of procedures for evaluation using skin tone. △ Less

Submitted 15 May, 2023; originally announced May 2023.

Comments: To appear in FAcct '23

arXiv:2305.04899 [pdf, other]

doi 10.1088/1361-6455/acf9d2

Bursts of polarised single photons from atom-cavity sources

Authors: Jan Ole Ernst, Juan-Rafael Alvarez, Thomas D. Barrett, Axel Kuhn

Abstract: Photonic qubits play an instrumental role in the development of advanced quantum technologies, including quantum networking, boson sampling and measurement based quantum computing. A promising framework for the deterministic production of indistinguishable single photons is an atomic emitter coupled to a single mode of a high finesse optical cavity. Polarisation control is an important cornerstone… ▽ More Photonic qubits play an instrumental role in the development of advanced quantum technologies, including quantum networking, boson sampling and measurement based quantum computing. A promising framework for the deterministic production of indistinguishable single photons is an atomic emitter coupled to a single mode of a high finesse optical cavity. Polarisation control is an important cornerstone, particularly when the polarisation defines the state of a quantum bit. Here, we propose a scheme for producing bursts of polarised single photons by coupling a generalised atomic emitter to an optical cavity, exploiting a particular choice of quantisation axis. In connection with two re-preparation methods, simulations predict 10-photon bursts coincidence count rates on the order of 1 kHz with single 87Rb atoms trapped in a state of the art optical cavity. This paves the way for novel n-photon experiments with atom-cavity sources. △ Less

Submitted 25 August, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Journal ref: Journal of Physics B: Atomic, Molecular and Optical Physics, Volume 56, Number 20, 2023

arXiv:2303.12035 [pdf, other]

doi 10.1021/acs.nanolett.3c04190

Quantum gas-enabled direct mapping of active current density in percolating networks of nanowires

Authors: J. Fekete, P. Joshi, T. J. Barrett, T. M. James, R. Shah, A. Gadge, S. Bhumbra, F. Oručević, P. Krüger

Abstract: Electrically percolating nanowire networks are amongst the most promising candidates for next-generation transparent electrodes. Scientific interest in these materials stems from their intrinsic current distribution heterogeneity, leading to phenomena like percolating pathway re-routing and localized self-heating, which can cause irreversible damage. Without an experimental technique to resolve th… ▽ More Electrically percolating nanowire networks are amongst the most promising candidates for next-generation transparent electrodes. Scientific interest in these materials stems from their intrinsic current distribution heterogeneity, leading to phenomena like percolating pathway re-routing and localized self-heating, which can cause irreversible damage. Without an experimental technique to resolve the current distribution, and an underpinning nonlinear percolation model, one relies on empirical rules and safety factors to engineer these materials. We introduce Bose-Einstein microscopy to address the long-standing problem of imaging active current flow in 2D materials. We report on improvement of the performance of this technique, whereby observation of dynamic redistribution of current pathways becomes feasible. We show how this, combined with existing thermal imaging methods, eliminates the need for assumptions between electrical and thermal properties. This will enable testing and modelling individual junction behaviour and hotspot formation. Investigating both reversible and irreversible mechanisms will contribute to the advancement of devices with improved performance and reliability. △ Less

Submitted 9 November, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

Journal ref: Nano Lett. 24, 1309 (2024)

arXiv:2212.05983 [pdf]

Measuring the Electronic Bandgap of Carbon Nanotube Networks in Non-ideal p-n Diodes

Authors: Gideon Oyibo, Thomas Barrett, Sharadh Jois, Jeffrey L. Blackburn, Ji Ung Lee

Abstract: The measurement of the bandgap in quasi-one dimensional materials such as carbon nanotubes is challenging due to its dimensionality. In this work, we measure the electronic bandgap of networks of polymer-wrapped semiconducting single-walled carbon nanotubes (s-SWCNTs) using non-ideal p-n diodes. Using these diodes, we measure the electronic bandgap and excitonic levels of different polymer-wrapped… ▽ More The measurement of the bandgap in quasi-one dimensional materials such as carbon nanotubes is challenging due to its dimensionality. In this work, we measure the electronic bandgap of networks of polymer-wrapped semiconducting single-walled carbon nanotubes (s-SWCNTs) using non-ideal p-n diodes. Using these diodes, we measure the electronic bandgap and excitonic levels of different polymer-wrapped s-SWCNTs with varying diameters: arc discharge (~1.55nm), (7,5) (0.83nm), and (6,5) (0.76nm). Our values are consistent with theoretical predictions, providing insight into the fundamental properties of networks of s-SWCNTs. △ Less

Submitted 12 December, 2022; originally announced December 2022.

arXiv:2211.02799 [pdf]

Evaluating Novel Mask-RCNN Architectures for Ear Mask Segmentation

Authors: Saurav K. Aryal, Teanna Barrett, Gloria Washington

Abstract: The human ear is generally universal, collectible, distinct, and permanent. Ear-based biometric recognition is a niche and recent approach that is being explored. For any ear-based biometric algorithm to perform well, ear detection and segmentation need to be accurately performed. While significant work has been done in existing literature for bounding boxes, a lack of approaches output a segmenta… ▽ More The human ear is generally universal, collectible, distinct, and permanent. Ear-based biometric recognition is a niche and recent approach that is being explored. For any ear-based biometric algorithm to perform well, ear detection and segmentation need to be accurately performed. While significant work has been done in existing literature for bounding boxes, a lack of approaches output a segmentation mask for ears. This paper trains and compares three newer models to the state-of-the-art MaskRCNN (ResNet 101 +FPN) model across four different datasets. The Average Precision (AP) scores reported show that the newer models outperform the state-of-the-art but no one model performs the best over multiple datasets. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: Accepted into ICCBS 2022

arXiv:2210.07986 [pdf]

doi 10.1021/acs.nanolett.2c03544

All-carbon nanotube solar cell devices mimic photosynthesis

Authors: Gideon Oyibo, Thomas Barrett, Sharadh Jois, Jeffrey Blackburn, Ji Ung Lee

Abstract: Photovoltaics has two main processes: Optical absorption and power conversion. In photosynthesis, the two equivalent processes are optical absorption and chemical conversion. Whereas in the latter, the two processes are carried out by distinct proteins, in conventional photovoltaic diodes, the two processes are convoluted because the optical and transport paths are the same, leading to inefficienc… ▽ More Photovoltaics has two main processes: Optical absorption and power conversion. In photosynthesis, the two equivalent processes are optical absorption and chemical conversion. Whereas in the latter, the two processes are carried out by distinct proteins, in conventional photovoltaic diodes, the two processes are convoluted because the optical and transport paths are the same, leading to inefficiencies. Here, we separate the site and direction of light absorption from those of power generation to show that semiconducting single-walled carbon nanotubes (s-SWCNTs) provide an artificial system that models photosynthesis in a tandem geometry. Using different s-SWCNT chiralities, we implement an energy funnel in dual-gated p-n diodes. This enables the capture of photons from multiple regions of the solar spectrum and the funneling of photogenerated excitons to the smallest bandgap s-SWCNT layer, where they become free carriers. As a result, we demonstrate an increase in the magnitude and spectral response of photocurrent by adding more s-SWCNT layers of different bandgaps without a corresponding deleterious increase in the dark leakage current. △ Less

Submitted 14 October, 2022; originally announced October 2022.

arXiv:2210.03475 [pdf, other]

Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization

Authors: Nathan Grinsztajn, Daniel Furelos-Blanco, Shikha Surana, Clément Bonnet, Thomas D. Barrett

Abstract: Applying reinforcement learning (RL) to combinatorial optimization problems is attractive as it removes the need for expert knowledge or pre-solved instances. However, it is unrealistic to expect an agent to solve these (often NP-)hard problems in a single shot at inference due to their inherent complexity. Thus, leading approaches often implement additional search strategies, from stochastic samp… ▽ More Applying reinforcement learning (RL) to combinatorial optimization problems is attractive as it removes the need for expert knowledge or pre-solved instances. However, it is unrealistic to expect an agent to solve these (often NP-)hard problems in a single shot at inference due to their inherent complexity. Thus, leading approaches often implement additional search strategies, from stochastic sampling and beam search to explicit fine-tuning. In this paper, we argue for the benefits of learning a population of complementary policies, which can be simultaneously rolled out at inference. To this end, we introduce Poppy, a simple training procedure for populations. Instead of relying on a predefined or hand-crafted notion of diversity, Poppy induces an unsupervised specialization targeted solely at maximizing the performance of the population. We show that Poppy produces a set of complementary policies, and obtains state-of-the-art RL results on four popular NP-hard problems: traveling salesman, capacitated vehicle routing, 0-1 knapsack, and job-shop scheduling. △ Less

Submitted 13 November, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

arXiv:2209.13083 [pdf, other]

Why neural networks find simple solutions: the many regularizers of geometric complexity

Authors: Benoit Dherin, Michael Munn, Mihaela Rosca, David G. T. Barrett

Abstract: In many contexts, simpler models are preferable to more complex models and the control of this model complexity is the goal for many methods in machine learning such as regularization, hyperparameter tuning and architecture design. In deep learning, it has been difficult to understand the underlying mechanisms of complexity control, since many traditional measures are not naturally suitable for de… ▽ More In many contexts, simpler models are preferable to more complex models and the control of this model complexity is the goal for many methods in machine learning such as regularization, hyperparameter tuning and architecture design. In deep learning, it has been difficult to understand the underlying mechanisms of complexity control, since many traditional measures are not naturally suitable for deep neural networks. Here we develop the notion of geometric complexity, which is a measure of the variability of the model function, computed using a discrete Dirichlet energy. Using a combination of theoretical arguments and empirical results, we show that many common training heuristics such as parameter norm regularization, spectral norm regularization, flatness regularization, implicit gradient regularization, noise regularization and the choice of parameter initialization all act to control geometric complexity, providing a unifying framework in which to characterize the behavior of deep learning models. △ Less

Submitted 23 December, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

Comments: Accepted as a NeurIPS 2022 paper

arXiv:2206.06758 [pdf, other]

Universally Expressive Communication in Multi-Agent Reinforcement Learning

Authors: Matthew Morris, Thomas D. Barrett, Arnu Pretorius

Abstract: Allowing agents to share information through communication is crucial for solving complex tasks in multi-agent reinforcement learning. In this work, we consider the question of whether a given communication protocol can express an arbitrary policy. By observing that many existing protocols can be viewed as instances of graph neural networks (GNNs), we demonstrate the equivalence of joint action se… ▽ More Allowing agents to share information through communication is crucial for solving complex tasks in multi-agent reinforcement learning. In this work, we consider the question of whether a given communication protocol can express an arbitrary policy. By observing that many existing protocols can be viewed as instances of graph neural networks (GNNs), we demonstrate the equivalence of joint action selection to node labelling. With standard GNN approaches provably limited in their expressive capacity, we draw from existing GNN literature and consider augmenting agent observations with: (1) unique agent IDs and (2) random noise. We provide a theoretical analysis as to how these approaches yield universally expressive communication, and also prove them capable of targeting arbitrary sets of actions for identical agents. Empirically, these augmentations are found to improve performance on tasks where expressive communication is required, whilst, in general, the optimal communication protocol is found to be task-dependent. △ Less

Submitted 13 January, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: Published in NeurIPS 2022

MSC Class: 68T07; 68T42; 68R10 (Primary) 68T20; 05C15 (Secondary) ACM Class: I.2.11; I.2.6; I.2.8

arXiv:2205.14345 [pdf, other]

Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories

Authors: Christopher W. F. Parsonson, Alexandre Laterre, Thomas D. Barrett

Abstract: Combinatorial optimisation problems framed as mixed integer linear programmes (MILPs) are ubiquitous across a range of real-world applications. The canonical branch-and-bound algorithm seeks to exactly solve MILPs by constructing a search tree of increasingly constrained sub-problems. In practice, its solving time performance is dependent on heuristics, such as the choice of the next variable to c… ▽ More Combinatorial optimisation problems framed as mixed integer linear programmes (MILPs) are ubiquitous across a range of real-world applications. The canonical branch-and-bound algorithm seeks to exactly solve MILPs by constructing a search tree of increasingly constrained sub-problems. In practice, its solving time performance is dependent on heuristics, such as the choice of the next variable to constrain ('branching'). Recently, machine learning (ML) has emerged as a promising paradigm for branching. However, prior works have struggled to apply reinforcement learning (RL), citing sparse rewards, difficult exploration, and partial observability as significant challenges. Instead, leading ML methodologies resort to approximating high quality handcrafted heuristics with imitation learning (IL), which precludes the discovery of novel policies and requires expensive data labelling. In this work, we propose retro branching; a simple yet effective approach to RL for branching. By retrospectively deconstructing the search tree into multiple paths each contained within a sub-tree, we enable the agent to learn from shorter trajectories with more predictable next states. In experiments on four combinatorial tasks, our approach enables learning-to-branch without any expert guidance or pre-training. We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables, with ablations verifying that our retrospectively constructed trajectories are essential to achieving these results. △ Less

Submitted 5 December, 2022; v1 submitted 28 May, 2022; originally announced May 2022.

Comments: Accepted to AAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence

Journal ref: AAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

arXiv:2205.14105 [pdf, other]

Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration

Authors: Thomas D. Barrett, Christopher W. F. Parsonson, Alexandre Laterre

Abstract: From logistics to the natural sciences, combinatorial optimisation on graphs underpins numerous real-world applications. Reinforcement learning (RL) has shown particular promise in this setting as it can adapt to specific problem structures and does not require pre-solved instances for these, often NP-hard, problems. However, state-of-the-art (SOTA) approaches typically suffer from severe scalabil… ▽ More From logistics to the natural sciences, combinatorial optimisation on graphs underpins numerous real-world applications. Reinforcement learning (RL) has shown particular promise in this setting as it can adapt to specific problem structures and does not require pre-solved instances for these, often NP-hard, problems. However, state-of-the-art (SOTA) approaches typically suffer from severe scalability issues, primarily due to their reliance on expensive graph neural networks (GNNs) at each decision step. We introduce ECORD; a novel RL algorithm that alleviates this expense by restricting the GNN to a single pre-processing step, before entering a fast-acting exploratory phase directed by a recurrent unit. Experimentally, ECORD achieves a new SOTA for RL algorithms on the Maximum Cut problem, whilst also providing orders of magnitude improvement in speed and scalability. Compared to the nearest competitor, ECORD reduces the optimality gap by up to 73% on 500 vertex graphs with a decreased wall-clock time. Moreover, ECORD retains strong performance when generalising to larger graphs with up to 10000 vertices. △ Less

Submitted 27 May, 2022; originally announced May 2022.

arXiv:2204.11973 [pdf, other]

On Automorphism Criteria for Comparing Amounts of Mathematical Structure

Authors: Thomas William Barrett, JB Manchak, James Owen Weatherall

Abstract: Wilhelm (2021) has recently defended a criterion for comparing structure of mathematical objects, which he calls Subgroup. He argues that Subgroup is better than SYM * , another widely adopted criterion. We argue that this is mistaken; Subgroup is strictly worse than SYM *. We then formulate a new criterion that improves on both SYM * and Subgroup, answering Wilhelm's criticisms of SYM * along the… ▽ More Wilhelm (2021) has recently defended a criterion for comparing structure of mathematical objects, which he calls Subgroup. He argues that Subgroup is better than SYM * , another widely adopted criterion. We argue that this is mistaken; Subgroup is strictly worse than SYM *. We then formulate a new criterion that improves on both SYM * and Subgroup, answering Wilhelm's criticisms of SYM * along the way. We conclude by arguing that no criterion that looks only to the automorphisms of mathematical objects to compare their structure can be fully satisfactory. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: 13 pages, 1 figure (depicting a giraffe)

arXiv:2201.08754 [pdf, other]

doi 10.1103/PhysRevLett.130.123401

Probing the Degree of Coherence through the Full 1D to 3D crossover

Authors: Robert Shah, Thomas Barrett, Andrea Colcelli, Fedja Orucevic, Andrea Trombettoni, Peter Kruger

Abstract: We experimentally study a gas of quantum degenerate $^{87}$Rb atoms throughout the full dimensional crossover, from a one-dimensional (1D) system exhibiting phase fluctuations consistent with 1D theory to a three-dimensional (3D) phase-coherent system, thereby smoothly interpolating between these distinct, well-understood regimes. Using a hybrid trapping architecture combining an atom chip with a… ▽ More We experimentally study a gas of quantum degenerate $^{87}$Rb atoms throughout the full dimensional crossover, from a one-dimensional (1D) system exhibiting phase fluctuations consistent with 1D theory to a three-dimensional (3D) phase-coherent system, thereby smoothly interpolating between these distinct, well-understood regimes. Using a hybrid trapping architecture combining an atom chip with a printed circuit board, we continuously adjust the system's dimensionality over a wide range while measuring the phase fluctuations through the power spectrum of density ripples in time-of-flight expansion. Our measurements confirm that the chemical potential $μ$ controls the departure of the system from 3D and that the fluctuations are dependent on both $μ$ and the temperature $T$. Through a rigorous study we quantitatively observe how inside the crossover the dependence on $T$ gradually disappears as the system becomes 3D. Throughout the entire crossover the fluctuations are shown to be determined by the relative occupation of 1D axial collective excitations. △ Less

Submitted 22 March, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

Comments: 10 pages, 6 figures

arXiv:2111.15090 [pdf, other]

The Geometric Occam's Razor Implicit in Deep Learning

Authors: Benoit Dherin, Michael Munn, David G. T. Barrett

Abstract: In over-parameterized deep neural networks there can be many possible parameter configurations that fit the training data exactly. However, the properties of these interpolating solutions are poorly understood. We argue that over-parameterized neural networks trained with stochastic gradient descent are subject to a Geometric Occam's Razor; that is, these networks are implicitly regularized by the… ▽ More In over-parameterized deep neural networks there can be many possible parameter configurations that fit the training data exactly. However, the properties of these interpolating solutions are poorly understood. We argue that over-parameterized neural networks trained with stochastic gradient descent are subject to a Geometric Occam's Razor; that is, these networks are implicitly regularized by the geometric model complexity. For one-dimensional regression, the geometric model complexity is simply given by the arc length of the function. For higher-dimensional settings, the geometric model complexity depends on the Dirichlet energy of the function. We explore the relationship between this Geometric Occam's Razor, the Dirichlet energy and other known forms of implicit regularization. Finally, for ResNets trained on CIFAR-10, we observe that Dirichlet energy measurements are consistent with the action of this implicit Geometric Occam's Razor. △ Less

Submitted 30 November, 2021; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: Accepted as a NeurIPS 2021 workshop paper (OPT2021)

arXiv:2111.00206 [pdf, other]

One Step at a Time: Pros and Cons of Multi-Step Meta-Gradient Reinforcement Learning

Authors: Clément Bonnet, Paul Caron, Thomas Barrett, Ian Davies, Alexandre Laterre

Abstract: Self-tuning algorithms that adapt the learning process online encourage more effective and robust learning. Among all the methods available, meta-gradients have emerged as a promising approach. They leverage the differentiability of the learning rule with respect to some hyper-parameters to adapt them in an online fashion. Although meta-gradients can be accumulated over multiple learning steps to… ▽ More Self-tuning algorithms that adapt the learning process online encourage more effective and robust learning. Among all the methods available, meta-gradients have emerged as a promising approach. They leverage the differentiability of the learning rule with respect to some hyper-parameters to adapt them in an online fashion. Although meta-gradients can be accumulated over multiple learning steps to avoid myopic updates, this is rarely used in practice. In this work, we demonstrate that whilst multi-step meta-gradients do provide a better learning signal in expectation, this comes at the cost of a significant increase in variance, hindering performance. In the light of this analysis, we introduce a novel method mixing multiple inner steps that enjoys a more accurate and robust meta-gradient signal, essentially trading off bias and variance in meta-gradient estimation. When applied to the Snake game, the mixing meta-gradient algorithm can cut the variance by a factor of 3 while achieving similar or higher performance. △ Less

Submitted 30 October, 2021; originally announced November 2021.

Comments: 14 pages, 6 figures, 2 tables

arXiv:2109.12606 [pdf, other]

Autoregressive neural-network wavefunctions for ab initio quantum chemistry

Authors: Thomas D. Barrett, Aleksei Malyshev, A. I. Lvovsky

Abstract: In recent years, neural network quantum states (NNQS) have emerged as powerful tools for the study of quantum many-body systems. Electronic structure calculations are one such canonical many-body problem that have attracted significant research efforts spanning multiple decades, whilst only recently being attempted with NNQS. However, the complex non-local interactions and high sample complexity a… ▽ More In recent years, neural network quantum states (NNQS) have emerged as powerful tools for the study of quantum many-body systems. Electronic structure calculations are one such canonical many-body problem that have attracted significant research efforts spanning multiple decades, whilst only recently being attempted with NNQS. However, the complex non-local interactions and high sample complexity are significant challenges that call for bespoke solutions. Here, we parameterise the electronic wavefunction with a novel autoregressive neural network (ARN) that permits highly efficient and scalable sampling, whilst also embedding physical priors reflecting the structure of molecular systems without sacrificing expressibility. This allows us to perform electronic structure calculations on molecules with up to 30 spin-orbitals -- at least an order of magnitude more Slater determinants than previous applications of conventional NNQS -- and we find that our ansatz can outperform the de-facto gold-standard coupled cluster methods even in the presence of strong quantum correlations. With a highly expressive neural network for which sampling is no longer a computational bottleneck, we conclude that the barriers to further scaling are not associated with the wavefunction ansatz itself, but rather are inherent to any variational Monte Carlo approach. △ Less

Submitted 25 January, 2022; v1 submitted 26 September, 2021; originally announced September 2021.

Comments: 8 pages, plus Methods and Supplementary Information

arXiv:2106.09705 [pdf, other]

How to administer an antidote to Schrödinger's cat

Authors: Juan-Rafael Álvarez, Mark IJspeert, Oliver Barter, Ben Yuen, Thomas D. Barrett, Dustin Stuart, Jerome Dilley, Annemarie Holleczek, Axel Kuhn

Abstract: In his 1935 Gedankenexperiment, Erwin Schrödinger imagined a poisonous substance which has a 50% probability of being released, based on the decay of a radioactive atom. As such, the life of the cat and the state of the poison become entangled, and the fate of the cat is determined upon opening the box. We present an experimental technique that keeps the cat alive on any account. This method relie… ▽ More In his 1935 Gedankenexperiment, Erwin Schrödinger imagined a poisonous substance which has a 50% probability of being released, based on the decay of a radioactive atom. As such, the life of the cat and the state of the poison become entangled, and the fate of the cat is determined upon opening the box. We present an experimental technique that keeps the cat alive on any account. This method relies on the time-resolved Hong-Ou-Mandel effect: two long, identical photons impinging on a beam splitter always bunch in either of the outputs. Interpreting the first photon detection as the state of the poison, the second photon is identified as the state of the cat. Even after the collapse of the first photon's state, we show their fates are intertwined through quantum interference. We demonstrate this by a sudden phase change between the inputs, administered conditionally on the outcome of the first detection, which steers the second photon to a pre-defined output and ensures that the cat is always observed alive. △ Less

Submitted 17 June, 2021; originally announced June 2021.

Comments: 7 pages, 4 figures, including supplementary material with 4 pages and 4 figures

arXiv:2105.13922 [pdf, other]

Discretization Drift in Two-Player Games

Authors: Mihaela Rosca, Yan Wu, Benoit Dherin, David G. T. Barrett

Abstract: Gradient-based methods for two-player games produce rich dynamics that can solve challenging problems, yet can be difficult to stabilize and understand. Part of this complexity originates from the discrete update steps given by simultaneous or alternating gradient descent, which causes each player to drift away from the continuous gradient flow -- a phenomenon we call discretization drift. Using b… ▽ More Gradient-based methods for two-player games produce rich dynamics that can solve challenging problems, yet can be difficult to stabilize and understand. Part of this complexity originates from the discrete update steps given by simultaneous or alternating gradient descent, which causes each player to drift away from the continuous gradient flow -- a phenomenon we call discretization drift. Using backward error analysis, we derive modified continuous dynamical systems that closely follow the discrete dynamics. These modified dynamics provide an insight into the notorious challenges associated with zero-sum games, including Generative Adversarial Networks. In particular, we identify distinct components of the discretization drift that can alter performance and in some cases destabilize the game. Finally, quantifying discretization drift allows us to identify regularizers that explicitly cancel harmful forms of drift or strengthen beneficial forms of drift, and thus improve performance of GAN training. △ Less

Submitted 1 July, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

arXiv:2105.08702 [pdf]

Component Based Solutions Under Architecture

Authors: T. A. Barrett, H. A. Proper

Abstract: Many of today's applications have an, almost tangible, monolithic nature. They are built as 'islands', purporting to be self contained, offering little or nothing in the way of integration with other applications. In the past, being large and self-contained may have eliminated the need to interact with other solutions to some extent. However, in the business environments of today the interaction w… ▽ More Many of today's applications have an, almost tangible, monolithic nature. They are built as 'islands', purporting to be self contained, offering little or nothing in the way of integration with other applications. In the past, being large and self-contained may have eliminated the need to interact with other solutions to some extent. However, in the business environments of today the interaction with other applications becomes paramount. As a result of this, many ad-hoc point-to-point integration solutions have been built between different applications. This has already led to an 'application spaghetti' at many of our customer sites. Many of today's applications are poorly structured, which makes their responsiveness to business change sluggish. The application spaghetti with its plethora of point-to-point interfaces further inhibits the responsiveness to change. △ Less

Submitted 18 May, 2021; originally announced May 2021.

arXiv:2104.03281 [pdf, other]

doi 10.1103/PhysRevLett.126.141801

Measurement of the Positive Muon Anomalous Magnetic Moment to 0.46 ppm

Authors: B. Abi, T. Albahri, S. Al-Kilani, D. Allspach, L. P. Alonzi, A. Anastasi, A. Anisenkov, F. Azfar, K. Badgley, S. Baeßler, I. Bailey, V. A. Baranov, E. Barlas-Yucel, T. Barrett, E. Barzi, A. Basti, F. Bedeschi, A. Behnke, M. Berz, M. Bhattacharya, H. P. Binney, R. Bjorkquist, P. Bloom, J. Bono, E. Bottalico , et al. (212 additional authors not shown)

Abstract: We present the first results of the Fermilab Muon g-2 Experiment for the positive muon magnetic anomaly $a_μ\equiv (g_μ-2)/2$. The anomaly is determined from the precision measurements of two angular frequencies. Intensity variation of high-energy positrons from muon decays directly encodes the difference frequency $ω_a$ between the spin-precession and cyclotron frequencies for polarized muons in… ▽ More We present the first results of the Fermilab Muon g-2 Experiment for the positive muon magnetic anomaly $a_μ\equiv (g_μ-2)/2$. The anomaly is determined from the precision measurements of two angular frequencies. Intensity variation of high-energy positrons from muon decays directly encodes the difference frequency $ω_a$ between the spin-precession and cyclotron frequencies for polarized muons in a magnetic storage ring. The storage ring magnetic field is measured using nuclear magnetic resonance probes calibrated in terms of the equivalent proton spin precession frequency ${\tildeω'^{}_p}$ in a spherical water sample at 34.7$^{\circ}$C. The ratio $ω_a / {\tildeω'^{}_p}$, together with known fundamental constants, determines $a_μ({\rm FNAL}) = 116\,592\,040(54)\times 10^{-11}$ (0.46\,ppm). The result is 3.3 standard deviations greater than the standard model prediction and is in excellent agreement with the previous Brookhaven National Laboratory (BNL) E821 measurement. After combination with previous measurements of both $μ^+$ and $μ^-$, the new experimental average of $a_μ({\rm Exp}) = 116\,592\,061(41)\times 10^{-11}$ (0.35\,ppm) increases the tension between experiment and theory to 4.2 standard deviations △ Less

Submitted 7 April, 2021; originally announced April 2021.

Comments: 10 pages; 4 figures

Report number: FERMILAB-PUB-21-132-E

Journal ref: Phys. Rev. Lett. 126, 141801 (2021)

arXiv:2104.03247 [pdf, other]

doi 10.1103/PhysRevD.103.072002

Measurement of the anomalous precession frequency of the muon in the Fermilab Muon g-2 experiment

Authors: T. Albahri, A. Anastasi, A. Anisenkov, K. Badgley, S. Baeßler, I. Bailey, V. A. Baranov, E. Barlas-Yucel, T. Barrett, A. Basti, F. Bedeschi, M. Berz, M. Bhattacharya, H. P. Binney, P. Bloom, J. Bono, E. Bottalico, T. Bowcock, G. Cantatore, R. M. Carey, B. C. K. Casey, D. Cauz, R. Chakraborty, S. P. Chang, A. Chapelain , et al. (153 additional authors not shown)

Abstract: The Muon g-2 Experiment at Fermi National Accelerator Laboratory (FNAL) has measured the muon anomalous precession frequency $ω_a$ to an uncertainty of 434 parts per billion (ppb), statistical, and 56 ppb, systematic, with data collected in four storage ring configurations during its first physics run in 2018. When combined with a precision measurement of the magnetic field of the experiment's muo… ▽ More The Muon g-2 Experiment at Fermi National Accelerator Laboratory (FNAL) has measured the muon anomalous precession frequency $ω_a$ to an uncertainty of 434 parts per billion (ppb), statistical, and 56 ppb, systematic, with data collected in four storage ring configurations during its first physics run in 2018. When combined with a precision measurement of the magnetic field of the experiment's muon storage ring, the precession frequency measurement determines a muon magnetic anomaly of $a_μ({\rm FNAL}) = 116\,592\,040(54) \times 10^{-11}$ (0.46 ppm). This article describes the multiple techniques employed in the reconstruction, analysis and fitting of the data to measure the precession frequency. It also presents the averaging of the results from the eleven separate determinations of ω_a, and the systematic uncertainties on the result. △ Less

Submitted 7 April, 2021; originally announced April 2021.

Comments: 29 pages, 19 figures. Published in Physical Review D

Report number: FERMILAB-PUB-21-183-E

Journal ref: Phys. Rev. D 103, 072002 (2021)

arXiv:2104.03240 [pdf, other]

doi 10.1103/PhysRevAccelBeams.24.044002

Beam dynamics corrections to the Run-1 measurement of the muon anomalous magnetic moment at Fermilab

Authors: T. Albahri, A. Anastasi, K. Badgley, S. Baeßler, I. Bailey, V. A. Baranov, E. Barlas-Yucel, T. Barrett, F. Bedeschi, M. Berz, M. Bhattacharya, H. P. Binney, P. Bloom, J. Bono, E. Bottalico, T. Bowcock, G. Cantatore, R. M. Carey, B. C. K. Casey, D. Cauz, R. Chakraborty, S. P. Chang, A. Chapelain, S. Charity, R. Chislett , et al. (152 additional authors not shown)

Abstract: This paper presents the beam dynamics systematic corrections and their uncertainties for the Run-1 data set of the Fermilab Muon g-2 Experiment. Two corrections to the measured muon precession frequency $ω_a^m$ are associated with well-known effects owing to the use of electrostatic quadrupole (ESQ) vertical focusing in the storage ring. An average vertically oriented motional magnetic field is fe… ▽ More This paper presents the beam dynamics systematic corrections and their uncertainties for the Run-1 data set of the Fermilab Muon g-2 Experiment. Two corrections to the measured muon precession frequency $ω_a^m$ are associated with well-known effects owing to the use of electrostatic quadrupole (ESQ) vertical focusing in the storage ring. An average vertically oriented motional magnetic field is felt by relativistic muons passing transversely through the radial electric field components created by the ESQ system. The correction depends on the stored momentum distribution and the tunes of the ring, which has relatively weak vertical focusing. Vertical betatron motions imply that the muons do not orbit the ring in a plane exactly orthogonal to the vertical magnetic field direction. A correction is necessary to account for an average pitch angle associated with their trajectories. A third small correction is necessary because muons that escape the ring during the storage time are slightly biased in initial spin phase compared to the parent distribution. Finally, because two high-voltage resistors in the ESQ network had longer than designed RC time constants, the vertical and horizontal centroids and envelopes of the stored muon beam drifted slightly, but coherently, during each storage ring fill. This led to the discovery of an important phase-acceptance relationship that requires a correction. The sum of the corrections to $ω_a^m$ is 0.50 $\pm$ 0.09 ppm; the uncertainty is small compared to the 0.43 ppm statistical precision of $ω_a^m$. △ Less

Submitted 23 April, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

Comments: 35 pages, 29 figures. Accepted by Phys. Rev. Accel. Beams

Report number: FERMILAB-PUB-21-133-E

Journal ref: Phys. Rev. Accel. Beams 24, 044002 (2021)

arXiv:2104.03201 [pdf, other]

doi 10.1103/PhysRevA.103.042208

Magnetic Field Measurement and Analysis for the Muon g-2 Experiment at Fermilab

Authors: T. Albahri, A. Anastasi, K. Badgley, S. Baeßler, I. Bailey, V. A. Baranov, E. Barlas-Yucel, T. Barrett, F. Bedeschi, M. Berz, M. Bhattacharya, H. P. Binney, P. Bloom, J. Bono, E. Bottalico, T. Bowcock, G. Cantatore, R. M. Carey, B. C. K. Casey, D. Cauz, R. Chakraborty, S. P. Chang, A. Chapelain, S. Charity, R. Chislett , et al. (148 additional authors not shown)

Abstract: The Fermi National Accelerator Laboratory has measured the anomalous precession frequency $a^{}_μ= (g^{}_μ-2)/2$ of the muon to a combined precision of 0.46 parts per million with data collected during its first physics run in 2018. This paper documents the measurement of the magnetic field in the muon storage ring. The magnetic field is monitored by nuclear magnetic resonance systems and calibrat… ▽ More The Fermi National Accelerator Laboratory has measured the anomalous precession frequency $a^{}_μ= (g^{}_μ-2)/2$ of the muon to a combined precision of 0.46 parts per million with data collected during its first physics run in 2018. This paper documents the measurement of the magnetic field in the muon storage ring. The magnetic field is monitored by nuclear magnetic resonance systems and calibrated in terms of the equivalent proton spin precession frequency in a spherical water sample at 34.7$^\circ$C. The measured field is weighted by the muon distribution resulting in $\tildeω'^{}_p$, the denominator in the ratio $ω^{}_a$/$\tildeω'^{}_p$ that together with known fundamental constants yields $a^{}_μ$. The reported uncertainty on $\tildeω'^{}_p$ for the Run-1 data set is 114 ppb consisting of uncertainty contributions from frequency extraction, calibration, mapping, tracking, and averaging of 56 ppb, and contributions from fast transient fields of 99 ppb. △ Less

Submitted 17 June, 2022; v1 submitted 7 April, 2021; originally announced April 2021.

Comments: Added one citation and corrected missing normalization in Eqs (35) and (36)

Report number: FERMILAB-PUB-21-109-E

Journal ref: Phys. Rev. A 103, 042208 (2021)

arXiv:2101.12726 [pdf, other]

An Environmental Monitoring Network for Quantum Gas Experiments and Devices

Authors: T. J. Barrett, W. Evans, A. Gadge, S. Bhumbra, S. Sleegers, R. Shah, J. Fekete, F. Orucevic, P. Kruger

Abstract: Quantum technology is approaching a level of maturity, recently demonstrated in space-borne experiments and in-field measurements, which would allow for adoption by non-specialist users. Parallel advancements made in microprocessor-based electronics and database software can be combined to create robust, versatile and modular experimental monitoring systems. Here, we describe a monitoring network… ▽ More Quantum technology is approaching a level of maturity, recently demonstrated in space-borne experiments and in-field measurements, which would allow for adoption by non-specialist users. Parallel advancements made in microprocessor-based electronics and database software can be combined to create robust, versatile and modular experimental monitoring systems. Here, we describe a monitoring network used across a number of cold atom laboratories with a shared laser system. The ability to diagnose malfunction, unexpected or unintended behaviour and passively collect data for key experimental parameters, such as vacuum chamber pressure, laser beam power, or resistances of important conductors, significantly reduces debugging time. This allows for efficient control over a number of experiments and remote control when access is limited. △ Less

Submitted 14 September, 2021; v1 submitted 29 January, 2021; originally announced January 2021.

Comments: 15 pages, 5 figures

arXiv:2101.12176 [pdf, other]

On the Origin of Implicit Regularization in Stochastic Gradient Descent

Authors: Samuel L. Smith, Benoit Dherin, David G. T. Barrett, Soham De

Abstract: For infinitesimal learning rates, stochastic gradient descent (SGD) follows the path of gradient flow on the full batch loss function. However moderately large learning rates can achieve higher test accuracies, and this generalization benefit is not explained by convergence bounds, since the learning rate which maximizes test accuracy is often larger than the learning rate which minimizes training… ▽ More For infinitesimal learning rates, stochastic gradient descent (SGD) follows the path of gradient flow on the full batch loss function. However moderately large learning rates can achieve higher test accuracies, and this generalization benefit is not explained by convergence bounds, since the learning rate which maximizes test accuracy is often larger than the learning rate which minimizes training loss. To interpret this phenomenon we prove that for SGD with random shuffling, the mean SGD iterate also stays close to the path of gradient flow if the learning rate is small and finite, but on a modified loss. This modified loss is composed of the original loss function and an implicit regularizer, which penalizes the norms of the minibatch gradients. Under mild assumptions, when the batch size is small the scale of the implicit regularization term is proportional to the ratio of the learning rate to the batch size. We verify empirically that explicitly including the implicit regularizer in the loss can enhance the test accuracy when the learning rate is small. △ Less

Submitted 28 January, 2021; originally announced January 2021.

Comments: Accepted as a conference paper at ICLR 2021

arXiv:2012.06317 [pdf]

Lunar Volatiles and Solar System Science

Authors: Parvathy Prem, Ákos Kereszturi, Ariel N. Deutsch, Charles A. Hibbitts, Carl A. Schmidt, Cesare Grava, Casey I. Honniball, Craig J. Hardgrove, Carlé M. Pieters, David B. Goldstein, Donald C. Barker, Debra H. Needham, Dana M. Hurley, Erwan Mazarico, Gerardo Dominguez, G. Wesley Patterson, Georgiana Y. Kramer, Julie Brisset, Jeffrey J. Gillis-Davis, Julie L. Mitchell, Jamey R. Szalay, Jasper S. Halekas, James T. Keane, James W. Head, Kathleen E. Mandt , et al. (16 additional authors not shown)

Abstract: Understanding the origin and evolution of the lunar volatile system is not only compelling lunar science, but also fundamental Solar System science. This white paper (submitted to the US National Academies' Decadal Survey in Planetary Science and Astrobiology 2023-2032) summarizes recent advances in our understanding of lunar volatiles, identifies outstanding questions for the next decade, and dis… ▽ More Understanding the origin and evolution of the lunar volatile system is not only compelling lunar science, but also fundamental Solar System science. This white paper (submitted to the US National Academies' Decadal Survey in Planetary Science and Astrobiology 2023-2032) summarizes recent advances in our understanding of lunar volatiles, identifies outstanding questions for the next decade, and discusses key steps required to address these questions. △ Less

Submitted 9 December, 2020; originally announced December 2020.

arXiv:2009.12095 [pdf, ps, other]

doi 10.1364/OL.401675

Fully reconfigurable coherent optical vector-matrix multiplication

Authors: James Spall, Xianxin Guo, Thomas D. Barrett, A. I. Lvovsky

Abstract: Optics is a promising platform in which to help realise the next generation of fast, parallel and energy-efficient computation. We demonstrate a reconfigurable free-space optical multiplier that is capable of over 3000 computations in parallel, using spatial light modulators with a pixel resolution of only 340x340. This enables vector-matrix multiplication and parallel vector-vector multiplication… ▽ More Optics is a promising platform in which to help realise the next generation of fast, parallel and energy-efficient computation. We demonstrate a reconfigurable free-space optical multiplier that is capable of over 3000 computations in parallel, using spatial light modulators with a pixel resolution of only 340x340. This enables vector-matrix multiplication and parallel vector-vector multiplication with vector size of up to 56. Our design is the first to simultaneously support optical implementation of reconfigurable, large-size and real-valued linear algebraic operations. Such an optical multiplier can serve as a building block of special-purpose optical processors such as optical neural networks and optical Ising machines. △ Less

Submitted 25 September, 2020; originally announced September 2020.

Comments: 4 pages, 4 figures

Journal ref: Optics Letters Vol. 45, Issue 20, pp. 5752-5755 (2020)

arXiv:2009.11162 [pdf, other]

Implicit Gradient Regularization

Authors: David G. T. Barrett, Benoit Dherin

Abstract: Gradient descent can be surprisingly good at optimizing deep neural networks without overfitting and without explicit regularization. We find that the discrete steps of gradient descent implicitly regularize models by penalizing gradient descent trajectories that have large loss gradients. We call this Implicit Gradient Regularization (IGR) and we use backward error analysis to calculate the size… ▽ More Gradient descent can be surprisingly good at optimizing deep neural networks without overfitting and without explicit regularization. We find that the discrete steps of gradient descent implicitly regularize models by penalizing gradient descent trajectories that have large loss gradients. We call this Implicit Gradient Regularization (IGR) and we use backward error analysis to calculate the size of this regularization. We confirm empirically that implicit gradient regularization biases gradient descent toward flat minima, where test errors are small and solutions are robust to noisy parameter perturbations. Furthermore, we demonstrate that the implicit gradient regularization term can be used as an explicit regularizer, allowing us to control this gradient regularization directly. More broadly, our work indicates that backward error analysis is a useful theoretical approach to the perennial question of how learning rate, model size, and parameter regularization interact to determine the properties of overparameterized models optimized with gradient descent. △ Less

Submitted 18 July, 2022; v1 submitted 23 September, 2020; originally announced September 2020.

Comments: Correction to formula A.14 in Appendix A.1 and update to the acknowledgments

Journal ref: Published as a conference paper at ICLR 2021

arXiv:2002.06991 [pdf, other]

Learning Group Structure and Disentangled Representations of Dynamical Environments

Authors: Robin Quessard, Thomas D. Barrett, William R. Clements

Abstract: Learning disentangled representations is a key step towards effectively discovering and modelling the underlying structure of environments. In the natural sciences, physics has found great success by describing the universe in terms of symmetry preserving transformations. Inspired by this formalism, we propose a framework, built upon the theory of group representation, for learning representations… ▽ More Learning disentangled representations is a key step towards effectively discovering and modelling the underlying structure of environments. In the natural sciences, physics has found great success by describing the universe in terms of symmetry preserving transformations. Inspired by this formalism, we propose a framework, built upon the theory of group representation, for learning representations of a dynamical environment structured around the transformations that generate its evolution. Experimentally, we learn the structure of explicitly symmetric environments without supervision from observational data generated by sequential interactions. We further introduce an intuitive disentanglement regularisation to ensure the interpretability of the learnt representations. We show that our method enables accurate long-horizon predictions, and demonstrate a correlation between the quality of predictions and disentanglement in the latent space. △ Less

Submitted 25 October, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

Comments: Accepted to NeurIPS 2020

arXiv:1912.12256 [pdf, ps, other]

Backpropagation through nonlinear units for all-optical training of neural networks

Authors: Xianxin Guo, Thomas D. Barrett, Zhiming M. Wang, A. I. Lvovsky

Abstract: Backpropagation through nonlinear neurons is an outstanding challenge to the field of optical neural networks and the major conceptual barrier to all-optical training schemes. Each neuron is required to exhibit a directionally dependent response to propagating optical signals, with the backwards response conditioned on the forward signal, which is highly non-trivial to implement optically. We prop… ▽ More Backpropagation through nonlinear neurons is an outstanding challenge to the field of optical neural networks and the major conceptual barrier to all-optical training schemes. Each neuron is required to exhibit a directionally dependent response to propagating optical signals, with the backwards response conditioned on the forward signal, which is highly non-trivial to implement optically. We propose a practical and surprisingly simple solution that uses saturable absorption to provide the network nonlinearity. We find that the backward propagating gradients required to train the network can be approximated in a pump-probe scheme that requires only passive optical elements. Simulations show that, with readily obtainable optical depths, our approach can achieve equivalent performance to state-of-the-art computational networks on image classification benchmarks, even in deep networks with multiple sequential gradient approximations. This scheme is compatible with leading optical neural network proposals and therefore provides a feasible path towards end-to-end optical training. △ Less

Submitted 8 October, 2020; v1 submitted 23 December, 2019; originally announced December 2019.

Comments: Error fixed in Fig.1

Journal ref: Photonics Research 9, B71-B80 (2021)

arXiv:1909.04063 [pdf, other]

Exploratory Combinatorial Optimization with Reinforcement Learning

Authors: Thomas D. Barrett, William R. Clements, Jakob N. Foerster, A. I. Lvovsky

Abstract: Many real-world problems can be reduced to combinatorial optimization on a graph, where the subset or ordering of vertices that maximize some objective function must be found. With such tasks often NP-hard and analytically intractable, reinforcement learning (RL) has shown promise as a framework with which efficient heuristic methods to tackle these problems can be learned. Previous works construc… ▽ More Many real-world problems can be reduced to combinatorial optimization on a graph, where the subset or ordering of vertices that maximize some objective function must be found. With such tasks often NP-hard and analytically intractable, reinforcement learning (RL) has shown promise as a framework with which efficient heuristic methods to tackle these problems can be learned. Previous works construct the solution subset incrementally, adding one element at a time, however, the irreversible nature of this approach prevents the agent from revising its earlier decisions, which may be necessary given the complexity of the optimization task. We instead propose that the agent should seek to continuously improve the solution by learning to explore at test time. Our approach of exploratory combinatorial optimization (ECO-DQN) is, in principle, applicable to any combinatorial problem that can be defined on a graph. Experimentally, we show our method to produce state-of-the-art RL performance on the Maximum Cut problem. Moreover, because ECO-DQN can start from any arbitrary configuration, it can be combined with other search methods to further improve performance, which we demonstrate using a simple random search. △ Less

Submitted 31 January, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

Comments: In Proceedings of the 34th National Conference on Artificial Intelligence, AAAI 2020

Journal ref: Proceedings of Thirty-fourth AAAI conference on artificial intelligence, 3243-3250 (2020)

arXiv:1903.08628 [pdf, ps, other]

doi 10.1088/1367-2630/ab8ab0

Pushing Purcell-enhancement beyond its limits

Authors: Thomas D. Barrett, Thomas H. Doherty, Axel Kuhn

Abstract: Purcell-enhanced emission from a coupled emitter-cavity system is a fundamental manifestation of cavity quantum electrodynamics. Starting from a theoretical description we derive a scheme for photon emission from an emitter coupled to a birefringent cavity that exceeds hitherto anticipated limitations. Based on a recent study and experimental investigation of the intra-cavity coupling of orthogona… ▽ More Purcell-enhanced emission from a coupled emitter-cavity system is a fundamental manifestation of cavity quantum electrodynamics. Starting from a theoretical description we derive a scheme for photon emission from an emitter coupled to a birefringent cavity that exceeds hitherto anticipated limitations. Based on a recent study and experimental investigation of the intra-cavity coupling of orthogonal polarisation modes in birefringent cavities, we now decouple the emitter and the photon prior to emission from the cavity mode. Effectively, this is "hiding" the emitter from the photon in the cavity to suppress re-excitation, increasing the overall emission through the cavity mirrors. In doing so we show that tailored cavity birefringence can offer significant advantages and that these are practically achievable within the bounds of present-day technology. It is found that birefringence can mitigate the tradeoff between stronger emitter-cavity coupling and efficient photon extraction. This allows for longer cavities to be constructed without a loss of performance -- a significant result for applications where dielectric mirrors interfere with any trapping fields confining the emitter. We then generalise our model to consider a variety of equivalent schemes. For instance, detuning a pair of ground states in a three-level emitter coupled to a cavity in a Lambda-system is shown to provide the same enhancement, and it can be combined with a birefringent cavity to further increase performance. Additionally, it is found that when directly connecting multiple ground states of the emitter to form a chain of coupled states, the extraction efficiency approaches its fundamental upper limit. The principles proposed in this work can be applied in multiple ways to any emitter-cavity system, paving the way to surpassing the traditional limits of such systems with technologies that exist today. △ Less

Submitted 29 March, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

Comments: 8 pages, 8 figures plus 3 page appendix

arXiv:1902.00120 [pdf, other]

Learning to Make Analogies by Contrasting Abstract Relational Structure

Authors: Felix Hill, Adam Santoro, David G. T. Barrett, Ari S. Morcos, Timothy Lillicrap

Abstract: Analogical reasoning has been a principal focus of various waves of AI research. Analogy is particularly challenging for machines because it requires relational structures to be represented such that they can be flexibly applied across diverse domains of experience. Here, we study how analogical reasoning can be induced in neural networks that learn to perceive and reason about raw visual data. We… ▽ More Analogical reasoning has been a principal focus of various waves of AI research. Analogy is particularly challenging for machines because it requires relational structures to be represented such that they can be flexibly applied across diverse domains of experience. Here, we study how analogical reasoning can be induced in neural networks that learn to perceive and reason about raw visual data. We find that the critical factor for inducing such a capacity is not an elaborate architecture, but rather, careful attention to the choice of data and the manner in which it is presented to the model. The most robust capacity for analogical reasoning is induced when networks learn analogies by contrasting abstract relational structures in their input domains, a training method that uses only the input data to force models to learn about important abstract features. Using this technique we demonstrate capacities for complex, visual and symbolic analogy making and generalisation in even the simplest neural network architectures. △ Less

Submitted 31 January, 2019; originally announced February 2019.

arXiv:1810.13373 [pdf, other]

Analyzing biological and artificial neural networks: challenges with opportunities for synergy?

Authors: David G. T. Barrett, Ari S. Morcos, Jakob H. Macke

Abstract: Deep neural networks (DNNs) transform stimuli across multiple processing stages to produce representations that can be used to solve complex tasks, such as object recognition in images. However, a full understanding of how they achieve this remains elusive. The complexity of biological neural networks substantially exceeds the complexity of DNNs, making it even more challenging to understand the r… ▽ More Deep neural networks (DNNs) transform stimuli across multiple processing stages to produce representations that can be used to solve complex tasks, such as object recognition in images. However, a full understanding of how they achieve this remains elusive. The complexity of biological neural networks substantially exceeds the complexity of DNNs, making it even more challenging to understand the representations that they learn. Thus, both machine learning and computational neuroscience are faced with a shared challenge: how can we analyze their representations in order to understand how they solve complex tasks? We review how data-analysis concepts and techniques developed by computational neuroscientists can be useful for analyzing representations in DNNs, and in turn, how recently developed techniques for analysis of DNNs can be useful for understanding representations in biological neural networks. We explore opportunities for synergy between the two fields, such as the use of DNNs as in-silico model systems for neuroscience, and how this synergy can lead to new hypotheses about the operating principles of biological neural networks. △ Less

Submitted 31 October, 2018; originally announced October 2018.

arXiv:1807.07633 [pdf, ps, other]

doi 10.1103/PhysRevLett.122.083602

Polarisation oscillations in birefringent emitter-cavity systems

Authors: Thomas D. Barrett, Oliver Barter, Dustin Stuart, Ben Yuen, Axel Kuhn

Abstract: We present the effects of resonator birefringence on the cavity-enhanced interfacing of quantum states of light and matter, including the first observation of single photons with a time-dependent polarisation state that evolves within their coherence time. A theoretical model is introduced and experimentally verified by the modified polarisation of temporally-long single photons emitted from a… ▽ More We present the effects of resonator birefringence on the cavity-enhanced interfacing of quantum states of light and matter, including the first observation of single photons with a time-dependent polarisation state that evolves within their coherence time. A theoretical model is introduced and experimentally verified by the modified polarisation of temporally-long single photons emitted from a $^{87}$Rb atom coupled to a high-finesse optical cavity by a vacuum-stimulated Raman adiabatic passage (V-STIRAP) process. Further theoretical investigation shows how a change in cavity birefringence can both impact the atom-cavity coupling and engender starkly different polarisation behaviour in the emitted photons. With polarisation a key resource for encoding quantum states of light and modern micron-scale cavities particularly prone to birefringence, the consideration of these effects is vital to the faithful realisation of efficient and coherent emitter-photon interfaces for distributed quantum networking and communications. △ Less

Submitted 26 March, 2019; v1 submitted 19 July, 2018; originally announced July 2018.

Comments: 9 pages, 5 figures including Supplemental Material

Journal ref: Phys. Rev. Lett. 122, 083602 (2019)

arXiv:1807.04225 [pdf, other]

Measuring abstract reasoning in neural networks

Authors: David G. T. Barrett, Felix Hill, Adam Santoro, Ari S. Morcos, Timothy Lillicrap

Abstract: Whether neural networks can learn abstract reasoning or whether they merely rely on superficial statistics is a topic of recent debate. Here, we propose a dataset and challenge designed to probe abstract reasoning, inspired by a well-known human IQ test. To succeed at this challenge, models must cope with various generalisation `regimes' in which the training and test data differ in clearly-define… ▽ More Whether neural networks can learn abstract reasoning or whether they merely rely on superficial statistics is a topic of recent debate. Here, we propose a dataset and challenge designed to probe abstract reasoning, inspired by a well-known human IQ test. To succeed at this challenge, models must cope with various generalisation `regimes' in which the training and test data differ in clearly-defined ways. We show that popular models such as ResNets perform poorly, even when the training and test sets differ only minimally, and we present a novel architecture, with a structure designed to encourage reasoning, that does significantly better. When we vary the way in which the test questions and training data differ, we find that our model is notably proficient at certain forms of generalisation, but notably weak at others. We further show that the model's ability to generalise improves markedly if it is trained to predict symbolic explanations for its answers. Altogether, we introduce and explore ways to both measure and induce stronger abstract reasoning in neural networks. Our freely-available dataset should motivate further progress in this direction. △ Less

Submitted 11 July, 2018; originally announced July 2018.

Comments: ICML 2018

arXiv:1806.02215 [pdf, other]

Spectral Inference Networks: Unifying Deep and Spectral Learning

Authors: David Pfau, Stig Petersen, Ashish Agarwal, David G. T. Barrett, Kimberly L. Stachenfeld

Abstract: We present Spectral Inference Networks, a framework for learning eigenfunctions of linear operators by stochastic optimization. Spectral Inference Networks generalize Slow Feature Analysis to generic symmetric operators, and are closely related to Variational Monte Carlo methods from computational physics. As such, they can be a powerful tool for unsupervised representation learning from video or… ▽ More We present Spectral Inference Networks, a framework for learning eigenfunctions of linear operators by stochastic optimization. Spectral Inference Networks generalize Slow Feature Analysis to generic symmetric operators, and are closely related to Variational Monte Carlo methods from computational physics. As such, they can be a powerful tool for unsupervised representation learning from video or graph-structured data. We cast training Spectral Inference Networks as a bilevel optimization problem, which allows for online learning of multiple eigenfunctions. We show results of training Spectral Inference Networks on problems in quantum mechanics and feature learning for videos on synthetic datasets. Our results demonstrate that Spectral Inference Networks accurately recover eigenfunctions of linear operators and can discover interpretable representations from video in a fully unsupervised manner. △ Less

Submitted 16 January, 2020; v1 submitted 6 June, 2018; originally announced June 2018.

Comments: Fixed typo in math in section 4

Journal ref: Seventh International Conference on Learning Representations (ICLR 2019)

arXiv:1804.10455 [pdf, ps, other]

doi 10.1088/1367-2630/aad14e

Nonlinear Zeeman Effects in the Cavity-Enhanced Emission of Polarised Photons

Authors: Thomas D. Barrett, Dustin Stuart, Oliver Barter, Axel Kuhn

Abstract: We theoretically and experimentally investigate nonlinear Zeeman effects within a polarised single-photon source that uses a single 87Rb atom strongly coupled to a high finesse optical cavity. The breakdown of the atomic hyperfine structure in the D2 transition manifold for intermediate strength magnetic fields is shown to result in asymmetric and, ultimately, inhibited operation of the polarised… ▽ More We theoretically and experimentally investigate nonlinear Zeeman effects within a polarised single-photon source that uses a single 87Rb atom strongly coupled to a high finesse optical cavity. The breakdown of the atomic hyperfine structure in the D2 transition manifold for intermediate strength magnetic fields is shown to result in asymmetric and, ultimately, inhibited operation of the polarised atom-photon interface. The coherence of the system is considered using Hong-Ou-Mandel interference of the emitted photons. This informs the next steps to be taken and the modelling of future implementations, based on feasible cavity designs operated in regimes minimising nonlinear Zeeman effects, is presented and shown to provide improved performance. △ Less

Submitted 27 April, 2018; originally announced April 2018.

Comments: 12 pages, 8 figures

Journal ref: New J. Phys. 20, 073030 (2018)

arXiv:1804.08663 [pdf, other]

A Discriminative Acoustic-Prosodic Approach for Measuring Local Entrainment

Authors: Megan M. Willi, Stephanie A. Borrie, Tyson S. Barrett, Ming Tu, Visar Berisha

Abstract: Acoustic-prosodic entrainment describes the tendency of humans to align or adapt their speech acoustics to each other in conversation. This alignment of spoken behavior has important implications for conversational success. However, modeling the subtle nature of entrainment in spoken dialogue continues to pose a challenge. In this paper, we propose a straightforward definition for local entrainmen… ▽ More Acoustic-prosodic entrainment describes the tendency of humans to align or adapt their speech acoustics to each other in conversation. This alignment of spoken behavior has important implications for conversational success. However, modeling the subtle nature of entrainment in spoken dialogue continues to pose a challenge. In this paper, we propose a straightforward definition for local entrainment in the speech domain and operationalize an algorithm based on this: acoustic-prosodic features that capture entrainment should be maximally different between real conversations involving two partners and sham conversations generated by randomly mixing the speaking turns from the original two conversational partners. We propose an approach for measuring local entrainment that quantifies alignment of behavior on a turn-by-turn basis, projecting the differences between interlocutors' acoustic-prosodic features for a given turn onto a discriminative feature subspace that maximizes the difference between real and sham conversations. We evaluate the method using the derived features to drive a classifier aiming to predict an objective measure of conversational success (i.e., low versus high), on a corpus of task-oriented conversations. The proposed entrainment approach achieves 72% classification accuracy using a Naive Bayes classifier, outperforming three previously established approaches evaluated on the same conversational corpus. △ Less

Submitted 12 July, 2018; v1 submitted 23 April, 2018; originally announced April 2018.

arXiv:1803.10222 [pdf, ps, other]

doi 10.1088/2058-9565/aafaba

Multimode interferometry for entangling atoms in quantum networks

Authors: Thomas D. Barrett, Allison Rubenok, Dustin Stuart, Oliver Barter, Annemarie Holleczek, Jerome Dilley, Peter B. R. Nisbet-Jones, Konstantinos Poulios, Graham D. Marshall, Jeremy L. O'Brien, Alberto Politi, Jonathan C. F. Matthews, Axel Kuhn

Abstract: We bring together a cavity-enhanced light-matter interface with a multimode interferometer (MMI) integrated onto a photonic chip and demonstrate the potential of such hybrid systems to tailor distributed entanglement in a quantum network. The MMI is operated with pairs of narrowband photons produced a priori deterministically from a single 87Rb atom strongly coupled to a high-finesse optical cavit… ▽ More We bring together a cavity-enhanced light-matter interface with a multimode interferometer (MMI) integrated onto a photonic chip and demonstrate the potential of such hybrid systems to tailor distributed entanglement in a quantum network. The MMI is operated with pairs of narrowband photons produced a priori deterministically from a single 87Rb atom strongly coupled to a high-finesse optical cavity. Non-classical coincidences between photon detection events show no loss of coherence when interfering pairs of these photons through the MMI in comparison to the two-photon visibility directly measured using Hong-Ou-Mandel interference on a beam splitter. This demonstrates the ability of integrated multimode circuits to mediate the entanglement of remote stationary nodes in a quantum network interlinked by photonic qubits. △ Less

Submitted 29 November, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

Comments: 10 pages, 5 figures

Journal ref: Thomas D Barrett et al 2019 Quantum Sci. Technol. 4 025008

Showing 1–50 of 59 results for author: Barrett, T