-
Gemini: Dynamic Bias Correction for Autonomous Experimentation and Molecular Simulation
Authors:
Riley J. Hickman,
Florian Häse,
Loïc M. Roch,
Alán Aspuru-Guzik
Abstract:
Bayesian optimization has emerged as a powerful strategy to accelerate scientific discovery by means of autonomous experimentation. However, expensive measurements are required to accurately estimate materials properties, and can quickly become a hindrance to exhaustive materials discovery campaigns. Here, we introduce Gemini: a data-driven model capable of using inexpensive measurements as proxie…
▽ More
Bayesian optimization has emerged as a powerful strategy to accelerate scientific discovery by means of autonomous experimentation. However, expensive measurements are required to accurately estimate materials properties, and can quickly become a hindrance to exhaustive materials discovery campaigns. Here, we introduce Gemini: a data-driven model capable of using inexpensive measurements as proxies for expensive measurements by correcting systematic biases between property evaluation methods. We recommend using Gemini for regression tasks with sparse data and in an autonomous workflow setting where its predictions of expensive to evaluate objectives can be used to construct a more informative acquisition function, thus reducing the number of expensive evaluations an optimizer needs to achieve desired target values. In a regression setting, we showcase the ability of our method to make accurate predictions of DFT calculated bandgaps of hybrid organic-inorganic perovskite materials. We further demonstrate the benefits that Gemini provides to autonomous workflows by augmenting the Bayesian optimizer Phoenics to yeild a scalable optimization framework leveraging multiple sources of measurement. Finally, we simulate an autonomous materials discovery platform for optimizing the activity of electrocatalysts for the oxygen evolution reaction. Realizing autonomous workflows with Gemini, we show that the number of measurements of a composition space comprising expensive and rare metals needed to achieve a target overpotential is significantly reduced when measurements from a proxy composition system with less expensive metals are available.
△ Less
Submitted 4 March, 2021;
originally announced March 2021.
-
Olympus: a benchmarking framework for noisy optimization and experiment planning
Authors:
Florian Häse,
Matteo Aldeghi,
Riley J. Hickman,
Loïc M. Roch,
Melodie Christensen,
Elena Liles,
Jason E. Hein,
Alán Aspuru-Guzik
Abstract:
Research challenges encountered across science, engineering, and economics can frequently be formulated as optimization tasks. In chemistry and materials science, recent growth in laboratory digitization and automation has sparked interest in optimization-guided autonomous discovery and closed-loop experimentation. Experiment planning strategies based on off-the-shelf optimization algorithms can b…
▽ More
Research challenges encountered across science, engineering, and economics can frequently be formulated as optimization tasks. In chemistry and materials science, recent growth in laboratory digitization and automation has sparked interest in optimization-guided autonomous discovery and closed-loop experimentation. Experiment planning strategies based on off-the-shelf optimization algorithms can be employed in fully autonomous research platforms to achieve desired experimentation goals with the minimum number of trials. However, the experiment planning strategy that is most suitable to a scientific discovery task is a priori unknown while rigorous comparisons of different strategies are highly time and resource demanding. As optimization algorithms are typically benchmarked on low-dimensional synthetic functions, it is unclear how their performance would translate to noisy, higher-dimensional experimental tasks encountered in chemistry and materials science. We introduce Olympus, a software package that provides a consistent and easy-to-use framework for benchmarking optimization algorithms against realistic experiments emulated via probabilistic deep-learning models. Olympus includes a collection of experimentally derived benchmark sets from chemistry and materials science and a suite of experiment planning strategies that can be easily accessed via a user-friendly python interface. Furthermore, Olympus facilitates the integration, testing, and sharing of custom algorithms and user-defined datasets. In brief, Olympus mitigates the barriers associated with benchmarking optimization algorithms on realistic experimental scenarios, promoting data sharing and the creation of a standard framework for evaluating the performance of experiment planning strategies
△ Less
Submitted 30 March, 2021; v1 submitted 8 October, 2020;
originally announced October 2020.
-
Gryffin: An algorithm for Bayesian optimization of categorical variables informed by expert knowledge
Authors:
Florian Häse,
Matteo Aldeghi,
Riley J. Hickman,
Loïc M. Roch,
Alán Aspuru-Guzik
Abstract:
Designing functional molecules and advanced materials requires complex design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters despite the urge to devise effi…
▽ More
Designing functional molecules and advanced materials requires complex design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters despite the urge to devise efficient strategies for the selection of categorical variables. Here, we introduce Gryffin, a general purpose optimization framework for the autonomous selection of categorical variables driven by expert knowledge. Gryffin augments Bayesian optimization based on kernel density estimation with smooth approximations to categorical distributions. Leveraging domain knowledge in the form of physicochemical descriptors, Gryffin can significantly accelerate the search for promising molecules and materials. Gryffin can further highlight relevant correlations between the provided descriptors to inspire physical insights and foster scientific intuition. In addition to comprehensive benchmarks, we demonstrate the capabilities and performance of Gryffin on three examples in materials science and chemistry: (i) the discovery of non-fullerene acceptors for organic solar cells, (ii) the design of hybrid organic-inorganic perovskites for light harvesting, and (iii) the identification of ligands and process parameters for Suzuki-Miyaura reactions. Our results suggest that Gryffin, in its simplest form, is competitive with state-of-the-art categorical optimization algorithms. However, when leveraging domain knowledge provided via descriptors, Gryffin outperforms other approaches while simultaneously refining this domain knowledge to promote scientific understanding.
△ Less
Submitted 28 May, 2021; v1 submitted 26 March, 2020;
originally announced March 2020.
-
From absorption spectra to charge transfer in PEDOT nanoaggregates with machine learning
Authors:
Loïc M. Roch,
Semion K. Saikin,
Florian Häse,
Pascal Friederich,
Randall H. Goldsmith,
Salvador León,
Alán Aspuru-Guzik
Abstract:
Fast and inexpensive characterization of materials properties is a key element to discover novel functional materials. In this work, we suggest an approach employing three classes of Bayesian machine learning (ML) models to correlate electronic absorption spectra of nanoaggregates with the strength of intermolecular electronic couplings in organic conducting and semiconducting materials. As a spec…
▽ More
Fast and inexpensive characterization of materials properties is a key element to discover novel functional materials. In this work, we suggest an approach employing three classes of Bayesian machine learning (ML) models to correlate electronic absorption spectra of nanoaggregates with the strength of intermolecular electronic couplings in organic conducting and semiconducting materials. As a specific model system, we consider PEDOT:PSS, a cornerstone material for organic electronic applications, and so analyze the couplings between charged dimers of closely packed PEDOT oligomers that are at the heart of the material's unrivaled conductivity. We demonstrate that ML algorithms can identify correlations between the coupling strengths and the electronic absorption spectra. We also show that ML models can be trained to be transferable across a broad range of spectral resolutions, and that the electronic couplings can be predicted from the simulated spectra with an 88 % accuracy when ML models are used as classifiers. Although the ML models employed in this study were trained on data generated by a multi-scale computational workflow, they were able to leverage leverage experimental data.
△ Less
Submitted 24 September, 2019;
originally announced September 2019.
-
Beyond Ternary OPV: High-Throughput Experimentation and Self-Driving Laboratories Optimize Multi-Component Systems
Authors:
Stefan Langner,
Florian Häse,
José Darío Perea,
Tobias Stubhan,
Jens Hauch,
Loïc M. Roch,
Thomas Heumueller,
Alán Aspuru-Guzik,
Christoph J. Brabec
Abstract:
Fundamental advances to increase the efficiency as well as stability of organic photovoltaics (OPVs) are achieved by designing ternary blends which represents a clear trend towards multi-component active layer blends. We report the development of high-throughput and autonomous experimentation methods for the effective optimization of multi-component polymer blends for OPVs. A method for automated…
▽ More
Fundamental advances to increase the efficiency as well as stability of organic photovoltaics (OPVs) are achieved by designing ternary blends which represents a clear trend towards multi-component active layer blends. We report the development of high-throughput and autonomous experimentation methods for the effective optimization of multi-component polymer blends for OPVs. A method for automated film formation enabling the fabrication of up to 6048 films per day is introduced. Equipping this automated experimentation platform with a Bayesian optimization, a self-driving laboratory is constructed that autonomously evaluates measurements to design and execute the next experiments. To demonstrate the potential of these methods, a four-dimensional parameter space of quaternary OPV blends is mapped and optimized for photo-stability. While with conventional approaches roughly 100 mg of material would be necessary, the robot based platform can screen 2,000 combinations with less than 10 mg and machine learning enabled autonomous experimentation identifies the stable compositions with less than 1 mg.
△ Less
Submitted 24 September, 2019; v1 submitted 8 September, 2019;
originally announced September 2019.
-
Self-driving laboratory for accelerated discovery of thin-film materials
Authors:
Benjamin P. MacLeod,
Fraser G. L. Parlane,
Thomas D. Morrissey,
Florian Häse,
Loïc M. Roch,
Kevan E. Dettelbach,
Raphaell Moreira,
Lars P. E. Yunker,
Michael B. Rooney,
Joseph R. Deeth,
Veronica Lai,
Gordon J. Ng,
Henry Situ,
Ray H. Zhang,
Michael S. Elliott,
Ted H. Haley,
David J. Dvorak,
Alán Aspuru-Guzik,
Jason E. Hein,
Curtis P. Berlinguette
Abstract:
Discovering and optimizing commercially viable materials for clean energy applications typically takes over a decade. Self-driving laboratories that iteratively design, execute, and learn from material science experiments in a fully autonomous loop present an opportunity to accelerate this research. We report here a modular robotic platform driven by a model-based optimization algorithm capable of…
▽ More
Discovering and optimizing commercially viable materials for clean energy applications typically takes over a decade. Self-driving laboratories that iteratively design, execute, and learn from material science experiments in a fully autonomous loop present an opportunity to accelerate this research. We report here a modular robotic platform driven by a model-based optimization algorithm capable of autonomously optimizing the optical and electronic properties of thin-film materials by modifying the film composition and processing conditions. We demonstrate this platform by using it to maximize the hole mobility of organic hole transport materials commonly used in perovskite solar cells and consumer electronics. This demonstration highlights the possibilities of using autonomous laboratories to discover organic and inorganic materials relevant to materials sciences and clean energy technologies.
△ Less
Submitted 10 March, 2020; v1 submitted 12 June, 2019;
originally announced June 2019.
-
PHOENICS: A universal deep Bayesian optimizer
Authors:
Florian Häse,
Loïc M. Roch,
Christoph Kreisbeck,
Alán Aspuru-Guzik
Abstract:
In this work we introduce PHOENICS, a probabilistic global optimization algorithm combining ideas from Bayesian optimization with concepts from Bayesian kernel density estimation. We propose an inexpensive acquisition function balancing the explorative and exploitative behavior of the algorithm. This acquisition function enables intuitive sampling strategies for an efficient parallel search of glo…
▽ More
In this work we introduce PHOENICS, a probabilistic global optimization algorithm combining ideas from Bayesian optimization with concepts from Bayesian kernel density estimation. We propose an inexpensive acquisition function balancing the explorative and exploitative behavior of the algorithm. This acquisition function enables intuitive sampling strategies for an efficient parallel search of global minima. The performance of PHOENICS is assessed via an exhaustive benchmark study on a set of 15 discrete, quasi-discrete and continuous multidimensional functions. Unlike optimization methods based on Gaussian processes (GP) and random forests (RF), we show that PHOENICS is less sensitive to the nature of the co-domain, and outperforms GP and RF optimizations. We illustrate the performance of PHOENICS on the Oregonator, a difficult case-study describing a complex chemical reaction network. We demonstrate that only PHOENICS was able to reproduce qualitatively and quantitatively the target dynamic behavior of this nonlinear reaction dynamics. We recommend PHOENICS for rapid optimization of scalar, possibly non-convex, black-box unknown objective functions.
△ Less
Submitted 4 January, 2018;
originally announced January 2018.
-
Toward Accurate Adsorption Energetics on Clay Surfaces
Authors:
Andrea Zen,
Loïc M Roch,
Stephen J Cox,
Xiao L Hu,
Sandro Sorella,
Dario Alfè,
Angelos Michaelides
Abstract:
Clay minerals are ubiquitous in nature, and the manner in which they interact with their surroundings has important industrial and environmental implications. Consequently, a molecular-level understanding of the adsorption of molecules on clay surfaces is crucial. In this regard computer simulations play an important role, yet the accuracy of widely used empirical force fields (FF) and density fun…
▽ More
Clay minerals are ubiquitous in nature, and the manner in which they interact with their surroundings has important industrial and environmental implications. Consequently, a molecular-level understanding of the adsorption of molecules on clay surfaces is crucial. In this regard computer simulations play an important role, yet the accuracy of widely used empirical force fields (FF) and density functional theory (DFT) exchange-correlation functionals is often unclear in adsorption systems dominated by weak interactions. Herein we present results from quantum Monte Carlo (QMC) for water and methanol adsorption on the prototypical clay kaolinite. To the best of our knowledge, this is the first time QMC has been used to investigate adsorption at a complex, natural surface such as a clay. As well as being valuable in their own right, the QMC benchmarks obtained provide reference data against which the performance of cheaper DFT methods can be tested. Indeed using various DFT exchange-correlation functionals yields a very broad range of adsorption energies, and it is unclear a priori which evaluation is better. QMC reveals that in the systems considered here it is essential to account for van der Waals (vdW) dispersion forces since this alters both the absolute and relative adsorption energies of water and methanol. We show, via FF simulations, that incorrect relative energies can lead to significant changes in the interfacial densities of water and methanol solutions at the kaolinite interface. Despite the clear improvements offered by the vdW-corrected and the vdW-inclusive functionals, absolute adsorption energies are often overestimated, suggesting that the treatment of vdW forces in DFT is not yet a solved problem.
△ Less
Submitted 17 November, 2016;
originally announced November 2016.