Search | arXiv e-print repository

Equation identification for fluid flows via physics-informed neural networks

Authors: Alexander New, Marisel Villafañe-Delgado, Charles Shugert

Abstract: Scientific machine learning (SciML) methods such as physics-informed neural networks (PINNs) are used to estimate parameters of interest from governing equations and small quantities of data. However, there has been little work in assessing how well PINNs perform for inverse problems across wide ranges of governing equations across the mathematical sciences. We present a new and challenging benchm… ▽ More Scientific machine learning (SciML) methods such as physics-informed neural networks (PINNs) are used to estimate parameters of interest from governing equations and small quantities of data. However, there has been little work in assessing how well PINNs perform for inverse problems across wide ranges of governing equations across the mathematical sciences. We present a new and challenging benchmark problem for inverse PINNs based on a parametric sweep of the 2D Burgers' equation with rotational flow. We show that a novel strategy that alternates between first- and second-order optimization proves superior to typical first-order strategies for estimating parameters. In addition, we propose a novel data-driven method to characterize PINN effectiveness in the inverse setting. PINNs' physics-informed regularization enables them to leverage small quantities of data more efficiently than the data-driven baseline. However, both PINNs and the baseline can fail to recover parameters for highly inviscid flows, motivating the need for further development of PINN methods. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: Published at ICML 2024 AI4Science: https://openreview.net/forum?id=XsvCLEYH3O

arXiv:2408.17255 [pdf, other]

Self-supervised learning for crystal property prediction via denoising

Authors: Alexander New, Nam Q. Le, Michael J. Pekala, Christopher D. Stiles

Abstract: Accurate prediction of the properties of crystalline materials is crucial for targeted discovery, and this prediction is increasingly done with data-driven models. However, for many properties of interest, the number of materials for which a specific property has been determined is much smaller than the number of known materials. To overcome this disparity, we propose a novel self-supervised learn… ▽ More Accurate prediction of the properties of crystalline materials is crucial for targeted discovery, and this prediction is increasingly done with data-driven models. However, for many properties of interest, the number of materials for which a specific property has been determined is much smaller than the number of known materials. To overcome this disparity, we propose a novel self-supervised learning (SSL) strategy for material property prediction. Our approach, crystal denoising self-supervised learning (CDSSL), pretrains predictive models (e.g., graph networks) with a pretext task based on recovering valid material structures when given perturbed versions of these structures. We demonstrate that CDSSL models out-perform models trained without SSL, across material types, properties, and dataset sizes. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: Published at ICML 2024 AI4Science: https://openreview.net/forum?id=yML9ufAEoV

arXiv:2311.16860 [pdf, other]

Data-efficient operator learning for solving high Mach number fluid flow problems

Authors: Noah Ford, Victor J. Leon, Honest Mrema, Jeffrey Gilbert, Alexander New

Abstract: We consider the problem of using SciML to predict solutions of high Mach fluid flows over irregular geometries. In this setting, data is limited, and so it is desirable for models to perform well in the low-data setting. We show that Neural Basis Functions (NBF), which learns a basis of behavior modes from the data and then uses this basis to make predictions, is more effective than a basis-unawar… ▽ More We consider the problem of using SciML to predict solutions of high Mach fluid flows over irregular geometries. In this setting, data is limited, and so it is desirable for models to perform well in the low-data setting. We show that Neural Basis Functions (NBF), which learns a basis of behavior modes from the data and then uses this basis to make predictions, is more effective than a basis-unaware baseline model. In addition, we identify continuing challenges in the space of predicting solutions for this type of problem. △ Less

Submitted 4 December, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

arXiv:2311.00060 [pdf, other]

Ensemble models outperform single model uncertainties and predictions for operator-learning of hypersonic flows

Authors: Victor J. Leon, Noah Ford, Honest Mrema, Jeffrey Gilbert, Alexander New

Abstract: High-fidelity computational simulations and physical experiments of hypersonic flows are resource intensive. Training scientific machine learning (SciML) models on limited high-fidelity data offers one approach to rapidly predict behaviors for situations that have not been seen before. However, high-fidelity data is itself in limited quantity to validate all outputs of the SciML model in unexplore… ▽ More High-fidelity computational simulations and physical experiments of hypersonic flows are resource intensive. Training scientific machine learning (SciML) models on limited high-fidelity data offers one approach to rapidly predict behaviors for situations that have not been seen before. However, high-fidelity data is itself in limited quantity to validate all outputs of the SciML model in unexplored input space. As such, an uncertainty-aware SciML model is desired. The SciML model's output uncertainties could then be used to assess the reliability and confidence of the model's predictions. In this study, we extend a DeepONet using three different uncertainty quantification mechanisms: mean-variance estimation, evidential uncertainty, and ensembling. The uncertainty aware DeepONet models are trained and evaluated on the hypersonic flow around a blunt cone object with data generated via computational fluid dynamics over a wide range of Mach numbers and altitudes. We find that ensembling outperforms the other two uncertainty models in terms of minimizing error and calibrating uncertainty in both interpolative and extrapolative regimes. △ Less

Submitted 3 November, 2023; v1 submitted 31 October, 2023; originally announced November 2023.

Comments: This work was accepted after peer-review and presented at the 2023 NeurIPS Machine Learning and the Physical Sciences workshop. https://ml4physicalsciences.github.io/2023/

arXiv:2309.12323 [pdf, other]

Evaluating the diversity and utility of materials proposed by generative models

Authors: Alexander New, Michael Pekala, Elizabeth A. Pogue, Nam Q. Le, Janna Domenico, Christine D. Piatko, Christopher D. Stiles

Abstract: Generative machine learning models can use data generated by scientific modeling to create large quantities of novel material structures. Here, we assess how one state-of-the-art generative model, the physics-guided crystal generation model (PGCGM), can be used as part of the inverse design process. We show that the default PGCGM's input space is not smooth with respect to parameter variation, mak… ▽ More Generative machine learning models can use data generated by scientific modeling to create large quantities of novel material structures. Here, we assess how one state-of-the-art generative model, the physics-guided crystal generation model (PGCGM), can be used as part of the inverse design process. We show that the default PGCGM's input space is not smooth with respect to parameter variation, making material optimization difficult and limited. We also demonstrate that most generated structures are predicted to be thermodynamically unstable by a separate property-prediction model, partially due to out-of-domain data challenges. Our findings suggest how generative models might be improved to enable better inverse design. △ Less

Submitted 9 August, 2023; originally announced September 2023.

Comments: 12 pages, 9 figures. Published at SynS & ML @ ICML2023: https://openreview.net/forum?id=2ZYbmYTKoR

arXiv:2301.07799 [pdf, other]

doi 10.1016/j.neunet.2023.01.007

A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

Authors: Megan M. Baker, Alexander New, Mario Aguilar-Simon, Ziad Al-Halah, Sébastien M. R. Arnold, Ese Ben-Iwhiwhu, Andrew P. Brna, Ethan Brooks, Ryan C. Brown, Zachary Daniels, Anurag Daram, Fabien Delattre, Ryan Dellana, Eric Eaton, Haotian Fu, Kristen Grauman, Jesse Hostetler, Shariq Iqbal, Cassandra Kent, Nicholas Ketz, Soheil Kolouri, George Konidaris, Dhireesha Kudithipudi, Erik Learned-Miller, Seungwon Lee , et al. (22 additional authors not shown)

Abstract: Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through th… ▽ More Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future. △ Less

Submitted 18 January, 2023; originally announced January 2023.

Comments: To appear in Neural Networks

arXiv:2210.07880 [pdf, other]

Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural Networks on Coupled Ordinary Differential Equations

Authors: Alexander New, Benjamin Eng, Andrea C. Timm, Andrew S. Gearhart

Abstract: In this work, we assess the ability of physics-informed neural networks (PINNs) to solve increasingly-complex coupled ordinary differential equations (ODEs). We focus on a pair of benchmarks: discretized partial differential equations and harmonic oscillators, each of which has a tunable parameter that controls its complexity. Even by varying network architecture and applying a state-of-the-art tr… ▽ More In this work, we assess the ability of physics-informed neural networks (PINNs) to solve increasingly-complex coupled ordinary differential equations (ODEs). We focus on a pair of benchmarks: discretized partial differential equations and harmonic oscillators, each of which has a tunable parameter that controls its complexity. Even by varying network architecture and applying a state-of-the-art training method that accounts for "difficult" training regions, we show that PINNs eventually fail to produce correct solutions to these benchmarks as their complexity -- the number of equations and the size of time domain -- increases. We identify several reasons why this may be the case, including insufficient network capacity, poor conditioning of the ODEs, and high local curvature, as measured by the Laplacian of the PINN loss. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Comments: Accepted at KGML-AAAI-22: https://sites.google.com/vt.edu/kgml-aaai-22

arXiv:2209.05245 [pdf, other]

Continual learning benefits from multiple sleep mechanisms: NREM, REM, and Synaptic Downscaling

Authors: Brian S. Robinson, Clare W. Lau, Alexander New, Shane M. Nichols, Erik C. Johnson, Michael Wolmetz, William G. Coon

Abstract: Learning new tasks and skills in succession without losing prior learning (i.e., catastrophic forgetting) is a computational challenge for both artificial and biological neural networks, yet artificial systems struggle to achieve parity with their biological analogues. Mammalian brains employ numerous neural operations in support of continual learning during sleep. These are ripe for artificial ad… ▽ More Learning new tasks and skills in succession without losing prior learning (i.e., catastrophic forgetting) is a computational challenge for both artificial and biological neural networks, yet artificial systems struggle to achieve parity with their biological analogues. Mammalian brains employ numerous neural operations in support of continual learning during sleep. These are ripe for artificial adaptation. Here, we investigate how modeling three distinct components of mammalian sleep together affects continual learning in artificial neural networks: (1) a veridical memory replay process observed during non-rapid eye movement (NREM) sleep; (2) a generative memory replay process linked to REM sleep; and (3) a synaptic downscaling process which has been proposed to tune signal-to-noise ratios and support neural upkeep. We find benefits from the inclusion of all three sleep components when evaluating performance on a continual learning CIFAR-100 image classification benchmark. Maximum accuracy improved during training and catastrophic forgetting was reduced during later tasks. While some catastrophic forgetting persisted over the course of network training, higher levels of synaptic downscaling lead to better retention of early tasks and further facilitated the recovery of early task accuracy during subsequent training. One key takeaway is that there is a trade-off at hand when considering the level of synaptic downscaling to use - more aggressive downscaling better protects early tasks, but less downscaling enhances the ability to learn new tasks. Intermediate levels can strike a balance with the highest overall accuracies during training. Overall, our results both provide insight into how to adapt sleep components to enhance artificial continual learning systems and highlight areas for future neuroscientific sleep research to further such systems. △ Less

Submitted 9 September, 2022; originally announced September 2022.

Comments: 9 pages, 12 figures, code available upon reasonable request. Corresponding author: William G. Coon (will.coon@jhuapl.edu)

arXiv:2208.01687 [pdf, other]

Neural Basis Functions for Accelerating Solutions to High Mach Euler Equations

Authors: David Witman, Alexander New, Hicham Alkendry, Honest Mrema

Abstract: We propose an approach to solving partial differential equations (PDEs) using a set of neural networks which we call Neural Basis Functions (NBF). This NBF framework is a novel variation of the POD DeepONet operator learning approach where we regress a set of neural networks onto a reduced order Proper Orthogonal Decomposition (POD) basis. These networks are then used in combination with a branch… ▽ More We propose an approach to solving partial differential equations (PDEs) using a set of neural networks which we call Neural Basis Functions (NBF). This NBF framework is a novel variation of the POD DeepONet operator learning approach where we regress a set of neural networks onto a reduced order Proper Orthogonal Decomposition (POD) basis. These networks are then used in combination with a branch network that ingests the parameters of the prescribed PDE to compute a reduced order approximation to the PDE. This approach is applied to the steady state Euler equations for high speed flow conditions (mach 10-30) where we consider the 2D flow around a cylinder which develops a shock condition. We then use the NBF predictions as initial conditions to a high fidelity Computational Fluid Dynamics (CFD) solver (CFD++) to show faster convergence. Lessons learned for training and implementing this algorithm will be presented as well. △ Less

Submitted 2 August, 2022; originally announced August 2022.

Comments: Published at ICML 2022 AI for Science workshop: https://openreview.net/forum?id=dvqjD3peY5S

arXiv:2208.01684 [pdf, other]

Curvature-informed multi-task learning for graph networks

Authors: Alexander New, Michael J. Pekala, Nam Q. Le, Janna Domenico, Christine D. Piatko, Christopher D. Stiles

Abstract: Properties of interest for crystals and molecules, such as band gap, elasticity, and solubility, are generally related to each other: they are governed by the same underlying laws of physics. However, when state-of-the-art graph neural networks attempt to predict multiple properties simultaneously (the multi-task learning (MTL) setting), they frequently underperform a suite of single property pred… ▽ More Properties of interest for crystals and molecules, such as band gap, elasticity, and solubility, are generally related to each other: they are governed by the same underlying laws of physics. However, when state-of-the-art graph neural networks attempt to predict multiple properties simultaneously (the multi-task learning (MTL) setting), they frequently underperform a suite of single property predictors. This suggests graph networks may not be fully leveraging these underlying similarities. Here we investigate a potential explanation for this phenomenon: the curvature of each property's loss surface significantly varies, leading to inefficient learning. This difference in curvature can be assessed by looking at spectral properties of the Hessians of each property's loss function, which is done in a matrix-free manner via randomized numerical linear algebra. We evaluate our hypothesis on two benchmark datasets (Materials Project (MP) and QM8) and consider how these findings can inform the training of novel multi-task learning models. △ Less

Submitted 2 August, 2022; originally announced August 2022.

Comments: Published at the ICML 2022 AI for Science workshop: https://openreview.net/forum?id=m5RYtApKFOg

arXiv:2207.14378 [pdf, other]

Latent Properties of Lifelong Learning Systems

Authors: Corban Rivera, Chace Ashcraft, Alexander New, James Schmidt, Gautam Vallabha

Abstract: Creating artificial intelligence (AI) systems capable of demonstrating lifelong learning is a fundamental challenge, and many approaches and metrics have been proposed to analyze algorithmic properties. However, for existing lifelong learning metrics, algorithmic contributions are confounded by task and scenario structure. To mitigate this issue, we introduce an algorithm-agnostic explainable surr… ▽ More Creating artificial intelligence (AI) systems capable of demonstrating lifelong learning is a fundamental challenge, and many approaches and metrics have been proposed to analyze algorithmic properties. However, for existing lifelong learning metrics, algorithmic contributions are confounded by task and scenario structure. To mitigate this issue, we introduce an algorithm-agnostic explainable surrogate-modeling approach to estimate latent properties of lifelong learning algorithms. We validate the approach for estimating these properties via experiments on synthetic data. To validate the structure of the surrogate model, we analyze real performance data from a collection of popular lifelong learning approaches and baselines adapted for lifelong classification and lifelong reinforcement learning. △ Less

Submitted 28 July, 2022; originally announced July 2022.

Comments: Accepted at 1st Conference on Lifelong Learning Agents (CoLLAs) Workshop Track, 2022

arXiv:2203.07454 [pdf, other]

L2Explorer: A Lifelong Reinforcement Learning Assessment Environment

Authors: Erik C. Johnson, Eric Q. Nguyen, Blake Schreurs, Chigozie S. Ewulum, Chace Ashcraft, Neil M. Fendley, Megan M. Baker, Alexander New, Gautam K. Vallabha

Abstract: Despite groundbreaking progress in reinforcement learning for robotics, gameplay, and other complex domains, major challenges remain in applying reinforcement learning to the evolving, open-world problems often found in critical application spaces. Reinforcement learning solutions tend to generalize poorly when exposed to new tasks outside of the data distribution they are trained on, prompting an… ▽ More Despite groundbreaking progress in reinforcement learning for robotics, gameplay, and other complex domains, major challenges remain in applying reinforcement learning to the evolving, open-world problems often found in critical application spaces. Reinforcement learning solutions tend to generalize poorly when exposed to new tasks outside of the data distribution they are trained on, prompting an interest in continual learning algorithms. In tandem with research on continual learning algorithms, there is a need for challenge environments, carefully designed experiments, and metrics to assess research progress. We address the latter need by introducing a framework for continual reinforcement-learning development and assessment using Lifelong Learning Explorer (L2Explorer), a new, Unity-based, first-person 3D exploration environment that can be continuously reconfigured to generate a range of tasks and task variants structured into complex and evolving evaluation curricula. In contrast to procedurally generated worlds with randomized components, we have developed a systematic approach to defining curricula in response to controlled changes with accompanying metrics to assess transfer, performance recovery, and data efficiency. Taken together, the L2Explorer environment and evaluation approach provides a framework for developing future evaluation methodologies in open-world settings and rigorously evaluating approaches to lifelong learning. △ Less

Submitted 14 March, 2022; originally announced March 2022.

Comments: 10 Pages submitted to AAAI AI for Open Worlds Symposium 2022

arXiv:2201.08278 [pdf, other]

Lifelong Learning Metrics

Authors: Alexander New, Megan Baker, Eric Nguyen, Gautam Vallabha

Abstract: The DARPA Lifelong Learning Machines (L2M) program seeks to yield advances in artificial intelligence (AI) systems so that they are capable of learning (and improving) continuously, leveraging data on one task to improve performance on another, and doing so in a computationally sustainable way. Performers on this program developed systems capable of performing a diverse range of functions, includi… ▽ More The DARPA Lifelong Learning Machines (L2M) program seeks to yield advances in artificial intelligence (AI) systems so that they are capable of learning (and improving) continuously, leveraging data on one task to improve performance on another, and doing so in a computationally sustainable way. Performers on this program developed systems capable of performing a diverse range of functions, including autonomous driving, real-time strategy, and drone simulation. These systems featured a diverse range of characteristics (e.g., task structure, lifetime duration), and an immediate challenge faced by the program's testing and evaluation team was measuring system performance across these different settings. This document, developed in close collaboration with DARPA and the program performers, outlines a formalism for constructing and characterizing the performance of agents performing lifelong learning scenarios. △ Less

Submitted 20 January, 2022; originally announced January 2022.

arXiv:1811.11190 [pdf, other]

Semantically-aware population health risk analyses

Authors: Alexander New, Sabbir M. Rashid, John S. Erickson, Deborah L. McGuinness, Kristin P. Bennett

Abstract: One primary task of population health analysis is the identification of risk factors that, for some subpopulation, have a significant association with some health condition. Examples include finding lifestyle factors associated with chronic diseases and finding genetic mutations associated with diseases in precision health. We develop a combined semantic and machine learning system that uses a hea… ▽ More One primary task of population health analysis is the identification of risk factors that, for some subpopulation, have a significant association with some health condition. Examples include finding lifestyle factors associated with chronic diseases and finding genetic mutations associated with diseases in precision health. We develop a combined semantic and machine learning system that uses a health risk ontology and knowledge graph (KG) to dynamically discover risk factors and their associated subpopulations. Semantics and the novel supervised cadre model make our system explainable. Future population health studies are easily performed and documented with provenance by specifying additional input and output KG cartridges. △ Less

Submitted 27 November, 2018; originally announced November 2018.

Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:cs/0101200

arXiv:1808.04880 [pdf, other]

A Precision Environment-Wide Association Study of Hypertension via Supervised Cadre Models

Authors: Alexander New, Kristin P. Bennett

Abstract: We consider the problem in precision health of grouping people into subpopulations based on their degree of vulnerability to a risk factor. These subpopulations cannot be discovered with traditional clustering techniques because their quality is evaluated with a supervised metric: the ease of modeling a response variable over observations within them. Instead, we apply the supervised cadre model (… ▽ More We consider the problem in precision health of grouping people into subpopulations based on their degree of vulnerability to a risk factor. These subpopulations cannot be discovered with traditional clustering techniques because their quality is evaluated with a supervised metric: the ease of modeling a response variable over observations within them. Instead, we apply the supervised cadre model (SCM), which does use this metric. We extend the SCM formalism so that it may be applied to multivariate regression and binary classification problems. We also develop a way to use conditional entropy to assess the confidence in the process by which a subject is assigned their cadre. Using the SCM, we generalize the environment-wide association study (EWAS) workflow to be able to model heterogeneity in population risk. In our EWAS, we consider more than two hundred environmental exposure factors and find their association with diastolic blood pressure, systolic blood pressure, and hypertension. This requires adapting the SCM to be applicable to data generated by a complex survey design. After correcting for false positives, we found 25 exposure variables that had a significant association with at least one of our response variables. Eight of these were significant for a discovered subpopulation but not for the overall population. Some of these associations have been identified by previous researchers, while others appear to be novel. We examine several discovered subpopulations in detail, and we find that they are interpretable and that they suggest further research questions. △ Less

Submitted 9 December, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

Comments: 9 pages, 5 figures

arXiv:1802.02500 [pdf, other]

doi 10.1109/IJCNN.2018.8489618

Cadre Modeling: Simultaneously Discovering Subpopulations and Predictive Models

Authors: Alexander New, Curt Breneman, Kristin P. Bennett

Abstract: We consider the problem in regression analysis of identifying subpopulations that exhibit different patterns of response, where each subpopulation requires a different underlying model. Unlike statistical cohorts, these subpopulations are not known a priori; thus, we refer to them as cadres. When the cadres and their associated models are interpretable, modeling leads to insights about the subpopu… ▽ More We consider the problem in regression analysis of identifying subpopulations that exhibit different patterns of response, where each subpopulation requires a different underlying model. Unlike statistical cohorts, these subpopulations are not known a priori; thus, we refer to them as cadres. When the cadres and their associated models are interpretable, modeling leads to insights about the subpopulations and their associations with the regression target. We introduce a discriminative model that simultaneously learns cadre assignment and target-prediction rules. Sparsity-inducing priors are placed on the model parameters, under which independent feature selection is performed for both the cadre assignment and target-prediction processes. We learn models using adaptive step size stochastic gradient descent, and we assess cadre quality with bootstrapped sample analysis. We present simulated results showing that, when the true clustering rule does not depend on the entire set of features, our method significantly outperforms methods that learn subpopulation-discovery and target-prediction rules separately. In a materials-by-design case study, our model provides state-of-the-art prediction of polymer glass transition temperature. Importantly, the method identifies cadres of polymers that respond differently to structural perturbations, thus providing design insight for targeting or avoiding specific transition temperature ranges. It identifies chemically meaningful cadres, each with interpretable models. Further experimental results show that cadre methods have generalization that is competitive with linear and nonlinear regression models and can identify robust subpopulations. △ Less

Submitted 23 October, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

Comments: 8 pages, 6 figures

Journal ref: In 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 2018

Showing 1–16 of 16 results for author: New, A