Search | arXiv e-print repository

Band-gap regression with architecture-optimized message-passing neural networks

Authors: Tim Bechtel, Daniel T. Speckhard, Jonathan Godwin, Claudia Draxl

Abstract: Graph-based neural networks and, specifically, message-passing neural networks (MPNNs) have shown great potential in predicting physical properties of solids. In this work, we train an MPNN to first classify materials through density functional theory data from the AFLOW database as being metallic or semiconducting/insulating. We then perform a neural-architecture search to explore the model archi… ▽ More Graph-based neural networks and, specifically, message-passing neural networks (MPNNs) have shown great potential in predicting physical properties of solids. In this work, we train an MPNN to first classify materials through density functional theory data from the AFLOW database as being metallic or semiconducting/insulating. We then perform a neural-architecture search to explore the model architecture and hyperparameter space of MPNNs to predict the band gaps of the materials identified as non-metals. The parameters in the search include the number of message-passing steps, latent size, and activation-function, among others. The top-performing models from the search are pooled into an ensemble that significantly outperforms existing models from the literature. Uncertainty quantification is evaluated with Monte-Carlo Dropout and ensembling, with the ensemble method proving superior. The domain of applicability of the ensemble model is analyzed with respect to the crystal systems, the inclusion of a Hubbard parameter in the density functional calculations, and the atomic species building up the materials. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2209.12466 [pdf, other]

Learned Force Fields Are Ready For Ground State Catalyst Discovery

Authors: Michael Schaarschmidt, Morgane Riviere, Alex M. Ganose, James S. Spencer, Alexander L. Gaunt, James Kirkpatrick, Simon Axelrod, Peter W. Battaglia, Jonathan Godwin

Abstract: We present evidence that learned density functional theory (``DFT'') force fields are ready for ground state catalyst discovery. Our key finding is that relaxation using forces from a learned potential yields structures with similar or lower energy to those relaxed using the RPBE functional in over 50\% of evaluated systems, despite the fact that the predicted forces differ significantly from the… ▽ More We present evidence that learned density functional theory (``DFT'') force fields are ready for ground state catalyst discovery. Our key finding is that relaxation using forces from a learned potential yields structures with similar or lower energy to those relaxed using the RPBE functional in over 50\% of evaluated systems, despite the fact that the predicted forces differ significantly from the ground truth. This has the surprising implication that learned potentials may be ready for replacing DFT in challenging catalytic systems such as those found in the Open Catalyst 2020 dataset. Furthermore, we show that a force field trained on a locally harmonic energy surface with the same minima as a target DFT energy is also able to find lower or similar energy structures in over 50\% of cases. This ``Easy Potential'' converges in fewer steps than a standard model trained on true energies and forces, which further accelerates calculations. Its success illustrates a key point: learned potentials can locate energy minima even when the model has high force errors. The main requirement for structure optimisation is simply that the learned potential has the correct minima. Since learned potentials are fast and scale linearly with system size, our results open the possibility of quickly finding ground states for large systems. △ Less

Submitted 26 September, 2022; originally announced September 2022.

arXiv:2206.00133 [pdf, other]

Pre-training via Denoising for Molecular Property Prediction

Authors: Sheheryar Zaidi, Michael Schaarschmidt, James Martens, Hyunjik Kim, Yee Whye Teh, Alvaro Sanchez-Gonzalez, Peter Battaglia, Razvan Pascanu, Jonathan Godwin

Abstract: Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks. In this paper, we describe a pre-training technique based on denoising that achieves a new state-of-the-art in molecular property prediction by utilizing large datasets of 3D molecular structures at equilibrium to learn meaningful representati… ▽ More Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks. In this paper, we describe a pre-training technique based on denoising that achieves a new state-of-the-art in molecular property prediction by utilizing large datasets of 3D molecular structures at equilibrium to learn meaningful representations for downstream tasks. Relying on the well-known link between denoising autoencoders and score-matching, we show that the denoising objective corresponds to learning a molecular force field -- arising from approximating the Boltzmann distribution with a mixture of Gaussians -- directly from equilibrium structures. Our experiments demonstrate that using this pre-training objective significantly improves performance on multiple benchmarks, achieving a new state-of-the-art on the majority of targets in the widely used QM9 dataset. Our analysis then provides practical insights into the effects of different factors -- dataset sizes, model size and architecture, and the choice of upstream and downstream datasets -- on pre-training. △ Less

Submitted 24 October, 2022; v1 submitted 31 May, 2022; originally announced June 2022.

arXiv:2201.05647 [pdf, other]

Tools and Practices for Responsible AI Engineering

Authors: Ryan Soklaski, Justin Goodwin, Olivia Brown, Michael Yee, Jason Matterer

Abstract: Responsible Artificial Intelligence (AI) - the practice of developing, evaluating, and maintaining accurate AI systems that also exhibit essential properties such as robustness and explainability - represents a multifaceted challenge that often stretches standard machine learning tooling, frameworks, and testing methods beyond their limits. In this paper, we present two new software libraries - hy… ▽ More Responsible Artificial Intelligence (AI) - the practice of developing, evaluating, and maintaining accurate AI systems that also exhibit essential properties such as robustness and explainability - represents a multifaceted challenge that often stretches standard machine learning tooling, frameworks, and testing methods beyond their limits. In this paper, we present two new software libraries - hydra-zen and the rAI-toolbox - that address critical needs for responsible AI engineering. hydra-zen dramatically simplifies the process of making complex AI applications configurable, and their behaviors reproducible. The rAI-toolbox is designed to enable methods for evaluating and enhancing the robustness of AI-models in a way that is scalable and that composes naturally with other popular ML frameworks. We describe the design principles and methodologies that make these tools effective, including the use of property-based testing to bolster the reliability of the tools themselves. Finally, we demonstrate the composability and flexibility of the tools by showing how various use cases from adversarial robustness and explainable AI can be concisely implemented with familiar APIs. △ Less

Submitted 14 January, 2022; originally announced January 2022.

arXiv:2112.15275 [pdf, other]

Learned Coarse Models for Efficient Turbulence Simulation

Authors: Kimberly Stachenfeld, Drummond B. Fielding, Dmitrii Kochkov, Miles Cranmer, Tobias Pfaff, Jonathan Godwin, Can Cui, Shirley Ho, Peter Battaglia, Alvaro Sanchez-Gonzalez

Abstract: Turbulence simulation with classical numerical solvers requires high-resolution grids to accurately resolve dynamics. Here we train learned simulators at low spatial and temporal resolutions to capture turbulent dynamics generated at high resolution. We show that our proposed model can simulate turbulent dynamics more accurately than classical numerical solvers at the comparably low resolutions ac… ▽ More Turbulence simulation with classical numerical solvers requires high-resolution grids to accurately resolve dynamics. Here we train learned simulators at low spatial and temporal resolutions to capture turbulent dynamics generated at high resolution. We show that our proposed model can simulate turbulent dynamics more accurately than classical numerical solvers at the comparably low resolutions across various scientifically relevant metrics. Our model is trained end-to-end from data and is capable of learning a range of challenging chaotic and turbulent dynamics at low resolution, including trajectories generated by the state-of-the-art Athena++ engine. We show that our simpler, general-purpose architecture outperforms various more specialized, turbulence-specific architectures from the learned turbulence simulation literature. In general, we see that learned simulators yield unstable trajectories; however, we show that tuning training noise and temporal downsampling solves this problem. We also find that while generalization beyond the training distribution is a challenge for learned models, training noise, added loss constraints, and dataset augmentation can help. Broadly, we conclude that our learned simulator outperforms traditional solvers run on coarser grids, and emphasize that simple design choices can offer stability and robust generalization. △ Less

Submitted 22 April, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

Journal ref: (2022) International Conference on Learning Representations

arXiv:2112.02958 [pdf, other]

Automap: Towards Ergonomic Automated Parallelism for ML Models

Authors: Michael Schaarschmidt, Dominik Grewe, Dimitrios Vytiniotis, Adam Paszke, Georg Stefan Schmid, Tamara Norman, James Molloy, Jonathan Godwin, Norman Alexander Rink, Vinod Nair, Dan Belov

Abstract: The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype o… ▽ More The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype of an automated partitioner that seamlessly integrates into existing compilers and existing user workflows. Our partitioner enables SPMD-style parallelism that encompasses data parallelism and parameter/activation sharding. Through a combination of inductive tactics and search in a platform-independent partitioning IR, automap can recover expert partitioning strategies such as Megatron sharding for transformer layers. △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: Workshop on ML for Systems at NeurIPS 2021

arXiv:2107.09422 [pdf, other]

Large-scale graph representation learning with very deep GNNs and self-supervision

Authors: Ravichandra Addanki, Peter W. Battaglia, David Budden, Andreea Deac, Jonathan Godwin, Thomas Keck, Wai Lok Sibon Li, Alvaro Sanchez-Gonzalez, Jacklynn Stott, Shantanu Thakoor, Petar Veličković

Abstract: Effectively and efficiently deploying graph neural networks (GNNs) at scale remains one of the most challenging aspects of graph representation learning. Many powerful solutions have only ever been validated on comparatively small datasets, often with counter-intuitive outcomes -- a barrier which has been broken by the Open Graph Benchmark Large-Scale Challenge (OGB-LSC). We entered the OGB-LSC wi… ▽ More Effectively and efficiently deploying graph neural networks (GNNs) at scale remains one of the most challenging aspects of graph representation learning. Many powerful solutions have only ever been validated on comparatively small datasets, often with counter-intuitive outcomes -- a barrier which has been broken by the Open Graph Benchmark Large-Scale Challenge (OGB-LSC). We entered the OGB-LSC with two large-scale GNNs: a deep transductive node classifier powered by bootstrapping, and a very deep (up to 50-layer) inductive graph regressor regularised by denoising objectives. Our models achieved an award-level (top-3) performance on both the MAG240M and PCQM4M benchmarks. In doing so, we demonstrate evidence of scalable self-supervised graph representation learning, and utility of very deep GNNs -- both very important open issues. Our code is publicly available at: https://github.com/deepmind/deepmind-research/tree/master/ogb_lsc. △ Less

Submitted 20 July, 2021; originally announced July 2021.

Comments: To appear at KDD Cup 2021. 13 pages, 3 figures. All authors contributed equally

arXiv:2107.02868 [pdf]

Principles for Evaluation of AI/ML Model Performance and Robustness

Authors: Olivia Brown, Andrew Curtis, Justin Goodwin

Abstract: The Department of Defense (DoD) has significantly increased its investment in the design, evaluation, and deployment of Artificial Intelligence and Machine Learning (AI/ML) capabilities to address national security needs. While there are numerous AI/ML successes in the academic and commercial sectors, many of these systems have also been shown to be brittle and nonrobust. In a complex and ever-cha… ▽ More The Department of Defense (DoD) has significantly increased its investment in the design, evaluation, and deployment of Artificial Intelligence and Machine Learning (AI/ML) capabilities to address national security needs. While there are numerous AI/ML successes in the academic and commercial sectors, many of these systems have also been shown to be brittle and nonrobust. In a complex and ever-changing national security environment, it is vital that the DoD establish a sound and methodical process to evaluate the performance and robustness of AI/ML models before these new capabilities are deployed to the field. This paper reviews the AI/ML development process, highlights common best practices for AI/ML model evaluation, and makes recommendations to DoD evaluators to ensure the deployment of robust AI/ML capabilities for national security needs. △ Less

Submitted 6 July, 2021; originally announced July 2021.

arXiv:2106.07971 [pdf, other]

Simple GNN Regularisation for 3D Molecular Property Prediction & Beyond

Authors: Jonathan Godwin, Michael Schaarschmidt, Alexander Gaunt, Alvaro Sanchez-Gonzalez, Yulia Rubanova, Petar Veličković, James Kirkpatrick, Peter Battaglia

Abstract: In this paper we show that simple noise regularisation can be an effective way to address GNN oversmoothing. First we argue that regularisers addressing oversmoothing should both penalise node latent similarity and encourage meaningful node representations. From this observation we derive "Noisy Nodes", a simple technique in which we corrupt the input graph with noise, and add a noise correcting n… ▽ More In this paper we show that simple noise regularisation can be an effective way to address GNN oversmoothing. First we argue that regularisers addressing oversmoothing should both penalise node latent similarity and encourage meaningful node representations. From this observation we derive "Noisy Nodes", a simple technique in which we corrupt the input graph with noise, and add a noise correcting node-level loss. The diverse node level loss encourages latent node diversity, and the denoising objective encourages graph manifold learning. Our regulariser applies well-studied methods in simple, straightforward ways which allow even generic architectures to overcome oversmoothing and achieve state of the art results on quantum chemistry tasks, and improve results significantly on Open Graph Benchmark (OGB) datasets. Our results suggest Noisy Nodes can serve as a complementary building block in the GNN toolkit. △ Less

Submitted 15 March, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: ICLR 2022 Camera Ready

arXiv:2101.00079 [pdf, other]

Graph Networks with Spectral Message Passing

Authors: Kimberly Stachenfeld, Jonathan Godwin, Peter Battaglia

Abstract: Graph Neural Networks (GNNs) are the subject of intense focus by the machine learning community for problems involving relational reasoning. GNNs can be broadly divided into spatial and spectral approaches. Spatial approaches use a form of learned message-passing, in which interactions among vertices are computed locally, and information propagates over longer distances on the graph with greater n… ▽ More Graph Neural Networks (GNNs) are the subject of intense focus by the machine learning community for problems involving relational reasoning. GNNs can be broadly divided into spatial and spectral approaches. Spatial approaches use a form of learned message-passing, in which interactions among vertices are computed locally, and information propagates over longer distances on the graph with greater numbers of message-passing steps. Spectral approaches use eigendecompositions of the graph Laplacian to produce a generalization of spatial convolutions to graph structured data which access information over short and long time scales simultaneously. Here we introduce the Spectral Graph Network, which applies message passing to both the spatial and spectral domains. Our model projects vertices of the spatial graph onto the Laplacian eigenvectors, which are each represented as vertices in a fully connected "spectral graph", and then applies learned message passing to them. We apply this model to various benchmark tasks including a graph-based variant of MNIST classification, molecular property prediction on MoleculeNet and QM9, and shortest path problems on random graphs. Our results show that the Spectral GN promotes efficient training, reaching high performance with fewer training iterations despite having more parameters. The model also provides robustness to edge dropout and outperforms baselines for the classification tasks. We also explore how these performance benefits depend on properties of the dataset. △ Less

Submitted 31 December, 2020; originally announced January 2021.

arXiv:2007.03832 [pdf, other]

Fast Training of Deep Neural Networks Robust to Adversarial Perturbations

Authors: Justin Goodwin, Olivia Brown, Victoria Helus

Abstract: Deep neural networks are capable of training fast and generalizing well within many domains. Despite their promising performance, deep networks have shown sensitivities to perturbations of their inputs (e.g., adversarial examples) and their learned feature representations are often difficult to interpret, raising concerns about their true capability and trustworthiness. Recent work in adversarial… ▽ More Deep neural networks are capable of training fast and generalizing well within many domains. Despite their promising performance, deep networks have shown sensitivities to perturbations of their inputs (e.g., adversarial examples) and their learned feature representations are often difficult to interpret, raising concerns about their true capability and trustworthiness. Recent work in adversarial training, a form of robust optimization in which the model is optimized against adversarial examples, demonstrates the ability to improve performance sensitivities to perturbations and yield feature representations that are more interpretable. Adversarial training, however, comes with an increased computational cost over that of standard (i.e., nonrobust) training, rendering it impractical for use in large-scale problems. Recent work suggests that a fast approximation to adversarial training shows promise for reducing training time and maintaining robustness in the presence of perturbations bounded by the infinity norm. In this work, we demonstrate that this approach extends to the Euclidean norm and preserves the human-aligned feature representations that are common for robust models. Additionally, we show that using a distributed training scheme can further reduce the time to train robust deep networks. Fast adversarial training is a promising approach that will provide increased security and explainability in machine learning applications for which robust optimization was previously thought to be impractical. △ Less

Submitted 7 July, 2020; originally announced July 2020.

arXiv:2005.10310 [pdf, other]

doi 10.1109/PLANS46316.2020.9109931

Maplets: An Efficient Approach for Cooperative SLAM Map Building Under Communication and Computation Constraints

Authors: Kevin M. Brink, Jincheng Zhang, Andrew R. Willis, Ryan E. Sherrill, Jamie L. Godwin

Abstract: This article introduces an approach to facilitate cooperative exploration and mapping of large-scale, near-ground, underground, or indoor spaces via a novel integration framework for locally-dense agent map data. The effort targets limited Size, Weight, and Power (SWaP) agents with an emphasis on limiting required communications and redundant processing. The approach uses a unique organization of… ▽ More This article introduces an approach to facilitate cooperative exploration and mapping of large-scale, near-ground, underground, or indoor spaces via a novel integration framework for locally-dense agent map data. The effort targets limited Size, Weight, and Power (SWaP) agents with an emphasis on limiting required communications and redundant processing. The approach uses a unique organization of batch optimization engines to enable a highly efficient two-tier optimization structure. Tier I consist of agents that create and potentially share local maplets (local maps, limited in size) which are generated using Simultaneous Localization and Mapping (SLAM) map-building software and then marginalized to a more compact parameterization. Maplets are generated in an overlapping manner and used to estimate the transform and uncertainty between those overlapping maplets, providing accurate and compact odometry or delta-pose representation between maplet's local frames. The delta poses can be shared between agents, and in cases where maplets have salient features (for loop closures), the compact representation of the maplet can also be shared. The second optimization tier consists of a global optimizer that seeks to optimize those maplet-to-maplet transformations, including any loop closures identified. This can provide an accurate global "skeleton"' of the traversed space without operating on the high-density point cloud. This compact version of the map data allows for scalable, cooperative exploration with limited communication requirements where most of the individual maplets, or low fidelity renderings, are only shared if desired. △ Less

Submitted 20 May, 2020; originally announced May 2020.

arXiv:2005.10222 [pdf, other]

doi 10.1117/12.2558168

Compute-Bound and Low-Bandwidth Distributed 3D Graph-SLAM

Authors: Jincheng Zhang, Andrew R. Willis, Jamie Godwin

Abstract: This article describes a new approach for distributed 3D SLAM map building. The key contribution of this article is the creation of a distributed graph-SLAM map-building architecture responsive to bandwidth and computational needs of the robotic platform. Responsiveness is afforded by the integration of a 3D point cloud to plane cloud compression algorithm that approximates dense 3D point cloud us… ▽ More This article describes a new approach for distributed 3D SLAM map building. The key contribution of this article is the creation of a distributed graph-SLAM map-building architecture responsive to bandwidth and computational needs of the robotic platform. Responsiveness is afforded by the integration of a 3D point cloud to plane cloud compression algorithm that approximates dense 3D point cloud using local planar patches. Compute bound platforms may restrict the computational duration of the compression algorithm and low-bandwidth platforms can restrict the size of the compression result. The backbone of the approach is an ultra-fast adaptive 3D compression algorithm that transforms swaths of 3D planar surface data into planar patches attributed with image textures. Our approach uses DVO SLAM, a leading algorithm for 3D mapping, and extends it by computationally isolating map integration tasks from local Guidance, Navigation, and Control tasks and includes an addition of a network protocol to share the compressed plane clouds. The joint effect of these contributions allows agents with 3D sensing capabilities to calculate and communicate compressed map information commensurate with their onboard computational resources and communication channel capacities. This opens SLAM mapping to new categories of robotic platforms that may have computational and memory limits that prohibit other SLAM solutions. △ Less

Submitted 20 May, 2020; originally announced May 2020.

arXiv:2002.09405 [pdf, other]

Learning to Simulate Complex Physics with Graph Networks

Authors: Alvaro Sanchez-Gonzalez, Jonathan Godwin, Tobias Pfaff, Rex Ying, Jure Leskovec, Peter W. Battaglia

Abstract: Here we present a machine learning framework and model implementation that can learn to simulate a wide variety of challenging physical domains, involving fluids, rigid solids, and deformable materials interacting with one another. Our framework---which we term "Graph Network-based Simulators" (GNS)---represents the state of a physical system with particles, expressed as nodes in a graph, and comp… ▽ More Here we present a machine learning framework and model implementation that can learn to simulate a wide variety of challenging physical domains, involving fluids, rigid solids, and deformable materials interacting with one another. Our framework---which we term "Graph Network-based Simulators" (GNS)---represents the state of a physical system with particles, expressed as nodes in a graph, and computes dynamics via learned message-passing. Our results show that our model can generalize from single-timestep predictions with thousands of particles during training, to different initial conditions, thousands of timesteps, and at least an order of magnitude more particles at test time. Our model was robust to hyperparameter choices across various evaluation metrics: the main determinants of long-term performance were the number of message-passing steps, and mitigating the accumulation of error by corrupting the training data with noise. Our GNS framework advances the state-of-the-art in learned physical simulation, and holds promise for solving a wide range of complex forward and inverse problems. △ Less

Submitted 14 September, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

Comments: Accepted at ICML 2020

arXiv:2001.11062 [pdf, other]

Safe Predictors for Enforcing Input-Output Specifications

Authors: Stephen Mell, Olivia Brown, Justin Goodwin, Sung-Hyun Son

Abstract: We present an approach for designing correct-by-construction neural networks (and other machine learning models) that are guaranteed to be consistent with a collection of input-output specifications before, during, and after algorithm training. Our method involves designing a constrained predictor for each set of compatible constraints, and combining them safely via a convex combination of their p… ▽ More We present an approach for designing correct-by-construction neural networks (and other machine learning models) that are guaranteed to be consistent with a collection of input-output specifications before, during, and after algorithm training. Our method involves designing a constrained predictor for each set of compatible constraints, and combining them safely via a convex combination of their predictions. We demonstrate our approach on synthetic datasets and an aircraft collision avoidance problem. △ Less

Submitted 29 January, 2020; originally announced January 2020.

Comments: 10 pages, 5 figures, paper accepted to the NeurIPS 2019 Workshop on Machine Learning with Guarantees and the NeurIPS 2019 Workshop on Safety and Robustness in Decision Making

arXiv:1906.03164 [pdf, other]

Kernelized Capsule Networks

Authors: Taylor Killian, Justin Goodwin, Olivia Brown, Sung-Hyun Son

Abstract: Capsule Networks attempt to represent patterns in images in a way that preserves hierarchical spatial relationships. Additionally, research has demonstrated that these techniques may be robust against adversarial perturbations. We present an improvement to training capsule networks with added robustness via non-parametric kernel methods. The representations learned through the capsule network are… ▽ More Capsule Networks attempt to represent patterns in images in a way that preserves hierarchical spatial relationships. Additionally, research has demonstrated that these techniques may be robust against adversarial perturbations. We present an improvement to training capsule networks with added robustness via non-parametric kernel methods. The representations learned through the capsule network are used to construct covariance kernels for Gaussian processes (GPs). We demonstrate that this approach achieves comparable prediction performance to Capsule Networks while improving robustness to adversarial perturbations and providing a meaningful measure of uncertainty that may aid in the detection of adversarial inputs. △ Less

Submitted 7 June, 2019; originally announced June 2019.

Comments: Paper accepted to the ICML 2019 Workshop on Understanding and Improving Generalization in Deep Learning

arXiv:1905.03592 [pdf]

AI Enabling Technologies: A Survey

Authors: Vijay Gadepally, Justin Goodwin, Jeremy Kepner, Albert Reuther, Hayley Reynolds, Siddharth Samsi, Jonathan Su, David Martinez

Abstract: Artificial Intelligence (AI) has the opportunity to revolutionize the way the United States Department of Defense (DoD) and Intelligence Community (IC) address the challenges of evolving threats, data deluge, and rapid courses of action. Developing an end-to-end artificial intelligence system involves parallel development of different pieces that must work together in order to provide capabilities… ▽ More Artificial Intelligence (AI) has the opportunity to revolutionize the way the United States Department of Defense (DoD) and Intelligence Community (IC) address the challenges of evolving threats, data deluge, and rapid courses of action. Developing an end-to-end artificial intelligence system involves parallel development of different pieces that must work together in order to provide capabilities that can be used by decision makers, warfighters and analysts. These pieces include data collection, data conditioning, algorithms, computing, robust artificial intelligence, and human-machine teaming. While much of the popular press today surrounds advances in algorithms and computing, most modern AI systems leverage advances across numerous different fields. Further, while certain components may not be as visible to end-users as others, our experience has shown that each of these interrelated components play a major role in the success or failure of an AI system. This article is meant to highlight many of these technologies that are involved in an end-to-end AI system. The goal of this article is to provide readers with an overview of terminology, technical details and recent highlights from academia, industry and government. Where possible, we indicate relevant resources that can be used for further reading and understanding. △ Less

Submitted 8 May, 2019; originally announced May 2019.

arXiv:1811.10714 [pdf, other]

Learning Robust Representations for Automatic Target Recognition

Authors: Justin A. Goodwin, Olivia M. Brown, Taylor W. Killian, Sung-Hyun Son

Abstract: Radio frequency (RF) sensors are used alongside other sensing modalities to provide rich representations of the world. Given the high variability of complex-valued target responses, RF systems are susceptible to attacks masking true target characteristics from accurate identification. In this work, we evaluate different techniques for building robust classification architectures exploiting learned… ▽ More Radio frequency (RF) sensors are used alongside other sensing modalities to provide rich representations of the world. Given the high variability of complex-valued target responses, RF systems are susceptible to attacks masking true target characteristics from accurate identification. In this work, we evaluate different techniques for building robust classification architectures exploiting learned physical structure in received synthetic aperture radar signals of simulated 3D targets. △ Less

Submitted 26 November, 2018; originally announced November 2018.

arXiv:1612.09113 [pdf, other]

Deep Semi-Supervised Learning with Linguistically Motivated Sequence Labeling Task Hierarchies

Authors: Jonathan Godwin, Pontus Stenetorp, Sebastian Riedel

Abstract: In this paper we present a novel Neural Network algorithm for conducting semi-supervised learning for sequence labeling tasks arranged in a linguistically motivated hierarchy. This relationship is exploited to regularise the representations of supervised tasks by backpropagating the error of the unsupervised task through the supervised tasks. We introduce a neural network where lower layers are su… ▽ More In this paper we present a novel Neural Network algorithm for conducting semi-supervised learning for sequence labeling tasks arranged in a linguistically motivated hierarchy. This relationship is exploited to regularise the representations of supervised tasks by backpropagating the error of the unsupervised task through the supervised tasks. We introduce a neural network where lower layers are supervised by junior downstream tasks and the final layer task is an auxiliary unsupervised task. The architecture shows improvements of up to two percentage points F1 for Chunking compared to a plausible baseline. △ Less

Submitted 29 December, 2016; originally announced December 2016.

arXiv:1304.1495 [pdf]

Uncertainty and Incompleteness

Authors: Piero P. Bonissone, David A. Cyrluk, James W. Goodwin, Jonathan Stillman

Abstract: Two major difficulties in using default logics are their intractability and the problem of selecting among multiple extensions. We propose an approach to these problems based on integrating nommonotonic reasoning with plausible reasoning based on triangular norms. A previously proposed system for reasoning with uncertainty (RUM) performs uncertain monotonic inferences on an acyclic graph. We ha… ▽ More Two major difficulties in using default logics are their intractability and the problem of selecting among multiple extensions. We propose an approach to these problems based on integrating nommonotonic reasoning with plausible reasoning based on triangular norms. A previously proposed system for reasoning with uncertainty (RUM) performs uncertain monotonic inferences on an acyclic graph. We have extended RUM to allow nommonotonic inferences and cycles within nonmonotonic rules. By restricting the size and complexity of the nommonotonic cycles we can still perform efficient inferences. Uncertainty measures provide a basis for deciding among multiple defaults. Different algorithms and heuristics for finding the optimal defaults are discussed. △ Less

Submitted 27 March, 2013; originally announced April 2013.

Comments: Appears in Proceedings of the Fifth Conference on Uncertainty in Artificial Intelligence (UAI1989)

Report number: UAI-P-1989-PG-34-45

Showing 1–20 of 20 results for author: Goodwin, J