Search | arXiv e-print repository

Data-driven low-dimensional model of a sedimenting flexible fiber

Authors: Andrew J Fox, Michael D. Graham

Abstract: The dynamics of flexible filaments entrained in flow, important for understanding many biological and industrial processes, are computationally expensive to model with full-physics simulations. This work describes a data-driven technique to create high-fidelity low-dimensional models of flexible fiber dynamics using machine learning; the technique is applied to sedimentation in a quiescent, viscou… ▽ More The dynamics of flexible filaments entrained in flow, important for understanding many biological and industrial processes, are computationally expensive to model with full-physics simulations. This work describes a data-driven technique to create high-fidelity low-dimensional models of flexible fiber dynamics using machine learning; the technique is applied to sedimentation in a quiescent, viscous Newtonian fluid, using results from detailed simulations as the data set. The approach combines an autoencoder neural network architecture to learn a low-dimensional latent representation of the filament shape, with a neural ODE that learns the evolution of the particle in the latent state. The model was designed to model filaments of varying flexibility, characterized by an elasto-gravitational number $\mathcal{B}$, and was trained on a data set containing the evolution of fibers beginning at set angles of inclination. For the range of $\mathcal{B}$ considered here (100-10000), the filament shape dynamics can be represented with high accuracy with only four degrees of freedom, in contrast to the 93 present in the original bead-spring model used to generate the dynamic trajectories. We predict the evolution of fibers set at arbitrary angles and demonstrate that our data-driven model can accurately forecast the evolution of a fiber at both trained and untrained elasto-gravitational numbers. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2312.10235 [pdf, other]

Building symmetries into data-driven manifold dynamics models for complex flows

Authors: Carlos E. Pérez De Jesús, Alec J. Linot, Michael D. Graham

Abstract: Symmetries in a dynamical system provide an opportunity to dramatically improve the performance of data-driven models. For fluid flows, such models are needed for tasks related to design, understanding, prediction, and control. In this work we exploit the symmetries of the Navier-Stokes equations (NSE) and use simulation data to find the manifold where the long-time dynamics live, which has many f… ▽ More Symmetries in a dynamical system provide an opportunity to dramatically improve the performance of data-driven models. For fluid flows, such models are needed for tasks related to design, understanding, prediction, and control. In this work we exploit the symmetries of the Navier-Stokes equations (NSE) and use simulation data to find the manifold where the long-time dynamics live, which has many fewer degrees of freedom than the full state representation, and the evolution equation for the dynamics on that manifold. We call this method ''symmetry charting''. The first step is to map to a ''fundamental chart'', which is a region in the state space of the flow to which all other regions can be mapped by a symmetry operation. To map to the fundamental chart we identify a set of indicators from the Fourier transform that uniquely identify the symmetries of the system. We then find a low-dimensional coordinate representation of the data in the fundamental chart with the use of an autoencoder. We use a variation called an implicit rank minimizing autoencoder with weight decay, which in addition to compressing the dimension of the data, also gives estimates of how many dimensions are needed to represent the data: i.e. the dimension of the invariant manifold of the long-time dynamics. Finally, we learn dynamics on this manifold with the use of neural ordinary differential equations. We apply symmetry charting to two-dimensional Kolmogorov flow in a chaotic bursting regime. This system has a continuous translation symmetry, and discrete rotation and shift-reflect symmetries. With this framework we observe that less data is needed to learn accurate data-driven models, more robust estimates of the manifold dimension are obtained, equivariance of the NSE is satisfied, better short-time tracking with respect to the true data is observed, and long-time statistics are correctly captured. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2311.04552 [pdf, other]

doi 10.1007/978-3-031-53767-7_13

A 3D generative model of pathological multi-modal MR images and segmentations

Authors: Virginia Fernandez, Walter Hugo Lopez Pinaya, Pedro Borges, Mark S. Graham, Tom Vercauteren, M. Jorge Cardoso

Abstract: Generative modelling and synthetic data can be a surrogate for real medical imaging datasets, whose scarcity and difficulty to share can be a nuisance when delivering accurate deep learning models for healthcare applications. In recent years, there has been an increased interest in using these models for data augmentation and synthetic data sharing, using architectures such as generative adversari… ▽ More Generative modelling and synthetic data can be a surrogate for real medical imaging datasets, whose scarcity and difficulty to share can be a nuisance when delivering accurate deep learning models for healthcare applications. In recent years, there has been an increased interest in using these models for data augmentation and synthetic data sharing, using architectures such as generative adversarial networks (GANs) or diffusion models (DMs). Nonetheless, the application of synthetic data to tasks such as 3D magnetic resonance imaging (MRI) segmentation remains limited due to the lack of labels associated with the generated images. Moreover, many of the proposed generative MRI models lack the ability to generate arbitrary modalities due to the absence of explicit contrast conditioning. These limitations prevent the user from adjusting the contrast and content of the images and obtaining more generalisable data for training task-specific models. In this work, we propose brainSPADE3D, a 3D generative model for brain MRI and associated segmentations, where the user can condition on specific pathological phenotypes and contrasts. The proposed joint imaging-segmentation generative model is shown to generate high-fidelity synthetic images and associated segmentations, with the ability to combine pathologies. We demonstrate how the model can alleviate issues with segmentation model performance when unexpected pathologies are present in the data. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: Accepted for publication at the 2023 Deep Generative Models (DGM4MICCAI) MICCAI workshop (Vancouver, Canada)

arXiv:2310.06790 [pdf, other]

Enhancing Predictive Capabilities in Data-Driven Dynamical Modeling with Automatic Differentiation: Koopman and Neural ODE Approaches

Authors: C. Ricardo Constante-Amores, Alec J. Linot, Michael D. Graham

Abstract: Data-driven approximations of the Koopman operator are promising for predicting the time evolution of systems characterized by complex dynamics. Among these methods, the approach known as extended dynamic mode decomposition with dictionary learning (EDMD-DL) has garnered significant attention. Here we present a modification of EDMD-DL that concurrently determines both the dictionary of observables… ▽ More Data-driven approximations of the Koopman operator are promising for predicting the time evolution of systems characterized by complex dynamics. Among these methods, the approach known as extended dynamic mode decomposition with dictionary learning (EDMD-DL) has garnered significant attention. Here we present a modification of EDMD-DL that concurrently determines both the dictionary of observables and the corresponding approximation of the Koopman operator. This innovation leverages automatic differentiation to facilitate gradient descent computations through the pseudoinverse. We also address the performance of several alternative methodologies. We assess a 'pure' Koopman approach, which involves the direct time-integration of a linear, high-dimensional system governing the dynamics within the space of observables. Additionally, we explore a modified approach where the system alternates between spaces of states and observables at each time step -- this approach no longer satisfies the linearity of the true Koopman operator representation. For further comparisons, we also apply a state space approach (neural ODEs). We consider systems encompassing two and three-dimensional ordinary differential equation systems featuring steady, oscillatory, and chaotic attractors, as well as partial differential equations exhibiting increasingly complex and intricate behaviors. Our framework significantly outperforms EDMD-DL. Furthermore, the state space approach offers superior performance compared to the 'pure' Koopman approach where the entire time evolution occurs in the space of observables. When the temporal evolution of the Koopman approach alternates between states and observables at each time step, however, its predictions become comparable to those of the state space approach. △ Less

Submitted 17 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2307.15208 [pdf, other]

Generative AI for Medical Imaging: extending the MONAI Framework

Authors: Walter H. L. Pinaya, Mark S. Graham, Eric Kerfoot, Petru-Daniel Tudosiu, Jessica Dafflon, Virginia Fernandez, Pedro Sanchez, Julia Wolleb, Pedro F. da Costa, Ashay Patel, Hyungjin Chung, Can Zhao, Wei Peng, Zelong Liu, Xueyan Mei, Oeslle Lucena, Jong Chul Ye, Sotirios A. Tsaftaris, Prerna Dogra, Andrew Feng, Marc Modat, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the comp… ▽ More Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the complexity of these models, their implementation and reproducibility can be difficult. This complexity can hinder progress, act as a use barrier, and dissuade the comparison of new methods with existing works. In this study, we present MONAI Generative Models, a freely available open-source platform that allows researchers and developers to easily train, evaluate, and deploy generative models and related applications. Our platform reproduces state-of-art studies in a standardised way involving different architectures (such as diffusion models, autoregressive transformers, and GANs), and provides pre-trained models for the community. We have implemented these models in a generalisable fashion, illustrating that their results can be extended to 2D or 3D scenarios, including medical images with different modalities (like CT, MRI, and X-Ray data) and from different anatomical areas. Finally, we adopt a modular and extensible approach, ensuring long-term maintainability and the extension of current applications for future features. △ Less

Submitted 27 July, 2023; originally announced July 2023.

arXiv:2307.03777 [pdf, other]

Unsupervised 3D out-of-distribution detection with latent diffusion models

Authors: Mark S. Graham, Walter Hugo Lopez Pinaya, Paul Wright, Petru-Daniel Tudosiu, Yee H. Mah, James T. Teo, H. Rolf Jäger, David Werring, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Methods for out-of-distribution (OOD) detection that scale to 3D data are crucial components of any real-world clinical deep learning system. Classic denoising diffusion probabilistic models (DDPMs) have been recently proposed as a robust way to perform reconstruction-based OOD detection on 2D datasets, but do not trivially scale to 3D data. In this work, we propose to use Latent Diffusion Models… ▽ More Methods for out-of-distribution (OOD) detection that scale to 3D data are crucial components of any real-world clinical deep learning system. Classic denoising diffusion probabilistic models (DDPMs) have been recently proposed as a robust way to perform reconstruction-based OOD detection on 2D datasets, but do not trivially scale to 3D data. In this work, we propose to use Latent Diffusion Models (LDMs), which enable the scaling of DDPMs to high-resolution 3D medical data. We validate the proposed approach on near- and far-OOD datasets and compare it to a recently proposed, 3D-enabled approach using Latent Transformer Models (LTMs). Not only does the proposed LDM-based approach achieve statistically significant better performance, it also shows less sensitivity to the underlying latent representation, more favourable memory scaling, and produces better spatial anomaly maps. Code is available at https://github.com/marksgraham/ddpm-ood △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: Accepted at MICCAI 2023

arXiv:2305.01090 [pdf, ps, other]

Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems

Authors: Kevin Zeng, Carlos E. Pérez De Jesús, Andrew J. Fox, Michael D. Graham

Abstract: While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal… ▽ More While many phenomena in physics and engineering are formally high-dimensional, their long-time dynamics often live on a lower-dimensional manifold. The present work introduces an autoencoder framework that combines implicit regularization with internal linear layers and $L_2$ regularization (weight decay) to automatically estimate the underlying dimensionality of a data set, produce an orthogonal manifold coordinate system, and provide the mapping functions between the ambient space and manifold space, allowing for out-of-sample projections. We validate our framework's ability to estimate the manifold dimension for a series of datasets from dynamical systems of varying complexities and compare to other state-of-the-art estimators. We analyze the training dynamics of the network to glean insight into the mechanism of low-rank learning and find that collectively each of the implicit regularizing layers compound the low-rank representation and even self-correct during training. Analysis of gradient descent dynamics for this architecture in the linear case reveals the role of the internal linear layers in leading to faster decay of a "collective weight variable" incorporating all layers, and the role of weight decay in breaking degeneracies and thus driving convergence along directions in which no decay would occur in its absence. We show that this framework can be naturally extended for applications of state-space modeling and forecasting by generating a data-driven dynamic model of a spatiotemporally chaotic partial differential equation using only the manifold coordinates. Finally, we demonstrate that our framework is robust to hyperparameter choices. △ Less

Submitted 6 December, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2305.01071 [pdf, other]

Right HTML, Wrong JSON: Challenges in Replaying Archived Webpages Built with Client-Side Rendering

Authors: Michele C. Weigle, Michael L. Nelson, Sawood Alam, Mark Graham

Abstract: Many web sites are transitioning how they construct their pages. The conventional model is where the content is embedded server-side in the HTML and returned to the client in an HTTP response. Increasingly, sites are moving to a model where the initial HTTP response contains only an HTML skeleton plus JavaScript that makes API calls to a variety of servers for the content (typically in JSON format… ▽ More Many web sites are transitioning how they construct their pages. The conventional model is where the content is embedded server-side in the HTML and returned to the client in an HTTP response. Increasingly, sites are moving to a model where the initial HTTP response contains only an HTML skeleton plus JavaScript that makes API calls to a variety of servers for the content (typically in JSON format), and then builds out the DOM client-side, more easily allowing for periodically refreshing the content in a page and allowing dynamic modification of the content. This client-side rendering, now predominant in social media platforms such as Twitter and Instagram, is also being adopted by news outlets, such as CNN.com. When conventional web archiving techniques, such as crawling with Heritrix, are applied to pages that render their content client-side, the JSON responses can become out of sync with the HTML page in which it is to be embedded, resulting in temporal violations on replay. Because the violative JSON is not directly observable in the page (i.e., in the same manner a violative embedded image is), the temporal violations can be difficult to detect. We describe how the top level CNN.com page has used client-side rendering since April 2015 and the impact this has had on web archives. Between April 24, 2015 and July 21, 2016, we found almost 15,000 mementos with a temporal violation of more than 2 days between the base CNN.com HTML and the JSON responses used to deliver the content under the main story. One way to mitigate this problem is to use browser-based crawling instead of conventional crawlers like Heritrix, but browser-based crawling is currently much slower than non-browser-based tools such as Heritrix. △ Less

Submitted 1 May, 2023; originally announced May 2023.

Comments: 20 pages, preprint version of paper accepted at the 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL)

arXiv:2301.12098 [pdf, other]

Turbulence control in plane Couette flow using low-dimensional neural ODE-based models and deep reinforcement learning

Authors: Alec J. Linot, Kevin Zeng, Michael D. Graham

Abstract: The high dimensionality and complex dynamics of turbulent flows remain an obstacle to the discovery and implementation of control strategies. Deep reinforcement learning (RL) is a promising avenue for overcoming these obstacles, but requires a training phase in which the RL agent iteratively interacts with the flow environment to learn a control policy, which can be prohibitively expensive when th… ▽ More The high dimensionality and complex dynamics of turbulent flows remain an obstacle to the discovery and implementation of control strategies. Deep reinforcement learning (RL) is a promising avenue for overcoming these obstacles, but requires a training phase in which the RL agent iteratively interacts with the flow environment to learn a control policy, which can be prohibitively expensive when the environment involves slow experiments or large-scale simulations. We overcome this challenge using a framework we call "DManD-RL" (data-driven manifold dynamics-RL), which generates a data-driven low-dimensional model of our system that we use for RL training. With this approach, we seek to minimize drag in a direct numerical simulation (DNS) of a turbulent minimal flow unit of plane Couette flow at Re=400 using two slot jets on one wall. We obtain, from DNS data with $\mathcal{O}(10^5)$ degrees of freedom, a 25-dimensional DManD model of the dynamics by combining an autoencoder and neural ordinary differential equation. Using this model as the environment, we train an RL control agent, yielding a 440-fold speedup over training on the DNS, with equivalent control performance. The agent learns a policy that laminarizes 84% of unseen DNS test trajectories within 900 time units, significantly outperforming classical opposition control (58%), despite the actuation authority being much more restricted. The agent often achieves laminarization through a counterintuitive strategy that drives the formation of two low-speed streaks, with a spanwise wavelength that is too small to be self-sustaining. The agent demonstrates the same performance when we limit observations to wall shear rate. △ Less

Submitted 28 January, 2023; originally announced January 2023.

arXiv:2301.04638 [pdf, other]

Dynamics of a data-driven low-dimensional model of turbulent minimal Couette flow

Authors: Alec J. Linot, Michael D. Graham

Abstract: Because the Navier-Stokes equations are dissipative, the long-time dynamics of a flow in state space are expected to collapse onto a manifold whose dimension may be much lower than the dimension required for a resolved simulation. On this manifold, the state of the system can be exactly described in a coordinate system parameterizing the manifold. Describing the system in this low-dimensional coor… ▽ More Because the Navier-Stokes equations are dissipative, the long-time dynamics of a flow in state space are expected to collapse onto a manifold whose dimension may be much lower than the dimension required for a resolved simulation. On this manifold, the state of the system can be exactly described in a coordinate system parameterizing the manifold. Describing the system in this low-dimensional coordinate system allows for much faster simulations and analysis. We show, for turbulent Couette flow, that this description of the dynamics is possible using a data-driven manifold dynamics modeling method. This approach consists of an autoencoder to find a low-dimensional manifold coordinate system and a set of ordinary differential equations defined by a neural network. Specifically, we apply this method to minimal flow unit turbulent plane Couette flow at $\textit{Re}=400$, where a fully resolved solutions requires $\mathcal{O}(10^5)$ degrees of freedom. Using only data from this simulation we build models with fewer than $20$ degrees of freedom that quantitatively capture key characteristics of the flow, including the streak breakdown and regeneration cycle. At short-times, the models track the true trajectory for multiple Lyapunov times, and, at long-times, the models capture the Reynolds stress and the energy balance. For comparison, we show that the models outperform POD-Galerkin models with $\sim$2000 degrees of freedom. Finally, we compute unstable periodic orbits from the models. Many of these closely resemble previously computed orbits for the full system; additionally, we find nine orbits that correspond to previously unknown solutions in the full system. △ Less

Submitted 11 January, 2023; originally announced January 2023.

arXiv:2212.01493 [pdf]

Applications of AI in Astronomy

Authors: S. G. Djorgovski, A. A. Mahabal, M. J. Graham, K. Polsterer, A. Krone-Martins

Abstract: We provide a brief, and inevitably incomplete overview of the use of Machine Learning (ML) and other AI methods in astronomy, astrophysics, and cosmology. Astronomy entered the big data era with the first digital sky surveys in the early 1990s and the resulting Terascale data sets, which required automating of many data processing and analysis tasks, for example the star-galaxy separation, with bi… ▽ More We provide a brief, and inevitably incomplete overview of the use of Machine Learning (ML) and other AI methods in astronomy, astrophysics, and cosmology. Astronomy entered the big data era with the first digital sky surveys in the early 1990s and the resulting Terascale data sets, which required automating of many data processing and analysis tasks, for example the star-galaxy separation, with billions of feature vectors in hundreds of dimensions. The exponential data growth continued, with the rise of synoptic sky surveys and the Time Domain Astronomy, with the resulting Petascale data streams and the need for a real-time processing, classification, and decision making. A broad variety of classification and clustering methods have been applied for these tasks, and this remains a very active area of research. Over the past decade we have seen an exponential growth of the astronomical literature involving a variety of ML/AI applications of an ever increasing complexity and sophistication. ML and AI are now a standard part of the astronomical toolkit. As the data complexity continues to increase, we anticipate further advances leading towards a collaborative human-AI discovery. △ Less

Submitted 2 December, 2022; originally announced December 2022.

Comments: 12 pages, 1 figure, an invited review chapter, to appear in: Artificial Intelligence for Science, eds. A. Choudhary, G. Fox and T. Hey, Singapore: World Scientific, in press (2023)

arXiv:2211.11061 [pdf, other]

doi 10.1103/PhysRevE.107.034215

Deep learning delay coordinate dynamics for chaotic attractors from partial observable data

Authors: Charles D. Young, Michael D. Graham

Abstract: A common problem in time series analysis is to predict dynamics with only scalar or partial observations of the underlying dynamical system. For data on a smooth compact manifold, Takens theorem proves a time delayed embedding of the partial state is diffeomorphic to the attractor, although for chaotic and highly nonlinear systems learning these delay coordinate mappings is challenging. We utilize… ▽ More A common problem in time series analysis is to predict dynamics with only scalar or partial observations of the underlying dynamical system. For data on a smooth compact manifold, Takens theorem proves a time delayed embedding of the partial state is diffeomorphic to the attractor, although for chaotic and highly nonlinear systems learning these delay coordinate mappings is challenging. We utilize deep artificial neural networks (ANNs) to learn discrete discrete time maps and continuous time flows of the partial state. Given training data for the full state, we also learn a reconstruction map. Thus, predictions of a time series can be made from the current state and several previous observations with embedding parameters determined from time series analysis. The state space for time evolution is of comparable dimension to reduced order manifold models. These are advantages over recurrent neural network models, which require a high dimensional internal state or additional memory terms and hyperparameters. We demonstrate the capacity of deep ANNs to predict chaotic behavior from a scalar observation on a manifold of dimension three via the Lorenz system. We also consider multivariate observations on the Kuramoto-Sivashinsky equation, where the observation dimension required for accurately reproducing dynamics increases with the manifold dimension via the spatial extent of the system. △ Less

Submitted 20 November, 2022; originally announced November 2022.

arXiv:2211.07740 [pdf, other]

Denoising diffusion models for out-of-distribution detection

Authors: Mark S. Graham, Walter H. L. Pinaya, Petru-Daniel Tudosiu, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Out-of-distribution detection is crucial to the safe deployment of machine learning systems. Currently, unsupervised out-of-distribution detection is dominated by generative-based approaches that make use of estimates of the likelihood or other measurements from a generative model. Reconstruction-based methods offer an alternative approach, in which a measure of reconstruction error is used to det… ▽ More Out-of-distribution detection is crucial to the safe deployment of machine learning systems. Currently, unsupervised out-of-distribution detection is dominated by generative-based approaches that make use of estimates of the likelihood or other measurements from a generative model. Reconstruction-based methods offer an alternative approach, in which a measure of reconstruction error is used to determine if a sample is out-of-distribution. However, reconstruction-based approaches are less favoured, as they require careful tuning of the model's information bottleneck - such as the size of the latent dimension - to produce good results. In this work, we exploit the view of denoising diffusion probabilistic models (DDPM) as denoising autoencoders where the bottleneck is controlled externally, by means of the amount of noise applied. We propose to use DDPMs to reconstruct an input that has been noised to a range of noise levels, and use the resulting multi-dimensional reconstruction error to classify out-of-distribution inputs. We validate our approach both on standard computer-vision datasets and on higher dimension medical datasets. Our approach outperforms not only reconstruction-based methods, but also state-of-the-art generative-based approaches. Code is available at https://github.com/marksgraham/ddpm-ood. △ Less

Submitted 20 April, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

arXiv:2210.16708 [pdf, other]

doi 10.1103/PhysRevFluids.8.044402

Data-driven low-dimensional dynamic model of Kolmogorov flow

Authors: Carlos E. Pérez De Jesús, Michael D. Graham

Abstract: Reduced order models (ROMs) that capture flow dynamics are of interest for decreasing computational costs for simulation as well as for model-based control approaches. This work presents a data-driven framework for minimal-dimensional models that effectively capture the dynamics and properties of the flow. We apply this to Kolmogorov flow in a regime consisting of chaotic and intermittent behavior… ▽ More Reduced order models (ROMs) that capture flow dynamics are of interest for decreasing computational costs for simulation as well as for model-based control approaches. This work presents a data-driven framework for minimal-dimensional models that effectively capture the dynamics and properties of the flow. We apply this to Kolmogorov flow in a regime consisting of chaotic and intermittent behavior, which is common in many flows processes and is challenging to model. The trajectory of the flow travels near relative periodic orbits (RPOs), interspersed with sporadic bursting events corresponding to excursions between the regions containing the RPOs. The first step in development of the models is use of an undercomplete autoencoder to map from the full state data down to a latent space of dramatically lower dimension. Then models of the discrete-time evolution of the dynamics in the latent space are developed. By analyzing the model performance as a function of latent space dimension we can estimate the minimum number of dimensions required to capture the system dynamics. To further reduce the dimension of the dynamical model, we factor out a phase variable in the direction of translational invariance for the flow, leading to separate evolution equations for the pattern and phase. At a model dimension of five for the pattern dynamics, as opposed to the full state dimension of 1024 (i.e. a 32x32 grid), accurate predictions are found for individual trajectories out to about two Lyapunov times, as well as for long-time statistics. Further small improvements in the results occur at a dimension of nine. The nearly heteroclinic connections between the different RPOs, including the quiescent and bursting time scales, are well captured. We also capture key features of the phase dynamics. Finally, we use the low-dimensional representation to predict future bursting events, finding good success. △ Less

Submitted 1 August, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

Journal ref: Phys. Rev. Fluids 8, 044402 (2023)

arXiv:2209.08256 [pdf, other]

doi 10.1007/978-3-031-16980-9_8

Can segmentation models be trained with fully synthetically generated data?

Authors: Virginia Fernandez, Walter Hugo Lopez Pinaya, Pedro Borges, Petru-Daniel Tudosiu, Mark S Graham, Tom Vercauteren, M Jorge Cardoso

Abstract: In order to achieve good performance and generalisability, medical image segmentation models should be trained on sizeable datasets with sufficient variability. Due to ethics and governance restrictions, and the costs associated with labelling data, scientific development is often stifled, with models trained and tested on limited data. Data augmentation is often used to artificially increase the… ▽ More In order to achieve good performance and generalisability, medical image segmentation models should be trained on sizeable datasets with sufficient variability. Due to ethics and governance restrictions, and the costs associated with labelling data, scientific development is often stifled, with models trained and tested on limited data. Data augmentation is often used to artificially increase the variability in the data distribution and improve model generalisability. Recent works have explored deep generative models for image synthesis, as such an approach would enable the generation of an effectively infinite amount of varied data, addressing the generalisability and data access problems. However, many proposed solutions limit the user's control over what is generated. In this work, we propose brainSPADE, a model which combines a synthetic diffusion-based label generator with a semantic image generator. Our model can produce fully synthetic brain labels on-demand, with or without pathology of interest, and then generate a corresponding MRI image of an arbitrary guided style. Experiments show that brainSPADE synthetic data can be used to train segmentation models with performance comparable to that of models trained on real data. △ Less

Submitted 17 September, 2022; originally announced September 2022.

Comments: 12 pages, 2 (+2 App.) figures, 3 tables. Accepted at Simulation and Synthesis in Medical Imaging workshop (MICCAI 2022)

arXiv:2209.03177 [pdf, other]

Morphology-preserving Autoregressive 3D Generative Modelling of the Brain

Authors: Petru-Daniel Tudosiu, Walter Hugo Lopez Pinaya, Mark S. Graham, Pedro Borges, Virginia Fernandez, Dai Yang, Jeremy Appleyard, Guido Novati, Disha Mehra, Mike Vella, Parashkev Nachev, Sebastien Ourselin, Jorge Cardoso

Abstract: Human anatomy, morphology, and associated diseases can be studied using medical imaging data. However, access to medical imaging data is restricted by governance and privacy concerns, data ownership, and the cost of acquisition, thus limiting our ability to understand the human body. A possible solution to this issue is the creation of a model able to learn and then generate synthetic images of th… ▽ More Human anatomy, morphology, and associated diseases can be studied using medical imaging data. However, access to medical imaging data is restricted by governance and privacy concerns, data ownership, and the cost of acquisition, thus limiting our ability to understand the human body. A possible solution to this issue is the creation of a model able to learn and then generate synthetic images of the human body conditioned on specific characteristics of relevance (e.g., age, sex, and disease status). Deep generative models, in the form of neural networks, have been recently used to create synthetic 2D images of natural scenes. Still, the ability to produce high-resolution 3D volumetric imaging data with correct anatomical morphology has been hampered by data scarcity and algorithmic and computational limitations. This work proposes a generative model that can be scaled to produce anatomically correct, high-resolution, and realistic images of the human brain, with the necessary quality to allow further downstream analyses. The ability to generate a potentially unlimited amount of data not only enables large-scale studies of human anatomy and pathology without jeopardizing patient privacy, but also significantly advances research in the field of anomaly detection, modality synthesis, learning under limited data, and fair and ethical AI. Code and trained models are available at: https://github.com/AmigoLab/SynthAnatomy. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Comments: 13 pages, 3 figures, 2 tables, accepted at SASHIMI MICCAI 2022

MSC Class: 68T99 (Primary) 92C55 (Secondary) ACM Class: I.2.1; J.3

arXiv:2207.09060 [pdf, other]

Data Science and Machine Learning in Education

Authors: Gabriele Benelli, Thomas Y. Chen, Javier Duarte, Matthew Feickert, Matthew Graham, Lindsey Gray, Dan Hackett, Phil Harris, Shih-Chieh Hsu, Gregor Kasieczka, Elham E. Khoda, Matthias Komm, Mia Liu, Mark S. Neubauer, Scarlet Norberg, Alexx Perloff, Marcel Rieger, Claire Savard, Kazuhiro Terao, Savannah Thais, Avik Roy, Jean-Roch Vlimant, Grigorios Chachamis

Abstract: The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit gr… ▽ More The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit greatly from materials widely available materials for use in education, training and workforce development. They are also contributing to these materials and providing software to DS/ML-related fields. Increasingly, physics departments are offering courses at the intersection of DS, ML and physics, often using curricula developed by HEP researchers and involving open software and data used in HEP. In this white paper, we explore synergies between HEP research and DS/ML education, discuss opportunities and challenges at this intersection, and propose community activities that will be mutually beneficial. △ Less

Submitted 19 July, 2022; originally announced July 2022.

Comments: Contribution to Snowmass 2021

arXiv:2206.03461 [pdf, other]

Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models

Authors: Walter H. L. Pinaya, Mark S. Graham, Robert Gray, Pedro F Da Costa, Petru-Daniel Tudosiu, Paul Wright, Yee H. Mah, Andrew D. MacKinnon, James T. Teo, Rolf Jager, David Werring, Geraint Rees, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Deep generative models have emerged as promising tools for detecting arbitrary anomalies in data, dispensing with the necessity for manual labelling. Recently, autoregressive transformers have achieved state-of-the-art performance for anomaly detection in medical imaging. Nonetheless, these models still have some intrinsic weaknesses, such as requiring images to be modelled as 1D sequences, the ac… ▽ More Deep generative models have emerged as promising tools for detecting arbitrary anomalies in data, dispensing with the necessity for manual labelling. Recently, autoregressive transformers have achieved state-of-the-art performance for anomaly detection in medical imaging. Nonetheless, these models still have some intrinsic weaknesses, such as requiring images to be modelled as 1D sequences, the accumulation of errors during the sampling process, and the significant inference times associated with transformers. Denoising diffusion probabilistic models are a class of non-autoregressive generative models recently shown to produce excellent samples in computer vision (surpassing Generative Adversarial Networks), and to achieve log-likelihoods that are competitive with transformers while having fast inference times. Diffusion models can be applied to the latent representations learnt by autoencoders, making them easily scalable and great candidates for application to high dimensional data, such as medical images. Here, we propose a method based on diffusion models to detect and segment anomalies in brain imaging. By training the models on healthy data and then exploring its diffusion and reverse steps across its Markov chain, we can identify anomalous areas in the latent space and hence identify anomalies in the pixel space. Our diffusion models achieve competitive performance compared with autoregressive approaches across a series of experiments with 2D CT and MRI data involving synthetic and real pathological lesions with much reduced inference times, making their usage clinically viable. △ Less

Submitted 7 June, 2022; originally announced June 2022.

arXiv:2205.10650 [pdf, other]

Transformer-based out-of-distribution detection for clinically safe segmentation

Authors: Mark S Graham, Petru-Daniel Tudosiu, Paul Wright, Walter Hugo Lopez Pinaya, U Jean-Marie, Yee Mah, James Teo, Rolf H Jäger, David Werring, Parashkev Nachev, Sebastien Ourselin, M Jorge Cardoso

Abstract: In a clinical setting it is essential that deployed image processing systems are robust to the full range of inputs they might encounter and, in particular, do not make confidently wrong predictions. The most popular approach to safe processing is to train networks that can provide a measure of their uncertainty, but these tend to fail for inputs that are far outside the training data distribution… ▽ More In a clinical setting it is essential that deployed image processing systems are robust to the full range of inputs they might encounter and, in particular, do not make confidently wrong predictions. The most popular approach to safe processing is to train networks that can provide a measure of their uncertainty, but these tend to fail for inputs that are far outside the training data distribution. Recently, generative modelling approaches have been proposed as an alternative; these can quantify the likelihood of a data sample explicitly, filtering out any out-of-distribution (OOD) samples before further processing is performed. In this work, we focus on image segmentation and evaluate several approaches to network uncertainty in the far-OOD and near-OOD cases for the task of segmenting haemorrhages in head CTs. We find all of these approaches are unsuitable for safe segmentation as they provide confidently wrong predictions when operating OOD. We propose performing full 3D OOD detection using a VQ-GAN to provide a compressed latent representation of the image and a transformer to estimate the data likelihood. Our approach successfully identifies images in both the far- and near-OOD cases. We find a strong relationship between image likelihood and the quality of a model's segmentation, making this approach viable for filtering images unsuitable for segmentation. To our knowledge, this is the first time transformers have been applied to perform OOD detection on 3D image data. Code is available at github.com/marksgraham/transformer-ood. △ Less

Submitted 17 May, 2023; v1 submitted 21 May, 2022; originally announced May 2022.

Comments: Accepted at MIDL 2022 (Oral)

arXiv:2205.03204 [pdf, other]

Bounded-degree plane geometric spanners in practice

Authors: Frederick Anderson, Anirban Ghosh, Matthew Graham, Lucas Mougeot, David Wisnosky

Abstract: The construction of bounded-degree plane geometric spanners has been a focus of interest since 2002 when Bose, Gudmundsson, and Smid proposed the first algorithm to construct such spanners. To date, eleven algorithms have been designed with various trade-offs in degree and stretch factor. We have implemented these sophisticated algorithms in C++ using the CGAL library and experimented with them us… ▽ More The construction of bounded-degree plane geometric spanners has been a focus of interest since 2002 when Bose, Gudmundsson, and Smid proposed the first algorithm to construct such spanners. To date, eleven algorithms have been designed with various trade-offs in degree and stretch factor. We have implemented these sophisticated algorithms in C++ using the CGAL library and experimented with them using large synthetic and real-world pointsets. Our experiments have revealed their practical behavior and real-world efficacy. We share the implementations via GitHub for broader uses and future research. We present a simple practical algorithm, named AppxStretchFactor, that can estimate stretch factors (obtains lower bounds on the exact stretch factors) of geometric spanners - a challenging problem for which no practical algorithm is known yet. In our experiments with bounded-degree plane geometric spanners, we find that AppxStretchFactor estimates stretch factors almost precisely. Further, it gives linear runtime performance in practice for the pointset distributions considered in this work, making it much faster than the naive Dijkstra-based algorithm for calculating stretch factors △ Less

Submitted 6 May, 2022; originally announced May 2022.

arXiv:2205.01716 [pdf, other]

Experiments with Unit Disk Cover Algorithms for Covering Massive Pointsets

Authors: Rachel Friederich, Matthew Graham, Anirban Ghosh, Brian Hicks, Ronald Shevchenko

Abstract: Given a set of $n$ points in the plane, the Unit Disk Cover (UDC) problem asks to compute the minimum number of unit disks required to cover the points, along with a placement of the disks. The problem is NP-hard and several approximation algorithms have been designed over the last three decades. In this paper, we have engineered and experimentally compared practical performances of some of these… ▽ More Given a set of $n$ points in the plane, the Unit Disk Cover (UDC) problem asks to compute the minimum number of unit disks required to cover the points, along with a placement of the disks. The problem is NP-hard and several approximation algorithms have been designed over the last three decades. In this paper, we have engineered and experimentally compared practical performances of some of these algorithms on massive pointsets. The goal is to investigate which algorithms run fast and give good approximation in practice. We present a simple $7$-approximation algorithm for UDC that runs in $O(n)$ expected time and uses $O(s)$ extra space, where $s$ denotes the size of the generated cover. In our experiments, it turned out to be the speediest of all. We also present two heuristics to reduce the sizes of covers generated by it without slowing it down by much. To our knowledge, this is the first work that experimentally compares geometric covering algorithms. Experiments with them using massive pointsets (in the order of millions) throw light on their practical uses. We share the engineered algorithms via GitHub - https://github.com/ghoshanirban/UnitDiskCoverAlgorithms for broader uses and future research in the domain of geometric optimization. △ Less

Submitted 3 May, 2022; originally announced May 2022.

arXiv:2205.00579 [pdf, other]

doi 10.1098/rspa.2022.0297

Data-driven control of spatiotemporal chaos with reduced-order neural ODE-based models and reinforcement learning

Authors: Kevin Zeng, Alec J. Linot, Michael D. Graham

Abstract: Deep reinforcement learning (RL) is a data-driven method capable of discovering complex control strategies for high-dimensional systems, making it promising for flow control applications. In particular, the present work is motivated by the goal of reducing energy dissipation in turbulent flows, and the example considered is the spatiotemporally chaotic dynamics of the Kuramoto-Sivashinsky equation… ▽ More Deep reinforcement learning (RL) is a data-driven method capable of discovering complex control strategies for high-dimensional systems, making it promising for flow control applications. In particular, the present work is motivated by the goal of reducing energy dissipation in turbulent flows, and the example considered is the spatiotemporally chaotic dynamics of the Kuramoto-Sivashinsky equation (KSE). A major challenge associated with RL is that substantial training data must be generated by repeatedly interacting with the target system, making it costly when the system is computationally or experimentally expensive. We mitigate this challenge in a data-driven manner by combining dimensionality reduction via an autoencoder with a neural ODE framework to obtain a low-dimensional dynamical model from just a limited data set. We substitute this data-driven reduced-order model (ROM) in place of the true system during RL training to efficiently estimate the optimal policy, which can then be deployed on the true system. For the KSE actuated with localized forcing ("jets") at four locations, we demonstrate that we are able to learn a ROM that accurately captures the actuated dynamics as well as the underlying natural dynamics just from snapshots of the KSE experiencing random actuations. Using this ROM and a control objective of minimizing dissipation and power cost, we extract a control policy from it using deep RL. We show that the ROM-based control strategy translates well to the true KSE and highlight that the RL agent discovers and stabilizes an underlying forced equilibrium solution of the KSE system. We show that this forced equilibrium captured in the ROM and discovered through RL is related to an existing known equilibrium solution of the natural KSE. △ Less

Submitted 1 May, 2022; originally announced May 2022.

arXiv:2203.15706 [pdf, other]

doi 10.1016/j.jcp.2022.111838

Stabilized Neural Ordinary Differential Equations for Long-Time Forecasting of Dynamical Systems

Authors: Alec J. Linot, Joshua W. Burby, Qi Tang, Prasanna Balaprakash, Michael D. Graham, Romit Maulik

Abstract: In data-driven modeling of spatiotemporal phenomena careful consideration often needs to be made in capturing the dynamics of the high wavenumbers. This problem becomes especially challenging when the system of interest exhibits shocks or chaotic dynamics. We present a data-driven modeling method that accurately captures shocks and chaotic dynamics by proposing a novel architecture, stabilized neu… ▽ More In data-driven modeling of spatiotemporal phenomena careful consideration often needs to be made in capturing the dynamics of the high wavenumbers. This problem becomes especially challenging when the system of interest exhibits shocks or chaotic dynamics. We present a data-driven modeling method that accurately captures shocks and chaotic dynamics by proposing a novel architecture, stabilized neural ordinary differential equation (ODE). In our proposed architecture, we learn the right-hand-side (RHS) of an ODE by adding the outputs of two NN together where one learns a linear term and the other a nonlinear term. Specifically, we implement this by training a sparse linear convolutional NN to learn the linear term and a dense fully-connected nonlinear NN to learn the nonlinear term. This is in contrast with the standard neural ODE which involves training only a single NN for learning the RHS. We apply this setup to the viscous Burgers equation, which exhibits shocked behavior, and show better short-time tracking and prediction of the energy spectrum at high wavenumbers than a standard neural ODE. We also find that the stabilized neural ODE models are much more robust to noisy initial conditions than the standard neural ODE approach. We also apply this method to chaotic trajectories of the Kuramoto-Sivashinsky equation. In this case, stabilized neural ODEs keep long-time trajectories on the attractor, and are highly robust to noisy initial conditions, while standard neural ODEs fail at achieving either of these results. We conclude by demonstrating how stabilizing neural ODEs provide a natural extension for use in reduced-order modeling by projecting the dynamics onto the eigenvectors of the learned linear term. △ Less

Submitted 3 October, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

arXiv:2109.08445 [pdf, other]

Developing Visualisations to Enhance an Insider Threat Product: A Case Study

Authors: Martin Graham, Robert Kukla, Oleksii Mandrychenko, Darren Hart, Jessie Kennedy

Abstract: This paper describes the process of developing data visualisations to enhance a commercial software platform for combating insider threat, whose existing UI, while perfectly functional, was limited in its ability to allow analysts to easily spot the patterns and outliers that visualisation naturally reveals. We describe the design and development process, proceeding from initial tasks/requirements… ▽ More This paper describes the process of developing data visualisations to enhance a commercial software platform for combating insider threat, whose existing UI, while perfectly functional, was limited in its ability to allow analysts to easily spot the patterns and outliers that visualisation naturally reveals. We describe the design and development process, proceeding from initial tasks/requirements gathering, understanding the platform's data formats, the rationale behind the visualisation's design, and then refining the prototype through gathering feedback from representative domain experts who are also current users of the software. Through a number of example scenarios, we show that the visualisation can support the identified tasks and aid analysts in discovering and understanding potentially risky insider activity within a large user base. △ Less

Submitted 17 September, 2021; originally announced September 2021.

Comments: VizSec 2021

ACM Class: H.5.2

arXiv:2109.00060 [pdf, other]

doi 10.1063/5.0069536

Data-Driven Reduced-Order Modeling of Spatiotemporal Chaos with Neural Ordinary Differential Equations

Authors: Alec J. Linot, Michael D. Graham

Abstract: Dissipative partial differential equations that exhibit chaotic dynamics tend to evolve to attractors that exist on finite-dimensional manifolds. We present a data-driven reduced order modeling method that capitalizes on this fact by finding the coordinates of this manifold and finding an ordinary differential equation (ODE) describing the dynamics in this coordinate system. The manifold coordinat… ▽ More Dissipative partial differential equations that exhibit chaotic dynamics tend to evolve to attractors that exist on finite-dimensional manifolds. We present a data-driven reduced order modeling method that capitalizes on this fact by finding the coordinates of this manifold and finding an ordinary differential equation (ODE) describing the dynamics in this coordinate system. The manifold coordinates are discovered using an undercomplete autoencoder -- a neural network (NN) that reduces then expands dimension. Then the ODE, in these coordinates, is approximated by a NN using the neural ODE framework. Both of these methods only require snapshots of data to learn a model, and the data can be widely and/or unevenly spaced. We apply this framework to the Kuramoto-Sivashinsky for different domain sizes that exhibit chaotic dynamics. With this system, we find that dimension reduction improves performance relative to predictions in the ambient space, where artifacts arise. Then, with the low-dimensional model, we vary the training data spacing and find excellent short- and long-time statistical recreation of the true dynamics for widely spaced data (spacing of ~0.7 Lyapunov times). We end by comparing performance with various degrees of dimension reduction, and find a "sweet spot" in terms of performance vs. dimension. △ Less

Submitted 31 August, 2021; originally announced September 2021.

arXiv:2108.05928 [pdf, other]

doi 10.1038/s42256-022-00575-4

Data-driven discovery of intrinsic dynamics

Authors: Daniel Floryan, Michael D. Graham

Abstract: Dynamical models underpin our ability to understand and predict the behavior of natural systems. Whether dynamical models are developed from first-principles derivations or from observational data, they are predicated on our choice of state variables. The choice of state variables is driven by convenience and intuition, and in the data-driven case the observed variables are often chosen to be the… ▽ More Dynamical models underpin our ability to understand and predict the behavior of natural systems. Whether dynamical models are developed from first-principles derivations or from observational data, they are predicated on our choice of state variables. The choice of state variables is driven by convenience and intuition, and in the data-driven case the observed variables are often chosen to be the state variables. The dimensionality of these variables (and consequently the dynamical models) can be arbitrarily large, obscuring the underlying behavior of the system. In truth, these variables are often highly redundant and the system is driven by a much smaller set of latent intrinsic variables. In this study, we combine the mathematical theory of manifolds with the representational capacity of neural networks to develop a method that learns a system's intrinsic state variables directly from time series data, and also learns predictive models for their dynamics. What distinguishes our method is its ability to reduce data to the intrinsic dimensionality of the nonlinear manifold they live on. This ability is enabled by the concepts of charts and atlases from the theory of manifolds, whereby a manifold is represented by a collection of patches that are sewn together -- a necessary representation to attain intrinsic dimensionality. We demonstrate this approach on several high-dimensional systems with low-dimensional behavior. The resulting framework provides the ability to develop dynamical models of the lowest possible dimension, capturing the essence of a system. △ Less

Submitted 14 June, 2022; v1 submitted 12 August, 2021; originally announced August 2021.

Comments: 27 pages + 15 pages Supplementary Information

Journal ref: Nature Machine Intelligence, Vol. 4, Issue 12, pp. 1113-1120 (2022)

arXiv:2107.10695 [pdf, other]

Low latency allcast over broadcast erasure channels

Authors: Mark A. Graham, Ayalvadi J. Ganesh, Robert J. Piechocki

Abstract: Consider n nodes communicating over an unreliable broadcast channel. Each node has a single packet that needs to be communicated to all other nodes. Time is slotted, and a time slot is long enough for each node to broadcast one packet. Each broadcast reaches a random subset of nodes. The objective is to minimise the time until all nodes have received all packets. We study two schemes, (i) random r… ▽ More Consider n nodes communicating over an unreliable broadcast channel. Each node has a single packet that needs to be communicated to all other nodes. Time is slotted, and a time slot is long enough for each node to broadcast one packet. Each broadcast reaches a random subset of nodes. The objective is to minimise the time until all nodes have received all packets. We study two schemes, (i) random relaying, and (ii) random linear network coding, and analyse their performance in an asymptotic regime in which n tends to infinity. Simulation results for a wide range of n are also presented. △ Less

Submitted 22 July, 2021; originally announced July 2021.

Comments: Submitted to IEEE Transactions on Information Theory

MSC Class: 94A05 94A15

arXiv:2104.05437 [pdf, other]

doi 10.1103/PhysRevE.104.014210

Symmetry reduction for deep reinforcement learning active control of chaotic spatiotemporal dynamics

Authors: Kevin Zeng, Michael D. Graham

Abstract: Deep reinforcement learning (RL) is a data-driven, model-free method capable of discovering complex control strategies for macroscopic objectives in high-dimensional systems, making its application towards flow control promising. Many systems of flow control interest possess symmetries that, when neglected, can significantly inhibit the learning and performance of a naive deep RL approach. Using a… ▽ More Deep reinforcement learning (RL) is a data-driven, model-free method capable of discovering complex control strategies for macroscopic objectives in high-dimensional systems, making its application towards flow control promising. Many systems of flow control interest possess symmetries that, when neglected, can significantly inhibit the learning and performance of a naive deep RL approach. Using a test-bed consisting of the Kuramoto-Sivashinsky Equation (KSE), equally spaced actuators, and a goal of minimizing dissipation and power cost, we demonstrate that by moving the deep RL problem to a symmetry-reduced space, we can alleviate limitations inherent in the naive application of deep RL. We demonstrate that symmetry-reduced deep RL yields improved data efficiency as well as improved control policy efficacy compared to policies found by naive deep RL. Interestingly, the policy learned by the the symmetry aware control agent drives the system toward an equilibrium state of the forced KSE that is connected by continuation to an equilibrium of the unforced KSE, despite having been given no explicit information regarding its existence. I.e., to achieve its goal, the RL algorithm discovers and stabilizes an equilibrium state of the system. Finally, we demonstrate that the symmetry-reduced control policy is robust to observation and actuation signal noise, as well as to system parameters it has not observed before. △ Less

Submitted 9 April, 2021; originally announced April 2021.

Comments: Submitted to Physical Review E

Journal ref: Phys. Rev. E 104, 014210 (2021)

arXiv:2103.08201 [pdf, other]

Geometric Change Detection in Digital Twins using 3D Machine Learning

Authors: Tiril Sundby, Julia Maria Graham, Adil Rasheed, Mandar Tabib, Omer San

Abstract: Digital twins are meant to bridge the gap between real-world physical systems and virtual representations. Both stand-alone and descriptive digital twins incorporate 3D geometric models, which are the physical representations of objects in the digital replica. Digital twin applications are required to rapidly update internal parameters with the evolution of their physical counterpart. Due to an es… ▽ More Digital twins are meant to bridge the gap between real-world physical systems and virtual representations. Both stand-alone and descriptive digital twins incorporate 3D geometric models, which are the physical representations of objects in the digital replica. Digital twin applications are required to rapidly update internal parameters with the evolution of their physical counterpart. Due to an essential need for having high-quality geometric models for accurate physical representations, the storage and bandwidth requirements for storing 3D model information can quickly exceed the available storage and bandwidth capacity. In this work, we demonstrate a novel approach to geometric change detection in the context of a digital twin. We address the issue through a combined solution of Dynamic Mode Decomposition (DMD) for motion detection, YOLOv5 for object detection, and 3D machine learning for pose estimation. DMD is applied for background subtraction, enabling detection of moving foreground objects in real-time. The video frames containing detected motion are extracted and used as input to the change detection network. The object detection algorithm YOLOv5 is applied to extract the bounding boxes of detected objects in the video frames. Furthermore, the rotational pose of each object is estimated in a 3D pose estimation network. A series of convolutional neural networks conducts feature extraction from images and 3D model shapes. Then, the network outputs the estimated Euler angles of the camera orientation with respect to the object in the input image. By only storing data associated with a detected change in pose, we minimize necessary storage and bandwidth requirements while still being able to recreate the 3D scene on demand. △ Less

Submitted 15 March, 2021; originally announced March 2021.

arXiv:2102.13352 [pdf, other]

doi 10.3847/1538-3881/abea7b

Tails: Chasing Comets with the Zwicky Transient Facility and Deep Learning

Authors: Dmitry A. Duev, Bryce T. Bolin, Matthew J. Graham, Michael S. P. Kelley, Ashish Mahabal, Eric C. Bellm, Michael W. Coughlin, Richard Dekany, George Helou, Shrinivas R. Kulkarni, Frank J. Masci, Thomas A. Prince, Reed Riddle, Maayane T. Soumagnac, Stéfan J. van der Walt

Abstract: We present Tails, an open-source deep-learning framework for the identification and localization of comets in the image data of the Zwicky Transient Facility (ZTF), a robotic optical time-domain survey currently in operation at the Palomar Observatory in California, USA. Tails employs a custom EfficientDet-based architecture and is capable of finding comets in single images in near real time, rath… ▽ More We present Tails, an open-source deep-learning framework for the identification and localization of comets in the image data of the Zwicky Transient Facility (ZTF), a robotic optical time-domain survey currently in operation at the Palomar Observatory in California, USA. Tails employs a custom EfficientDet-based architecture and is capable of finding comets in single images in near real time, rather than requiring multiple epochs as with traditional methods. The system achieves state-of-the-art performance with 99% recall, 0.01% false positive rate, and 1-2 pixel root mean square error in the predicted position. We report the initial results of the Tails efficiency evaluation in a production setting on the data of the ZTF Twilight survey, including the first AI-assisted discovery of a comet (C/2020 T2) and the recovery of a comet (P/2016 J3 = P/2021 A3). △ Less

Submitted 26 February, 2021; originally announced February 2021.

arXiv:2011.00867 [pdf, other]

Accessible Data Curation and Analytics for International-Scale Citizen Science Datasets

Authors: Benjamin Murray, Eric Kerfoot, Mark S. Graham, Carole H. Sudre, Erika Molteni, Liane S. Canas, Michela Antonelli, Kerstin Klaser, Alessia Visconti, Andrew T. Chan, Paul W. Franks, Richard Davies, Jonathan Wolf, Tim Spector, Claire J. Steves, Marc Modat, Sebastien Ourselin

Abstract: The Covid Symptom Study, a smartphone-based surveillance study on COVID-19 symptoms in the population, is an exemplar of big data citizen science. Over 4.7 million participants and 189 million unique assessments have been logged since its introduction in March 2020. The success of the Covid Symptom Study creates technical challenges around effective data curation for two reasons. Firstly, the scal… ▽ More The Covid Symptom Study, a smartphone-based surveillance study on COVID-19 symptoms in the population, is an exemplar of big data citizen science. Over 4.7 million participants and 189 million unique assessments have been logged since its introduction in March 2020. The success of the Covid Symptom Study creates technical challenges around effective data curation for two reasons. Firstly, the scale of the dataset means that it can no longer be easily processed using standard software on commodity hardware. Secondly, the size of the research group means that replicability and consistency of key analytics used across multiple publications becomes an issue. We present ExeTera, an open source data curation software designed to address scalability challenges and to enable reproducible research across an international research group for datasets such as the Covid Symptom Study dataset. △ Less

Submitted 17 February, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

ACM Class: D.m; E.2; H.3.3; I.7

arXiv:2010.01926 [pdf, other]

Test-time Unsupervised Domain Adaptation

Authors: Thomas Varsavsky, Mauricio Orbes-Arteaga, Carole H. Sudre, Mark S. Graham, Parashkev Nachev, M. Jorge Cardoso

Abstract: Convolutional neural networks trained on publicly available medical imaging datasets (source domain) rarely generalise to different scanners or acquisition protocols (target domain). This motivates the active field of domain adaptation. While some approaches to the problem require labeled data from the target domain, others adopt an unsupervised approach to domain adaptation (UDA). Evaluating UDA… ▽ More Convolutional neural networks trained on publicly available medical imaging datasets (source domain) rarely generalise to different scanners or acquisition protocols (target domain). This motivates the active field of domain adaptation. While some approaches to the problem require labeled data from the target domain, others adopt an unsupervised approach to domain adaptation (UDA). Evaluating UDA methods consists of measuring the model's ability to generalise to unseen data in the target domain. In this work, we argue that this is not as useful as adapting to the test set directly. We therefore propose an evaluation framework where we perform test-time UDA on each subject separately. We show that models adapted to a specific target subject from the target domain outperform a domain adaptation method which has seen more data of the target domain but not this specific target subject. This result supports the thesis that unsupervised domain adaptation should be used at test-time, even if only using a single target-domain subject △ Less

Submitted 5 October, 2020; originally announced October 2020.

Comments: Accepted at MICCAI 2020

arXiv:2009.07573 [pdf, other]

Hierarchical brain parcellation with uncertainty

Authors: Mark S. Graham, Carole H. Sudre, Thomas Varsavsky, Petru-Daniel Tudosiu, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Many atlases used for brain parcellation are hierarchically organised, progressively dividing the brain into smaller sub-regions. However, state-of-the-art parcellation methods tend to ignore this structure and treat labels as if they are `flat'. We introduce a hierarchically-aware brain parcellation method that works by predicting the decisions at each branch in the label tree. We further show ho… ▽ More Many atlases used for brain parcellation are hierarchically organised, progressively dividing the brain into smaller sub-regions. However, state-of-the-art parcellation methods tend to ignore this structure and treat labels as if they are `flat'. We introduce a hierarchically-aware brain parcellation method that works by predicting the decisions at each branch in the label tree. We further show how this method can be used to model uncertainty separately for every branch in this label tree. Our method exceeds the performance of flat uncertainty methods, whilst also providing decomposed uncertainty estimates that enable us to obtain self-consistent parcellations and uncertainty maps at any level of the label hierarchy. We demonstrate a simple way these decision-specific uncertainty maps may be used to provided uncertainty-thresholded tissue maps at any level of the label tree. △ Less

Submitted 16 September, 2020; originally announced September 2020.

Comments: To be published in the MICCAI 2020 workshop: Uncertainty for Safe Utilization of Machine Learning in Medical Imaging

arXiv:2002.05692 [pdf, other]

Neuromorphologicaly-preserving Volumetric data encoding using VQ-VAE

Authors: Petru-Daniel Tudosiu, Thomas Varsavsky, Richard Shaw, Mark Graham, Parashkev Nachev, Sebastien Ourselin, Carole H. Sudre, M. Jorge Cardoso

Abstract: The increasing efficiency and compactness of deep learning architectures, together with hardware improvements, have enabled the complex and high-dimensional modelling of medical volumetric data at higher resolutions. Recently, Vector-Quantised Variational Autoencoders (VQ-VAE) have been proposed as an efficient generative unsupervised learning approach that can encode images to a small percentage… ▽ More The increasing efficiency and compactness of deep learning architectures, together with hardware improvements, have enabled the complex and high-dimensional modelling of medical volumetric data at higher resolutions. Recently, Vector-Quantised Variational Autoencoders (VQ-VAE) have been proposed as an efficient generative unsupervised learning approach that can encode images to a small percentage of their initial size, while preserving their decoded fidelity. Here, we show a VQ-VAE inspired network can efficiently encode a full-resolution 3D brain volume, compressing the data to $0.825\%$ of the original size while maintaining image fidelity, and significantly outperforming the previous state-of-the-art. We then demonstrate that VQ-VAE decoded images preserve the morphological characteristics of the original data through voxel-based morphology and segmentation experiments. Lastly, we show that such models can be pre-trained and then fine-tuned on different datasets without the introduction of bias. △ Less

Submitted 13 February, 2020; originally announced February 2020.

arXiv:2002.00854 [pdf]

doi 10.1371/journal.pone.0233660

Measuring relative opinion from location-based social media: A case study of the 2016 U.S. presidential election

Authors: Zhaoya Gong, Tengteng Cai, Jean-Claude Thill, Scott Hale, Mark Graham

Abstract: Social media has become an emerging alternative to opinion polls for public opinion collection, while it is still posing many challenges as a passive data source, such as structurelessness, quantifiability, and representativeness. Social media data with geotags provide new opportunities to unveil the geographic locations of users expressing their opinions. This paper aims to answer two questions:… ▽ More Social media has become an emerging alternative to opinion polls for public opinion collection, while it is still posing many challenges as a passive data source, such as structurelessness, quantifiability, and representativeness. Social media data with geotags provide new opportunities to unveil the geographic locations of users expressing their opinions. This paper aims to answer two questions: 1) whether quantifiable measurement of public opinion can be obtained from social media and 2) whether it can produce better or complementary measures compared to opinion polls. This research proposes a novel approach to measure the relative opinion of Twitter users towards public issues in order to accommodate more complex opinion structures and take advantage of the geography pertaining to the public issues. To ensure that this new measure is technically feasible, a modeling framework is developed including building a training dataset by adopting a state-of-the-art approach and devising a new deep learning method called Opinion-Oriented Word Embedding. With a case study of the tweets selected for the 2016 U.S. presidential election, we demonstrate the predictive superiority of our relative opinion approach and we show how it can aid visual analytics and support opinion predictions. Although the relative opinion measure is proved to be more robust compared to polling, our study also suggests that the former can advantageously complement the later in opinion prediction. △ Less

Submitted 20 April, 2020; v1 submitted 3 February, 2020; originally announced February 2020.

Journal ref: PLoS ONE 15(5): e0233660 (2020)

arXiv:2001.04263 [pdf, other]

doi 10.1103/PhysRevE.101.062209

Deep learning to discover and predict dynamics on an inertial manifold

Authors: Alec J. Linot, Michael D. Graham

Abstract: A data-driven framework is developed to represent chaotic dynamics on an inertial manifold (IM), and applied to solutions of the Kuramoto-Sivashinsky equation. A hybrid method combining linear and nonlinear (neural-network) dimension reduction transforms between coordinates in the full state space and on the IM. Additional neural networks predict time-evolution on the IM. The formalism accounts fo… ▽ More A data-driven framework is developed to represent chaotic dynamics on an inertial manifold (IM), and applied to solutions of the Kuramoto-Sivashinsky equation. A hybrid method combining linear and nonlinear (neural-network) dimension reduction transforms between coordinates in the full state space and on the IM. Additional neural networks predict time-evolution on the IM. The formalism accounts for translation invariance and energy conservation, and substantially outperforms linear dimension reduction, reproducing very well key dynamic and statistical features of the attractor. △ Less

Submitted 21 May, 2020; v1 submitted 20 December, 2019; originally announced January 2020.

Comments: Accepted in Physical Review E

Journal ref: Phys. Rev. E 101, 062209 (2020)

arXiv:1911.11779 [pdf, other]

doi 10.1038/s42254-019-0097-4

Enabling real-time multi-messenger astrophysics discoveries with deep learning

Authors: E. A. Huerta, Gabrielle Allen, Igor Andreoni, Javier M. Antelis, Etienne Bachelet, Bruce Berriman, Federica Bianco, Rahul Biswas, Matias Carrasco, Kyle Chard, Minsik Cho, Philip S. Cowperthwaite, Zachariah B. Etienne, Maya Fishbach, Francisco Förster, Daniel George, Tom Gibbs, Matthew Graham, William Gropp, Robert Gruendl, Anushri Gupta, Roland Haas, Sarah Habib, Elise Jennings, Margaret W. G. Johnson , et al. (35 additional authors not shown)

Abstract: Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos. In this Expert Recommendation, we review the key challenges of real-time observations of gravit… ▽ More Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos. In this Expert Recommendation, we review the key challenges of real-time observations of gravitational wave sources and their electromagnetic and astroparticle counterparts, and make a number of recommendations to maximize their potential for scientific discovery. These recommendations refer to the design of scalable and computationally efficient machine learning algorithms; the cyber-infrastructure to numerically simulate astrophysical sources, and to process and interpret multi-messenger astrophysics data; the management of gravitational wave detections to trigger real-time alerts for electromagnetic and astroparticle follow-ups; a vision to harness future developments of machine learning and cyber-infrastructure resources to cope with the big-data requirements; and the need to build a community of experts to realize the goals of multi-messenger astrophysics. △ Less

Submitted 26 November, 2019; originally announced November 2019.

Comments: Invited Expert Recommendation for Nature Reviews Physics. The art work produced by E. A. Huerta and Shawn Rosofsky for this article was used by Carl Conway to design the cover of the October 2019 issue of Nature Reviews Physics

Journal ref: Nature Reviews Physics volume 1, pages 600-608 (2019)

arXiv:1909.06429 [pdf, other]

Recommendation or Discrimination?: Quantifying Distribution Parity in Information Retrieval Systems

Authors: Rinat Khaziev, Bryce Casavant, Pearce Washabaugh, Amy A. Winecoff, Matthew Graham

Abstract: Information retrieval (IR) systems often leverage query data to suggest relevant items to users. This introduces the possibility of unfairness if the query (i.e., input) and the resulting recommendations unintentionally correlate with latent factors that are protected variables (e.g., race, gender, and age). For instance, a visual search system for fashion recommendations may pick up on features o… ▽ More Information retrieval (IR) systems often leverage query data to suggest relevant items to users. This introduces the possibility of unfairness if the query (i.e., input) and the resulting recommendations unintentionally correlate with latent factors that are protected variables (e.g., race, gender, and age). For instance, a visual search system for fashion recommendations may pick up on features of the human models rather than fashion garments when generating recommendations. In this work, we introduce a statistical test for "distribution parity" in the top-K IR results, which assesses whether a given set of recommendations is fair with respect to a specific protected variable. We evaluate our test using both simulated and empirical results. First, using artificially biased recommendations, we demonstrate the trade-off between statistically detectable bias and the size of the search catalog. Second, we apply our test to a visual search system for fashion garments, specifically testing for recommendation bias based on the skin tone of fashion models. Our distribution parity test can help ensure that IR systems' results are fair and produce a good experience for all users. △ Less

Submitted 13 September, 2019; originally announced September 2019.

arXiv:1907.05164 [pdf]

Disease classification of macular Optical Coherence Tomography scans using deep learning software: validation on independent, multi-centre data

Authors: Kanwal K. Bhatia, Mark S. Graham, Louise Terry, Ashley Wood, Paris Tranos, Sameer Trikha, Nicolas Jaccard

Abstract: Purpose: To evaluate Pegasus-OCT, a clinical decision support software for the identification of features of retinal disease from macula OCT scans, across heterogenous populations involving varying patient demographics, device manufacturers, acquisition sites and operators. Methods: 5,588 normal and anomalous macular OCT volumes (162,721 B-scans), acquired at independent centres in five countrie… ▽ More Purpose: To evaluate Pegasus-OCT, a clinical decision support software for the identification of features of retinal disease from macula OCT scans, across heterogenous populations involving varying patient demographics, device manufacturers, acquisition sites and operators. Methods: 5,588 normal and anomalous macular OCT volumes (162,721 B-scans), acquired at independent centres in five countries, were processed using the software. Results were evaluated against ground truth provided by the dataset owners. Results: Pegasus-OCT performed with AUROCs of at least 98% for all datasets in the detection of general macular anomalies. For scans of sufficient quality, the AUROCs for general AMD and DME detection were found to be at least 99% and 98%, respectively. Conclusions: The ability of a clinical decision support system to cater for different populations is key to its adoption. Pegasus-OCT was shown to be able to detect AMD, DME and general anomalies in OCT volumes acquired across multiple independent sites with high performance. Its use thus offers substantial promise, with the potential to alleviate the burden of growing demand in eye care services caused by retinal disease. △ Less

Submitted 11 July, 2019; originally announced July 2019.

arXiv:1905.06457 [pdf, other]

An interdisciplinary survey of network similarity methods

Authors: Emily Evans, Marissa Graham

Abstract: Comparative graph and network analysis play an important role in both systems biology and pattern recognition, but existing surveys on the topic have historically ignored or underserved one or the other of these fields. We present an integrative introduction to the key objectives and methods of graph and network comparison in each field, with the intent of remaining accessible to relative novices… ▽ More Comparative graph and network analysis play an important role in both systems biology and pattern recognition, but existing surveys on the topic have historically ignored or underserved one or the other of these fields. We present an integrative introduction to the key objectives and methods of graph and network comparison in each field, with the intent of remaining accessible to relative novices in order to mitigate the barrier to interdisciplinary idea crossover. To guide our investigation, and to quantitatively justify our assertions about what the key objectives and methods of each field are, we have constructed a citation network containing 5,793 vertices from the full reference lists of over two hundred relevant papers, which we collected by searching Google Scholar for ten different network comparison-related search terms. We investigate its basic statistics and community structure, and frame our presentation around the papers found to have high importance according to five different standard centrality measures. △ Less

Submitted 15 May, 2019; originally announced May 2019.

arXiv:1902.00522 [pdf, ps, other]

Deep Learning for Multi-Messenger Astrophysics: A Gateway for Discovery in the Big Data Era

Authors: Gabrielle Allen, Igor Andreoni, Etienne Bachelet, G. Bruce Berriman, Federica B. Bianco, Rahul Biswas, Matias Carrasco Kind, Kyle Chard, Minsik Cho, Philip S. Cowperthwaite, Zachariah B. Etienne, Daniel George, Tom Gibbs, Matthew Graham, William Gropp, Anushri Gupta, Roland Haas, E. A. Huerta, Elise Jennings, Daniel S. Katz, Asad Khan, Volodymyr Kindratenko, William T. C. Kramer, Xin Liu, Ashish Mahabal , et al. (23 additional authors not shown)

Abstract: This report provides an overview of recent work that harnesses the Big Data Revolution and Large Scale Computing to address grand computational challenges in Multi-Messenger Astrophysics, with a particular emphasis on real-time discovery campaigns. Acknowledging the transdisciplinary nature of Multi-Messenger Astrophysics, this document has been prepared by members of the physics, astronomy, compu… ▽ More This report provides an overview of recent work that harnesses the Big Data Revolution and Large Scale Computing to address grand computational challenges in Multi-Messenger Astrophysics, with a particular emphasis on real-time discovery campaigns. Acknowledging the transdisciplinary nature of Multi-Messenger Astrophysics, this document has been prepared by members of the physics, astronomy, computer science, data science, software and cyberinfrastructure communities who attended the NSF-, DOE- and NVIDIA-funded "Deep Learning for Multi-Messenger Astrophysics: Real-time Discovery at Scale" workshop, hosted at the National Center for Supercomputing Applications, October 17-19, 2018. Highlights of this report include unanimous agreement that it is critical to accelerate the development and deployment of novel, signal-processing algorithms that use the synergy between artificial intelligence (AI) and high performance computing to maximize the potential for scientific discovery with Multi-Messenger Astrophysics. We discuss key aspects to realize this endeavor, namely (i) the design and exploitation of scalable and computationally efficient AI algorithms for Multi-Messenger Astrophysics; (ii) cyberinfrastructure requirements to numerically simulate astrophysical sources, and to process and interpret Multi-Messenger Astrophysics data; (iii) management of gravitational wave detections and triggers to enable electromagnetic and astro-particle follow-ups; (iv) a vision to harness future developments of machine and deep learning and cyberinfrastructure resources to cope with the scale of discovery in the Big Data Era; (v) and the need to build a community that brings domain experts together with data scientists on equal footing to maximize and accelerate discovery in the nascent field of Multi-Messenger Astrophysics. △ Less

Submitted 1 February, 2019; originally announced February 2019.

Comments: 15 pages, no figures. White paper based on the "Deep Learning for Multi-Messenger Astrophysics: Real-time Discovery at Scale" workshop, hosted at NCSA, October 17-19, 2018 http://www.ncsa.illinois.edu/Conferences/DeepLearningLSST/

arXiv:1712.10068 [pdf, other]

doi 10.1145/3178876.3186094

Platform Criminalism: The 'Last-Mile' Geography of the Darknet Market Supply Chain

Authors: Martin Dittus, Joss Wright, Mark Graham

Abstract: Does recent growth of darknet markets signify a slow reorganisation of the illicit drug trade? Where are darknet markets situated in the global drug supply chain? In principle, these platforms allow producers to sell directly to end users, bypassing traditional trafficking routes. And yet, there is evidence that many offerings originate from a small number of highly active consumer countries, rath… ▽ More Does recent growth of darknet markets signify a slow reorganisation of the illicit drug trade? Where are darknet markets situated in the global drug supply chain? In principle, these platforms allow producers to sell directly to end users, bypassing traditional trafficking routes. And yet, there is evidence that many offerings originate from a small number of highly active consumer countries, rather than from countries that are primarily known for drug production. In a large-scale empirical study, we determine the darknet trading geography of three plant-based drugs across four of the largest darknet markets, and compare it to the global footprint of production and consumption for these drugs. We present strong evidence that cannabis and cocaine vendors are primarily located in a small number of consumer countries, rather than producer countries, suggesting that darknet trading happens at the 'last mile', possibly leaving old trafficking routes intact. A model to explain trading volumes of opiates is inconclusive. We cannot find evidence for significant production-side offerings across any of the drug types or marketplaces. Our evidence further suggests that the geography of darknet market trades is primarily driven by existing consumer demand, rather than new demand fostered by individual markets. △ Less

Submitted 11 March, 2018; v1 submitted 28 December, 2017; originally announced December 2017.

ACM Class: K.4.1, K.4.4

arXiv:1709.06257 [pdf, other]

doi 10.1109/SSCI.2017.8280984

Deep-Learnt Classification of Light Curves

Authors: Ashish Mahabal, Kshiteej Sheth, Fabian Gieseke, Akshay Pai, S. George Djorgovski, Andrew Drake, Matthew Graham, the CSS/CRTS/PTF Collaboration

Abstract: Astronomy light curves are sparse, gappy, and heteroscedastic. As a result standard time series methods regularly used for financial and similar datasets are of little help and astronomers are usually left to their own instruments and techniques to classify light curves. A common approach is to derive statistical features from the time series and to use machine learning methods, generally supervis… ▽ More Astronomy light curves are sparse, gappy, and heteroscedastic. As a result standard time series methods regularly used for financial and similar datasets are of little help and astronomers are usually left to their own instruments and techniques to classify light curves. A common approach is to derive statistical features from the time series and to use machine learning methods, generally supervised, to separate objects into a few of the standard classes. In this work, we transform the time series to two-dimensional light curve representations in order to classify them using modern deep learning techniques. In particular, we show that convolutional neural networks based classifiers work well for broad characterization and classification. We use labeled datasets of periodic variables from CRTS survey and show how this opens doors for a quick classification of diverse classes with several possible exciting extensions. △ Less

Submitted 19 September, 2017; originally announced September 2017.

Comments: 8 pages, 9 figures, 6 tables, 2 listings. Accepted to 2017 IEEE Symposium Series on Computational Intelligence (SSCI)

Journal ref: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 2017, p2757

arXiv:1605.02688 [pdf, other]

Theano: A Python framework for fast computation of mathematical expressions

Authors: The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano , et al. (88 additional authors not shown)

Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, mu… ▽ More Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it. △ Less

Submitted 9 May, 2016; originally announced May 2016.

Comments: 19 pages, 5 figures

arXiv:1605.02265 [pdf, other]

doi 10.1016/j.jss.2018.03.047

Software search is not a science, even among scientists: A survey of how scientists and engineers find software

Authors: Michael Hucka, Matthew J. Graham

Abstract: Improved software discovery is a prerequisite for greater software reuse: after all, if someone cannot find software for a particular task, they cannot reuse it. Understanding people's approaches and preferences when they look for software could help improve facilities for software discovery. We surveyed people working in several scientific and engineering fields to better understand their approac… ▽ More Improved software discovery is a prerequisite for greater software reuse: after all, if someone cannot find software for a particular task, they cannot reuse it. Understanding people's approaches and preferences when they look for software could help improve facilities for software discovery. We surveyed people working in several scientific and engineering fields to better understand their approaches and selection criteria. We found that even among highly-trained people, the rudimentary approaches of relying on general Web searches, the opinions of colleagues, and the literature were still the most commonly used. However, those who were involved in software development differed from nondevelopers in their use of social help sites, software project repositories, software catalogs, and organization-specific mailing lists or forums. For example, software developers in our sample were more likely to search in community sites such as Stack Overflow even when seeking ready-to-run software rather than source code, and likewise, asking colleagues was significantly more important when looking for ready-to-run software. Our survey also provides insight into the criteria that matter most to people when they are searching for ready-to-run software. Finally, our survey also identifies some factors that can prevent people from finding software. △ Less

Submitted 28 May, 2018; v1 submitted 7 May, 2016; originally announced May 2016.

Journal ref: Journal of Systems and Software, 141:171-191, 2018

arXiv:1601.04385 [pdf]

doi 10.1016/j.future.2015.10.013

Real-Time Data Mining of Massive Data Streams from Synoptic Sky Surveys

Authors: S. G. Djorgovski, M. J. Graham, C. Donalek, A. A. Mahabal, A. J. Drake, M. Turmon, T. Fuchs

Abstract: The nature of scientific and technological data collection is evolving rapidly: data volumes and rates grow exponentially, with increasing complexity and information content, and there has been a transition from static data sets to data streams that must be analyzed in real time. Interesting or anomalous phenomena must be quickly characterized and followed up with additional measurements via optim… ▽ More The nature of scientific and technological data collection is evolving rapidly: data volumes and rates grow exponentially, with increasing complexity and information content, and there has been a transition from static data sets to data streams that must be analyzed in real time. Interesting or anomalous phenomena must be quickly characterized and followed up with additional measurements via optimal deployment of limited assets. Modern astronomy presents a variety of such phenomena in the form of transient events in digital synoptic sky surveys, including cosmic explosions (supernovae, gamma ray bursts), relativistic phenomena (black hole formation, jets), potentially hazardous asteroids, etc. We have been developing a set of machine learning tools to detect, classify and plan a response to transient events for astronomy applications, using the Catalina Real-time Transient Survey (CRTS) as a scientific and methodological testbed. The ability to respond rapidly to the potentially most interesting events is a key bottleneck that limits the scientific returns from the current and anticipated synoptic sky surveys. Similar challenge arise in other contexts, from environmental monitoring using sensor networks to autonomous spacecraft systems. Given the exponential growth of data rates, and the time-critical response, we need a fully automated and robust approach. We describe the results obtained to date, and the possible future developments. △ Less

Submitted 17 January, 2016; originally announced January 2016.

Comments: 14 pages, an invited paper for a special issue of Future Generation Computer Systems, Elsevier Publ. (2015). This is an expanded version of a paper arXiv:1407.3502 presented at the IEEE e-Science 2014 conf., with some new content

arXiv:1410.7670 [pdf]

doi 10.1109/BigData.2014.7004282

Immersive and Collaborative Data Visualization Using Virtual Reality Platforms

Authors: Ciro Donalek, S. G. Djorgovski, Scott Davidoff, Alex Cioc, Anwell Wang, Giuseppe Longo, Jeffrey S. Norris, Jerry Zhang, Elizabeth Lawler, Stacy Yeh, Ashish Mahabal, Matthew Graham, Andrew Drake

Abstract: Effective data visualization is a key part of the discovery process in the era of big data. It is the bridge between the quantitative content of the data and human intuition, and thus an essential component of the scientific path from data into knowledge and understanding. Visualization is also essential in the data mining process, directing the choice of the applicable algorithms, and in helping… ▽ More Effective data visualization is a key part of the discovery process in the era of big data. It is the bridge between the quantitative content of the data and human intuition, and thus an essential component of the scientific path from data into knowledge and understanding. Visualization is also essential in the data mining process, directing the choice of the applicable algorithms, and in helping to identify and remove bad data from the analysis. However, a high complexity or a high dimensionality of modern data sets represents a critical obstacle. How do we visualize interesting structures and patterns that may exist in hyper-dimensional data spaces? A better understanding of how we can perceive and interact with multi dimensional information poses some deep questions in the field of cognition technology and human computer interaction. To this effect, we are exploring the use of immersive virtual reality platforms for scientific data visualization, both as software and inexpensive commodity hardware. These potentially powerful and innovative tools for multi dimensional data visualization can also provide an easy and natural path to a collaborative data visualization and exploration, where scientists can interact with their data and their colleagues in the same visual space. Immersion provides benefits beyond the traditional desktop visualization tools: it leads to a demonstrably better perception of a datascape geometry, more intuitive data understanding, and a better retention of the perceived relationships in the data. △ Less

Submitted 28 October, 2014; originally announced October 2014.

Comments: 6 pages, refereed proceedings of 2014 IEEE International Conference on Big Data, page 609, ISBN 978-1-4799-5665-4

arXiv:1407.3692 [pdf]

Helium: Visualization of Large Scale Plant Pedigrees

Authors: Paul D. Shaw, Martin Graham, Jessie Kennedy, Iain Milne, David F. Marshall

Abstract: Background: Plant breeders are utilising an increasingly diverse range of data types in order to identify lines that have desirable characteristics which are suitable to be taken forward in plant breeding programmes. There are a number of key morphological and physiological traits such as disease resistance and yield that are required to be maintained, and improved upon if a commercial variety is… ▽ More Background: Plant breeders are utilising an increasingly diverse range of data types in order to identify lines that have desirable characteristics which are suitable to be taken forward in plant breeding programmes. There are a number of key morphological and physiological traits such as disease resistance and yield that are required to be maintained, and improved upon if a commercial variety is to be successful. Computational tools that provide the ability to pull this data together, and integrate with pedigree structure, will enable breeders to make better decisions on which plant lines are used in crossings to meet both critical demands for increased yield/production and adaptation to climate change. Results: We have used a large and unique set of experimental barley (H. vulgare) data to develop a prototype pedigree visualization system and performed a subjective user evaluation with domain experts to guide and direct the development of an interactive pedigree visualization tool which we have called Helium. Conclusions: We show that Helium allows users to easily integrate a number of data types along with large plant pedigrees to offer an integrated environment in which they can explore pedigree data. We have also verified that users were happy with the abstract representation of pedigrees that we have used in our visualization tool. △ Less

Submitted 11 July, 2014; originally announced July 2014.

Comments: BioVis 2014 conference

arXiv:1310.1976 [pdf]

doi 10.1109/BigData.2013.6691731

Feature Selection Strategies for Classifying High Dimensional Astronomical Data Sets

Authors: Ciro Donalek, Arun Kumar A., S. G. Djorgovski, Ashish A. Mahabal, Matthew J. Graham, Thomas J. Fuchs, Michael J. Turmon, N. Sajeeth Philip, Michael Ting-Chang Yang, Giuseppe Longo

Abstract: The amount of collected data in many scientific fields is increasing, all of them requiring a common task: extract knowledge from massive, multi parametric data sets, as rapidly and efficiently possible. This is especially true in astronomy where synoptic sky surveys are enabling new research frontiers in the time domain astronomy and posing several new object classification challenges in multi di… ▽ More The amount of collected data in many scientific fields is increasing, all of them requiring a common task: extract knowledge from massive, multi parametric data sets, as rapidly and efficiently possible. This is especially true in astronomy where synoptic sky surveys are enabling new research frontiers in the time domain astronomy and posing several new object classification challenges in multi dimensional spaces; given the high number of parameters available for each object, feature selection is quickly becoming a crucial task in analyzing astronomical data sets. Using data sets extracted from the ongoing Catalina Real-Time Transient Surveys (CRTS) and the Kepler Mission we illustrate a variety of feature selection strategies used to identify the subsets that give the most information and the results achieved applying these techniques to three major astronomical problems. △ Less

Submitted 7 October, 2013; originally announced October 2013.

Comments: 7 pages, to appear in refereed proceedings of Scalable Machine Learning: Theory and Applications, IEEE BigData 2013

arXiv:1308.0683 [pdf]

doi 10.1080/00330124.2014.907699

Where in the World are You? Geolocation and Language Identification in Twitter

Authors: Mark Graham, Scott A. Hale, Devin Gaffney

Abstract: The movements of ideas and content between locations and languages are unquestionably crucial concerns to researchers of the information age, and Twitter has emerged as a central, global platform on which hundreds of millions of people share knowledge and information. A variety of research has attempted to harvest locational and linguistic metadata from tweets in order to understand important ques… ▽ More The movements of ideas and content between locations and languages are unquestionably crucial concerns to researchers of the information age, and Twitter has emerged as a central, global platform on which hundreds of millions of people share knowledge and information. A variety of research has attempted to harvest locational and linguistic metadata from tweets in order to understand important questions related to the 300 million tweets that flow through the platform each day. However, much of this work is carried out with only limited understandings of how best to work with the spatial and linguistic contexts in which the information was produced. Furthermore, standard, well-accepted practices have yet to emerge. As such, this paper studies the reliability of key methods used to determine language and location of content in Twitter. It compares three automated language identification packages to Twitter's user interface language setting and to a human coding of languages in order to identify common sources of disagreement. The paper also demonstrates that in many cases user-entered profile locations differ from the physical locations users are actually tweeting from. As such, these open-ended, user-generated, profile locations cannot be used as useful proxies for the physical locations from which information is published to Twitter. △ Less

Submitted 3 August, 2013; originally announced August 2013.

Showing 1–50 of 54 results for author: Graham, M