Search | arXiv e-print repository

A Quadrature Approach for General-Purpose Batch Bayesian Optimization via Probabilistic Lifting

Authors: Masaki Adachi, Satoshi Hayakawa, Martin Jørgensen, Saad Hamid, Harald Oberhauser, Michael A. Osborne

Abstract: Parallelisation in Bayesian optimisation is a common strategy but faces several challenges: the need for flexibility in acquisition functions and kernel choices, flexibility dealing with discrete and continuous variables simultaneously, model misspecification, and lastly fast massive parallelisation. To address these challenges, we introduce a versatile and modular framework for batch Bayesian opt… ▽ More Parallelisation in Bayesian optimisation is a common strategy but faces several challenges: the need for flexibility in acquisition functions and kernel choices, flexibility dealing with discrete and continuous variables simultaneously, model misspecification, and lastly fast massive parallelisation. To address these challenges, we introduce a versatile and modular framework for batch Bayesian optimisation via probabilistic lifting with kernel quadrature, called SOBER, which we present as a Python library based on GPyTorch/BoTorch. Our framework offers the following unique benefits: (1) Versatility in downstream tasks under a unified approach. (2) A gradient-free sampler, which does not require the gradient of acquisition functions, offering domain-agnostic sampling (e.g., discrete and mixed variables, non-Euclidean space). (3) Flexibility in domain prior distribution. (4) Adaptive batch size (autonomous determination of the optimal batch size). (5) Robustness against a misspecified reproducing kernel Hilbert space. (6) Natural stopping criterion. △ Less

Submitted 19 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

Comments: This work is the journal extension of the workshop paper (arXiv:2301.11832) and AISTATS paper (arXiv:2306.05843). 48 pages, 11 figures

MSC Class: 62C10; 62F15

arXiv:2404.07008 [pdf, other]

doi 10.1007/978-3-031-63787-2_9

Knowledge graphs for empirical concept retrieval

Authors: Lenka Tětková, Teresa Karen Scheidt, Maria Mandrup Fogh, Ellen Marie Gaunby Jørgensen, Finn Årup Nielsen, Lars Kai Hansen

Abstract: Concept-based explainable AI is promising as a tool to improve the understanding of complex models at the premises of a given user, viz.\ as a tool for personalized explainability. An important class of concept-based explainability methods is constructed with empirically defined concepts, indirectly defined through a set of positive and negative examples, as in the TCAV approach (Kim et al., 2018)… ▽ More Concept-based explainable AI is promising as a tool to improve the understanding of complex models at the premises of a given user, viz.\ as a tool for personalized explainability. An important class of concept-based explainability methods is constructed with empirically defined concepts, indirectly defined through a set of positive and negative examples, as in the TCAV approach (Kim et al., 2018). While it is appealing to the user to avoid formal definitions of concepts and their operationalization, it can be challenging to establish relevant concept datasets. Here, we address this challenge using general knowledge graphs (such as, e.g., Wikidata or WordNet) for comprehensive concept definition and present a workflow for user-driven data collection in both text and image domains. The concepts derived from knowledge graphs are defined interactively, providing an opportunity for personalization and ensuring that the concepts reflect the user's intentions. We test the retrieved concept datasets on two concept-based explainability methods, namely concept activation vectors (CAVs) and concept activation regions (CARs) (Crabbe and van der Schaar, 2022). We show that CAVs and CARs based on these empirical concept datasets provide robust and accurate explanations. Importantly, we also find good alignment between the models' representations of concepts and the structure of knowledge graphs, i.e., human representations. This supports our conclusion that knowledge graph-based concepts are relevant for XAI. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: Preprint. Accepted to The 2nd World Conference on eXplainable Artificial Intelligence

arXiv:2306.05843 [pdf, other]

Adaptive Batch Sizes for Active Learning A Probabilistic Numerics Approach

Authors: Masaki Adachi, Satoshi Hayakawa, Martin Jørgensen, Xingchen Wan, Vu Nguyen, Harald Oberhauser, Michael A. Osborne

Abstract: Active learning parallelization is widely used, but typically relies on fixing the batch size throughout experimentation. This fixed approach is inefficient because of a dynamic trade-off between cost and speed -- larger batches are more costly, smaller batches lead to slower wall-clock run-times -- and the trade-off may change over the run (larger batches are often preferable earlier). To address… ▽ More Active learning parallelization is widely used, but typically relies on fixing the batch size throughout experimentation. This fixed approach is inefficient because of a dynamic trade-off between cost and speed -- larger batches are more costly, smaller batches lead to slower wall-clock run-times -- and the trade-off may change over the run (larger batches are often preferable earlier). To address this trade-off, we propose a novel Probabilistic Numerics framework that adaptively changes batch sizes. By framing batch selection as a quadrature task, our integration-error-aware algorithm facilitates the automatic tuning of batch sizes to meet predefined quadrature precision objectives, akin to how typical optimizers terminate based on convergence thresholds. This approach obviates the necessity for exhaustive searches across all potential batch sizes. We also extend this to scenarios with constrained active learning and constrained optimization, interpreting constraint violations as reductions in the precision requirement, to subsequently adapt batch construction. Through extensive experiments, we demonstrate that our approach significantly enhances learning efficiency and flexibility in diverse Bayesian batch active learning and Bayesian optimization applications. △ Less

Submitted 21 February, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: Accepted at AISTATS 2024. 33 pages, 6 figures

MSC Class: 62C10; 62F15

arXiv:2303.08874 [pdf, other]

Bayesian Quadrature for Neural Ensemble Search

Authors: Saad Hamid, Xingchen Wan, Martin Jørgensen, Binxin Ru, Michael Osborne

Abstract: Ensembling can improve the performance of Neural Networks, but existing approaches struggle when the architecture likelihood surface has dispersed, narrow peaks. Furthermore, existing methods construct equally weighted ensembles, and this is likely to be vulnerable to the failure modes of the weaker architectures. By viewing ensembling as approximately marginalising over architectures we construct… ▽ More Ensembling can improve the performance of Neural Networks, but existing approaches struggle when the architecture likelihood surface has dispersed, narrow peaks. Furthermore, existing methods construct equally weighted ensembles, and this is likely to be vulnerable to the failure modes of the weaker architectures. By viewing ensembling as approximately marginalising over architectures we construct ensembles using the tools of Bayesian Quadrature -- tools which are well suited to the exploration of likelihood surfaces with dispersed, narrow peaks. Additionally, the resulting ensembles consist of architectures weighted commensurate with their performance. We show empirically -- in terms of test likelihood, accuracy, and expected calibration error -- that our method outperforms state-of-the-art baselines, and verify via ablation studies that its components do so independently. △ Less

Submitted 17 March, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

arXiv:2303.05263 [pdf, other]

Fast post-process Bayesian inference with Variational Sparse Bayesian Quadrature

Authors: Chengkun Li, Grégoire Clarté, Martin Jørgensen, Luigi Acerbi

Abstract: In applied Bayesian inference scenarios, users may have access to a large number of pre-existing model evaluations, for example from maximum-a-posteriori (MAP) optimization runs. However, traditional approximate inference techniques make little to no use of this available information. We propose the framework of post-process Bayesian inference as a means to obtain a quick posterior approximation f… ▽ More In applied Bayesian inference scenarios, users may have access to a large number of pre-existing model evaluations, for example from maximum-a-posteriori (MAP) optimization runs. However, traditional approximate inference techniques make little to no use of this available information. We propose the framework of post-process Bayesian inference as a means to obtain a quick posterior approximation from existing target density evaluations, with no further model calls. Within this framework, we introduce Variational Sparse Bayesian Quadrature (VSBQ), a method for post-process approximate inference for models with black-box and potentially noisy likelihoods. VSBQ reuses existing target density evaluations to build a sparse Gaussian process (GP) surrogate model of the log posterior density function. Subsequently, we leverage sparse-GP Bayesian quadrature combined with variational inference to achieve fast approximate posterior inference over the surrogate. We validate our method on challenging synthetic scenarios and real-world applications from computational neuroscience. The experiments show that VSBQ builds high-quality posterior approximations by post-processing existing optimization traces, with no further model evaluations. △ Less

Submitted 18 June, 2024; v1 submitted 9 March, 2023; originally announced March 2023.

Comments: 52 pages, 14 figures

arXiv:2301.11832 [pdf, other]

SOBER: Highly Parallel Bayesian Optimization and Bayesian Quadrature over Discrete and Mixed Spaces

Authors: Masaki Adachi, Satoshi Hayakawa, Saad Hamid, Martin Jørgensen, Harald Oberhauser, Micheal A. Osborne

Abstract: Batch Bayesian optimisation and Bayesian quadrature have been shown to be sample-efficient methods of performing optimisation and quadrature where expensive-to-evaluate objective functions can be queried in parallel. However, current methods do not scale to large batch sizes -- a frequent desideratum in practice (e.g. drug discovery or simulation-based inference). We present a novel algorithm, SOB… ▽ More Batch Bayesian optimisation and Bayesian quadrature have been shown to be sample-efficient methods of performing optimisation and quadrature where expensive-to-evaluate objective functions can be queried in parallel. However, current methods do not scale to large batch sizes -- a frequent desideratum in practice (e.g. drug discovery or simulation-based inference). We present a novel algorithm, SOBER, which permits scalable and diversified batch global optimisation and quadrature with arbitrary acquisition functions and kernels over discrete and mixed spaces. The key to our approach is to reformulate batch selection for global optimisation as a quadrature problem, which relaxes acquisition function maximisation (non-convex) to kernel recombination (convex). Bridging global optimisation and quadrature can efficiently solve both tasks by balancing the merits of exploitative Bayesian optimisation and explorative Bayesian quadrature. We show that SOBER outperforms 11 competitive baselines on 12 synthetic and diverse real-world tasks. △ Less

Submitted 5 July, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: 34 pages, 12 figures

MSC Class: 62C10; 62F15

arXiv:2211.07435 [pdf]

Enabling Autonomous Teams and Continuous Deployment at Scale

Authors: Torgeir Dingsøyr, Magne Jørgensen, Frode Odde Carlsen, Lena Carlström, Jens Engelsrud, Kine Hansvold, Mari Heibø-Bagheri, Kjetil Røe, Karl Ove Vika Sørensen

Abstract: In this article, we give advice on transitioning to a more agile delivery model for large-scale agile development projects based on experience from the Parental Benefit Project of the Norwegian Labour and Welfare Administration. The project modernized a central part of the organizations IT portfolio and included up to ten development teams working in parallel. The project successfully changed from… ▽ More In this article, we give advice on transitioning to a more agile delivery model for large-scale agile development projects based on experience from the Parental Benefit Project of the Norwegian Labour and Welfare Administration. The project modernized a central part of the organizations IT portfolio and included up to ten development teams working in parallel. The project successfully changed from using a delivery model which combined traditional project management elements and agile methods to a more agile delivery model with autonomous teams and continuous deployment. This transition was completed in tandem with the project execution. We identify key lessons learned which will be useful for other organizations considering similar changes and report how the new delivery model reduced risk and opened up a range of new possibilities for delivering the benefits of digitalization. △ Less

Submitted 14 November, 2022; originally announced November 2022.

arXiv:2209.00343 [pdf, other]

Bézier Gaussian Processes for Tall and Wide Data

Authors: Martin Jørgensen, Michael A. Osborne

Abstract: Modern approximations to Gaussian processes are suitable for "tall data", with a cost that scales well in the number of observations, but under-performs on ``wide data'', scaling poorly in the number of input features. That is, as the number of input features grows, good predictive performance requires the number of summarising variables, and their associated cost, to grow rapidly. We introduce a… ▽ More Modern approximations to Gaussian processes are suitable for "tall data", with a cost that scales well in the number of observations, but under-performs on ``wide data'', scaling poorly in the number of input features. That is, as the number of input features grows, good predictive performance requires the number of summarising variables, and their associated cost, to grow rapidly. We introduce a kernel that allows the number of summarising variables to grow exponentially with the number of input features, but requires only linear cost in both number of observations and input features. This scaling is achieved through our introduction of the Bézier buttress, which allows approximate inference without computing matrix inverses or determinants. We show that our kernel has close similarities to some of the most used kernels in Gaussian process regression, and empirically demonstrate the kernel's ability to scale to both tall and wide datasets. △ Less

Submitted 13 October, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

arXiv:2206.04734 [pdf, other]

Fast Bayesian Inference with Batch Bayesian Quadrature via Kernel Recombination

Authors: Masaki Adachi, Satoshi Hayakawa, Martin Jørgensen, Harald Oberhauser, Michael A. Osborne

Abstract: Calculation of Bayesian posteriors and model evidences typically requires numerical integration. Bayesian quadrature (BQ), a surrogate-model-based approach to numerical integration, is capable of superb sample efficiency, but its lack of parallelisation has hindered its practical applications. In this work, we propose a parallelised (batch) BQ method, employing techniques from kernel quadrature, t… ▽ More Calculation of Bayesian posteriors and model evidences typically requires numerical integration. Bayesian quadrature (BQ), a surrogate-model-based approach to numerical integration, is capable of superb sample efficiency, but its lack of parallelisation has hindered its practical applications. In this work, we propose a parallelised (batch) BQ method, employing techniques from kernel quadrature, that possesses an empirically exponential convergence rate. Additionally, just as with Nested Sampling, our method permits simultaneous inference of both posteriors and model evidence. Samples from our BQ surrogate model are re-selected to give a sparse set of samples, via a kernel recombination algorithm, requiring negligible additional time to increase the batch size. Empirically, we find that our approach significantly outperforms the sampling efficiency of both state-of-the-art BQ techniques and Nested Sampling in various real-world datasets, including lithium-ion battery analytics. △ Less

Submitted 27 January, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

Comments: 38 pages, 6 figures

MSC Class: 62C10; 62F15

Journal ref: NeurIPS 35, 16533--16547 (2022)

arXiv:2204.13808 [pdf, other]

Analysing the Influence of Attack Configurations on the Reconstruction of Medical Images in Federated Learning

Authors: Mads Emil Dahlgaard, Morten Wehlast Jørgensen, Niels Asp Fuglsang, Hiba Nassar

Abstract: The idea of federated learning is to train deep neural network models collaboratively and share them with multiple participants without exposing their private training data to each other. This is highly attractive in the medical domain due to patients' privacy records. However, a recently proposed method called Deep Leakage from Gradients enables attackers to reconstruct data from shared gradients… ▽ More The idea of federated learning is to train deep neural network models collaboratively and share them with multiple participants without exposing their private training data to each other. This is highly attractive in the medical domain due to patients' privacy records. However, a recently proposed method called Deep Leakage from Gradients enables attackers to reconstruct data from shared gradients. This study shows how easy it is to reconstruct images for different data initialization schemes and distance measures. We show how data and model architecture influence the optimal choice of initialization scheme and distance measure configurations when working with single images. We demonstrate that the choice of initialization scheme and distance measure can significantly increase convergence speed and quality. Furthermore, we find that the optimal attack configuration depends largely on the nature of the target image distribution and the complexity of the model architecture. △ Less

Submitted 25 April, 2022; originally announced April 2022.

arXiv:2203.15864 [pdf]

Measurement of software development effort estimation bias: Avoiding biased measures of estimation bias

Authors: Magne Jørgensen

Abstract: In this paper, we propose improvements in how estimation bias, e.g., the tendency towards under-estimating the effort, is measured. The proposed approach emphasizes the need to know what the estimates are meant to represent, i.e., the type of estimate we evaluate and the need for a match between the type of estimate given and the bias measure used. We show that even perfect estimates of the mean e… ▽ More In this paper, we propose improvements in how estimation bias, e.g., the tendency towards under-estimating the effort, is measured. The proposed approach emphasizes the need to know what the estimates are meant to represent, i.e., the type of estimate we evaluate and the need for a match between the type of estimate given and the bias measure used. We show that even perfect estimates of the mean effort will not lead to an expectation of zero estimation bias when applying the frequently used bias measure: (actual effort - estimated effort)/actual effort. This measure will instead reward under-estimates of the mean effort. We also provide examples of bias measures that match estimates of the mean and the median effort, and argue that there are, in general, no practical bias measures for estimates of the most likely effort. The paper concludes with implications for the evaluation of bias of software development effort estimates. △ Less

Submitted 29 March, 2022; originally announced March 2022.

Comments: aircconline.com/csit/abstract/v12n6/csit120607.html

MSC Class: D.2 K6.3

arXiv:2106.07512 [pdf, other]

Last Layer Marginal Likelihood for Invariance Learning

Authors: Pola Schwöbel, Martin Jørgensen, Sebastian W. Ober, Mark van der Wilk

Abstract: Data augmentation is often used to incorporate inductive biases into models. Traditionally, these are hand-crafted and tuned with cross validation. The Bayesian paradigm for model selection provides a path towards end-to-end learning of invariances using only the training data, by optimising the marginal likelihood. Computing the marginal likelihood is hard for neural networks, but success with tr… ▽ More Data augmentation is often used to incorporate inductive biases into models. Traditionally, these are hand-crafted and tuned with cross validation. The Bayesian paradigm for model selection provides a path towards end-to-end learning of invariances using only the training data, by optimising the marginal likelihood. Computing the marginal likelihood is hard for neural networks, but success with tractable approaches that compute the marginal likelihood for the last layer only raises the question of whether this convenient approach might be employed for learning invariances. We show partial success on standard benchmarks, in the low-data regime and on a medical imaging dataset by designing a custom optimisation routine. Introducing a new lower bound to the marginal likelihood allows us to perform inference for a larger class of likelihood functions than before. On the other hand, we demonstrate failure modes on the CIFAR10 dataset, where the last layer approximation is not sufficient due to the increased complexity of our neural network. Our results indicate that once more sophisticated approximations become available the marginal likelihood is a promising approach for invariance learning in neural networks. △ Less

Submitted 1 March, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

Comments: AISTATS '22

arXiv:2104.11564 [pdf, other]

Backsourcing of Software Development -- A Systematic Literature Review

Authors: Jefferson Seide Molléri, Casper Lassenius, Magne Jørgensen

Abstract: Context: Backsourcing is the process of insourcing previously outsourced activities. When companies experience environmental or strategic changes, or challenges with outsourcing, backsourcing can be a viable alternative. While outsourcing and related processes have been extensively studied in software engineering, few studies report experiences with backsourcing. Objectives: We intend to summarize… ▽ More Context: Backsourcing is the process of insourcing previously outsourced activities. When companies experience environmental or strategic changes, or challenges with outsourcing, backsourcing can be a viable alternative. While outsourcing and related processes have been extensively studied in software engineering, few studies report experiences with backsourcing. Objectives: We intend to summarize the results of the research literature on the backsourcing of IT, with a focus on software development. By identifying practical relevance experience, we aim to present findings that may help companies considering backsourcing. In addition, we aim to identify gaps in the current research literature and point out areas for future work. Method: Our systematic literature review (SLR) started with a search for empirical studies on the backsourcing of software development. From each study we identified the contexts in which backsourcing occurs, the factors leading to the decision to backsource, the backsourcing process itself, and the outcomes of backsourcing. We employed inductive coding to extract textual data from the papers identified and qualitative cross-case analysis to synthesize the evidence from backsourcing experiences. Results: We identified 17 papers that reported 26 cases of backsourcing, six of which were related to software development. The cases came from a variety of contexts. The most common reasons for backsourcing were improving quality, reducing costs, and regaining control of outsourced activities. The backsourcing process can be described as containing five sub-processes: change management, vendor relationship management, competence building, organizational build-up, and transfer of ownership. Furthermore, ... △ Less

Submitted 23 April, 2021; originally announced April 2021.

arXiv:2101.10790 [pdf]

The Consequences of the Framing of Machine Learning Risk Prediction Models: Evaluation of Sepsis in General Wards

Authors: Simon Meyer Lauritsen, Bo Thiesson, Marianne Johansson Jørgensen, Anders Hammerich Riis, Ulrick Skipper Espelund, Jesper Bo Weile, Jeppe Lange

Abstract: Objectives: To evaluate the consequences of the framing of machine learning risk prediction models. We evaluate how framing affects model performance and model learning in four different approaches previously applied in published artificial-intelligence (AI) models. Setting and participants: We analysed structured secondary healthcare data from 221,283 citizens from four Danish municipalities wh… ▽ More Objectives: To evaluate the consequences of the framing of machine learning risk prediction models. We evaluate how framing affects model performance and model learning in four different approaches previously applied in published artificial-intelligence (AI) models. Setting and participants: We analysed structured secondary healthcare data from 221,283 citizens from four Danish municipalities who were 18 years of age or older. Results: The four models had similar population level performance (a mean area under the receiver operating characteristic curve of 0.73 to 0.82), in contrast to the mean average precision, which varied greatly from 0.007 to 0.385. Correspondingly, the percentage of missing values also varied between framing approaches. The on-clinical-demand framing, which involved samples for each time the clinicians made an early warning score assessment, showed the lowest percentage of missing values among the vital sign parameters, and this model was also able to learn more temporal dependencies than the others. The Shapley additive explanations demonstrated opposing interpretations of SpO2 in the prediction of sepsis as a consequence of differentially framed models. Conclusions: The profound consequences of framing mandate attention from clinicians and AI developers, as the understanding and reporting of framing are pivotal to the successful development and clinical implementation of future AI technology. Model framing must reflect the expected clinical environment. The importance of proper problem framing is by no means exclusive to sepsis prediction and applies to most clinical risk prediction models. △ Less

Submitted 26 January, 2021; originally announced January 2021.

arXiv:2011.12663 [pdf, other]

Bayesian Triplet Loss: Uncertainty Quantification in Image Retrieval

Authors: Frederik Warburg, Martin Jørgensen, Javier Civera, Søren Hauberg

Abstract: Uncertainty quantification in image retrieval is crucial for downstream decisions, yet it remains a challenging and largely unexplored problem. Current methods for estimating uncertainties are poorly calibrated, computationally expensive, or based on heuristics. We present a new method that views image embeddings as stochastic features rather than deterministic features. Our two main contributions… ▽ More Uncertainty quantification in image retrieval is crucial for downstream decisions, yet it remains a challenging and largely unexplored problem. Current methods for estimating uncertainties are poorly calibrated, computationally expensive, or based on heuristics. We present a new method that views image embeddings as stochastic features rather than deterministic features. Our two main contributions are (1) a likelihood that matches the triplet constraint and that evaluates the probability of an anchor being closer to a positive than a negative; and (2) a prior over the feature space that justifies the conventional l2 normalization. To ensure computational efficiency, we derive a variational approximation of the posterior, called the Bayesian triplet loss, that produces state-of-the-art uncertainty estimates and matches the predictive performance of current state-of-the-art methods. △ Less

Submitted 17 September, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

Journal ref: 2021 ICCV

arXiv:2008.05552 [pdf, other]

Reparametrization Invariance in non-parametric Causal Discovery

Authors: Martin Jørgensen, Søren Hauberg

Abstract: Causal discovery estimates the underlying physical process that generates the observed data: does X cause Y or does Y cause X? Current methodologies use structural conditions to turn the causal query into a statistical query, when only observational data is available. But what if these statistical queries are sensitive to causal invariants? This study investigates one such invariant: the causal re… ▽ More Causal discovery estimates the underlying physical process that generates the observed data: does X cause Y or does Y cause X? Current methodologies use structural conditions to turn the causal query into a statistical query, when only observational data is available. But what if these statistical queries are sensitive to causal invariants? This study investigates one such invariant: the causal relationship between X and Y is invariant to the marginal distributions of X and Y. We propose an algorithm that uses a non-parametric estimator that is robust to changes in the marginal distributions. This way we may marginalize the marginals, and inspect what relationship is intrinsically there. The resulting causal estimator is competitive with current methodologies and has high emphasis on the uncertainty in the causal query; an aspect just as important as the query itself. △ Less

Submitted 12 August, 2020; originally announced August 2020.

arXiv:2007.14474 [pdf]

Construction and Usage of a Human Body Common Coordinate Framework Comprising Clinical, Semantic, and Spatial Ontologies

Authors: Katy Börner, Ellen M. Quardokus, Bruce W. Herr II, Leonard E. Cross, Elizabeth G. Record, Yingnan Ju, Andreas D. Bueckle, James P. Sluka, Jonathan C. Silverstein, Kristen M. Browne, Sanjay Jain, Clive H. Wasserfall, Marda L. Jorgensen, Jeffrey M. Spraggins, Nathan H. Patterson, Mark A. Musen, Griffin M. Weber

Abstract: The National Institutes of Health's (NIH) Human Biomolecular Atlas Program (HuBMAP) aims to create a comprehensive high-resolution atlas of all the cells in the healthy human body. Multiple laboratories across the United States are collecting tissue specimens from different organs of donors who vary in sex, age, and body size. Integrating and harmonizing the data derived from these samples and 'ma… ▽ More The National Institutes of Health's (NIH) Human Biomolecular Atlas Program (HuBMAP) aims to create a comprehensive high-resolution atlas of all the cells in the healthy human body. Multiple laboratories across the United States are collecting tissue specimens from different organs of donors who vary in sex, age, and body size. Integrating and harmonizing the data derived from these samples and 'mapping' them into a common three-dimensional (3D) space is a major challenge. The key to making this possible is a 'Common Coordinate Framework' (CCF), which provides a semantically annotated, 3D reference system for the entire body. The CCF enables contributors to HuBMAP to 'register' specimens and datasets within a common spatial reference system, and it supports a standardized way to query and 'explore' data in a spatially and semantically explicit manner. [...] This paper describes the construction and usage of a CCF for the human body and its reference implementation in HuBMAP. The CCF consists of (1) a CCF Clinical Ontology, which provides metadata about the specimen and donor (the 'who'); (2) a CCF Semantic Ontology, which describes 'what' part of the body a sample came from and details anatomical structures, cell types, and biomarkers (ASCT+B); and (3) a CCF Spatial Ontology, which indicates 'where' a tissue sample is located in a 3D coordinate system. An initial version of all three CCF ontologies has been implemented for the first HuBMAP Portal release. It was successfully used by Tissue Mapping Centers to semantically annotate and spatially register 48 kidney and spleen tissue blocks. The blocks can be queried and explored in their clinical, semantic, and spatial context via the CCF user interface in the HuBMAP Portal. △ Less

Submitted 28 July, 2020; originally announced July 2020.

Comments: 24 pages with SI, 6 figures, 5 tables

arXiv:2006.14895 [pdf, other]

Stochastic Differential Equations with Variational Wishart Diffusions

Authors: Martin Jørgensen, Marc Peter Deisenroth, Hugh Salimbeni

Abstract: We present a Bayesian non-parametric way of inferring stochastic differential equations for both regression tasks and continuous-time dynamical modelling. The work has high emphasis on the stochastic part of the differential equation, also known as the diffusion, and modelling it by means of Wishart processes. Further, we present a semi-parametric approach that allows the framework to scale to hig… ▽ More We present a Bayesian non-parametric way of inferring stochastic differential equations for both regression tasks and continuous-time dynamical modelling. The work has high emphasis on the stochastic part of the differential equation, also known as the diffusion, and modelling it by means of Wishart processes. Further, we present a semi-parametric approach that allows the framework to scale to high dimensions. This successfully lead us onto how to model both latent and auto-regressive temporal systems with conditional heteroskedastic noise. We provide experimental evidence that modelling diffusion often improves performance and that this randomness in the differential equation can be essential to avoid overfitting. △ Less

Submitted 26 June, 2020; originally announced June 2020.

Comments: ICML 2020

arXiv:2006.11741 [pdf, other]

Isometric Gaussian Process Latent Variable Model for Dissimilarity Data

Authors: Martin Jørgensen, Søren Hauberg

Abstract: We present a probabilistic model where the latent variable respects both the distances and the topology of the modeled data. The model leverages the Riemannian geometry of the generated manifold to endow the latent space with a well-defined stochastic distance measure, which is modeled locally as Nakagami distributions. These stochastic distances are sought to be as similar as possible to observed… ▽ More We present a probabilistic model where the latent variable respects both the distances and the topology of the modeled data. The model leverages the Riemannian geometry of the generated manifold to endow the latent space with a well-defined stochastic distance measure, which is modeled locally as Nakagami distributions. These stochastic distances are sought to be as similar as possible to observed distances along a neighborhood graph through a censoring process. The model is inferred by variational inference based on observations of pairwise distances. We demonstrate how the new model can encode invariances in the learned manifolds. △ Less

Submitted 8 June, 2021; v1 submitted 21 June, 2020; originally announced June 2020.

Comments: ICML 2021

arXiv:2005.09638 [pdf, other]

Physics-informed Neural Networks for Solving Inverse Problems of Nonlinear Biot's Equations: Batch Training

Authors: Teeratorn Kadeethum, Thomas M Jørgensen, Hamidreza M Nick

Abstract: In biomedical engineering, earthquake prediction, and underground energy harvesting, it is crucial to indirectly estimate the physical properties of porous media since the direct measurement of those are usually impractical/prohibitive. Here we apply the physics-informed neural networks to solve the inverse problem with regard to the nonlinear Biot's equations. Specifically, we consider batch trai… ▽ More In biomedical engineering, earthquake prediction, and underground energy harvesting, it is crucial to indirectly estimate the physical properties of porous media since the direct measurement of those are usually impractical/prohibitive. Here we apply the physics-informed neural networks to solve the inverse problem with regard to the nonlinear Biot's equations. Specifically, we consider batch training and explore the effect of different batch sizes. The results show that training with small batch sizes, i.e., a few examples per batch, provides better approximations (lower percentage error) of the physical parameters than using large batches or the full batch. The increased accuracy of the physical parameters, comes at the cost of longer training time. Specifically, we find the size should not be too small since a very small batch size requires a very long training time without a corresponding improvement in estimation accuracy. We find that a batch size of 8 or 32 is a good compromise, which is also robust to additive noise in the data. The learning rate also plays an important role and should be used as a hyperparameter. △ Less

Submitted 18 May, 2020; originally announced May 2020.

Comments: arXiv admin note: text overlap with arXiv:2002.08235

arXiv:2004.03637 [pdf, other]

Probabilistic Spatial Transformer Networks

Authors: Pola Schwöbel, Frederik Warburg, Martin Jørgensen, Kristoffer H. Madsen, Søren Hauberg

Abstract: Spatial Transformer Networks (STNs) estimate image transformations that can improve downstream tasks by `zooming in' on relevant regions in an image. However, STNs are hard to train and sensitive to mis-predictions of transformations. To circumvent these limitations, we propose a probabilistic extension that estimates a stochastic transformation rather than a deterministic one. Marginalizing trans… ▽ More Spatial Transformer Networks (STNs) estimate image transformations that can improve downstream tasks by `zooming in' on relevant regions in an image. However, STNs are hard to train and sensitive to mis-predictions of transformations. To circumvent these limitations, we propose a probabilistic extension that estimates a stochastic transformation rather than a deterministic one. Marginalizing transformations allows us to consider each image at multiple poses, which makes the localization task easier and the training more robust. As an additional benefit, the stochastic transformations act as a localized, learned data augmentation that improves the downstream tasks. We show across standard imaging benchmarks and on a challenging real-world dataset that these two properties lead to improved classification performance, robustness and model calibration. We further demonstrate that the approach generalizes to non-visual domains by improving model performance on time-series data. △ Less

Submitted 15 June, 2022; v1 submitted 7 April, 2020; originally announced April 2020.

Comments: UAI 2022

arXiv:2002.08235 [pdf, other]

doi 10.1371/journal.pone.0232683

Physics-informed Neural Networks for Solving Nonlinear Diffusivity and Biot's equations

Authors: Teeratorn Kadeethum, Thomas M Jorgensen, Hamidreza M Nick

Abstract: This paper presents the potential of applying physics-informed neural networks for solving nonlinear multiphysics problems, which are essential to many fields such as biomedical engineering, earthquake prediction, and underground energy harvesting. Specifically, we investigate how to extend the methodology of physics-informed neural networks to solve both the forward and inverse problems in relati… ▽ More This paper presents the potential of applying physics-informed neural networks for solving nonlinear multiphysics problems, which are essential to many fields such as biomedical engineering, earthquake prediction, and underground energy harvesting. Specifically, we investigate how to extend the methodology of physics-informed neural networks to solve both the forward and inverse problems in relation to the nonlinear diffusivity and Biot's equations. We explore the accuracy of the physics-informed neural networks with different training example sizes and choices of hyperparameters. The impacts of the stochastic variations between various training realizations are also investigated. In the inverse case, we also study the effects of noisy measurements. Furthermore, we address the challenge of selecting the hyperparameters of the inverse model and illustrate how this challenge is linked to the hyperparameters selection performed for the forward one. △ Less

Submitted 19 February, 2020; originally announced February 2020.

arXiv:1912.01266 [pdf, other]

Explainable artificial intelligence model to predict acute critical illness from electronic health records

Authors: Simon Meyer Lauritsen, Mads Kristensen, Mathias Vassard Olsen, Morten Skaarup Larsen, Katrine Meyer Lauritsen, Marianne Johansson Jørgensen, Jeppe Lange, Bo Thiesson

Abstract: We developed an explainable artificial intelligence (AI) early warning score (xAI-EWS) system for early detection of acute critical illness. While maintaining a high predictive performance, our system explains to the clinician on which relevant electronic health records (EHRs) data the prediction is grounded. Acute critical illness is often preceded by deterioration of routinely measured clinical… ▽ More We developed an explainable artificial intelligence (AI) early warning score (xAI-EWS) system for early detection of acute critical illness. While maintaining a high predictive performance, our system explains to the clinician on which relevant electronic health records (EHRs) data the prediction is grounded. Acute critical illness is often preceded by deterioration of routinely measured clinical parameters, e.g., blood pressure and heart rate. Early clinical prediction is typically based on manually calculated screening metrics that simply weigh these parameters, such as Early Warning Scores (EWS). The predictive performance of EWSs yields a tradeoff between sensitivity and specificity that can lead to negative outcomes for the patient. Previous work on EHR-trained AI systems offers promising results with high levels of predictive performance in relation to the early, real-time prediction of acute critical illness. However, without insight into the complex decisions by such system, clinical translation is hindered. In this letter, we present our xAI-EWS system, which potentiates clinical translation by accompanying a prediction with information on the EHR data explaining it. △ Less

Submitted 3 December, 2019; originally announced December 2019.

arXiv:1906.03260 [pdf, other]

Reliable training and estimation of variance networks

Authors: Nicki S. Detlefsen, Martin Jørgensen, Søren Hauberg

Abstract: We propose and investigate new complementary methodologies for estimating predictive variance networks in regression neural networks. We derive a locally aware mini-batching scheme that result in sparse robust gradients, and show how to make unbiased weight updates to a variance network. Further, we formulate a heuristic for robustly fitting both the mean and variance networks post hoc. Finally, w… ▽ More We propose and investigate new complementary methodologies for estimating predictive variance networks in regression neural networks. We derive a locally aware mini-batching scheme that result in sparse robust gradients, and show how to make unbiased weight updates to a variance network. Further, we formulate a heuristic for robustly fitting both the mean and variance networks post hoc. Finally, we take inspiration from posterior Gaussian processes and propose a network architecture with similar extrapolation properties to Gaussian processes. The proposed methodologies are complementary, and improve upon baseline methods individually. Experimentally, we investigate the impact on predictive uncertainty on multiple datasets and tasks ranging from regression, active learning and generative modeling. Experiments consistently show significant improvements in predictive uncertainty estimation over state-of-the-art methods across tasks and datasets. △ Less

Submitted 4 November, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: Appeared at NeurIPS 2019

arXiv:1906.02956 [pdf, other]

Early detection of sepsis utilizing deep learning on electronic health record event sequences

Authors: Simon Meyer Lauritsen, Mads Ellersgaard Kalør, Emil Lund Kongsgaard, Katrine Meyer Lauritsen, Marianne Johansson Jørgensen, Jeppe Lange, Bo Thiesson

Abstract: The timeliness of detection of a sepsis event in progress is a crucial factor in the outcome for the patient. Machine learning models built from data in electronic health records can be used as an effective tool for improving this timeliness, but so far the potential for clinical implementations has been largely limited to studies in intensive care units. This study will employ a richer data set t… ▽ More The timeliness of detection of a sepsis event in progress is a crucial factor in the outcome for the patient. Machine learning models built from data in electronic health records can be used as an effective tool for improving this timeliness, but so far the potential for clinical implementations has been largely limited to studies in intensive care units. This study will employ a richer data set that will expand the applicability of these models beyond intensive care units. Furthermore, we will circumvent several important limitations that have been found in the literature: 1) Models are evaluated shortly before sepsis onset without considering interventions already initiated. 2) Machine learning models are built on a restricted set of clinical parameters, which are not necessarily measured in all departments. 3) Model performance is limited by current knowledge of sepsis, as feature interactions and time dependencies are hardcoded into the model. In this study, we present a model to overcome these shortcomings using a deep learning approach on a diverse multicenter data set. We used retrospective data from multiple Danish hospitals over a seven-year period. Our sepsis detection system is constructed as a combination of a convolutional neural network and a long short-term memory network. We suggest a retrospective assessment of interventions by looking at intravenous antibiotics and blood cultures preceding the prediction time. Results show performance ranging from AUROC 0.856 (3 hours before sepsis onset) to AUROC 0.756 (24 hours before sepsis onset). We present a deep learning system for early detection of sepsis that is able to learn characteristics of the key factors and interactions from the raw event sequence data itself, without relying on a labor-intensive feature extraction work. △ Less

Submitted 7 June, 2019; originally announced June 2019.

arXiv:1902.10501 [pdf, other]

doi 10.1063/1.5108871

Atomistic structure learning

Authors: Mathias S. Jørgensen, Henrik L. Mortensen, Søren A. Meldgaard, Esben L. Kolsbjerg, Thomas L. Jacobsen, Knud H. Sørensen, Bjørk Hammer

Abstract: One endeavour of modern physical chemistry is to use bottom-up approaches to design materials and drugs with desired properties. Here we introduce an atomistic structure learning algorithm (ASLA) that utilizes a convolutional neural network to build 2D compounds and layered structures atom by atom. The algorithm takes no prior data or knowledge on atomic interactions but inquires a first-principle… ▽ More One endeavour of modern physical chemistry is to use bottom-up approaches to design materials and drugs with desired properties. Here we introduce an atomistic structure learning algorithm (ASLA) that utilizes a convolutional neural network to build 2D compounds and layered structures atom by atom. The algorithm takes no prior data or knowledge on atomic interactions but inquires a first-principles quantum mechanical program for physical properties. Using reinforcement learning, the algorithm accumulates knowledge of chemical compound space for a given number and type of atoms and stores this in the neural network, ultimately learning the blueprint for the optimal structural arrangement of the atoms for a given target property. ASLA is demonstrated to work on diverse problems, including grain boundaries in graphene sheets, organic compound formation and a surface oxide structure. This approach to structure prediction is a first step toward direct manipulation of atoms with artificially intelligent first principles computer codes. △ Less

Submitted 27 February, 2019; originally announced February 2019.

arXiv:1804.03919 [pdf, other]

doi 10.1145/3167132.3167293

An Experimental Evaluation of a De-biasing Intervention for Professional Software Developers

Authors: Martin Shepperd, Carolyn Mair, Magne Jørgensen

Abstract: CONTEXT: The role of expert judgement is essential in our quest to improve software project planning and execution. However, its accuracy is dependent on many factors, not least the avoidance of judgement biases, such as the anchoring bias, arising from being influenced by initial information, even when it's misleading or irrelevant. This strong effect is widely documented. OBJECTIVE: We aimed to… ▽ More CONTEXT: The role of expert judgement is essential in our quest to improve software project planning and execution. However, its accuracy is dependent on many factors, not least the avoidance of judgement biases, such as the anchoring bias, arising from being influenced by initial information, even when it's misleading or irrelevant. This strong effect is widely documented. OBJECTIVE: We aimed to replicate this anchoring bias using professionals and, novel in a software engineering context, explore de-biasing interventions through increasing knowledge and awareness of judgement biases. METHOD: We ran two series of experiments in company settings with a total of 410 software developers. Some developers took part in a workshop to heighten their awareness of a range of cognitive biases, including anchoring. Later, the anchoring bias was induced by presenting low or high productivity values, followed by the participants' estimates of their own project productivity. Our hypothesis was that the workshop would lead to reduced bias, i.e., work as a de-biasing intervention. RESULTS: The anchors had a large effect (robust Cohen's $d=1.19$) in influencing estimates. This was substantially reduced in those participants who attended the workshop (robust Cohen's $d=0.72$). The reduced bias related mainly to the high anchor. The de-biasing intervention also led to a threefold reduction in estimate variance. CONCLUSIONS: The impact of anchors upon judgement was substantial. Learning about judgement biases does appear capable of mitigating, although not removing, the anchoring bias. The positive effect of de-biasing through learning about biases suggests that it has value. △ Less

Submitted 11 April, 2018; originally announced April 2018.

Comments: Presented at: SAC 2018: Symposium on Applied Computing, April 9--13, 2018, Pau, France. ACM, New York, NY, USA, 8 pages

arXiv:1204.4411 [pdf, ps, other]

Solutions to the generalized Towers of Hanoi problem

Authors: Mikael Erik Jörgensen

Abstract: The purpose of this paper is to prove the Frame-Stewart algorithm for the generalized Towers of Hanoi problem as well as finding the number of moves required to solve the problem and studying the multitude of optimal solutions. The main idea is to study how to most effectively move away all but the last disc and use the fact that the total number of moves required to solve the problem is twice thi… ▽ More The purpose of this paper is to prove the Frame-Stewart algorithm for the generalized Towers of Hanoi problem as well as finding the number of moves required to solve the problem and studying the multitude of optimal solutions. The main idea is to study how to most effectively move away all but the last disc and use the fact that the total number of moves required to solve the problem is twice this number plus one. △ Less

Submitted 21 April, 2012; v1 submitted 19 April, 2012; originally announced April 2012.

Comments: 9 pages

MSC Class: 90C27 (Combinatorial optimization)

Showing 1–28 of 28 results for author: Jørgensen, M