-
How to Leverage Predictive Uncertainty Estimates for Reducing Catastrophic Forgetting in Online Continual Learning
Authors:
Giuseppe Serra,
Ben Werner,
Florian Buettner
Abstract:
Many real-world applications require machine-learning models to be able to deal with non-stationary data distributions and thus learn autonomously over an extended period of time, often in an online setting. One of the main challenges in this scenario is the so-called catastrophic forgetting (CF) for which the learning model tends to focus on the most recent tasks while experiencing predictive deg…
▽ More
Many real-world applications require machine-learning models to be able to deal with non-stationary data distributions and thus learn autonomously over an extended period of time, often in an online setting. One of the main challenges in this scenario is the so-called catastrophic forgetting (CF) for which the learning model tends to focus on the most recent tasks while experiencing predictive degradation on older ones. In the online setting, the most effective solutions employ a fixed-size memory buffer to store old samples used for replay when training on new tasks. Many approaches have been presented to tackle this problem. However, it is not clear how predictive uncertainty information for memory management can be leveraged in the most effective manner and conflicting strategies are proposed to populate the memory. Are the easiest-to-forget or the easiest-to-remember samples more effective in combating CF? Starting from the intuition that predictive uncertainty provides an idea of the samples' location in the decision space, this work presents an in-depth analysis of different uncertainty estimates and strategies for populating the memory. The investigation provides a better understanding of the characteristics data points should have for alleviating CF. Then, we propose an alternative method for estimating predictive uncertainty via the generalised variance induced by the negative log-likelihood. Finally, we demonstrate that the use of predictive uncertainty measures helps in reducing CF in different settings.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Provably Better Explanations with Optimized Aggregation of Feature Attributions
Authors:
Thomas Decker,
Ananta R. Bhattarai,
Jindong Gu,
Volker Tresp,
Florian Buettner
Abstract:
Using feature attributions for post-hoc explanations is a common practice to understand and verify the predictions of opaque machine learning models. Despite the numerous techniques available, individual methods often produce inconsistent and unstable results, putting their overall reliability into question. In this work, we aim to systematically improve the quality of feature attributions by comb…
▽ More
Using feature attributions for post-hoc explanations is a common practice to understand and verify the predictions of opaque machine learning models. Despite the numerous techniques available, individual methods often produce inconsistent and unstable results, putting their overall reliability into question. In this work, we aim to systematically improve the quality of feature attributions by combining multiple explanations across distinct methods or their variations. For this purpose, we propose a novel approach to derive optimal convex combinations of feature attributions that yield provable improvements of desired quality criteria such as robustness or faithfulness to the model behavior. Through extensive experiments involving various model architectures and popular feature attribution techniques, we demonstrate that our combination strategy consistently outperforms individual methods and existing baselines.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Federated Continual Learning Goes Online: Leveraging Uncertainty for Modality-Agnostic Class-Incremental Learning
Authors:
Giuseppe Serra,
Florian Buettner
Abstract:
Given the ability to model more realistic and dynamic problems, Federated Continual Learning (FCL) has been increasingly investigated recently. A well-known problem encountered in this setting is the so-called catastrophic forgetting, for which the learning model is inclined to focus on more recent tasks while forgetting the previously learned knowledge. The majority of the current approaches in F…
▽ More
Given the ability to model more realistic and dynamic problems, Federated Continual Learning (FCL) has been increasingly investigated recently. A well-known problem encountered in this setting is the so-called catastrophic forgetting, for which the learning model is inclined to focus on more recent tasks while forgetting the previously learned knowledge. The majority of the current approaches in FCL propose generative-based solutions to solve said problem. However, this setting requires multiple training epochs over the data, implying an offline setting where datasets are stored locally and remain unchanged over time. Furthermore, the proposed solutions are tailored for vision tasks solely. To overcome these limitations, we propose a new modality-agnostic approach to deal with the online scenario where new data arrive in streams of mini-batches that can only be processed once. To solve catastrophic forgetting, we propose an uncertainty-aware memory-based approach. In particular, we suggest using an estimator based on the Bregman Information (BI) to compute the model's variance at the sample level. Through measures of predictive uncertainty, we retrieve samples with specific characteristics, and - by retraining the model on such samples - we demonstrate the potential of this approach to reduce the forgetting effect in realistic settings.
△ Less
Submitted 3 July, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
DomainLab: A modular Python package for domain generalization in deep learning
Authors:
Xudong Sun,
Carla Feistner,
Alexej Gossmann,
George Schwarz,
Rao Muhammad Umer,
Lisa Beer,
Patrick Rockenschaub,
Rahul Babu Shrestha,
Armin Gruber,
Nutan Chen,
Sayedali Shetab Boushehri,
Florian Buettner,
Carsten Marr
Abstract:
Poor generalization performance caused by distribution shifts in unseen domains often hinders the trustworthy deployment of deep neural networks. Many domain generalization techniques address this problem by adding a domain invariant regularization loss terms during training. However, there is a lack of modular software that allows users to combine the advantages of different methods with minimal…
▽ More
Poor generalization performance caused by distribution shifts in unseen domains often hinders the trustworthy deployment of deep neural networks. Many domain generalization techniques address this problem by adding a domain invariant regularization loss terms during training. However, there is a lack of modular software that allows users to combine the advantages of different methods with minimal effort for reproducibility. DomainLab is a modular Python package for training user specified neural networks with composable regularization loss terms. Its decoupled design allows the separation of neural networks from regularization loss construction. Hierarchical combinations of neural networks, different domain generalization methods, and associated hyperparameters, can all be specified together with other experimental setup in a single configuration file. Hierarchical combinations of neural networks, different domain generalization methods, and associated hyperparameters, can all be specified together with other experimental setup in a single configuration file. In addition, DomainLab offers powerful benchmarking functionality to evaluate the generalization performance of neural networks in out-of-distribution data. The package supports running the specified benchmark on an HPC cluster or on a standalone machine. The package is well tested with over 95 percent coverage and well documented. From the user perspective, it is closed to modification but open to extension. The package is under the MIT license, and its source code, tutorial and documentation can be found at https://github.com/marrlab/DomainLab.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Time-Resolved Imaging Reveals Transiently Chaotic Spin-Orbit-Torque-Driven Dynamics Under Controlled Conditions
Authors:
Lisa-Marie Kern,
Kai Litzius,
Victor Deinhart,
Michael Schneider,
Christopher Klose,
Kathinka Gerlinger,
Riccardo Battistelli,
Dieter Engel,
Christian M. Günther,
Meng-Jie Huang,
Katja Höflich,
Felix Büttner,
Stefan Eisebitt,
Bastian Pfau
Abstract:
Spin-orbit torques (SOTs) act as efficient drivers for nanoscale magnetic systems, such as in magnetic tunnel junctions, nano-oscillators and racetrack geometries. In particular, in combination with materials exhibiting high Dzyaloshinskii--Moriya interaction, SOTs are considered to result in well-controlled deterministic magnetisation dynamics and are, therefore, used as robust drives to move and…
▽ More
Spin-orbit torques (SOTs) act as efficient drivers for nanoscale magnetic systems, such as in magnetic tunnel junctions, nano-oscillators and racetrack geometries. In particular, in combination with materials exhibiting high Dzyaloshinskii--Moriya interaction, SOTs are considered to result in well-controlled deterministic magnetisation dynamics and are, therefore, used as robust drives to move and create magnetic skyrmions. In contrast to these expectations, we here find unpredictable, transiently chaotic dynamics induced by SOT at an artificial anisotropy-engineered defect in a magnetic racetrack. Based on these controlled conditions, we directly observe the nanoscale dynamics with holography-based, time-resolved x-ray imaging. In concert with micromagnetic simulations, we disclose a regime of violent picosecond fluctuations, including topological instabilities that, remarkably, result in deterministic final configurations. In addition, our images expose previously unseen skyrmion shedding and highlight the potential of transiently chaotic pathways for topological switching. Our approach offers new perspectives for the investigation and application of highly non-linear SOT dynamics in spintronics materials.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
arXiv:2401.04793
[pdf]
cond-mat.mtrl-sci
cond-mat.mes-hall
cond-mat.str-el
cond-mat.supr-con
quant-ph
2024 Roadmap on Magnetic Microscopy Techniques and Their Applications in Materials Science
Authors:
D. V. Christensen,
U. Staub,
T. R. Devidas,
B. Kalisky,
K. C. Nowack,
J. L. Webb,
U. L. Andersen,
A. Huck,
D. A. Broadway,
K. Wagner,
P. Maletinsky,
T. van der Sar,
C. R. Du,
A. Yacoby,
D. Collomb,
S. Bending,
A. Oral,
H. J. Hug,
A. -O. Mandru,
V. Neu,
H. W. Schumacher,
S. Sievers,
H. Saito,
A. A. Khajetoorians,
N. Hauptmann
, et al. (28 additional authors not shown)
Abstract:
Considering the growing interest in magnetic materials for unconventional computing, data storage, and sensor applications, there is active research not only on material synthesis but also characterisation of their properties. In addition to structural and integral magnetic characterisations, imaging of magnetization patterns, current distributions and magnetic fields at nano- and microscale is of…
▽ More
Considering the growing interest in magnetic materials for unconventional computing, data storage, and sensor applications, there is active research not only on material synthesis but also characterisation of their properties. In addition to structural and integral magnetic characterisations, imaging of magnetization patterns, current distributions and magnetic fields at nano- and microscale is of major importance to understand the material responses and qualify them for specific applications. In this roadmap, we aim to cover a broad portfolio of techniques to perform nano- and microscale magnetic imaging using SQUIDs, spin center and Hall effect magnetometries, scanning probe microscopies, x-ray- and electron-based methods as well as magnetooptics and nanoMRI. The roadmap is aimed as a single access point of information for experts in the field as well as the young generation of students outlining prospects of the development of magnetic imaging technologies for the upcoming decade with a focus on physics, materials science, and chemistry of planar, 3D and geometrically curved objects of different material classes including 2D materials, complex oxides, semi-metals, multiferroics, skyrmions, antiferromagnets, frustrated magnets, magnetic molecules/nanoparticles, ionic conductors, superconductors, spintronic and spinorbitronic materials.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Consistent and Asymptotically Unbiased Estimation of Proper Calibration Errors
Authors:
Teodora Popordanoska,
Sebastian G. Gruber,
Aleksei Tiulpin,
Florian Buettner,
Matthew B. Blaschko
Abstract:
Proper scoring rules evaluate the quality of probabilistic predictions, playing an essential role in the pursuit of accurate and well-calibrated models. Every proper score decomposes into two fundamental components -- proper calibration error and refinement -- utilizing a Bregman divergence. While uncertainty calibration has gained significant attention, current literature lacks a general estimato…
▽ More
Proper scoring rules evaluate the quality of probabilistic predictions, playing an essential role in the pursuit of accurate and well-calibrated models. Every proper score decomposes into two fundamental components -- proper calibration error and refinement -- utilizing a Bregman divergence. While uncertainty calibration has gained significant attention, current literature lacks a general estimator for these quantities with known statistical properties. To address this gap, we propose a method that allows consistent, and asymptotically unbiased estimation of all proper calibration errors and refinement terms. In particular, we introduce Kullback--Leibler calibration error, induced by the commonly used cross-entropy loss. As part of our results, we prove the relation between refinement and f-divergences, which implies information monotonicity in neural networks, regardless of which proper scoring rule is optimized. Our experiments validate empirically the claimed properties of the proposed estimator and suggest that the selection of a post-hoc calibration method should be determined by the particular calibration error of interest.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models
Authors:
Sebastian G. Gruber,
Florian Buettner
Abstract:
Generative models, like large language models, are becoming increasingly relevant in our daily lives, yet a theoretical framework to assess their generalization behavior and uncertainty does not exist. Particularly, the problem of uncertainty estimation is commonly solved in an ad-hoc and task-dependent manner. For example, natural language approaches cannot be transferred to image generation. In…
▽ More
Generative models, like large language models, are becoming increasingly relevant in our daily lives, yet a theoretical framework to assess their generalization behavior and uncertainty does not exist. Particularly, the problem of uncertainty estimation is commonly solved in an ad-hoc and task-dependent manner. For example, natural language approaches cannot be transferred to image generation. In this paper, we introduce the first bias-variance-covariance decomposition for kernel scores. This decomposition represents a theoretical framework from which we derive a kernel-based variance and entropy for uncertainty estimation. We propose unbiased and consistent estimators for each quantity which only require generated samples but not the underlying model itself. Based on the wide applicability of kernels, we demonstrate our framework via generalization and uncertainty experiments for image, audio, and language generation. Specifically, kernel entropy for uncertainty estimation is more predictive of performance on CoQA and TriviaQA question answering datasets than existing baselines and can also be applied to closed-source models.
△ Less
Submitted 10 July, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Coherent x-ray magnetic imaging with 5 nm resolution
Authors:
Riccardo Battistelli,
Daniel Metternich,
Michael Schneider,
Lisa-Marie Kern,
Kai Litzius,
Josefin Fuchs,
Christopher Klose,
Kathinka Gerlinger,
Kai Bagschik,
Christian M. Günther,
Dieter Engel,
Claus Ropers,
Stefan Eisebitt,
Bastian Pfau,
Felix Büttner,
Sergey Zayko
Abstract:
Soft x-ray microscopy plays an important role in modern spintronics. However, the achievable resolution of most x-ray magnetic imaging experiments is above 10 nm, limiting access to fundamental and technologically relevant length scales. Here, we demonstrate x-ray magnetic microscopy with 5 nm resolution by combining holography-assisted coherent diffractive imaging with heterodyne amplification of…
▽ More
Soft x-ray microscopy plays an important role in modern spintronics. However, the achievable resolution of most x-ray magnetic imaging experiments is above 10 nm, limiting access to fundamental and technologically relevant length scales. Here, we demonstrate x-ray magnetic microscopy with 5 nm resolution by combining holography-assisted coherent diffractive imaging with heterodyne amplification of the weak magnetic signal. The gain in resolution and contrast allows direct access to key magnetic properties, including domain wall profiles and the position of pinning sites. The ability to detect and map such properties with photons opens new horizons for element-specific, time-resolved, and in-operando research on magnetic materials and beyond.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Application-driven Validation of Posteriors in Inverse Problems
Authors:
Tim J. Adler,
Jan-Hinrich Nölke,
Annika Reinke,
Minu Dietlinde Tizabi,
Sebastian Gruber,
Dasha Trofimova,
Lynton Ardizzone,
Paul F. Jaeger,
Florian Buettner,
Ullrich Köthe,
Lena Maier-Hein
Abstract:
Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress i…
▽ More
Current deep learning-based solutions for image analysis tasks are commonly incapable of handling problems to which multiple different plausible solutions exist. In response, posterior-based methods such as conditional Diffusion Models and Invertible Neural Networks have emerged; however, their translation is hampered by a lack of research on adequate validation. In other words, the way progress is measured often does not reflect the needs of the driving practical application. Closing this gap in the literature, we present the first systematic framework for the application-driven validation of posterior-based methods in inverse problems. As a methodological novelty, it adopts key principles from the field of object detection validation, which has a long history of addressing the question of how to locate and match multiple object instances in an image. Treating modes as instances enables us to perform mode-centric validation, using well-interpretable metrics from the application perspective. We demonstrate the value of our framework through instantiations for a synthetic toy example and two medical vision use cases: pose estimation in surgery and imaging-based quantification of functional tissue parameters for diagnostics. Our framework offers key advantages over common approaches to posterior validation in all three examples and could thus revolutionize performance assessment in inverse problems.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
X-ray holography of skyrmionic cocoons in aperiodic magnetic multilayers
Authors:
M. Grelier,
R. Battistelli,
H. Popescu,
F. Godel,
A. Vecchiola,
S. Collin,
C. Léveillé,
K. Bouzehouane,
F. Büttner,
V. Cros,
N. Jaouen,
N. Reyren
Abstract:
The development and characterization of three-dimensional (3D) topological magnetic textures has become an important topic in modern magnetism both for fundamental and technological perspectives. Among the novel 3D spin textures, skyrmionic cocoons have been successfully stabilized in magnetic multilayers having a variable thickness of the ferromagnet in the vertical direction of the stack. These…
▽ More
The development and characterization of three-dimensional (3D) topological magnetic textures has become an important topic in modern magnetism both for fundamental and technological perspectives. Among the novel 3D spin textures, skyrmionic cocoons have been successfully stabilized in magnetic multilayers having a variable thickness of the ferromagnet in the vertical direction of the stack. These ellipsoidal 3D magnetic textures remain vertically confined in a fraction of the total thickness while coexisting with fully columnar skyrmions. Here, we use X-ray holography with about 15 nm lateral resolution to investigate how their properties depend on the field and temperature. We observe circular objects with different amplitude of contrast which evidences the presence of different 3D objects located in various vertical parts of the multilayer. Moreover, we witness during out-of-plane cycling an attractive interaction between cocoons located at various heights, mainly due to the stray field, which impacts their horizontal positioning. The X-ray holography measurements also allow to determine the size of the cocoons at remanence which, at room temperature, possess diameter close to 100 nm in average. Combining this transmission technique with magnetic force microscopy and micromagnetic simulations gives a precise insight into the 3D distribution of the magnetization which demonstrate the 3D nature of skyrmionic cocoons.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Understanding metric-related pitfalls in image analysis validation
Authors:
Annika Reinke,
Minu D. Tizabi,
Michael Baumgartner,
Matthias Eisenmann,
Doreen Heckmann-Nötzel,
A. Emre Kavur,
Tim Rädsch,
Carole H. Sudre,
Laura Acion,
Michela Antonelli,
Tal Arbel,
Spyridon Bakas,
Arriel Benis,
Matthew Blaschko,
Florian Buettner,
M. Jorge Cardoso,
Veronika Cheplygina,
Jianxu Chen,
Evangelia Christodoulou,
Beth A. Cimini,
Gary S. Collins,
Keyvan Farahani,
Luciana Ferrer,
Adrian Galdran,
Bram van Ginneken
, et al. (53 additional authors not shown)
Abstract:
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibilit…
▽ More
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.
△ Less
Submitted 23 February, 2024; v1 submitted 3 February, 2023;
originally announced February 2023.
-
Robust scenario for the generation of non-equilibrium topological fluctuation states
Authors:
Kathinka Gerlinger,
Rein Liefferink,
Michael Schneider,
Lisa-Marie Kern,
Christopher Klose,
Daniel Metternich,
Dieter Engel,
Flavio Capotondi,
Dario De Angelis,
Matteo Pancaldi,
Emanuele Pedersoli,
Felix Büttner,
Stefan Eisebitt,
Johan H. Mentink,
Bastian Pfau
Abstract:
The recently discovered topological fluctuation state provides a fascinating new perspective on the ultrafast emergence of topology in condensed matter systems. However, rather little is known about the physics of this state and the origin of the topological fluctuations. Using time-resolved small-angle x-ray scattering, we observe that topological fluctuation states appear after laser excitation…
▽ More
The recently discovered topological fluctuation state provides a fascinating new perspective on the ultrafast emergence of topology in condensed matter systems. However, rather little is known about the physics of this state and the origin of the topological fluctuations. Using time-resolved small-angle x-ray scattering, we observe that topological fluctuation states appear after laser excitation even if the final state does not host stable skyrmions. Simulations support these findings and reveal that the fluctuations originate from the competition between spontaneous nucleation and decay of skyrmions, consistent with Arrhenius-like activation over a potential barrier. Stable skyrmions can freeze out of such fluctuations when the effective temperature of the system relaxes faster than the decay time of the skyrmions. Our results reveal a robust scenario for the generation of topological fluctuation states, potentially enabling their study in a wide variety of magnetic systems.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Role of substrate clamping on anisotropy and domain structure in the canted antiferromagnet $α$-Fe$_2$O$_3$
Authors:
Angela Wittmann,
Olena Gomonay,
Kai Litzius,
Allison Kaczmarek,
Alexander E. Kossak,
Daniel Wolf,
Axel Lubk,
Tyler N. Johnson,
Elizaveta A. Tremsina,
Alexandra Churikova,
Felix Büttner,
Sebastian Wintz,
Mohamad-Assaad Mawass,
Markus Weigand,
Florian Kronast,
Larry Scipioni,
Adam Shepard,
Ty Newhouse-Illig,
James A Greer,
Gisela Schütz,
Norman O. Birge,
Geoffrey S. D. Beach
Abstract:
Antiferromagnets have recently been propelled to the forefront of spintronics by their high potential for revolutionizing memory technologies. For this, understanding the formation and driving mechanisms of the domain structure is paramount. In this work, we investigate the domain structure in a thin-film canted antiferromagnet $α$-Fe$_2$O$_3$. We find that the internal destressing fields driving…
▽ More
Antiferromagnets have recently been propelled to the forefront of spintronics by their high potential for revolutionizing memory technologies. For this, understanding the formation and driving mechanisms of the domain structure is paramount. In this work, we investigate the domain structure in a thin-film canted antiferromagnet $α$-Fe$_2$O$_3$. We find that the internal destressing fields driving the formation of domains do not follow the crystal symmetry of $α$-Fe$_2$O$_3$, but fluctuate due to substrate clamping. This leads to an overall isotropic distribution of the Néel order with locally varying effective anisotropy in antiferromagnetic thin films. Furthermore, we show that the weak ferromagnetic nature of $α$-Fe$_2$O$_3$ leads to a qualitatively different dependence on magnetic field compared to collinear antiferromagnets such as NiO. The insights gained from our work serve as a foundation for further studies of electrical and optical manipulation of the domain structure of antiferromagnetic thin films.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Uncertainty Estimates of Predictions via a General Bias-Variance Decomposition
Authors:
Sebastian G. Gruber,
Florian Buettner
Abstract:
Reliably estimating the uncertainty of a prediction throughout the model lifecycle is crucial in many safety-critical applications. The most common way to measure this uncertainty is via the predicted confidence. While this tends to work well for in-domain samples, these estimates are unreliable under domain drift and restricted to classification. Alternatively, proper scores can be used for most…
▽ More
Reliably estimating the uncertainty of a prediction throughout the model lifecycle is crucial in many safety-critical applications. The most common way to measure this uncertainty is via the predicted confidence. While this tends to work well for in-domain samples, these estimates are unreliable under domain drift and restricted to classification. Alternatively, proper scores can be used for most predictive tasks but a bias-variance decomposition for model uncertainty does not exist in the current literature. In this work we introduce a general bias-variance decomposition for proper scores, giving rise to the Bregman Information as the variance term. We discover how exponential families and the classification log-likelihood are special cases and provide novel formulations. Surprisingly, we can express the classification case purely in the logit space. We showcase the practical relevance of this decomposition on several downstream tasks, including model ensembles and confidence regions. Further, we demonstrate how different approximations of the instance-level Bregman Information allow reliable out-of-distribution detection for all degrees of domain drift.
△ Less
Submitted 20 April, 2023; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Metrics reloaded: Recommendations for image analysis validation
Authors:
Lena Maier-Hein,
Annika Reinke,
Patrick Godau,
Minu D. Tizabi,
Florian Buettner,
Evangelia Christodoulou,
Ben Glocker,
Fabian Isensee,
Jens Kleesiek,
Michal Kozubek,
Mauricio Reyes,
Michael A. Riegler,
Manuel Wiesenfarth,
A. Emre Kavur,
Carole H. Sudre,
Michael Baumgartner,
Matthias Eisenmann,
Doreen Heckmann-Nötzel,
Tim Rädsch,
Laura Acion,
Michela Antonelli,
Tal Arbel,
Spyridon Bakas,
Arriel Benis,
Matthew Blaschko
, et al. (49 additional authors not shown)
Abstract:
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, our large international ex…
▽ More
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, our large international expert consortium created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. The framework was developed in a multi-stage Delphi process and is based on the novel concept of a problem fingerprint - a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), data set and algorithm output. Based on the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as a classification task at image, object or pixel level, namely image-level classification, object detection, semantic segmentation, and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool, which also provides a point of access to explore weaknesses, strengths and specific recommendations for the most common validation metrics. The broad applicability of our framework across domains is demonstrated by an instantiation for various biological and medical image analysis use cases.
△ Less
Submitted 23 February, 2024; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Proximally Sensitive Error for Anomaly Detection and Feature Learning
Authors:
Amogh Gudi,
Fritjof Büttner,
Jan van Gemert
Abstract:
Mean squared error (MSE) is one of the most widely used metrics to expression differences between multi-dimensional entities, including images. However, MSE is not locally sensitive as it does not take into account the spatial arrangement of the (pixel) differences, which matters for structured data types like images. Such spatial arrangements carry information about the source of the differences;…
▽ More
Mean squared error (MSE) is one of the most widely used metrics to expression differences between multi-dimensional entities, including images. However, MSE is not locally sensitive as it does not take into account the spatial arrangement of the (pixel) differences, which matters for structured data types like images. Such spatial arrangements carry information about the source of the differences; therefore, an error function that also incorporates the location of errors can lead to a more meaningful distance measure. We introduce Proximally Sensitive Error (PSE), through which we suggest that a regional emphasis in the error measure can 'highlight' semantic differences between images over syntactic/random deviations. We demonstrate that this emphasis can be leveraged upon for the task of anomaly/occlusion detection. We further explore its utility as a loss function to help a model focus on learning representations of semantic objects instead of minimizing syntactic reconstruction noise.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
Tailoring Optical Excitation to Control Magnetic Skyrmion Nucleation
Authors:
Lisa-Marie Kern,
Bastian Pfau,
Michael Schneider,
Kathinka Gerlinger,
Victor Deinhart,
Steffen Wittrock,
Themistoklis Sidiropoulos,
Dieter Engel,
Ingo Will,
Christian M. Günther,
Kai Litzius,
Sebastian Wintz,
Markus Weigand,
Felix Büttner,
Stefan Eisebitt
Abstract:
In ferromagnetic multilayers, a single laser pulse with a fluence above an optical nucleation threshold can create magnetic skyrmions, which are randomly distributed over the area of the laser spot. However, in order to study the dynamics of skyrmions and for their application in future data technology, a controllable localization of the skyrmion nucleation sites is crucial. Here, it is demonstrat…
▽ More
In ferromagnetic multilayers, a single laser pulse with a fluence above an optical nucleation threshold can create magnetic skyrmions, which are randomly distributed over the area of the laser spot. However, in order to study the dynamics of skyrmions and for their application in future data technology, a controllable localization of the skyrmion nucleation sites is crucial. Here, it is demonstrated that patterned reflective masks behind a thin magnetic film can be designed to locally tailor the optical excitation amplitudes reached, leading to spatially controlled skyrmion nucleation on the nanometer scale. Using x-ray microscopy, the influence of nanopatterned back-side aluminum masks on the optical excitation is studied in two sample geometries with varying layer sequence of substrate and magnetic Co/Pt multilayer. Surprisingly, the masks' effect on suppressing or enhancing skymion nucleation reverses when changing this sequence. Moreover, optical near-field enhancements additionally affect the spatial arrangement of the nucleated skyrmions. Simulations of the spatial modulation of the laser excitation, and the following heat transfer across the interfaces in the two sample geometries are employed to explain these observations. The results demonstrate a reliable approach to add nanometer-scale spatial control to optically induced magnetization processes on ultrafast timescales.
△ Less
Submitted 11 August, 2022; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Encoding Domain Knowledge in Multi-view Latent Variable Models: A Bayesian Approach with Structured Sparsity
Authors:
Arber Qoku,
Florian Buettner
Abstract:
Many real-world systems are described not only by data from a single source but via multiple data views. In genomic medicine, for instance, patients can be characterized by data from different molecular layers. Latent variable models with structured sparsity are a commonly used tool for disentangling variation within and across data views. However, their interpretability is cumbersome since it req…
▽ More
Many real-world systems are described not only by data from a single source but via multiple data views. In genomic medicine, for instance, patients can be characterized by data from different molecular layers. Latent variable models with structured sparsity are a commonly used tool for disentangling variation within and across data views. However, their interpretability is cumbersome since it requires a direct inspection and interpretation of each factor from domain experts. Here, we propose MuVI, a novel multi-view latent variable model based on a modified horseshoe prior for modeling structured sparsity. This facilitates the incorporation of limited and noisy domain knowledge, thereby allowing for an analysis of multi-view data in an inherently explainable manner. We demonstrate that our model (i) outperforms state-of-the-art approaches for modeling structured sparsity in terms of the reconstruction error and the precision/recall, (ii) robustly integrates noisy domain expertise in the form of feature sets, (iii) promotes the identifiability of factors and (iv) infers interpretable and biologically meaningful axes of variation in a real-world multi-view dataset of cancer patients.
△ Less
Submitted 15 March, 2023; v1 submitted 13 April, 2022;
originally announced April 2022.
-
Better Uncertainty Calibration via Proper Scores for Classification and Beyond
Authors:
Sebastian G. Gruber,
Florian Buettner
Abstract:
With model trustworthiness being crucial for sensitive real-world applications, practitioners are putting more and more focus on improving the uncertainty calibration of deep neural networks. Calibration errors are designed to quantify the reliability of probabilistic predictions but their estimators are usually biased and inconsistent. In this work, we introduce the framework of proper calibratio…
▽ More
With model trustworthiness being crucial for sensitive real-world applications, practitioners are putting more and more focus on improving the uncertainty calibration of deep neural networks. Calibration errors are designed to quantify the reliability of probabilistic predictions but their estimators are usually biased and inconsistent. In this work, we introduce the framework of proper calibration errors, which relates every calibration error to a proper score and provides a respective upper bound with optimal estimation properties. This relationship can be used to reliably quantify the model calibration improvement. We theoretically and empirically demonstrate the shortcomings of commonly used estimators compared to our approach. Due to the wide applicability of proper scores, this gives a natural extension of recalibration beyond classification.
△ Less
Submitted 12 March, 2024; v1 submitted 15 March, 2022;
originally announced March 2022.
-
Deterministic Generation and Guided Motion of Magnetic Skyrmions by Focused He$^+$-Ion Irradiation
Authors:
L. -M. Kern,
B. Pfau,
V. Deinhart,
M. Schneider,
C. Klose,
K. Gerlinger,
S. Wittrock,
D. Engel,
I. Will,
C. M. Günther,
R. Liefferink,
J. H. Mentink,
S. Wintz,
M. Weigand,
M. -J. Huang,
R. Battistelli,
D. Metternich,
F. Büttner,
K. Höflich,
S. Eisebitt
Abstract:
Magnetic skyrmions are quasiparticles with non-trivial topology, envisioned to play a key role in next-generation data technology while simultaneously attracting fundamental research interest due to their emerging topological charge. In chiral magnetic multilayers, current-generated spin-orbit torques or ultrafast laser excitation can be used to nucleate isolated skyrmions on a picosecond timescal…
▽ More
Magnetic skyrmions are quasiparticles with non-trivial topology, envisioned to play a key role in next-generation data technology while simultaneously attracting fundamental research interest due to their emerging topological charge. In chiral magnetic multilayers, current-generated spin-orbit torques or ultrafast laser excitation can be used to nucleate isolated skyrmions on a picosecond timescale. Both methods, however, produce randomly arranged skyrmions, which inherently limits the precision on the location at which the skyrmions are nucleated. Here, we show that nanopatterning of the anisotropy landscape with a He$^+$-ion beam creates well-defined skyrmion nucleation sites, thereby transforming the skyrmion localization into a deterministic process. This approach allows to realize control of individual skyrmion nucleation as well as guided skyrmion motion with nanometer-scale precision, which is pivotal for both future fundamental studies of skyrmion dynamics and applications.
△ Less
Submitted 29 April, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
Photon correlation spectroscopy with heterodyne mixing based on soft-x-ray magnetic circular dichroism
Authors:
Christopher Klose,
Felix Büttner,
Wen Hu,
Claudio Mazzoli,
Geoffrey S. D. Beach,
Stefan Eisebitt,
Bastian Pfau
Abstract:
Many magnetic equilibrium states and phase transitions are characterized by fluctuations. Such magnetic fluctuation can in principle be detected with scattering-based x-ray photon correlation spectroscopy (XPCS). However, in the established approach of XPCS, the magnetic scattering signal is quadratic in the magnetic scattering cross section, which results not only in often prohibitively small sig…
▽ More
Many magnetic equilibrium states and phase transitions are characterized by fluctuations. Such magnetic fluctuation can in principle be detected with scattering-based x-ray photon correlation spectroscopy (XPCS). However, in the established approach of XPCS, the magnetic scattering signal is quadratic in the magnetic scattering cross section, which results not only in often prohibitively small signals but also in a fundamental inability to detect negative correlations (anticorrelations). Here, we propose to exploit the possibility of heterodyne mixing of the magnetic signal with static charge scattering to reconstruct the first-order (linear) magnetic correlation function. We show that the first-order magnetic scattering signal reconstructed from heterodyne scattering now directly represents the underlying magnetization texture. Moreover, we suggest a practical implementation based on an absorption mask rigidly connected to the sample, which not only produces a static charge scattering signal but also eliminates the problem of drift-induced artificial decay of the correlation functions. Our method thereby significantly broadens the range of scientific questions accessible by magnetic x-ray photon correlation spectroscopy.
△ Less
Submitted 30 May, 2022; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Encoding Domain Information with Sparse Priors for Inferring Explainable Latent Variables
Authors:
Arber Qoku,
Florian Buettner
Abstract:
Latent variable models are powerful statistical tools that can uncover relevant variation between patients or cells, by inferring unobserved hidden states from observable high-dimensional data. A major shortcoming of current methods, however, is their inability to learn sparse and interpretable hidden states. Additionally, in settings where partial knowledge on the latent structure of the data is…
▽ More
Latent variable models are powerful statistical tools that can uncover relevant variation between patients or cells, by inferring unobserved hidden states from observable high-dimensional data. A major shortcoming of current methods, however, is their inability to learn sparse and interpretable hidden states. Additionally, in settings where partial knowledge on the latent structure of the data is readily available, a statistically sound integration of prior information into current methods is challenging. To address these issues, we propose spex-LVM, a factorial latent variable model with sparse priors to encourage the inference of explainable factors driven by domain-relevant information. spex-LVM utilizes existing knowledge of curated biomedical pathways to automatically assign annotated attributes to latent factors, yielding interpretable results tailored to the corresponding domain of interest. Evaluations on simulated and real single-cell RNA-seq datasets demonstrate that our model robustly identifies relevant structure in an inherently explainable manner, distinguishes technical noise from sources of biomedical variation, and provides dataset-specific adaptations of existing pathway annotations. Implementation is available at https://github.com/MLO-lab/spexlvm.
△ Less
Submitted 11 April, 2022; v1 submitted 8 July, 2021;
originally announced July 2021.
-
Multi-output Gaussian Processes for Uncertainty-aware Recommender Systems
Authors:
Yinchong Yang,
Florian Buettner
Abstract:
Recommender systems are often designed based on a collaborative filtering approach, where user preferences are predicted by modelling interactions between users and items. Many common approaches to solve the collaborative filtering task are based on learning representations of users and items, including simple matrix factorization, Gaussian process latent variable models, and neural-network based…
▽ More
Recommender systems are often designed based on a collaborative filtering approach, where user preferences are predicted by modelling interactions between users and items. Many common approaches to solve the collaborative filtering task are based on learning representations of users and items, including simple matrix factorization, Gaussian process latent variable models, and neural-network based embeddings. While matrix factorization approaches fail to model nonlinear relations, neural networks can potentially capture such complex relations with unprecedented predictive power and are highly scalable. However, neither of them is able to model predictive uncertainties. In contrast, Gaussian Process based models can generate a predictive distribution, but cannot scale to large amounts of data. In this manuscript, we propose a novel approach combining the representation learning paradigm of collaborative filtering with multi-output Gaussian processes in a joint framework to generate uncertainty-aware recommendations. We introduce an efficient strategy for model training and inference, resulting in a model that scales to very large and sparse datasets and achieves competitive performance in terms of classical metrics quantifying the reconstruction error. In addition to accurately predicting user preferences, our model also provides meaningful uncertainty estimates about that prediction.
△ Less
Submitted 8 October, 2021; v1 submitted 8 June, 2021;
originally announced June 2021.
-
Common Limitations of Image Processing Metrics: A Picture Story
Authors:
Annika Reinke,
Minu D. Tizabi,
Carole H. Sudre,
Matthias Eisenmann,
Tim Rädsch,
Michael Baumgartner,
Laura Acion,
Michela Antonelli,
Tal Arbel,
Spyridon Bakas,
Peter Bankhead,
Arriel Benis,
Matthew Blaschko,
Florian Buettner,
M. Jorge Cardoso,
Jianxu Chen,
Veronika Cheplygina,
Evangelia Christodoulou,
Beth Cimini,
Gary S. Collins,
Sandy Engelhardt,
Keyvan Farahani,
Luciana Ferrer,
Adrian Galdran,
Bram van Ginneken
, et al. (68 additional authors not shown)
Abstract:
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using spe…
▽ More
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. These are typically related to (1) the disregard of inherent metric properties, such as the behaviour in the presence of class imbalance or small target structures, (2) the disregard of inherent data set properties, such as the non-independence of the test cases, and (3) the disregard of the actual biomedical domain interest that the metrics should reflect. This living dynamically document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. In this context, it focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.
△ Less
Submitted 6 December, 2023; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Parameterized Temperature Scaling for Boosting the Expressive Power in Post-Hoc Uncertainty Calibration
Authors:
Christian Tomani,
Daniel Cremers,
Florian Buettner
Abstract:
We address the problem of uncertainty calibration and introduce a novel calibration method, Parametrized Temperature Scaling (PTS). Standard deep neural networks typically yield uncalibrated predictions, which can be transformed into calibrated confidence scores using post-hoc calibration methods. In this contribution, we demonstrate that the performance of accuracy-preserving state-of-the-art pos…
▽ More
We address the problem of uncertainty calibration and introduce a novel calibration method, Parametrized Temperature Scaling (PTS). Standard deep neural networks typically yield uncalibrated predictions, which can be transformed into calibrated confidence scores using post-hoc calibration methods. In this contribution, we demonstrate that the performance of accuracy-preserving state-of-the-art post-hoc calibrators is limited by their intrinsic expressive power. We generalize temperature scaling by computing prediction-specific temperatures, parameterized by a neural network. We show with extensive experiments that our novel accuracy-preserving approach consistently outperforms existing algorithms across a large number of model architectures, datasets and metrics.
△ Less
Submitted 17 September, 2022; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Hierarchical Variational Auto-Encoding for Unsupervised Domain Generalization
Authors:
Xudong Sun,
Florian Buettner
Abstract:
We address the task of domain generalization, where the goal is to train a predictive model such that it is able to generalize to a new, previously unseen domain. We choose a hierarchical generative approach within the framework of variational autoencoders and propose a domain-unsupervised algorithm that is able to generalize to new domains without domain supervision. We show that our method is ab…
▽ More
We address the task of domain generalization, where the goal is to train a predictive model such that it is able to generalize to a new, previously unseen domain. We choose a hierarchical generative approach within the framework of variational autoencoders and propose a domain-unsupervised algorithm that is able to generalize to new domains without domain supervision. We show that our method is able to learn representations that disentangle domain-specific information from class-label specific information even in complex settings where domain structure is not observed during training. Our interpretable method outperforms previously proposed generative algorithms for domain generalization as well as other non-generative state-of-the-art approaches in several hierarchical domain settings including sequential overlapped near continuous domain shift. It also achieves competitive performance on the standard domain generalization benchmark dataset PACS compared to state-of-the-art approaches which rely on observing domain-specific information during training, as well as another domain unsupervised method. Additionally, we proposed model selection purely based on Evidence Lower Bound (ELBO) and also proposed weak domain supervision where implicit domain information can be added into the algorithm.
△ Less
Submitted 14 May, 2021; v1 submitted 23 January, 2021;
originally announced January 2021.
-
Post-hoc Uncertainty Calibration for Domain Drift Scenarios
Authors:
Christian Tomani,
Sebastian Gruber,
Muhammed Ebrar Erdem,
Daniel Cremers,
Florian Buettner
Abstract:
We address the problem of uncertainty calibration. While standard deep neural networks typically yield uncalibrated predictions, calibrated confidence scores that are representative of the true likelihood of a prediction can be achieved using post-hoc calibration methods. However, to date the focus of these approaches has been on in-domain calibration. Our contribution is two-fold. First, we show…
▽ More
We address the problem of uncertainty calibration. While standard deep neural networks typically yield uncalibrated predictions, calibrated confidence scores that are representative of the true likelihood of a prediction can be achieved using post-hoc calibration methods. However, to date the focus of these approaches has been on in-domain calibration. Our contribution is two-fold. First, we show that existing post-hoc calibration methods yield highly over-confident predictions under domain shift. Second, we introduce a simple strategy where perturbations are applied to samples in the validation set before performing the post-hoc calibration step. In extensive experiments, we demonstrate that this perturbation step results in substantially better calibration under domain shift on a wide range of architectures and modelling tasks.
△ Less
Submitted 23 June, 2021; v1 submitted 20 December, 2020;
originally announced December 2020.
-
Towards Trustworthy Predictions from Deep Neural Networks with Fast Adversarial Calibration
Authors:
Christian Tomani,
Florian Buettner
Abstract:
To facilitate a wide-spread acceptance of AI systems guiding decision making in real-world applications, trustworthiness of deployed models is key. That is, it is crucial for predictive models to be uncertainty-aware and yield well-calibrated (and thus trustworthy) predictions for both in-domain samples as well as under domain shift. Recent efforts to account for predictive uncertainty include pos…
▽ More
To facilitate a wide-spread acceptance of AI systems guiding decision making in real-world applications, trustworthiness of deployed models is key. That is, it is crucial for predictive models to be uncertainty-aware and yield well-calibrated (and thus trustworthy) predictions for both in-domain samples as well as under domain shift. Recent efforts to account for predictive uncertainty include post-processing steps for trained neural networks, Bayesian neural networks as well as alternative non-Bayesian approaches such as ensemble approaches and evidential deep learning. Here, we propose an efficient yet general modelling approach for obtaining well-calibrated, trustworthy probabilities for samples obtained after a domain shift. We introduce a new training strategy combining an entropy-encouraging loss term with an adversarial calibration loss term and demonstrate that this results in well-calibrated and technically trustworthy predictions for a wide range of domain drifts. We comprehensively evaluate previously proposed approaches on different data modalities, a large range of data sets including sequence data, network architectures and perturbation strategies. We observe that our modelling approach substantially outperforms existing state-of-the-art approaches, yielding well-calibrated predictions under domain drift.
△ Less
Submitted 2 March, 2021; v1 submitted 20 December, 2020;
originally announced December 2020.
-
TIMELY: Improving Labeling Consistency in Medical Imaging for Cell Type Classification
Authors:
Yushan Liu,
Markus M. Geipel,
Christoph Tietz,
Florian Buettner
Abstract:
Diagnosing diseases such as leukemia or anemia requires reliable counts of blood cells. Hematologists usually label and count microscopy images of blood cells manually. In many cases, however, cells in different maturity states are difficult to distinguish, and in combination with image noise and subjectivity, humans are prone to make labeling mistakes. This results in labels that are often not re…
▽ More
Diagnosing diseases such as leukemia or anemia requires reliable counts of blood cells. Hematologists usually label and count microscopy images of blood cells manually. In many cases, however, cells in different maturity states are difficult to distinguish, and in combination with image noise and subjectivity, humans are prone to make labeling mistakes. This results in labels that are often not reproducible, which can directly affect the diagnoses. We introduce TIMELY, a probabilistic model that combines pseudotime inference methods with inhomogeneous hidden Markov trees, which addresses this challenge of label inconsistency. We show first on simulation data that TIMELY is able to identify and correct wrong labels with higher precision and recall than baseline methods for labeling correction. We then apply our method to two real-world datasets of blood cell data and show that TIMELY successfully finds inconsistent labels, thereby improving the quality of human-generated labels.
△ Less
Submitted 10 July, 2020;
originally announced July 2020.
-
arXiv:2004.07763
[pdf]
cond-mat.mes-hall
cond-mat.mtrl-sci
cond-mat.str-el
cond-mat.supr-con
physics.app-ph
A Magnon Scattering Platform
Authors:
Tony X. Zhou,
Joris J. Carmiggelt,
Lisa M. Gächter,
Ilya Esterlis,
Dries Sels,
Rainer J. Stöhr,
Chunhui Du,
Daniel Fernandez,
Joaquin F. Rodriguez-Nieva,
Felix Büttner,
Eugene Demler,
Amir Yacoby
Abstract:
Scattering experiments have revolutionized our understanding of nature. Examples include the discovery of the nucleus, crystallography, and the discovery of the double helix structure of DNA. Scattering techniques differ by the type of the particles used, the interaction these particles have with target materials and the range of wavelengths used. Here, we demonstrate a new 2-dimensional table-top…
▽ More
Scattering experiments have revolutionized our understanding of nature. Examples include the discovery of the nucleus, crystallography, and the discovery of the double helix structure of DNA. Scattering techniques differ by the type of the particles used, the interaction these particles have with target materials and the range of wavelengths used. Here, we demonstrate a new 2-dimensional table-top scattering platform for exploring magnetic properties of materials on mesoscopic length scales. Long lived, coherent magnonic excitations are generated in a thin film of YIG and scattered off a magnetic target deposited on its surface. The scattered waves are then recorded using a scanning NV center magnetometer that allows sub-wavelength imaging and operation under conditions ranging from cryogenic to ambient environment. While most scattering platforms measure only the intensity of the scattered waves, our imaging method allows for spatial determination of both amplitude and phase of the scattered waves thereby allowing for a systematic reconstruction of the target scattering potential. Our experimental results are consistent with theoretical predictions for such a geometry and reveal several unusual features of the magnetic response of the target, including suppression near the target edges and gradient in the direction perpendicular to the direction of surface wave propagation. Our results establish magnon scattering experiments as a new platform for studying correlated many-body systems.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
AAAI FSS-19: Human-Centered AI: Trustworthiness of AI Models and Data Proceedings
Authors:
Florian Buettner,
John Piorkowski,
Ian McCulloh,
Ulli Waltinger
Abstract:
To facilitate the widespread acceptance of AI systems guiding decision-making in real-world applications, it is key that solutions comprise trustworthy, integrated human-AI systems. Not only in safety-critical applications such as autonomous driving or medicine, but also in dynamic open world systems in industry and government it is crucial for predictive models to be uncertainty-aware and yield t…
▽ More
To facilitate the widespread acceptance of AI systems guiding decision-making in real-world applications, it is key that solutions comprise trustworthy, integrated human-AI systems. Not only in safety-critical applications such as autonomous driving or medicine, but also in dynamic open world systems in industry and government it is crucial for predictive models to be uncertainty-aware and yield trustworthy predictions. Another key requirement for deployment of AI at enterprise scale is to realize the importance of integrating human-centered design into AI systems such that humans are able to use systems effectively, understand results and output, and explain findings to oversight committees.
While the focus of this symposium was on AI systems to improve data quality and technical robustness and safety, we welcomed submissions from broadly defined areas also discussing approaches addressing requirements such as explainable models, human trust and ethical aspects of AI.
△ Less
Submitted 15 January, 2020;
originally announced January 2020.
-
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior
Authors:
Pankaj Gupta,
Yatin Chaudhary,
Florian Buettner,
Hinrich Schütze
Abstract:
We address two challenges of probabilistic topic modelling in order to better estimate the probability of a word in a given context, i.e., P(word|context): (1) No Language Structure in Context: Probabilistic topic models ignore word order by summarizing a given context as a "bag-of-word" and consequently the semantics of words in the context is lost. The LSTM-LM learns a vector-space representatio…
▽ More
We address two challenges of probabilistic topic modelling in order to better estimate the probability of a word in a given context, i.e., P(word|context): (1) No Language Structure in Context: Probabilistic topic models ignore word order by summarizing a given context as a "bag-of-word" and consequently the semantics of words in the context is lost. The LSTM-LM learns a vector-space representation of each word by accounting for word order in local collocation patterns and models complex characteristics of language (e.g., syntax and semantics), while the TM simultaneously learns a latent representation from the entire document and discovers the underlying thematic structure. We unite two complementary paradigms of learning the meaning of word occurrences by combining a TM (e.g., DocNADE) and a LM in a unified probabilistic framework, named as ctx-DocNADE. (2) Limited Context and/or Smaller training corpus of documents: In settings with a small number of word occurrences (i.e., lack of context) in short text or data sparsity in a corpus of few documents, the application of TMs is challenging. We address this challenge by incorporating external knowledge into neural autoregressive topic models via a language modelling approach: we use word embeddings as input of a LSTM-LM with the aim to improve the word-topic mapping on a smaller and/or short-text corpus. The proposed DocNADE extension is named as ctx-DocNADEe.
We present novel neural autoregressive topic model variants coupled with neural LMs and embeddings priors that consistently outperform state-of-the-art generative TMs in terms of generalization (perplexity), interpretability (topic coherence) and applicability (retrieval and classification) over 6 long-text and 8 short-text datasets from diverse domains.
△ Less
Submitted 23 February, 2019; v1 submitted 9 October, 2018;
originally announced October 2018.
-
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Authors:
Pankaj Gupta,
Yatin Chaudhary,
Florian Buettner,
Hinrich Schütze
Abstract:
We address two challenges in topic models: (1) Context information around words helps in determining their actual meaning, e.g., "networks" used in the contexts "artificial neural networks" vs. "biological neuron networks". Generative topic models infer topic-word distributions, taking no or only little context into account. Here, we extend a neural autoregressive topic model to exploit the full c…
▽ More
We address two challenges in topic models: (1) Context information around words helps in determining their actual meaning, e.g., "networks" used in the contexts "artificial neural networks" vs. "biological neuron networks". Generative topic models infer topic-word distributions, taking no or only little context into account. Here, we extend a neural autoregressive topic model to exploit the full context information around words in a document in a language modeling fashion. The proposed model is named as iDocNADE. (2) Due to the small number of word occurrences (i.e., lack of context) in short text and data sparsity in a corpus of few documents, the application of topic models is challenging on such texts. Therefore, we propose a simple and efficient way of incorporating external knowledge into neural autoregressive topic models: we use embeddings as a distributional prior. The proposed variants are named as DocNADEe and iDocNADEe.
We present novel neural autoregressive topic model variants that consistently outperform state-of-the-art generative topic models in terms of generalization, interpretability (topic coherence) and applicability (retrieval and classification) over 7 long-text and 8 short-text datasets from diverse domains.
△ Less
Submitted 14 January, 2019; v1 submitted 15 September, 2018;
originally announced September 2018.
-
Document Informed Neural Autoregressive Topic Models
Authors:
Pankaj Gupta,
Florian Buettner,
Hinrich Schütze
Abstract:
Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks. Generative topic models infer topic-word distributions, taking no or only little context into account. Here, we extend a neural autoregressive topic model to exploit the full context information around words in a document i…
▽ More
Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks. Generative topic models infer topic-word distributions, taking no or only little context into account. Here, we extend a neural autoregressive topic model to exploit the full context information around words in a document in a language modeling fashion. This results in an improved performance in terms of generalization, interpretability and applicability. We apply our modeling approach to seven data sets from various domains and demonstrate that our approach consistently outperforms stateof-the-art generative topic models. With the learned representations, we show on an average a gain of 9.6% (0.57 Vs 0.52) in precision at retrieval fraction 0.02 and 7.2% (0.582 Vs 0.543) in F1 for text categorization.
△ Less
Submitted 11 August, 2018;
originally announced August 2018.
-
Magnetic Imaging and Microscopy
Authors:
Robert M. Reeve,
Hans-Joachim Elmers,
Felix Büttner,
Mathias Kläui
Abstract:
The magnetic domain configuration of a system reveals a wealth of information about the fundamental magnetic properties of that system and can be a critical factor in the operation of magnetic devices. Not only are the details of the domain structure strongly governed by materials parameters, but in thin-films and mesoscopic elements the geometry has an often pivotal effect, providing a convenient…
▽ More
The magnetic domain configuration of a system reveals a wealth of information about the fundamental magnetic properties of that system and can be a critical factor in the operation of magnetic devices. Not only are the details of the domain structure strongly governed by materials parameters, but in thin-films and mesoscopic elements the geometry has an often pivotal effect, providing a convenient handle to tailor desired domain states. Furthermore a full understanding of a system requires, in addition, investigation of the dynamic evolution of the spin-state, which is of particular importance for applications relying on e.g. the switching of magnetic elements. Here we review some of the main modern techniques for magnetic imaging, highlighting their respective advantages and limitations. The methods for imaging domain configurations and spin structures cover various spatial and temporal resolution scales and encompass those based on electron and x-ray microscopy as well as scanning probe techniques. Furthermore, away from the discipline of condensed-matter physics, magnetic effects are instrumental in a number of techniques for medical imaging, some key examples of which we also present.
△ Less
Submitted 29 June, 2018; v1 submitted 20 June, 2018;
originally announced June 2018.
-
Investigation of the Dzyaloshinskii-Moriya interaction and room temperature skyrmions in W/CoFeB/MgO thin films and microwires
Authors:
S. Jaiswal,
K. Litzius,
I. Lemesh,
F. Buttner,
S. Finizio,
J. Raabe,
M. Weigand,
K. Lee,
J. Langer,
B. Ocker,
G. Jakob,
G. S. D. Beach,
M. Klaeui
Abstract:
Recent studies have shown that material structures, which lack structural inversion symmetry and have high spin-orbit coupling can exhibit chiral magnetic textures and skyrmions which could be a key component for next generation storage devices. The Dzyaloshinskii-Moriya Interaction (DMI) that stabilizes skyrmions is an anti-symmetric exchange interaction favoring non-collinear orientation of neig…
▽ More
Recent studies have shown that material structures, which lack structural inversion symmetry and have high spin-orbit coupling can exhibit chiral magnetic textures and skyrmions which could be a key component for next generation storage devices. The Dzyaloshinskii-Moriya Interaction (DMI) that stabilizes skyrmions is an anti-symmetric exchange interaction favoring non-collinear orientation of neighboring spins. It has been shown that material systems with high DMI can lead to very efficient domain wall and skyrmion motion by spin-orbit torques. To engineer such devices, it is important to quantify the DMI for a given material system. Here we extract the DMI at the Heavy Metal (HM) /Ferromagnet (FM) interface using two complementary measurement schemes namely asymmetric domain wall motion and the magnetic stripe annihilation. By using the two different measurement schemes, we find for W(5 nm)/Co20Fe60B20(0.6 nm)/MgO(2 nm) the DMI to be 0.68 +/- 0.05 mJ/m2 and 0.73 +/- 0.5 mJ/m2, respectively. Furthermore, we show that this DMI stabilizes skyrmions at room temperature and that there is a strong dependence of the DMI on the relative composition of the CoFeB alloy. Finally we optimize the layers and the interfaces using different growth conditions and demonstrate that a higher deposition rate leads to a more uniform film with reduced pinning and skyrmions that can be manipulated by Spin-Orbit Torques.
△ Less
Submitted 19 June, 2017;
originally announced June 2017.
-
Field-free deterministic ultra fast creation of skyrmions by spin orbit torques
Authors:
Felix Büttner,
Ivan Lemesh,
Michael Schneider,
Bastian Pfau,
Christian M. Günther,
Piet Hessing,
Jan Geilhufe,
Lucas Caretta,
Dieter Engel,
Benjamin Krüger,
Jens Viefhaus,
Stefan Eisebitt,
Geoffrey S. D. Beach
Abstract:
Magnetic skyrmions are currently the most promising option to realize current-driven magnetic shift registers. A variety of concepts to create skyrmions were proposed and demonstrated. However, none of the reported experiments show controlled creation of single skyrmions using integrated designs. Here, we demonstrate that skyrmions can be generated deterministically on subnanosecond timescales in…
▽ More
Magnetic skyrmions are currently the most promising option to realize current-driven magnetic shift registers. A variety of concepts to create skyrmions were proposed and demonstrated. However, none of the reported experiments show controlled creation of single skyrmions using integrated designs. Here, we demonstrate that skyrmions can be generated deterministically on subnanosecond timescales in magnetic racetracks at artificial or natural defects using spin orbit torque (SOT) pulses. The mechanism is largely similar to SOT-induced switching of uniformly magnetized elements, but due to the effect of the Dzyaloshinskii-Moriya interaction (DMI), external fields are not required. Our observations provide a simple and reliable means for skyrmion writing that can be readily integrated into racetrack devices.
△ Less
Submitted 4 May, 2017;
originally announced May 2017.
-
Full phase diagram of isolated skyrmions in a ferromagnet
Authors:
Felix Büttner,
Ivan Lemesh,
Geoffrey S. D. Beach
Abstract:
Magnetic skyrmions are topological quasi particles of great interest for data storage applications because of their small size, high stability, and ease of manipulation via electric current. Theoretically, however, skyrmions are poorly understood since existing theories are not applicable to small skyrmion sizes and finite material thicknesses. Here, we present a complete theoretical framework to…
▽ More
Magnetic skyrmions are topological quasi particles of great interest for data storage applications because of their small size, high stability, and ease of manipulation via electric current. Theoretically, however, skyrmions are poorly understood since existing theories are not applicable to small skyrmion sizes and finite material thicknesses. Here, we present a complete theoretical framework to determine the energy of any skyrmion in any material, assuming only a circular symmetric 360$^\circ$ domain wall profile and a homogeneous magnetization profile in the out-of-plane direction. Our model precisely agrees with existing experimental data and micromagnetic simulations. Surprisingly, we can prove that there is no topological protection of skyrmions. We discover and confirm new phases, such as bi-stability, a phenomenon unknown in magnetism so far. The outstanding computational performance and precision of our model allow us to obtain the complete phase diagram of static skyrmions and to tackle the inverse problem of finding materials corresponding to given skyrmion properties, a milestone of skyrmion engineering.
△ Less
Submitted 27 April, 2017;
originally announced April 2017.
-
Imaging the Spin Texture of a Skyrmion Under Ambient Conditions Using an Atomic-Sized Sensor
Authors:
Yuliya Dovzhenko,
Francesco Casola,
Sarah Schlotter,
Tony X. Zhou,
Felix Büttner,
Ronald L. Walsworth,
Geoffrey S. D. Beach,
Amir Yacoby
Abstract:
Magnetic skyrmions are two-dimensional non-collinear spin textures characterised by an integer topological number. They commonly crystallise at low temperatures in bulk noncentrosymmetric ferromagnets where the lack of inversion symmetry leads to an antisymmetric component of the exchange interaction. Recently, stable room-temperature skyrmions were reported in stacks of thin magnetic films where…
▽ More
Magnetic skyrmions are two-dimensional non-collinear spin textures characterised by an integer topological number. They commonly crystallise at low temperatures in bulk noncentrosymmetric ferromagnets where the lack of inversion symmetry leads to an antisymmetric component of the exchange interaction. Recently, stable room-temperature skyrmions were reported in stacks of thin magnetic films where antisymmetric exchange results from broken symmetry at the interface. Determining the spin structure of these technologically-relevant skyrmions in the presence of external magnetic fields has remained a key experimental challenge. Here, we use the single electron spin of a Nitrogen-Vacancy (NV) centre in diamond to perform scanning magnetometry of skyrmions in Pt/Co/Ta multilayers under ambient conditions. We introduce a novel method to assess the manifold of spin textures compatible with the measured local magnetic fields and identify physically allowed configurations based on the topology of the resulting solution. We determine that the underlying magnetization pattern for the skyrmion is consistent with a cycloid (or Néel)-like spin texture, in agreement with theoretical predictions for interfacial antisymmetric exchange interaction. Our results open up wide-ranging possibilities in quantitative model-free imaging of two-dimensional spin structures using scanning NV magnetometry.
△ Less
Submitted 8 August, 2017; v1 submitted 2 November, 2016;
originally announced November 2016.
-
Skyrmion Hall Effect Revealed by Direct Time-Resolved X-Ray Microscopy
Authors:
Kai Litzius,
Ivan Lemesh,
Benjamin Krüger,
Pedram Bassirian,
Lucas Caretta,
Kornel Richter,
Felix Büttner,
Koji Sato,
Oleg A. Tretiakov,
Johannes Förster,
Robert M. Reeve,
Markus Weigand,
Iuliia Bykova,
Hermann Stoll,
Gisela Schütz,
Geoffrey S. D. Beach,
Mathias Kläui
Abstract:
Magnetic skyrmions are highly promising candidates for future spintronic applications such as skyrmion racetrack memories and logic devices. They exhibit exotic and complex dynamics governed by topology and are less influenced by defects, such as edge roughness, than conventionally used domain walls. In particular, their finite topological charge leads to a predicted "skyrmion Hall effect", in whi…
▽ More
Magnetic skyrmions are highly promising candidates for future spintronic applications such as skyrmion racetrack memories and logic devices. They exhibit exotic and complex dynamics governed by topology and are less influenced by defects, such as edge roughness, than conventionally used domain walls. In particular, their finite topological charge leads to a predicted "skyrmion Hall effect", in which current-driven skyrmions acquire a transverse velocity component analogous to charged particles in the conventional Hall effect. Here, we present nanoscale pump-probe imaging that for the first time reveals the real-time dynamics of skyrmions driven by current-induced spin orbit torque (SOT). We find that skyrmions move at a well-defined angle Θ_{SH} that can exceed 30° with respect to the current flow, but in contrast to theoretical expectations, Θ_{SH} increases linearly with velocity up to at least 100 m/s. We explain our observation based on internal mode excitations in combination with a field-like SOT, showing that one must go beyond the usual rigid skyrmion description to unravel the dynamics.
△ Less
Submitted 28 September, 2016; v1 submitted 25 August, 2016;
originally announced August 2016.
-
Phase-locking in cascaded stimulated Brillouin scattering
Authors:
Thomas F. S. Büttner,
Christopher G. Poulton,
M. J. Steel,
Darren D. Hudson,
Benjamin J. Eggleton
Abstract:
Cascaded stimulated Brillouin scattering (SBS) is a complex nonlinear optical process that results in the generation of several optical waves that are frequency shifted by an acoustic resonance frequency. Four-wave mixing (FWM) between these Brillouin shifted optical waves can create an equally spaced optical frequency comb with a stable spectral phase, i.e. a Brillouin frequency comb (BFC). Here,…
▽ More
Cascaded stimulated Brillouin scattering (SBS) is a complex nonlinear optical process that results in the generation of several optical waves that are frequency shifted by an acoustic resonance frequency. Four-wave mixing (FWM) between these Brillouin shifted optical waves can create an equally spaced optical frequency comb with a stable spectral phase, i.e. a Brillouin frequency comb (BFC). Here, we investigate phase-locking of the spectral components of BFCs, considering FWM interactions arising from the Kerr-nonlinearity as well as from coupling by the acoustic field. Deriving for the first time the coupled-mode equations that include all relevant nonlinear interactions, we examine the contribution of the various nonlinear processes to phase-locking, and show that different regimes can be obtained that depend on the length scale on which the field amplitudes vary.
△ Less
Submitted 25 October, 2015;
originally announced October 2015.
-
Accurate calculation of the transverse anisotropy in perpendicularly magnetized multilayers
Authors:
Felix Büttner,
Benjamin Krüger,
Stefan Eisebitt,
Mathias Kläui
Abstract:
The transverse anisotropy constant and the related Döring mass density are key parameters of the one-dimensional model to describe the motion of magnetic domain walls. So far, no general framework is available to determine these quantities from static characterizations such as magnetometry measurements. Here, we derive a universal analytical expression to calculate the transverse anisotropy consta…
▽ More
The transverse anisotropy constant and the related Döring mass density are key parameters of the one-dimensional model to describe the motion of magnetic domain walls. So far, no general framework is available to determine these quantities from static characterizations such as magnetometry measurements. Here, we derive a universal analytical expression to calculate the transverse anisotropy constant for the important class of perpendicular magnetic multilayers. All the required input parameters of the model, such as the number of repeats, the thickness of a single magnetic layer, and the layer periodicity, as well as the effective perpendicular anisotropy, the saturation magnetization, and the static domain wall width are accessible by static sample characterizations. We apply our model to a widely used multilayer system and find that the effective transverse anisotropy constant is a factor 7 different from the when using the conventional approximations, showing the importance of using our analysis scheme.
△ Less
Submitted 27 February, 2015;
originally announced February 2015.
-
Phase-locking in Multi-Frequency Brillouin Oscillator via Four Wave Mixing
Authors:
Thomas F. S. Buettner,
Irina V. Kabakova,
Darren D. Hudson,
Ravi Pant,
Christopher G. Poulton,
Alexander C. Judge,
Benjamin J. Eggleton
Abstract:
Stimulated Brillouin scattering (SBS) and Kerr-nonlinear four wave-mixing (FWM) are among the most important and widely studied nonlinear effects in optical fibres. At high powers SBS can be cascaded producing multiple Stokes waves spaced by the Brillouin frequency shift. Here, we investigate the complex nonlinear interaction of the cascade of Stokes waves, generated in a Fabry-Perot chalcogenide…
▽ More
Stimulated Brillouin scattering (SBS) and Kerr-nonlinear four wave-mixing (FWM) are among the most important and widely studied nonlinear effects in optical fibres. At high powers SBS can be cascaded producing multiple Stokes waves spaced by the Brillouin frequency shift. Here, we investigate the complex nonlinear interaction of the cascade of Stokes waves, generated in a Fabry-Perot chalcogenide fibre resonator through the combined action of SBS and FWM. We demonstrate the existence of parameter regimes, in which pump and Stokes waves attain a phase-locked steady state. Real-time measurements of 40ps pulses with 8GHz repetition rate are presented, confirming short-and long-term stability. Numerical simulations qualitatively agree with experiments and show the significance of FWM in phase-locking of pump and Stokes waves. Our findings can be applied for the design of novel picosecond pulse sources with GHz repetition rate for optical communication systems.
△ Less
Submitted 17 February, 2014;
originally announced February 2014.
-
Analyzing Flowgraphs with ATL
Authors:
Valerio Cosentino,
Massimo Tisi,
Fabian Büttner
Abstract:
This paper presents a solution to the Flowgraphs case study for the Transformation Tool Contest 2013 (TTC 2013). Starting from Java source code, we execute a chain of model transformations to derive a simplified model of the program, its control flow graph and its data flow graph. Finally we develop a model transformation that validates the program flow by comparing it with a set of flow specifica…
▽ More
This paper presents a solution to the Flowgraphs case study for the Transformation Tool Contest 2013 (TTC 2013). Starting from Java source code, we execute a chain of model transformations to derive a simplified model of the program, its control flow graph and its data flow graph. Finally we develop a model transformation that validates the program flow by comparing it with a set of flow specifications written in a domain specific language. The proposed solution has been implemented using ATL.
△ Less
Submitted 2 December, 2013;
originally announced December 2013.