Search | arXiv e-print repository

Aligning Transformers with Weisfeiler-Leman

Authors: Luis Müller, Christopher Morris

Abstract: Graph neural network architectures aligned with the $k$-dimensional Weisfeiler--Leman ($k$-WL) hierarchy offer theoretically well-understood expressive power. However, these architectures often fail to deliver state-of-the-art predictive performance on real-world graphs, limiting their practical utility. While recent works aligning graph transformer architectures with the $k$-WL hierarchy have sho… ▽ More Graph neural network architectures aligned with the $k$-dimensional Weisfeiler--Leman ($k$-WL) hierarchy offer theoretically well-understood expressive power. However, these architectures often fail to deliver state-of-the-art predictive performance on real-world graphs, limiting their practical utility. While recent works aligning graph transformer architectures with the $k$-WL hierarchy have shown promising empirical results, employing transformers for higher orders of $k$ remains challenging due to a prohibitive runtime and memory complexity of self-attention as well as impractical architectural assumptions, such as an infeasible number of attention heads. Here, we advance the alignment of transformers with the $k$-WL hierarchy, showing stronger expressivity results for each $k$, making them more feasible in practice. In addition, we develop a theoretical framework that allows the study of established positional encodings such as Laplacian PEs and SPE. We evaluate our transformers on the large-scale PCQM4Mv2 dataset, showing competitive predictive performance with the state-of-the-art and demonstrating strong downstream performance when fine-tuning them on small-scale molecular datasets. Our code is available at https://github.com/luis-mueller/wl-transformers. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted at ICML 2024

arXiv:2405.03689 [pdf, other]

Pose Priors from Language Models

Authors: Sanjay Subramanian, Evonne Ng, Lea Müller, Dan Klein, Shiry Ginosar, Trevor Darrell

Abstract: We present a zero-shot pose optimization method that enforces accurate physical contact constraints when estimating the 3D pose of humans. Our central insight is that since language is often used to describe physical interaction, large pretrained text-based models can act as priors on pose estimation. We can thus leverage this insight to improve pose estimation by converting natural language des… ▽ More We present a zero-shot pose optimization method that enforces accurate physical contact constraints when estimating the 3D pose of humans. Our central insight is that since language is often used to describe physical interaction, large pretrained text-based models can act as priors on pose estimation. We can thus leverage this insight to improve pose estimation by converting natural language descriptors, generated by a large multimodal model (LMM), into tractable losses to constrain the 3D pose optimization. Despite its simplicity, our method produces surprisingly compelling pose reconstructions of people in close contact, correctly capturing the semantics of the social and physical interactions. We demonstrate that our method rivals more complex state-of-the-art approaches that require expensive human annotation of contact points and training specialized models. Moreover, unlike previous approaches, our method provides a unified framework for resolving self-contact and person-to-person contact. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.16051 [pdf, other]

TimeFlows: Visualizing Process Chronologies from Vast Collections of Heterogeneous Information Objects

Authors: Max Lonysa Muller, Erik Saaman, Jan Martijn E. M. van der Werf, Charles Jeurgens, Hajo A. Reijers

Abstract: In many fact-finding investigations, notably parliamentary inquiries, process chronologies are created to reconstruct how a controversial policy or decision came into existence. Current approaches, like timelines, lack the expressiveness to represent the variety of relations in which historic events may link to the overall chronology. This obfuscates the nature of the interdependence among the eve… ▽ More In many fact-finding investigations, notably parliamentary inquiries, process chronologies are created to reconstruct how a controversial policy or decision came into existence. Current approaches, like timelines, lack the expressiveness to represent the variety of relations in which historic events may link to the overall chronology. This obfuscates the nature of the interdependence among the events, and the texts from which they are distilled. Based on explorative interviews with expert analysts, we propose an extended, rich set of relationships. We describe how these can be visualized as TimeFlows. We provide an example of such a visualization by illustrating the Childcare Benefits Scandal -- an affair that deeply affected Dutch politics in recent years. This work extends the scope of existing process discovery research into the direction of unveiling non-repetitive processes from unstructured information objects. △ Less

Submitted 2 May, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: 16 pages, accepted at RCIS 2024

arXiv:2404.14986 [pdf, other]

$\texttt{MiniMol}$: A Parameter-Efficient Foundation Model for Molecular Learning

Authors: Kerstin Kläser, Błażej Banaszewski, Samuel Maddrell-Mander, Callum McLean, Luis Müller, Ali Parviz, Shenyang Huang, Andrew Fitzgibbon

Abstract: In biological tasks, data is rarely plentiful as it is generated from hard-to-gather measurements. Therefore, pre-training foundation models on large quantities of available data and then transfer to low-data downstream tasks is a promising direction. However, how to design effective foundation models for molecular learning remains an open question, with existing approaches typically focusing on m… ▽ More In biological tasks, data is rarely plentiful as it is generated from hard-to-gather measurements. Therefore, pre-training foundation models on large quantities of available data and then transfer to low-data downstream tasks is a promising direction. However, how to design effective foundation models for molecular learning remains an open question, with existing approaches typically focusing on models with large parameter capacities. In this work, we propose $\texttt{MiniMol}$, a foundational model for molecular learning with 10 million parameters. $\texttt{MiniMol}$ is pre-trained on a mix of roughly 3300 sparsely defined graph- and node-level tasks of both quantum and biological nature. The pre-training dataset includes approximately 6 million molecules and 500 million labels. To demonstrate the generalizability of $\texttt{MiniMol}$ across tasks, we evaluate it on downstream tasks from the Therapeutic Data Commons (TDC) ADMET group showing significant improvements over the prior state-of-the-art foundation model across 17 tasks. $\texttt{MiniMol}$ will be a public and open-sourced model for future research. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2401.14267 [pdf]

Transformers and Cortical Waves: Encoders for Pulling In Context Across Time

Authors: Lyle Muller, Patricia S. Churchland, Terrence J. Sejnowski

Abstract: The capabilities of transformer networks such as ChatGPT and other Large Language Models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long "encoding vector" that allows transformers to learn long-range temporal dependencies in naturali… ▽ More The capabilities of transformer networks such as ChatGPT and other Large Language Models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long "encoding vector" that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, "self-attention" applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas or multiple regions at the whole-brain scale could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable temporal context to be extracted from sequences of sensory inputs, the same computational principle used in transformers. △ Less

Submitted 2 July, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: 25 pages, 5 figures

arXiv:2401.10119 [pdf, other]

Towards Principled Graph Transformers

Authors: Luis Müller, Daniel Kusuma, Blai Bonet, Christopher Morris

Abstract: Graph learning architectures based on the k-dimensional Weisfeiler-Leman (k-WL) hierarchy offer a theoretically well-understood expressive power. However, such architectures often fail to deliver solid predictive performance on real-world tasks, limiting their practical impact. In contrast, global attention-based models such as graph transformers demonstrate strong performance in practice, but com… ▽ More Graph learning architectures based on the k-dimensional Weisfeiler-Leman (k-WL) hierarchy offer a theoretically well-understood expressive power. However, such architectures often fail to deliver solid predictive performance on real-world tasks, limiting their practical impact. In contrast, global attention-based models such as graph transformers demonstrate strong performance in practice, but comparing their expressive power with the k-WL hierarchy remains challenging, particularly since these architectures rely on positional or structural encodings for their expressivity and predictive performance. To address this, we show that the recently proposed Edge Transformer, a global attention model operating on node pairs instead of nodes, has at least 3-WL expressive power. Empirically, we demonstrate that the Edge Transformer surpasses other theoretically aligned architectures regarding predictive performance while not relying on positional or structural encodings. Our code is available at https://github.com/luis-mueller/towards-principled-gts △ Less

Submitted 24 May, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

arXiv:2312.03043 [pdf, other]

Navigating the Synthetic Realm: Harnessing Diffusion-based Models for Laparoscopic Text-to-Image Generation

Authors: Simeon Allmendinger, Patrick Hemmer, Moritz Queisner, Igor Sauer, Leopold Müller, Johannes Jakubik, Michael Vössing, Niklas Kühl

Abstract: Recent advances in synthetic imaging open up opportunities for obtaining additional data in the field of surgical imaging. This data can provide reliable supplements supporting surgical applications and decision-making through computer vision. Particularly the field of image-guided surgery, such as laparoscopic and robotic-assisted surgery, benefits strongly from synthetic image datasets and virtu… ▽ More Recent advances in synthetic imaging open up opportunities for obtaining additional data in the field of surgical imaging. This data can provide reliable supplements supporting surgical applications and decision-making through computer vision. Particularly the field of image-guided surgery, such as laparoscopic and robotic-assisted surgery, benefits strongly from synthetic image datasets and virtual surgical training methods. Our study presents an intuitive approach for generating synthetic laparoscopic images from short text prompts using diffusion-based generative models. We demonstrate the usage of state-of-the-art text-to-image architectures in the context of laparoscopic imaging with regard to the surgical removal of the gallbladder as an example. Results on fidelity and diversity demonstrate that diffusion-based models can acquire knowledge about the style and semantics in the field of image-guided surgery. A validation study with a human assessment survey underlines the realistic nature of our synthetic data, as medical personnel detects actual images in a pool with generated images causing a false-positive rate of 66%. In addition, the investigation of a state-of-the-art machine learning model to recognize surgical actions indicates enhanced results when trained with additional generated images of up to 5.20%. Overall, the achieved image quality contributes to the usage of computer-generated images in surgical applications and enhances its path to maturity. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2311.16943 [pdf, other]

Image segmentation with traveling waves in an exactly solvable recurrent neural network

Authors: Luisa H. B. Liboni, Roberto C. Budzinski, Alexandra N. Busch, Sindy Löwe, Thomas A. Keller, Max Welling, Lyle E. Muller

Abstract: We study image segmentation using spatiotemporal dynamics in a recurrent neural network where the state of each unit is given by a complex number. We show that this network generates sophisticated spatiotemporal dynamics that can effectively divide an image into groups according to a scene's structural characteristics. Using an exact solution of the recurrent network's dynamics, we present a preci… ▽ More We study image segmentation using spatiotemporal dynamics in a recurrent neural network where the state of each unit is given by a complex number. We show that this network generates sophisticated spatiotemporal dynamics that can effectively divide an image into groups according to a scene's structural characteristics. Using an exact solution of the recurrent network's dynamics, we present a precise description of the mechanism underlying object segmentation in this network, providing a clear mathematical interpretation of how the network performs this task. We then demonstrate a simple algorithm for object segmentation that generalizes across inputs ranging from simple geometric objects in grayscale images to natural images. Object segmentation across all images is accomplished with one recurrent neural network that has a single, fixed set of weights. This demonstrates the expressive potential of recurrent neural networks when constructed using a mathematical approach that brings together their structure, dynamics, and computation. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2311.16431 [pdf, other]

An exact mathematical description of computation with transient spatiotemporal dynamics in a complex-valued neural network

Authors: Roberto C. Budzinski, Alexandra N. Busch, Samuel Mestern, Erwan Martin, Luisa H. B. Liboni, Federico W. Pasini, Ján Mináč, Todd Coleman, Wataru Inoue, Lyle E. Muller

Abstract: We study a complex-valued neural network (cv-NN) with linear, time-delayed interactions. We report the cv-NN displays sophisticated spatiotemporal dynamics, including partially synchronized ``chimera'' states. We then use these spatiotemporal dynamics, in combination with a nonlinear readout, for computation. The cv-NN can instantiate dynamics-based logic gates, encode short-term memories, and med… ▽ More We study a complex-valued neural network (cv-NN) with linear, time-delayed interactions. We report the cv-NN displays sophisticated spatiotemporal dynamics, including partially synchronized ``chimera'' states. We then use these spatiotemporal dynamics, in combination with a nonlinear readout, for computation. The cv-NN can instantiate dynamics-based logic gates, encode short-term memories, and mediate secure message passing through a combination of interactions and time delays. The computations in this system can be fully described in an exact, closed-form mathematical expression. Finally, using direct intracellular recordings of neurons in slices from neocortex, we demonstrate that computations in the cv-NN are decodable by living biological neurons. These results demonstrate that complex-valued linear systems can perform sophisticated computations, while also being exactly solvable. Taken together, these results open future avenues for design of highly adaptable, bio-hybrid computing systems that can interface seamlessly with other neural networks. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.09744 [pdf, other]

Redefining the Laparoscopic Spatial Sense: AI-based Intra- and Postoperative Measurement from Stereoimages

Authors: Leopold Müller, Patrick Hemmer, Moritz Queisner, Igor Sauer, Simeon Allmendinger, Johannes Jakubik, Michael Vössing, Niklas Kühl

Abstract: A significant challenge in image-guided surgery is the accurate measurement task of relevant structures such as vessel segments, resection margins, or bowel lengths. While this task is an essential component of many surgeries, it involves substantial human effort and is prone to inaccuracies. In this paper, we develop a novel human-AI-based method for laparoscopic measurements utilizing stereo vis… ▽ More A significant challenge in image-guided surgery is the accurate measurement task of relevant structures such as vessel segments, resection margins, or bowel lengths. While this task is an essential component of many surgeries, it involves substantial human effort and is prone to inaccuracies. In this paper, we develop a novel human-AI-based method for laparoscopic measurements utilizing stereo vision that has been guided by practicing surgeons. Based on a holistic qualitative requirements analysis, this work proposes a comprehensive measurement method, which comprises state-of-the-art machine learning architectures, such as RAFT-Stereo and YOLOv8. The developed method is assessed in various realistic experimental evaluation environments. Our results outline the potential of our method achieving high accuracies in distance measurements with errors below 1 mm. Furthermore, on-surface measurements demonstrate robustness when applied in challenging environments with textureless regions. Overall, by addressing the inherent challenges of image-guided surgery, we lay the foundation for a more robust and accurate solution for intra- and postoperative measurements, enabling more precise, safe, and efficient surgical procedures. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 38th AAAI Conference on Artificial Intelligence (AAAI-24)

arXiv:2310.04292 [pdf, other]

Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets

Authors: Dominique Beaini, Shenyang Huang, Joao Alex Cunha, Zhiyi Li, Gabriela Moisescu-Pareja, Oleksandr Dymov, Samuel Maddrell-Mander, Callum McLean, Frederik Wenkel, Luis Müller, Jama Hussein Mohamud, Ali Parviz, Michael Craig, Michał Koziarski, Jiarui Lu, Zhaocheng Zhu, Cristian Gabellini, Kerstin Klaser, Josef Dean, Cas Wognum, Maciej Sypetkowski, Guillaume Rabusseau, Reihaneh Rabbany, Jian Tang, Christopher Morris , et al. (10 additional authors not shown)

Abstract: Recently, pre-trained foundation models have enabled significant advancements in multiple fields. In molecular machine learning, however, where datasets are often hand-curated, and hence typically small, the lack of datasets with labeled features, and codebases to manage those datasets, has hindered the development of foundation models. In this work, we present seven novel datasets categorized by… ▽ More Recently, pre-trained foundation models have enabled significant advancements in multiple fields. In molecular machine learning, however, where datasets are often hand-curated, and hence typically small, the lack of datasets with labeled features, and codebases to manage those datasets, has hindered the development of foundation models. In this work, we present seven novel datasets categorized by size into three distinct categories: ToyMix, LargeMix and UltraLarge. These datasets push the boundaries in both the scale and the diversity of supervised labels for molecular learning. They cover nearly 100 million molecules and over 3000 sparsely defined tasks, totaling more than 13 billion individual labels of both quantum and biological nature. In comparison, our datasets contain 300 times more data points than the widely used OGB-LSC PCQM4Mv2 dataset, and 13 times more than the quantum-only QM1B dataset. In addition, to support the development of foundational models based on our proposed datasets, we present the Graphium graph machine learning library which simplifies the process of building and training molecular machine learning models for multi-task and multi-level molecular datasets. Finally, we present a range of baseline results as a starting point of multi-task and multi-level training on these datasets. Empirically, we observe that performance on low-resource biological datasets show improvement by also training on large amounts of quantum data. This indicates that there may be potential in multi-task and multi-level training of a foundation model and fine-tuning it to resource-constrained downstream tasks. △ Less

Submitted 18 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

arXiv:2310.03507 [pdf, other]

RL-based Stateful Neural Adaptive Sampling and Denoising for Real-Time Path Tracing

Authors: Antoine Scardigli, Lukas Cavigelli, Lorenz K. Müller

Abstract: Monte-Carlo path tracing is a powerful technique for realistic image synthesis but suffers from high levels of noise at low sample counts, limiting its use in real-time applications. To address this, we propose a framework with end-to-end training of a sampling importance network, a latent space encoder network, and a denoiser network. Our approach uses reinforcement learning to optimize the sampl… ▽ More Monte-Carlo path tracing is a powerful technique for realistic image synthesis but suffers from high levels of noise at low sample counts, limiting its use in real-time applications. To address this, we propose a framework with end-to-end training of a sampling importance network, a latent space encoder network, and a denoiser network. Our approach uses reinforcement learning to optimize the sampling importance network, thus avoiding explicit numerically approximated gradients. Our method does not aggregate the sampled values per pixel by averaging but keeps all sampled values which are then fed into the latent space encoder. The encoder replaces handcrafted spatiotemporal heuristics by learned representations in a latent space. Finally, a neural denoiser is trained to refine the output image. Our approach increases visual quality on several challenging datasets and reduces rendering times for equal quality by a factor of 1.6x compared to the previous state-of-the-art, making it a promising solution for real-time applications. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: Submitted to NeurIPS. https://openreview.net/forum?id=xNyR7DXUzJ

arXiv:2309.08045 [pdf, other]

Traveling Waves Encode the Recent Past and Enhance Sequence Learning

Authors: T. Anderson Keller, Lyle Muller, Terrence Sejnowski, Max Welling

Abstract: Traveling waves of neural activity have been observed throughout the brain at a diversity of regions and scales; however, their precise computational role is still debated. One physically inspired hypothesis suggests that the cortical sheet may act like a wave-propagating system capable of invertibly storing a short-term memory of sequential stimuli through induced waves traveling across the corti… ▽ More Traveling waves of neural activity have been observed throughout the brain at a diversity of regions and scales; however, their precise computational role is still debated. One physically inspired hypothesis suggests that the cortical sheet may act like a wave-propagating system capable of invertibly storing a short-term memory of sequential stimuli through induced waves traveling across the cortical surface, and indeed many experimental results from neuroscience correlate wave activity with memory tasks. To date, however, the computational implications of this idea have remained hypothetical due to the lack of a simple recurrent neural network architecture capable of exhibiting such waves. In this work, we introduce a model to fill this gap, which we denote the Wave-RNN (wRNN), and demonstrate how such an architecture indeed efficiently encodes the recent past through a suite of synthetic memory tasks where wRNNs learn faster and reach significantly lower error than wave-free counterparts. We further explore the implications of this memory storage system on more complex sequence modeling tasks such as sequential image classification and find that wave-based models not only again outperform comparable wave-free RNNs while using significantly fewer parameters, but additionally perform comparably to more complex gated architectures such as LSTMs and GRUs. △ Less

Submitted 14 March, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

arXiv:2307.03571 [pdf, other]

Smoothing the Edges: Smooth Optimization for Sparse Regularization using Hadamard Overparametrization

Authors: Chris Kolb, Christian L. Müller, Bernd Bischl, David Rügamer

Abstract: We present a framework for smooth optimization of explicitly regularized objectives for (structured) sparsity. These non-smooth and possibly non-convex problems typically rely on solvers tailored to specific models and regularizers. In contrast, our method enables fully differentiable and approximation-free optimization and is thus compatible with the ubiquitous gradient descent paradigm in deep l… ▽ More We present a framework for smooth optimization of explicitly regularized objectives for (structured) sparsity. These non-smooth and possibly non-convex problems typically rely on solvers tailored to specific models and regularizers. In contrast, our method enables fully differentiable and approximation-free optimization and is thus compatible with the ubiquitous gradient descent paradigm in deep learning. The proposed optimization transfer comprises an overparameterization of selected parameters and a change of penalties. In the overparametrized problem, smooth surrogate regularization induces non-smooth, sparse regularization in the base parametrization. We prove that the surrogate objective is equivalent in the sense that it not only has identical global minima but also matching local minima, thereby avoiding the introduction of spurious solutions. Additionally, our theory establishes results of independent interest regarding matching local minima for arbitrary, potentially unregularized, objectives. We comprehensively review sparsity-inducing parametrizations across different fields that are covered by our general theory, extend their scope, and propose improvements in several aspects. Numerical experiments further demonstrate the correctness and effectiveness of our approach on several sparse learning problems ranging from high-dimensional regression to sparse neural network training. △ Less

Submitted 26 April, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

arXiv:2306.09337 [pdf, other]

Generative Proxemics: A Prior for 3D Social Interaction from Images

Authors: Lea Müller, Vickie Ye, Georgios Pavlakos, Michael Black, Angjoo Kanazawa

Abstract: Social interaction is a fundamental aspect of human behavior and communication. The way individuals position themselves in relation to others, also known as proxemics, conveys social cues and affects the dynamics of social interaction. Reconstructing such interaction from images presents challenges because of mutual occlusion and the limited availability of large training datasets. To address this… ▽ More Social interaction is a fundamental aspect of human behavior and communication. The way individuals position themselves in relation to others, also known as proxemics, conveys social cues and affects the dynamics of social interaction. Reconstructing such interaction from images presents challenges because of mutual occlusion and the limited availability of large training datasets. To address this, we present a novel approach that learns a prior over the 3D proxemics two people in close social interaction and demonstrate its use for single-view 3D reconstruction. We start by creating 3D training data of interacting people using image datasets with contact annotations. We then model the proxemics using a novel denoising diffusion model called BUDDI that learns the joint distribution over the poses of two people in close social interaction. Sampling from our generative proxemics model produces realistic 3D human interactions, which we validate through a perceptual study. We use BUDDI in reconstructing two people in close proximity from a single image without any contact annotation via an optimization approach that uses the diffusion model as a prior. Our approach recovers accurate and plausible 3D social interactions from noisy initial estimates, outperforming state-of-the-art methods. Our code, data, and model are availableat our project website at: muelea.github.io/buddi. △ Less

Submitted 12 December, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: Project website: muelea.github.io/buddi

arXiv:2303.18246 [pdf, other]

3D Human Pose Estimation via Intuitive Physics

Authors: Shashank Tripathi, Lea Müller, Chun-Hao P. Huang, Omid Taheri, Michael J. Black, Dimitrios Tzionas

Abstract: Estimating 3D humans from images often produces implausible bodies that lean, float, or penetrate the floor. Such methods ignore the fact that bodies are typically supported by the scene. A physics engine can be used to enforce physical plausibility, but these are not differentiable, rely on unrealistic proxy bodies, and are difficult to integrate into existing optimization and learning frameworks… ▽ More Estimating 3D humans from images often produces implausible bodies that lean, float, or penetrate the floor. Such methods ignore the fact that bodies are typically supported by the scene. A physics engine can be used to enforce physical plausibility, but these are not differentiable, rely on unrealistic proxy bodies, and are difficult to integrate into existing optimization and learning frameworks. In contrast, we exploit novel intuitive-physics (IP) terms that can be inferred from a 3D SMPL body interacting with the scene. Inspired by biomechanics, we infer the pressure heatmap on the body, the Center of Pressure (CoP) from the heatmap, and the SMPL body's Center of Mass (CoM). With these, we develop IPMAN, to estimate a 3D body from a color image in a "stable" configuration by encouraging plausible floor contact and overlapping CoP and CoM. Our IP terms are intuitive, easy to implement, fast to compute, differentiable, and can be integrated into existing optimization and regression methods. We evaluate IPMAN on standard datasets and MoYo, a new dataset with synchronized multi-view images, ground-truth 3D bodies with complex poses, body-floor contact, CoM and pressure. IPMAN produces more plausible results than the state of the art, improving accuracy for static poses, while not hurting dynamic ones. Code and data are available for research at https://ipman.is.tue.mpg.de. △ Less

Submitted 24 July, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

Comments: Accepted in CVPR'23. Project page: https://ipman.is.tue.mpg.de

arXiv:2302.04181 [pdf, other]

Attending to Graph Transformers

Authors: Luis Müller, Mikhail Galkin, Christopher Morris, Ladislav Rampášek

Abstract: Recently, transformer architectures for graphs emerged as an alternative to established techniques for machine learning with graphs, such as (message-passing) graph neural networks. So far, they have shown promising empirical results, e.g., on molecular prediction datasets, often attributed to their ability to circumvent graph neural networks' shortcomings, such as over-smoothing and over-squashin… ▽ More Recently, transformer architectures for graphs emerged as an alternative to established techniques for machine learning with graphs, such as (message-passing) graph neural networks. So far, they have shown promising empirical results, e.g., on molecular prediction datasets, often attributed to their ability to circumvent graph neural networks' shortcomings, such as over-smoothing and over-squashing. Here, we derive a taxonomy of graph transformer architectures, bringing some order to this emerging field. We overview their theoretical properties, survey structural and positional encodings, and discuss extensions for important graph classes, e.g., 3D molecular graphs. Empirically, we probe how well graph transformers can recover various graph properties, how well they can deal with heterophilic graphs, and to what extent they prevent over-squashing. Further, we outline open challenges and research direction to stimulate future work. Our code is available at https://github.com/luis-mueller/probing-graph-transformers. △ Less

Submitted 28 March, 2024; v1 submitted 8 February, 2023; originally announced February 2023.

arXiv:2302.03022 [pdf, other]

SurgT challenge: Benchmark of Soft-Tissue Trackers for Robotic Surgery

Authors: Joao Cartucho, Alistair Weld, Samyakh Tukra, Haozheng Xu, Hiroki Matsuzaki, Taiyo Ishikawa, Minjun Kwon, Yong Eun Jang, Kwang-Ju Kim, Gwang Lee, Bizhe Bai, Lueder Kahrs, Lars Boecking, Simeon Allmendinger, Leopold Muller, Yitong Zhang, Yueming Jin, Sophia Bano, Francisco Vasconcelos, Wolfgang Reiter, Jonas Hajek, Bruno Silva, Estevao Lima, Joao L. Vilaca, Sandro Queiros , et al. (1 additional authors not shown)

Abstract: This paper introduces the ``SurgT: Surgical Tracking" challenge which was organised in conjunction with MICCAI 2022. There were two purposes for the creation of this challenge: (1) the establishment of the first standardised benchmark for the research community to assess soft-tissue trackers; and (2) to encourage the development of unsupervised deep learning methods, given the lack of annotated da… ▽ More This paper introduces the ``SurgT: Surgical Tracking" challenge which was organised in conjunction with MICCAI 2022. There were two purposes for the creation of this challenge: (1) the establishment of the first standardised benchmark for the research community to assess soft-tissue trackers; and (2) to encourage the development of unsupervised deep learning methods, given the lack of annotated data in surgery. A dataset of 157 stereo endoscopic videos from 20 clinical cases, along with stereo camera calibration parameters, have been provided. Participants were assigned the task of developing algorithms to track the movement of soft tissues, represented by bounding boxes, in stereo endoscopic videos. At the end of the challenge, the developed methods were assessed on a previously hidden test subset. This assessment uses benchmarking metrics that were purposely developed for this challenge, to verify the efficacy of unsupervised deep learning algorithms in tracking soft-tissue. The metric used for ranking the methods was the Expected Average Overlap (EAO) score, which measures the average overlap between a tracker's and the ground truth bounding boxes. Coming first in the challenge was the deep learning submission by ICVS-2Ai with a superior EAO score of 0.617. This method employs ARFlow to estimate unsupervised dense optical flow from cropped images, using photometric and regularization losses. Second, Jmees with an EAO of 0.583, uses deep learning for surgical tool segmentation on top of a non-deep learning baseline method: CSRT. CSRT by itself scores a similar EAO of 0.563. The results from this challenge show that currently, non-deep learning methods are still competitive. The dataset and benchmarking tool created for this challenge have been made publicly available at https://surgt.grand-challenge.org/. △ Less

Submitted 30 August, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

arXiv:2209.13353 [pdf, other]

Suppress with a Patch: Revisiting Universal Adversarial Patch Attacks against Object Detection

Authors: Svetlana Pavlitskaya, Jonas Hendl, Sebastian Kleim, Leopold Müller, Fabian Wylczoch, J. Marius Zöllner

Abstract: Adversarial patch-based attacks aim to fool a neural network with an intentionally generated noise, which is concentrated in a particular region of an input image. In this work, we perform an in-depth analysis of different patch generation parameters, including initialization, patch size, and especially positioning a patch in an image during training. We focus on the object vanishing attack and ru… ▽ More Adversarial patch-based attacks aim to fool a neural network with an intentionally generated noise, which is concentrated in a particular region of an input image. In this work, we perform an in-depth analysis of different patch generation parameters, including initialization, patch size, and especially positioning a patch in an image during training. We focus on the object vanishing attack and run experiments with YOLOv3 as a model under attack in a white-box setting and use images from the COCO dataset. Our experiments have shown, that inserting a patch inside a window of increasing size during training leads to a significant increase in attack strength compared to a fixed position. The best results were obtained when a patch was positioned randomly during training, while patch position additionally varied within a batch. △ Less

Submitted 22 December, 2022; v1 submitted 27 September, 2022; originally announced September 2022.

Comments: Accepted for publication at ICECCME 2022

arXiv:2206.07036 [pdf, other]

Accurate 3D Body Shape Regression using Metric and Semantic Attributes

Authors: Vasileios Choutas, Lea Muller, Chun-Hao P. Huang, Siyu Tang, Dimitrios Tzionas, Michael J. Black

Abstract: While methods that regress 3D human meshes from images have progressed rapidly, the estimated body shapes often do not capture the true human shape. This is problematic since, for many applications, accurate body shape is as important as pose. The key reason that body shape accuracy lags pose accuracy is the lack of data. While humans can label 2D joints, and these constrain 3D pose, it is not so… ▽ More While methods that regress 3D human meshes from images have progressed rapidly, the estimated body shapes often do not capture the true human shape. This is problematic since, for many applications, accurate body shape is as important as pose. The key reason that body shape accuracy lags pose accuracy is the lack of data. While humans can label 2D joints, and these constrain 3D pose, it is not so easy to "label" 3D body shape. Since paired data with images and 3D body shape are rare, we exploit two sources of information: (1) we collect internet images of diverse "fashion" models together with a small set of anthropometric measurements; (2) we collect linguistic shape attributes for a wide range of 3D body meshes and the model images. Taken together, these datasets provide sufficient constraints to infer dense 3D shape. We exploit the anthropometric measurements and linguistic shape attributes in several novel ways to train a neural network, called SHAPY, that regresses 3D human pose and shape from an RGB image. We evaluate SHAPY on public benchmarks, but note that they either lack significant body shape variation, ground-truth shape, or clothing variation. Thus, we collect a new dataset for evaluating 3D human shape estimation, called HBW, containing photos of "Human Bodies in the Wild" for which we have ground-truth 3D body scans. On this new benchmark, SHAPY significantly outperforms state-of-the-art methods on the task of 3D body shape estimation. This is the first demonstration that 3D body shape regression from images can be trained from easy-to-obtain anthropometric measurements and linguistic shape attributes. Our model and data are available at: shapy.is.tue.mpg.de △ Less

Submitted 14 June, 2022; originally announced June 2022.

Comments: First two authors contributed equally

Journal ref: CVPR 2022

arXiv:2205.13080 [pdf, other]

Factorized Structured Regression for Large-Scale Varying Coefficient Models

Authors: David Rügamer, Andreas Bender, Simon Wiegrebe, Daniel Racek, Bernd Bischl, Christian L. Müller, Clemens Stachl

Abstract: Recommender Systems (RS) pervade many aspects of our everyday digital life. Proposed to work at scale, state-of-the-art RS allow the modeling of thousands of interactions and facilitate highly individualized recommendations. Conceptually, many RS can be viewed as instances of statistical regression models that incorporate complex feature effects and potentially non-Gaussian outcomes. Such structur… ▽ More Recommender Systems (RS) pervade many aspects of our everyday digital life. Proposed to work at scale, state-of-the-art RS allow the modeling of thousands of interactions and facilitate highly individualized recommendations. Conceptually, many RS can be viewed as instances of statistical regression models that incorporate complex feature effects and potentially non-Gaussian outcomes. Such structured regression models, including time-aware varying coefficients models, are, however, limited in their applicability to categorical effects and inclusion of a large number of interactions. Here, we propose Factorized Structured Regression (FaStR) for scalable varying coefficient models. FaStR overcomes limitations of general regression models for large-scale data by combining structured additive regression and factorization approaches in a neural network-based model implementation. This fusion provides a scalable framework for the estimation of statistical models in previously infeasible data settings. Empirical results confirm that the estimation of varying coefficients of our approach is on par with state-of-the-art regression techniques, while scaling notably better and also being competitive with other time-aware RS in terms of prediction performance. We illustrate FaStR's performance and interpretability on a large-scale behavioral study with smartphone user data. △ Less

Submitted 25 May, 2022; originally announced May 2022.

arXiv:2203.09885 [pdf, ps, other]

doi 10.4204/EPTCS.355.5

Formally Modeling Autonomous Vehicles in LNT for Simulation and Testing

Authors: Lina Marsso, Radu Mateescu, Lucie Muller, Wendelin Serwe

Abstract: We present two behavioral models of an autonomous vehicle and its interaction with the environment. Both models use the formal modeling language LNT provided by the CADP toolbox. This paper discusses the modeling choices and the challenges of our autonomous vehicle models, and also illustrates how formal validation tools can be applied to a single component or the overall vehicle. We present two behavioral models of an autonomous vehicle and its interaction with the environment. Both models use the formal modeling language LNT provided by the CADP toolbox. This paper discusses the modeling choices and the challenges of our autonomous vehicle models, and also illustrates how formal validation tools can be applied to a single component or the overall vehicle. △ Less

Submitted 18 March, 2022; originally announced March 2022.

Comments: In Proceedings MARS 2022, arXiv:2203.09299

Journal ref: EPTCS 355, 2022, pp. 60-117

arXiv:2112.08961 [pdf, other]

doi 10.1186/s12868-022-00758-0

Objective hearing threshold identification from auditory brainstem response measurements using supervised and self-supervised approaches

Authors: Dominik Thalmeier, Gregor Miller, Elida Schneltzer, Anja Hurt, Martin Hrabě de Angelis, Lore Becker, Christian L. Müller, Holger Maier

Abstract: Hearing loss is a major health problem and psychological burden in humans. Mouse models offer a possibility to elucidate genes involved in the underlying developmental and pathophysiological mechanisms of hearing impairment. To this end, large-scale mouse phenotyping programs include auditory phenotyping of single-gene knockout mouse lines. Using the auditory brainstem response (ABR) procedure, th… ▽ More Hearing loss is a major health problem and psychological burden in humans. Mouse models offer a possibility to elucidate genes involved in the underlying developmental and pathophysiological mechanisms of hearing impairment. To this end, large-scale mouse phenotyping programs include auditory phenotyping of single-gene knockout mouse lines. Using the auditory brainstem response (ABR) procedure, the German Mouse Clinic and similar facilities worldwide have produced large, uniform data sets of averaged ABR raw data of mutant and wildtype mice. In the course of standard ABR analysis, hearing thresholds are assessed visually by trained staff from series of signal curves of increasing sound pressure level. This is time-consuming and prone to be biased by the reader as well as the graphical display quality and scale. In an attempt to reduce workload and improve quality and reproducibility, we developed and compared two methods for automated hearing threshold identification from averaged ABR raw data: a supervised approach involving two combined neural networks trained on human-generated labels and a self-supervised approach, which exploits the signal power spectrum and combines random forest sound level estimation with a piece-wise curve fitting algorithm for threshold finding. We show that both models work well, outperform human threshold detection, and are suitable for fast, reliable, and unbiased hearing threshold detection and quality control. In a high-throughput mouse phenotyping environment, both methods perform well as part of an automated end-to-end screening pipeline to detect candidate genes for hearing involvement. Code for both models as well as data used for this work are freely available. △ Less

Submitted 16 December, 2021; originally announced December 2021.

Comments: 41 pages, 17 figures

Journal ref: BMC Neurosci 23, 81 (2022)

arXiv:2110.00620 [pdf, other]

SPEC: Seeing People in the Wild with an Estimated Camera

Authors: Muhammed Kocabas, Chun-Hao P. Huang, Joachim Tesch, Lea Müller, Otmar Hilliges, Michael J. Black

Abstract: Due to the lack of camera parameter information for in-the-wild images, existing 3D human pose and shape (HPS) estimation methods make several simplifying assumptions: weak-perspective projection, large constant focal length, and zero camera rotation. These assumptions often do not hold and we show, quantitatively and qualitatively, that they cause errors in the reconstructed 3D shape and pose. To… ▽ More Due to the lack of camera parameter information for in-the-wild images, existing 3D human pose and shape (HPS) estimation methods make several simplifying assumptions: weak-perspective projection, large constant focal length, and zero camera rotation. These assumptions often do not hold and we show, quantitatively and qualitatively, that they cause errors in the reconstructed 3D shape and pose. To address this, we introduce SPEC, the first in-the-wild 3D HPS method that estimates the perspective camera from a single image and employs this to reconstruct 3D human bodies more accurately. First, we train a neural network to estimate the field of view, camera pitch, and roll given an input image. We employ novel losses that improve the calibration accuracy over previous work. We then train a novel network that concatenates the camera calibration to the image features and uses these together to regress 3D body shape and pose. SPEC is more accurate than the prior art on the standard benchmark (3DPW) as well as two new datasets with more challenging camera views and varying focal lengths. Specifically, we create a new photorealistic synthetic dataset (SPEC-SYN) with ground truth 3D bodies and a novel in-the-wild dataset (SPEC-MTP) with calibration and high-quality reference bodies. Both qualitative and quantitative analysis confirm that knowing camera parameters during inference regresses better human bodies. Code and datasets are available for research purposes at https://spec.is.tue.mpg.de. △ Less

Submitted 1 November, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

arXiv:2107.10314 [pdf, other]

doi 10.18653/v1/2023.eacl-demo.11

Small-Text: Active Learning for Text Classification in Python

Authors: Christopher Schröder, Lydia Müller, Andreas Niekler, Martin Potthast

Abstract: We introduce small-text, an easy-to-use active learning library, which offers pool-based active learning for single- and multi-label text classification in Python. It features numerous pre-implemented state-of-the-art query strategies, including some that leverage the GPU. Standardized interfaces allow the combination of a variety of classifiers, query strategies, and stopping criteria, facilitati… ▽ More We introduce small-text, an easy-to-use active learning library, which offers pool-based active learning for single- and multi-label text classification in Python. It features numerous pre-implemented state-of-the-art query strategies, including some that leverage the GPU. Standardized interfaces allow the combination of a variety of classifiers, query strategies, and stopping criteria, facilitating a quick mix and match, and enabling a rapid and convenient development of both active learning experiments and applications. With the objective of making various classifiers and query strategies accessible for active learning, small-text integrates several well-known machine learning libraries, namely scikit-learn, PyTorch, and Hugging Face transformers. The latter integrations are optionally installable extensions, so GPUs can be used but are not required. Using this new library, we investigate the performance of the recently published SetFit training paradigm, which we compare to vanilla transformer fine-tuning, finding that it matches the latter in classification accuracy while outperforming it in area under the curve. The library is available under the MIT License at https://github.com/webis-de/small-text, in version 1.3.0 at the time of writing. △ Less

Submitted 7 October, 2023; v1 submitted 21 July, 2021; originally announced July 2021.

Comments: This revision fixes the number of query strategies for modAL, which had remained unchanged from an earlier iteration of the table that did not yet include multi-label strategies

arXiv:2106.11234 [pdf, other]

Instrumental Variable Estimation for Compositional Treatments

Authors: Elisabeth Ailer, Christian L. Müller, Niki Kilbertus

Abstract: Many scientific datasets are compositional in nature. Important biological examples include species abundances in ecology, cell-type compositions derived from single-cell sequencing data, and amplicon abundance data in microbiome research. Here, we provide a causal view on compositional data in an instrumental variable setting where the composition acts as the cause. First, we crisply articulate p… ▽ More Many scientific datasets are compositional in nature. Important biological examples include species abundances in ecology, cell-type compositions derived from single-cell sequencing data, and amplicon abundance data in microbiome research. Here, we provide a causal view on compositional data in an instrumental variable setting where the composition acts as the cause. First, we crisply articulate potential pitfalls for practitioners regarding the interpretation of compositional causes from the viewpoint of interventions and warn against attributing causal meaning to common summary statistics such as diversity indices in microbiome data analysis. We then advocate for and develop multivariate methods using statistical data transformations and regression techniques that take the special structure of the compositional sample space into account while still yielding scientifically interpretable results. In a comparative analysis on synthetic and real microbiome data we show the advantages and limitations of our proposal. We posit that our analysis provides a useful framework and guidance for valid and informative cause-effect estimation in the context of compositional data. △ Less

Submitted 28 May, 2024; v1 submitted 21 June, 2021; originally announced June 2021.

Comments: Code available on https://github.com/EAiler/causal-compositions

arXiv:2105.08470 [pdf, other]

Overparametrization of HyperNetworks at Fixed FLOP-Count Enables Fast Neural Image Enhancement

Authors: Lorenz K. Muller

Abstract: Deep convolutional neural networks can enhance images taken with small mobile camera sensors and excel at tasks like demoisaicing, denoising and super-resolution. However, for practical use on mobile devices these networks often require too many FLOPs and reducing the FLOPs of a convolution layer, also reduces its parameter count. This is problematic in view of the recent finding that heavily over… ▽ More Deep convolutional neural networks can enhance images taken with small mobile camera sensors and excel at tasks like demoisaicing, denoising and super-resolution. However, for practical use on mobile devices these networks often require too many FLOPs and reducing the FLOPs of a convolution layer, also reduces its parameter count. This is problematic in view of the recent finding that heavily over-parameterized neural networks are often the ones that generalize best. In this paper we propose to use HyperNetworks to break the fixed ratio of FLOPs to parameters of standard convolutions. This allows us to exceed previous state-of-the-art architectures in SSIM and MS-SSIM on the Zurich RAW- to-DSLR (ZRR) data-set at > 10x reduced FLOP-count. On ZRR we further observe generalization curves consistent with 'double-descent' behavior at fixed FLOP-count, in the large image limit. Finally we demonstrate the same technique can be applied to an existing network (VDN) to reduce its computational cost while maintaining fidelity on the Smartphone Image Denoising Dataset (SIDD). Code for key functions is given in the appendix. △ Less

Submitted 18 May, 2021; originally announced May 2021.

arXiv:2105.05557 [pdf, other]

doi 10.18653/v1/2021.acl-long.320

Supporting Land Reuse of Former Open Pit Mining Sites using Text Classification and Active Learning

Authors: Christopher Schröder, Kim Bürgl, Yves Annanias, Andreas Niekler, Lydia Müller, Daniel Wiegreffe, Christian Bender, Christoph Mengs, Gerik Scheuermann, Gerhard Heyer

Abstract: Open pit mines left many regions worldwide inhospitable or uninhabitable. To put these regions back into use, entire stretches of land must be renaturalized. For the sustainable subsequent use or transfer to a new primary use, many contaminated sites and soil information have to be permanently managed. In most cases, this information is available in the form of expert reports in unstructured data… ▽ More Open pit mines left many regions worldwide inhospitable or uninhabitable. To put these regions back into use, entire stretches of land must be renaturalized. For the sustainable subsequent use or transfer to a new primary use, many contaminated sites and soil information have to be permanently managed. In most cases, this information is available in the form of expert reports in unstructured data collections or file folders, which in the best case are digitized. Due to size and complexity of the data, it is difficult for a single person to have an overview of this data in order to be able to make reliable statements. This is one of the most important obstacles to the rapid transfer of these areas to after-use. An information-based approach to this issue supports fulfilling several Sustainable Development Goals regarding environment issues, health and climate action. We use a stack of Optical Character Recognition, Text Classification, Active Learning and Geographic Information System Visualization to effectively mine and visualize this information. Subsequently, we link the extracted information to geographic coordinates and visualize them using a Geographic Information System. Active Learning plays a vital role because our dataset provides no training data. In total, we process nine categories and actively learn their representation in our dataset. We evaluate the OCR, Active Learning and Text Classification separately to report the performance of the system. Active Learning and text classification results are twofold: Whereas our categories about restrictions work sufficient ($>$.85 F1), the seven topic-oriented categories were complicated for human coders and hence the results achieved mediocre evaluation scores ($<$.70 F1). △ Less

Submitted 22 March, 2022; v1 submitted 12 May, 2021; originally announced May 2021.

Journal ref: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021

arXiv:2104.03176 [pdf, other]

On Self-Contact and Human Pose

Authors: Lea Müller, Ahmed A. A. Osman, Siyu Tang, Chun-Hao P. Huang, Michael J. Black

Abstract: People touch their face 23 times an hour, they cross their arms and legs, put their hands on their hips, etc. While many images of people contain some form of self-contact, current 3D human pose and shape (HPS) regression methods typically fail to estimate this contact. To address this, we develop new datasets and methods that significantly improve human pose estimation with self-contact. First, w… ▽ More People touch their face 23 times an hour, they cross their arms and legs, put their hands on their hips, etc. While many images of people contain some form of self-contact, current 3D human pose and shape (HPS) regression methods typically fail to estimate this contact. To address this, we develop new datasets and methods that significantly improve human pose estimation with self-contact. First, we create a dataset of 3D Contact Poses (3DCP) containing SMPL-X bodies fit to 3D scans as well as poses from AMASS, which we refine to ensure good contact. Second, we leverage this to create the Mimic-The-Pose (MTP) dataset of images, collected via Amazon Mechanical Turk, containing people mimicking the 3DCP poses with selfcontact. Third, we develop a novel HPS optimization method, SMPLify-XMC, that includes contact constraints and uses the known 3DCP body pose during fitting to create near ground-truth poses for MTP images. Fourth, for more image variety, we label a dataset of in-the-wild images with Discrete Self-Contact (DSC) information and use another new optimization method, SMPLify-DC, that exploits discrete contacts during pose optimization. Finally, we use our datasets during SPIN training to learn a new 3D human pose regressor, called TUCH (Towards Understanding Contact in Humans). We show that the new self-contact training data significantly improves 3D human pose estimates on withheld test data and existing datasets like 3DPW. Not only does our method improve results for self-contact poses, but it also improves accuracy for non-contact poses. The code and data are available for research purposes at https://tuch.is.tue.mpg.de. △ Less

Submitted 8 April, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

Comments: Accepted in CVPR'21 (oral). Project page: https://tuch.is.tue.mpg.de/

arXiv:2104.02705 [pdf, other]

deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression

Authors: David Rügamer, Chris Kolb, Cornelius Fritz, Florian Pfisterer, Philipp Kopper, Bernd Bischl, Ruolin Shen, Christina Bukas, Lisa Barros de Andrade e Sousa, Dominik Thalmeier, Philipp Baumann, Lucas Kook, Nadja Klein, Christian L. Müller

Abstract: In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library \pkg{TensorFlow} for the fusion of various statistical and deep… ▽ More In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library \pkg{TensorFlow} for the fusion of various statistical and deep learning approaches, (2) an orthogonalization cell to allow for an interpretable combination of different subnetworks, as well as (3) pre-processing steps necessary to set up such models. The software package allows to define models in a user-friendly manner via a formula interface that is inspired by classical statistical model frameworks such as \pkg{mgcv}. The packages' modular design and functionality provides a unique resource for both scalable estimation of complex statistical models and the combination of approaches from deep learning and statistics. This allows for state-of-the-art predictive performance while simultaneously retaining the indispensable interpretability of classical statistical models. △ Less

Submitted 10 March, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

arXiv:2102.07621 [pdf, other]

Hit by the Data: a visual data analysis regarding the effects of traffic public policies

Authors: Luana Müller, Camila Moser, Guilherme Paris, Lucas Freitas, Mayara Oliveira, Wagner Signoretti, Isabel Harb Manssour, Milene Selbach Silveira

Abstract: The availability of Open Government Data (OGD) provides means for citizens to understand and follow governmental policies and decisions, showing evidence of how the latter have contributed to both the place they live in and their lives. In such a scenario, one of the proposals is the use of visualizations to support the process of data analysis and interpretation. Herein, we present the use of thr… ▽ More The availability of Open Government Data (OGD) provides means for citizens to understand and follow governmental policies and decisions, showing evidence of how the latter have contributed to both the place they live in and their lives. In such a scenario, one of the proposals is the use of visualizations to support the process of data analysis and interpretation. Herein, we present the use of three different visualization tools, a commercial one and two academic ones, applied to two specific Brazilian cases: the implementation of the Drink Driving Law and the construction of a new overpass in an important city avenue. Our focus was on the analysis of how visualization could help in the identification of the effects of such traffic public policies. As our main contributions, we present details on the effects of the observed policies, as well as new cases showing how visualization tools can assist users to interpret OGD. △ Less

Submitted 12 February, 2021; originally announced February 2021.

arXiv:2101.06182 [pdf, other]

STENCIL-NET: Data-driven solution-adaptive discretization of partial differential equations

Authors: Suryanarayana Maddu, Dominik Sturm, Bevan L. Cheeseman, Christian L. Müller, Ivo F. Sbalzarini

Abstract: Numerical methods for approximately solving partial differential equations (PDE) are at the core of scientific computing. Often, this requires high-resolution or adaptive discretization grids to capture relevant spatio-temporal features in the PDE solution, e.g., in applications like turbulence, combustion, and shock propagation. Numerical approximation also requires knowing the PDE in order to co… ▽ More Numerical methods for approximately solving partial differential equations (PDE) are at the core of scientific computing. Often, this requires high-resolution or adaptive discretization grids to capture relevant spatio-temporal features in the PDE solution, e.g., in applications like turbulence, combustion, and shock propagation. Numerical approximation also requires knowing the PDE in order to construct problem-specific discretizations. Systematically deriving such solution-adaptive discrete operators, however, is a current challenge. Here we present STENCIL-NET, an artificial neural network architecture for data-driven learning of problem- and resolution-specific local discretizations of nonlinear PDEs. STENCIL-NET achieves numerically stable discretization of the operators in an unknown nonlinear PDE by spatially and temporally adaptive parametric pooling on regular Cartesian grids, and by incorporating knowledge about discrete time integration. Knowing the actual PDE is not necessary, as solution data is sufficient to train the network to learn the discrete operators. A once-trained STENCIL-NET model can be used to predict solutions of the PDE on larger spatial domains and for longer times than it was trained for, hence addressing the problem of PDE-constrained extrapolation from data. To support this claim, we present numerical experiments on long-term forecasting of chaotic PDE solutions on coarse spatio-temporal grids. We also quantify the speed-up achieved by substituting base-line numerical methods with equation-free STENCIL-NET predictions on coarser grids with little compromise on accuracy. △ Less

Submitted 18 January, 2021; v1 submitted 15 January, 2021; originally announced January 2021.

arXiv:2012.15376 [pdf, other]

Provident Vehicle Detection at Night: The PVDN Dataset

Authors: Lars Ohnemus, Lukas Ewecker, Ebubekir Asan, Stefan Roos, Simon Isele, Jakob Ketterer, Leopold Müller, Sascha Saralajew

Abstract: For advanced driver assistance systems, it is crucial to have information about oncoming vehicles as early as possible. At night, this task is especially difficult due to poor lighting conditions. For that, during nighttime, every vehicle uses headlamps to improve sight and therefore ensure safe driving. As humans, we intuitively assume oncoming vehicles before the vehicles are actually physically… ▽ More For advanced driver assistance systems, it is crucial to have information about oncoming vehicles as early as possible. At night, this task is especially difficult due to poor lighting conditions. For that, during nighttime, every vehicle uses headlamps to improve sight and therefore ensure safe driving. As humans, we intuitively assume oncoming vehicles before the vehicles are actually physically visible by detecting light reflections caused by their headlamps. In this paper, we present a novel dataset containing 59746 annotated grayscale images out of 346 different scenes in a rural environment at night. In these images, all oncoming vehicles, their corresponding light objects (e.g., headlamps), and their respective light reflections (e.g., light reflections on guardrails) are labeled. This is accompanied by an in-depth analysis of the dataset characteristics. With that, we are providing the first open-source dataset with comprehensive ground truth data to enable research into new methods of detecting oncoming vehicles based on the light reflections they cause, long before they are directly visible. We consider this as an essential step to further close the performance gap between current advanced driver assistance systems and human behavior. △ Less

Submitted 23 January, 2021; v1 submitted 30 December, 2020; originally announced December 2020.

arXiv:2012.08780 [pdf, other]

Analysing the Direction of Emotional Influence in Nonverbal Dyadic Communication: A Facial-Expression Study

Authors: Maha Shadaydeh, Lea Mueller, Dana Schneider, Martin Thuemmel, Thomas Kessler, Joachim Denzler

Abstract: Identifying the direction of emotional influence in a dyadic dialogue is of increasing interest in the psychological sciences with applications in psychotherapy, analysis of political interactions, or interpersonal conflict behavior. Facial expressions are widely described as being automatic and thus hard to overtly influence. As such, they are a perfect measure for a better understanding of unint… ▽ More Identifying the direction of emotional influence in a dyadic dialogue is of increasing interest in the psychological sciences with applications in psychotherapy, analysis of political interactions, or interpersonal conflict behavior. Facial expressions are widely described as being automatic and thus hard to overtly influence. As such, they are a perfect measure for a better understanding of unintentional behavior cues about social-emotional cognitive processes. With this view, this study is concerned with the analysis of the direction of emotional influence in dyadic dialogue based on facial expressions only. We exploit computer vision capabilities along with causal inference theory for quantitative verification of hypotheses on the direction of emotional influence, i.e., causal effect relationships, in dyadic dialogues. We address two main issues. First, in a dyadic dialogue, emotional influence occurs over transient time intervals and with intensity and direction that are variant over time. To this end, we propose a relevant interval selection approach that we use prior to causal inference to identify those transient intervals where causal inference should be applied. Second, we propose to use fine-grained facial expressions that are present when strong distinct facial emotions are not visible. To specify the direction of influence, we apply the concept of Granger causality to the time series of facial expressions over selected relevant intervals. We tested our approach on newly, experimentally obtained data. Based on the quantitative verification of hypotheses on the direction of emotional influence, we were able to show that the proposed approach is most promising to reveal the causal effect pattern in various instructed interaction conditions. △ Less

Submitted 16 December, 2020; originally announced December 2020.

Comments: arXiv admin note: text overlap with arXiv:1810.12171

arXiv:2012.06391 [pdf, other]

doi 10.1103/PhysRevE.103.042310

Learning physically consistent mathematical models from data using group sparsity

Authors: Suryanarayana Maddu, Bevan L. Cheeseman, Christian L. Müller, Ivo F. Sbalzarini

Abstract: We propose a statistical learning framework based on group-sparse regression that can be used to 1) enforce conservation laws, 2) ensure model equivalence, and 3) guarantee symmetries when learning or inferring differential-equation models from measurement data. Directly learning $\textit{interpretable}$ mathematical models from data has emerged as a valuable modeling approach. However, in areas l… ▽ More We propose a statistical learning framework based on group-sparse regression that can be used to 1) enforce conservation laws, 2) ensure model equivalence, and 3) guarantee symmetries when learning or inferring differential-equation models from measurement data. Directly learning $\textit{interpretable}$ mathematical models from data has emerged as a valuable modeling approach. However, in areas like biology, high noise levels, sensor-induced correlations, and strong inter-system variability can render data-driven models nonsensical or physically inconsistent without additional constraints on the model structure. Hence, it is important to leverage $\textit{prior}$ knowledge from physical principles to learn "biologically plausible and physically consistent" models rather than models that simply fit the data best. We present a novel group Iterative Hard Thresholding (gIHT) algorithm and use stability selection to infer physically consistent models with minimal parameter tuning. We show several applications from systems biology that demonstrate the benefits of enforcing $\textit{priors}$ in data-driven modeling. △ Less

Submitted 11 December, 2020; originally announced December 2020.

Journal ref: Phys. Rev. E 103, 042310 (2021)

arXiv:2011.00898 [pdf, other]

c-lasso -- a Python package for constrained sparse and robust regression and classification

Authors: Léo Simpson, Patrick L. Combettes, Christian L. Müller

Abstract: We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints. The underlying statistical forward model is assumed to be of the following form: \[ y = X β+ σε\qquad \textrm{subject to} \qquad Cβ=0 \] Here, $X \in \mathbb{R}^{n\times d}$is a given design matrix and the vector $y \in \mathbb{R}^{n}$ is a continuous or bina… ▽ More We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints. The underlying statistical forward model is assumed to be of the following form: \[ y = X β+ σε\qquad \textrm{subject to} \qquad Cβ=0 \] Here, $X \in \mathbb{R}^{n\times d}$is a given design matrix and the vector $y \in \mathbb{R}^{n}$ is a continuous or binary response vector. The matrix $C$ is a general constraint matrix. The vector $β\in \mathbb{R}^{d}$ contains the unknown coefficients and $σ$ an unknown scale. Prominent use cases are (sparse) log-contrast regression with compositional data $X$, requiring the constraint $1_d^T β= 0$ (Aitchion and Bacon-Shone 1984) and the Generalized Lasso which is a special case of the described problem (see, e.g, (James, Paulson, and Rusmevichientong 2020), Example 3). The c-lasso package provides estimators for inferring unknown coefficients and scale (i.e., perspective M-estimators (Combettes and Müller 2020a)) of the form \[ \min_{β\in \mathbb{R}^d, σ\in \mathbb{R}_{0}} f\left(Xβ- y,σ \right) + λ\left\lVert β\right\rVert_1 \qquad \textrm{subject to} \qquad Cβ= 0 \] for several convex loss functions $f(\cdot,\cdot)$. This includes the constrained Lasso, the constrained scaled Lasso, and sparse Huber M-estimators with linear equality constraints. △ Less

Submitted 2 November, 2020; originally announced November 2020.

arXiv:2010.10176 [pdf]

Individual corpora predict fast memory retrieval during reading

Authors: Markus J. Hofmann, Lara Müller, Andre Rölke, Ralph Radach, Chris Biemann

Abstract: The corpus, from which a predictive language model is trained, can be considered the experience of a semantic system. We recorded everyday reading of two participants for two months on a tablet, generating individual corpus samples of 300/500K tokens. Then we trained word2vec models from individual corpora and a 70 million-sentence newspaper corpus to obtain individual and norm-based long-term mem… ▽ More The corpus, from which a predictive language model is trained, can be considered the experience of a semantic system. We recorded everyday reading of two participants for two months on a tablet, generating individual corpus samples of 300/500K tokens. Then we trained word2vec models from individual corpora and a 70 million-sentence newspaper corpus to obtain individual and norm-based long-term memory structure. To test whether individual corpora can make better predictions for a cognitive task of long-term memory retrieval, we generated stimulus materials consisting of 134 sentences with uncorrelated individual and norm-based word probabilities. For the subsequent eye tracking study 1-2 months later, our regression analyses revealed that individual, but not norm-corpus-based word probabilities can account for first-fixation duration and first-pass gaze duration. Word length additionally affected gaze duration and total viewing duration. The results suggest that corpora representative for an individual's longterm memory structure can better explain reading performance than a norm corpus, and that recently acquired information is lexically accessed rapidly. △ Less

Submitted 20 October, 2020; originally announced October 2020.

Comments: Proceedings of the 6th workshop on Cognitive Aspects of the Lexicon (CogALex-VI), Barcelona, Spain, December 12, 2020; accepted manuscript; 11 pages, 2 figures, 4 Tables

arXiv:2007.12229 [pdf, other]

SeismoFlow -- Data augmentation for the class imbalance problem

Authors: Ruy Luiz Milidiú, Luis Felipe Müller

Abstract: In several application areas, such as medical diagnosis, spam filtering, fraud detection, and seismic data analysis, it is very usual to find relevant classification tasks where some class occurrences are rare. This is the so called class imbalance problem, which is a challenge in machine learning. In this work, we propose the SeismoFlow a flow-based generative model to create synthetic samples, a… ▽ More In several application areas, such as medical diagnosis, spam filtering, fraud detection, and seismic data analysis, it is very usual to find relevant classification tasks where some class occurrences are rare. This is the so called class imbalance problem, which is a challenge in machine learning. In this work, we propose the SeismoFlow a flow-based generative model to create synthetic samples, aiming to address the class imbalance. Inspired by the Glow model, it uses interpolation on the learned latent space to produce synthetic samples for one rare class. We apply our approach to the development of a seismogram signal quality classifier. We introduce a dataset composed of5.223seismograms that are distributed between the good, medium, and bad classes and with their respective frequencies of 66.68%,31.54%, and 1.76%. Our methodology is evaluated on a stratified 10-fold cross-validation setting, using the Miniceptionmodel as a baseline, and assessing the effects of adding the generated samples on the training set of each iteration. In our experiments, we achieve an improvement of 13.9% on the rare class F1-score, while not hurting the metric value for the other classes and thus observing the overall accuracy improvement. Our empirical findings indicate that our method can generate high-quality synthetic seismograms with realistic looking and sufficient plurality to help the Miniception model to overcome the class imbalance problem. We believe that our results are a step forward in solving both the task of seismogram signal quality classification and class imbalance. △ Less

Submitted 2 September, 2020; v1 submitted 23 July, 2020; originally announced July 2020.

Comments: 10 pages

arXiv:2005.02218 [pdf, ps, other]

On Reachable Assignments in Cycles and Cliques

Authors: Luis Müller, Matthias Bentert

Abstract: The efficient and fair distribution of indivisible resources among agents is a common problem in the field of \emph{Multi-Agent-Systems}. We consider a graph-based version of this problem called Reachable Assignments, introduced by Gourves, Lesca, and Wilczynski [AAAI, 2017]. The input for this problem consists of a set of agents, a set of objects, the agent's preferences over the objects, a graph… ▽ More The efficient and fair distribution of indivisible resources among agents is a common problem in the field of \emph{Multi-Agent-Systems}. We consider a graph-based version of this problem called Reachable Assignments, introduced by Gourves, Lesca, and Wilczynski [AAAI, 2017]. The input for this problem consists of a set of agents, a set of objects, the agent's preferences over the objects, a graph with the agents as vertices and edges encoding which agents can trade resources with each other, and an initial and a target distribution of the objects, where each agent owns exactly one object in each distribution. The question is then whether the target distribution is reachable via a sequence of rational trades. A trade is rational when the two participating agents are neighbors in the graph and both obtain an object they prefer over the object they previously held. We show that Reachable Assignments is NP-hard even when restricting the input graph to be a clique and develop an $O(n^3)$-time algorithm for the case where the input graph is a cycle with $n$ vertices. △ Less

Submitted 5 May, 2020; originally announced May 2020.

arXiv:1907.07810 [pdf, other]

Stability selection enables robust learning of partial differential equations from limited noisy data

Authors: Suryanarayana Maddu, Bevan L. Cheeseman, Ivo F. Sbalzarini, Christian L. Müller

Abstract: We present a statistical learning framework for robust identification of partial differential equations from noisy spatiotemporal data. Extending previous sparse regression approaches for inferring PDE models from simulated data, we address key issues that have thus far limited the application of these methods to noisy experimental data, namely their robustness against noise and the need for manua… ▽ More We present a statistical learning framework for robust identification of partial differential equations from noisy spatiotemporal data. Extending previous sparse regression approaches for inferring PDE models from simulated data, we address key issues that have thus far limited the application of these methods to noisy experimental data, namely their robustness against noise and the need for manual parameter tuning. We address both points by proposing a stability-based model selection scheme to determine the level of regularization required for reproducible recovery of the underlying PDE. This avoids manual parameter tuning and provides a principled way to improve the method's robustness against noise in the data. Our stability selection approach, termed PDE-STRIDE, can be combined with any sparsity-promoting penalized regression model and provides an interpretable criterion for model component importance. We show that in particular the combination of stability selection with the iterative hard-thresholding algorithm from compressed sensing provides a fast, parameter-free, and robust computational framework for PDE inference that outperforms previous algorithmic approaches with respect to recovery accuracy, amount of data required, and robustness to noise. We illustrate the performance of our approach on a wide range of noise-corrupted simulated benchmark problems, including 1D Burgers, 2D vorticity-transport, and 3D reaction-diffusion problems. We demonstrate the practical applicability of our method on real-world data by considering a purely data-driven re-evaluation of the advective triggering hypothesis for an embryonic polarization system in C.~elegans. Using fluorescence microscopy images of C.~elegans zygotes as input data, our framework is able to recover the PDE model for the regulatory reaction-diffusion-flow network of the associated proteins. △ Less

Submitted 17 July, 2019; originally announced July 2019.

Comments: 20 pages, 10 figures and supplementary material included

arXiv:1907.00770 [pdf, other]

Teaching deep neural networks to localize single molecules for super-resolution microscopy

Authors: Artur Speiser, Lucas-Raphael Müller, Ulf Matti, Christopher J. Obara, Wesley R. Legant, Jonas Ries, Jakob H. Macke, Srinivas C. Turaga

Abstract: Single-molecule localization fluorescence microscopy constructs super-resolution images by sequential imaging and computational localization of sparsely activated fluorophores. Accurate and efficient fluorophore localization algorithms are key to the success of this computational microscopy method. We present a novel localization algorithm based on deep learning which significantly improves upon t… ▽ More Single-molecule localization fluorescence microscopy constructs super-resolution images by sequential imaging and computational localization of sparsely activated fluorophores. Accurate and efficient fluorophore localization algorithms are key to the success of this computational microscopy method. We present a novel localization algorithm based on deep learning which significantly improves upon the state of the art. Our contributions are a novel network architecture for simultaneous detection and localization, and new loss function which phrases detection and localization as a Bayesian inference problem, and thus allows the network to provide uncertainty-estimates. In contrast to standard methods which independently process imaging frames, our network architecture uses temporal context from multiple sequentially imaged frames to detect and localize molecules. We demonstrate the power of our method across a variety of datasets, imaging modalities, signal to noise ratios, and fluorophore densities. While existing localization algorithms can achieve optimal localization accuracy at low fluorophore densities, they are confounded by high densities. Our method is the first deep-learning based approach which achieves state-of-the-art on the SMLM2016 challenge. It achieves the best scores on 12 out of 12 data-sets when comparing both detection accuracy and precision, and excels at high densities. Finally, we investigate how unsupervised learning can be used to make the network robust against mismatch between simulated and real data. The lessons learned here are more generally relevant for the training of deep networks to solve challenging Bayesian inverse problems on spatially extended domains in biology and physics. △ Less

Submitted 20 July, 2020; v1 submitted 27 June, 2019; originally announced July 2019.

Comments: Significant improvements of the algorithm, including a novel loss function. Evaluations on multiple real data sets

arXiv:1810.12171 [pdf, other]

Causal Inference in Nonverbal Dyadic Communication with Relevant Interval Selection and Granger Causality

Authors: Lea Müller, Maha Shadaydeh, Martin Thümmel, Thomas Kessler, Dana Schneider, Joachim Denzler

Abstract: Human nonverbal emotional communication in dyadic dialogs is a process of mutual influence and adaptation. Identifying the direction of influence, or cause-effect relation between participants is a challenging task, due to two main obstacles. First, distinct emotions might not be clearly visible. Second, participants cause-effect relation is transient and variant over time. In this paper, we addre… ▽ More Human nonverbal emotional communication in dyadic dialogs is a process of mutual influence and adaptation. Identifying the direction of influence, or cause-effect relation between participants is a challenging task, due to two main obstacles. First, distinct emotions might not be clearly visible. Second, participants cause-effect relation is transient and variant over time. In this paper, we address these difficulties by using facial expressions that can be present even when strong distinct facial emotions are not visible. We also propose to apply a relevant interval selection approach prior to causal inference to identify those transient intervals where adaptation process occurs. To identify the direction of influence, we apply the concept of Granger causality to the time series of facial expressions on the set of relevant intervals. We tested our approach on synthetic data and then applied it to newly, experimentally obtained data. Here, we were able to show that a more sensitive facial expression detection algorithm and a relevant interval detection approach is most promising to reveal the cause-effect pattern for dyadic communication in various instructed interaction conditions. △ Less

Submitted 29 October, 2018; originally announced October 2018.

Comments: Nonverbal emotional communication, Granger causality, maximally coherent intervals

arXiv:1807.05128 [pdf, other]

doi 10.1039/C8FD00114F

A neuromorphic systems approach to in-memory computing with non-ideal memristive devices: From mitigation to exploitation

Authors: Melika Payvand, Manu V Nair, Lorenz K. Muller, Giacomo Indiveri

Abstract: Memristive devices represent a promising technology for building neuromorphic electronic systems. In addition to their compactness and non-volatility features, they are characterized by computationally relevant physical properties, such as state-dependence, non-linear conductance changes, and intrinsic variability in both their switching threshold and conductance values, that make them ideal devic… ▽ More Memristive devices represent a promising technology for building neuromorphic electronic systems. In addition to their compactness and non-volatility features, they are characterized by computationally relevant physical properties, such as state-dependence, non-linear conductance changes, and intrinsic variability in both their switching threshold and conductance values, that make them ideal devices for emulating the bio-physics of real synapses. In this paper we present a spiking neural network architecture that supports the use of memristive devices as synaptic elements, and propose mixed-signal analog-digital interfacing circuits which mitigate the effect of variability in their conductance values and exploit their variability in the switching threshold, for implementing stochastic learning. The effect of device variability is mitigated by using pairs of memristive devices configured in a complementary push-pull mechanism and interfaced to a current-mode normalizer circuit. The stochastic learning mechanism is obtained by mapping the desired change in synaptic weight into a corresponding switching probability that is derived from the intrinsic stochastic behavior of memristive devices. We demonstrate the features of the CMOS circuits and apply the architecture proposed to a standard neural network hand-written digit classification benchmark based on the MNIST data-set. We evaluate the performance of the approach proposed on this benchmark using behavioral-level spiking neural network simulation, showing both the effect of the reduction in conductance variability produced by the current-mode normalizer circuit, and the increase in performance as a function of the number of memristive devices used in each synapse. △ Less

Submitted 13 July, 2018; originally announced July 2018.

Comments: 13 pages, 12 figures, accepted for Faraday Discussions

arXiv:1805.11472 [pdf, other]

Comparison of 1D and 3D Models for the Estimation of Fractional Flow Reserve

Authors: P. J. Blanco, C. A. Bulant, L. O. Müller, G. D. Maso Talou, C. Guedes Bezerra, P. L. Lemos, R. A. Feijóo

Abstract: In this work we propose to validate the predictive capabilities of one-dimensional (1D) blood flow models with full three-dimensional (3D) models in the context of patient-specific coronary hemodynamics in hyperemic conditions. Such conditions mimic the state of coronary circulation during the acquisition of the Fractional Flow Reserve (FFR) index. Demonstrating that 1D models accurately reproduce… ▽ More In this work we propose to validate the predictive capabilities of one-dimensional (1D) blood flow models with full three-dimensional (3D) models in the context of patient-specific coronary hemodynamics in hyperemic conditions. Such conditions mimic the state of coronary circulation during the acquisition of the Fractional Flow Reserve (FFR) index. Demonstrating that 1D models accurately reproduce FFR estimates obtained with 3D models has implications in the approach to computationally estimate FFR. To this end, a sample of 20 patients was employed from which 29 3D geometries of arterial trees were constructed, 9 obtained from coronary computed tomography angiography (CCTA) and 20 from intra-vascular ultrasound (IVUS). For each 3D arterial model, a 1D counterpart was generated. The same outflow and inlet pressure boundary conditions were applied to both (3D and 1D) models. In the 1D setting, pressure losses at stenoses and bifurcations were accounted for through specific lumped models. Comparisons between 1D models ($\text{FFR}_{\text{1D}}$) and 3D models ($\text{FFR}_{\text{3D}}$) were performed in terms of predicted $\text{FFR}$ value. Compared to $\text{FFR}_{\text{3D}}$, $\text{FFR}_{\text{1D}}$ resulted with a difference of 0.00$\pm$0.03 and overall predictive capability AUC, Acc, Spe, Sen, PPV and NPV of 0.97, 0.98, 0.90, 0.99, 0.82, and 0.99, with an FFR threshold of 0.8. We conclude that inexpensive $\text{FFR}_{\text{1D}}$ simulations can be reliably used as a surrogate of demanding $\text{FFR}_{\text{3D}}$ computations. △ Less

Submitted 29 May, 2018; originally announced May 2018.

Comments: 11 pages, 2 figures, 2 tables, submited to Scientific Reports

arXiv:1709.05484 [pdf, other]

doi 10.1088/2399-1984/aa954a

A differential memristive synapse circuit for on-line learning in neuromorphic computing systems

Authors: Manu V Nair, Lorenz K. Muller, Giacomo Indiveri

Abstract: Spike-based learning with memristive devices in neuromorphic computing architectures typically uses learning circuits that require overlapping pulses from pre- and post-synaptic nodes. This imposes severe constraints on the length of the pulses transmitted in the network, and on the network's throughput. Furthermore, most of these circuits do not decouple the currents flowing through memristive de… ▽ More Spike-based learning with memristive devices in neuromorphic computing architectures typically uses learning circuits that require overlapping pulses from pre- and post-synaptic nodes. This imposes severe constraints on the length of the pulses transmitted in the network, and on the network's throughput. Furthermore, most of these circuits do not decouple the currents flowing through memristive devices from the one stimulating the target neuron. This can be a problem when using devices with high conductance values, because of the resulting large currents. In this paper we propose a novel circuit that decouples the current produced by the memristive device from the one used to stimulate the post-synaptic neuron, by using a novel differential scheme based on the Gilbert normalizer circuit. We show how this circuit is useful for reducing the effect of variability in the memristive devices, and how it is ideally suited for spike-based learning mechanisms that do not require overlapping pre- and post-synaptic pulses. We demonstrate the features of the proposed synapse circuit with SPICE simulations, and validate its learning properties with high-level behavioral network simulations which use a stochastic gradient descent learning rule in two classification tasks. △ Less

Submitted 16 September, 2017; originally announced September 2017.

Comments: 18 Pages main text, 9 pages of supplementary text, 19 figures. Patented

arXiv:1505.01139 [pdf, other]

doi 10.1038/ncomms9941

An event-based architecture for solving constraint satisfaction problems

Authors: Hesham Mostafa, Lorenz K. Müller, Giacomo Indiveri

Abstract: Constraint satisfaction problems (CSPs) are typically solved using conventional von Neumann computing architectures. However, these architectures do not reflect the distributed nature of many of these problems and are thus ill-suited to solving them. In this paper we present a hybrid analog/digital hardware architecture specifically designed to solve such problems. We cast CSPs as networks of ster… ▽ More Constraint satisfaction problems (CSPs) are typically solved using conventional von Neumann computing architectures. However, these architectures do not reflect the distributed nature of many of these problems and are thus ill-suited to solving them. In this paper we present a hybrid analog/digital hardware architecture specifically designed to solve such problems. We cast CSPs as networks of stereotyped multi-stable oscillatory elements that communicate using digital pulses, or events. The oscillatory elements are implemented using analog non-stochastic circuits. The non-repeating phase relations among the oscillatory elements drive the exploration of the solution space. We show that this hardware architecture can yield state-of-the-art performance on a number of CSPs under reasonable assumptions on the implementation. We present measurements from a prototype electronic chip to demonstrate that a physical implementation of the proposed architecture is robust to practical non-idealities and to validate the theory proposed. △ Less

Submitted 4 May, 2015; originally announced May 2015.

Comments: First two authors contributed equally to this work

Journal ref: Nature Communications 6, Article number: 8941 (2015), pg. 1-10

arXiv:1504.05767 [pdf, other]

Rounding Methods for Neural Networks with Low Resolution Synaptic Weights

Authors: Lorenz K. Muller, Giacomo Indiveri

Abstract: Neural network algorithms simulated on standard computing platforms typically make use of high resolution weights, with floating-point notation. However, for dedicated hardware implementations of such algorithms, fixed-point synaptic weights with low resolution are preferable. The basic approach of reducing the resolution of the weights in these algorithms by standard rounding methods incurs drast… ▽ More Neural network algorithms simulated on standard computing platforms typically make use of high resolution weights, with floating-point notation. However, for dedicated hardware implementations of such algorithms, fixed-point synaptic weights with low resolution are preferable. The basic approach of reducing the resolution of the weights in these algorithms by standard rounding methods incurs drastic losses in performance. To reduce the resolution further, in the extreme case even to binary weights, more advanced techniques are necessary. To this end, we propose two methods for mapping neural network algorithms with high resolution weights to corresponding algorithms that work with low resolution weights and demonstrate that their performance is substantially better than standard rounding. We further use these methods to investigate the performance of three common neural network algorithms under fixed memory size of the weight matrix with different weight resolutions. We show that dedicated hardware systems, whose technology dictates very low weight resolutions (be they electronic or biological) could in principle implement the algorithms we study. △ Less

Submitted 22 April, 2015; originally announced April 2015.

arXiv:1409.3367 [pdf, other]

HTML5 WebSocket protocol and its application to distributed computing

Authors: Gabriel L. Muller

Abstract: HTML5 WebSocket protocol brings real time communication in web browsers to a new level. Daily, new products are designed to stay permanently connected to the web. WebSocket is the technology enabling this revolution. WebSockets are supported by all current browsers, but it is still a new technology in constant evolution. WebSockets are slowly replacing older client-server communication technolog… ▽ More HTML5 WebSocket protocol brings real time communication in web browsers to a new level. Daily, new products are designed to stay permanently connected to the web. WebSocket is the technology enabling this revolution. WebSockets are supported by all current browsers, but it is still a new technology in constant evolution. WebSockets are slowly replacing older client-server communication technologies. As opposed to comet-like technologies WebSockets' remarkable performances is a result of the protocol's fully duplex nature and because it doesn't rely on HTTP communications. To begin with this paper studies the WebSocket protocol and different WebSocket servers implementations. This first theoretic part focuses more deeply on heterogeneous implementations and OpenCL. The second part is a benchmark of a new promising library. The real-time engine used for testing purposes is SocketCluster. SocketCluster provides a highly scalable WebSocket server that makes use of all available cpu cores on an instance. The scope of this work is reduced to vertical scaling of SocketCluster. △ Less

Submitted 11 September, 2014; originally announced September 2014.

arXiv:1404.7514 [pdf, other]

doi 10.1371/journal.pone.0108590

Characterization and Compensation of Network-Level Anomalies in Mixed-Signal Neuromorphic Modeling Platforms

Authors: Mihai A. Petrovici, Bernhard Vogginger, Paul Müller, Oliver Breitwieser, Mikael Lundqvist, Lyle Muller, Matthias Ehrlich, Alain Destexhe, Anders Lansner, René Schüffny, Johannes Schemmel, Karlheinz Meier

Abstract: Advancing the size and complexity of neural network models leads to an ever increasing demand for computational resources for their simulation. Neuromorphic devices offer a number of advantages over conventional computing architectures, such as high emulation speed or low power consumption, but this usually comes at the price of reduced configurability and precision. In this article, we investigat… ▽ More Advancing the size and complexity of neural network models leads to an ever increasing demand for computational resources for their simulation. Neuromorphic devices offer a number of advantages over conventional computing architectures, such as high emulation speed or low power consumption, but this usually comes at the price of reduced configurability and precision. In this article, we investigate the consequences of several such factors that are common to neuromorphic devices, more specifically limited hardware resources, limited parameter configurability and parameter variations. Our final aim is to provide an array of methods for coping with such inevitable distortion mechanisms. As a platform for testing our proposed strategies, we use an executable system specification (ESS) of the BrainScaleS neuromorphic system, which has been designed as a universal emulation back-end for neuroscientific modeling. We address the most essential limitations of this device in detail and study their effects on three prototypical benchmark network models within a well-defined, systematic workflow. For each network model, we start by defining quantifiable functionality measures by which we then assess the effects of typical hardware-specific distortion mechanisms, both in idealized software simulations and on the ESS. For those effects that cause unacceptable deviations from the original network dynamics, we suggest generic compensation mechanisms and demonstrate their effectiveness. Both the suggested workflow and the investigated compensation mechanisms are largely back-end independent and do not require additional hardware configurability beyond the one required to emulate the benchmark networks in the first place. We hereby provide a generic methodological environment for configurable neuromorphic devices that are targeted at emulating large-scale, functional neural networks. △ Less

Submitted 10 February, 2015; v1 submitted 29 April, 2014; originally announced April 2014.

Journal ref: PLOS ONE, October 10th 2014

arXiv:1310.5062 [pdf, other]

Aspects of randomness in neural graph structures

Authors: Michelle Rudolph-Lilith, Lyle E. Muller

Abstract: In the past two decades, significant advances have been made in understanding the structural and functional properties of biological networks, via graph-theoretic analysis. In general, most graph-theoretic studies are conducted in the presence of serious uncertainties, such as major undersampling of the experimental data. In the specific case of neural systems, however, a few moderately robust exp… ▽ More In the past two decades, significant advances have been made in understanding the structural and functional properties of biological networks, via graph-theoretic analysis. In general, most graph-theoretic studies are conducted in the presence of serious uncertainties, such as major undersampling of the experimental data. In the specific case of neural systems, however, a few moderately robust experimental reconstructions do exist, and these have long served as fundamental prototypes for studying connectivity patterns in the nervous system. In this paper, we provide a comparative analysis of these "historical" graphs, both in (unmodified) directed and (often symmetrized) undirected forms, and focus on simple structural characterizations of their connectivity. We find that in most measures the networks studied are captured by simple random graph models; in a few key measures, however, we observe a marked departure from the random graph prediction. Our results suggest that the mechanism of graph formation in the networks studied is not well-captured by existing abstract graph models, such as the small-world or scale-free graph. △ Less

Submitted 18 October, 2013; originally announced October 2013.

Comments: 19 pages, 7 figures

Showing 1–50 of 53 results for author: Müller, L