Search | arXiv e-print repository

Topological Attention for Time Series Forecasting

Authors: Sebastian Zeng, Florian Graf, Christoph Hofer, Roland Kwitt

Abstract: The problem of (point) forecasting $ \textit{univariate} $ time series is considered. Most approaches, ranging from traditional statistical methods to recent learning-based techniques with neural networks, directly operate on raw time series observations. As an extension, we study whether $\textit{local topological properties}$, as captured via persistent homology, can serve as a reliable signal t… ▽ More The problem of (point) forecasting $ \textit{univariate} $ time series is considered. Most approaches, ranging from traditional statistical methods to recent learning-based techniques with neural networks, directly operate on raw time series observations. As an extension, we study whether $\textit{local topological properties}$, as captured via persistent homology, can serve as a reliable signal that provides complementary information for learning to forecast. To this end, we propose $\textit{topological attention}$, which allows attending to local topological features within a time horizon of historical data. Our approach easily integrates into existing end-to-end trainable forecasting models, such as $\texttt{N-BEATS}$, and in combination with the latter exhibits state-of-the-art performance on the large-scale M4 benchmark dataset of 100,000 diverse time series from different domains. Ablation experiments, as well as a comparison to a broad range of forecasting methods in a setting where only a single time series is available for training, corroborate the beneficial nature of including local topological information through an attention mechanism. △ Less

Submitted 19 July, 2021; originally announced July 2021.

arXiv:2102.08817 [pdf, other]

Dissecting Supervised Contrastive Learning

Authors: Florian Graf, Christoph D. Hofer, Marc Niethammer, Roland Kwitt

Abstract: Minimizing cross-entropy over the softmax scores of a linear map composed with a high-capacity encoder is arguably the most popular choice for training neural networks on supervised learning tasks. However, recent works show that one can directly optimize the encoder instead, to obtain equally (or even more) discriminative representations via a supervised variant of a contrastive objective. In thi… ▽ More Minimizing cross-entropy over the softmax scores of a linear map composed with a high-capacity encoder is arguably the most popular choice for training neural networks on supervised learning tasks. However, recent works show that one can directly optimize the encoder instead, to obtain equally (or even more) discriminative representations via a supervised variant of a contrastive objective. In this work, we address the question whether there are fundamental differences in the sought-for representation geometry in the output space of the encoder at minimal loss. Specifically, we prove, under mild assumptions, that both losses attain their minimum once the representations of each class collapse to the vertices of a regular simplex, inscribed in a hypersphere. We provide empirical evidence that this configuration is attained in practice and that reaching a close-to-optimal state typically indicates good generalization performance. Yet, the two losses show remarkably different optimization behavior. The number of iterations required to perfectly fit to data scales superlinearly with the amount of randomly flipped labels for the supervised contrastive loss. This is in contrast to the approximately linear scaling previously reported for networks trained with cross-entropy. △ Less

Submitted 2 March, 2023; v1 submitted 17 February, 2021; originally announced February 2021.

Comments: v4 updates: - updated appendix section S1.3 - this includes fixing an oversight in the proofs (Lemma 1 missed an equality condition, which now appears in Lemma 2) - improved figure quality

Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3821-3830, 2021

arXiv:2005.14620 [pdf, ps, other]

Parameterized Complexity of Min-Power Asymmetric Connectivity

Authors: Matthias Bentert, Roman Haag, Christian Hofer, Tomohiro Koana, André Nichterlein

Abstract: We investigate parameterized algorithms for the NP-hard problem Min-Power Asymmetric Connectivity (MinPAC) that has applications in wireless sensor networks. Given a directed arc-weighted graph, MinPAC asks for a strongly connected spanning subgraph minimizing the summed vertex costs. Here, the cost of each vertex is the weight of its heaviest outgoing arc in the chosen subgraph. We present linear… ▽ More We investigate parameterized algorithms for the NP-hard problem Min-Power Asymmetric Connectivity (MinPAC) that has applications in wireless sensor networks. Given a directed arc-weighted graph, MinPAC asks for a strongly connected spanning subgraph minimizing the summed vertex costs. Here, the cost of each vertex is the weight of its heaviest outgoing arc in the chosen subgraph. We present linear-time algorithms for the cases where the number of strongly connected components in a so-called obligatory subgraph or the feedback edge number in the underlying undirected graph is constant. Complementing these results, we prove that the problem is W[2]-hard with respect to the solution cost, even on restricted graphs with one feedback arc and binary arc weights. △ Less

Submitted 29 May, 2020; originally announced May 2020.

ACM Class: F.2.2

arXiv:2002.04805 [pdf, other]

Topologically Densified Distributions

Authors: Christoph D. Hofer, Florian Graf, Marc Niethammer, Roland Kwitt

Abstract: We study regularization in the context of small sample-size learning with over-parameterized neural networks. Specifically, we shift focus from architectural properties, such as norms on the network weights, to properties of the internal representations before a linear classifier. Specifically, we impose a topological constraint on samples drawn from the probability measure induced in that space.… ▽ More We study regularization in the context of small sample-size learning with over-parameterized neural networks. Specifically, we shift focus from architectural properties, such as norms on the network weights, to properties of the internal representations before a linear classifier. Specifically, we impose a topological constraint on samples drawn from the probability measure induced in that space. This provably leads to mass concentration effects around the representations of training instances, i.e., a property beneficial for generalization. By leveraging previous work to impose topological constraints in a neural network setting, we provide empirical evidence (across various vision benchmarks) to support our claim for better generalization. △ Less

Submitted 17 May, 2021; v1 submitted 12 February, 2020; originally announced February 2020.

arXiv:1906.09003 [pdf, other]

Connectivity-Optimized Representation Learning via Persistent Homology

Authors: Christoph Hofer, Roland Kwitt, Mandar Dixit, Marc Niethammer

Abstract: We study the problem of learning representations with controllable connectivity properties. This is beneficial in situations when the imposed structure can be leveraged upstream. In particular, we control the connectivity of an autoencoder's latent space via a novel type of loss, operating on information from persistent homology. Under mild conditions, this loss is differentiable and we present a… ▽ More We study the problem of learning representations with controllable connectivity properties. This is beneficial in situations when the imposed structure can be leveraged upstream. In particular, we control the connectivity of an autoencoder's latent space via a novel type of loss, operating on information from persistent homology. Under mild conditions, this loss is differentiable and we present a theoretical analysis of the properties induced by the loss. We choose one-class learning as our upstream task and demonstrate that the imposed structure enables informed parameter selection for modeling the in-class distribution via kernel density estimators. Evaluated on computer vision data, these one-class models exhibit competitive performance and, in a low sample size regime, outperform other methods by a large margin. Notably, our results indicate that a single autoencoder, trained on auxiliary (unlabeled) data, yields a mapping into latent space that can be reused across datasets for one-class learning. △ Less

Submitted 21 June, 2019; originally announced June 2019.

arXiv:1905.10996 [pdf, other]

Graph Filtration Learning

Authors: Christoph D. Hofer, Florian Graf, Bastian Rieck, Marc Niethammer, Roland Kwitt

Abstract: We propose an approach to learning with graph-structured data in the problem domain of graph classification. In particular, we present a novel type of readout operation to aggregate node features into a graph-level representation. To this end, we leverage persistent homology computed via a real-valued, learnable, filter function. We establish the theoretical foundation for differentiating through… ▽ More We propose an approach to learning with graph-structured data in the problem domain of graph classification. In particular, we present a novel type of readout operation to aggregate node features into a graph-level representation. To this end, we leverage persistent homology computed via a real-valued, learnable, filter function. We establish the theoretical foundation for differentiating through the persistent homology computation. Empirically, we show that this type of readout operation compares favorably to previous techniques, especially when the graph connectivity structure is informative for the learning problem. △ Less

Submitted 17 May, 2021; v1 submitted 27 May, 2019; originally announced May 2019.

arXiv:1707.04041 [pdf, other]

Deep Learning with Topological Signatures

Authors: Christoph Hofer, Roland Kwitt, Marc Niethammer, Andreas Uhl

Abstract: Inferring topological and geometrical information from data can offer an alternative perspective on machine learning problems. Methods from topological data analysis, e.g., persistent homology, enable us to obtain such information, typically in the form of summary representations of topological features. However, such topological signatures often come with an unusual structure (e.g., multisets of… ▽ More Inferring topological and geometrical information from data can offer an alternative perspective on machine learning problems. Methods from topological data analysis, e.g., persistent homology, enable us to obtain such information, typically in the form of summary representations of topological features. However, such topological signatures often come with an unusual structure (e.g., multisets of intervals) that is highly impractical for most machine learning techniques. While many strategies have been proposed to map these topological signatures into machine learning compatible representations, they suffer from being agnostic to the target learning task. In contrast, we propose a technique that enables us to input topological signatures to deep neural networks and learn a task-optimal representation during training. Our approach is realized as a novel input layer with favorable theoretical properties. Classification experiments on 2D object shapes and social network graphs demonstrate the versatility of the approach and, in case of the latter, we even outperform the state-of-the-art by a large margin. △ Less

Submitted 16 February, 2018; v1 submitted 13 July, 2017; originally announced July 2017.

Showing 1–7 of 7 results for author: Hofer, C