-
Early Warning Signals of Social Instabilities in Twitter Data
Authors:
Vahid Shamsaddini,
Henry Kirveslahti,
Raphael Reinauer,
Wallyson Lemes de Oliveira,
Matteo Caorsi,
Etienne Voutaz
Abstract:
The goal of this project is to create and study novel techniques to identify early warning signals for socially disruptive events, like riots, wars, or revolutions using only publicly available data on social media. Such techniques need to be robust enough to work on real-time data: to achieve this goal we propose a topological approach together with more standard BERT models. Indeed, topology-bas…
▽ More
The goal of this project is to create and study novel techniques to identify early warning signals for socially disruptive events, like riots, wars, or revolutions using only publicly available data on social media. Such techniques need to be robust enough to work on real-time data: to achieve this goal we propose a topological approach together with more standard BERT models. Indeed, topology-based algorithms, being provably stable against deformations and noise, seem to work well in low-data regimes. The general idea is to build a binary classifier that predicts if a given tweet is related to a disruptive event or not. The results indicate that the persistent-gradient approach is stable and even more performant than deep-learning-based anomaly detection algorithms. We also benchmark the generalisability of the methodology against out-of-samples tasks, with very promising results.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Persformer: A Transformer Architecture for Topological Machine Learning
Authors:
Raphael Reinauer,
Matteo Caorsi,
Nicolas Berkouk
Abstract:
One of the main challenges of Topological Data Analysis (TDA) is to extract features from persistent diagrams directly usable by machine learning algorithms. Indeed, persistence diagrams are intrinsically (multi-)sets of points in $\mathbb{R}^2$ and cannot be seen in a straightforward manner as vectors. In this article, we introduce $\texttt{Persformer}$, the first Transformer neural network archi…
▽ More
One of the main challenges of Topological Data Analysis (TDA) is to extract features from persistent diagrams directly usable by machine learning algorithms. Indeed, persistence diagrams are intrinsically (multi-)sets of points in $\mathbb{R}^2$ and cannot be seen in a straightforward manner as vectors. In this article, we introduce $\texttt{Persformer}$, the first Transformer neural network architecture that accepts persistence diagrams as input. The $\texttt{Persformer}$ architecture significantly outperforms previous topological neural network architectures on classical synthetic and graph benchmark datasets. Moreover, it satisfies a universal approximation theorem. This allows us to introduce the first interpretability method for topological machine learning, which we explore in two examples.
△ Less
Submitted 26 September, 2022; v1 submitted 30 December, 2021;
originally announced December 2021.
-
ICLR 2021 Challenge for Computational Geometry & Topology: Design and Results
Authors:
Nina Miolane,
Matteo Caorsi,
Umberto Lupo,
Marius Guerard,
Nicolas Guigui,
Johan Mathe,
Yann Cabanes,
Wojciech Reise,
Thomas Davies,
António Leitão,
Somesh Mohapatra,
Saiteja Utpala,
Shailja Shailja,
Gabriele Corso,
Guoxi Liu,
Federico Iuricich,
Andrei Manolache,
Mihaela Nistor,
Matei Bejan,
Armand Mihai Nicolicioiu,
Bogdan-Alexandru Luchian,
Mihai-Sorin Stupariu,
Florent Michel,
Khanh Dao Duc,
Bilal Abdulrahman
, et al. (8 additional authors not shown)
Abstract:
This paper presents the computational challenge on differential geometry and topology that happened within the ICLR 2021 workshop "Geometric and Topological Representation Learning". The competition asked participants to provide creative contributions to the fields of computational geometry and topology through the open-source repositories Geomstats and Giotto-TDA. The challenge attracted 16 teams…
▽ More
This paper presents the computational challenge on differential geometry and topology that happened within the ICLR 2021 workshop "Geometric and Topological Representation Learning". The competition asked participants to provide creative contributions to the fields of computational geometry and topology through the open-source repositories Geomstats and Giotto-TDA. The challenge attracted 16 teams in its two month duration. This paper describes the design of the challenge and summarizes its main findings.
△ Less
Submitted 25 August, 2021; v1 submitted 22 August, 2021;
originally announced August 2021.
-
giotto-ph: A Python Library for High-Performance Computation of Persistent Homology of Vietoris-Rips Filtrations
Authors:
Julián Burella Pérez,
Sydney Hauke,
Umberto Lupo,
Matteo Caorsi,
Alberto Dassatti
Abstract:
We introduce giotto-ph, a high-performance, open-source software package for the computation of Vietoris-Rips barcodes. giotto-ph is based on Morozov and Nigmetov's lockfree (multicore) implementation of Ulrich Bauer's Ripser package. It also contains a re-working of the GUDHI library's implementation of Boissonnat and Pritam's Edge Collapser, which can be used as a pre-processing step to dramatic…
▽ More
We introduce giotto-ph, a high-performance, open-source software package for the computation of Vietoris-Rips barcodes. giotto-ph is based on Morozov and Nigmetov's lockfree (multicore) implementation of Ulrich Bauer's Ripser package. It also contains a re-working of the GUDHI library's implementation of Boissonnat and Pritam's Edge Collapser, which can be used as a pre-processing step to dramatically reduce overall run-times in certain scenarios. Our contribution is twofold: on the one hand, we integrate existing state-of-the-art ideas coherently in a single library and provide Python bindings to the C++ code. On the other hand, we increase parallelization opportunities and improve overall performance by adopting more efficient data structures. Our persistent homology backend establishes a new state of the art, surpassing even GPU-accelerated implementations such as Ripser++ when using as few as 5-10 CPU cores. Furthermore, our implementation of Edge Collapser has fewer software dependencies and improved run-times relative to GUDHI's original implementation.
△ Less
Submitted 2 August, 2021; v1 submitted 12 July, 2021;
originally announced July 2021.
-
giotto-tda: A Topological Data Analysis Toolkit for Machine Learning and Data Exploration
Authors:
Guillaume Tauzin,
Umberto Lupo,
Lewis Tunstall,
Julian Burella Pérez,
Matteo Caorsi,
Wojciech Reise,
Anibal Medina-Mardones,
Alberto Dassatti,
Kathryn Hess
Abstract:
We introduce giotto-tda, a Python library that integrates high-performance topological data analysis with machine learning via a scikit-learn-compatible API and state-of-the-art C++ implementations. The library's ability to handle various types of data is rooted in a wide range of preprocessing techniques, and its strong focus on data exploration and interpretability is aided by an intuitive plott…
▽ More
We introduce giotto-tda, a Python library that integrates high-performance topological data analysis with machine learning via a scikit-learn-compatible API and state-of-the-art C++ implementations. The library's ability to handle various types of data is rooted in a wide range of preprocessing techniques, and its strong focus on data exploration and interpretability is aided by an intuitive plotting API. Source code, binaries, examples, and documentation can be found at https://github.com/giotto-ai/giotto-tda.
△ Less
Submitted 5 March, 2021; v1 submitted 6 April, 2020;
originally announced April 2020.