Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A Subsequence Approach to Topological Data Analysis for Irregularly-Spaced Time Series

Sixtus Dakurah  
Department of Statistics, University of Wisconsin-Madison
and
Jessi Cisewski-Kehe11footnotemark: 1
Department of Statistics, University of Wisconsin-Madison
The authors were supported by NSF Grant DMS-2038556. Support for this research was also provided by the University of Wisconsin-Madison Office of the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation. The authors thank Chris Geoga from the University of Wisconsin-Madison for useful discussion related to this work.
Abstract

A time-delay embedding (TDE), grounded in the framework of Takens’s Theorem, provides a mechanism to represent and analyze the inherent dynamics of time-series data. Recently, topological data analysis (TDA) methods have been applied to study this time series representation mainly through the lens of persistent homology. Current literature on the fusion of TDE and TDA are adept at analyzing uniformly-spaced time series observations. This work introduces a novel subsequence embedding method for irregularly-spaced time-series data. We show that this method preserves the original state space topology while reducing spurious homological features. Theoretical stability results and convergence properties of the proposed method in the presence of noise and varying levels of irregularity in the spacing of the time series are established. Numerical studies and an application to real data illustrates the performance of the proposed method.


Keywords: Persistent Homology, Time-Delay Embedding, State Space Reconstruction, Astrostatistics

1 Introduction

A time series measurement x(t)𝑥𝑡x(t)\in\mathbb{R}italic_x ( italic_t ) ∈ blackboard_R at time t𝑡titalic_t can be considered as the outcome of a data-generating space of some dynamical system (i.e., mathematical models that describes the evolution of variables over time) with state vector 𝐬(t)N𝐬𝑡superscript𝑁\mathbf{s}(t)\in\mathbb{R}^{N}bold_s ( italic_t ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. Constructing a meaningful approximation of this underlying data-generating space when only the scalar time series is observed can uncover latent patterns and structures not readily apparent in the raw time series. Time-delay embeddings (TDEs) are often employed for this state space reconstruction. The TDE transforms the time-series data from the time-domain to an estimate of the state space, which can reveal properties of the system such as periodicity and other structures not apparent in the time domain. The principle underlying TDEs is Takens’s Theorem, which asserts that even if the actual dynamics (i.e., the system’s behavior over time) are not known, a single time series can be treated as a one-dimensional projection of the path traced by the system’s state vector in a multi-dimensional space. An approximation to the actual dynamics can be constructed from this projection (Takens, 2006). Takens proved that assuming uniformly-spaced and noise-free measurements of unlimited length, there exists a diffeomorphism (i.e., a smooth and invertible function) between the true high-dimensional system and the (TDE) reconstructed system. This theorem forms the foundation for much of the discussions on reconstructing the multi-dimensional state of a system from a single time series (Ali et al., 2007).

More recently, there is renewed interest in coupling the TDE method with tools from topological data analysis (TDA) to study the underlying dynamics of time-series data using various geometric and topological features of the TDE such as clusters, loops, voids, and their higher dimensional analogs (Gholizadeh and Zadrozny, 2018). In particular, the reconstructed state space obtained from the TDE reveals consistent patterns and structures that reflect the shape and characteristics of the underlying time-series dynamics (Takens, 2006). TDA is a computational method for studying the shape of data, which can be applied to characterize the topological features of these reconstructed state spaces. The characterization is often carried out using persistent homology, a tool of TDA, which employs a multi-scale approach to quantify certain topological features (Edelsbrunner et al., 2000; Edelsbrunner and Harer, 2022) (see Section 2.1 for more details). TDA and TDE have been successfully applied to quantify periodicity in time-series data (Perea and Harer, 2015), analyze human speech (Brown and Knudson, 2009), detect motion patterns in video (Tralie and Perea, 2018), and in wheeze detection (Emrani et al., 2014).

In the applications noted above, the observed time series is uniformly-spaced. However, time series is often not uniformly-spaced due to measurement lapses (Stark et al., 1997), process errors (Casdagli et al., 1991), or inherent features of the data generating process (Stark et al., 1997; Lekscha and Donner, 2018), etc. The standard Takens’s theorem does not handle irregularly-spaced time series, but several options exist in the literature to address issues related to irregularly-spaced time series observations to make it amenable to TDE. Broadly, these can be classified into imputation or exclusion methods. Imputation methods involve predicting the missing observations, and then the analysis is carried out assuming a uniformly-spaced time series has been observed (Harvey and Pierse, 1984; Casdagli et al., 1991; Lekscha and Donner, 2018). Exclusion methods initially ignore the presence of missing values and assume a uniformly-spaced set. The TDE maps are then constructed and any map with a missing value is excluded (Boker et al., 2018; Johnson and Munch, 2022). If the imputation model is misspecified, it can produce structures in the TDE that do not reflect true properties of the data, and the exclusion method can significantly alter the shape of the TDE space (Huke and Broomhead, 2007; Boker et al., 2018). Since TDA can provide quantification of qualitative properties of the reconstructed state space, the drawbacks of the imputation and exclusion methods may distort topological features constructed from the TDE spaces.

In this work, we propose a subsequence method for carrying out a TDE of irregularly-spaced time-series data that preserves certain properties of the reconstructed state space. The level of irregularity of the time-series data is controlled by the regularity score (defined in Section 3.3). We show that the proposed method preserves the topological features of the original underlying state space of the time series while reducing spurious shape features. Theoretically, we prove stability and convergence results of the proposed subsequence method in the presence of noise and for varying levels of irregularity in the observed time series. Further, we demonstrate the competitiveness of the proposed subsequence method through simulation studies and an application to real data. We view this work as laying a necessary foundation for new statistical TDA methodology that has the potential for characterizing properties of time-series signals beyond period estimation; this point is expanded upon in the Discussion Section of the supplementary material (Section S2) included at the end of this article after the references.

2 Background on Topological Data Analysis

One of the main constructs for studying the topological features of data is persistent homology (Edelsbrunner et al., 2000; Edelsbrunner and Harer, 2022), which detects different dimensional holes in data. Persistent homology provides a multi-scale characterization of the homological features (holes) of a topological space by keeping track of when these topological features first appear (birth) and when they disappear (death) in a filtration. The main topological object in this work is a point cloud resulting from the (M+1)𝑀1(M+1)( italic_M + 1 )-dimensional TDE space of time-series data. The following provide an introduction to persistent homology that are used in the proposed method.

2.1 Vietoris-Rips Filtration and Persistent Homology

Homology is an area of mathematics that looks for holes in a topological space, and persistent homology looks for holes in data. Topological holes are identified using ideas from algebraic topology and are characterized by different dimensional homology groups (Hatcher et al., 2002; Edelsbrunner and Harer, 2022). The zero-dimensional homology group (H0subscript𝐻0{H}_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT) contains connected components (clusters), the one-dimensional homology group (H1subscript𝐻1{H}_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT) contains loops, the two-dimensional homology group (H2subscript𝐻2H_{2}italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT) contains voids like the interior of a balloon, and more generally, the k-dimensional homology group (Hksubscript𝐻𝑘{H}_{k}italic_H start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT) can be thought of as representing k-dimensional holes.

The data for which we use persistent homology in the proposed method are point clouds derived from a timeseries. If we only computed the homology of the point-cloud data, it would not be useful because the data points are not connected resulting in only H0subscript𝐻0H_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT features. Instead, an intermediate structure is defined with simplicial complexes. A zero-simplex is a point, a one-simplex is a segment, a two-simplex is a triangle, and, more generally, a k𝑘kitalic_k-simplex σ=(v0,,vk)𝜎subscript𝑣0subscript𝑣𝑘\sigma=(v_{0},\cdots,v_{k})italic_σ = ( italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ⋯ , italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is a k𝑘kitalic_k-dimensional polytope of k+1𝑘1k+1italic_k + 1 affinely independent points (i.e., zero-simplexes) v0,,vksubscript𝑣0subscript𝑣𝑘v_{0},\cdots,v_{k}italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ⋯ , italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. A simplicial complex K𝐾{K}italic_K is a finite set of simplices such that for any σ1,σ2Ksubscript𝜎1subscript𝜎2𝐾\sigma_{1},\sigma_{2}\in{K}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_K, σ1σ2subscript𝜎1subscript𝜎2\sigma_{1}\cap\sigma_{2}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is a face (i.e., the convex hull of any non-empty subset of points that define the simplex) of both simplices, or the empty set; and a face of any simplex σK𝜎𝐾\sigma\in{K}italic_σ ∈ italic_K is also a simplex in K𝐾{K}italic_K. To compute the homology, simplicial complexes are built along a sequence of filtration values, which is discussed next.

A common approach to constructing simplicial complexes in TDA is the Vietoris-Rips (VR) complex (Vietoris, 1927; Edelsbrunner and Harer, 2022). A VR complex is constructed over a finite set of points S={v1,v2,,vn}𝑆subscript𝑣1subscript𝑣2subscript𝑣𝑛S=\{v_{1},v_{2},\cdots,v_{n}\}italic_S = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } using a distance parameter δ0𝛿0\delta\geq 0italic_δ ≥ 0. For any subset of k-points {vi1,,vik}subscript𝑣subscript𝑖1subscript𝑣subscript𝑖𝑘\{v_{i_{1}},\cdots,v_{i_{k}}\}{ italic_v start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , ⋯ , italic_v start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT }, a (k1)𝑘1(k-1)( italic_k - 1 )-dimensional simplex is formed when the pairwise Euclidean distance between all points is at most δ𝛿\deltaitalic_δ. A collection of all such simplices forms the VR complex denoted as VR(S,δ)𝑉𝑅𝑆𝛿VR(S,\delta)italic_V italic_R ( italic_S , italic_δ ). The composition of the VR complex progresses hierarchically as δ𝛿\deltaitalic_δ increases. This leads to the concept of filtration, which defines an inclusion relation between the simplicial complexes for a set of δ𝛿\deltaitalic_δ values. More formally, for an ordered sequence of δ𝛿\deltaitalic_δ values: 0<δ1<δ2<<δq<0subscript𝛿1subscript𝛿2subscript𝛿𝑞0<\delta_{1}<\delta_{2}<\cdots<\delta_{q}<\infty0 < italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < ⋯ < italic_δ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT < ∞, the VR complexes admits a nested structure, VR(S,0)VR(S,δ1)VR(S,δq)VR(S,).𝑉𝑅𝑆0𝑉𝑅𝑆subscript𝛿1𝑉𝑅𝑆subscript𝛿𝑞𝑉𝑅𝑆VR(S,0)\subset VR(S,\delta_{1})\subset\cdots\subset VR(S,\delta_{q})\subset VR% (S,\infty).italic_V italic_R ( italic_S , 0 ) ⊂ italic_V italic_R ( italic_S , italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ⊂ ⋯ ⊂ italic_V italic_R ( italic_S , italic_δ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) ⊂ italic_V italic_R ( italic_S , ∞ ) . Figure 1 provides an illustration of a VR complex. The black points (zero-simplices) in Figure 1(a) and Figure 1(b) denotes the data with cyan balls of diameter δ=0.8𝛿0.8\delta=0.8italic_δ = 0.8 and δ=1.5𝛿1.5\delta=1.5italic_δ = 1.5, respectively, along with the resulting one-simplices and two-simplices. The simplicial complex in Figure 1(a) is a subset of the simplicial complex displayed in Figure 1(b).

Refer to caption
(a) Point set with δ=0.8𝛿0.8\delta=0.8italic_δ = 0.8
Refer to caption
(b) Point set with δ=1.5𝛿1.5\delta=1.5italic_δ = 1.5
Refer to caption
(c) Persistence diagram
Figure 1: Illustration of VR complexes and persistence diagram. The zero-simplices (black points) were sampled around three circles. Balls of diameter 0.80.80.80.8 (a) and 1.51.51.51.5 (b) are drawn around the points, resulting in one-simplices (black lines) and two-simplices (yellow triangles). The associated persistence diagram (c) has H0subscript𝐻0H_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT features (red points) and H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT features (blue triangles) that represent the three circles.

The inclusion relation between the VR complexes induces a map between the homology groups, Hk(VR(S,0))Hk(VR(S,δ1))Hk(VR(S,)).absentsubscript𝐻𝑘𝑉𝑅𝑆0subscript𝐻𝑘𝑉𝑅𝑆subscript𝛿1absentabsentsubscript𝐻𝑘𝑉𝑅𝑆{H}_{k}(VR(S,0))\xrightarrow{}{H}_{k}(VR(S,\delta_{1}))\xrightarrow{}\cdots% \xrightarrow{}{H}_{k}(VR(S,\infty)).italic_H start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_V italic_R ( italic_S , 0 ) ) start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW italic_H start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_V italic_R ( italic_S , italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW ⋯ start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW italic_H start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_V italic_R ( italic_S , ∞ ) ) . The notion of persistent homology is developed through these homology maps by tracking the changes in the generators (i.e., the different dimensional holes) of these nested homology groups. The birth time and death time of homology group generators along this sequence encodes topological changes in the groups. For a homology group Hksubscript𝐻𝑘{H}_{k}italic_H start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, we denote the birth time and death time of the j𝑗jitalic_j-th homology group generator (i.e., the j𝑗jitalic_j-th hole) by bjsubscript𝑏𝑗b_{j}italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and djsubscript𝑑𝑗d_{j}italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, respectively. The persistence, or lifetime, of the generator is given by djbjsubscript𝑑𝑗subscript𝑏𝑗d_{j}-b_{j}italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, and a longer persistence is often considered to be topological signal while shorter persistence often represents topological noise. If we let kjsubscript𝑘𝑗k_{j}italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT be the homology group dimension of the j𝑗jitalic_j-th generator, and J𝐽{J}italic_J the index set of the generators of the homology groups, then the set Dgm(S)={(bj,dj,kj):jJ}ΔDgm𝑆conditional-setsubscript𝑏𝑗subscript𝑑𝑗subscript𝑘𝑗for-all𝑗𝐽Δ\text{Dgm}(S)=\{(b_{j},d_{j},k_{j}):\forall j\in J\}\cup\DeltaDgm ( italic_S ) = { ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) : ∀ italic_j ∈ italic_J } ∪ roman_Δ fully characterizes the persistence (life span) of the homology generators, and is used to construct a graphical summary referred to as the persistence diagram; see example in Figure 1(c). The ΔΔ\Deltaroman_Δ represents points where the birth time is equal to the death time, and it included in the definition of persistence diagrams for technical reasons (Mileyko et al., 2011).

A persistence diagram is a summary statistic that characterizes the birth and death times of different dimensional holes detected in a dataset. These diagrams can be useful on their own for visualizing topological information of complex data, but are often used for inference or prediction tasks (e.g., Fasy et al., 2014; Bubenik et al., 2015; Adams et al., 2017; Robinson and Turner, 2017; Xu et al., 2019; Berry et al., 2020; Pun et al., 2022; Glenn et al., 2024). In this work, persistence diagrams are used to characterize the TDE from time series and to quantify the performance of the proposed and comparison methods for reconstructing the state space of irregularly-spaced time-series data.

3 Method

The persistence of topological features forms the basis for most of our analysis on the TDE reconstructed state spaces. A TDE transforms a time series over a sliding window of size Mτ𝑀𝜏M\tauitalic_M italic_τ to a point cloud in M+1superscript𝑀1\mathbb{R}^{M+1}blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT. While Takens’s theorem guarantees exact reconstruction for uniformly-spaced time-series data with appropriate M𝑀Mitalic_M and τ𝜏\tauitalic_τ, no such guarantees exist for irregularly-spaced time series. Here, we propose a new reconstruction method for irregularly-spaced time series that better preserves the topological features of the reconstructed state space compared to existing methods.

3.1 A Note on Terminology

For this work, the term uniformly-spaced time series is used to describe time series that have equally-spaced time intervals between successive observations, and is considered the “true” times series for purposes of evaluating the proposed method. The term irregularly-spaced time series refers to observations with unequally-spaced time interval between successive observations. To characterize how the two forms of time series are related, it is assumed throughout this work that the irregularly-spaced time series is a subset of the uniformly-spaced time series. Hence the irregularly-spaced time series always have fewer time measurements than the corresponding uniformly-spaced time series.

The concept of TDE is also referred to in the literature as a delayed coordinate embedding, a sliding window embedding, or simply a Takens embedding. For this work, only the term TDE is used. For any irregularly-spaced time series, it is assumed there is a “true” underlying uniformly-spaced time series. The use of “TDE” exclusively refers to an embedding constructed from this true underlying uniformly-spaced time series. After applying the proposed subsequence method, the resulting embedding is referred as the subsequence embedding (SSE). In instances where an exposition applies to both the TDE and SSE, the term embedding map is used as a collective reference to the two concepts.

For a given time interval, it is assumed that missing or unobserved values occurs with a given probability. That is, for a given time point in a time interval, a measurement is not observed at that point with some probability. Such probabilistic mechanism governing the observations of time series values is not uncommon in the literature (e.g., Dunsmuir and Robinson, 1981). This probability can be fixed for all time points or it can vary for each time point. This characterization is referred to as the missingness structure of the time series in context.

3.2 Basics of Time-Delay Embeddings

For this work, the discussion on TDEs is restricted to univariate time series. Assume this univariate time series is generated by a system with a state vector 𝐬(t)𝐬𝑡\mathbf{s}(t)bold_s ( italic_t ) on a manifold which is a subset of some N𝑁Nitalic_N-dimensional space Nsuperscript𝑁\mathbb{R}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. The state vector 𝐬(t)𝐬𝑡\mathbf{s}(t)bold_s ( italic_t ) is not directly observable, however some measurement of it, denoted x(t)=h(𝐬(t))𝑥𝑡𝐬𝑡x(t)=h(\mathbf{s}(t))italic_x ( italic_t ) = italic_h ( bold_s ( italic_t ) ), is observed through the measurement function h()h(\cdot)italic_h ( ⋅ ). The measurement function h()h(\cdot)italic_h ( ⋅ ) can be thought of as rule that transforms the high-dimensional state vector into the observed univariate time series x(t)𝑥𝑡x(t)italic_x ( italic_t ). For instance, in astronomy, 𝐬(t)𝐬𝑡\mathbf{s}(t)bold_s ( italic_t ) might include variables such as positions, velocities, and luminosities of various celestial bodies, such as exoplanets, stars, or galaxies. However, the measurement function is specifically designed to extract a single scalar value from this vector. The specific form of h()h(\cdot)italic_h ( ⋅ ) is influenced by many considerations, for example, the limitation of observational tools. For a star, the measurement function could be designed to extract a key observable from the state vector, such as its luminosity. Thus the measurement h(𝐬(t))𝐬𝑡h(\mathbf{s}(t))italic_h ( bold_s ( italic_t ) ) reflects the observed brightness of the star at any given time t𝑡titalic_t.

The scalar value x(t)𝑥𝑡x(t)italic_x ( italic_t ) is the observed time series measurement. Define the function F:NM+1:𝐹absentsuperscript𝑁superscript𝑀1F:\mathbb{R}^{N}\xrightarrow{}\mathbb{R}^{M+1}italic_F : blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT as the embedding map with the form F(𝐬(t))=[x(t),x(t+τ),,x(t+Mτ)]𝐹𝐬𝑡𝑥𝑡𝑥𝑡𝜏𝑥𝑡𝑀𝜏F(\mathbf{s}(t))=\left[x(t),x(t+\tau),\cdots,x(t+M\tau)\right]italic_F ( bold_s ( italic_t ) ) = [ italic_x ( italic_t ) , italic_x ( italic_t + italic_τ ) , ⋯ , italic_x ( italic_t + italic_M italic_τ ) ]. If the measurement function h()h(\cdot)italic_h ( ⋅ ) is noise-free, and the embedding dimension M+1𝑀1M+1italic_M + 1 is chosen to be more than twice the dimension of the attractor (i.e., the N𝑁Nitalic_N-dimensional region toward which the system evolves) of the system’s state space, Takens’s theorem guarantees that the embedding map F(𝐬(t))𝐹𝐬𝑡F(\mathbf{s}(t))italic_F ( bold_s ( italic_t ) ) has a one-to-one correspondence between the original state space of the system (from which the time series is derived) and the reconstructed state space formed by F(𝐬(t))𝐹𝐬𝑡F(\mathbf{s}(t))italic_F ( bold_s ( italic_t ) ) (Takens, 2006). This ensures that the dynamics of the system can be studied in the reconstructed space as if it were being studied in the original space. Figure 2 demonstrates this reconstruction process by mapping the scalar time series to the TDE matrix F𝐹Fitalic_F to reconstruct the state space.

Refer to caption
Figure 2: Illustration of the embedding process. Top-left: the state space, typically not observed. Middle-bottom: the time series obtained via the measurement function h()h(\cdot)italic_h ( ⋅ ). Top-right: the reconstructed space from the TDE matrix F𝐹Fitalic_F, which preserves the topology of the original state space.

The choice of embedding dimension M+1𝑀1M+1italic_M + 1 and step size τ𝜏\tauitalic_τ is a subject of considerable research in the literature (e.g., Cao, 1997; Kim et al., 1999). In this work, the embedding dimension is chosen manually. However, the method of false nearest neighbors is one common method for determining this dimension, which identifies points in a low-dimensional space that appear to be near each other but are not actually neighbors when the data is viewed in a high-dimensional space. By systematically increasing the embedding dimension, and evaluating the percentage of false nearest neighbors, the dimension can be set where this percentage drops significantly, indicating a suitable dimension. More details about this and other procedures for determining M𝑀Mitalic_M and τ𝜏\tauitalic_τ can be found in Cao, (1997); Kim et al., (1999). A large value of M𝑀Mitalic_M is often preferred as it enables the embedding to capture more details inherent in the time series. If M𝑀Mitalic_M is too large, there may be an insufficient number of points in the embedding space. Furthermore, if Mτ𝑀𝜏M\tauitalic_M italic_τ is too small due to a small τ𝜏\tauitalic_τ, relatively fewer points fall in each embedding window. This results in points repeatedly appearing in windows, which can lead to redundant information. If Mτ𝑀𝜏M\tauitalic_M italic_τ is too large due to a large value of τ𝜏\tauitalic_τ, the reconstructed state space can be distorted because relevant periodic behavior of the time series may not be captured (Casdagli et al., 1991). Hence the choice of τ𝜏\tauitalic_τ and M𝑀Mitalic_M is such that Mτ𝑀𝜏M\tauitalic_M italic_τ is not too large or too small, but is application-dependent and requires empirical testing.

3.3 Subsequence Method

The TDE construction in the previous section assumes the observed time series is uniformly-spaced, but a time series is often irregularly-spaced in real data. We propose a method to extract uniformly-spaced subsequences from the observed irregularly-spaced time series and prove its topology-preserving properties, along with consistency and convergence results.

3.3.1 Subsequence construction

Let 𝐱=(x(t):t𝒯)\mathbf{x}=\left(x(t):t\in\mathcal{T}\right)bold_x = ( italic_x ( italic_t ) : italic_t ∈ caligraphic_T ) be a time series of length n𝑛nitalic_n where 𝒯={t1,,tn}𝒯subscript𝑡1subscript𝑡𝑛\mathcal{T}=\{t_{1},\cdots,t_{n}\}\subset\mathbb{N}caligraphic_T = { italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } ⊂ blackboard_N. Further assume that this time series is not uniformly spaced, that is, ti+1titi+2ti+1subscript𝑡𝑖1subscript𝑡𝑖subscript𝑡𝑖2subscript𝑡𝑖1t_{i+1}-t_{i}\neq t_{i+2}-t_{i+1}italic_t start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT - italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≠ italic_t start_POSTSUBSCRIPT italic_i + 2 end_POSTSUBSCRIPT - italic_t start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT, for some ti𝒯subscript𝑡𝑖𝒯t_{i}\in\mathcal{T}italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_T such that ti<ti+1subscript𝑡𝑖subscript𝑡𝑖1t_{i}<t_{i+1}italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_t start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT. In this work, a subsequence of the set 𝐱𝐱\mathbf{x}bold_x is defined as any subset that omits elements of 𝐱𝐱\mathbf{x}bold_x without changing the order of the remaining elements. This definition does not guarantee that ti+1ti=ti+2ti+1,subscript𝑡𝑖1subscript𝑡𝑖subscript𝑡𝑖2subscript𝑡𝑖1t_{i+1}-t_{i}=t_{i+2}-t_{i+1},italic_t start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT - italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_t start_POSTSUBSCRIPT italic_i + 2 end_POSTSUBSCRIPT - italic_t start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT , ti𝒯for-allsubscript𝑡𝑖𝒯\forall t_{i}\in\mathcal{T}∀ italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_T, which is a condition we want to achieve with the proposed subsequence construction. Let 𝐱p,r𝐱subscript𝐱𝑝𝑟𝐱\mathbf{x}_{p,r}\subseteq\mathbf{x}bold_x start_POSTSUBSCRIPT italic_p , italic_r end_POSTSUBSCRIPT ⊆ bold_x be a subset of the original time series with time index 𝒯p,r𝒯subscript𝒯𝑝𝑟𝒯\mathcal{T}_{p,r}\subseteq\mathcal{T}caligraphic_T start_POSTSUBSCRIPT italic_p , italic_r end_POSTSUBSCRIPT ⊆ caligraphic_T with the condition that tp,i+1tp,i=r,subscript𝑡𝑝𝑖1subscript𝑡𝑝𝑖𝑟t_{p,i+1}-t_{p,i}=r,italic_t start_POSTSUBSCRIPT italic_p , italic_i + 1 end_POSTSUBSCRIPT - italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT = italic_r , tp,i𝒯p,rfor-allsubscript𝑡𝑝𝑖subscript𝒯𝑝𝑟\forall t_{p,i}\in\mathcal{T}_{p,r}∀ italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ∈ caligraphic_T start_POSTSUBSCRIPT italic_p , italic_r end_POSTSUBSCRIPT. The set 𝐱p,rsubscript𝐱𝑝𝑟\mathbf{x}_{p,r}bold_x start_POSTSUBSCRIPT italic_p , italic_r end_POSTSUBSCRIPT is the p𝑝pitalic_p-th subsequence of regularity r𝑟ritalic_r, and it is a uniformly-spaced subsequence. For any non-uniformly spaced time series, we can build a collection of such subsequence for various values of r𝑟ritalic_r. The goal is to first obtain the longest subsequence for a small r𝑟ritalic_r. As the subsequence length reduces for a given r𝑟ritalic_r, the regularity value r𝑟ritalic_r can increase to obtain more uniformly-spaced subsequences. An algorithm for computing this collection of subsequences is displayed in Algorithm 1, which is adapted from an algorithm that finds the longest arithmetic progression in a sequence developed in (Erickson, 1999). In the statement of the algorithm, the following notation is used: (i) the union symbol \cup denotes the addition of a set (element) to a set (vector), (ii) the number of elements in a set or a vector A𝐴Aitalic_A is denoted by |A|𝐴|A|| italic_A |, and (iii) the notation A\B\𝐴𝐵A\backslash Bitalic_A \ italic_B represents subset of elements in set A𝐴Aitalic_A obtained by excluding all elements from set B𝐵Bitalic_B.

Algorithm 1 Uniform subsequence construction
1:Time points 𝒯={t1,,tn}𝒯subscript𝑡1subscript𝑡𝑛\mathcal{T}=\{t_{1},\cdots,t_{n}\}caligraphic_T = { italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT }, regularity score r𝑟ritalic_r, minimum sequence length m𝑚mitalic_m.
2:rtnt1𝑟subscript𝑡𝑛subscript𝑡1r\leq\ t_{n}-t_{1}italic_r ≤ italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, mn𝑚𝑛m\leq nitalic_m ≤ italic_n.
3:𝒯p{}subscript𝒯𝑝\mathcal{T}_{p}\leftarrow\{\dots\}caligraphic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ← { … }, temporary time index, 𝐓reg{}subscript𝐓reg\mathbf{T}_{\text{reg}}\leftarrow\{\}bold_T start_POSTSUBSCRIPT reg end_POSTSUBSCRIPT ← { }, uniformly-spaced subsequences.
4:while number of elements in 𝒯𝒯\mathcal{T}caligraphic_T is greater than m do
5:     for i=1:(|𝒯|1):𝑖1𝒯1i=1:(|\mathcal{T}|-1)italic_i = 1 : ( | caligraphic_T | - 1 ) do
6:         𝒯sub𝒯[i]subscript𝒯sub𝒯delimited-[]𝑖\mathcal{T}_{\text{sub}}\leftarrow\mathcal{T}[i]caligraphic_T start_POSTSUBSCRIPT sub end_POSTSUBSCRIPT ← caligraphic_T [ italic_i ] \triangleright Initialize a subsequence.
7:         for j=(i+1):|𝒯|:𝑗𝑖1𝒯j=(i+1):|\mathcal{T}|italic_j = ( italic_i + 1 ) : | caligraphic_T | do
8:              if 𝒯[j]𝒯sub[ji]=r𝒯delimited-[]𝑗subscript𝒯subdelimited-[]𝑗𝑖𝑟\mathcal{T}[j]-\mathcal{T}_{\text{sub}}[j-i]=rcaligraphic_T [ italic_j ] - caligraphic_T start_POSTSUBSCRIPT sub end_POSTSUBSCRIPT [ italic_j - italic_i ] = italic_r then \triangleright Check the regularity condition.
9:                  𝒯sub𝒯sub𝒯[j]subscript𝒯subsubscript𝒯sub𝒯delimited-[]𝑗\mathcal{T}_{\text{sub}}\leftarrow\mathcal{T}_{\text{sub}}\cup\mathcal{T}[j]caligraphic_T start_POSTSUBSCRIPT sub end_POSTSUBSCRIPT ← caligraphic_T start_POSTSUBSCRIPT sub end_POSTSUBSCRIPT ∪ caligraphic_T [ italic_j ];  if |𝒯sub|>|𝒯p|subscript𝒯subsubscript𝒯𝑝|\mathcal{T}_{\text{sub}}|>|\mathcal{T}_{p}|| caligraphic_T start_POSTSUBSCRIPT sub end_POSTSUBSCRIPT | > | caligraphic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | then 𝒯p𝒯subsubscript𝒯𝑝subscript𝒯sub\mathcal{T}_{p}\leftarrow\mathcal{T}_{\text{sub}}caligraphic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ← caligraphic_T start_POSTSUBSCRIPT sub end_POSTSUBSCRIPT
10:              elsebreak \triangleright Initialize with the next point in the sequence.                             
11:     if |𝒯p|msubscript𝒯𝑝𝑚|\mathcal{T}_{p}|\geq m| caligraphic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | ≥ italic_m and 𝒯psubscript𝒯𝑝\mathcal{T}_{p}caligraphic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is not identical to any other subsequence in 𝐓regsubscript𝐓𝑟𝑒𝑔\mathbf{T}_{reg}bold_T start_POSTSUBSCRIPT italic_r italic_e italic_g end_POSTSUBSCRIPT then
12:         𝐓reg𝐓reg𝒯psubscript𝐓𝑟𝑒𝑔subscript𝐓𝑟𝑒𝑔subscript𝒯𝑝\mathbf{T}_{reg}\leftarrow\mathbf{T}_{reg}\cup\mathcal{T}_{p}bold_T start_POSTSUBSCRIPT italic_r italic_e italic_g end_POSTSUBSCRIPT ← bold_T start_POSTSUBSCRIPT italic_r italic_e italic_g end_POSTSUBSCRIPT ∪ caligraphic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT;  𝒯𝒯\𝒯p𝒯\𝒯subscript𝒯𝑝\mathcal{T}\leftarrow\mathcal{T}\backslash\mathcal{T}_{p}caligraphic_T ← caligraphic_T \ caligraphic_T start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT \triangleright Remove the subset from the sequence.
13:     elsebreak \triangleright No uniformly-spaced subsequence of the required length exist.      
14:return 𝐓regsubscript𝐓reg\mathbf{T}_{\text{reg}}bold_T start_POSTSUBSCRIPT reg end_POSTSUBSCRIPT \triangleright Set of all regularly spaced subsequences each of regularity r𝑟ritalic_r.

Algorithm 1 returns all possible uniformly-spaced time points from the time index set 𝒯𝒯\mathcal{T}caligraphic_T with regularity score r𝑟ritalic_r. Note that for uniformly-spaced 𝒯𝒯\mathcal{T}caligraphic_T, it returns the full sequence. The uniformly-spaced observations can now be obtained by simply matching these observations to the time points in each subsequence. Not all the subsequences returned by Algorithm 1 are required in the SSE (see Remark 1).

3.3.2 Subsequence embedding method

Takens’s theorem guiding the construction of the TDE in Section 3.2 involves a single measurement function h()h(\cdot)italic_h ( ⋅ ), which generates each time series measurement (Takens, 2006). A generalization considers each coordinate in the embedding maps as a measurement function (see Remark 2.92.92.92.9 in Sauer et al., (1991) and Theorem 2222 in Deyle and Sugihara, (2011)). Such generalizations allow for the extension of Takens’s theorem to multiple measurement functions involving multiple time series. This motivates the proposed SSE method where each subsequence is viewed as distinct time series.

To construct the proposed SSE, a single distinct measurement function is defined on each subsequence. Let hp()subscript𝑝h_{p}(\cdot)italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( ⋅ ) be the measurement function associated with the p𝑝pitalic_p-th subsequence. Then the p𝑝pitalic_p-th embedding mapping has the form

Fp(𝐬(tp,i))=[xp,r(tp,i),xp,r(tp,i+τp),,xp,r(tp,i+Mτp)],xp,r(tp,i)=hp(𝐬(tp,i)).formulae-sequencesubscript𝐹𝑝𝐬subscript𝑡𝑝𝑖subscript𝑥𝑝𝑟subscript𝑡𝑝𝑖subscript𝑥𝑝𝑟subscript𝑡𝑝𝑖subscript𝜏𝑝subscript𝑥𝑝𝑟subscript𝑡𝑝𝑖𝑀subscript𝜏𝑝subscript𝑥𝑝𝑟subscript𝑡𝑝𝑖subscript𝑝𝐬subscript𝑡𝑝𝑖F_{p}(\mathbf{s}(t_{p,i}))=\left[x_{p,r}(t_{p,i}),x_{p,r}(t_{p,i}+\tau_{p}),% \cdots,x_{p,r}(t_{p,i}+M\tau_{p})\right],\quad x_{p,r}(t_{p,i})=h_{p}(\mathbf{% s}(t_{p,i})).italic_F start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) = [ italic_x start_POSTSUBSCRIPT italic_p , italic_r end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) , italic_x start_POSTSUBSCRIPT italic_p , italic_r end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT + italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) , ⋯ , italic_x start_POSTSUBSCRIPT italic_p , italic_r end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT + italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) ] , italic_x start_POSTSUBSCRIPT italic_p , italic_r end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) = italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) . (1)

The delay step τpsubscript𝜏𝑝\tau_{p}italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is fixed for each subsequence map. The map is also constructed under the assumption that the length of each subsequence np>max(M+1,Mτp)subscript𝑛𝑝𝑀1𝑀subscript𝜏𝑝n_{p}>\max(M+1,M*\tau_{p})italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT > roman_max ( italic_M + 1 , italic_M ∗ italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ). This ensures that there are sufficient observations within each subsequence to construct a point in the embedding space. The embedding matrix from the p𝑝pitalic_p-th subsequence has the form

𝐅p=[F(𝐬(tp,1))F(𝐬(tp,2))F(𝐬(tp,npM))].subscript𝐅𝑝superscriptmatrix𝐹superscript𝐬subscript𝑡𝑝1top𝐹superscript𝐬subscript𝑡𝑝2top𝐹superscript𝐬subscript𝑡𝑝subscript𝑛𝑝𝑀toptop\mathbf{F}_{p}=\begin{bmatrix}F(\mathbf{s}(t_{p,1}))^{\top}&F(\mathbf{s}(t_{p,% 2}))^{\top}&\cdots&F(\mathbf{s}(t_{p,n_{p}-M}))^{\top}\end{bmatrix}^{\top}.bold_F start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , 1 end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT end_CELL start_CELL italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , 2 end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT . (2)

The embedding matrix of the irregularly-spaced time series is then obtained by vertically stacking the embedding matrices from each of the subsequences. This is denoted as 𝐅=[𝐅1;;𝐅P]\mathbf{F}=\begin{bmatrix}\mathbf{F}_{1};&\cdots&;\mathbf{F}_{P}\end{bmatrix}bold_F = [ start_ARG start_ROW start_CELL bold_F start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ; end_CELL start_CELL ⋯ end_CELL start_CELL ; bold_F start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ], where P𝑃Pitalic_P is the total number of uniformly-spaced subsequences. The matrix 𝐅𝐅\mathbf{F}bold_F is of dimension (p=1P(npM))×(M+1)superscriptsubscript𝑝1𝑃subscript𝑛𝑝𝑀𝑀1\left(\sum_{p=1}^{P}(n_{p}-M)\right)\times(M+1)( ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ( italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M ) ) × ( italic_M + 1 ). Note that when the original time series is uniformly-spaced, the SSE method is identical to the TDE. To see this, observe that the longest subsequence in the uniformly-spaced time series is the original sequence.

To illustrate the SSE framework, Figure 3(a) shows measurements at 1000 uniformly-spaced time points (orange points and blue diamonds combined) of which about 20%percent2020\%20 % are designated as missing values (blue diamonds), which creates an irregularly-spaced time series (orange points). Both the uniformly-spaced and irregularly-space time series were embedded into 4superscript4\mathbb{R}^{4}blackboard_R start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT using the TDE and SSE methods, respectively. Figure 3(b)-top gives the TDE of the uniformly-spaced 1000 measurements and contains two identical elliptical shapes. Figure 3(b)-bottom shows the proposed SSE of the irregularly-spaced time series and also contains two similar elliptical shapes, however, there is visible non-uniform spacing of the points compared to the TDE space. This is primarily due to the SSE using a subset of the original time series (i.e., it constructs a uniform subsample from the irregularly-spaced time series based on Algorithm 1); the SSE space may be considered as a sparse representation of the TDE space. The persistence diagram for the TDE is shown in Figure 3(c). Since Figure 3(b)-top has two identical elliptical shapes, the H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT features have overlapping birth and death time, hence the appearance of a single blue triangle. Figure 3(d) shows the persistence diagram for the SSE, and correctly identifies the two loops but the birth and death time are non-overlapping due to the non-identical spacing of the points in the two elliptical shapes. In general, the SSE converges to the TDE in terms of the topological similarity of the reconstructed spaces and in the closeness of the persistence diagrams as the time sampling becomes more uniform. A more formal theoretical justification of this assertion, and other technical considerations are discussed in the next section.

Refer to caption
(a) Time series measurements at 1000 time points
Refer to caption
Refer to caption
(b) Reconstructed spaces
Refer to caption
(c) TDE persistence diagram
Refer to caption
(d) SSE persistence diagram
Figure 3: SSE method illustration. (a) One thousand time series measurements (blue and orange points). About 20%percent2020\%20 % were designated as missing (hollow blue diamonds) to obtain irregularly-spaced observations (orange points). The TDE of the full time series ((b)-top) and the SSE of the irregularly-spaced time series ((b)-bottom); both time series were embedded in 4superscript4\mathbb{R}^{4}blackboard_R start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT and their first three principal components are plotted. The persistence diagram of the TDE (c) and SSE (d).
Remark 1.

The choice of the number of subsequences P𝑃Pitalic_P and the number of subsequences of different regularity scores r𝑟ritalic_r depend on the context and goals of the analysis. To reconstruct an (M+1)𝑀1(M+1)( italic_M + 1 )-dimensional state space, subsequences must satisfy np>max(M+1,Mτp)subscript𝑛𝑝𝑀1𝑀subscript𝜏𝑝n_{p}>\max(M+1,M*\tau_{p})italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT > roman_max ( italic_M + 1 , italic_M ∗ italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ). A set of subsequences with the same regularity score can lead to better reconstruction accuracy as it captures the dominant patterns of the underlying data-generating space of the time series more coherently. Combining subsequences with different regularity scores can improve topological approximations as it captures a wider range of structures of the underlying data-generating space. Thus, there is a trade-off between a better topology approximation and improved reconstruction accuracy. If subsequences with the same regularity score capture most of the time series, combining sequences with different regularity scores may offer limited benefits. While the simulations and real data analysis in this work utilize subsequences with the same regularity score, the methodology and theoretical results apply to subsequences with the same or different regularity scores.

4 Stability and Convergence Results

The reconstructed state space using the proposed SSE is an estimate of the “true” state space based on a uniformly-sampled time series (i.e., the TDE space). Persistence diagrams are used to quantify the stability of the estimate by measuring its closeness to the true state space. The results are stated below, and the proofs are included in the supplementary materials Section S1.

4.1 Stability Theory in Persistence Homology

The set of persistence diagrams from the true state space (TDE) and the reconstructed state space (SSE) can be endowed with a distance measure, such as the popular bottleneck distance. The bottleneck distance gives the minimal bijection between any two diagrams as a measure of the distance between the diagrams. Let 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT be two finite compact subsets of M+1superscript𝑀1\mathbb{R}^{M+1}blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT, with Dgm(𝐅1)Dgmsuperscript𝐅1\text{Dgm}(\mathbf{F}^{1})Dgm ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) and Dgm(𝐅2)Dgmsuperscript𝐅2\text{Dgm}(\mathbf{F}^{2})Dgm ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) as their corresponding persistence diagrams. The bottleneck distance, dBsubscriptdB\text{d}_{\text{B}}d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT, between the two persistence diagrams is defined as

dB(Dgm(𝐅1),Dgm(𝐅2))=infγsupμDgm(𝐅1)μγ(μ),subscriptdBDgmsuperscript𝐅1Dgmsuperscript𝐅2subscriptinfimum𝛾subscriptsupremum𝜇Dgmsuperscript𝐅1subscriptnorm𝜇𝛾𝜇\text{d}_{\text{B}}\left(\text{Dgm}(\mathbf{F}^{1}),\text{Dgm}(\mathbf{F}^{2})% \right)=\inf_{\gamma}\sup_{\mu\in\text{Dgm}(\mathbf{F}^{1})}||\mu-\gamma(\mu)|% |_{\infty},d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , Dgm ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ) = roman_inf start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT italic_μ ∈ Dgm ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT | | italic_μ - italic_γ ( italic_μ ) | | start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT , (3)

where the infimum is taken over all bijections γ:Dgm(𝐅1)Dgm(𝐅2):𝛾absentDgmsuperscript𝐅1Dgmsuperscript𝐅2\gamma:\text{Dgm}(\mathbf{F}^{1})\xrightarrow{}\text{Dgm}(\mathbf{F}^{2})italic_γ : Dgm ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW Dgm ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). For the metric space (M+1,2)superscript𝑀1subscriptdelimited-∥∥2(\mathbb{R}^{M+1},\lVert\cdot\rVert_{2})( blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT , ∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), where 2subscriptdelimited-∥∥2\lVert\cdot\rVert_{2}∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT denotes the l2superscript𝑙2l^{2}italic_l start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-norm, define the distance function as: d(F,𝐅1)=infF1𝐅1FF12,𝑑𝐹superscript𝐅1subscriptinfimumsuperscript𝐹1superscript𝐅1subscriptdelimited-∥∥𝐹superscript𝐹12d(F,\mathbf{F}^{1})=\inf_{F^{1}\in\mathbf{F}^{1}}\lVert F-F^{1}\rVert_{2},italic_d ( italic_F , bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) = roman_inf start_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∈ bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_F - italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , FM+1.for-all𝐹superscript𝑀1\forall F\in\mathbb{R}^{M+1}.∀ italic_F ∈ blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT . Then the Hausdorff distance, dHsubscript𝑑𝐻d_{H}italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT, is given by

dH(𝐅1,𝐅2)=max{supF1𝐅1d(F1,𝐅2),supF2𝐅2d(F2,𝐅1)}.subscript𝑑𝐻superscript𝐅1superscript𝐅2subscriptsupremumsuperscript𝐹1superscript𝐅1𝑑superscript𝐹1superscript𝐅2subscriptsupremumsuperscript𝐹2superscript𝐅2𝑑superscript𝐹2superscript𝐅1d_{H}(\mathbf{F}^{1},\mathbf{F}^{2})=\max\left\{\sup_{F^{1}\in\mathbf{F}^{1}}d% (F^{1},\mathbf{F}^{2}),\sup_{F^{2}\in\mathbf{F}^{2}}d(F^{2},\mathbf{F}^{1})% \right\}.italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) = roman_max { roman_sup start_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∈ bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_d ( italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , roman_sup start_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∈ bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_d ( italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) } . (4)

The Hausdorff distance quantifies the level to which two metric spaces represent the same object. A modification of the Hausdorff distance that is blind to isometric transformations is the Gromov-Hausdorff distance. Let ν1:𝐅1M+1:subscript𝜈1superscript𝐅1superscript𝑀1\nu_{1}:\mathbf{F}^{1}\hookrightarrow\mathbb{R}^{M+1}italic_ν start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ↪ blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT and ν2:𝐅2M+1:subscript𝜈2superscript𝐅2superscript𝑀1\nu_{2}:\mathbf{F}^{2}\hookrightarrow\mathbb{R}^{M+1}italic_ν start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT : bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ↪ blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT be isometric (distance preserving) embeddings into the metric space (M+1,2)superscript𝑀1subscriptdelimited-∥∥2(\mathbb{R}^{M+1},\lVert\cdot\rVert_{2})( blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT , ∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). Then the Gromov-Hausdorff distance between 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is given by: dGH(𝐅1,𝐅2)=infν1,ν2dH(ν1(𝐅1),ν2(𝐅2))subscript𝑑𝐺𝐻superscript𝐅1superscript𝐅2subscriptinfimumsubscript𝜈1subscript𝜈2subscript𝑑𝐻subscript𝜈1superscript𝐅1subscript𝜈2superscript𝐅2d_{GH}(\mathbf{F}^{1},\mathbf{F}^{2})=\inf_{\nu_{1},\nu_{2}}d_{H}(\nu_{1}(% \mathbf{F}^{1}),\nu_{2}(\mathbf{F}^{2}))italic_d start_POSTSUBSCRIPT italic_G italic_H end_POSTSUBSCRIPT ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) = roman_inf start_POSTSUBSCRIPT italic_ν start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ν start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_ν start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , italic_ν start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ). A fundamental result on persistence diagrams is that they are stable summaries in many settings (i.e., a small change in a point cloud result in a small change in the corresponding persistence diagram) (Chazal and Michel, 2021). The stability relation of the persistence diagrams is stated as

dB(Dgm(𝐅1),Dgm(𝐅2))2dGH(𝐅1,𝐅2)2dH(𝐅1,𝐅2).subscriptdBDgmsuperscript𝐅1Dgmsuperscript𝐅22subscriptdGHsuperscript𝐅1superscript𝐅22subscriptdHsuperscript𝐅1superscript𝐅2\text{d}_{\text{B}}\left(\text{Dgm}(\mathbf{F}^{1}),\text{Dgm}(\mathbf{F}^{2})% \right)\leq 2\text{d}_{\text{GH}}(\mathbf{F}^{1},\mathbf{F}^{2})\leq 2\text{d}% _{\text{H}}(\mathbf{F}^{1},\mathbf{F}^{2}).d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , Dgm ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ) ≤ 2 d start_POSTSUBSCRIPT GH end_POSTSUBSCRIPT ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ≤ 2 d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (5)

In what follows, these stability results are established for the proposed SSE method and a denoising procedure to reduce the noise present in the observed time series.

4.2 Stability of Denoising Procedure

Time-series data is typically observed with noise. The level of noise in the reconstructed space influences the presence and the persistence of the topological features. A Fourier denoising procedure is proposed to filter out noise in the observed time series.

Let 𝐱=[x(t1),x(t2),,x(tn)]\mathbf{x}=\begin{bmatrix}x(t_{1}),&x(t_{2}),&\cdots&,x(t_{n})\end{bmatrix}^{\top}bold_x = [ start_ARG start_ROW start_CELL italic_x ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , end_CELL start_CELL italic_x ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , end_CELL start_CELL ⋯ end_CELL start_CELL , italic_x ( italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT be an observed time series vector. The first step in the denoising procedure is to transform this observed signal to the frequency domain. The discrete Fourier transform (DFT) of x(tk)𝑥subscript𝑡𝑘x(t_{k})italic_x ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), denoted as x~(tk)~𝑥subscript𝑡𝑘\tilde{x}(t_{k})over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is given by

x~(tk)=r=1nx(tr)ej2πwrfk=r=1nx(tr)ϕkr,1kn,formulae-sequence~𝑥subscript𝑡𝑘superscriptsubscript𝑟1𝑛𝑥subscript𝑡𝑟superscript𝑒𝑗2𝜋subscript𝑤𝑟subscript𝑓𝑘superscriptsubscript𝑟1𝑛𝑥subscript𝑡𝑟subscriptitalic-ϕ𝑘𝑟1𝑘𝑛\tilde{x}(t_{k})=\sum_{r=1}^{n}x(t_{r})e^{-j2\pi w_{r}f_{k}}=\sum_{r=1}^{n}x(t% _{r}){\phi}_{kr},\quad 1\leq k\leq n,over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x ( italic_t start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ) italic_e start_POSTSUPERSCRIPT - italic_j 2 italic_π italic_w start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x ( italic_t start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ) italic_ϕ start_POSTSUBSCRIPT italic_k italic_r end_POSTSUBSCRIPT , 1 ≤ italic_k ≤ italic_n , (6)

where j𝑗jitalic_j is the imaginary unit (j2=1superscript𝑗21j^{2}=-1italic_j start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = - 1), ϕkr=ej2πwrfksubscriptitalic-ϕ𝑘𝑟superscript𝑒𝑗2𝜋subscript𝑤𝑟subscript𝑓𝑘\phi_{kr}=e^{-j2\pi w_{r}f_{k}}italic_ϕ start_POSTSUBSCRIPT italic_k italic_r end_POSTSUBSCRIPT = italic_e start_POSTSUPERSCRIPT - italic_j 2 italic_π italic_w start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, 0wr10subscript𝑤𝑟10\leq w_{r}\leq 10 ≤ italic_w start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ≤ 1 are sample points, 0fkn0subscript𝑓𝑘𝑛0\leq f_{k}\leq n0 ≤ italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ≤ italic_n are frequencies, and x~(tk)~𝑥subscript𝑡𝑘\tilde{x}(t_{k})over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is the k𝑘kitalic_k-th sample of the power spectrum at fksubscript𝑓𝑘f_{k}italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT.

To filter out noise, the power spectral density is computed for each observation x~(tk)~𝑥subscript𝑡𝑘\tilde{x}(t_{k})over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ). Then a threshold is chosen, and any x~(tk)~𝑥subscript𝑡𝑘\tilde{x}(t_{k})over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) with power spectral density less than the threshold is set to zero. In selecting the threshold, the goal is to choose a value that does not smooth out the peaks in the true signal. The derivation that follows assumes the selected threshold preserves the peaks in the true signal. To simplify notations, the thresholded observations are also denoted as x~(tk)~𝑥subscript𝑡𝑘\tilde{x}(t_{k})over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ). The thresholded x~(tk)~𝑥subscript𝑡𝑘\tilde{x}(t_{k})over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) are transformed back to the time domain to get the noise-reduced signal. which typically involves multiplying x~(tk)~𝑥subscript𝑡𝑘\tilde{x}(t_{k})over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) by the inverse of a Fourier transform matrix.

Let 𝐱~~𝐱\tilde{\mathbf{x}}over~ start_ARG bold_x end_ARG be the DFT of the time series vector with corresponding Fourier basis ϕksubscriptbold-italic-ϕ𝑘\boldsymbol{\phi}_{k}bold_italic_ϕ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, where

𝐱~=[x~(t1),x~(t2),,x~(tn)],ϕk=[ϕk1,ϕk2,,ϕkn],k=1,,n.formulae-sequence~𝐱superscriptmatrix~𝑥subscript𝑡1~𝑥subscript𝑡2~𝑥subscript𝑡𝑛topformulae-sequencesubscriptbold-italic-ϕ𝑘superscriptmatrixsubscriptitalic-ϕ𝑘1subscriptitalic-ϕ𝑘2subscriptitalic-ϕ𝑘𝑛top𝑘1𝑛\tilde{\mathbf{x}}=\begin{bmatrix}\tilde{x}(t_{1}),&\tilde{x}(t_{2}),&\cdots,&% \tilde{x}(t_{n})\end{bmatrix}^{\top},\boldsymbol{\phi}_{k}=\begin{bmatrix}\phi% _{k1},&\phi_{k2},&\cdots,&\phi_{kn}\end{bmatrix}^{\top},k=1,\ldots,n.over~ start_ARG bold_x end_ARG = [ start_ARG start_ROW start_CELL over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , end_CELL start_CELL over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , end_CELL start_CELL ⋯ , end_CELL start_CELL over~ start_ARG italic_x end_ARG ( italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL italic_ϕ start_POSTSUBSCRIPT italic_k 1 end_POSTSUBSCRIPT , end_CELL start_CELL italic_ϕ start_POSTSUBSCRIPT italic_k 2 end_POSTSUBSCRIPT , end_CELL start_CELL ⋯ , end_CELL start_CELL italic_ϕ start_POSTSUBSCRIPT italic_k italic_n end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , italic_k = 1 , … , italic_n . (7)

The forward transform in Equation (6) can be vectorized:

𝐱~=𝚽𝐱,𝚽=[ϕ1ϕ2ϕn].formulae-sequence~𝐱𝚽𝐱𝚽superscriptmatrixsubscriptbold-italic-ϕ1subscriptbold-italic-ϕ2subscriptbold-italic-ϕ𝑛top\mathbf{\tilde{x}}=\boldsymbol{\Phi}\mathbf{x},\quad\boldsymbol{\Phi}=\begin{% bmatrix}\boldsymbol{\phi}_{1}&\boldsymbol{\phi}_{2}&\cdots&\boldsymbol{\phi}_{% n}\end{bmatrix}^{\top}.over~ start_ARG bold_x end_ARG = bold_Φ bold_x , bold_Φ = [ start_ARG start_ROW start_CELL bold_italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL bold_italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL bold_italic_ϕ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT . (8)

The backward transform can then be determined by inverting the matrix 𝚽𝚽\boldsymbol{\Phi}bold_Φ. However, due to the non-uniformity in the spacing of the time series 𝐱𝐱\mathbf{x}bold_x, the columns of 𝚽𝚽\boldsymbol{\Phi}bold_Φ are not orthogonal, and it is not directly invertible, so the pseudo-inverse is used instead. The backward transform then has the form: 𝐱=1n(𝚽H𝚽)𝚽H𝐱~𝐱1𝑛superscriptsuperscript𝚽𝐻𝚽superscript𝚽𝐻~𝐱\mathbf{x}=\frac{1}{n}(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}% \boldsymbol{\Phi}^{H}\tilde{\mathbf{x}}bold_x = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT over~ start_ARG bold_x end_ARG, where 𝐀Hsuperscript𝐀𝐻\mathbf{A}^{H}bold_A start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT and 𝐀superscript𝐀\mathbf{A}^{\dagger}bold_A start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT denote the complex conjugate transpose and the Moore-Penrose inverse of the matrix 𝐀𝐀\mathbf{A}bold_A, respectively. The matrix (𝚽H𝚽)𝚽Hsuperscriptsuperscript𝚽𝐻𝚽superscript𝚽𝐻(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}\boldsymbol{\Phi}^{H}( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT projects the frequency vector 𝐱~~𝐱\mathbf{\tilde{x}}over~ start_ARG bold_x end_ARG onto the column space of 𝚽𝚽\boldsymbol{\Phi}bold_Φ. Let Π𝚽(𝐱)subscriptΠ𝚽𝐱\Pi_{\boldsymbol{\Phi}}(\mathbf{x})roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x ) denote this projection operation. The modulus of off-diagonal elements of 𝚽H𝚽superscript𝚽𝐻𝚽\boldsymbol{\Phi}^{H}\boldsymbol{\Phi}bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ is bounded as: |k=1nej2π(wl1wl2)fk|k=1n|ej2π(wl1wl2)fk|=nsuperscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤subscript𝑙1subscript𝑤subscript𝑙2subscript𝑓𝑘superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤subscript𝑙1subscript𝑤subscript𝑙2subscript𝑓𝑘𝑛\left|\sum_{k=1}^{n}e^{j2\pi(w_{l_{1}}-w_{l_{2}})f_{k}}\right|\leq\sum_{k=1}^{% n}\left|e^{j2\pi(w_{l_{1}}-w_{l_{2}})f_{k}}\right|=n| ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | ≤ ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | = italic_n, where the equality follows from the definition of the complex modulus |z|=zz¯𝑧𝑧¯𝑧|{z}|=\sqrt{{z}\bar{z}}| italic_z | = square-root start_ARG italic_z over¯ start_ARG italic_z end_ARG end_ARG, with z¯¯𝑧\bar{z}over¯ start_ARG italic_z end_ARG as the conjugate of the complex value z𝑧zitalic_z. The matrix 𝚽𝚽H𝚽superscript𝚽𝐻\boldsymbol{\Phi}\boldsymbol{\Phi}^{H}bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT has the structure

(𝚽𝚽H)l1,l2=k=1ne(1)δj2π(fl2fl1)wk,δ=𝟙(l1l2),formulae-sequencesubscript𝚽superscript𝚽𝐻subscript𝑙1subscript𝑙2superscriptsubscript𝑘1𝑛superscript𝑒superscript1𝛿𝑗2𝜋subscript𝑓subscript𝑙2subscript𝑓subscript𝑙1subscript𝑤𝑘𝛿1subscript𝑙1subscript𝑙2(\boldsymbol{\Phi}\boldsymbol{\Phi}^{H})_{l_{1},l_{2}}=\sum_{k=1}^{n}e^{(-1)^{% \delta}j2\pi(f_{l_{2}}-f_{l_{1}})w_{k}},\quad\delta=\mathbbm{1}(l_{1}\leq l_{2% }),( bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT ( - 1 ) start_POSTSUPERSCRIPT italic_δ end_POSTSUPERSCRIPT italic_j 2 italic_π ( italic_f start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_f start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , italic_δ = blackboard_1 ( italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , (9)

for indicator function 𝟙(l1l2)1subscript𝑙1subscript𝑙2\mathbbm{1}(l_{1}\leq l_{2})blackboard_1 ( italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , and (𝚽𝚽)l1,l2Hsubscriptsuperscript𝚽𝚽𝐻subscript𝑙1subscript𝑙2(\boldsymbol{\Phi}\boldsymbol{\Phi})^{H}_{l_{1},l_{2}}( bold_Φ bold_Φ ) start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT denotes the value in the l1subscript𝑙1l_{1}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-th row and l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-th column of 𝚽𝚽H𝚽superscript𝚽𝐻\boldsymbol{\Phi}\boldsymbol{\Phi}^{H}bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT. Then 𝚽𝚽H𝚽superscript𝚽𝐻\boldsymbol{\Phi}\boldsymbol{\Phi}^{H}bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT is Toeplitz when the frequency components are uniformly-spaced such that fk=ksubscript𝑓𝑘𝑘f_{k}=kitalic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_k, and 𝚽𝚽H𝚽superscript𝚽𝐻\boldsymbol{\Phi}\boldsymbol{\Phi}^{H}bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT is fully specified by its first row elements (HuoLiu and YuanTang, 1998).

Using these results, the following proposition asserts that the DFT preserves topological features and is stable with respect to the bottleneck distance. In particular, the bottleneck distance between the persistence diagrams of the embeddings of the noise-free and smoothed (i.e., noise-reduced) time series is bounded above by the embeddings of the observed noisy and noise-free time series.

Proposition 4.1.

Given 𝐱nsuperscript𝐱superscript𝑛\mathbf{x}^{*}\in\mathbb{R}^{n}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT as a possibly irregularly-spaced scalar time series with additive noise of the form 𝐱=𝐱+εsuperscript𝐱𝐱𝜀\mathbf{x}^{*}=\mathbf{x}+\mathbf{\varepsilon}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_x + italic_ε, where 𝐱𝐱\mathbf{x}bold_x is a noise-free scalar time series, and ε𝜀\varepsilonitalic_ε is a zero-mean noise term, then let 𝐱superscript𝐱\mathbf{x}^{\prime}bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be the time series vector after applying the proposed Fourier denoising to 𝐱superscript𝐱\mathbf{x}^{*}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, and 𝐅𝐅\mathbf{F}bold_F, 𝐅superscript𝐅\mathbf{F}^{*}bold_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, and 𝐅superscript𝐅\mathbf{F}^{\prime}bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be the embedding matrices associated with 𝐱𝐱\mathbf{x}bold_x, 𝐱superscript𝐱\mathbf{x}^{*}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, and 𝐱superscript𝐱\mathbf{x}^{\prime}bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, respectively. Also, let Dgm(𝐅)Dgm𝐅\text{Dgm}(\mathbf{F})Dgm ( bold_F ) and Dgm(𝐅)Dgmsuperscript𝐅\text{Dgm}({\mathbf{F}^{\prime}})Dgm ( bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) denote the persistence diagrams associated with the Vietoris-Rips complex constructed from 𝐅𝐅\mathbf{F}bold_F and 𝐅superscript𝐅\mathbf{F}^{\prime}bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, respectively. Then the bottleneck distance between these two persistence diagrams is bounded as

dB(Dgm(𝐅),Dgm(𝐅))4n2γ(supi,pF(𝐬(tp,i))F(𝐬(tp,i))2),subscriptdBDgm𝐅Dgmsuperscript𝐅4𝑛2𝛾subscriptsupremum𝑖𝑝subscriptnormsuperscript𝐹𝐬subscript𝑡𝑝𝑖𝐹𝐬subscript𝑡𝑝𝑖2\text{d}_{\text{B}}(\text{Dgm}({\mathbf{F}}),\text{Dgm}({\mathbf{F}^{\prime}})% )\leq\frac{4n-2}{\gamma}\left(\sup\limits_{i,p}||F^{*}(\mathbf{s}(t_{p,i}))-F(% \mathbf{s}(t_{p,i}))||_{2}\right),d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F ) , Dgm ( bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ divide start_ARG 4 italic_n - 2 end_ARG start_ARG italic_γ end_ARG ( roman_sup start_POSTSUBSCRIPT italic_i , italic_p end_POSTSUBSCRIPT | | italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , (10)

where 0<γ1,1inpMτp,1pP.formulae-sequence0𝛾11𝑖subscript𝑛𝑝𝑀subscript𝜏𝑝1𝑝𝑃0<\gamma\leq 1,\quad 1\leq i\leq n_{p}-M\tau_{p},\quad 1\leq p\leq P.0 < italic_γ ≤ 1 , 1 ≤ italic_i ≤ italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , 1 ≤ italic_p ≤ italic_P .

Remark 2.

When the samples are uniformly-spaced in both the time and frequency domain, the matrix 𝚽𝚽\boldsymbol{\Phi}bold_Φ is Hermitian with orthogonal columns, hence the factor (2n1)/γ2𝑛1𝛾({2n-1})/{\gamma}( 2 italic_n - 1 ) / italic_γ (not (4n2)/γ(\text{not }({4n-2})/{\gamma}( not ( 4 italic_n - 2 ) / italic_γ, as the scaling by 2222 is still required)))) is not required for the bound to hold. The factor (2n1)/γ2𝑛1𝛾({2n-1})/{\gamma}( 2 italic_n - 1 ) / italic_γ makes the bound conservative. However, if the denoising is well constructed, this has minimal impact on the scale of the bound. Numerical experiments in the supplementary materials Section S2.1 suggest this is the case for the settings considered.

4.3 Stability of the Subsequence Embedding Method

In this section, an error bound and stability results for the SSE approximation of the TDE are established. To simplify the notation, the embedding matrices are represented as sets where the elements of the set are the row vectors of the corresponding TDE or SSE. Also, for an embedding from a single uniformly-spaced time series, the step-size is assumed to be τ𝜏\tauitalic_τ. When constructing from a set of P𝑃Pitalic_P subsequences, a step-size of τpsubscript𝜏𝑝\tau_{p}italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is assumed for the p𝑝pitalic_p-th subsequence, where 1pP1𝑝𝑃1\leq p\leq P1 ≤ italic_p ≤ italic_P.

Because the SSE can have fewer elements than the TDE, the following lemma addresses how to expand the SSE without affecting its topology by repeating already existing points in the SSE, so distances can be computed between the TDE and the expanded SSE.

Lemma 4.2 (Topology-preserving transform).

Let 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT be an embedding matrix from a uniformly-spaced time sequence of length n𝑛nitalic_n with the form:

𝐅1={F1(𝐬(t1)),F1(𝐬(t2)),,F1(𝐬(tnMτ))}M+1.superscript𝐅1superscript𝐹1𝐬subscript𝑡1superscript𝐹1𝐬subscript𝑡2superscript𝐹1𝐬subscript𝑡𝑛𝑀𝜏superscript𝑀1\mathbf{F}^{1}=\left\{F^{1}(\mathbf{s}(t_{1})),F^{1}(\mathbf{s}(t_{2})),\cdots% ,F^{1}(\mathbf{s}(t_{n-M\tau}))\right\}\subset\mathbb{R}^{M+1}.bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = { italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) , italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) , ⋯ , italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_n - italic_M italic_τ end_POSTSUBSCRIPT ) ) } ⊂ blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT . (11)

Also, let 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT be an embedding matrix from a set of P𝑃Pitalic_P subsequences with the form:

𝐅2={F2(𝐬(t1,1)),,F2(𝐬(t1,n1Mτ1)),,F2(𝐬(tP,1),,F2(𝐬(tP,nPMτP))}M+1,\mathbf{F}^{2}=\left\{F^{2}(\mathbf{s}(t_{1,1})),\cdots,F^{2}(\mathbf{s}(t_{1,% n_{1}-M\tau_{1}})),\cdots,F^{2}(\mathbf{s}(t_{P,1}),\cdots,F^{2}(\mathbf{s}(t_% {P,n_{P}-M\tau_{P}}))\right\}\subset\mathbb{R}^{M+1},bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = { italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT 1 , 1 end_POSTSUBSCRIPT ) ) , ⋯ , italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT 1 , italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ) , ⋯ , italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_P , 1 end_POSTSUBSCRIPT ) , ⋯ , italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_P , italic_n start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ) } ⊂ blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT , (12)

where p=1P(npMτp)nMτsuperscriptsubscript𝑝1𝑃subscript𝑛𝑝𝑀subscript𝜏𝑝𝑛𝑀𝜏\sum_{p=1}^{P}(n_{p}-M\tau_{p})\leq n-M\tau∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ( italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) ≤ italic_n - italic_M italic_τ. Consider the set extension 𝐅^2={𝐅2,𝐅k2}superscript^𝐅2superscript𝐅2subscriptsuperscript𝐅2𝑘\widehat{\mathbf{F}}^{2}=\left\{\mathbf{F}^{2},{\mathbf{F}}^{2}_{k}\right\}over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = { bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT }, where 𝐅k2subscriptsuperscript𝐅2𝑘{\mathbf{F}}^{2}_{k}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is a subset of k𝑘kitalic_k elements from 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Then the persistence diagrams associated with 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and 𝐅^2superscript^𝐅2\widehat{\mathbf{F}}^{2}over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT are identical, that is, Dgm(𝐅2)Dgm(𝐅^2)Dgmsuperscript𝐅2Dgmsuperscript^𝐅2\text{Dgm}(\mathbf{F}^{2})\equiv\text{Dgm}(\widehat{\mathbf{F}}^{2})Dgm ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ≡ Dgm ( over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), and the three embedded spaces are related through the bottleneck distance as follows: dB(Dgm(𝐅1),Dgm(𝐅2))=dB(Dgm(𝐅1),Dgm(𝐅^2)).subscriptdBDgmsuperscript𝐅1Dgmsuperscript𝐅2subscriptdBDgmsuperscript𝐅1Dgmsuperscript^𝐅2\text{d}_{\text{B}}(\text{Dgm}(\mathbf{F}^{1}),\text{Dgm}(\mathbf{F}^{2}))=% \text{d}_{\text{B}}(\text{Dgm}(\mathbf{F}^{1}),\text{Dgm}(\widehat{\mathbf{F}}% ^{2})).d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , Dgm ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ) = d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , Dgm ( over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ) .

Lemma 4.2 asserts that a particular set of points can be added to an embedding matrix without changing the SSE’s persistence diagram. This is used to establish a bound on the SSE as an approximation to the TDE in the following proposition. In Lemma 4.2, when k=(nMτ)p=1P(npMτp)𝑘𝑛𝑀𝜏superscriptsubscript𝑝1𝑃subscript𝑛𝑝𝑀subscript𝜏𝑝k=(n-M\tau)-\sum_{p=1}^{P}(n_{p}-M\tau_{p})italic_k = ( italic_n - italic_M italic_τ ) - ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ( italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ), the row dimension of 𝐅^2superscript^𝐅2\widehat{\mathbf{F}}^{2}over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is the same as that of 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. In such instances, when the interest is in a row-wise comparison of 𝐅^2superscript^𝐅2\widehat{\mathbf{F}}^{2}over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT the subsequence indexing in the time variable for any F^2(𝐬(tp,i))𝐅^2superscript^𝐹2𝐬subscript𝑡𝑝𝑖superscript^𝐅2\widehat{F}^{2}(\mathbf{s}(t_{p,i}))\in\widehat{\mathbf{F}}^{2}over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ∈ over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is ignored and a row is simply written as F^2(𝐬(ti))superscript^𝐹2𝐬subscript𝑡𝑖\widehat{F}^{2}(\mathbf{s}(t_{i}))over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ), where 1inMτ1𝑖𝑛𝑀𝜏1\leq i\leq n-M\tau1 ≤ italic_i ≤ italic_n - italic_M italic_τ.

Proposition 4.3.

Let 𝐱1superscript𝐱1\mathbf{x}^{1}bold_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT be a uniformly-spaced time series vector of length n𝑛nitalic_n with TDE matrix 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. Let 𝐱2𝐱1superscript𝐱2superscript𝐱1\mathbf{x}^{2}\subset\mathbf{x}^{1}bold_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⊂ bold_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT be a time series vector where some of the elements are missing or unobserved. Denote the SSE matrix constructed from 𝐱2superscript𝐱2\mathbf{x}^{2}bold_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Define the extension 𝐅^2={𝐅2,𝐅k2}superscript^𝐅2superscript𝐅2subscriptsuperscript𝐅2𝑘\widehat{\mathbf{F}}^{2}=\left\{\mathbf{F}^{2},{\mathbf{F}}^{2}_{k}\right\}over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = { bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT }, where 𝐅k2subscriptsuperscript𝐅2𝑘{\mathbf{F}}^{2}_{k}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is a subset of k=(nMτ)p=1P(npMτp)𝑘𝑛𝑀𝜏superscriptsubscript𝑝1𝑃subscript𝑛𝑝𝑀subscript𝜏𝑝k=(n-M\tau)-\sum_{p=1}^{P}(n_{p}-M\tau_{p})italic_k = ( italic_n - italic_M italic_τ ) - ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ( italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) elements from 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Then the bottleneck distance between 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is bounded as: dB(Dgm(𝐅1),Dgm(𝐅2))2sup1inMτF1(𝐬(ti))F^2(𝐬(ti))2.subscriptdBDgmsuperscript𝐅1Dgmsuperscript𝐅22subscriptsupremum1𝑖𝑛𝑀𝜏subscriptdelimited-∥∥superscript𝐹1𝐬subscript𝑡𝑖superscript^𝐹2𝐬subscript𝑡𝑖2\text{d}_{\text{B}}(\text{Dgm}(\mathbf{F}^{1}),\text{Dgm}(\mathbf{F}^{2}))\leq 2% \sup\limits_{1\leq i\leq n-M\tau}\lVert F^{1}(\mathbf{s}(t_{i}))-\widehat{F}^{% 2}(\mathbf{s}(t_{i}))\rVert_{2}.d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , Dgm ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ) ≤ 2 roman_sup start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_n - italic_M italic_τ end_POSTSUBSCRIPT ∥ italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT .

The choice of the k𝑘kitalic_k subsets of embedding vectors 𝐅k2subscriptsuperscript𝐅2𝑘\mathbf{F}^{2}_{k}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT in Proposition 4.3 is arbitrary as any subset satisfies the bound. However, since they are chosen to match the subset {𝐅(𝐬tl):p=1P(npMτp)+1lnMτ}conditional-setsuperscript𝐅subscript𝐬subscript𝑡𝑙superscriptsubscript𝑝1𝑃subscript𝑛𝑝𝑀subscript𝜏𝑝1𝑙𝑛𝑀𝜏\{\mathbf{F}^{\prime}(\mathbf{s}_{t_{l}}):\sum_{p=1}^{P}(n_{p}-M\tau_{p})+1% \leq l\leq n-M\tau\}{ bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_s start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) : ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ( italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) + 1 ≤ italic_l ≤ italic_n - italic_M italic_τ } of 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, the bound can be improved. The minimum bound can be attained by choosing a subset in 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT that has the smallest Euclidean distance to the subset {𝐅(𝐬tl):p=1P(npMτp)+1lnMτ}conditional-setsuperscript𝐅subscript𝐬subscript𝑡𝑙superscriptsubscript𝑝1𝑃subscript𝑛𝑝𝑀subscript𝜏𝑝1𝑙𝑛𝑀𝜏\{\mathbf{F}^{\prime}(\mathbf{s}_{t_{l}}):\sum_{p=1}^{P}(n_{p}-M\tau_{p})+1% \leq l\leq n-M\tau\}{ bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_s start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) : ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ( italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) + 1 ≤ italic_l ≤ italic_n - italic_M italic_τ }. This is summarized as a corollary below.

Corollary 4.4.

Let 𝐱1superscript𝐱1\mathbf{x}^{1}bold_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT be a uniformly spaced time series vector of length n𝑛nitalic_n with TDE matrix 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. Let 𝐱2𝐱1superscript𝐱2superscript𝐱1\mathbf{x}^{2}\subset\mathbf{x}^{1}bold_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⊂ bold_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT be a time series vector where some of the elements are unobserved, and 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT its SSE matrix. Define the extension 𝐅^2={𝐅2,𝐅k2}superscript^𝐅2superscript𝐅2subscriptsuperscript𝐅2𝑘\widehat{\mathbf{F}}^{2}=\left\{\mathbf{F}^{2},{\mathbf{F}}^{2}_{k}\right\}over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = { bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT }, where 𝐅k2subscriptsuperscript𝐅2𝑘{\mathbf{F}}^{2}_{k}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is a random subset of k=(nMτ)p=1P(npMτp)𝑘𝑛𝑀𝜏superscriptsubscript𝑝1𝑃subscript𝑛𝑝𝑀subscript𝜏𝑝k=(n-M\tau)-\sum_{p=1}^{P}(n_{p}-M\tau_{p})italic_k = ( italic_n - italic_M italic_τ ) - ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_P end_POSTSUPERSCRIPT ( italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) elements from 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. For some F1𝐅1superscript𝐹1superscript𝐅1F^{1}\in\mathbf{F}^{1}italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∈ bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, define the set 𝐅k,min2subscriptsuperscript𝐅2𝑘min{\mathbf{F}}^{2}_{k,\text{min}}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k , min end_POSTSUBSCRIPT as follows:

𝐅k,min2={Fmin2𝐅2:Fmin2F12F2F12,F2𝐅2, s.t. Fmin2F2}.subscriptsuperscript𝐅2𝑘minconditional-setsubscriptsuperscript𝐹2minsuperscript𝐅2formulae-sequencesubscriptdelimited-∥∥subscriptsuperscript𝐹2minsuperscript𝐹12subscriptdelimited-∥∥superscript𝐹2superscript𝐹12formulae-sequencefor-allsuperscript𝐹2superscript𝐅2 s.t. subscriptsuperscript𝐹2minsuperscript𝐹2{\mathbf{F}}^{2}_{k,\text{min}}=\left\{F^{2}_{\text{min}}\in\mathbf{F}^{2}:% \lVert F^{2}_{\text{min}}-F^{1}\rVert_{2}\leq\lVert F^{2}-F^{1}\rVert_{2},% \forall F^{2}\in\mathbf{F}^{2},\text{ s.t. }{F}^{2}_{\text{min}}\neq{F}^{2}% \right\}.bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k , min end_POSTSUBSCRIPT = { italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ∈ bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : ∥ italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT - italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ ∥ italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ∀ italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∈ bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , s.t. italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ≠ italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } . (13)

That is, 𝐅k,min2subscriptsuperscript𝐅2𝑘min{\mathbf{F}}^{2}_{k,\text{min}}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k , min end_POSTSUBSCRIPT is a subset of k𝑘kitalic_k embedding vectors in 𝐅2superscript𝐅2\mathbf{F}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT with minimum distance to some points in 𝐅1superscript𝐅1\mathbf{F}^{1}bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. Let 𝐅^min2={𝐅2,𝐅k,min2}subscriptsuperscript^𝐅2minsuperscript𝐅2subscriptsuperscript𝐅2𝑘min\widehat{\mathbf{F}}^{2}_{\text{min}}=\left\{\mathbf{F}^{2},{\mathbf{F}}^{2}_{% k,\text{min}}\right\}over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT = { bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k , min end_POSTSUBSCRIPT }, then it follows that

sup1inF1(𝐬(ti))F^min2(𝐬(ti))2sup1inF1(𝐬(ti))F^2(𝐬(ti))2,subscriptsupremum1𝑖𝑛subscriptdelimited-∥∥superscript𝐹1𝐬subscript𝑡𝑖subscriptsuperscript^𝐹2min𝐬subscript𝑡𝑖2subscriptsupremum1𝑖𝑛subscriptdelimited-∥∥superscript𝐹1𝐬subscript𝑡𝑖superscript^𝐹2𝐬subscript𝑡𝑖2\sup\limits_{1\leq i\leq n}\lVert F^{1}(\mathbf{s}(t_{i}))-\widehat{F}^{2}_{% \text{min}}(\mathbf{s}(t_{i}))\rVert_{2}\leq\sup\limits_{1\leq i\leq n}\lVert F% ^{1}(\mathbf{s}(t_{i}))-\widehat{F}^{2}(\mathbf{s}(t_{i}))\rVert_{2},roman_sup start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_n end_POSTSUBSCRIPT ∥ italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ roman_sup start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_n end_POSTSUBSCRIPT ∥ italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , (14)

where F1(𝐬(ti))𝐅1superscript𝐹1𝐬subscript𝑡𝑖superscript𝐅1F^{1}(\mathbf{s}(t_{i}))\in\mathbf{F}^{1}italic_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ∈ bold_F start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, F^2(𝐬(ti))𝐅^2superscript^𝐹2𝐬subscript𝑡𝑖superscript^𝐅2\widehat{F}^{2}(\mathbf{s}(t_{i}))\in\widehat{\mathbf{F}}^{2}over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ∈ over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, and F^min2(𝐬(ti))𝐅^min2subscriptsuperscript^𝐹2min𝐬subscript𝑡𝑖subscriptsuperscript^𝐅2min\widehat{F}^{2}_{\text{min}}(\mathbf{s}(t_{i}))\in\widehat{\mathbf{F}}^{2}_{% \text{min}}over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ∈ over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT.

An immediate consequence of Proposition 4.3 and Corollary 4.4 is that the SSE matrix approximates the TDE for varying level of regularity. In particular, for a time series 𝐱={x(t):t𝒯}𝐱conditional-set𝑥𝑡𝑡𝒯\mathbf{x}=\{x(t):t\in\mathcal{T}\}bold_x = { italic_x ( italic_t ) : italic_t ∈ caligraphic_T }, and 𝒯={t1,,tn}𝒯subscript𝑡1subscript𝑡𝑛\mathcal{T}=\{t_{1},\cdots,t_{n}\}\subset\mathbb{N}caligraphic_T = { italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } ⊂ blackboard_N, the sequence of embedding matrices for each r𝑟ritalic_r where 1rtnt11𝑟subscript𝑡𝑛subscript𝑡11\leq r\leq t_{n}-t_{1}1 ≤ italic_r ≤ italic_t start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is finite. Hence for a fixed n𝑛nitalic_n, the limiting persistence diagram as r1absent𝑟1r\xrightarrow[]{}1italic_r start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW 1 is close to the TDE’s persistence diagram. If r=1𝑟1r=1italic_r = 1, the SSE is exactly the same as the TDE, and the persistence diagrams would be identical. This reinforces the fact that the proposed reconstruction preserves the topological structures more accurately as the level of irregularity in the observed time series decreases. A more formal treatment of these observations is presented next.

4.4 Convergence Results

Throughout this section, it is assumed that the number of missing values increases at a slower rate than the sample size of the time series. Specifically, the missing values grows at a rate of o(logm)𝑜𝑚o(\log m)italic_o ( roman_log italic_m ), for sample size m𝑚mitalic_m. This assumption and others are formalized as follows.

Let 𝐱1,𝐱2,,subscript𝐱1subscript𝐱2\mathbf{x}_{1},\mathbf{x}_{2},\cdots,bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , be irregularly-spaced time series vectors where |𝐱i|<|𝐱j|subscript𝐱𝑖subscript𝐱𝑗|\mathbf{x}_{i}|<|\mathbf{x}_{j}|| bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | < | bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT |, i<j𝑖𝑗i<jitalic_i < italic_j. Denote by 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, the SSE associated with 𝐱{𝐱1,𝐱2,,}\mathbf{x}\in\{\mathbf{x}_{1},\mathbf{x}_{2},\cdots,\}bold_x ∈ { bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , }, i.e.,

𝐅m={F(𝐬(t1)),F(𝐬(t2)),,F(𝐬(tm))}.subscript𝐅𝑚𝐹𝐬subscript𝑡1𝐹𝐬subscript𝑡2𝐹𝐬subscript𝑡𝑚\mathbf{F}_{m}=\left\{F(\mathbf{s}(t_{1})),F(\mathbf{s}(t_{2})),\cdots,F(% \mathbf{s}(t_{m}))\right\}.bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = { italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) , italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) , ⋯ , italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) } . (15)

Note the correspondence between the subscript m𝑚mitalic_m and the number of points in 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT. 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT depends on which time series is selected; however, indexing over this selection is not needed for the following results. Recall that 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is a compact subset of (M+1,2)superscript𝑀1subscriptdelimited-∥∥2(\mathbb{R}^{M+1},\lVert\cdot\rVert_{2})( blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT , ∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). Let the space (M+1,2)superscript𝑀1subscriptdelimited-∥∥2(\mathbb{R}^{M+1},\lVert\cdot\rVert_{2})( blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT , ∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) be endowed with the unknown probability measure ϑitalic-ϑ\varthetaitalic_ϑ such that the set of m𝑚mitalic_m time points are randomly sampled according to ϑitalic-ϑ\varthetaitalic_ϑ. Let ϑitalic-ϑ\varthetaitalic_ϑ be supported on an embedding 𝐅ϑsubscript𝐅italic-ϑ\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT, and let φ𝜑\varphiitalic_φ be the associated density function. Consider the following set of assumptions.

  • A1.

    The sample size increases such that 𝐱i𝐱jsubscript𝐱𝑖subscript𝐱𝑗\mathbf{x}_{i}\subset\mathbf{x}_{j}bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊂ bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT whenever i<j𝑖𝑗i<jitalic_i < italic_j.

  • A2.

    Let εm(r)subscript𝜀𝑚𝑟\varepsilon_{m}(r)italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) be a function of m𝑚mitalic_m and the regularity score r𝑟ritalic_r such that εm(1)0absentsubscript𝜀𝑚10\varepsilon_{m}(1)\xrightarrow{}0italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( 1 ) start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW 0 as mabsent𝑚m\xrightarrow{}\inftyitalic_m start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW ∞.

  • A3.

    For any point Fϑ𝐅ϑsubscript𝐹italic-ϑsubscript𝐅italic-ϑF_{\vartheta}\in\mathbf{F}_{\vartheta}italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT, ϑ(B(Fϑ,δ))min(κδM+1,1)italic-ϑ𝐵subscript𝐹italic-ϑ𝛿𝜅superscript𝛿𝑀11\vartheta(B(F_{\vartheta},\delta))\geq\min(\kappa\delta^{M+1},1)italic_ϑ ( italic_B ( italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT , italic_δ ) ) ≥ roman_min ( italic_κ italic_δ start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT , 1 ), where B(Fϑ,δ)𝐵subscript𝐹italic-ϑ𝛿B(F_{\vartheta},\delta)italic_B ( italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT , italic_δ ) is a closed ball of radius δ>0𝛿0\delta>0italic_δ > 0 around Fϑsubscript𝐹italic-ϑF_{\vartheta}italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT, with constant κ>0𝜅0\kappa>0italic_κ > 0.

  • A4.

    It is possible to create joint distributions based on the marginals of 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT that satisfy

    sup𝐅m|φ(F(𝐬(t1)),,F(𝐬(tm)))φ(F(𝐬(t1)))××φ(F(𝐬(tm)))φ(F(𝐬(t1)))××φ(F(𝐬(tm)))|ηm,subscriptsupremumsubscript𝐅𝑚𝜑𝐹𝐬subscript𝑡1𝐹𝐬subscript𝑡𝑚𝜑𝐹𝐬subscript𝑡1𝜑𝐹𝐬subscript𝑡𝑚𝜑𝐹𝐬subscript𝑡1𝜑𝐹𝐬subscript𝑡𝑚subscript𝜂𝑚\sup_{\mathbf{F}_{m}}\left|\frac{\varphi\left(F(\mathbf{s}(t_{1})),\cdots,F(% \mathbf{s}(t_{m}))\right)-\varphi\left(F(\mathbf{s}(t_{1}))\right)\times\cdots% \times\varphi\left(F(\mathbf{s}(t_{m}))\right)}{\varphi\left(F(\mathbf{s}(t_{1% }))\right)\times\cdots\times\varphi\left(F(\mathbf{s}(t_{m}))\right)}\right|% \leq\eta_{m},roman_sup start_POSTSUBSCRIPT bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT | divide start_ARG italic_φ ( italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) , ⋯ , italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) ) - italic_φ ( italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) ) × ⋯ × italic_φ ( italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) ) end_ARG start_ARG italic_φ ( italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) ) × ⋯ × italic_φ ( italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) ) end_ARG | ≤ italic_η start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ,

    where ηmsubscript𝜂𝑚\eta_{m}italic_η start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is such that m=1ηmmβlog(m)<superscriptsubscript𝑚1subscript𝜂𝑚superscript𝑚𝛽𝑚\sum_{m=1}^{\infty}\frac{\eta_{m}}{m^{\beta}\log(m)}<\infty∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG italic_η start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT roman_log ( italic_m ) end_ARG < ∞ for any β>1𝛽1\beta>1italic_β > 1.

Assumption A4 is to address the possible lack of independence of the vectors in 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT. Under this assumption, the dependence can be controlled and the vectors in 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT are regarded as the so-called ηmsubscript𝜂𝑚\eta_{m}italic_η start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT-almost independent samples, which allows for 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT to converge in Hausdorff distance to 𝐅ϑsubscript𝐅italic-ϑ\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT (Aaron et al., 2017; Picado and Oliveira, 2020). The SSE matrix 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT can be regarded as an estimator of 𝐅ϑsubscript𝐅italic-ϑ\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT and convergence results can be established in the context of assumptions A1-A4. These results are analogous to convergence results established on support estimation of d𝑑ditalic_d-dimensional sets (Cuevas and Rodríguez-Casal, 2004), its generalization to metric spaces, and on the space of persistence diagrams (Mileyko et al., 2011; Chazal et al., 2014). The following result gives the rate of convergence in estimating 𝐅ϑsubscript𝐅italic-ϑ\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT.

Theorem 4.5.

Let 𝐱1,𝐱2,,subscript𝐱1subscript𝐱2\mathbf{x}_{1},\mathbf{x}_{2},\cdots,bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , be a sequence of irregularly-spaced time series vectors satisfying assumption A1, and 𝐅m={F(𝐬(t1)),F(𝐬(t2)),,F(𝐬(tm))}M+1subscript𝐅𝑚𝐹𝐬subscript𝑡1𝐹𝐬subscript𝑡2𝐹𝐬subscript𝑡𝑚superscript𝑀1\mathbf{F}_{m}=\left\{F(\mathbf{s}(t_{1})),F(\mathbf{s}(t_{2})),\cdots,F(% \mathbf{s}(t_{m}))\right\}\subset\mathbb{R}^{M+1}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = { italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) , italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) , ⋯ , italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) } ⊂ blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT be the SSE associated with some 𝐱{𝐱1,𝐱2,,}\mathbf{x}\in\{\mathbf{x}_{1},\mathbf{x}_{2},\cdots,\}bold_x ∈ { bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , }, satisfying assumption A2. If the probability measure ϑitalic-ϑ\varthetaitalic_ϑ satisfies assumption A3 and A4, then with probability one,

limmsup(εm(r))1M+1dH(𝐅m,𝐅ϑ)K,subscriptabsent𝑚supremumsuperscriptsubscript𝜀𝑚𝑟1𝑀1subscriptd𝐻subscript𝐅𝑚subscript𝐅italic-ϑ𝐾\lim_{m\xrightarrow{}\infty}\sup\left(\varepsilon_{m}(r)\right)^{-\frac{1}{M+1% }}\text{d}_{H}(\mathbf{F}_{m},\mathbf{F}_{\vartheta})\leq K,roman_lim start_POSTSUBSCRIPT italic_m start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW ∞ end_POSTSUBSCRIPT roman_sup ( italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT d start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) ≤ italic_K , (16)

where K𝐾Kitalic_K is a constant depending on κ𝜅\kappaitalic_κ and the embedding dimension M+1𝑀1M+1italic_M + 1.

From the stability relation in Equation (5), the Hausdorff metric can be replaced with the Gromov-Hausdorff metric and the results still holds. This also gives a similar convergence results on the space of persistence diagrams with respect to the bottleneck distance and is summarized as:

Corollary 4.6.

Let 𝐱1,𝐱2,,subscript𝐱1subscript𝐱2\mathbf{x}_{1},\mathbf{x}_{2},\cdots,bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , be a sequence of irregularly-spaced time series vectors satisfying assumption A1, and let 𝐅m={F(𝐬(t1)),F(𝐬(t2)),,F(𝐬(tm))}M+1subscript𝐅𝑚𝐹𝐬subscript𝑡1𝐹𝐬subscript𝑡2𝐹𝐬subscript𝑡𝑚superscript𝑀1\mathbf{F}_{m}=\left\{F(\mathbf{s}(t_{1})),F(\mathbf{s}(t_{2})),\cdots,F(% \mathbf{s}(t_{m}))\right\}\subset\mathbb{R}^{M+1}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = { italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) , italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) , ⋯ , italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) } ⊂ blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT be the SSE associated with some 𝐱{𝐱1,𝐱2,,}\mathbf{x}\in\{\mathbf{x}_{1},\mathbf{x}_{2},\cdots,\}bold_x ∈ { bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , }, satisfying assumption A2. If the probability measure ϑitalic-ϑ\varthetaitalic_ϑ satisfies assumption A3 and A4, then with probability one,

limmsup(εm(r))1M+1dB(Dgm(𝐅m),Dgm(𝐅ϑ))K,subscriptabsent𝑚supremumsuperscriptsubscript𝜀𝑚𝑟1𝑀1subscriptd𝐵Dgmsubscript𝐅𝑚Dgmsubscript𝐅italic-ϑ𝐾\lim_{m\xrightarrow{}\infty}\sup\left(\varepsilon_{m}(r)\right)^{-\frac{1}{M+1% }}\text{d}_{B}(\text{Dgm}(\mathbf{F}_{m}),\text{Dgm}(\mathbf{F}_{\vartheta}))% \leq K,roman_lim start_POSTSUBSCRIPT italic_m start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW ∞ end_POSTSUBSCRIPT roman_sup ( italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT d start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ( Dgm ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) , Dgm ( bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) ) ≤ italic_K , (17)

where K𝐾Kitalic_K is a a constant depending on κ𝜅\kappaitalic_κ and the embedding dimension M+1𝑀1M+1italic_M + 1.

5 Numerical Studies

This section presents numerical studies that shows the performance of the proposed SSE method.

5.1 Reconstruction Accuracy

This simulation assesses the SSE method’s effectiveness in preserving the original state space geometry using the Hénon map as an illustrative example (Hénon, 2004). The Hénon map recursively maps a point (ht,gt)2subscript𝑡subscript𝑔𝑡superscript2(h_{t},g_{t})\in\mathbb{R}^{2}( italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as follows: ht+1=1aht2+gtsubscript𝑡11𝑎superscriptsubscript𝑡2subscript𝑔𝑡h_{t+1}=1-ah_{t}^{2}+g_{t}italic_h start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = 1 - italic_a italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, gt+1=bhtsubscript𝑔𝑡1𝑏subscript𝑡g_{t+1}=bh_{t}italic_g start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = italic_b italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, with a=1.4𝑎1.4a=1.4italic_a = 1.4 and b=0.3𝑏0.3b=0.3italic_b = 0.3, their classical values. The map is initialized at (h0,g0)=(0,0)subscript0subscript𝑔000(h_{0},g_{0})=(0,0)( italic_h start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = ( 0 , 0 ), and simulated with 500500500500 points with observations designated as missing with a given probability. Figure 4 shows the 2D Hénon map and the corresponding time series for one dimensions. The measurement function (see Section 3.2) extracts observations along the hhitalic_h-dimension, hence {ht}subscript𝑡\{h_{t}\}{ italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } are used to reconstruct the space. Observations along the g𝑔gitalic_g-dimension could be used instead.

Refer to caption
(a) Hénon map
Refer to caption
(b) The hhitalic_h dimension
Figure 4: The Hénon map used in assessing reconstruction accuracy. (a) The Hénon map with 500 points (blue and orange) where the blue diamonds are designated as missing. (b) The hhitalic_h-dimension of the Hénon map; only 200 points are displayed for visual clarity.

The correlation dimension is used to assess how well the geometry of the original state space is preserved in the reconstruction. Specifically, for a given ε>0𝜀0\varepsilon>0italic_ε > 0, it measures the probability that two random points in a space are within ε𝜀\varepsilonitalic_ε-distance of each other. To compute the correlation dimension, the correlation sum is computed using the following:

Corr(ε)=limm2m(m1)i=1mj=i+1m𝟙(F(𝐬(ti))F(𝐬(tj))2ε),Corr𝜀subscriptabsent𝑚2𝑚𝑚1superscriptsubscript𝑖1𝑚superscriptsubscript𝑗𝑖1𝑚1subscriptdelimited-∥∥𝐹𝐬subscript𝑡𝑖𝐹𝐬subscript𝑡𝑗2𝜀\text{Corr}(\varepsilon)=\lim_{m\xrightarrow{}\infty}\frac{2}{m(m-1)}\sum_{i=1% }^{m}\sum_{j=i+1}^{m}\mathbbm{1}\left(\lVert F(\mathbf{s}(t_{i}))-F(\mathbf{s}% (t_{j}))\rVert_{2}\leq\varepsilon\right),Corr ( italic_ε ) = roman_lim start_POSTSUBSCRIPT italic_m start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW ∞ end_POSTSUBSCRIPT divide start_ARG 2 end_ARG start_ARG italic_m ( italic_m - 1 ) end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = italic_i + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT blackboard_1 ( ∥ italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_ε ) , (18)

for some embedding map 𝐅={F(𝐬(t1)),,F(𝐬(tm))}𝐅𝐹𝐬subscript𝑡1𝐹𝐬subscript𝑡𝑚\mathbf{F}=\{F(\mathbf{s}({t_{1}})),\cdots,F(\mathbf{s}({t_{m}}))\}bold_F = { italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) , ⋯ , italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ) }. Then the correlation dimension is estimated as: limε0log(Corr(ε))/log(ε).subscriptabsent𝜀0Corr𝜀𝜀\lim_{\varepsilon\xrightarrow{}0}{\log(\text{Corr}(\varepsilon))}/{\log(% \varepsilon)}.roman_lim start_POSTSUBSCRIPT italic_ε start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW 0 end_POSTSUBSCRIPT roman_log ( Corr ( italic_ε ) ) / roman_log ( italic_ε ) . If the reconstructed space preserves relevant geometrical invariants, its correlation dimension should match that of the true state space. Other accuracy measures include the box-counting dimension, Hausdorff dimension, and information dimension. However, the correlation dimension is more robust to sample size, making it less noisy with fewer samples (Grassberger and Procaccia, 1983b ).

The SSE method is compared to common statistical interpolation methods used to impute missing data. A range of methods were considered111The comparison methods considered are available in the the R package, imputeTS (Moritz and Bartz-Beielstein, 2017): linear, spline, and Stineman interpolation methods, Kalman Smoothing with a structural model and autoregressive integrated moving average model, a moving average method with exponential and linear weighting, seasonal decomposition (imputation by interpolation is done on the deseasonalized component), seasonal split (imputation by interpolation is done on each split), imputing with the previous observation (LOCF) or next observation (NOCB), imputing with the mean, median, mode, and by a random point in the dataset. , but only the best three methods are presented, which were implemented using the R package, imputeTS (Moritz and Bartz-Beielstein, 2017):
(1) Kalman Smoothing (KS): This fits a structural time series model via maximum likelihood, using the linear local trend as the structural class (see referenced package for more details).
(2) Last Observation Carried Forward (LOCF): This methods replaces each missing value with the most immediate prior observed value.
(3) Next Observation Carried Backward (NOCB): This is similar to the LOCF, but instead replaces each missing value with the most immediate next observed value.

The results are presented in terms of the correlation dimension, with standard errors generated by applying each method to 100 independently generated instances of the Hénon map. A noise model (with no missing values) served as a baseline, with observations from a normal distribution (mean zero) and standard deviation equal to the probability of observing a missing value.

Figure 5 shows example reconstructed spaces using the proposed method and two imputation methods (the NOCB result is nearly identical to the LOCF and is not shown) with 500500500500 samples and a 0.250.250.250.25 missingness probability. Note that for the comparison methods, after imputation the TDE method is used to estimate the state space. Only the SSE method preserves the original geometry, while the imputation methods introduce extraneous features.

Refer to caption
(a) SSE
Refer to caption
(b) KS imputation
Refer to caption
(c) LOCF imputation
Figure 5: Reconstructed state spaces of the Hénon map for: (a) proposed SSE method, (b) KS imputation, and (c) LOCF imputation.
Refer to caption
Figure 6: Reconstruction accuracy results of the Hénon map based on the correlation dimension. The points in different shapes are the mean correlation dimension after 100 repetitions using the proposed SSE method (pink dotted), the three imputation methods, and a baseline noise model (blue dashed), and the vertical bars represent the corresponding standard errors. The black dashed lines indicate the established empirical bounds of the Hénon map.

Figure 6 shows the correlation dimension versus missingness probability for the SSE method and the three imputation methods. The black dashed lines indicate the established empirical estimate for the Hénon map’s correlation dimension (1.22±0.04)plus-or-minus1.220.04(1.22\pm 0.04)( 1.22 ± 0.04 ) (Grassberger and Procaccia, 1983a ; Sprott and Rowlands, 2001). The SSE method performs well up to a missingness probability of 0.60.60.60.6, staying within or close to the empirical bounds. Beyond 0.60.60.60.6, its comparable to the three imputation methods. However, the SSE method is more variable due to the fewer points used to compute the correlation dimension compared to the other methods (which always have 500 points).

5.2 Periodicity Quantification

The periodicity of a time series can be quantified based on the H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT features in the persistence diagram. This relies on the idea that periodic patterns yields elliptic curves in the reconstructed state space, and the roundness of the curves is an indicator of the periodicity in the time series (Perea and Harer, 2015). The roundness of these ellipses can be quantified by examining the maximum persistence of their associated H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT features. For a time series vector 𝐱𝐱\mathbf{x}bold_x with its embedding map 𝐅𝐅\mathbf{F}bold_F, its periodicity score ps(𝐱)ps𝐱\text{ps}(\mathbf{x})ps ( bold_x ) can be defined as (Perea and Harer, 2015):

ps(𝐱)=max(b,d)Dgm(𝐅)(db)/3,ps𝐱subscript𝑏𝑑Dgm𝐅𝑑𝑏3\text{ps}(\mathbf{x})=\max_{{(b,d)}\in\text{Dgm}(\mathbf{F})}(d-b)/{\sqrt{3}},ps ( bold_x ) = roman_max start_POSTSUBSCRIPT ( italic_b , italic_d ) ∈ Dgm ( bold_F ) end_POSTSUBSCRIPT ( italic_d - italic_b ) / square-root start_ARG 3 end_ARG , (19)

where Dgm(𝐅)Dgm𝐅\text{Dgm}(\mathbf{F})Dgm ( bold_F ) is the persistence diagram, and max(b,d)Dgm(𝐅)(db)subscript𝑏𝑑Dgm𝐅𝑑𝑏\max_{{(b,d)}\in\text{Dgm}(\mathbf{F})}(d-b)roman_max start_POSTSUBSCRIPT ( italic_b , italic_d ) ∈ Dgm ( bold_F ) end_POSTSUBSCRIPT ( italic_d - italic_b ) is restricted to the H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT features. For this calculation, the embedding map 𝐅𝐅\mathbf{F}bold_F is pointwise-centered and scaled. The motivation for the periodicity score is that during the VR filtration for a dataset with a large enough sample size, a unit circle (H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature) dies when an inscribed equilateral triangle appears in the VR complex at filtration value 33\sqrt{3}square-root start_ARG 3 end_ARG, hence the maximum periodicity score of one is realized when either the TDE or SSE spaces has a well-sampled circle; a ps(𝐱)ps𝐱\text{ps}(\mathbf{x})ps ( bold_x ) closer to one indicates a stronger periodic signal in 𝐱𝐱\mathbf{x}bold_x.

5.2.1 Simulation

To evaluate this framework, two different set of signals were generated with sample sizes n{50,100,500,1000}𝑛501005001000n\in\{50,100,500,1000\}italic_n ∈ { 50 , 100 , 500 , 1000 } and missingness probabilities {0,0.1,0.2,0.3,0.4}00.10.20.30.4\{0,0.1,0.2,0.3,0.4\}{ 0 , 0.1 , 0.2 , 0.3 , 0.4 }. The first set follows f(t)=50×cos(πt/4λπ)×sin(πt/2)+50𝑓𝑡50𝜋𝑡4𝜆𝜋𝜋𝑡250f(t)=50\times\cos(\pi t/4-\lambda\pi)\times\sin({\pi t}/{2})+50italic_f ( italic_t ) = 50 × roman_cos ( italic_π italic_t / 4 - italic_λ italic_π ) × roman_sin ( italic_π italic_t / 2 ) + 50 with λ(0,1)𝜆01\lambda\in(0,1)italic_λ ∈ ( 0 , 1 ) and t[0,12π]𝑡012𝜋t\in[0,12\pi]italic_t ∈ [ 0 , 12 italic_π ], having a longest period of 8888. The second set is a non-periodic signals drawn from a N(10,2)𝑁102N(10,2)italic_N ( 10 , 2 ). Figure 7 shows samples of both signals.

Refer to caption
(a) Periodic signal
Refer to caption
(b) Non-periodic signal
Figure 7: Sample periodic (a) and non-periodic (b) signals used in the periodicity quantification simulation of Section 5.2. Each time series include 500 time points.

To construct the embedding from the time series, the time points are rescaled to integers and the step-size of τ=1𝜏1\tau=1italic_τ = 1. The periodicity score ps(𝐱)ps𝐱\text{ps}(\mathbf{x})ps ( bold_x ) is then compared to those obtained from the Lomb-Scargle periodogram method for both uniformly-spaced and irregularly-spaced observations (Lomb, 1976; Scargle, 1982; Ruf, 1999), the sliding windows method (Perea and Harer, 2015), and the JTK_CycleJTK_Cycle\text{JTK}\_\text{Cycle}JTK _ Cycle algorithm for uniformly-spaced samples (Hughes et al., 2010).

The results are summarized in Table 1 (periodic signal) and in Table 2 (non-periodic signal). Table 1 shows that all the methods rate the periodic signals as highly periodic with increasing sample size. The proposed SSE method consistently identifies a distinct H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT across all sample sizes and missing observations despite noisy features in the persistence diagram. The JTK__\__Cycle and the Lomb-Scargle method requires specifying a period search range. The proposed SSE method has the added advantage that its periodicity score has a geometric interpretation (Perea and Harer, 2015).

Table 1: Results for the periodic signal summarized as p-values for JTK_CycleJTK_Cycle\text{JTK}\_\text{Cycle}JTK _ Cycle and Lomb-Scargle with estimated period in brackets, and as periodicity scores for Sliding Windows and SSE methods.
n π𝜋\piitalic_π M𝑀Mitalic_M Sliding Windows JTK_CycleJTK_Cycle\text{JTK}\_\text{Cycle}JTK _ Cycle Lomb-Scargle Proposed SSE Method
50 0.000.000.000.00 2222 0.740.740.740.74 0.000.000.000.00 (7.69)7.69(7.69)( 7.69 ) 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.740.740.740.74
0.100.100.100.10 2222 -- -- 0.000.000.000.00 (8.03)8.03(8.03)( 8.03 ) 0.740.740.740.74
0.200.200.200.20 2222 -- -- 0.000.000.000.00 (8.03)8.03(8.03)( 8.03 ) 0.700.700.700.70
0.300.300.300.30 2222 -- -- 0.000.000.000.00 (8.04)8.04(8.04)( 8.04 ) 0.670.670.670.67
0.400.400.400.40 2222 -- -- 0.000.000.000.00 (8.04)8.04(8.04)( 8.04 ) 0.440.440.440.44
100 0.000.000.000.00 6666 0.530.530.530.53 0.000.000.000.00 (8.00)8.00(8.00)( 8.00 ) 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.530.530.530.53
0.100.100.100.10 6666 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.530.530.530.53
0.200.200.200.20 6666 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.530.530.530.53
0.300.300.300.30 4444 -- -- 0.000.000.000.00 (8.20)8.20(8.20)( 8.20 ) 0.490.490.490.49
0.400.400.400.40 3333 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.430.430.430.43
500 0.000.000.000.00 8888 0.930.930.930.93 0.000.000.000.00 (2.64)2.64(2.64)( 2.64 ) 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.930.930.930.93
0.100.100.100.10 8888 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.850.850.850.85
0.200.200.200.20 6666 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.710.710.710.71
0.300.300.300.30 2222 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.630.630.630.63
0.400.400.400.40 2222 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.600.600.600.60
1000 0.000.000.000.00 26262626 0.900.900.900.90 0.000.000.000.00 (1.28)1.28(1.28)( 1.28 ) 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.900.900.900.90
0.100.100.100.10 15151515 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.740.740.740.74
0.200.200.200.20 12121212 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.690.690.690.69
0.300.300.300.30 6666 -- -- 0.000.000.000.00 (8.02)8.02(8.02)( 8.02 ) 0.440.440.440.44
0.400.400.400.40 4444 -- -- 0.000.000.000.00 (8.01)8.01(8.01)( 8.01 ) 0.430.430.430.43
Table 2: Results for the non-periodic signal are given as p-values for JTK_CycleJTK_Cycle\text{JTK}\_\text{Cycle}JTK _ Cycle and Lomb-Scargle with estimated period in brackets, and as periodicity scores for Sliding Windows and SSE methods.
n π𝜋\piitalic_π M𝑀Mitalic_M Sliding Windows JTK_CycleJTK_Cycle\text{JTK}\_\text{Cycle}JTK _ Cycle Lomb-Scargle Proposed SSE Method
50 0.000.000.000.00 3333 0.300.300.300.30 1.001.001.001.00 (13.29)13.29(13.29)( 13.29 ) 0.150.150.150.15 (2.05)2.05(2.05)( 2.05 ) 0.300.300.300.30
0.100.100.100.10 3333 -- -- 0.160.160.160.16 (2.04)2.04(2.04)( 2.04 ) 0.320.320.320.32
0.200.200.200.20 3333 -- -- 0.280.280.280.28 (2.04)2.04(2.04)( 2.04 ) 0.230.230.230.23
0.300.300.300.30 3333 -- -- 0.170.170.170.17 (2.04)2.04(2.04)( 2.04 ) 0.250.250.250.25
0.400.400.400.40 3333 -- -- 0.410.410.410.41 (32.87)32.87(32.87)( 32.87 ) 0.140.140.140.14
100 0.000.000.000.00 3333 0.260.260.260.26 1.001.001.001.00 (12.88)12.88(12.88)( 12.88 ) 0.100.100.100.10 (1.00)1.00(1.00)( 1.00 ) 0.260.260.260.26
0.100.100.100.10 3333 -- -- 0.080.080.080.08 (1.00)1.00(1.00)( 1.00 ) 0.220.220.220.22
0.200.200.200.20 3333 -- -- 0.270.270.270.27 (1.00)1.00(1.00)( 1.00 ) 0.250.250.250.25
0.300.300.300.30 3333 -- -- 0.550.550.550.55 (16.39)16.39(16.39)( 16.39 ) 0.320.320.320.32
0.400.400.400.40 3333 -- -- 0.330.330.330.33 (16.39)16.39(16.39)( 16.39 ) 0.290.290.290.29
500 0.000.000.000.00 11111111 0.140.140.140.14 0.280.280.280.28 (0.45)0.45(0.45)( 0.45 ) 0.110.110.110.11 (0.15)0.15(0.15)( 0.15 ) 0.140.140.140.14
0.100.100.100.10 9999 -- -- 0.230.230.230.23 (0.80)0.80(0.80)( 0.80 ) 0.160.160.160.16
0.200.200.200.20 00 -- -- 0.130.130.130.13 (0.20)0.20(0.20)( 0.20 ) 0.100.100.100.10
0.300.300.300.30 7777 -- -- 0.290.290.290.29 (0.79)0.79(0.79)( 0.79 ) 0.180.180.180.18
0.400.400.400.40 3333 -- -- 0.190.190.190.19 (0.45)0.45(0.45)( 0.45 ) 0.280.280.280.28
1000 0.000.000.000.00 3333 0.130.130.130.13 1.001.001.001.00 (1.28)1.28(1.28)( 1.28 ) 0.870.870.870.87 (0.25)0.25(0.25)( 0.25 ) 0.130.130.130.13
0.100.100.100.10 3333 -- -- 0.500.500.500.50 (0.40)0.40(0.40)( 0.40 ) 0.140.140.140.14
0.200.200.200.20 3333 -- -- 0.550.550.550.55 (0.26)0.26(0.26)( 0.26 ) 0.150.150.150.15
0.300.300.300.30 3333 -- -- 0.220.220.220.22 (0.26)0.26(0.26)( 0.26 ) 0.160.160.160.16
0.400.400.400.40 3333 -- -- 0.660.660.660.66 (0.26)0.26(0.26)( 0.26 ) 0.200.200.200.20

For the non-periodic signal, all the methods performed reasonably well across all samples and missingness mechanisms. The performance of the SSE method in the non-periodic setting is not surprising. This is because as more observations are missing, the sampled time points appear random, and the resulting time series looks more like random random noise than signal.

5.2.2 Real Data Analysis

Irregularly-spaced times series data are common in astronomy such as those discussed in VanderPlas, (2018) and in exoplanet detection methods (Zhao et al., 2020, 2022). In this section, we examine an asteroid dataset from the Lincoln Near-Earth Asteroid Research (LINEAR) survey, which tracks near-Earth asteroids. The data includes 280280280280 magnitude measurements (brightness) of LINEAR object ID 11375941 over five and a half years. Magnitude measurements are unitless; lower values indicate brighter objects. Further details on the data and preprocessing are in Sesar et al., (2011); Palaversa et al., (2013); VanderPlas, (2018).

Figure 8(a) shows the observed magnitude over time, revealing no visible periodic pattern due to irregular sampling. The TDE method is unsuitable for such data, but the proposed SSE method can construct a geometric representation. Using M=2𝑀2M=2italic_M = 2, and rescaling the time points to integers and taking τ=1𝜏1\tau=1italic_τ = 1, the SSE in Figure 8(b) reveals a circular geometric object, indicating high periodicity. The H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature on the persistence diagram (Figure 8(c)) is at the point (b,d)=(0.31,1.74)𝑏𝑑0.311.74(b,d)=(0.31,1.74)( italic_b , italic_d ) = ( 0.31 , 1.74 ).

Refer to caption
(a) Time series
Refer to caption
(b) SSE
Refer to caption
(c) Persistence diagram
Figure 8: LINEAR object ID 11375941. (a) The time series of the measured magnitudes (orange circles) with error bars (vertical bars). (b) The SSE of the time series. (c) The persistence diagram for the SSE with a single highly persistent H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature as expected.

The periodicity score obtained using Equation (19) is 0.820.820.820.82, indicating high periodicity in he observed magnitude of LINEAR object ID 11375941. Using the Lomb-Scargle method, the optimal period was found to be 2.582.582.582.58 with a p-value of 0.000.000.000.00. These results confirm the SSE method’s periodicity findings and highlight its utility in quantifying and visualizing periodicity. Additional astronomy applications are discussed in the supplementary material Section S2.4.

6 Conclusion

The fusion of TDE with TDA holds significant promise for discerning system dynamics and quantifying properties like periodicity in uniformly-spaced time series. This work introduces a novel subsequence embedding method for irregularly-spaced time-series data. Irregular spacing can obscure patterns and introduce noise (e.g., Figure 8(a)). While data imputation can create uniformly-spaced series, it often fails to accurately represent the true state space (e.g., Figure 5). The proposed SSE method addresses these challenges, preserving the topology of the reconstructed state space and mitigating spurious homological features introduced by irregular spacing.

In Section 5.2, we note the need for statistical inference on periodicity scores to determine if the most persistent H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature is due to a real periodic signal or chance. Existing methods for significance testing of homology generators, such as those using kernel density estimators, allow constructing confidence sets for homology generators (Fasy et al., 2014; Xu et al., 2019). Extending this to homology generators based on direct filtration on the point-cloud space requires bounding the bottleneck-distance with the Hausdorff distance. Initial investigations produced wide confidence sets, indicating the need for a more tailored method, which remains a future research direction.

Finally, Algorithm 1 constructs subsequences with a fixed regularity score r𝑟ritalic_r. Extending this to r±ϵplus-or-minus𝑟italic-ϵr\pm\epsilonitalic_r ± italic_ϵ for small ϵitalic-ϵ\epsilonitalic_ϵ would increase the length of each constructed subsequence, and the number of points in the reconstructed space. Future work will explore the robustness of this method to different ϵitalic-ϵ\epsilonitalic_ϵ values and the trade-off between sample size and reconstruction accuracy.

References

  • Aaron et al., (2017) Aaron, C., Cholaquidis, A., and Cuevas, A. (2017). Detection of low dimensionality and data denoising via set estimation techniques. Electronic Journal of Statistics, 11(2):4596 – 4628.
  • Adams et al., (2017) Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., Chepushtanova, S., Hanson, E., Motta, F., and Ziegelmeier, L. (2017). Persistence images: A stable vector representation of persistent homology. Journal of Machine Learning Research, 18(8):1–35.
  • Ali et al., (2007) Ali, S., Basharat, A., and Shah, M. (2007). Chaotic invariants for human action recognition. In 2007 IEEE 11th International Conference on Computer Vision, pages 1–8. IEEE.
  • Attouch et al., (1991) Attouch, H., Lucchetti, R., and Wets, R. J.-B. (1991). The topology of the ρ𝜌\rhoitalic_ρ-Hausdorff distance. Annali di Matematica pura ed applicata, 160(1):303–320.
  • Berry et al., (2020) Berry, E., Chen, Y.-C., Cisewski-Kehe, J., and Fasy, B. T. (2020). Functional summaries of persistence diagrams. Journal of Applied and Computational Topology, 4(2):211–262.
  • Boker et al., (2018) Boker, S. M., Tiberio, S. S., and Moulder, R. G. (2018). Robustness of time delay embedding to sampling interval misspecification. Continuous time modeling in the behavioral and related sciences, pages 239–258.
  • Brown and Knudson, (2009) Brown, K. A. and Knudson, K. P. (2009). Nonlinear statistics of human speech data. International Journal of Bifurcation and Chaos, 19(07):2307–2319.
  • Bubenik et al., (2015) Bubenik, P. et al. (2015). Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res., 16(1):77–102.
  • Cantelli, (1917) Cantelli, F. (1917). Sulla probabilità como limite della frequenza. Atti Reale Academia Nazionale dei Lincei.
  • Cao, (1997) Cao, L. (1997). Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenomena, 110(1-2):43–50.
  • Casdagli et al., (1991) Casdagli, M., Eubank, S., Farmer, J. D., and Gibson, J. (1991). State space reconstruction in the presence of noise. Physica D: Nonlinear Phenomena, 51(1-3):52–98.
  • Chazal et al., (2014) Chazal, F., Glisse, M., Labruère, C., and Michel, B. (2014). Convergence rates for persistence diagram estimation in topological data analysis. In International Conference on Machine Learning, pages 163–171. PMLR.
  • Chazal and Michel, (2021) Chazal, F. and Michel, B. (2021). An introduction to topological data analysis: fundamental and practical aspects for data scientists. Frontiers in artificial intelligence, 4:108.
  • Chung and Erdös, (1952) Chung, K. L. and Erdös, P. (1952). On the application of the Borel-Cantelli lemma. Transactions of the American Mathematical Society, 72(1):179–186.
  • Cuevas and Rodríguez-Casal, (2004) Cuevas, A. and Rodríguez-Casal, A. (2004). On boundary estimation. Advances in Applied Probability, 36(2):340–354.
  • Davis et al., (2017) Davis, A. B., Cisewski, J., Dumusque, X., Fischer, D. A., and Ford, E. B. (2017). Insights on the spectral signatures of stellar activity and planets from PCA. The Astrophysical Journal, 846(1):59.
  • Deyle and Sugihara, (2011) Deyle, E. R. and Sugihara, G. (2011). Generalized theorems for nonlinear state space reconstruction. Plos one, 6(3):e18295.
  • Dumusque, (2016) Dumusque, X. (2016). Radial velocity fitting challenge-I. Simulating the data set including realistic stellar radial-velocity signals. Astronomy & Astrophysics, 593:A5.
  • Dumusque, (2018) Dumusque, X. (2018). Measuring precise radial velocities on individual spectral lines-I. Validation of the method and application to mitigate stellar activity. Astronomy & Astrophysics, 620:A47.
  • Dumusque et al., (2014) Dumusque, X., Boisse, I., and Santos, N. (2014). SOAP 2.0: a tool to estimate the photometric and radial velocity variations induced by stellar spots and plages. The Astrophysical Journal, 796(2):132.
  • Dunsmuir and Robinson, (1981) Dunsmuir, W. and Robinson, P. (1981). Estimation of time series models in the presence of missing data. Journal of the American Statistical Association, 76(375):560–568.
  • Edelsbrunner and Harer, (2022) Edelsbrunner, H. and Harer, J. L. (2022). Computational topology: an introduction. American Mathematical Society.
  • Edelsbrunner et al., (2000) Edelsbrunner, H., Letscher, D., and Zomorodian, A. (2000). Topological persistence and simplification. In Proceedings 41st annual symposium on foundations of computer science, pages 454–463. IEEE.
  • Émile Borel, (1909) Émile Borel, M. (1909). Les probabilités dénombrables et leurs applications arithmétiques. Rendiconti del Circolo Matematico di Palermo (1884-1940), 27(1):247–271.
  • Emrani et al., (2014) Emrani, S., Gentimis, T., and Krim, H. (2014). Persistent homology of delay embeddings and its application to wheeze detection. IEEE Signal Processing Letters, 21(4):459–463.
  • Erickson, (1999) Erickson, J. (1999). Finding longest arithmetic progressions. University of Illinois at Urbana-Champaign.
  • Fasy et al., (2014) Fasy, B. T., Lecci, F., Rinaldo, A., Wasserman, L., Balakrishnan, S., and Singh, A. (2014). Confidence sets for persistence diagrams. The Annals of Statistics, 42:2301–2339.
  • Fischer et al., (2016) Fischer, D. A., Anglada-Escude, G., Arriagada, P., Baluev, R. V., Bean, J. L., Bouchy, F., Buchhave, L. A., Carroll, T., Chakraborty, A., Crepp, J. R., et al. (2016). State of the field: extreme precision radial velocities. Publications of the Astronomical Society of the Pacific, 128(964):066001.
  • Gholizadeh and Zadrozny, (2018) Gholizadeh, S. and Zadrozny, W. (2018). A short survey of topological data analysis in time series and systems analysis. arXiv preprint arXiv:1809.10745.
  • Glenn et al., (2024) Glenn, S., Cisewski-Kehe, J., Zhu, J., and Bement, W. M. (2024). Confidence regions for a persistence diagram of a single image with one or more loops. arXiv preprint arXiv:2405.01651.
  • (31) Grassberger, P. and Procaccia, I. (1983a). Characterization of strange attractors. Physical review letters, 50(5):346.
  • (32) Grassberger, P. and Procaccia, I. (1983b). Measuring the strangeness of strange attractors. Physica D: nonlinear phenomena, 9(1-2):189–208.
  • Hara and Ford, (2023) Hara, N. C. and Ford, E. B. (2023). Statistical methods for exoplanet detection with radial velocities. Annual Review of Statistics and Its Application, 10:623–649.
  • Harvey and Pierse, (1984) Harvey, A. C. and Pierse, R. G. (1984). Estimating missing observations in economic time series. Journal of the American statistical Association, 79(385):125–131.
  • Hatcher et al., (2002) Hatcher, A., Press, C. U., and of Mathematics, C. U. D. (2002). Algebraic Topology. Algebraic Topology. Cambridge University Press.
  • Hénon, (2004) Hénon, M. (2004). A two-dimensional mapping with a strange attractor. The theory of chaotic attractors, pages 94–102.
  • Hertz, (1992) Hertz, D. (1992). Simple bounds on the extreme eigenvalues of Toeplitz matrices. IEEE transactions on information theory, 38(1):175–176.
  • (38) Holzer, P. H., Cisewski-Kehe, J., Fischer, D., and Zhao, L. (2021a). A Hermite-Gaussian based exoplanet radial velocity estimation method. The Annals of Applied Statistics, 15(2):527–555.
  • (39) Holzer, P. H., Cisewski-Kehe, J., Zhao, L., Ford, E. B., Gilbertson, C., and Fischer, D. A. (2021b). A stellar activity F-statistic for exoplanet surveys (SAFE). The Astronomical Journal, 161(6):272.
  • Huélamo et al., (2008) Huélamo, N., Figueira, P., Bonfils, X., Santos, N., Pepe, F., Gillon, M., Azevedo, R., Barman, T., Fernández, M., Di Folco, E., et al. (2008). TW Hydrae: evidence of stellar spots instead of a Hot Jupiter. Astronomy & Astrophysics, 489(2):L9–L13.
  • Hughes et al., (2010) Hughes, M. E., Hogenesch, J. B., and Kornacker, K. (2010). Jtk_cycle: an efficient nonparametric algorithm for detecting rhythmic components in genome-scale data sets. Journal of biological rhythms, 25(5):372–380.
  • Huke and Broomhead, (2007) Huke, J. P. and Broomhead, D. S. (2007). Embedding theorems for non-uniformly sampled dynamical systems. Nonlinearity, 20(9):2205.
  • HuoLiu and YuanTang, (1998) HuoLiu, Q. and YuanTang, X. (1998). Iterative algorithm for nonuniform inverse fast Fourier transform (NU-IFFT). Electronics Letters, 34(20):1913–1914.
  • Johnson and Munch, (2022) Johnson, B. and Munch, S. B. (2022). An empirical dynamic modeling framework for missing or irregular samples. Ecological Modelling, 468:109948.
  • Jones et al., (2022) Jones, D. E., Stenning, D. C., Ford, E. B., Wolpert, R. L., Loredo, T. J., Gilbertson, C., and Dumusque, X. (2022). Improving exoplanet detection power: Multivariate Gaussian process models for stellar activity. The Annals of Applied Statistics, 16(2):652–679.
  • Kim et al., (1999) Kim, H., Eykholt, R., and Salas, J. (1999). Nonlinear dynamics, delay times, and embedding windows. Physica D: Nonlinear Phenomena, 127(1-2):48–60.
  • Lekscha and Donner, (2018) Lekscha, J. and Donner, R. V. (2018). Phase space reconstruction for non-uniformly sampled noisy time series. Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(8):085702.
  • Lomb, (1976) Lomb, N. R. (1976). Least-squares frequency analysis of unequally spaced data. Astrophysics and space science, 39:447–462.
  • Mayor and Queloz, (1995) Mayor, M. and Queloz, D. (1995). A Jupiter-mass companion to a solar-type star. Nature, 378(6555):355–359.
  • Mileyko et al., (2011) Mileyko, Y., Mukherjee, S., and Harer, J. (2011). Probability measures on the space of persistence diagrams. Inverse Problems, 27(12):124007.
  • Moritz and Bartz-Beielstein, (2017) Moritz, S. and Bartz-Beielstein, T. (2017). imputeTS: time series missing value imputation in R. R J., 9(1):207.
  • Palaversa et al., (2013) Palaversa, L., Ivezić, Ž., Eyer, L., Ruždjak, D., Sudar, D., Galin, M., Kroflin, A., Mesarić, M., Munk, P., Vrbanec, D., et al. (2013). Exploring the variable sky with LINEAR. III. Classification of periodic light curves. The Astronomical Journal, 146(4):101.
  • Perea and Harer, (2015) Perea, J. A. and Harer, J. (2015). Sliding windows and persistence: An application of topological methods to signal analysis. Foundations of Computational Mathematics, 15:799–838.
  • Picado and Oliveira, (2020) Picado, N. and Oliveira, P. E. (2020). Denoising and interior detection problems. arXiv preprint arXiv:2010.16360.
  • Pun et al., (2022) Pun, C. S., Lee, S. X., and Xia, K. (2022). Persistent-homology-based machine learning: a survey and a comparative study. Artificial Intelligence Review, 55(7):5169–5213.
  • Rajpaul et al., (2015) Rajpaul, V., Aigrain, S., Osborne, M. A., Reece, S., and Roberts, S. (2015). A Gaussian process framework for modelling stellar activity signals in radial velocity data. Monthly Notices of the Royal Astronomical Society, 452(3):2269–2291.
  • Robinson and Turner, (2017) Robinson, A. and Turner, K. (2017). Hypothesis testing for topological data analysis. Journal of Applied and Computational Topology, 1:241–261.
  • Ruf, (1999) Ruf, T. (1999). The Lomb-Scargle periodogram in biological rhythm research: analysis of incomplete and unequally spaced time-series. Biological Rhythm Research, 30(2):178–201.
  • Sauer et al., (1991) Sauer, T., Yorke, J. A., and Casdagli, M. (1991). Embedology. Journal of statistical Physics, 65:579–616.
  • Scargle, (1982) Scargle, J. D. (1982). Studies in astronomical time series analysis. ii-statistical aspects of spectral analysis of unevenly spaced data. Astrophysical Journal, Part 1, vol. 263, Dec. 15, 1982, p. 835-853., 263:835–853.
  • Sesar et al., (2011) Sesar, B., Stuart, J. S., Ivezić, Ž., Morgan, D. P., Becker, A. C., and Woźniak, P. (2011). Exploring the variable sky with LINEAR. I. Photometric recalibration with the sloan digital sky survey. The Astronomical Journal, 142(6):190.
  • Sprott and Rowlands, (2001) Sprott, J. C. and Rowlands, G. (2001). Improved correlation dimension calculation. International Journal of Bifurcation and Chaos, 11(07):1865–1880.
  • Stark et al., (1997) Stark, J., Broomhead, D. S., Davies, M. E., and Huke, J. (1997). Takens embedding theorems for forced and stochastic systems. Nonlinear Analysis: Theory, Methods & Applications, 30(8):5303–5314.
  • Takens, (2006) Takens, F. (2006). Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, Warwick 1980: proceedings of a symposium held at the University of Warwick 1979/80, pages 366–381. Springer.
  • Tralie and Perea, (2018) Tralie, C. J. and Perea, J. A. (2018). (Quasi) Periodicity quantification in video data, using topology. SIAM Journal on Imaging Sciences, 11(2):1049–1077.
  • VanderPlas, (2018) VanderPlas, J. T. (2018). Understanding the Lomb-Scargle periodogram. The Astrophysical Journal Supplement Series, 236(1):16.
  • Vietoris, (1927) Vietoris, L. (1927). Über den höheren zusammenhang kompakter räume und eine klasse von zusammenhangstreuen abbildungen. Mathematische Annalen, 97(1):454–472.
  • Xu et al., (2019) Xu, X., Cisewski-Kehe, J., Green, S. B., and Nagai, D. (2019). Finding cosmic voids and filament loops using topological data analysis. Astronomy and Computing, 27:34–52.
  • Zhao et al., (2020) Zhao, L., Fischer, D. A., Ford, E. B., Henry, G. W., Roettenbacher, R. M., and Brewer, J. M. (2020). The EXPRES stellar-signals project. I. description of data. Research Notes of the AAS, 4(9):156.
  • Zhao et al., (2022) Zhao, L. L., Fischer, D. A., Ford, E. B., Wise, A., Cretignier, M., Aigrain, S., Barragan, O., Bedell, M., Buchhave, L. A., Camacho, J. D., et al. (2022). The EXPRES stellar signals project II. state of the field in disentangling photospheric velocities. The Astronomical Journal, 163(4):171.

Supplementary Material

S1 Stability and Convergence Results

S1.1 Proof of Proposition 4.1

Proof.

It suffices to bound the Hausdorff distance between 𝐅𝐅\mathbf{F}bold_F and 𝐅superscript𝐅\mathbf{F}^{\prime}bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, then using the stability theory in persistence homology (Chazal and Michel, 2021), the bound on their persistence diagrams with respect to the Bottleneck distance can be established. The proof proceeds as follows.

There exists a subset 𝐱i𝐱superscriptsubscript𝐱𝑖superscript𝐱\mathbf{x}_{i}^{*}\subset\mathbf{x}^{*}bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ⊂ bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT that exactly equals F(𝐬(tp,i))superscript𝐹𝐬subscript𝑡𝑝𝑖F^{*}(\mathbf{s}(t_{p,i}))italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) (the i𝑖iitalic_i-th row of the p𝑝pitalic_p-th subsequence of the embedding matrix 𝐅superscript𝐅\mathbf{F}^{*}bold_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT). This fact stems from the construction of 𝐅superscript𝐅\mathbf{F}^{*}bold_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, whose rows are uniformly-spaced samples of 𝐱superscript𝐱\mathbf{x}^{*}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. The same guarantee holds for the pairs (𝐅,𝐱)𝐅𝐱(\mathbf{F},\mathbf{x})( bold_F , bold_x ), and (𝐅,𝐱)superscript𝐅superscript𝐱(\mathbf{F}^{\prime},\mathbf{x}^{\prime})( bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). The distance between the projection Π𝚽(𝐱i)subscriptΠ𝚽superscriptsubscript𝐱𝑖\Pi_{\mathbf{\boldsymbol{\Phi}}}(\mathbf{x}_{i}^{*})roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) and F(𝐬(tp,i))𝐹𝐬subscript𝑡𝑝𝑖F(\mathbf{s}(t_{p,i}))italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) is given by

Π𝚽(𝐱i)F(𝐬(tp,i))2=Π𝚽(𝐱iF(𝐬(tp,i)))2,subscriptdelimited-∥∥subscriptΠ𝚽superscriptsubscript𝐱𝑖𝐹𝐬subscript𝑡𝑝𝑖2subscriptdelimited-∥∥subscriptΠ𝚽superscriptsubscript𝐱𝑖𝐹𝐬subscript𝑡𝑝𝑖2\lVert\Pi_{\mathbf{\boldsymbol{\Phi}}}(\mathbf{x}_{i}^{*})-F(\mathbf{s}(t_{p,i% }))\rVert_{2}=\lVert\Pi_{\mathbf{\boldsymbol{\Phi}}}\left(\mathbf{x}_{i}^{*}-F% (\mathbf{s}(t_{p,i}))\right)\rVert_{2},∥ roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ∥ roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , (20)

where 2subscriptdelimited-∥∥2\lVert\cdot\rVert_{2}∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT denotes the l2superscript𝑙2l^{2}italic_l start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-norm. Equation (20) is under the assumption that the choice of frequency threshold does not smooth out the peaks in the true signal 𝐱𝐱\mathbf{x}bold_x. Observe that each F(𝐬(tp,i))superscript𝐹𝐬subscript𝑡𝑝𝑖F^{\prime}(\mathbf{s}(t_{p,i}))italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) is isometric to Π𝚽(𝐱i)subscriptΠ𝚽subscriptsuperscript𝐱𝑖\Pi_{\boldsymbol{\Phi}}(\mathbf{x}^{*}_{i})roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ), where 𝐱isubscriptsuperscript𝐱𝑖\mathbf{x}^{*}_{i}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is a subset of length M+1𝑀1M+1italic_M + 1 of the original noisy scalar time series. Then the Gromov-Hausdorff distance between 𝐅superscript𝐅\mathbf{F}^{\prime}bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and 𝐅𝐅\mathbf{F}bold_F can be expressed as

dGH(𝐅,𝐅)=dGH(Π^𝚽(𝐱),𝐅),subscriptdGHsuperscript𝐅𝐅subscriptdGHsubscript^Π𝚽superscript𝐱𝐅\text{d}_{\text{GH}}\left(\mathbf{F}^{\prime},\mathbf{F}\right)=\text{d}_{% \text{GH}}\left(\widehat{\Pi}_{\boldsymbol{\Phi}}(\mathbf{x}^{*}),\mathbf{F}% \right),d start_POSTSUBSCRIPT GH end_POSTSUBSCRIPT ( bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , bold_F ) = d start_POSTSUBSCRIPT GH end_POSTSUBSCRIPT ( over^ start_ARG roman_Π end_ARG start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) , bold_F ) , (21)

where Π^𝚽(𝐱)subscript^Π𝚽superscript𝐱\widehat{\Pi}_{\boldsymbol{\Phi}}(\mathbf{x}^{*})over^ start_ARG roman_Π end_ARG start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) denotes embedding of the vector Π𝚽(𝐱)subscriptΠ𝚽superscript𝐱{\Pi}_{\boldsymbol{\Phi}}(\mathbf{x}^{*})roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ). Using the same isometric property, the Hausdorff distance between 𝐅superscript𝐅\mathbf{F}^{\prime}bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and 𝐅𝐅\mathbf{F}bold_F can be expressed in terms of Equation (20)italic-(20italic-)\eqref{eqn:init-dist}italic_( italic_). This follows from the fact that

Π𝚽(𝐱iF(𝐬(tp,i)))2=Π𝚽(F(𝐬(tp,i))F(𝐬(tp,i)))2F(𝐬(tp,i))F(𝐬(tp,i))2(𝚽H𝚽)𝚽H2.subscriptdelimited-∥∥subscriptΠ𝚽superscriptsubscript𝐱𝑖𝐹𝐬subscript𝑡𝑝𝑖2missing-subexpressionabsentsubscriptdelimited-∥∥subscriptΠ𝚽superscript𝐹𝐬subscript𝑡𝑝𝑖𝐹𝐬subscript𝑡𝑝𝑖2missing-subexpressionabsentsubscriptdelimited-∥∥superscript𝐹𝐬subscript𝑡𝑝𝑖𝐹𝐬subscript𝑡𝑝𝑖2subscriptdelimited-∥∥superscriptsuperscript𝚽𝐻𝚽superscript𝚽𝐻2\displaystyle\lVert\Pi_{\mathbf{\boldsymbol{\Phi}}}\left(\mathbf{x}_{i}^{*}-F(% \mathbf{s}(t_{p,i}))\right)\rVert_{2}\begin{aligned} &=\lVert\Pi_{\mathbf{% \boldsymbol{\Phi}}}\left(F^{*}(\mathbf{s}(t_{p,i}))-F(\mathbf{s}(t_{p,i}))% \right)\rVert_{2}\\ &\leq\lVert F^{*}(\mathbf{s}(t_{p,i}))-F(\mathbf{s}(t_{p,i}))\rVert_{2}\lVert(% \boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}\boldsymbol{\Phi}^{H}\rVert_{% 2}.\end{aligned}∥ roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ROW start_CELL end_CELL start_CELL = ∥ roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ≤ ∥ italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . end_CELL end_ROW (22)

The matrix 𝚽H𝚽superscript𝚽𝐻𝚽\boldsymbol{\Phi}^{H}\boldsymbol{\Phi}bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ has the form:

𝚽H𝚽=[nk=1nej2π(w2w1)fkk=1nej2π(wnw1)fkk=1nej2π(w2w1)fknk=1nej2π(wnw2)fkk=1nej2π(wnw1)fkk=1nej2π(wnw2)fkn].superscript𝚽𝐻𝚽matrix𝑛superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤2subscript𝑤1subscript𝑓𝑘superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤𝑛subscript𝑤1subscript𝑓𝑘superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤2subscript𝑤1subscript𝑓𝑘𝑛superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤𝑛subscript𝑤2subscript𝑓𝑘superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤𝑛subscript𝑤1subscript𝑓𝑘superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤𝑛subscript𝑤2subscript𝑓𝑘𝑛\boldsymbol{\Phi}^{H}\boldsymbol{\Phi}=\begin{bmatrix}n&\sum_{k=1}^{n}e^{j2\pi% (w_{2}-w_{1})f_{k}}&\cdots&\sum_{k=1}^{n}e^{j2\pi(w_{n}-w_{1})f_{k}}\vspace{0.% 5cm}\\ \sum_{k=1}^{n}e^{-j2\pi(w_{2}-w_{1})f_{k}}&n&\cdots&\sum_{k=1}^{n}e^{j2\pi(w_{% n}-w_{2})f_{k}}\vspace{0.5cm}\\ \vdots&\vdots&\cdots&\vdots\vspace{0.5cm}\\ \sum_{k=1}^{n}e^{-j2\pi(w_{n}-w_{1})f_{k}}&\sum_{k=1}^{n}e^{-j2\pi(w_{n}-w_{2}% )f_{k}}&\cdots&n\end{bmatrix}.bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ = [ start_ARG start_ROW start_CELL italic_n end_CELL start_CELL ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_CELL start_CELL italic_n end_CELL start_CELL ⋯ end_CELL start_CELL ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋯ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_CELL start_CELL ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_n end_CELL end_ROW end_ARG ] . (23)

The modulus of off-diagonal elements of 𝚽H𝚽superscript𝚽𝐻𝚽\boldsymbol{\Phi}^{H}\boldsymbol{\Phi}bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ is bounded as

|k=1nej2π(wl1wl2)fk|k=1n|ej2π(wl1wl2)fk|=n.superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤subscript𝑙1subscript𝑤subscript𝑙2subscript𝑓𝑘superscriptsubscript𝑘1𝑛superscript𝑒𝑗2𝜋subscript𝑤subscript𝑙1subscript𝑤subscript𝑙2subscript𝑓𝑘𝑛\left|\sum_{k=1}^{n}e^{j2\pi(w_{l_{1}}-w_{l_{2}})f_{k}}\right|\leq\sum_{k=1}^{% n}\left|e^{j2\pi(w_{l_{1}}-w_{l_{2}})f_{k}}\right|=n.| ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | ≤ ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π ( italic_w start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_w start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | = italic_n . (24)

Observe that (𝚽H𝚽)𝚽H2(𝚽H𝚽)2𝚽H2subscriptdelimited-∥∥superscriptsuperscript𝚽𝐻𝚽superscript𝚽𝐻2subscriptdelimited-∥∥superscriptsuperscript𝚽𝐻𝚽2subscriptdelimited-∥∥superscript𝚽𝐻2\lVert(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}\boldsymbol{\Phi}^{H}% \rVert_{2}\leq\lVert(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}\rVert_{% 2}\lVert\boldsymbol{\Phi}^{H}\rVert_{2}∥ ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ ∥ ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. First we bound 𝚽H2subscriptdelimited-∥∥superscript𝚽𝐻2\lVert\boldsymbol{\Phi}^{H}\rVert_{2}∥ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, by directly using the definition:

𝚽H2=λmax(𝚽𝚽H),subscriptdelimited-∥∥superscript𝚽𝐻2subscript𝜆max𝚽superscript𝚽𝐻\lVert\boldsymbol{\Phi}^{H}\rVert_{2}=\sqrt{\lambda_{\text{max}}(\boldsymbol{% \Phi}\boldsymbol{\Phi}^{H})},∥ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = square-root start_ARG italic_λ start_POSTSUBSCRIPT max end_POSTSUBSCRIPT ( bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ) end_ARG , (25)

where λmax(𝚽𝚽H)subscript𝜆max𝚽superscript𝚽𝐻\lambda_{\text{max}}(\boldsymbol{\Phi}\boldsymbol{\Phi}^{H})italic_λ start_POSTSUBSCRIPT max end_POSTSUBSCRIPT ( bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ) is the maximum eigenvalue of 𝚽𝚽H𝚽superscript𝚽𝐻\boldsymbol{\Phi}\boldsymbol{\Phi}^{H}bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT. Since 𝚽𝚽H𝚽superscript𝚽𝐻\boldsymbol{\Phi}\boldsymbol{\Phi}^{H}bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT is Toeplitz, a bound on the maximum eigenvalue can be established as follows (Hertz, 1992). Let 𝝍=[ψ1,ψ2,,ψn]𝝍superscriptsubscript𝜓1subscript𝜓2subscript𝜓𝑛top\boldsymbol{\psi}=\left[\psi_{1},\psi_{2},\cdots,\psi_{n}\right]^{\top}bold_italic_ψ = [ italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT be a vector such that ψ1=1subscript𝜓11\psi_{1}=1italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 and

ψk=2cos(π(n1)/(k1)+2),k=2,,n.formulae-sequencesubscript𝜓𝑘2𝜋𝑛1𝑘12𝑘2𝑛\psi_{k}=2*\cos\left(\frac{\pi}{\lfloor(n-1)/(k-1)\rfloor+2}\right),\quad k=2,% \cdots,n.italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = 2 ∗ roman_cos ( divide start_ARG italic_π end_ARG start_ARG ⌊ ( italic_n - 1 ) / ( italic_k - 1 ) ⌋ + 2 end_ARG ) , italic_k = 2 , ⋯ , italic_n . (26)

Also, let 𝜻=[|(𝚽𝚽H)1,1|,|(𝚽𝚽H)1,2|,,|(𝚽𝚽H)1,n|]𝜻superscriptsubscript𝚽superscript𝚽𝐻11subscript𝚽superscript𝚽𝐻12subscript𝚽superscript𝚽𝐻1𝑛top\boldsymbol{\zeta}=\left[|(\boldsymbol{\Phi}\boldsymbol{\Phi}^{H})_{1,1}|,|(% \boldsymbol{\Phi}\boldsymbol{\Phi}^{H})_{1,2}|,\cdots,|(\boldsymbol{\Phi}% \boldsymbol{\Phi}^{H})_{1,n}|\right]^{\top}bold_italic_ζ = [ | ( bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT 1 , 1 end_POSTSUBSCRIPT | , | ( bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT | , ⋯ , | ( bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT 1 , italic_n end_POSTSUBSCRIPT | ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT, that is, the modulus of the terms in first row of 𝚽𝚽H𝚽superscript𝚽𝐻\boldsymbol{\Phi}\boldsymbol{\Phi}^{H}bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT. Let λksubscript𝜆𝑘\lambda_{k}italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT be the k𝑘kitalic_k-th eigenvalue of 𝚽𝚽H𝚽superscript𝚽𝐻\boldsymbol{\Phi}\boldsymbol{\Phi}^{H}bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT. Then it follows that (Hertz, 1992):

max1kn(λk)𝜻T𝝍.subscript1𝑘𝑛subscript𝜆𝑘superscript𝜻𝑇𝝍\max_{1\leq k\leq n}(\lambda_{k})\leq\boldsymbol{\zeta}^{T}\boldsymbol{\psi}.roman_max start_POSTSUBSCRIPT 1 ≤ italic_k ≤ italic_n end_POSTSUBSCRIPT ( italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ≤ bold_italic_ζ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_ψ . (27)

Further observe that max1kn(ψk)=2subscript1𝑘𝑛subscript𝜓𝑘2\max_{1\leq k\leq n}(\psi_{k})=2roman_max start_POSTSUBSCRIPT 1 ≤ italic_k ≤ italic_n end_POSTSUBSCRIPT ( italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = 2, hence, together with the bound on the values of 𝜻𝜻\boldsymbol{\zeta}bold_italic_ζ from Equation (24) it follows that

𝜻T𝝍=n+k=2n|(𝚽𝚽H)1,k|ψk(n+nk=2n2)=2n2n.superscript𝜻𝑇𝝍missing-subexpressionabsent𝑛superscriptsubscript𝑘2𝑛subscript𝚽superscript𝚽𝐻1𝑘subscript𝜓𝑘absent𝑛𝑛superscriptsubscript𝑘2𝑛22superscript𝑛2𝑛\displaystyle\boldsymbol{\zeta}^{T}\boldsymbol{\psi}\begin{aligned} &=n+\sum_{% k=2}^{n}|(\boldsymbol{\Phi}\boldsymbol{\Phi}^{H})_{1,k}|\psi_{k}&\leq\left(n+n% \sum_{k=2}^{n}2\right)=2n^{2}-n.\end{aligned}bold_italic_ζ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_italic_ψ start_ROW start_CELL end_CELL start_CELL = italic_n + ∑ start_POSTSUBSCRIPT italic_k = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | ( bold_Φ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT 1 , italic_k end_POSTSUBSCRIPT | italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL start_CELL ≤ ( italic_n + italic_n ∑ start_POSTSUBSCRIPT italic_k = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT 2 ) = 2 italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n . end_CELL end_ROW (28)

Hence the norm 𝚽H22n2nsubscriptdelimited-∥∥superscript𝚽𝐻22superscript𝑛2𝑛\lVert{{\boldsymbol{\Phi}}}^{H}\rVert_{2}\leq 2n^{2}-n∥ bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 2 italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n. It now remains to bound the quantity (𝚽H𝚽)2subscriptdelimited-∥∥superscriptsuperscript𝚽𝐻𝚽2\lVert(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}\rVert_{2}∥ ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. By computing the singular value decomposition of (𝚽H𝚽)2subscriptdelimited-∥∥superscriptsuperscript𝚽𝐻𝚽2\lVert(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}\rVert_{2}∥ ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, it follows directly that

(𝚽H𝚽)21σmin2(𝚽),subscriptdelimited-∥∥superscriptsuperscript𝚽𝐻𝚽21subscriptsuperscript𝜎2min𝚽\lVert(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}\rVert_{2}\leq\frac{1}% {\sigma^{2}_{\text{min}}(\boldsymbol{\Phi})},∥ ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_Φ ) end_ARG , (29)

where σmin2(𝚽)>0subscriptsuperscript𝜎2min𝚽0\sigma^{2}_{\text{min}}(\boldsymbol{\Phi})>0italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_Φ ) > 0 is the smallest non-zero singular value of 𝚽𝚽\boldsymbol{\Phi}bold_Φ. Using the fact that

k=1nλk(𝚽H𝚽)=Tr(𝚽H𝚽)=n2,superscriptsubscript𝑘1𝑛subscript𝜆𝑘superscript𝚽𝐻𝚽Trsuperscript𝚽𝐻𝚽superscript𝑛2\sum_{k=1}^{n}\lambda_{k}(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})=\text{Tr}(% \boldsymbol{\Phi}^{H}\boldsymbol{\Phi})=n^{2},∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) = Tr ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) = italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (30)

where Tr(𝚽H𝚽)Trsuperscript𝚽𝐻𝚽\text{Tr}(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})Tr ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) is the matrix trace, it follows that the smallest non-zero eigenvalue is bounded as 0<λmin(𝚽H𝚽)n0subscript𝜆minsuperscript𝚽𝐻𝚽𝑛0<\lambda_{\text{min}}(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})\leq n0 < italic_λ start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) ≤ italic_n which implies that σmin2(𝚽H𝚽)nsubscriptsuperscript𝜎2minsuperscript𝚽𝐻𝚽𝑛\sigma^{2}_{\text{min}}(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})\leq nitalic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) ≤ italic_n. For any such that σmin2(𝚽H𝚽)subscriptsuperscript𝜎2minsuperscript𝚽𝐻𝚽\sigma^{2}_{\text{min}}(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ), we can always find a 0<γ10𝛾10<\gamma\leq 10 < italic_γ ≤ 1 such that σmin2(𝚽H𝚽)>γnsubscriptsuperscript𝜎2minsuperscript𝚽𝐻𝚽𝛾𝑛\sigma^{2}_{\text{min}}(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})>{\gamma n}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) > italic_γ italic_n. Hence the bound in Equation (29) can be extended to

(𝚽H𝚽)21σmin2(𝚽)1γn.subscriptdelimited-∥∥superscriptsuperscript𝚽𝐻𝚽21subscriptsuperscript𝜎2min𝚽1𝛾𝑛\lVert(\boldsymbol{\Phi}^{H}\boldsymbol{\Phi})^{\dagger}\rVert_{2}\leq\frac{1}% {\sigma^{2}_{\text{min}}(\boldsymbol{\Phi})}\leq\frac{1}{\gamma n}.∥ ( bold_Φ start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT bold_Φ ) start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT min end_POSTSUBSCRIPT ( bold_Φ ) end_ARG ≤ divide start_ARG 1 end_ARG start_ARG italic_γ italic_n end_ARG . (31)

Now using the bound in Equations (28) and  (31), the bound in Equation (22) has the form

Π𝚽(𝐱iF(𝐬(tp,i)))22n1γF(𝐬(tp,i))F(𝐬(tp,i))2subscriptdelimited-∥∥subscriptΠ𝚽superscriptsubscript𝐱𝑖𝐹𝐬subscript𝑡𝑝𝑖22𝑛1𝛾subscriptdelimited-∥∥superscript𝐹𝐬subscript𝑡𝑝𝑖𝐹𝐬subscript𝑡𝑝𝑖2\lVert\Pi_{\mathbf{\boldsymbol{\Phi}}}\left(\mathbf{x}_{i}^{*}-F(\mathbf{s}(t_% {p,i}))\right)\rVert_{2}\leq\frac{2n-1}{\gamma}\lVert F^{*}(\mathbf{s}(t_{p,i}% ))-F(\mathbf{s}(t_{p,i}))\rVert_{2}∥ roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ divide start_ARG 2 italic_n - 1 end_ARG start_ARG italic_γ end_ARG ∥ italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (32)

The bound on the Hausdorff distance between 𝐅superscript𝐅\mathbf{F}^{\prime}bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and 𝐅𝐅\mathbf{F}bold_F is then expressed as

dH(Π^𝚽(𝐱),𝐅)supi,pΠ𝚽(𝐱iF(𝐬(tp,i)))22n1γsupi,pF(𝐬(tp,i))F(𝐬(tp,i))2,1inpMτp,1pP.subscriptdHsubscript^Π𝚽superscript𝐱𝐅missing-subexpressionabsentsubscriptsupremum𝑖𝑝subscriptdelimited-∥∥subscriptΠ𝚽superscriptsubscript𝐱𝑖𝐹𝐬subscript𝑡𝑝𝑖2missing-subexpressionformulae-sequenceformulae-sequenceabsent2𝑛1𝛾subscriptsupremum𝑖𝑝subscriptdelimited-∥∥superscript𝐹𝐬subscript𝑡𝑝𝑖𝐹𝐬subscript𝑡𝑝𝑖21𝑖subscript𝑛𝑝𝑀subscript𝜏𝑝1𝑝𝑃\displaystyle\text{d}_{\text{H}}(\widehat{\Pi}_{\boldsymbol{\Phi}}(\mathbf{x}^% {*}),\mathbf{F})\begin{aligned} &\leq\sup\limits_{i,p}\lVert\Pi_{\mathbf{% \boldsymbol{\Phi}}}\left(\mathbf{x}_{i}^{*}-F(\mathbf{s}(t_{p,i}))\right)% \rVert_{2}\\ &\leq\frac{2n-1}{\gamma}\sup\limits_{i,p}\lVert F^{*}(\mathbf{s}(t_{p,i}))-F(% \mathbf{s}(t_{p,i}))\rVert_{2},1\leq i\leq n_{p}-M\tau_{p},1\leq p\leq P.\end{aligned}d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( over^ start_ARG roman_Π end_ARG start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) , bold_F ) start_ROW start_CELL end_CELL start_CELL ≤ roman_sup start_POSTSUBSCRIPT italic_i , italic_p end_POSTSUBSCRIPT ∥ roman_Π start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ≤ divide start_ARG 2 italic_n - 1 end_ARG start_ARG italic_γ end_ARG roman_sup start_POSTSUBSCRIPT italic_i , italic_p end_POSTSUBSCRIPT ∥ italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , 1 ≤ italic_i ≤ italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , 1 ≤ italic_p ≤ italic_P . end_CELL end_ROW (33)

From the equality in Equation (21), it holds that

dB(Dgm(𝐅),Dgm(𝐅))2dGH(𝐅,𝐅)=2dGH(Π^𝚽(𝐱),𝐅).subscriptdBDgm𝐅Dgmsuperscript𝐅2subscriptdGH𝐅superscript𝐅2subscriptdGHsubscript^Π𝚽superscript𝐱𝐅\text{d}_{\text{B}}(\text{Dgm}({\mathbf{F}}),\text{Dgm}({\mathbf{F}^{\prime}})% )\leq 2\text{d}_{\textbf{GH}}\left(\mathbf{F},\mathbf{F}^{\prime}\right)=2% \text{d}_{\textbf{GH}}\left(\widehat{\Pi}_{\boldsymbol{\Phi}}(\mathbf{x}^{*}),% \mathbf{F}\right).d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F ) , Dgm ( bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ 2 d start_POSTSUBSCRIPT GH end_POSTSUBSCRIPT ( bold_F , bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = 2 d start_POSTSUBSCRIPT GH end_POSTSUBSCRIPT ( over^ start_ARG roman_Π end_ARG start_POSTSUBSCRIPT bold_Φ end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) , bold_F ) . (34)

The Gromov-Hausdorff distance is further bounded above by the Hausdorff distance. From Equations (33), the bound on the bottleneck distance between Dgm(𝐅)Dgm𝐅\text{Dgm}({\mathbf{F}})Dgm ( bold_F ) and Dgm(𝐅)Dgmsuperscript𝐅\text{Dgm}({\mathbf{F}^{\prime}})Dgm ( bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) is established as

dB(Dgm(𝐅),Dgm(𝐅))4n2γ(supi,pF(𝐬(tp,i))F(𝐬(tp,i))2),subscriptdBDgm𝐅Dgmsuperscript𝐅4𝑛2𝛾subscriptsupremum𝑖𝑝subscriptnormsuperscript𝐹𝐬subscript𝑡𝑝𝑖𝐹𝐬subscript𝑡𝑝𝑖2\text{d}_{\text{B}}(\text{Dgm}({\mathbf{F}}),\text{Dgm}({\mathbf{F}^{\prime}})% )\leq\frac{4n-2}{\gamma}\left(\sup\limits_{i,p}||F^{*}(\mathbf{s}(t_{p,i}))-F(% \mathbf{s}(t_{p,i}))||_{2}\right),d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F ) , Dgm ( bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) ≤ divide start_ARG 4 italic_n - 2 end_ARG start_ARG italic_γ end_ARG ( roman_sup start_POSTSUBSCRIPT italic_i , italic_p end_POSTSUBSCRIPT | | italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) - italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , (35)

where 1inpMτp,1pPformulae-sequence1𝑖subscript𝑛𝑝𝑀subscript𝜏𝑝1𝑝𝑃1\leq i\leq n_{p}-M\tau_{p},\quad 1\leq p\leq P1 ≤ italic_i ≤ italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT - italic_M italic_τ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , 1 ≤ italic_p ≤ italic_P. If the observed time series is ‘noise-free’ such that 𝐱=𝐱superscript𝐱𝐱\mathbf{x}^{*}=\mathbf{x}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_x, then supi,pF(𝐬(tp,i))F(𝐬(tp,i))2=0,i,psubscriptsupremum𝑖𝑝subscriptnorm𝐹𝐬subscript𝑡𝑝𝑖superscript𝐹𝐬subscript𝑡𝑝𝑖20for-all𝑖𝑝\sup\limits_{i,p}||F(\mathbf{s}(t_{p,i}))-F^{*}(\mathbf{s}(t_{p,i}))||_{2}=0,% \forall i,proman_sup start_POSTSUBSCRIPT italic_i , italic_p end_POSTSUBSCRIPT | | italic_F ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i end_POSTSUBSCRIPT ) ) | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 , ∀ italic_i , italic_p, and dB(Dgm(𝐅),Dgm(𝐅))=0subscriptdBDgm𝐅Dgmsuperscript𝐅0\text{d}_{\text{B}}(\text{Dgm}({\mathbf{F}}),\text{Dgm}({\mathbf{F}^{\prime}})% )=0d start_POSTSUBSCRIPT B end_POSTSUBSCRIPT ( Dgm ( bold_F ) , Dgm ( bold_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) = 0. ∎

S1.2 Proof of Lemma 4.2

Proof.

By construction, 𝐅2𝐅^2superscript𝐅2superscript^𝐅2\mathbf{F}^{2}\subset\widehat{\mathbf{F}}^{2}bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⊂ over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, thus for any F2(s(tp,i1))𝐅2superscript𝐹2𝑠subscript𝑡𝑝subscript𝑖1superscript𝐅2F^{2}(s(t_{p,i_{1}}))\in\mathbf{F}^{2}italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ) ∈ bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, F^2(s(tp,i2))𝐅^2superscript^𝐹2𝑠subscript𝑡𝑝subscript𝑖2superscript^𝐅2\exists\widehat{F}^{2}(s(t_{p,i_{2}}))\in\widehat{\mathbf{F}}^{2}∃ over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ) ∈ over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT such that F2(s(tp,i1))=F^2(s(tp,i2))superscript𝐹2𝑠subscript𝑡𝑝subscript𝑖1superscript^𝐹2𝑠subscript𝑡𝑝subscript𝑖2F^{2}(s(t_{p,i_{1}}))=\widehat{F}^{2}(s(t_{p,i_{2}}))italic_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ) = over^ start_ARG italic_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_s ( italic_t start_POSTSUBSCRIPT italic_p , italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ). Further observe that |𝐅2|u=|𝐅^2|usubscriptsuperscript𝐅2𝑢subscriptsuperscript^𝐅2𝑢|\mathbf{F}^{2}|_{u}=|\hat{\mathbf{F}}^{2}|_{u}| bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = | over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT, where |.|u|.|_{u}| . | start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is a measure of the cardinality of unique observations. Hence it follows that Dgm(𝐅2)Dgm(𝐅^2)Dgmsuperscript𝐅2Dgmsuperscript^𝐅2\text{Dgm}(\mathbf{F}^{2})\equiv\text{Dgm}(\widehat{\mathbf{F}}^{2})Dgm ( bold_F start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ≡ Dgm ( over^ start_ARG bold_F end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), and the conclusion is a direct consequence of this equivalence. ∎

S1.3 Proof of Theorem 4.5

Proof.

By construction of the subsequence and assumption A2, the Hausdorff distance between 𝐅msubscript𝐅𝑚\mathbf{F}_{m}bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT and 𝐅ϑsubscript𝐅italic-ϑ\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT has the form

dH(𝐅m,𝐅ϑ)=supFϑ𝐅ϑmin1imF(s(ti))Fϑ2.\text{d}_{\text{H}}(\mathbf{F}_{m},\mathbf{F}_{\vartheta})=\sup_{F_{\vartheta}% \in\mathbf{F}_{\vartheta}}\min_{1\leq i\leq m}\lVert F(s(t_{i}))-F_{\vartheta}% \rVert_{2}.d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) = roman_sup start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . (36)

Let 𝐅0𝐅ϑsubscript𝐅0subscript𝐅italic-ϑ\mathbf{F}_{0}\subset\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⊂ bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT be a set of ball centers such that

𝐅ϑF0𝐅0B(F0,δ),subscript𝐅italic-ϑsubscriptsubscript𝐹0subscript𝐅0𝐵subscript𝐹0𝛿\mathbf{F}_{\vartheta}\subset\bigcup_{F_{0}\in\mathbf{F}_{0}}B(F_{0},\delta),bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ⊂ ⋃ start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_B ( italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_δ ) , (37)

i.e., the minimal covering of 𝐅ϑsubscript𝐅italic-ϑ\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT consisting of balls of radius δ𝛿\deltaitalic_δ around F0𝐅0subscript𝐹0subscript𝐅0F_{0}\in\mathbf{F}_{0}italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. For any Fϑ𝐅ϑsubscript𝐹italic-ϑsubscript𝐅italic-ϑF_{\vartheta}\in\mathbf{F}_{\vartheta}italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT and F0𝐅0subscript𝐹0subscript𝐅0F_{0}\in\mathbf{F}_{0}italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, the following inequality holds:

min1imF(s(ti))Fϑ2F(s(tj))Fϑ2=F(s(tj))F0+F0Fϑ2F(s(tj))F02+F0Fϑ2,j=1,,m.\displaystyle\min_{1\leq i\leq m}\lVert F(s(t_{i}))-F_{\vartheta}\rVert_{2}% \begin{aligned} &\leq\lVert F(s(t_{j}))-F_{\vartheta}\rVert_{2}\\ &=\lVert F(s(t_{j}))-F_{0}+F_{0}-F_{\vartheta}\rVert_{2}\\ &\leq\lVert F(s(t_{j}))-F_{0}\rVert_{2}+\lVert F_{0}-F_{\vartheta}\rVert_{2},% \quad j=1,\cdots,m.\end{aligned}roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ROW start_CELL end_CELL start_CELL ≤ ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ≤ ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + ∥ italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_j = 1 , ⋯ , italic_m . end_CELL end_ROW (38)

Observe that F0Fϑ2subscriptdelimited-∥∥subscript𝐹0subscript𝐹italic-ϑ2\lVert F_{0}-F_{\vartheta}\rVert_{2}∥ italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is bounded by the radius δ𝛿\deltaitalic_δ, hence using Equation (37), it follows that

min1imF(s(ti))Fϑ2δ+maxF0𝐅0min1imF(s(ti))F02ε,\min_{1\leq i\leq m}\lVert F(s(t_{i}))-F_{\vartheta}\rVert_{2}\leq\delta+\max_% {F_{0}\in\mathbf{F}_{0}}\min_{1\leq i\leq m}\lVert F(s(t_{i}))-F_{0}\rVert_{2}% \leq\varepsilon,roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_δ + roman_max start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_ε , (39)

for some ϵ>0italic-ϵ0\epsilon>0italic_ϵ > 0. Further, taking the supremum over all Fϑsubscript𝐹italic-ϑF_{\vartheta}italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT, the relation still holds:

supFϑ𝐅ϑmin1imF(s(ti))Fϑ2δ+maxF0𝐅0min1imF(s(ti))F02ε,\sup_{F_{\vartheta}\in\mathbf{F}_{\vartheta}}\min_{1\leq i\leq m}\lVert F(s(t_% {i}))-F_{\vartheta}\rVert_{2}\leq\delta+\max_{F_{0}\in\mathbf{F}_{0}}\min_{1% \leq i\leq m}\lVert F(s(t_{i}))-F_{0}\rVert_{2}\leq\varepsilon,roman_sup start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_δ + roman_max start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_ε , (40)

Then the probability that supFϑ𝐅ϑmin1imF(s(ti))Fϑ2\sup_{F_{\vartheta}\in\mathbf{F}_{\vartheta}}\min_{1\leq i\leq m}\lVert F(s(t_% {i}))-F_{\vartheta}\rVert_{2}roman_sup start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT exceeds ε𝜀\varepsilonitalic_ε is bounded as

Pr(supFϑ𝐅ϑmin1imF(s(ti))Fϑ2>ε)Pr(δ+maxF0𝐅0min1imF(s(ti))F02>ε)=Pr(maxF0𝐅0min1imF(s(ti))F02>εδ)\displaystyle\text{Pr}\left(\sup_{F_{\vartheta}\in\mathbf{F}_{\vartheta}}\min_% {1\leq i\leq m}\lVert F(s(t_{i}))-F_{\vartheta}\rVert_{2}>\varepsilon\right)% \begin{aligned} &\leq\text{Pr}\left(\delta+\max_{F_{0}\in\mathbf{F}_{0}}\min_{% 1\leq i\leq m}\lVert F(s(t_{i}))-F_{0}\rVert_{2}>\varepsilon\right)\\ &=\text{Pr}\left(\max_{F_{0}\in\mathbf{F}_{0}}\min_{1\leq i\leq m}\lVert F(s(t% _{i}))-F_{0}\rVert_{2}>\varepsilon-\delta\right)\end{aligned}Pr ( roman_sup start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > italic_ε ) start_ROW start_CELL end_CELL start_CELL ≤ Pr ( italic_δ + roman_max start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > italic_ε ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = Pr ( roman_max start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > italic_ε - italic_δ ) end_CELL end_ROW (41)

From Equation (36), supFϑ𝐅ϑmin1imF(s(ti))Fϑ2=dH(𝐅m,𝐅ϑ)\sup_{F_{\vartheta}\in\mathbf{F}_{\vartheta}}\min_{1\leq i\leq m}\lVert F(s(t_% {i}))-F_{\vartheta}\rVert_{2}=\text{d}_{\text{H}}(\mathbf{F}_{m},\mathbf{F}_{% \vartheta})roman_sup start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ), hence a bound on the probability of dH(𝐅m,𝐅ϑ)subscriptdHsubscript𝐅𝑚subscript𝐅italic-ϑ\text{d}_{\text{H}}(\mathbf{F}_{m},\mathbf{F}_{\vartheta})d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) exceeding ε𝜀\varepsilonitalic_ε can be obtained as

Pr(dH(𝐅m,𝐅ϑ)>ε)Pr(maxF0𝐅0min1imF(s(ti))F02>εδ).\text{Pr}\left(\text{d}_{\text{H}}(\mathbf{F}_{m},\mathbf{F}_{\vartheta})>% \varepsilon\right)\leq\text{Pr}\left(\max_{F_{0}\in\mathbf{F}_{0}}\min_{1\leq i% \leq m}\lVert F(s(t_{i}))-F_{0}\rVert_{2}>\varepsilon-\delta\right).Pr ( d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) > italic_ε ) ≤ Pr ( roman_max start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > italic_ε - italic_δ ) . (42)

Observe that 𝐅ϑsubscript𝐅italic-ϑ\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT endowed with the Hausdorff metric is complete and separable (Attouch et al., 1991). Then for ε𝜀\varepsilonitalic_ε small enough, the following bound holds (Cuevas and Rodríguez-Casal, 2004):

Pr(maxF0𝐅0min1imF(s(ti))F02>εδ)C(1κω(εδ)M+1)m.\text{Pr}\left(\max_{F_{0}\in\mathbf{F}_{0}}\min_{1\leq i\leq m}\lVert F(s(t_{% i}))-F_{0}\rVert_{2}>\varepsilon-\delta\right)\leq C\left(1-\kappa\omega(% \varepsilon-\delta)^{M+1}\right)^{m}.Pr ( roman_max start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > italic_ε - italic_δ ) ≤ italic_C ( 1 - italic_κ italic_ω ( italic_ε - italic_δ ) start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT . (43)

The constant C𝐶Citalic_C is the number of points in the covering of 𝐅ϑsubscript𝐅italic-ϑ\mathbf{F}_{\vartheta}bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT, i.e., |𝐅0|subscript𝐅0|\mathbf{F}_{0}|| bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT |, and ω𝜔\omegaitalic_ω is the Lebesgue measure of the unit ball in M+1superscript𝑀1\mathbb{R}^{M+1}blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT. Note that since 0κω(εδ)M+110𝜅𝜔superscript𝜀𝛿𝑀110\leq\kappa\omega(\varepsilon-\delta)^{M+1}\leq 10 ≤ italic_κ italic_ω ( italic_ε - italic_δ ) start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT ≤ 1, it follows that (1κω(εδ)M+1)memκω(εδ)M+1superscript1𝜅𝜔superscript𝜀𝛿𝑀1𝑚superscript𝑒𝑚𝜅𝜔superscript𝜀𝛿𝑀1\left(1-\kappa\omega(\varepsilon-\delta)^{M+1}\right)^{m}\leq e^{-m\kappa% \omega(\varepsilon-\delta)^{M+1}}( 1 - italic_κ italic_ω ( italic_ε - italic_δ ) start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ≤ italic_e start_POSTSUPERSCRIPT - italic_m italic_κ italic_ω ( italic_ε - italic_δ ) start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT. This allows for Equations (42) and (43) to be rewritten as

Pr(dH(𝐅m,𝐅ϑ)>ε)Pr(maxF0𝐅0min1imF(s(ti))F02>εδ)Cemκω(εδ)M+1.\text{Pr}\left(\text{d}_{\text{H}}(\mathbf{F}_{m},\mathbf{F}_{\vartheta})>% \varepsilon\right)\leq\text{Pr}\left(\max_{F_{0}\in\mathbf{F}_{0}}\min_{1\leq i% \leq m}\lVert F(s(t_{i}))-F_{0}\rVert_{2}>\varepsilon-\delta\right)\leq Ce^{-m% \kappa\omega(\varepsilon-\delta)^{M+1}}.Pr ( d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) > italic_ε ) ≤ Pr ( roman_max start_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ bold_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT 1 ≤ italic_i ≤ italic_m end_POSTSUBSCRIPT ∥ italic_F ( italic_s ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) - italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > italic_ε - italic_δ ) ≤ italic_C italic_e start_POSTSUPERSCRIPT - italic_m italic_κ italic_ω ( italic_ε - italic_δ ) start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT . (44)

Choose some K>(2κω)1M+1𝐾superscript2𝜅𝜔1𝑀1K>\left(\frac{2}{\kappa\omega}\right)^{\frac{1}{M+1}}italic_K > ( divide start_ARG 2 end_ARG start_ARG italic_κ italic_ω end_ARG ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT and let εm(r)=(rmlm)subscript𝜀𝑚𝑟𝑟𝑚𝑙𝑚\varepsilon_{m}(r)=\left(r-\frac{m-l}{m}\right)italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) = ( italic_r - divide start_ARG italic_m - italic_l end_ARG start_ARG italic_m end_ARG ), where l𝑙litalic_l is the number of missing observations in the initial time series and mlmuch-greater-than𝑚𝑙m\gg litalic_m ≫ italic_l, then for m𝑚mitalic_m large enough, it follows that

Pr((εm(r))1M+1dH(𝐅m,𝐅ϑ)>K)Cemκω((εm(r)2κω)1M+1δ)M+1.Prsuperscriptsubscript𝜀𝑚𝑟1𝑀1subscriptdHsubscript𝐅𝑚subscript𝐅italic-ϑ𝐾𝐶superscript𝑒𝑚𝜅𝜔superscriptsuperscriptsubscript𝜀𝑚𝑟2𝜅𝜔1𝑀1𝛿𝑀1\text{Pr}\left(\left(\varepsilon_{m}(r)\right)^{-\frac{1}{M+1}}\text{d}_{\text% {H}}(\mathbf{F}_{m},\mathbf{F}_{\vartheta})>K\right)\leq Ce^{-m\kappa\omega% \left((\varepsilon_{m}(r)\frac{2}{\kappa\omega})^{\frac{1}{M+1}}-\delta\right)% ^{M+1}}.Pr ( ( italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) > italic_K ) ≤ italic_C italic_e start_POSTSUPERSCRIPT - italic_m italic_κ italic_ω ( ( italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) divide start_ARG 2 end_ARG start_ARG italic_κ italic_ω end_ARG ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT - italic_δ ) start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT . (45)

The above bound can be obtained by simply substituting (2εm(r)κω)1M+1superscript2subscript𝜀𝑚𝑟𝜅𝜔1𝑀1\left(\frac{2\varepsilon_{m}(r)}{\kappa\omega}\right)^{\frac{1}{M+1}}( divide start_ARG 2 italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) end_ARG start_ARG italic_κ italic_ω end_ARG ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT for ε𝜀\varepsilonitalic_ε in Equation (44). Now consider the sum

memκω((εm(r)2κω)1M+1δ)M+1,subscript𝑚superscript𝑒𝑚𝜅𝜔superscriptsuperscriptsubscript𝜀𝑚𝑟2𝜅𝜔1𝑀1𝛿𝑀1\sum_{m}e^{-m\kappa\omega\left((\varepsilon_{m}(r)\frac{2}{\kappa\omega})^{% \frac{1}{M+1}}-\delta\right)^{M+1}},∑ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - italic_m italic_κ italic_ω ( ( italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) divide start_ARG 2 end_ARG start_ARG italic_κ italic_ω end_ARG ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT - italic_δ ) start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT , (46)

and observe that it is convergent if εm(r)(δK)M+1subscript𝜀𝑚𝑟superscript𝛿𝐾𝑀1\varepsilon_{m}(r)\geq\left(\frac{\delta}{K}\right)^{M+1}italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) ≥ ( divide start_ARG italic_δ end_ARG start_ARG italic_K end_ARG ) start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT. This condition can always be satisfied given the restriction K>(2κω)1M+1𝐾superscript2𝜅𝜔1𝑀1K>\left(\frac{2}{\kappa\omega}\right)^{\frac{1}{M+1}}italic_K > ( divide start_ARG 2 end_ARG start_ARG italic_κ italic_ω end_ARG ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT and for an appropriate choice of κ𝜅\kappaitalic_κ and δ𝛿\deltaitalic_δ. Then by the Borel-Cantelli lemma (Émile Borel, 1909; Cantelli, 1917; Chung and Erdös, 1952), since

mPr((εm(r))1M+1dH(𝐅m,𝐅ϑ)>K)<,subscript𝑚Prsuperscriptsubscript𝜀𝑚𝑟1𝑀1subscriptdHsubscript𝐅𝑚subscript𝐅italic-ϑ𝐾\sum_{m}\text{Pr}\left(\left(\varepsilon_{m}(r)\right)^{-\frac{1}{M+1}}\text{d% }_{\text{H}}(\mathbf{F}_{m},\mathbf{F}_{\vartheta})>K\right)<\infty,∑ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT Pr ( ( italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) > italic_K ) < ∞ , (47)

it follows that

limmsup(εm(r))1M+1dH(𝐅m,𝐅ϑ)K.subscriptabsent𝑚supremumsuperscriptsubscript𝜀𝑚𝑟1𝑀1subscriptdHsubscript𝐅𝑚subscript𝐅italic-ϑ𝐾\lim_{m\xrightarrow{}\infty}\sup\left(\varepsilon_{m}(r)\right)^{-\frac{1}{M+1% }}\text{d}_{\text{H}}(\mathbf{F}_{m},\mathbf{F}_{\vartheta})\leq K.roman_lim start_POSTSUBSCRIPT italic_m start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW ∞ end_POSTSUBSCRIPT roman_sup ( italic_ε start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_r ) ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_M + 1 end_ARG end_POSTSUPERSCRIPT d start_POSTSUBSCRIPT H end_POSTSUBSCRIPT ( bold_F start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , bold_F start_POSTSUBSCRIPT italic_ϑ end_POSTSUBSCRIPT ) ≤ italic_K . (48)

S2 Discussion

The proposed SSE methodology generalizes TDEs to allow for more realistic time-series data that are not uniformly spaced. The SSE method and TDA analysis of the reconstructed state space can be used to detect periodic signals, but there is potential for this framework to provide more useful information. This discussion highlights this claim by demonstrating that different time-series signals with common frequency domain and periodicity properties can exhibit different structural shapes in their reconstructed state spaces. The reconstruction provides qualitative and quantitative information that may be used for downstream analysis. We begin with artificial time-series data, and then provide a motivating application related to the detection and characterization of exoplanets.

S2.1 Denoising and Stability

To evaluate the performance of denoising procedure presented in in this work, and Proposition 4.14.14.14.1, the time series in Figure 3a of the main article was replicated at varying noise levels and sample sizes. The probability that any value is unobserved at a given time point is fixed at 0.250.250.250.25. Four noise levels {0,0.25,0.5,2}00.250.52\{0,0.25,0.5,2\}{ 0 , 0.25 , 0.5 , 2 } and five samples sizes {50,100,500,1000,5000}5010050010005000\{50,100,500,1000,5000\}{ 50 , 100 , 500 , 1000 , 5000 } were used in the simulations. For each noise level and sample size combination, the denoising method outlined in Section 4.24.24.24.2 of the main article was performed and the bottleneck distance between the corresponding persistence diagrams and the theoretical upper-bound are computed. The upper bound computed does not include the multiplicative factor 2n1γ2𝑛1𝛾\frac{2n-1}{\gamma}divide start_ARG 2 italic_n - 1 end_ARG start_ARG italic_γ end_ARG. Figure S1 shows a noisy time series and the outcome after denoising at various frequency thresholds.

Refer to caption
(a) Noisy original time series
Refer to caption
(b) Smoothed (threshold=5)
Refer to caption
(c) Smoothed (threshold=15)
Refer to caption
(d) Smoothed (threshold=25)
Figure S1: Illustration of the denoising method of Section 4.24.24.24.2 of the main article. The time series was perturbed with noise drawn from a N(0,0.25)𝑁00.25N(0,0.25)italic_N ( 0 , 0.25 ), and the probability of a missing observation is 0.250.250.250.25. (a) The original 500 time series measurements. The round orange points are observed values, while the blue diamonds are the missing values displayed at the true signal value without noise. The other sub-figures display the time series after denoising with a frequency threshold of 5 (b), 15 (c), and 25 (d).

For an appropriate choice of frequency threshold, which is generally application dependent, the true underlying signal can be satisfactorily recovered.

For each combination of the noise-level and sample size, the process is repeated 100 times and standard errors are obtained. The results are presented in Figure S2.

Refer to caption
Figure S2: Stability results of the denoising procedure (see Proposition 4.14.14.14.1 in the main text). The solid points represent the mean values from 100100100100 repetitions, the vertical lines on these points indicate the error bars (which are too small to see in many cases), and the colors and line types indicate the sample size. The vertical axis represents the bottleneck distance for the circle points and the error bound (without the multiplicative factor 2n1γ2𝑛1𝛾\frac{2n-1}{\gamma}divide start_ARG 2 italic_n - 1 end_ARG start_ARG italic_γ end_ARG) for the triangle points.

The bottleneck distance is bounded above by the error bound for all noise levels and sample sizes as expected. At the same noise level, smaller sample sizes tend to have larger bottleneck distances. This can partly be explained by the fact that the reconstructed space is more sparse (i.e., points in the space are more spread out since there are fewer points). The H0subscript𝐻0H_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT features are more likely to persist longer in such sparse settings. The reverse is true for the error bound in Proposition 4.14.14.14.1, which for the same noise level, is higher for larger sample sizes. This follows from the fact that larger sample sizes increases the chance of observing highly noisy terms. These results demonstrate the denoising procedure’s efficacy in controlling noise effects on the SSE’s topological features.

Proposition 4.1 establishes a conservative bound on the bottleneck distance between persistence diagrams of a noise-free and a denoised time series using Fourier methods. The factor (2n1)/γ2𝑛1𝛾(2n-1)/\gamma( 2 italic_n - 1 ) / italic_γ reflects the poor-conditioning of the Fourier matrix in non-uniform domains. Empirical evidence suggests this bound could be improved in more restricted settings, a topic for future research.

S2.2 Topology-Informed Features

Traditional time series methods (e.g., Lomb, 1976), such as Fourier analysis and autocorrelation, focus primarily on frequency components and periodicity. The SSE method coupled with TDA through techniques like persistent homology allow for further exploration of the time series beyond the observed scalar values. For instance, consider the square wave in Figure S3-bottom-left, where its abrupt transitions challenges frequency-based analyses (Lomb, 1976). Its sharp edges and discontinuities result in significant spectral leakage when analyzed using Fourier methods (Lomb, 1976). However, its SSE captures these sharp transitions (Figure S4)-bottom-left, while its associated persistence diagram in Figure S5-bottom-left effectively distinguishes between high and low states. Specifically, the high and low states of the square wave result in exactly a single cluster for each homology dimension. The persistence diagrams were computed on point-wise centered and scaled embedding maps.

Refer to caption
Figure S3: Four time-series signals. Top-left: sawtooth wave; Top-right: sine wave; Bottom-left: square wave; Bottom-right: triangular wave. All four signals were generated to have the same period and the observations were sampled at the same frequency.
Refer to caption
Figure S4: The embedding in 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT of the four time series in Figure S3. Each embedding highlights the unique shape of the original time series.

The sawtooth wave (Figure S3-top-left), sine wave (Figure S3-top-right), and the triangular wave (Figure S3-bottom-right) display distinct geometric structures in their embeddings (Figure S4). Note that their persistence diagrams (Figure S5) do not exhibit more significant H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT features compared to the square wave.

Refer to caption
Figure S5: The persistence diagrams for the four time series in Figure S3. The persistence of the most significant H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature in each diagram is noted above the corresponding feature. The square wave has a single highly persistent H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature consistent with its shape.

S2.3 Shape-Agnostic Analysis

Different time-series signals with the same periodicity and frequency information can exhibit unique and additional characteristics when examined through SSE and TDA. In particular, the SSE and TDA analysis pipeline further reveals the presence and longevity of topological features, emphasizing the stability of any cyclic structure in a way that complements traditional time-series methods. For example, the time-series signals in Figure S3 can be embedded in a space that emphasizes the stability of their cyclical structure. If one chooses an embedding dimension M+1𝑀1M+1italic_M + 1 and step-size τ𝜏\tauitalic_τ such that (M+1)τ𝑀1𝜏(M+1)\tau( italic_M + 1 ) italic_τ approximates the period of the time series, the cyclical structure can be well approximated. The four signals were each embedded in 20superscript20\mathbb{R}^{20}blackboard_R start_POSTSUPERSCRIPT 20 end_POSTSUPERSCRIPT, and a scatterplot of their first two principal components are shown in Figure S6. The circular (loops) shapes are indicative of the cyclical structures in the time series. Their corresponding persistence diagrams, computed on the pointwise-centered and scaled SSE matrices are shown in Figure S7 where the persistence of the H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature can be viewed as quantifying the stability and longevity of these cyclical structures.

Refer to caption
Figure S6: The embedding in M+1superscript𝑀1\mathbb{R}^{M+1}blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT of the four time series in Figure S3. M=19𝑀19M=19italic_M = 19 was chosen such that (M+1)τ𝑀1𝜏(M+1)\tau( italic_M + 1 ) italic_τ for τ=1/200𝜏1200\tau=1/200italic_τ = 1 / 200 approximate the period of the signals.
Refer to caption
Figure S7: The persistence diagrams associated with the embeddings Figure S6. All four signals have at least one-significant H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature, which captures the cyclical pattern in the original time series.

S2.4 Exoplanet Time-Series Data

Exoplanets are planets that orbit stars other than our sun. The radial velocity (RV) method is one of the most prolific methods for the detection of exoplanets, and the method used to discover the first exoplanet orbiting a sun-like star (Mayor and Queloz, 1995). The RV method is sometimes referred to as the “wobble method” because an exoplanet is detected through the motion, or the wobble, of its host star. More specifically, an exoplanet is detected by estimating the forward and backward velocity (i.e., the RV) of the star across time, and observing a particular periodic signature in the time series suggesting the RV of the star is induced by an exoplanet. A simulated example of the Keplerian model of the RV of star across time when an exoplanet is orbiting in a circular orbit (i.e., with an eccentricy of zero) is displayed as the red point and line in Figure S8. More background on statistical aspects of the RV method can be found in (Hara and Ford, 2023). Though the RV method has been successful at detecting exoplanets, challenges exist in detecting low-mass exoplanets, such as earth analogs, which, all else equal, produce smaller amplitude signals compared to more massive exoplanets.222The exoplanet detected in Mayor and Queloz, (1995) had an RV time series semi-amplitude around 60 m/s, and the RV signal for an earth-like exoplanet orbiting a sun-like star is around 10 cm/s.

Until the recent development of extreme precision radial velocity (EPRV) spectrographs, the detection of low-mass exoplanet RV signals was not generally feasible (Fischer et al., 2016). Estimation of planetary RV signals is especially challenging in the low-mass regime because activity and variability in the atmosphere of a star (e.g., the formation and evolution of star spots) can produce periodic variation imprinted on the spectra that can hide or mimic exoplanet signals (Huélamo et al., 2008; Dumusque, 2016; Davis et al., 2017). For example, the green triangle line in Figure S8 displays the RV times series induced by a single simulated star spot orbiting a star (discussed in more detail below). While the spot is behind the star, the RV is zero. However, when the spot rotates in view, it produces a sinusoid-like signal that is challenging to distinguish from an exoplanet signal in real RV observations. New statistical methods have been developed to address the challenge of detecting low-mass exoplanets in the presence of stellar activity (e.g., Rajpaul et al., 2015; Dumusque, 2018; Holzer et al., 2021a ; Holzer et al., 2021b ; Jones et al., 2022), but none of these methods fully or generally mitigate the issues (Zhao et al., 2022).

While we do not solve the issue with detecting low-mass exoplanets in the presence of stellar activity, ideas from TDE and TDA may prove to be fruitful. In particular, RV measurements of stars other than our sun are highly non-uniform,333Figure 3 of Zhao et al., (2022) displays examples of the non-uniformity of RV time-series observations. the proposed SSE method is a necessary foundation to use TDEs for exoplanet detection using the RV method.

To illustrate how TDEs and TDA may be used in this setting, we consider a simple simulated example of a planet RV time series, a star spot RV time series, and the combination of a planet and spot RV times series in Figure S8. The host star is set to have a stellar rotation period of 25.05 days, the planet RV time series has an orbital period of 4 days and is on a circular orbit resulting in a semi-amplitude of 0.87 m/s, and the single spot covers 0.05% of the observed surface area of the star and orbits at 30 degrees latitude generated using the Spot Oscillation And Planet (SOAP) 2.0 code (Dumusque et al., 2014), which induces a semi-amplitude of 0.58 m/s. Realistic RV data has noise, irregular and sparse sampling, more complicated stellar variability, and many other complexities; our purpose here is to illustrate how extensions of SSE and TDA may be helpful in this setting.

Refer to caption
Figure S8: Exoplanet time-series data. Simulated RV data of an exoplanet (red circles), a 0.05% spot (blue squares), and the planet and spot combined (green triangles).

The SSE matrices for the three time series are equivalent to the TDE matrices since the observations are uniformly-spaced (i.e., the time difference between observations in each of the three time series is 0.25050.25050.25050.2505 days). To quantify the longevity and stability of any cyclical structures, the embedding window ((M+1)τ)𝑀1𝜏((M+1)\tau)( ( italic_M + 1 ) italic_τ ) was chosen to approximate the period of the time series. For the planet RV time series, choosing M=15𝑀15M=15italic_M = 15 and taking τ=0.2505𝜏0.2505\tau=0.2505italic_τ = 0.2505 approximates the period of 4444 days. Similarly, for the star spot RV time series, taking a M=49𝑀49M=49italic_M = 49 with the same τ=0.2505𝜏0.2505\tau=0.2505italic_τ = 0.2505 approximates the period of 12.7512.7512.7512.75 days. For the combination of planet and spot RV times series, the embedding window was chosen to approximate the dominant period in the two time series, which is choosing M=49𝑀49M=49italic_M = 49 and τ=0.2505𝜏0.2505\tau=0.2505italic_τ = 0.2505. We note that the embedding window can also be chosen to approximate the less dominant period. This can sometimes result in negligible decrease in the persistence of the most persistent H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature. The embedding matrix denoted 𝐅M+1𝐅superscript𝑀1\mathbf{F}\subset\mathbb{R}^{M+1}bold_F ⊂ blackboard_R start_POSTSUPERSCRIPT italic_M + 1 end_POSTSUPERSCRIPT for each time series is pointwise-centered and scaled. Figure S9 shows the plot of the two principal components of the embedding matrices. The planet’s TDE displays a clear circular pattern reflecting the periodic nature of the planetary signal in its RV time series. In contrast, the spot’s TDE, while still appearing somewhat circular, does not form a closed loop. The combined planet and spot RV time series has a TDE that appears to have several closely matched loops. These differences in structure in the TDEs are apparent in the corresponding persistence diagram, which is shown in Figure S10. The planet’s persistence diagram has the most persistent H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature, while the spot’s persistence diagram has a less persistent, though still apparent, H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature. The persistence diagram of the combined planet and spot has a persistent H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature, but also has a cluster of low persistence H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT features.

Refer to caption
Figure S9: The embedding of the three signals. Planet was embedded in 16superscript16\mathbb{R}^{16}blackboard_R start_POSTSUPERSCRIPT 16 end_POSTSUPERSCRIPT whiles Spot and Spot+Planet were each embedded in 50superscript50\mathbb{R}^{50}blackboard_R start_POSTSUPERSCRIPT 50 end_POSTSUPERSCRIPT. The planet signal is prominent in the Planet+Spot embedded space.
Refer to caption
Figure S10: The persistence diagrams associated with the three signals. The text above the most significant H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT feature in each diagram represents the persistence of that feature. The Planet+Spot, as expected has a few noisy H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT features.

The long-term goal of this sort of RV analysis is to detect and characterize the planetary signal in the presence of stellar activity and other sources of (possibility periodic) variability. Unlike the simplified simulated data used here to illustrate concepts, real RV times series are highly non-uniform and noisy, and contain many complicated and periodic stellar variability signals, along with instrumental biases and other sources of uncertainty. Dominant periodic signals in RV time series can vary widely among planetary, stellar variability, instrumental, and other sources.

S2.5 Summary

Our methodology has implications for real-world applications, such as the detection of exoplanets in the presence of stellar variability. The proposed SSE method addresses challenges posed by non-uniformly sampled time-series data for TDEs. The SSE method preserves the topology of the state space, mitigating the artificial noisy structures introduced by irregular time points, or structural shifts introduced by imputation methods. Coupling the SSE with TDA further enhances its robustness, since TDA is robust to noise and local distortions in the geometry of the state spaces. This advancement ensures that the reconstructed state space accurately reflects the true dynamics of the system, even with non-uniform and noisy sampling. This is particularly relevant in real-world applications where data is often sampled at irregular intervals with an inherent noisy structure.

Future research will further explore the full potential of this methodology, integrating it with conventional statistical techniques, developing more efficient algorithms, and applying these methods to a broader range of real-world data to uncover new insights and structures not readily apparent on scalar time-series data.