Universal approximation on non-geometric rough paths and applications to financial derivatives pricing
Abstract.
We present a novel perspective on the universal approximation theorem for rough path functionals, introducing a polynomial-based approximation class. We extend universal approximation to non-geometric rough paths within the tensor algebra. This development addresses critical needs in finance, where no-arbitrage conditions necessitate Itô integration. Furthermore, our findings motivate a hypothesis for payoff functionals in financial markets, allowing straightforward analysis of signature payoffs proposed in [Arr18].
Key words and phrases:
Signature payoff, derivatives pricing, path dependence, rough paths, signatures, model free finance, universal approximation2020 Mathematics Subject Classification:
60L10; 60L90; 60H05;91G20; 91G60Contents
1. Introduction
Financial derivatives allow firms to hedge exposure to diverse financial risks, such as price, currency, or interest rate fluctuations. In commodity markets, participants contend with uncertainties stemming from production volumes, demand variability, and transportation logistics. To address these challenges, a variety of financial derivatives are traded, designed to mitigate risk and stabilize revenues. These instruments often involve complex transactions with multivariate payoffs that replicate potential future cash flows. For example, power producers manage both price and volume risks, while renewable energy producers face weather-driven uncertainties, such as wind or cloud cover. Retailers, on the other hand, contend with temperature-sensitive demand variations. Multivariate derivatives, such as spread and quanto options, are particularly attractive for managing such intertwined risks, although their pricing remains a significant challenge due to the lack of closed-form solutions.
Pricing these products typically relies on numerical approximations of theoretical prices using techniques like Monte Carlo simulations or numerical solutions of partial differential equations. However, the inherent complexity and high dimensionality of such problems demand novel mathematical tools to improve efficiency and accuracy.
In recent years, insights from the theory of rough paths, introduced by Lyons [Lyo98], have emerged as a powerful framework for understanding stochastic processes and their functionals. A central concept in this theory is the signature, a collection of iterated integrals that captures the fundamental characteristics of a path. Initially studied by Chen [Che54] from an algebraic topology perspective, the signature framework has since evolved into a dynamic field of research, offering new methodologies for modelling and computation in various domains. More precisely, for a smooth path we define , and for
(1.1) |
For , we define the -truncated signature by
and the signature as the limit when . The space is known as the truncated tensor algebra, and we call given formally as , the (extended) tensor algebra. A more detailed introduction will be provided in Section 2. Beyond smooth paths, the theory extends to stochastic paths, where probabilistic tools enable the construction of these integrals under frameworks like Itô or Stratonovich integration (see e.g. [FV10]).
In many ways, the iterated integral signature can be seen as an infinite-dimensional extension of polynomials. In fact, it shares many of the key features of polynomials, but related to one-parameter functions. Some key properties are
-
(i)
Chen’s relation holds, i.e., the signature is a multiplicative functional. More specifically, for any then . The tensor product is understood in the tensor algebra , which is defined in Section 2. This not only provides computational efficiency but also serves as a fundamental building block in rough integration theory.
-
(ii)
The signature uniquely characterizes the path up to tree-like equivalence [HL10].
-
(iii)
It is re-parametrization invariant; for a monotone increasing function with and , and define , then [FV10].
-
(iv)
The signature is invariant to translation; for some , then .
-
(v)
It is associated with a rich algebraic structure (see e.g. [FV10] for an introduction).
-
(vi)
The signature is tightly connected to stochastic integration theory, and naturally encode information about stochastic integration choices, enriching the understanding of stochastic integration theory beyond the classical martingale theory [FH14].
-
(vii)
The signature characterizes the law of stochastic processes [CO22].
Importantly, the signature serves as a universal approximation basis for continuous functionals on path space, akin to how polynomials approximate real-valued functions. This universal approximation property allows any continuous functional on the space of Lipschitz paths to be approximated arbitrarily well by a linear functional of the signature [CPSF23]. The universal approximation property is proven through the Stone-Weierstrass theorem, and thus requires a feature set that forms a sub-algebra of the continous functionals on path-space.
The signature framework extends to stochastic functional approximation. A key challenge lies in encoding the choice of stochastic integration—such as Itô or Stratonovich—into the functional representation. While the universal approximation property has been extended to geometric rough paths, which naturally align with Stratonovich integration, this leaves a significant gap for Itô-based financial functionals, which are most prevalent in practice.
This article addresses the challenges of applying signature-based universal approximation to non-geometric rough paths, with a focus on practical implementation in financial markets. By bridging the gap between non-geometric rough paths and universal approximation, we investigate the framework for efficient pricing in the Itô setting for complex financial derivatives, providing several examples throughout the text. The remainder of the paper develops the theoretical foundations and explores applications in detail.
1.1. Main ideas and contribution
In this article we present a universal approximation result for functionals of non-geometric rough paths. The main challenge with non-geometric rough paths and by extension the non-geometric signature, is that the multiplication of two elements in the non-geometric signature does not yield another element contained in the same signature. This implies that just considering the linear span of signature elements as the subset of continuous linear functionals to use for functional approximation is not sufficiently rich to become an algebra; a strict requirement of the Stone-Weierstrass theorem. To overcome this challenge, we enrich the feature set, the linear span of our signature terms, by polynomials of the signature terms. This becomes a very large class of features that we use for universal approximation, but provides the sufficient set which guarantees universal approximation in the non-geometric setting. When applying this theorem to geometric rough paths, the approximation can be written as a linear functional acting on the signature through the shuffle product.
In finance the universal approximation theorem with signatures has been successfully used in the context of pricing complex financial derivatives, see [Arr18, LNPA20, LNPA19]. Such derivatives typically have a payoff of functional form, in practice often a so-called \sayAsian structure. There the payoff depends not on the price at a given time, but on the average price over a time period, thus introducing an integral with the price being . For even more complicated structures, one can encounter derivatives in energy markets, where the payoff is given as an integral over the product of two stochastic processes, , for example representing electricity price and temperature. In even more complicated models, one could imagine compositions of a finite number of such structures. The point is that payoff functionals, mapping paths to prices , are typically given in a very specific form of (lower-ordered) signature functionals. In the examples above, the payoff functional can be seen as continuous function acting on a finite number of terms in the signature of the price signals, enriched with a time component.
This motivates a working hypothesis of the paper; we consider functionals that we assume to be given as continuous functions acting on terms from the signature. While being a relevant hypothesis in many practical applications, one can resort to classical function approximation techniques to approximate the continuous function of interest. Approximating this function through some polynomial can under various assumptions yield convergence rates for the approximation, providing theoretical guarantees important for practical implementation. This simplifies the derivatives pricing problem from [Arr18] where machine learning techniques are suggested to find the functional approximation, resulting in an approximation which is difficult to analyze from a practical perspective. In contrast, we believe that our approach, mixing the new universal approximation method, with the hypothesis suited for payoff functionals provides a simple way of using signature features for practical pricing problems arising in financial markets.
Throughout the article, we illustrate our results and contributions through examples, with an emphasis on derivatives pricing in energy markets where there exists many complex derivatives structures. We emphasise deriving rather explicit conditions ensuring convergence rates for the approximations that we introduce.
1.2. Organization of the paper
In Section 2 we provide a basic introduction of the signature and building the algebraic foundation for our analysis. As this article is targeting an audience in financial mathematics, we have chosen to provide a detailed introduction to the algebraic side. In Section 3 we recall the state of the art in universal approximation, and propose the new universal approximation for non-geometric rough paths. In Section 4 we discuss stochastic price paths and the computation of signature correlators. Section 5 combines our considerations and provides some approximation results for financial derivatives. This is highlighted with a discussion on the applications in energy markets. At last we provide a conclusion with an outlook to future developments in Section 6.
1.3. Notation
For a complete metric space , we denote by the space of continuous paths with the uniform topology. The set of continuous paths with finite variation from into is denoted by , and is equipped with the norm
where is the collection of all partitions over . The subset of continuous paths with finite -variation is denoted by . Whenever the interval under consideration is otherwise clear, we simply write . Recall that when , the gamma function satisfies , and by slightly abuse of notation we will write for . Throughout the article, we will occasionally consider Hilbert spaces, denoted by , and where the inner product is then given by and the associated norm is denoted by . If the space under consideration is clear, we dismiss the index and simply write and for the inner product and norm. For a path we define the time-enhanced path , where . This notation will be used consistently throughout the article. Frequently, we resort to the notation .
2. Basics of words and signatures
In this section, we will provide a fundamental overview of the conventions and concepts related to weighted words, denoted by , and signatures, represented by , along with their pairing .
2.1. Words
Given that we are working with -valued paths , the alphabet of our consideration is . Throughout this article, will denote the set unless otherwise stated.
Definition 2.1.
A word of length is a sequence where for every . We denote by the set of all words of length . For we have that where is the empty word. We further let denote the set of all words.
The algebra is introduced as , representing the vector space generated by .
Definition 2.2.
The algebra of all non-commutative polynomials in is defined to be
We refer to as a weighted word.
forms an algebra with respect to concatenation which is for and given by
Example 2.3.
Suppose we work with an alphabet given by . Then we have that
where . As an example, one element is
Moreover, the concatenation of the weighted words and is
Note that we use alphabets consisting of the natural numbers up to dimensions in this paper. However, to separate clearly between the weights and the words , we used letters for the alphabet in this example.
Remark 2.4.
In the next Subsection, we will introduce the concept of signatures. However, it is worth noting that the signature of a path can also be defined recursively through the projections of words. Since the path is -valued, let denote the collection of words formed from the alphabet . We can regard the signature as an element of . To formalize this, we define it recursively using the projections for , as follows:
First, for the empty word, we set . For any non-empty word , we define the projection as:
2.2. Signatures
In this Subsection, we introduce the basic concepts of signatures. We begin by presenting key algebras and a Hilbert space that plays a central role in the theory of signatures. Next, we define what a signature is, and finally, we demonstrate how to pair a weighted word with a signature. Let us start with the setup. Let denote a general -Hilbert space, where the field is either or . We need to introduce a triplet of spaces . First of all, by convention we let
Definition 2.5.
For a Hilbert space we call
the extended tensor algebra.
As we will see later, any well-defined signature naturally belongs to the extended tensor algebra . Another space closely linked to the theory of words and signatures is the tensor algebra, which we introduce next:
Definition 2.6.
The tensor algebra is given by the algebraic direct sum
We equip with a sum , scalar multiplication, and a product . These operations are defined as follows: For elements and in , we define the the sum of and element-wise, i.e.,
Scalar multiplication is also defined element-wise as
Lastly, the product of and is given by
These operations turn the tensor algebra into an algebra. Moreover, we denote by
the -truncated tensor algebra.
Remark 2.7.
We present next two useful facts about the tensor algebra and the extended tensor algebra when . First, we have the isomorphisms
see e.g. [Lan02]. Thus, every word can be viewed as an element in Secondly, we have a dual pairing between and , see e.g. [LCL07], because of the fact that
where denotes the algebraic dual of . Hence for and we denote by the algebraic dual pairing between and .
Putting these two facts together, we find that
and therefore every weighted word can be viewed as a linear functional on the extended tensor algebra, .
Now we want to introduce a Hilbert space for which a big class of signatures belongs to, called the full Fock space. For , we consider an inner product on by
Hence, the norm on becomes . Then for where we can define an inner product by
and hence a norm
The full Fock space over is the Hilbert space given by the topological direct sum, ,
Moreover, is a dense subspace in .
Example 2.8.
Recall that for we have that and therefore any can be recognized as an element . Moreover, since is a (dense) subspace of , we can compute the Fock norm of a weighted word For example, let so that and consider the weighted word
Then
where is the basis vector in the field while are the basis vectors in . Hence, we get that
Next we recall the definition of a multiplicative functional, a fundamental object in the theory of rough paths [LCL07]:
Definition 2.9.
A multiplicative functional of degree is a continuous map
satisfying Chen’s identity:
For , a multiplicative functional of degree is said to have finite -variation if
and is the collection of all finite partitions over .
Definition 2.10.
Let and let be a multiplicative functional of degree . We call a -rough path if is of finite -variation for all
Theorem 2.11.
[Lyons’ extension theorem [LCL07]] Let be a p-rough path. Then for any there exists a unique continuous map
such that
is a multiplicative functional with finite -variation. We call the signature of . Moreover, for any norm on we have that
where is a constant only dependent of .
Definition 2.12 (Geometric rough paths, [LCL07]).
A geometric -rough path is a -rough path that can be expressed as a limit of -rough paths in the -variation distance. The space of geometric -rough paths in is denoted by .
We now present a useful theorem: the signature of a -rough path is an element of the Fock space, instead of the entire extended tensor algebra, . To this end, let
be the Mittag-Leffler function.
Corollary 2.13.
For let be a -rough path. We have that for any .
Proof.
We wish to utilize the dual pairing in Remark 2.7 between weighted words and signatures . Hence, from now on we consider the case when . Given an -valued path for which the signature is well-defined, and a word , we can identify as an element and we have the following relation with the signature
(2.1) |
Here and in the sequel of this paper we shall use the generic notation to signify a (finite) linear combination of elements , or, after identification, a finite linear combination of words (as in the Example 2.8 above). Indeed, if for and for finitely many , then,
with given as in (2.1).
The shuffle product is turning into an algebra, and is very convenient and useful in operating with products of signatures. Indeed, the shuffle product is the key operation in the rough path theory that linearises nonlinear functionals, at least approximately, as can be seen in the Universal Approximation Theorem (as we recall for the convenience of the reader in Proposition 3.1). We define the shuffle product next: The shuffle product between two words gives a word which is constructed by taking a linear combination of all the different ways to combine two words while preserving their own orders. As an example, if we shuffle together the words and , we get
Note that and is not in the sum on the right hand side since it violates the order of .
Below follows an important Lemma for products of signatures and the shuffle product:
Lemma 2.14 (Shuffle property, [LCL07]).
Let and let be a geometric rough path, according to Definition 2.12. Then the following identity holds
Interestingly, one can link shuffle product monomials to monomials of the signatures: given we define for every by
Then by the shuffle property we have that
This gives a convenient link between monomials of signatures and monomials of words.
3. Functional approximation with signatures
With the goal of approximating complex pricing functionals from financial markets, we recall here some basic properties of universality of the signature. In addition, we provide a new statement of the universal approximation property for functionals acting on general rough paths (not restricted to geometric rough paths).
3.1. Universal approximation for geometric rough paths
The functional approximation setup outlined in [LNPA20], based on ideas also formulated in [Arr18], is strongly based on the universal approximation property of the signature. This property is a consequence of the Stone-Weierstrass theorem, using the fact that the linear span of signatures of geometric rough paths forms an algebra, and that when lifting the underlying path to its time extension , the signature uniquely determines the path, and thus separates points. The universal approximation theorem relied upon in [Arr18, LNPA20] is based on Lipschitz-paths. The following universal approximation theorem is based on the recently proposed extension by Cuchiero et.al. in [CPSF23] to the setting of -valued paths.
Proposition 3.1 (Universal approximation).
For let be a compact subset which is finite in the -variation norm. Suppose is a functional on . Let be a -rough path defined according to Definition 2.10 over the extended path for some . Then for each there exists a linear functional such that
The universal approximation applied to valued paths allows us, in particular, to consider functionals of truly rough signals , that only have finite variation, as long as we have constructed the rough path of over . For instance, we can consider functionals of paths of the Brownian motion. However, in its current form, it is only formulated for Stratonovich lifts of the iterated integral, excluding the more natural choice of rough paths lift for financial applications, namely the Itô lift. One might try to work around this problem by identifying the Itô-Stratonovich correction term and implementing this in the functional and signature. However, under the working hypothesis that will be used in the remainder of the text, we can circumvent this challenge completely.
3.2. General universal approximation on rough paths
The universal approximation theorem in Proposition 3.1 heavily relies upon the geometric structure of the rough paths, on which the functionals act on. The reason is that when applying the Stone-Weierstrass theorem to check for denseness of the linear span of signature terms in the space , one relies upon being able to multiply two signature terms and to obtain a new signature term . When is geometric (i.e. takes values in ) it follows from Lemma 2.14 that this holds with . However, this is a restrictive class of signatures; in the semi-martingale setting it corresponds to Stratonovich lifts of the rough path. In financial applications, one typically work with functions that acts on non-geometric rough paths, such that Itô lifts of semi-martingales. Thus extending the universal approximation property to any rough path provides practical consequences for several applications.
It turns out that a simple mixing of ideas from the Stone-Weierstrass theorem over polynomials, with the classical signature universal approximation allows one to obtain such a new generalized universal approximation. To this end we will need a so-called separation of points property of the sub-algebra of continuous functionals that we will consider as a the basis for functional approximation. We therefore recall the following technical lemma from [Bre11, Cor. 4.24].
Lemma 3.2.
Let be open, and suppose is such that
Then a.e on .
With this lemma at hand we are now ready to prove a generalized version of polynomial universal approximation over rough paths with values in the tensor algebra .
Theorem 3.3 (Generalized universal approximation).
Let be a continuous functional on a compact set . Then for any there exists finite set , and a polynomial given by
(3.1) |
and a sequence of linear operators with the property that for all then
Proof.
Let . Define for and ,
(3.2) |
Furthermore, let . Clearly, . With the goal of applying Stone-Weierstrass theorem to prove denseness of in we check that the following holds:
-
i)
The set forms a sub-algebra. Indeed; addition holds. Furthermore, for two elements and then we can choose such that
where denotes the signature of for and for , and is the concatenation of and .
-
ii)
separates points. Since is created over the time-extended paths and , then the following two integral functionals exists in :
where we recall that , the time component corresponds to the first component in the dimensional vector, and then
Suppose now that for all . Then since polynomials are dense in it follows by Lemma 3.2 that . As a consequence it implies that separate points.
-
iii)
the constant function is in , since by choosing we get that and therefore in particular .
From these three properties it follows by the Stone-Weierstrass theorem that is dense in , which concludes the proof. ∎
3.3. Signature associated to price paths
As discussed in the introduction, even the most complex financial derivatives typically have a simple functional structure. By this we mean that the path dependent nature of the functional either comes through an averaging over the price path (like in Asian style derivatives), or products of price paths (quanto-style options), in addition to basket of different assets etc. with these structures. It is therefore natural to assume that for this purpose, given a price path , and the extended signature , the payoff functional can be written as a function , and different weighted words , such that
This will therefore be the main working hypothesis of the subsequent sections, and we will illustrate several numerical and analytic advantages of using this specific structure. We will also give examples to show exactly how this hypothesis applies for various exotic derivatives. As a first simple example, we have the following:
Example 3.4.
Asian options are contracts that pays the holder an amount of money according to the average price over a period of time. If is the price process, the holder receives at time . Considering the time-enhanced price path defined by , we see (recall Example 2.8 with and ) that for . Another example is a spread option between two assets with price dynamics and , respectively, paying the holder at time , for a conversion constant (here, may convert the currency of the second asset into the currency of the first, say). With , we can (still following the notation in Example 2.8, now with ) express the payoff as
with and , or, more simple,
with the weighted word . Yet another example from energy finance is so-called quanto-options (see e.g. [BLM15]), where the holder receives a payment at exercise time according to a product of two payoffs on the average of the spot energy price and temperature, say. Denoting the energy spot price, the temperature process, and , we have a payoff
where and , and are the payoff functions written on the average of the spot energy price and temperature, resp.
Let us precise the hypothesis we work under in the remainder of this paper.
Hypothesis 3.5.
Let and assume that for a given continuous functional acting on , there exists a collection of linear operators , and a function such that for any
(3.3) |
Remark 3.6.
Define to be the signature functional
and define . Under the assumption that is a continuous functional in Hypothesis 3.5, it follows that also must be continuous. Indeed, we know that is a continuous mapping according to Lyons’ Extension theorem 2.11. The right hand side of (3.3) can be written as a composition between and a continuous functional with image . Since by assumption, is continuous, and we know that is continuous, thus it follows that also must be continuous. Of course, the converse would also be true; suppose we start out with a continuous function , then also is a continuous functional, by composition of two continuous maps.
The main advantage of invoking Hypothesis 3.5 is that analytic functions are dense in the space of continuous functions. Thus, if Hypothesis 3.5 holds for a continuous function , then we may approximate the functional by a (multivariate) polynomial in a finite number of signature coefficients. More precisely, if Hypothesis 3.5 holds, there exists an analytic function such that for , and
where for a multi-index and , we write . The coefficients are real numbers labeled by the multi-index . This leads to the following simplified version of the universal approximation theorem, being a consequence of the Stone-Weierstrass theorem and Hypothesis 3.5.
Theorem 3.7.
Let be a continuous functional on , and suppose Hypothesis 3.5 holds with a sequence and a continuous function . Then for any there exists a finite set a hypercube , and a polynomial given by
with the property that
(3.4) |
for all such that . Furthermore, let be the constant from Theorem 2.11 such that for of the form with ,
where . Suppose the coefficients satisfies for some
(3.5) |
where . Then we have that
(3.6) |
Proof.
This is a simple consequence of the Stone-Weierstrass approximation theorem for continuous functions, using that there always exists a compact subset such that
Restricting the domain of to , we are done showing (3.4). For the convergence rate (3.6), define the remainder term
Invoking the bound on the signature decay and the assumption on , we see that
The right hand side of this inequality corresponds to the remainder term of a multivariate Taylor approximation of the function around up to order . Thus from the multivariate Taylor theorem, it follows that
By an elementary combinatorial argument (using the so-called "stars and bars"-argument) we see that . This concludes the proof.
∎
Remark 3.8.
While Hypothesis 3.5 is certainly limiting the class of functionals that we can analyze, universal approximation becomes easier. In addition, the assumption that only acts on compact subsets of -variation paths with values in the space of geometric rough paths, i.e. , is dropped, allowing for an easier verification of universality. For later probabilistic arguments related to financial prices as expected functionals, this point will simplify computations and discussions.
Furthermore, the classical assumption that the approximation holds over compact subsets , as seen in Theorem 3.3, the compactness statement in Theorem 3.7 significantly simplifies this. Indeed, describing compact subsets of can be a challenging task, as illustrated in e.g. [Gul24]. In Theorem 3.7 one essentially only need to choose a bound , and one can consider any such that such that
The compact subsets of an infinite dimensional space is therefore replaced by a (something that may be interpreted as) bounded subsets. This can also make probabilistic statements easier, as will be illustrated in subsequent sections.
Remark 3.9.
It is important to note that in Theorem 3.7 the statement allows for any functional acting on the space of -variation paths with values in the truncated tensor algebra. This is a significant difference with the classical universal approximation theorem for signatures stated in Proposition 3.1, as the space of geometric rough paths limits the possible structure of the under consideration. In particular, a canonical asset pricing model would be constructed from semi-martingales and Itô processes. To preserve a martingale property of the derivative prices, one then use the Itô integral for computing derivatives prices, an integration choice which in the sense of signatures is not geometric. In contrast, under Hypothesis 3.5 one can easily work with functionals that structurally contain Itô integration, and still obtain a direct and descriptive approximation of the functional in terms of the signature associated to the price path.
Remark 3.10.
Computationally, Hypothesis 3.5, given a specific , one only requires the computation of terms from the signature, and not the full signature, and with these terms one can achieve as high accuracy as desired for functional approximation. This is in stark contrast to the much more general Universal approximation theorem in Proposition 3.1 and Theorem 3.3, where the accuracy of the approximation is dictated by number of signature terms included. Invoking Hypothesis 3.5 therefore has the potential to reduce computational time significantly.
Remark 3.11.
The condition assumed on the coefficients in (3.5) yields the bounds in (3.6). Different assumptions on will yield different convergence rates. While the condition in (3.5) is seemingly abstract, it can be verified to be weaker than the conditions satisfied by the coefficients in a Taylor expansion. On the other hand, the condition is not satisfied by a much "slower" convergent polynomial series, such as the Bernoulli polynomials. A more clear illustration of this condition will be given by the subsequent examples.
A restriction of the functional approximation in Theorem 3.7 to the case of functionals on geometric rough paths can readily be seen as a special case of the classical universal approximation theorem presented in Proposition 3.1. Indeed, we have the following corollary:
Corollary 3.12.
Let be a continuous functional on the space of -variation (extended) geometric rough paths, , and suppose Hypothesis 3.5 holds with a sequence and a continuous function . Then for any there exists finite set and a compact , such that
Proof.
Since now is a geometric rough path, it follows from Lemma 2.14 that for there exists a linear functional such that
Thus, inserting this into the polynomial, the result follows. ∎
Remark 3.13.
Note that the linear functionals quickly become very large sums of words, even when the ’s consist of elementary words. As an example, for some single letter , consider the product for some potentially large . Then doing the th power of the shuffle product of , yields the word ( repeated times), and we get the weight in front, i.e.,
See, e.g., [BB02] and the references therein for a longer exposition of the shuffle product and algebras. It becomes quickly expensive to compute higher order signature terms. However, as long as Hypothesis 3.5 is in place, signature computations can be made much more efficient if what one really needs is only to compute the power of the number . When computing expected values, this is often the situation.
3.4. Examples of functions
We will in this Subsection consider a few examples of functions that can be approximated, and investigate their convergence properties. As already discussed, most examples of financial payoff functionals only considers the simpler case when Hypothesis 3.5 holds. That is, for each specific payoff functional there exists a finite number of linear operators for such that
From both the Universal approximation theorem in Proposition 3.1 or from Theorem 3.7 we know there exists an associated approximation in terms of the signature of the (rough path lifted) price path (either as a linear combination of signature terms, or as a polynomial of a finite number of signature terms). However, while there is no standard way of finding and describing the linear functional in Proposition 3.1, there is much theory available to compute potential sequences of to obtain a good approximation in Theorem 3.7.
We provide now three elementary examples of such approximation choices.
Example 3.14 (Taylor polynomials).
Suppose the payoff functional can be identified through hypothesis 3.5 with an infinitely continuously differentiable function . Then an elementary Taylor expansion of around yields
where denotes the ’th derivative of . Moreover, under less restrictive regularity assumptions, we can truncate this sum at any level and explicitly determine the error we make by the formula
where there exists some between and such that
This yields an analytic expression for the remainder term in Theorem 3.7 when is sufficiently regular.
Example 3.15 (Hermite polynomials).
Let and and suppose we have an option that pays
The function can be approximated by Hermite polynomials. More precisely, for the ’th Hermite polynomial is given by
is the density of the standard normal distribution . Note that . Moreover, for the Hilbert space with inner product
we obtain an orthonormal basis given by
In particular, any function can be written as
As argued in [Ben21], the function for some constant , belongs to From this Hermite polynomial expansion, we have an exact formula for given by
Moreover, we can truncate this sum at any desired level to reach a suitable approximation by
Example 3.16 (Bernstein approximation).
The standard choice of approximation of a continuous function by a polynomial is arguably the Bernstein polynomial. Any continuous function can be approximated arbitrarily well as follows
and . Again, if an option pays , and it satisfies Hypothesis 3.5 with a continuous function and a such that and
we have the Bernstein approximation
4. Stochastic market prices - lifting to rough paths
While the methodologies for numerical approximation of complex derivative payoffs we propose here will be in the spirit of model free finance, we will also connect the results to classical pricing when the underlying stock is assumed to be a semi-martingale. More specifically, we consider the case of an Itô process of the form
(4.1) |
Here and are square integrable processes and adapted to the filtration generated by the Brownian motion . The process may be used to model the log-price dynamics of an asset price, or the absolute price, the volatility or any other relevant stochastic asset dynamic.
In light of the theory for rough paths and signatures presented briefly in the beginning of Section 2, it is only natural to ask whether the signature can be constructed above the stochastic process . More precisely, one wants to make sure that for almost all , the following map exists
Since is a Brownian motion, the regularity of will be of finite -variation for . There is therefore no canonical construction of the iterated integral . However, since is a semi-martingale, we can use this probabilistic structure to construct the iterated integrals as random variables in . As is well-known, there exist different choices of constructing this integral as a random variable, with Itô or Stratonovich integration as the most commonly used. It is up to the application at hand which integral to use for the specific task. Typically, in financial models, Itô integration is selected as this preserves adaptedness and a martingale structure, necessary for arbitrage-free pricing. Given the choice of integration, using the Burkholder-Gundy-Davis inequality, one can then apply Kolmogorov’s continuity theorem to identify a subset of full measure such that for each there exists a realization of the iterated integral , see e.g. [FH14, Section 3]. Moreover, one can verify that this object satisfies Chen’s relation
We can therefore conclude that is a -rough path according to Definition 2.10. By applying Theorem 2.11 we know that also the signature exists.
For practical financial purposes, we are interested in computing the expected value of multivariate monomials of signature functionals, as will be seen as a crucial component of the pricing approximation. More precisely, the price at time of a contingent claim on a financial asset with payoff at some future time can be written as
where is a pay-off functional, possibly dependent on the whole price path . As previously described, a pay-off functional is a functional on the rough paths lift to show the dependence on the chosen rough paths lift, i.e. stochastic integration choice. In the next section we will show how this price, given as the conditional expectation, can be approximated by a sum of different correlators of signature terms of the form
This is in contrast to the functional approximation considered in [LNPA20] where one computes the complete expected signature , or dynamically as , and then consider . In the setting of an Itô process , the latter methodology requires one to solve a (very) high dimensional Kolmogorov equation as described on [LN15]. However, since is an Itô process, then one can show that is a real valued Itô process, as proven in the proposition below. Thus computing can be done by standard use of a (low) dimensional Kolmogorov equation, and then one must do this computation for and several different . Indeed, we provide a simple proof of this claim:
Proposition 4.1.
Suppose is an Itô process of the form (4.1) where and are adapted to the filtration generated by the Brownian motion. Consider the Itô lift . Then for any the process is an Itô process.
Proof.
Let be a single word of length , given of the form for for all . If , it follows that is an Itô process by by definition. When , assume that for all words of length , is a square integrable and adapted. We then use the recursive definition of the signature in 2.4 to see that
To see that this process is an Ito process we must verify that it is square integrable. Using that , then by Hölder’s inequality we have that for all
By the inductive hypothesis, and so the product of the two is square integrable. Adaptedness follows immediately by the inductive hypothesis. ∎
For certain choices of and assumptions of the underlying stochastic process , we can compute explicit expressions for the signature moments of the form for . We illustrate this through some common choices in the following examples.
Example 4.2.
Let be a two dimensional Brownian motion with independent components, and consider the path and the word . Recall that this choice of relates to the spread options case considered in Example 3.4. We are interested in computing the -th moment of . We first see that . Note that since is a normally distributed random variable, their difference . A simple argument based on the Itô formula for shows that
where is a standard Brownian motion. Thus, using the Itô isometry, we see that
By Gaussianity, it furthermore follows that,
(4.2) |
and for odd . Here, for being an odd number.
The above example shows that the moments of words applied to signatures are very easy to calculate when the path is the time-extended Brownian motion. [CSF21, Section 4.5] derive an explicit formula for the expected signature of the time-extended path of a -dimensional Brownian motion. According to formula (4.18) in [CSF21], one has
(4.3) |
for words where , even with and is the th unit vector in . The tensor products are here interpreted as the non-symmetric ones and the iterated integrals in the signature are interpreted in the Stratonovich-sense. We recall that in a financial setting however, we are mostly interested in martingale structures, typically guaranteed with the Itô lift, and must there also compute certain Itô-Stratonovich corrections that we do not consider further here. When is odd, the expected signature is zero. An extension of (4.3) to correlated Brownian motions are found in [CGSF23, Thm. A.1]. Equation (4.3) is of similar complexity as (4.2) in the example above. However, we see that to compute the expected signature, we need to find the whole representation of the expected signature (up to depth ) before applying it to words, whereas in our approach, we first identify the different terms in the signature which we need according to the words we are given, and then compute the expected moments in question. The words in the former approach might be very long as we have converted the moments into linear representations, and thus becomes big.
Towards signature approximations of complex derivative prices, we will need to investigate correlators of functionals of the signature. More precisely, given a multi-index , and weighted words we are interested in computing the correlator
(4.4) |
where we recall that for a vector we define . Correlators appear in statistical turbulence theory as interesting objects to study, see [BNBV18]. We also mention [BL21] for correlators applied to financial derivatives pricing along with polynomial processes. Computing these correlators for some arbitrary sequence can be challenging, and if and gets large, might even become unfeasible. However, for certain linear functionals , typically given as short words or even single letters, the computation may be analytically tractable by invoking stochastic structures, even for large .
In contrast, the signature methodologies presented in [LNPA19, LNPA20] would require one to compute the expected signature term for some which potentially becomes a very long sum of very large words. Indeed, given that is a geometric signature, as already discussed in Remark 3.13, using the shuffle product from Lemma 2.14, it is possible to find a such that
Computationally, it will be more expensive to compute the right hand side than the left hand side, since one would need to compute the complete signature up to a very high degree (i.e. the length of the longest single word in ). Furthermore, we have that the correlator , and thus computing these expected signature terms will be challenging. Even in the setting where the underlying path solves an Itô SDE, one is required to solve a Fokker-Planck equation with values in the tensor algebra , which is numerically very challenging (see e.g. [LN15]).
5. Approximation of exotic derivatives
We are now ready to present the core results of this article, namely, an approximation formula for exotic, path dependent, financial derivatives. To this end, we begin with an assumption on the probability spaces we work with for the admissible price paths.
Hypothesis 5.1 (Market prices and probability measures).
We consider a complete stochastic basis supporting market prices as measurable maps from to , with the property that for any there exists a compact subset such that
Remark 5.2.
Let be a separable and complete metric space. Then every Borel probability measure on is tight; that is, for every , there exists a compact set such that , see [Lin86]. Consequently, if were separable, the hypothesis above would be unnecessary.
However, as shown in [FV10, Sec. 8], the space is in general not separable. In contrast, the closure of smooth paths with values in the -step free Lie group in -variation norm, denoted by , is separable. Note also that we have the inclusion of spaces
Since is not separable, we must explicitly assume that all distributions on are tight, or equivalently, that they are Radon measures (finite tight Borel measures). In the non-separable setting of , Radon measures - characterized by their separable image - are the "right" type of measures to consider. For further details, see [Lin86] or [Bil68].
The above market prices and probability measures provides a broad framework for pricing. In the following we will not deal with the problem of risk neutral prices, and rather refer the reader to [Arr18, LNPA20] for a discussion on this point. For the purpose here, the reader may assume that the probability measure chosen is a risk neutral measure in the context of the given pricing problem.
Theorem 5.3.
Let be a filtered probability space satisfying Hypothesis 5.1, and suppose the price of a financial derivative, denoted by , can be represented by an adapted payoff functional acting on the set of random price paths for some sufficiently large (see Remark 5.4), and is given by
(5.1) |
Furthermore, suppose for all , for some . Then for any there exists a finite set and a sequence of numbers such that
Where denotes the signature correlator from (4.4) for the stochastic process .
Remark 5.4.
Here we consider payoff functionals as functionals acting on random variables with finite -moments. The requirement on will depend on the approximation accuracy that is desired, since finiteness of the correlators is only guaranteed from the moments of the price paths.
Proof.
We begin to observe that for some suffieciently large compact subset we have
In the last inequality we have applied Hölders inequality to the product , invoking the bound on the -moment of as well as Hypothesis 5.1 on the probability measure to get that . Now, apply Theorem 3.3 to the payoff functional and the expectation acting on the corresponding signature polynomial from Theorem 3.3 then yields the correlators from (4.4) applied to the signature . Since can be chosen arbitrarily small by assumption (just choose larger), and likewise for the universal approximation, the proof is complete. ∎
The next corollary enables further simplifications for pricing approximation by invoking Hypothesis 3.5 on the payoff functional.
Corollary 5.5.
Suppose the price of a financial derivative, denoted by , can be represented by a payoff functional acting on the set of admissible price paths , and is given by (5.1). Furthermore, suppose that for Hypothesis 3.5 holds for some function . Then for any there exists a finite set and a sequence of numbers such that
Proof.
We illustrate the above corollary by considering a specific choice of functional, common in financial practice, namely the max function acting on the signature; the functionals for certain Asian options.
Example 5.6 (Simple Asian option).
We will in this example consider a very basic Asian option, and show how moments of the signature can be used to approximate this price. Of course, there are well known formulas and approximations (see e.g. [GY93]) for the price of an Asian option in the Itô setting, but we believe that this is instructive. In incomplete markets, one could also imagine that is not an Itô process, and therefore the following expansion could still be an interesting pricing technique.
Let us consider the standard Asian call option payoff function
A smooth approximation for the max-function is given by , where is the well-known sigmoid function and for some large enough . The figure below shows how this function, , resembles the max-function.
We have a Maclaurin series for the Sigmoid function , and then multiplying by we get :
(5.2) |
where is the Euler polynomial. Truncating this approximation at level , one get a price approximation for
Providing further theoretical convergence rates can be done under certain assumptions on the moments , but will not be further dealt with here. However, we believe that the example highlights how moments of signatures can be used in derivative price approximation.
5.1. Pricing exotic derivatives in electricity markets
We consider here some cases of interest in pricing and valuation in electricity markets.
The so-called quality factor is used to assess the profitability of renewable power production such as solar or wind. It measures the income relative to a plant with fixed base load price producing the same volume. The quality factor is defined as
(5.3) |
Here, is the volume power produced at time and is the spot power price. A natural question is to ask what is the expected quality factor, i.e., . The expectation is either under the risk-adjusted probability or the market probability.
The volume produced is given by the installed capacity (measured in megawatt (MW)) times the capacity factor . The process takes values between 0 and 1, measuring the amount of production from solar or wind in a power plant of capacity 1 MW. We also introduce a maximal power price for the market, denoted , which can be an upper limit we believe never will be exceeded in practice. Hence, with and we get
But since , we have that and similarly , which yields that
Introduce the time-enhanced process where . Since , it follows from the shuffle property in Lemma 2.14 that
where
as long as is a geometric rough path. Moreover, we notice that and with and , resp. Hence, after truncation, we compute an approximation of the expected value of the quality factor by
(5.4) |
We can simulate the functionals inside the above expectation given stochastic models of and . Notice that we only need to have available simulations of the three functionals in order to compute the expectations by Monte Carlo for any orders of and . If we appeal to Lemma 2.14 we can use the shuffle property to re-state the signature functionals inside the expectation to , however, will depend on and and therefore we would need to consider higher and higher signatures in the calculation. This shows the power in our approach.
Remark 5.7.
In the above discussion of the computation of the expected quality factor, we have to assume being a geometric rough path in order to re-express a product between the capacity factor and price processes as a functional on the signature. The capacity factor process is derived from wind speeds or solar irradiation, where empirical studies indicate that higher-order continuous time autoregressive (CAR) processes are suitable (see e.g. [BvB09] for wind and [LGB23] for solar). The CAR-processes are of order 2 or higher, implying that the paths are continuously differentiable and thus of finite 1-variation. On the other hand, different studies point to power spot prices being sums of Ornstein-Uhlenbeck processes (see [LS02]), or even fractional models with Hurst parameters less than 0.5 (see [Ben17]), and therefore may have regularity at most as Brownian motion. We notice however, that power spot prices (which is what the process models) are by definition only available on a discrete time grid (typically of hourly granularity), and hence we can in principle imagine paths which are of high regularity when considering continuous-time paths.
Recall Example 3.4. Quanto options are options with a product payoff on two underlying assets, typically being a call or put on price and a volume variable. The volume variable is indicating production indirectly, for example through temperature (which controls the demand for power) or wind speed (which controls the amount of renewable wind power that can be generated). Let now be the volume-variable (temperature, wind…). In power, a typical option is a call on average volume over a period, and a put on the price average over the same period.111This gives an insurance against too low prices when renewable power production is high.,
In general, this can be expressed as
where for some , with words and . Here, the time-enhanced process is . If we have power series expansions of and available, we can find an approximation of the price expressed by the risk-adjusted expectation by computing terms of the kind
These expected values are simplified versions of the expectation for the quality factor in (5.4) discussed above. The payoff of quanto options motivates studying the following general structure: let be some measurable function, and consider the payoff
for words and path . Indeed, the quality factor is itself an example of a specification of an . The time-enhanced price path may also consist of more variables that and . For example, one could have an option settled on temperature, wind and price.
Example 5.8.
In this example we will illustrate how specific structural assumptions on the driving underlying price processes can be used to make price approximations expcicit by leveraging moment computations. Consider two correlated Ornstein-Uhlenbeck processes
for two independent Brownian motions and , and is the correlation coefficient. Solving explicitly, we see that the difference is given by
(5.5) |
We then see that is normally distributed with mean , and second moment given by
(5.6) |
Now, just as in Example 4.2, we again consider the pairing where A simple use of Fubini yields that
(5.7) |
Using Itô-isometry, we get the mean and second moment of is given by
Moreover, since is Gaussian, the higher order moments become
where is the confluent hyper-geometric function. Now, the correlator is given by , hence from Corollary 5.5, we know there exist an and such that
Consider the setting where the expected functional is an Asian option, such as analyzed in 5.6. One can then use the coefficients found there to get an explicit formula for the approximation of an Asian spread option, in the case when the underlying processes are assumed to be given by Ornstein-Uhlenbeck processes.
6. Conclusion
In this work, we develop a new type of universal approximation theorem for non-geometric rough paths, addressing the practical challenges of financial applications that naturally involve Itô integration. The results presented provide a robust framework for advancing derivatives pricing methodologies in financial markets, ensuring both computational efficiency and theoretical rigor. By introducing a polynomial-based approximation framework, we demonstrated how complex payoff functionals for financial derivatives can be efficiently represented and approximated using signature terms. This approach bridges the gap between rough path theory and the practical requirements of financial markets, providing a robust tool for pricing exotic derivatives and path-dependent contracts.
Our results highlight the versatility of signatures in capturing the intricacies of stochastic paths while enabling computational efficiency and how this can be used in finance. The proposed framework not only broadens the scope of universal approximation beyond geometric rough paths but also lays a foundation for further exploration of functional approximation in stochastic finance. Specifically, it builds further on the research developed in [LNPA19, LNPA20, Arr18], and provides a new perspective in the Itô setting The methodology for universal approximation here seems also promising in the context of the Volterra signature induced from the analysis in [HT21], and this is currently something we are working on. Several new applications of this method seems promising.
References
- [Arr18] Imanol Perez Arribas. Derivatives pricing using signature payoffs, 2018.
- [BB02] Douglas Bowman and David M. Bradley. The algebra and combinatorics of shuffles and multiple zeta values. Journal of Combinatorial Theory, Series A, 97(1):43–61, 2002.
- [Ben17] Mikkel Bennedsen. A rough multi-factor model of electricity spot prices. Energy Economics, 63:301–313, 2017.
- [Ben21] Fred Espen Benth. Pricing of commodity and energy derivatives for polynomial processes. Mathematics, 9(2), 2021.
- [Bil68] Patrick Billingsley. Convergence of probability measures. John Wiley & Sons, Inc., New York-London-Sydney, 1968.
- [BL21] Fred Espen Benth and Silvia Lavagnini. Correlators of polynomial processes. SIAM Journal of Financial Mathematics, 12(4):1374–1415, 2021.
- [BLM15] Fred Espen Benth, Nina Lange, and Tor Aage Myklebust. Pricing and hedging quanto options in energy markets. Journal of Energy Markets, 8(1):1–35, 2015.
- [BNBV18] Ole E. Barndorff-Nielsen, Fred Espen Benth, and Almut Veraart. Ambit Stochastics, volume 88 of Probability Theory and Stochastic Modelling. Springer Nature, Cham, 2018.
- [Bre11] Haim Brezis. Functional Analysis, Sobolev Spaces and Partial Differential Equations. Universitext. Springer, New York, 2011.
- [BvB09] Fred Espen Benth and Jurate Šaltytė Benth. Dynamic pricing of wind futures. Energy Economics, 31(1):16–24, 2009.
- [CGSF23] Christa Cuchiero, Guido Gazzani, and Sara Svaluto-Ferro. Signature-based models: theory and calibration. SIAM Journal of Financial Mathematics, 14(3), 2023.
- [Che54] Kuo-Tsai Chen. Iterated integrals and exponential homomorphisms. Proceedings of the London Mathematical Society, s3-4(1):502–512, 1954.
- [CO22] Ilya Chevyrev and Harald Oberhauser. Signature moments to characterize laws of stochastic processes. Journal of Machine Learning Research, 23(176):1–42, 2022.
- [CPSF23] Christa Cuchiero, Francesca Primavera, and Sara Svaluto-Ferro. Universal approximation theorems for continuous functions of càdlàg paths and Lévy-type signature models, 2023.
- [CSF21] Christa Cuchiero and Sara Svaluto-Ferro. Infinite-dimensional polynomial processes. Finance & Stochastics, 25:383–426, 2021.
- [FH14] Peter K. Friz and Martin Hairer. A Course on Rough Paths. Universitext. Springer, Cham, 2014.
- [FV10] Peter K. Friz and Nicolas B. Victoir. Multidimensional Stochastic Processes as Rough Paths: Theory and Applications. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2010.
- [Gul24] Jacek Gulgowski. Compactness in the spaces of functions of bounded variation. Zeitschrift für Analysis und ihre Anwendungen, 42, 01 2024.
- [GY93] Helyette Geman and Marc Yor. Bessel processes, asian options, and perpetuities. Mathematical Finance, 3:349–375, 1993.
- [HL10] Ben Hambly and Terry Lyons. Uniqueness for the signature of a path of bounded variation and the reduced path group. Annals of Mathematics, 171(1):109–167, 2010.
- [HT21] Fabian A. Harang and Samy Tindel. Volterra equations driven by rough signals. Stochastic Process. Appl., 142:34–78, 2021.
- [Lan02] Serge Lang. Algebra, volume 211 of Graduate Texts in Mathematics. Springer-Verlag, New York, third edition, 2002.
- [LCL07] Terry J. Lyons, Michael Caruana, and Thierry Lévy. Differential Equations Driven by Rough Paths, volume 1908 of Lecture Notes in Mathematics. Springer, Berlin, 2007.
- [LGB23] Karl Larsson, Rikard Green, and Fred Espen Benth. A stochastic time-series model for solar irradiation. Energy Economics, 117, 2023.
- [Lin86] Werner Linde. Probability in Banach spaces—stable and infinitely divisible distributions. A Wiley-Interscience Publication. John Wiley & Sons, Ltd., Chichester, second edition, 1986.
- [LN15] Terry Lyons and Hao Ni. Expected signature of a Brownian motion up to the first exit time from a bounded domain. The Annals of Probability, 43(5):2729–2762, 2015.
- [LNPA19] Terry Lyons, Sina Nejad, and Imanol Perez Arribas. Numerical method for model-free pricing of exotic derivatives in discrete time using rough path signatures. Applied Mathematical Finance, 26(6):583–597, 2019.
- [LNPA20] Terry Lyons, Sina Nejad, and Imanol Perez Arribas. Non-parametric pricing and hedging of exotic derivatives. Applied Mathematical Finance, 27(6):457–494, 2020.
- [LS02] Julio J. Lucia and Eduardo S. Schwartz. Electricity prices and power derivatives: Evidence from the nordic power exchange. Review of Derivatives Research, 5(1):5–50, 2002.
- [Lyo98] T. Lyons. Differential equations driven by rough signals. Revista Matemática Iberoamericana, pages 215–310, 1998.