Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A C A N M D E S C G: Verage Ontrolled and Verage Atural Icro Irect Ffects in Ummary Ausal Raphs

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

AVERAGE C ONTROLLED AND AVERAGE N ATURAL M ICRO

D IRECT E FFECTS IN S UMMARY C AUSAL G RAPHS

A P REPRINT

Simon Ferreira1 and Charles K. Assaad1


arXiv:2410.23975v1 [cs.AI] 31 Oct 2024

1
Sorbonne Université, INSERM, Institut Pierre Louis d’Epidémiologie et de Santé Publique, F75012, Paris, France

A BSTRACT
In this paper, we investigate the identifiability of average controlled direct effects and average natural
direct effects in causal systems represented by summary causal graphs, which are abstractions of
full causal graphs, often used in dynamic systems where cycles and omitted temporal information
complicate causal inference. Unlike in the traditional linear setting, where direct effects are typically
easier to identify and estimate, non-parametric direct effects, which are crucial for handling real-
world complexities, particularly in epidemiological contexts where relationships between variables
(e.g., genetic, environmental, and behavioral factors) are often non-linear, are much harder to define
and identify. In particular, we give sufficient conditions for identifying average controlled micro
direct effect and average natural micro direct effect from summary causal graphs in the presence
of hidden confounding. Furthermore, we show that the conditions given for the average controlled
micro direct effect become also necessary in the setting where there is no hidden confounding and
where we are only interested in identifiability by adjustment.

Keywords Summary causal graph, Controlled direct effect, Natural direct effect

1 Introduction

The identification and estimation of direct effects are critical in many applications. For instance, epidemiologists
aim to measure how smoking affects lung cancer risk independently of genetic susceptibility [Zhou et al., 2021]. In
addition, it has been shown that estimating direct effects can help diagnose system failures by comparing the direct
impacts of different components before and after failure [Assaad et al., 2023].
In the framework of Structural Causal Models (SCMs) [Pearl, 2009], there has been significant progress in identifying
direct effects using fully specified causal graphs. However, constructing a fully specified causal graph is a challenging
task, as it requires precise knowledge of the causal relationships between all pairs of observed variables. In complex,
high-dimensional settings such as in medico-administrative databases, this level of detail is often unavailable, limiting
the practical application of causal inference. As a result, there has been growing interest in the use of partially
specified causal graphs in recent years [Perkovic, 2020, Anand et al., 2023, Ferreira and Assaad, 2024a, Assaad et al.,
2024, Wahl et al., 2024]. One notable example is the cluster-ADMG (Acyclic Directed Mixed Graph) introduced by
Anand et al. [2023], which offers a coarser representation of causal relationships. In this graph, vertices represent
clusters of variables and therefore it simplifies the representation of complex systems. However, one main limitation
of this type of graph is its acyclicity assumption which does not necessarily hold even if the underlying fully specified
graph is acyclic. A closely related graph type is the summary causal graph (SCG), which is mainly used in systems
involving time and which relaxes the assumption of acyclicity and represents causal relationships where each cluster
corresponds to a time series or repeated measurements of the same variable in longitudinal studies. This flexibility
makes SCGs particularly useful in scenarios involving temporal data and cycles.
In SCGs, two types of causal effects may be of interest. The first is the macro causal effect, which captures the impact
of one set of time series on another set of time series. The second is the micro causal effect, which concerns the effect
of one specific time point on another. This paper focuses on micro causal effects, with particular emphasis on micro
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

direct effects. To date, the problem of identifying micro direct effects from SCGs has primarily been addressed under
the assumptions of no hidden confounding and linear SCMs [Ferreira and Assaad, 2024a]. However, many real-world
applications involve hidden confounding and relationships between causes and effects that are not necessarily linear. In
a non-parametric setting, defining direct effects becomes more complex. The literature of causal inference primarily
offers two definitions: the controlled direct effect and the natural direct effect [Robins and Greenland, 1992, Pearl,
2001]. The former isolates the impact of the treatment variable by controlling on mediators or on the parents of the
response variable, while the natural direct effect captures how the treatment variable influences the response variable
while fixing mediators to their natural values, making it more realistic but harder to identify and estimate.
This paper focuses on the identifiability of direct effects from SCGs in a non-parametric setting. Our contributions
are threefold: first, we present sufficient conditions that characterizes cases where the average controlled micro direct
effect is graphically identifiable from an SCG in the presence of hidden confounding. Next we show that those
conditions become necessary in the absence of hidden confounding and when considering only identifiability by
adjustment. Finally, we establish sufficient conditions for the identifiability of the average natural micro direct effect
from an SCG with hidden confounding.
The remainder of the paper is organized as follows: Section 2 reviews related work, while Section 3 introduces
preliminaries and tools necessary for the subsequent sections. Section 4 provides identifiability conditions for average
controlled micro direct effects, and Section 5 addresses identifiability conditions for average natural micro direct
effects. Finally, Section 6 concludes the paper.

2 Related works
There is a substantial body of work in the literature focused on identifying causal effects using partially speci-
fied graphs. For instance, the studies by Maathuis and Colombo [2013], Perkovic et al. [2016], Perkovic [2020],
Wang et al. [2023] provided conditions for identifying total effects through partially directed graphs that represent
all graphs within the Markov equivalence class of the true causal graph, including CPDAGs, MPDAGs, and PAGs.
Furthermore, Flanagan [2020] provided conditions for identifying average controlled direct effects and average natural
direct effects using CPDAGs and MPDAGs assuming no hidden confounding.
In the context of Cluster-ADMGs (acyclic directed mixed graphs where each vertex represents a group of variables),
Anand et al. [2023] extended Pearl’s do-calculus [Pearl, 1995] to establish necessary and sufficient conditions for
identifying total effects between clusters. Building on this, Ferreira and Assaad [2024b] applied these results to SCGs,
where each cluster represents a time series and which unlike Cluster-ADMGs, can contain cycles. However, both
studies focus on macro causal effects (i.e., effects between clusters), whereas the present paper is concerned with
micro-level causal effects within clusters.
For micro causal effects, Eichler and Didelez [2007] provided sufficient conditions for identifying micro total effects
in time series, applicable to SCGs under the assumption of no instantaneous relations. When instantaneous relations
are allowed, Assaad et al. [2023] showed that both micro total and micro direct effects are identifiable from SCGs
in a linear setting, assuming no hidden confounders and that cycles in the SCG cannot be of size greater than one.
Subsequently, Ferreira and Assaad [2024a] and Assaad et al. [2024] extended these results, establishing conditions for
identifying micro total effects in a non-parametric setting and micro direct effects in a linear setting, even when cycles
larger than one exist in the SCG. More recently, Assaad [2024] provided sufficient conditions for identifying total
effects in the presence of hidden confounding. However, identifying micro direct effects in a non-parametric setting
or with hidden confounding from SCGs (or any graph with cycles) remains an open challenge. This is not surprising,
as direct effects are inherently more complex than total effects in non-parametric setting.

3 Preliminaries
In this section, we introduce the tools, terminology, and key theoretical results that will be essential for the remainder
of the paper.
In the remainder, uppercase letters are used to denote variables, lowercase letters represent specific values of those
variables, and blackboard bold letters indicate sets of variables. For any given graph, P arents(., .), Ancestors(., .),
and Descendants(., .) will be used to represent the parents, ancestors, and descendants of a vertex, respectively. The
first argument refers to the specific vertex under consideration, while the second argument corresponds to the graph
in which these relationships are being examined. For instance, P arents(X, G) refers to the parents of vertex X in
graph G. By convention, We will treat any vertex as an ancestor and a descendant of itself. The mutilated graph GW Z
is obtained from G by removing all edges with an arrowhead pointing to W (e.g., X → W , X W ), as well as all

2
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

edges with a tail originating from Z (e.g., X ← Z). In a graph, a path is said to be activated by W if every collider W
(i.e., a node with structure → W ←) on the path, or any of its descendants, is in W, and no non-collider on the path
is in W. A path that is not activated is said to be blocked. If all paths between X and Y are blocked by a set W, then
X is said to be d-separated from Y by W in the graph G, denoted as X G Y ∣ W. The strongly connected component


of a vertex X in a graph G, denoted by Scc(X, G), is the maximal subset of vertices that includes X, such that every
vertex in this subset has a directed path to every other vertex in the subset. By convention, Scc(X, G) = ∅ if there is
no cycle in the graph that includes X and Scc(X, G) = {X} if the cycle that includes X is a self loop on X. We refer
to Pr as the probability distribution, and Pr(⋅ ∣ do(X = x)) as the interventional distribution, where the do(⋅) operator
represents the intervention.
In this study, we consider an unknown discrete-time dynamic structural causal model, a specific variant of the structural
causal model [Pearl, 2009], designed to account for temporal dynamics.
Definition 1 (Discrete-time dynamic structural causal model (DTDSCM)). A discrete-time dynamic structural causal
model is a tuple M = (L, V, F, Pr(l)), where L = ⋃{Lvt ∣ i ∈ [1, d], t ∈ [t0 , tmax ]} is a set of exogenous variables,
i

which cannot be observed but affect the rest of the model. V = ⋃{Vi ∣ i ∈ [1, d]} such that ∀i ∈ [1, d], Vi = {Vti ∣ t ∈
[t0 , tmax ]}, is a set of endogenous variables, which are observed and every Vti ∈ V is functionally dependent on some
subset of Lvt ∪ V≤t /{Vti } where V≤t = {Vtj′ ∣ j ∈ [1, d], t′ ≤ t}. F is a set of functions such that for all Vti ∈ V, f vt is a
i i

mapping from Lvt and a subset of V≤t /{Vti } to Vti . Pr(l) is a joint probability distribution over L.
i

Some of the findings in this paper consider an unknown, general DTDSCM, while others focus on a specific subfamily
of DTDSCM that are constrained by the following assumption.
Assumption 2. ∀Lvt , Lvt′ ∈ L such that Lvt ≠ Lvt′ , Pr(Lvt ∩ Lvt ) = Pr(Lvt )Pr(Lvt ) , i.e., no hidden confounding.
i j i j i j i j

Furthermore, we suppose that in every unknown DTDSCM, there exists a known maximal lag between causes and
effects, denoted as γmax . Additionally, we assume that the probability distribution Pr over the observed variables is
positive. Finally, we suppose that each DTDSCM induces a full-time acyclic directed mixed graph (FT-ADMG), a
specific type of ADMG [Richardson, 2003], where bidirected dashed arrows represent hidden confounding. Under
Assumption 2, the FT-ADMG reduces to a full-time directed acyclic graph (FT-DAG).
Definition 3 (Full-Time Acyclic Directed Mixed Graph). Consider a DTDSCM M. The full-time acyclic directed
mixed graph (FT-ADMG) G = (V, E) induced by M is defined in the following way:

E1 ∶={Xt−γ → Yt ∣ ∀Yt ∈ V, Xt−γ ∈ X ⊆ V≤t /{Yt }


such that Yt ∶= f yt (X, Lyt ) in M},
E2 ∶={Xt−γ Yt ∣ ∀Yt , ∀Xt−γ ∈ V/{Xt−γ }
such that Pr(Lxt−γ ∩ Lyt ) ≠ Pr(Lxt−γ )Pr(Lyt )},

where E = E1 ∪ E2 .

The findings of this study, when viewed in isolation, do not rely on any additional assumptions. However, following
the identification of the direct effect, the estimation phase involves using actual data. If the data consist of time series
(e.g., various time series describing a specific patient), a stationarity assumption becomes necessary to satisfy the
positivity assumption. Conversely, the stationarity assumption is not required when working with spatio-temporal or
cohort data.
In the context of DTDSCM and FT-ADMG, both macro and micro causal effect questions can be posed
[Ferreira and Assaad, 2024b]. A macro causal effect refers to the influence of an entire time series (or multiple time
series) on another series, whereas a micro causal effect focuses on the influence of specific time points (or multiple
time points) on other time points. This work specifically focuses on micro causal effects, with a particular emphasis
on micro direct effects. In a linear setting (where all functions in F are linear), the micro direct effect of Xt−γ on Yt is
represented by the path coefficient of Xt−γ in f yt within the DTDSCM. However, in a non-parametric setting, defining
the direct effect becomes more complex. The literature distinguishes two types of non-parametric direct effects, the
first being the controlled direct effect, which is defined as follows.
Definition 4 (Average Controlled Micro Direct Effect [Pearl, 2001]). Given a FT-ADMG G = (V, E), Yt , Xt−γ ∈ V and
Z = P arents(Yt , G)/{Xt−γ }. The controlled micro direct effect of changing Xt−γ from x to x′ on Yt while keeping Z
at value z is defined as

, Yt , z) = E(Yt ∣ do(Xt−γ = x′ ), do(Z = z)) − E(Yt ∣ do(Xt−γ = x), do(Z = z)).



x,x
CDE(Xt−γ

3
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

Xt−2 Xt−1 Xt Xt−2 Xt−1 Xt

W Wt−2 Wt−1 Wt Wt−2 Wt−1 Wt

X Y Yt−2 Yt−1 Yt Yt−2 Yt−1 Yt

Figure 1: An SCG in (a) with two compatible FT-ADMGs in (b) and (c). Each pair of red and blue vertices represents
the micro direct effect we are interested in. Both the controlled direct effect and the natural direct effect are not
identifiable using the conditions given in this paper.

Xt−2 Xt−1 Xt

Wt−2 Wt−1 Wt
W
Yt−2 Yt−1 Yt
X Y

Figure 2: An SCG in (a) with a compatible FT-ADMG in (b). Each pair of red and blue vertices represents the micro
direct effect we are interested in. Both the controlled direct effect and the natural direct effect are not identifiable using
the conditions given in this paper.

In general, average controlled micro direct effects cannot be used for effect decomposition and therefore are unuseful
in mediation analysis. Specifically, the difference between the total effect and the average controlled micro direct
effect cannot typically be interpreted as an indirect effect [Robins and Greenland, 1992, Pearl, 2001, Kaufman et al.,
2004, Vanderweele, 2011]. This issue can be overcomed when the we set all values of parents of Yt to whatever value
they would have obtained under do(Xt−γ = x). This is known as the natural direct effect [Pearl, 2001] or the pure
direct effect [Robins and Greenland, 1992]).
Definition 5 (Average Natural Micro Direct Effect [Pearl, 2001]). Given a FT-ADMG G = (V, E), Yt , Xt−γ ∈ V and
Z = P arents(Yt , G)/{Xt−γ }. The natural micro direct effect of changing Xt−γ from x to x′ on Yt is defined as

, Yt ) = E(Yt ∣ do(Xt−γ = x′ ), do(Z = zx )) − E(Yt ∣ do(Xt−γ = x))



x,x
N DE(Xt−γ

where zx is the counterfactual value of Z under the setting Xt−γ = x.

Note that in Pearl [2014], Z is not the parents of Yt but the set of intermediate variables (i.e., mediators) between Xt−γ
and Yt . In this work we will focus on the set of parents of the effect, as in Pearl [2001], since it allows a better parallel
with the controlled direct effect.
There has been substantial work on identifying both average controlled and average natural direct effects from obser-
vational data using ADMGs, and by extension, FT-ADMGs. These direct effects are considered identifiable if they
can be expressed solely in terms of conditional probabilities and expectations over observed variables. Notably, the
task of identifying these effects can, in some cases, be fully reduced to identifying interventional distributions alone.
For instance, in the case of the average controlled direct effect, identification relies entirely on identifying the inter-
ventional distribution Pr(yt ∣ do(Xt−γ = x), do(Z = z)). Meanwhile, for the average natural direct effect, it partially
depends on identifying the interventional distribution Pr(yt ∣ do(Xt−γ = x)) and additionally requires identifying a
counterfactual term. However, as demonstrated by Pearl [2001] and as we will show in Section 5, even this counter-
factual term can sometimes be identified by relating it back to specific interventional distributions. The do-calculus,
introduced by Pearl [1995] provides a symbolic machinery that can be used to identify interventional distributions. It
compromises three rules, each establishing specific graphical criteria that dictate when substitutions can be applied to
an interventional distribution. The most important aspect of the do-calculus is that it is sound and complete: any inter-
ventional distribution is identifiable if and only if there exists a sequence of the rules of the do-calculus that transforms
the interventional distribution into an expression containing only of conditional probabilities and expectations over ob-
served variables [Pearl, 1995, Shpitser and Pearl, 2006, Huang and Valtorta, 2006]. The three rules of the do-calculus

4
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

are as follows:
R1:Pr(Y = y ∣ do(Z = z), X = x, W = w) = Pr(Y = y ∣ do(Z = z), W = w) if Y GZ X ∣ Z, W

⊧ ⊧
R2:Pr(Y = y ∣ do(Z = z), do(X = x), W = w) = Pr(Y = y ∣ do(Z = z), X = x, W = w) if Y GZX X ∣ Z, W
R3:Pr(Y = y ∣ do(Z = z), do(X = x), W = w) = Pr(Y = y ∣ do(Z = z), W = w) if Y GZX(W) X ∣ Z, W


where X(W) is the set of vertices in X that are non-ancestors of any vertex in W in the mutilated graph GZ .
The interventional distribution is said to be identifiable by adjustment when it can be expressed as a sum of probabilities
of the response conditioned on the treatment and a subset of observed variables W, weighted by the probability of
observing those values of W, i.e., ∑w Pr(Yt ∣ Xt−γ = x, W = w)Pr(w).
However, it is challenging for practitioners in fields like epidemiology to provide or validate a full-time ADMG, as this
requires detailed knowledge of temporal dynamics at every time point, which can be both complex and impractical.
Moreover, causal discovery algorithms [Spirtes et al., 2000] (i.e., algorithms which infer a causal graph from obser-
vational data under strong untestable assumptions) are still largely ineffective on real-world data [Aı̈t-Bachir et al.,
2023]. Therefore, it is often more feasible to use summary causal graphs, which simplify the causal structure into a
more manageable form. This allows practitioners to focus on the key pathways and relationships. For instance, when
studying the long-term effects of smoking on lung disease, epidemiologists can outline the main causal pathways—like
smoking leading to lung damage—without needing to detail the exact timing of each intermediate step.
Definition 6 (Summary Causal Graph with possible latent confounding). Consider an FT-ADMG G = (V, E). The
summary causal graph (SCG) G s = (S, Es ) compatible with G is defined in the following way:

S ∶={V i = (Vti0 , ⋯, Vtimax ) ∣ ∀i ∈ [1, d]},


Es1 ∶={X → Y ∣ ∀X, Y ∈ S, ∃t′ ≤ t ∈ [t0 , tmax ] such that Xt′ → Yt ∈ E},
E s2
∶={X Y ∣ ∀X, Y ∈ S, ∃t , t ∈ [t0 , tmax ]

such that Xt′ Yt ∈ E}.

where Es = Es1 ∪ Es2 .

Figures 1-4 present many examples of SCGs with compatible FT-ADMGs. The abstraction of SCGs implies that,
while a given FT-ADMG corresponds to exactly one SCG, multiple FT-ADMGs can be compatible with the same
SCG. As a result, the do-calculus as introduced by Pearl [1995], is not directly applicable to SCGs to identify micro
causal effects [Ferreira and Assaad, 2024b]. The objective of this paper is to establish specific graphical conditions
within an SCG that guarantee the existence of a sequence of do-calculus rules that can be applied to any FT-ADMG
compatible with the SCG, yielding an identical expression involving only conditional probabilities and expectations
over observed variables. A key insight about SCGs that aids in identifying these conditions is that, although there may
not be a strict one-to-one correspondence between the set of parents of a vertex Y in an SCG and the set of parents
of Yt in any particular FT-ADMG, a precise mapping is possible when considering the set of all potential parents
across all FT-ADMGs compatible with the SCG. Specifically, this set includes vertices that act as parents in at least
one FT-ADMG that aligns with the SCG. Formally, this set of possible parents is defined as follows.
Definition 7 (Possible Parents). Given an SCG G s = (S, Es ), a maximal lag γmax and a vertex Y ∈ S. The set of
possible parents of the temporal vertex Yt is the set of temporal vertices (i.e.vertices in compatible FT-ADMGs) which
are parents of Yt in at least one compatible FT-ADMG. It is written P P (Yt ) and is defined as
P P (Yt ) = {Pt−γ ∣ P ∈ P arents(Y, G s )/{Y }, γ ∈ [0, γmax ]} ∪ {Yt−γ ∣ γ ∈]0, γmax ]} ∗ 1Y ∈P arents(Y,G s ) .

We close this section with the following property of possible parents, which shows that it is possible to replace the set
of parents in the definitions of the average controlled micro direct effect and the average natural micro direct effect by
the set of potential parents.
Property 8. Given a FT-ADMG G = (V, E), Yt ∈ V and Z ⊇ P arents(Yt , G) such that Yt ∉ Z, the following equality
holds:
Pr(Yt = y ∣ do(P arents(Yt , G) = p)) = Pr(Yt = y ∣ do(Z = z)).
This equality holds specifically for Z = P P (Yt ) and note that there exists a compatible FT-ADMG G ′ in which
P P (Yt ) = P arents(Yt , G ′ ).

This property results directly from R3 of the do-calculus. A formal proof can be found in the supplementary materials.

5
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

4 Identifying Controlled Micro Direct Effects in SCGs


As stated in the previous section, the average controlled micro direct effect of interest is the one where we intervene on
all parents of Yt . In order to identify this effect, in particular using R2 of the do-calculus, one can find a set of variables
of Yt which d-separates Yt from its parents in GP arents(Yt ,G) for all compatible FT-ADMG G. There are two significant
challenges in this process. The first challenge arises when there exists a cycle involving Y and another vertex in the
SCG. For example, consider the SCG in Figure 1(a), where Assumption 2 holds. In this case, the cycle between Y and
W implies the presence of a compatible FT-ADMG G, as depicted in Figure 1(b), in which Wt ∈ P arents(Yt , G). In
this FT-ADMG we would like to ultimately remove all the do(⋅) operators in the interventional distribution Pr(Yt =
y ∣ do(Wt = w), do(Yt−1 = y ′ ), do(Xt−1 = x), do(Wt−1 = w′ )) but for the sake of illustration let us only focus on
removing the do(⋅) on Wt . Notice that in this FT-ADMG, we have Yt GY X W W Wt ∣ Yt−1 , Xt−1 , Wt−1 , which


t−1 t−1 t−1 t
means that by using using R2 of the do-calculus, it is possible to write the considered interventional distribution as
Pr(Yt = y ∣ Wt = w, do(Yt−1 = y ′ ), do(Xt−1 = x), do(Wt−1 = w′ )). However, there exists another compatible FT-
ADMG G ′ , shown in Figure 1(c) where Wt ← Yt and thus it is impossible to use R2 to remove the do(⋅) operator on
Wt in the same interventional distribution since ∀W, Yt /⊧G ′ Wt ∣ W. This indicates that the relationship
Yt−1 Xt−1 Wt−1 Wt

between Wt and Yt is ambiguous when only the SCG is considered, preventing the removal of the do(⋅) operator on Wt
in the considered interventional distribution. The second challenge occurs when hidden confounding exists between
Y and another vertex. For instance, consider the SCG in Figure 2(a), in this scenario, by examining the compatible
FT-ADMG G in Figure 2(b), it becomes clear that as in the previous example, it is necessary to remove the do(⋅)
on Wt in the interventional distribution Pr(Yt = y ∣ do(Wt = w), do(Yt−1 = y ′ ), do(Xt−1 = x), do(Wt−1 = w′ )) as
Wt ∈ P arents(Yt , G). However, hidden confounding exists between Wt and Yt , making R2 and R3 of the do-calculus
inapplicable since ∀W, Yt /⊧GY X W W Wt ∣ W and Yt /⊧GY X W W (W) Wt ∣ W1 . Nevertheless, as shown in the
t−1 t−1 t−1 t t−1 t−1 t−1 t
following theorem, it is possible to identify the controlled micro direct effect in any SCG where no cycle involving Y
and other vertices exists and where hidden confounding between Y and its ancestors is absent.
Theorem 9. Given an SCG G s = (S, Es ), Y ∈ S and Xt−γ ∈ P P (Yt ). The controlled direct effect of changing Xt−γ
from x to x′ on Yt is identifiable if Scc(Y, G s ) ⊆ {Y } and there does not exist a bidirected dashed arrow between Y
and one of its ancestors (i.e. ∃/ Z ∈ Ancestors(Y, G s ), Z Y ) and we have:

, Yt , z) = E(Yt ∣ Xt−γ = x′ , Z = z) − E(Yt ∣ Xt−γ = x, Z = z)



x,x
CDE(Xt−γ

where Z = P P (Yt )/{Xt−γ }.

Proof. Using Property 8 one can replace Z = P arents(Yt , G)/{Xt−γ } by Z = P P (Yt )/{Xt−γ } in Definition 4.
Suppose that Scc(Y, G s ) ⊆ {Y } and that there does not exist a bidirected dashed arrow between Y and one of its
ancestors in G s . Let us show that in every compatible FT-ADMG G, we have Yt GP P (Yt ) P P (Yt ) in order to use R2

of the do-calculus to write Pr(Yt = y ∣ do(P P (Yt ) = p)) as Pr(Yt = y ∣ P P (Yt ) = p).
Let G be a compatible FT-ADMG and ZtZ ∈ P P (Yt ) and let us consider a path π from ZtZ to Yt in GP P (Yt ) .

• π cannot end with a right arrow (i.e., π ≠ ⟨ZtZ ⋯ → Yt ⟩) as no arrow going in Yt exists in GP P (Yt ) .

• π can end with a left arrow (i.e., π = ⟨ZtZ ⋯ ← Yt ⟩) but it cannot contain only left arrows (i.e., π ≠ ⟨ZtZ ←
⋯ ← Yt ⟩) as if X = Y , this would imply having causal arrows going backwards in time which is forbidden
or if X ≠ Y , this would imply X ∈ Descendants(Y, G s ) and thus X ∈ Scc(Y, G s ) which contradicts the
assumption. Therefore, there exists a collider in π (π = ⟨ZtZ ⋯ → WtW ← ⋯ ← Yt ⟩ or π = ⟨ZtZ ⋯
WtW ← ⋯ ← Yt ⟩) and π is blocked.
• If π ends with a bidirected dashed arrow (π = ⟨ZtZ ⋯UtU Yt ⟩) then by assumption,UtU is not a possible
ancestor of Yt so since ZtZ ∈ P P (Yt ), the remaining arrows cannot all be left arrows (π ≠ ⟨ZtZ ← ⋯ ←
UtU Yt ⟩). Therefore, there exists a collider in π (i.e., π = ⟨ZtZ ⋯ → WtW ← ⋯ ← UtU Yt ⟩ or
π = ⟨ZtZ ⋯ WtW ← ⋯ ← UtU Yt ⟩) and π is blocked.

Therefore, all path between Yt and P P (Yt ) are blocked in GP P (Yt ) , i.e., Yt GP P (Yt ) P P (Yt ).

1
The subgraph formed by Yt and Wt in Figure 2(b) is known as an bow arc graph and when such a structure exists, the
interventional distribution Pr(Yt = y ∣ do(Wt = w)) is known to be non-identifiable [Shpitser and Pearl, 2006]

6
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

Xt−2 Xt−1 Xt Xt−2 Xt−1 Xt

W Wt−2 Wt−1 Wt Wt−2 Wt−1 Wt

X Y Yt−2 Yt−1 Yt Yt−2 Yt−1 Yt

Figure 3: An SCG in (a) with two compatible FT-ADMGs in (b) and (c). Each pair of red and blue vertices represents
the micro direct effect we are interested in. The controlled direct effect is identifiable according to our condition but
the natural direct effect is not.

Xt−2 Xt−1 Xt

W Wt−2 Wt−1 Wt

X Y Yt−2 Yt−1 Yt

Figure 4: An SCG in (a) with a compatible FT-ADMG in (b). Each pair of red and blue vertices represents the micro
direct effect we are interested in. The controlled direct effect is identifiable according to our condition but the natural
direct effect is not.

For illustration, we give in Figure 5 three SCGs where the average controlled micro directed effect is identifiable by
Theorem 9.
Theorem 9 offers sufficient conditions for identifying the average controlled micro direct effect supposing a general
unknown FT-ADMG. It’s important to note, however, that these conditions are not necessary in general. Interestingly,
under Assumption 2 (i.e., supposing an unknown FT-DAG) and by only considering identifiability by adjustment, these
conditions do become necessary, as demonstrated in the following proposition.
Proposition 10. Given an SCG G s = (S, Es ), Y ∈ S and Xt−γ ∈ P P (Yt ). Under Assumption 2, if Scc(Y, G s ) ⊆/ {Y }
then the controlled direct effect of changing Xt−γ from x to x′ on Yt is not identifiable by adjustment.

Proof. Firstly using Property 8 one can replace Z = P arents(Yt , G) by Z = P P (Yt ) in Definition 4. Let us
suppose that Scc(Y, G s ) ⊆/ {Y }, then there exists a cycle on Y other than the self-loop in the SCG, i.e.∃C ∈
Cycle(Y, G s )/{⟨Y ⟩}. Let us write C = ⟨V 1 , ⋯, V n ⟩ with n ≥ 3, V 1 = V n = Y , ∀1 ≤ i < n, V i ≠ V i+1 and
V i → V i+1 or V i ⇄ V i+1 . Then, ⟨V n−1 , ⋯, V 1 ⟩ is a active non-direct path containing only descendants of Y so
according to Theorem 1 of Ferreira and Assaad [2024a] one cannot be certain to block every non-direct from Vtn−1
(which is a potential parent of Yt ) to Yt without risking to induce a bias by adjusting on descendants of Yt and thus the
controlled direct effect is not identifiable by adjustment.

5 Identifying Natural Micro Direct Effects in SCGs


Pearl [2014] provided sufficient conditions to reframe the problem of identifying the average natural direct effect from
one involving counterfactuals to one based on interventions, specifically using the do() operator. The formal result
addressed cases where Z represents all mediators. It was informally observed that the same conditions apply when Z
includes all parent variables. Since this paper focuses on parent variables, we restate and formalize this result in the
context of parents.
Lemma 11. Given an SCG G s = (S, Es ), let Y ∈ S, Xt−γ ∈ P P (Yt ). If Scc(Y, G s ) ⊆ {Y } and there is no bidi-
rected dashed arrow between Y and its ancestors then the natural direct effect of changing Xt−γ from x to x′ on Yt ,
N DE(Xt−γx,x′
, Yt ), is identifiable from the SCG if

• Pr(Yt = y ∣ do(Xt−γ = x), do(Z = z)) is identifiable from the SCG, and
• Pr(Z = z ∣ do(Xt−γ = x)) is identifiable from the SCG.

7
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

where Z = P P (Yt )/{Xt−γ }.

Proof. As in Pearl [2014], the first term E(Yt ∣ do(Xt−γ = x′ ), do(Z = zx )) in N DE(Xt−γ , Yt ) can be written as

x,x

∑z E(Yt ∣ do(Xt−γ = x ), do(Z = z), Zx = z)Pr(Z = z ∣ do(Xt−γ = x)). Using Yt GP P (Yt ) P P (Yt ) which was shown


in the proof of Theorem 9 one can obtain E(Yt ∣ do(Xt−γ = x′ ), do(Z = zx )) = ∑z E(Yt ∣ do(Xt−γ = x′ ), do(Z =
z))Pr(Z = z ∣ do(Xt−γ = x)). The second term E(Yt ∣ do(Xt−γ = x)) in N DE(Xt−γ x,x′
, Yt ) can be modified similarly
since using the law of composition [Pearl, 2009, Chapter 7, Property 1] E(Yt ∣ do(Xt−γ = x)) = E(Yt ∣ do(Xt−γ =
x), do(Z = zx )). Therefore, if Pr(Yt = y ∣ do(Xt−γ = x), do(Z = z)) and Pr(Z = z ∣ do(Xt−γ = x)) are identifiable
from the SCG then N DE(Xt−γ x,x′
, Yt ) is identifiable.

Notice that Lemma 11 shows that the the identifiability of the average natural micro direct effect can be reduced to
the identifiability of the average controlled micro direct effect and the identifiability of the interventional distribution
Pr(Z = z ∣ do(Xt−γ = x)). The first identifiability can be directly given by the result of the previous section which
means that we need to impose the constraints given for the average controlled micro direct effect to the average
natural micro direct effect. In addition, we need to add some conditions to ensure the second identifiability. The
SCG in Figure 3(a) represents a case where this second identifiability cannot be achieved. To see this, consider
Pr(Wt−1 = w, Wt = w′ , Yt−1 = y ∣ do(Xt−1 = x)) and notice that, similarly as in Figure 1, because of the cycle
X ⇄ W in the SCG, the do(⋅) operator cannot be removed using the same sequence of do-calculus rules in the first
compatible FT-ADMG in Figure 3(b) in which Xt−1 → Wt−1 and in the second compatible FT-ADMG in Figure 3(c)
in which Xt−1 ← Wt−1 . The SCG in Figure 4(a) represents another case where this second identifiability cannot be
achieved due to hidden confounding. To see this, consider Pr(Wt−1 = w, Wt = w′ , Yt−1 = y ∣ do(Xt−1 = x)) and
notice that, similarly to Figure 2, because of hidden confounding X W in the SCG, the do(⋅) cannot be removed
in this interventional distribution in the compatible FT-ADMG in Figure 4(b) in which Xt−1 Wt−1 .
, Yt ), is

x,x
Theorem 12. Given an SCG G s , the natural direct effect of changing Xt−γ from x to x′ on Yt , N DE(Xt−γ
identifiable from the SCG if

1. Scc(Y, G s ) ⊆ {Y }, 4. ∄Z ∈ Ancestors(Y, G s ), Z Y , and


2. Scc(X, G s ) ⊆ {X}, 5. ∄Z ∈ Ancestors(Y, G s ), Z X.
3. P P (Xt−γ ) ∩ P P (Yt ) = ∅,

When these conditions are satisfied, we have:

, Yt ) = ∑ (CDE(Xt−γ , Yt , z) ∑ Pr(Z = z ∣ Xt−γ = x, A = a)Pr(A = a))


′ ′
x,x x,x
N DE(Xt−γ
z a

where Z = P P (Yt )/{Xt−γ }, A = P P (Xt−γ ) and a formula for CDE(Xt−γ



x,x
, Yt , z) is given in Theorem 9.

Proof. Conditions 1 and 4 allow to use lemma 11. Therefore, it is sufficient to identify Pr(Yt = y ∣ do(Xt−γ =
x), do(Z = z)) and Pr(Z = z ∣ do(Xt−γ = x)). Theorem 9 states that Pr(Yt = y ∣ do(Xt−γ = x), do(Z = z)) is
identifiable given Conditions 1 and 4, thus what is left to show is how Pr(Z = z ∣ do(Xt−γ = x)) is identifiable. By
the law of total probability, Pr(Z = z ∣ do(Xt−γ = x)) = ∑a Pr(Z = z ∣ do(Xt−γ = x), =a)Pr(A = a ∣ do(Xt−γ = x)).
Notice that because of Conditions 2 and 5 we have for all compatible FT-ADMG G, A GX Xt−γ and thus using R3

of the do-calculus Pr(A = a ∣ do(Xt−γ = x)) = Pr(A = a). Let us show that in every compatible FT-ADMG G,
t−γ

Xt−γ GXt−γ Z ∣ A which will allow to use R2 of the do-calculus and get Pr(Z = z ∣ do(Xt−γ = x), A = a) = Pr(Z = z ∣

Xt−γ = x, A = a). Let G be a compatible FT-ADMG and ZtZ ∈ Z and suppose there exists a path π = ⟨Xt−γ ⋯ZtZ ⟩ in
GXt−γ .

• π cannot start with a right arrow (i.e., π ≠ ⟨Xt−γ → ⋯ZtZ ⟩) as no arrow going out of Xt−γ exists in GXt−γ .

• π cannot start with a left arrow and be of length 2 (i.e., π ≠ ⟨Xt−γ ← ZtZ ⟩) as this contradicts Condition 3.
• π can start with a left arrow if it has length at least 2 (i.e., π = ⟨Xt−γ ← UtU ⋯ZtZ ⟩) as UtU ∈ A and thus A
blocks π.

8
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

W Z W Z W Z

X Y X Y X Y
Figure 5: Three SCGs. Each pair of red and blue vertices represents the micro direct effect we are interested in. The
controlled direct effect is identifiable in all these SCGs using the conditions given in this paper. The natural direct
effect is not identifiable in (a), identifiable if γ = γmax in (b) and identifiable for all γ in (c).

• If π starts with a bidirected dashed arrow (i.e., π = ⟨Xt−γ ⋯ZtZ ⟩) then length of π cannot be 2 (i.e.,
π ≠ ⟨Xt−γ ZtZ ⟩) and the remaining arrow cannot all be right arrows (π ≠ ⟨Xt−γ UtU → ⋯ → ZtZ ⟩)
as this would imply U ∈ Ancestors(Z, G s ) ⊆ Ancestors(Y, G s ) which contradicts Condition 5. Therefore,
there exists a collider in π (i.e., π = ⟨Xt−γ UtU → ⋯ → WtW ← ⋯ZtZ ⟩ or ⟨Xt−γ UtU →
⋯ → WtW ⋯ZtZ ⟩) which verifies Descendants(U, G s ) ⊇ Descendants(W, G s ) and Condition 5
implies P arents(X, G s ) ∩ Descendants(U, G s ) = ∅ so P arents(X, G s ) ∩ Descendants(W, G s ) = ∅.
Thus A ∩ Descendants(WtW , G) = ∅ and A blocks π.

All those cases are exhaustive and thus under Conditions 2, 3 and 5, A is a valid adjustment set to identify Pr(Z = z ∣
do(Xt−γ = x)). Therefore, the 5 conditions together imply the identifiability of N DE(Xt−γx,x′
, Yt ).

Note that Condition 3 implies either having no self-loop on X in the SCG or choosing γ = γmax . As an illustration,
Figure 5(b) presents an SCG where N DE(Xt−γ x,x′
max
, Yt ) is identifiable, though this identifiability does not hold when
, Yt ) is identifiable for all

x,x
γmax is replaced by γ < γmax . In contrast, Figure 5(c) shows an SCG where N DE(Xt−γ
γ.

6 Conclusion
In this paper, we addressed both average controlled and average natural micro direct effects, providing theoretical
results that extend existing work in causal inference using summary causal graphs. By accounting for cycles, hidden
confounders and allowing for a non-parametric setting, our results contribute to a broader applicability of causal
inference techniques in real-world settings where full causal specification is impractical. Our findings are particularly
relevant in fields such as epidemiology, where accurate measurement of direct effects is crucial for informing public
health interventions and policy decisions.
In future work, it will be important to validate whether our conditions hold when focusing on other forms of average
controlled and average natural direct effects, such as those that emphasize mediators rather than parents. Additionally,
it would be valuable to establish necessary and sufficient conditions for average controlled direct effects that extend
beyond simple adjustment. Another future direction would be to derive necessary and sufficient conditions for iden-
tifying average controlled direct effects in the presence of hidden confounding, as well as for average natural direct
effects. However, we believe that this is more challenging to achieve compared to average controlled direct effects
without hidden confounding. Now that total and direct effects are well understood in the context of summary causal
graphs, it would also be interesting to explore the identification of indirect effects and, more broadly, path-specific
effects within SCGs. Finally, we plan to apply the findings of this work to real-world applications, particularly in the
field of epidemiology.

Acknowledgments
This work was supported by the CIPHOD project (ANR-23-CPJ1-0212-01).

References
Tara V. Anand, Adele H. Ribeiro, Jin Tian, and Elias Bareinboim. Causal effect identification in cluster dags. Pro-
ceedings of the AAAI Conference on Artificial Intelligence, 37(10):12172–12179, Jun. 2023.
Charles K. Assaad. Toward identifiability of total effects in summary causal graphs with latent confounders: an
extension of the front-door criterion. arXiv 2406.05805, 2024.

9
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

Charles K. Assaad, Imad Ez-Zejjari, and Lei Zan. Root cause identification for collective anomalies in time series
given an acyclic summary causal graph with loops. In Francisco Ruiz, Jennifer Dy, and Jan-Willem van de Meent,
editors, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, volume 206 of
Proceedings of Machine Learning Research, pages 8395–8404. PMLR, 25–27 Apr 2023.
Charles K. Assaad, Emilie Devijver, Eric Gaussier, Gregor Goessler, and Anouar Meynaoui. Identifiability of total
effects from abstractions of time series causal graphs. In Negar Kiyavash and Joris M. Mooij, editors, Proceedings of
the Fortieth Conference on Uncertainty in Artificial Intelligence, volume 244 of Proceedings of Machine Learning
Research, pages 173–185. PMLR, 15–19 Jul 2024.
Ali Aı̈t-Bachir, Charles K. Assaad, Christophe de Bignicourt, Emilie Devijver, Simon Ferreira, Eric Gaussier, Hosein
Mohanna, and Lei Zan. Case studies of causal discovery from it monitoring time series, 2023. The History and
Development of Search Methods for Causal Structure Workshop at the 39th Conference on Uncertainty in Artificial
Intelligence.
Michael Eichler and Vanessa Didelez. Causal reasoning in graphical time series models. In Proceedings of the Twenty-
Third Conference on Uncertainty in Artificial Intelligence, UAI’07, page 109–116, Arlington, Virginia, USA, 2007.
AUAI Press. ISBN 0974903930.
Simon Ferreira and Charles K. Assaad. Identifiability of direct effects from summary causal graphs. Proceedings of
the AAAI Conference on Artificial Intelligence, 38(18):20387–20394, Mar. 2024a. doi: 10.1609/aaai.v38i18.30021.
Simon Ferreira and Charles K. Assaad. Identifying macro conditional independencies and macro total effects in
summary causal graphs with latent confounding, 2024b. Causal Inference for Time Series Data Workshop at the
40th Conference on Uncertainty in Artificial Intelligence.
Emily R. Flanagan. Identification and estimation of direct causal effects. Thesis, University of Washington, Juin 2020.
Yimin Huang and Marco Valtorta. Pearl’s calculus of intervention is complete. In Proceedings of the Twenty-Second
Conference on Uncertainty in Artificial Intelligence, UAI’06, page 217–224, Arlington, Virginia, USA, 2006. AUAI
Press. ISBN 0974903922.
Jay Kaufman, Richard Maclehose, and Sol Kaufman. A further crtique of the analytic strategy of adjusting for
covariates to identify biologic mediation. Epidemiologic perspectives & innovations : EP+I, 1:4, 02 2004. doi:
10.1186/1742-5573-1-4.
Marloes Maathuis and Diego Colombo. A generalized backdoor criterion. The Annals of Statistics, 43, 07 2013. doi:
10.1214/14-AOS1295.
Judea Pearl. Causal diagrams for empirical research. Biometrika, 82(4):669–688, 1995.
Judea Pearl. Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainy in Articial
Intelligence, 2001.
Judea Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, USA, 2nd edition, 2009.
ISBN 052189560X.
Judea Pearl. Interpretation and identification of causal mediation. Psychological methods, 19, 06 2014. doi: 10.1037/
a0036434.
Emilija Perkovic. Identifying causal effects in maximally oriented partially directed acyclic graphs. In Jonas Peters and
David Sontag, editors, Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), volume
124 of Proceedings of Machine Learning Research, pages 530–539. PMLR, 03–06 Aug 2020.
Emilija Perkovic, Johannes Textor, Markus Kalisch, and Marloes H. Maathuis. Complete graphical characterization
and construction of adjustment sets in markov equivalence classes of ancestral graphs. J. Mach. Learn. Res., 18:
220:1–220:62, 2016.
Thomas S. Richardson. Markov properties for acyclic directed mixed graphs. Scandinavian Journal of Statistics, 30:
145–157, 2003.
James M. Robins and Sander Greenland. Identifiability and exchangeability for direct and indirect effects. Epidemi-
ology, 3:143–155, 1992.
Ilya Shpitser and Judea Pearl. Identification of joint interventional distributions in recursive semi-markovian causal
models. In AAAI, pages 1219–1226, 2006.
Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. MIT press, 2nd edition,
2000.
Tyler J. Vanderweele. Controlled direct and mediated effects: Definition, identification and bounds. Scandinavian
Journal of Statistics, 38(3):551–563, 2011. doi: https://doi.org/10.1111/j.1467-9469.2010.00722.x.

10
Average Controlled and Average Natural Micro Direct Effects in Summary Causal Graphs A P REPRINT

Jonas Wahl, Urmi Ninad, and Jakob Runge. Foundations of causal discovery on groups of variables. Journal of Causal
Inference, 12(1):20230041, 2024. doi: doi:10.1515/jci-2023-0041.
Tian-Zuo Wang, Tian Qin, and Zhi-Hua Zhou. Estimating possible causal effects with latent variables via adjustment.
In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett,
editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of
Machine Learning Research, pages 36308–36335. PMLR, 23–29 Jul 2023.
Wen Zhou, Geoffrey Liu, Rayjean J. Hung, Philip C. Haycock, Melinda C. Aldrich, Angeline S. Andrew, Su-
sanne M. Arnold, Heike Bickeböller, Stig E. Bojesen, Paul Brennan, Hans Brunnström, Olle Melander, Neil E.
Caporaso, Maria Teresa Landi, Chu Chen, Gary E. Goodman, David C. Christiani, Angela Cox, John K. Field,
Mikael Johansson, Lambertus A. Kiemeney, Stephen Lam, Philip Lazarus, Loı̈c Le Marchand, Gad Rennert,
Angela Risch, Matthew B. Schabath, Sanjay S. Shete, Adonina Tardón, Shanbeh Zienolddiny, Hongbing Shen,
and Christopher I. Amos. Causal relationships between body mass index, smoking and lung cancer: Univari-
able and multivariable mendelian randomization. International Journal of Cancer, 148(5):1077–1086, 2021. doi:
https://doi.org/10.1002/ijc.33292.

A Proofs
Property 8. Take ZtZ ∈ Z/P arents(Yt , G) and notice that Yt GZ ZtZ ∣ P arents(Yt , G). Indeed, P arents(Yt , G) ⊆ Z


so in GZ , P arents(Yt , G) have no parents and thus cannot be colliders nor descendants of colliders. Therefore, in GZ ,
an active path starting with Yt must be of the form ⟨Yt → ⋯ →⟩ or ⟨Yt ⋅ → ⋯ →⟩ and an active path ending with
ZtZ must be of the form ⟨← ⋯ ← ZtZ ⟩. Clearly, there is no active path from Yt to ZtZ in GZ and the d-separation
holds. Thus, using Rule 3 of the do-calculus [Pearl, 1995] we get Pr(Yt = y ∣ do(P arents(Yt , G) = p)) = Pr(Yt =
y ∣ do(Z = z)).

11

You might also like