Interacting urn models with strong reinforcement
Abstract.
For the interacting urn model with polynomial reinforcement, it has been conjectured in [16] that almost surely one color monopolizes all the urns if the interaction parameter . We disprove the conjecture.
For the case , we give a sufficient condition for monopoly, which improves a result obtained by Launay in [17].
1. General introduction
1.1. Definition of the model
Reinforced processes provide a rich framework for modeling and analyzing systems in physics, economics, and social sciences, where history plays a crucial role in shaping future dynamics. We refer to [23] for a survey of various models of random processes with reinforcement and their applications, a basic model of which is the well-known Pólya urn model. A generalized Pólya urn model is defined as follows.
Let be a positive sequence. Given an urn of black and red balls, let and denote the number of black and red balls in the urn at time , respectively. We assume that and are positive integers. At each time step , we draw a ball from the urn and then return the ball to the urn along with another ball of the same color. The probability of drawing a ball of a certain color from the urn is proportional to where is the number of balls of this color, that is,
Notice that the classical Pólya urn corresponds to the case .
In [11, 21], this model was called a balls-in-bins process with feedback where the authors were motivated by economic problems of competition and the sequence was called the feedback function. It is also referred to as an ordinal dependent Pólya urn [23] since it is equal in law to the following process: At each time step, we draw a ball from the urn randomly uniformly with replacement, and add red balls, resp. black balls, if it is the -th time a red ball, resp. a black ball is drawn.
We denote by the event that eventually only balls of one color are added to the urn. Using Rubin’s exponential embedding, Davis [10] proved that
(1) |
Recently, there is a growing interest in the study of systems of interacting urns, see e.g. [1, 3, 6, 7, 9, 13, 14, 20, 25]. We study a model of (strongly) reinforced interacting urns introduced by Launay [16], which can be described as follows.
The model has urns containing black and red balls. Imagine that there are barriers separating different urns. At each step, for the i-th urn, ,
-
(1)
with probability , (all the barriers are removed, and) a ball is drawn from a combined pool of all urns with replacement, see e.g. Figure 1(b) for the case ;
-
(2)
with probability , (the barriers are kept, and) a ball is drawn from the i-th urn with replacement, see e.g. Figure 1(a);
-
(3)
The probability of drawing a ball of a certain color is proportional to (#the number of balls of that color), as in the ordinal-dependent Pólya’s urn;
-
(4)
In either case, we add another ball of the same color as the drawn ball to the i-th urn.
We do the above procedure simultaneously and independently for each urn.
More precisely, let and denote the number of black and red balls in the i-th urn at time , respectively. Then, , resp. , is the total number of black balls, resp. red balls, in the system at time . Write and . The initial composition is given by with . Let be a positive sequence and let be two constants. For any , conditional on , define independent Bernoulli random variables by
(2) |
Now set
(3) |
The process is called the interacting urn mechanism (IUM) with reinforcement sequence and interaction parameter . We denote its law by .
Unless otherwise specified, we assume (i.e. there are two urns) for simplicity. Without loss of generality, we assume that and . Let
(4) |
be the proportions of black balls in the two urns at time , respectively.
For an IUM with , there is a tendency for different components to adopt a common behavior. For example, for IUM with linear reinforcements, i.e. , Dai Pra, Louis and Minelli [9] proved that if and , then for any , almost surely,
(5) |
This phenomenon is called synchronization. Moreover, it has been proved in [6, Theorem 3.2] that this common limit satisfies
(6) |
However, IUMs with strong reinforcement may exhibit very different behaviors. We say that is a strong reinforcement sequence if
(7) |
As we will see later, for some IUMs with strong reinforcement sequences and weak interaction (i.e. is small), one color can maintain its advantage in a single urn while it is at a disadvantage globally, in which case the urns do not synchronize. Indeed, this models a common phenomenon in economic systems: Many companies perform well in their local markets but struggle to replicate that success globally due to weak interactions between regions. On the other hand, for strong interaction, the phenomena of domination and monopoly may occur, as already exhibited in the ordinal dependent Pólya urn.
Definition 1 (Domination and monopoly).
For an IUM, we denote by
the event that eventually the number of balls of one color is negligible with respect to the number of balls of the other color, and call this event domination. Further, we denote by
the event that eventually only balls of one color are added to the urns, and call this event monopoly. Note that .
We will be interested in how and are affected by the parameter and the sequence , especially in the case of power function/polynomial reinforcement sequences, i.e. , , for some real number , or is of the form
(8) |
where is a positive integer. For , we can assume that since in this case the IUM is equal in law to an IUM with and shifted initial condition . In addition, as we will see later, our results do not depend on . Thus, by a slight abuse of notation, in either of the two cases above, we let denote the law of the IUM with reinforcement sequence and parameter . We show that under , domination implies monopoly a.s., and thus, we will not distinguish the two events in the sequel.
Proposition 1.1.
Assume that satisfies (7) and is eventually increasing, i.e., there exists such that for all . If
(9) |
then for any . In particular, for any and , one has .
1.2. Main results
1.2.1. Power function/Polynomial reinforcements
For , define
(11) |
which we call the critical parameter. Our first main result shows that , which disproves Launay’s conjecture. Moreover, for , we prove that (5) holds for general initial conditions, and the domination occurs with probability 0.
Theorem 1.2.
For any and , under , the sequence defined in (4) is convergent a.s.. Moreover,
(i) if , then for any , one has and
(ii) if , then
where is the unique solution of the following equation on
Note that exists if , see Lemma 3.1.
As mentioned in [16, Conclusion], polynomial reinforcements do not behave as exponential reinforcements where , for some . It has been proved in [16] that, for any , one has
Therefore, one can observe a phase transition at .
Remark 1.1.
As tends to , the exponential reinforcement mechanism converges to the ”generalized” reinforcement, which was introduced and studied by Launay and Limic in [18].
Our second main result shows that power function/polynomial reinforcements are weaker than exponential reinforcements in the sense that . On the other hand, for large , the reinforcement becomes very strong and behaves ”like” the exponential reinforcement: For , if is sufficiently large, then with positive probability, is close to . In particular, .
Theorem 1.3.
(i) For any , one has .
(ii) For , if , then .
(iii) Fix , for sufficiently large , there exists such that
1.2.2. Urns with simultaneous drawing
For general reinforcement sequences, the case was studied in [17] whose main result is the following. Recall (7) for the definition of the strong reinforcement sequences.
Theorem 1.4 (Launay, [17]).
Given urns, if is a non-decreasing strong reinforcement sequence, then .
We see from (1) that for , the monotonicity assumption is not needed. It was conjectured in [17] that this assumption is also redundant for the cases . In this work, we show a generalization of Theorem 1.4 to a larger class of strong reinforcement sequences.
We will not limit ourselves to two-color urns. In this case, it is convenient to assume that we have only a single urn of -color balls where . Let be the number of balls of the -th color in this urn at time , and write . Without loss of generality, we assume that are positive integers. For any , conditional on , the law of is given by the multinomial distribution
One can easily see that the process is an IUM with reinforcement sequence and parameter . The event monopoly is then given by
Theorem 1.5.
One has if is a strong reinforcement sequence that satisfies one of the following conditions:
(i) There exists a positive constant such that for all ,
(13) |
(ii) For , let . One has
(14) |
Remark 1.3.
(I) If is non-decreasing and satisfies (7), then (13) holds with . Thus, Theorem 1.5 generalizes Theorem 1.4.
(II) In either case, we require that , the total variation of , is relatively small. For example, if or , then neither (13) nor (14) is satisfied. It is worth mentioning that similar conditions and examples have appeared in the study of strongly edge-reinforced random walks, see [19].
(III) Each condition cannot be derived from the other. By (I), satisfies (13) but does not satisfy (14). On the other hand, one can check by the Cauchy-Schwarz inequality that if
then (14) is satisfied, see e.g. the proof of [19, Corollary 4]. In particular, if , then (14) is satisfied but (13) is not satisfied.
2. Introduction to the proofs and the techniques
2.1. Notation
We let denote a positive constant depending only on real variables and let denote a universal positive constant, which usually means that and do not depend on .
For a real-valued function and a -valued function , we write as , resp. , if there exist positive constants and such that for all , resp. .
We let . We let denote the usual Euclidean norm. We write if a random variable has an exponential distribution with rate .
2.2. Stochastic approximation algorithms
Under , we show that defined by (4) is generated by a stochastic approximation algorithm (Robbins-Monro algorithm), and is closely related to the following (deterministic) planar nonlinear system
(15) |
where is a vector function on defined by
(16) |
For an introduction to stochastic approximation algorithms, see e.g. [2, 4, 5, 12].
Proposition 2.1.
For any and , under , defined by (4) satisfies the following recursion:
(17) |
where and are adapted sequences such that for all ,
where is a positive constant.
For the system (15) with initial condition , the existence and uniqueness of the solution follow from the Lipschitz property of and Picard’s theorem. Note that the solution satisfies for all .
We shall study the asymptotic behavior of the solution to the system (15). It turns out that (15) is a gradient system. The proof of the following result is direct and is omitted here.
Proposition 2.2.
For any , define by
(18) |
where
Then, we have .
Example 2.1 ().
One has and
A point is called an equilibrium of (15) if . Let be the set of all equilibrium points. Observe that for any . We prove that is a finite set for . The cases and are plotted in Figure 1 where is the set of intersection points of the two curves.
Proposition 2.3.
includes . Moreover, for , one has,
(i) if , then
(ii) if (this is possible only when ), then
where is defined in Theorem 1.2;
(iii) if , then
(iv) if , then is finite and
Propositions 2.2, 2.3 and Example 2.1 then imply that the solution to (15) converges to . More precisely,
-
•
If and , then and increases to 0 as . In particular, both and converges to .
-
•
If , converges to an equilibrium as .
Definition 2 (Asymptotically stable equilibria).
Define
(19) |
Observe that the Jacobian matrix of the system (15) is given by
which is a real symmetric matrix and thus has two real eigenvalues:
(20) | ||||
Note that for an equilibrium , if , then ; it is called unstable if .
Example 2.2.
and are asymptotically stable equilibria since . While is unstable since .
If and is sufficiently small, or and is sufficiently large, we prove the existence of asymptotically stable equilibrium.
Proposition 2.4.
(i) For and , one has
where is defined in Theorem 1.2.
(ii) For , if , then there exists an asymptotically stable equilibrium in .
(iii) Fix , for sufficiently large , there exists such that
For the cases and , the vector fields generated by (15) are plotted in Figure 2. In Figure 2(a), is inside the red circle. Readers can also find an asymptotically stable equilibrium near in Figure 2(b).
The following result says that if , then all the equilibria except and , are unstable, as is illustrated in Figure 3 for the cases and .
Corollary 2.5.
Assume that and . If , then .
Proof.
We now sketch the proof for Theorems 1.2 and 1.3 (the details will be given in Section 4):
-
•
Using stochastic approximation techniques, we show that under , as in the deterministic case (15), almost surely, the sequence is convergent and the limit belongs to .
-
•
For , stochastic approximation theory can also be used to show that converges to any asymptotically stable equilibrium with positive probability, and converges to any unstable equilibrium with probability 0. Theorems 1.2 (ii) and Theorem 1.3 then follows from Propositions 2.3, 2.4 and Corollary 2.5.
- •
2.3. Continuous-time construction with time delays
We use a continuous-time embedding technique to prove Theorem 1.5. As we have mentioned, the case was solved by a continuous-time construction. It is natural to consider whether this technique can be generalized.
For the purpose of the proof, we introduce a new time-lines representation which we call continuous-time construction with time delays.
(22) |
We let be a graph as in (22) consisting of a single vertex and self-loops, i.e. the edge set and , . We shall define a continuous-time jump process on . Let us first introduce some preliminary notation.
Let be the hitting times of to . For each and , let
(23) |
be the number of visits to up to time plus with the convention that . Here and should be interpreted respectively as the number of balls of the -th color in the urn at time and the initial number of balls of the -th color (see Proposition 2.6 for a more precise statement). For , let
(24) |
with the convention that . Let be independent Exp(1)-distributed random variables. The law of is defined as follows:
At time , on each edge , we launch a timer with a duration . When the timer of an edge rings, jumps to cross instantaneously. If an edge is crossed at time such that for some , then we launch a new timer on this edge with a duration . For , at time , we update the denominators (i.e. the rates) for all the timers: for , if the timer on has run a time of , then we reset the timer such that the remaining time becomes
(25) |
Remark 2.1.
(i) Just before we reset the timers, the remaining time of the timer on is
(ii) If, for some , is not crossed during the time interval , then so that there is nothing to change for the timer on .
(iii) If jumps to cross for some at time , we will launch a new timer on and thus . We may simply launch a new timer on with a duration rather than .
(iv) The timer which corresponds to may run at different rates as time changes. All the possible denominators (rates) are
due to the time delays. Note that we may update this timer at jumping times but we will never launch a new one until it rings. Recall defined in (24). If , the total time this timer needs to run is simply , in which case we can write
(26) |
where and .
We denote the natural filtration of by , i.e. . Recall the process defined in Section 1.2.2.
Proposition 2.6.
Let be the jump process defined above. Then,
In particular, we may define and the IUM on the same probability space such that a.s.
(27) |
Proof.
As we explained in Remark 2.1, unlike the continuous-time construction in the proof of [23, Theorem 3.6], at time , we keep using the data we collect at time (i.e. ) to launch new timers. This justifies its name continuous-time construction with time delays. It is a powerful technique that allows us to give a very short proof of a multi-color () version of Theorem 1.4.
A new proof of Theorem 1.4 with .
Conditional on , if for some , then the probability that (i.e. we add balls of the relative major color at time ) is lower bounded by since is non-decreasing. By the conditional Borel-Cantelli lemma, see e.g. [8], a.s. such an event occurs for infinitely many , and thus, there is an infinite sequence of finite stopping times such that at time , there exists such that
(28) |
By (26), for , if ,
(29) |
As is mentioned in the proof of Proposition 2.6, conditional on , the remaining time of the timer on has the distribution of an independent copy of where is an Exp(1)-distributed random variable. By a slight abuse of notation, the time remaining is denoted by . By symmetry and (28), conditional on , with probability at least ,
(note that all the sums above all have continuous distributions) and in particular, by (29),
That is, the remaining time needed to visit i.o. is strictly less than that needed for any other edge. By (27) in Proposition 2.6, this is equivalent to saying that only balls of color are taken infinitely often (after time ). Therefore, for any ,
We then conclude that by Levy’s 0-1 law. ∎
2.4. Coupling
Proposition 1.1 is proved by coupling. Let be an IUM with reinforcement sequence and interaction parameter . We define a new urn process as follows, where and () denote the number of black and red balls in the -th urn at time , respectively.
Similarly, we write , and , . The initial composition is given by . For any , at time step , we add a black ball to the first urn with probability
otherwise, we add a red ball to the first urn; at time step , we add a black ball to the second urn with probability
otherwise, we add a red ball to the second urn.
In words, red balls are always drawn from all the urns combined, black balls are always drawn from the urn alone. Compared to the IUM, it is natural to expect that there will be more red balls and fewer black balls if is non-decreasing.
Lemma 2.7.
Assume that is non-decreasing and . Then we can define the two urn processes and above on the same probability space such that
(30) |
Lemma 2.8.
Lemmas 2.7 and 2.8 will be proved in Section 6. Notice that they also hold if we interchange the colors black/red. Using Lemmas 2.7 and 2.8, we can prove Proposition 1.1.
Proof of Proposition 1.1.
Fix , let be as in Lemma 2.8 for . We define an infinite sequence of stopping times as follows. Let and for any ,
with the convention that . If for some , then we set for all . By Lemma 2.7 and Lemma 2.8, for any ,
On the other hand, on the event , either
-
(1)
for all , in which case the monopoly occurs, or
-
(2)
for all large , in which case domination does not occur.
Therefore, . Thus, for any ,
By Levy’s 0-1 law, which completes the proof since . ∎
2.5. Organization of the remaining of this paper
Section 3 concerns the results on the deterministic nonlinear system (15): Proposition 2.3 and Proposition 2.4 are proved.
3. Results on the deterministic dynamical system
We assume that and . The following two functions will be used frequently:
(31) |
In particular, and where is defined by (19).
Lemma 3.1.
The equation has a solution on if and only if . If one solution exists, then it is unique.
Proof.
The assertion is trivial for . We now assume that . Note that . Observe that is a strictly increasing function on with and .
If , then there exists a unique such that . The function is strictly decreasing on and strictly increasing on . In particular, there exists a unique solution to on . See Figure 4(a) for the case and .
If , then is strictly decreasing on , and thus, there is no solution to on . The case and is plotted in Figure 4(b). ∎
We denote the unique solution by when it exists. Note that only when .
The proof of Proposition 2.3 will need the following three technical lemmas.
Lemma 3.2.
(i) If and , then .
(ii) For and , one has
Proof.
(i) If , then
(32) |
where the equality holds if and only if or . The inequality (32) is reversed if where the equality holds if and only if or . This proves (i).
(ii) If is an equilibrium, then by (32), we have
and similarly,
Thus, or . Similarly, if is an equilibrium, then or . Moreover, it is easy to see that if a boundary point is an equilibrium, then or . (ii) is then proved. ∎
Lemma 3.3.
Define
(33) |
If , then for any , one has .
Proof.
Observe that on the interval , the function defined in (31) is convex and . Thus, if , then
In particular,
Since , one has
Thus, for and , we have
where in the last inequality we used that
If and , then
which completes the proof. ∎
Lemma 3.4.
Let and . One has
Proof.
Now we are ready to prove Proposition 2.3.
Proof of Proposition 2.3.
It is direct to check that includes . The assertions (i) and (ii) follow from Lemma 3.2 and Lemma 3.4, respectively.
By symmetry and Lemma 3.2, to prove (iii) and (iv), we need to show that the set
is finite if and is empty if .
(iii) Assume that and , and in particular, . Recall and defined in (31). We see from the proof of Lemma 3.1 that and
(36) |
which contradicts our assumption, and proves (iii).
(iv) Assume that . Using arguments in the proof of Lemma 3.1, we see that there exists such that is increasing on and decreasing on , see e.g. Figure 4(a). For , define
Since and , there exists an such that for any , there exists a unique such that . By the analytic implicit function theorem, see e.g. [15, Theorem 6.1.2], this unique can be written as where is analytic and increasing on and
(37) |
Observing that decreases as increases, we see that is increasing on . Thus, using the convexity of and that , one has, for any ,
(38) |
Now we consider the analytic function on . It is non-constant since its derivative equals
(39) |
which converges to as (observe that and as ). As a non-constant analytic function, has finite zeros on . Observe that if with , then and . In particular, is a zero of on . Since can take only finitely many values, (iv) is proved. ∎
For the proof of Proposition 2.4, we will need the following auxiliary lemmas.
Lemma 3.5.
Assume that . One has,
(i) if , then ;
(ii) if , then
Proof.
Fix , define
and
Then and . Moreover,
and for ,
where we used that achieves its minimum on at . Therefore, on and on . It remains to notice that (i) and (ii) follow from that and that , respectively. ∎
Lemma 3.6.
For and , let
(40) |
Then , and in particular, .
Proof.
For , one has , and in particular,
(41) |
where the last inequality follows from that . Observe that
Then if and only if
(42) |
(i) If , by Lemma 3.5 (i) and (41),
where we used that . To prove (42), it suffices to show that
Using that for , we see the left-hand side is lower bounded by
which completes the proof for the case .
(ii) If , similarly, by Lemma 3.5 (ii) and (41), it suffices to show that
Using that , we see that the left-hand side (LHS) satisfies
We now prove Proposition 2.4.
Proof of Proposition 2.4.
(i) Notice that and . By (20), we need to prove that
(43) |
The case is trivial since . For , one has . Moreover, using that (and thus ), we have
Therefore, using (19), we can write
Then (43) is equivalent to
which follows from Lemma 3.6.
(ii) Recall defined in (16). If and , then for any , one has
where we used that for and that
If , then for any , one still has
since the left-hand side is an increasing function in . Moreover, for any and , one has
On the other hand, since , one has and for and . Therefore, the maximum of on the square can only be achieved at some interior point, which is asymptotically stable.
(iii) We can assume that and thus . We shall use the functions
and defined in the proof of Proposition 2.3 (iv): For any , we have and where is such that is strictly increasing on and strictly decreasing on . By (32), if , then
whence we have for . Fix such that , then for any ,
as . In particular, we can choose a large such that for all . Therefore, for any ,
(44) |
Note that
(45) |
uniformly on . Recall that . Using (37), (39) and (44), we have
(46) |
Moreover, by the choice of , one has
(47) |
Therefore, for all large , we have and , and by (46), there exists a unique such that . In particular, is an equilibrium. Observe that by (46) and (47),
and thus . Since none of and are in the interval , by (45), we have
which, by (20), implies that . Thus, for all large . ∎
4. Stochastic approximation algorithm
Proof of Proposition 2.1.
Lemma 4.1.
Recall defined by (21). Under , one has:
(i) The process satisfies the following recursion:
where and are adapted sequence such that for all ,
(50) |
where is a positive constant. In particular, converges a.s.
(ii) For , let with the convention that . Then, there exists a positive integer such that for all ,
Proof.
(i) For , let
and
Then
By definition,
(51) |
Using that and , we have
(52) |
Note that conditional on , the two random variables and are independent Bernoulli random variables. Thus, and
which implies (50) in virtue of (52). Now let and
Then, by (50), the process is a -bounded martingale, and thus converges a.s. Moreover,
These show that converges a.s..
(ii) For any , by (50),
and, similarly, the quadratic variation of satisfies
Choose such that . Then, for all , by the optional stopping theorem, one has
where and we used that on the event ,
This proves (ii) since . ∎
We now set , and define an interpolated process by
By Proposition 2.1 and [2, Proposition 4.2, Remark 4.5], the interpolated process is an asymptotic pseudotrajectory of the flow induced by the vector field . We now prove Theorem 1.2.
Proof of Theorem 1.2.
We first prove the a.s.-convergence of . We first assume that . The case follows from the classical results for the Pólya urn model. If , recall that and by Example 2.1. Since is compact, by [2, Theorem 5.7], the limit set of the interpolated process
is internally chain transitive a.s.. Then, by Proposition 2.2 and [2, Proposition 6.4], almost surely, , and thus, the sequence converges to a.s.. On the other hand, by Lemma 4.1 (i) and (51), converges a.s., and in particular, converges a.s.. For , the proof is similar: By Proposition 2.3, there are only finitely many equilibria of the gradient system (15). Then one can directly apply [2, Corollary 6.6] to conclude that converges a.s. to an equilibrium.
The following auxiliary lemma will be used in the proof of Theorem 1.3.
Lemma 4.2.
If is unstable, i.e. , then
(53) |
Proof.
Since is unstable, by Proposition 2.3 (iv), it is not on the boundary of , and thus, there exists a neighborhood of such that any is bounded away from the boundary. Now we show that there exists a constant such that for any ,
(54) |
where is given in Proposition 2.1 and . By (2) and (49), we can find positive constants such that
Therefore, for any , the left-hand side of (54) is lower bounded by
The cases can be proved similarly.
Proof of Theorem 1.3.
(i) If , then, by Proposition 2.3 and Corollary 2.5, consists of finitely many unstable equilibria. From the proof of Theorem 1.2, we see that converges a.s. to an equilibrium. Lemma 4.2 then implies that , and thus, .
We assume that for some . In particular, for any , there exists such that . By Proposition 2.3 and Lemma 4.2, we can find two sequences and such that for any ,
and
Since is compact, by possibly choosing a subsequence, we may assume that for some . Since and are continuous function in , we see that and with , which contradicts Corollary 2.5.
5. Continuous-time construction with time-delays
Let be the jump process defined in Section 2.3 and be its natural filtration. By Proposition 2.6, we may define and the IUM on the same probability space such that (27) holds.
The following lemma will be used in the proof of Theorem 1.5. Recall that .
Lemma 5.1.
Assume that satisfies (7) and . Let and be independent Exp(1)-distributed random variables, and for , let
Then, there exists such that for all large ,
Proof.
Our assumptions imply that there exists such that for all . Now fix , let and . Note that . By definition, . Moreover,
By the choice of , one has
Applying the optional stopping theorem to the -bounded martingale , we have
where the quadratic variation of the martingale is given by
By symmetry,
and
These two inequalities imply the desired result. ∎
Proof of Theorem 1.5.
By (26), for any and , if , we can write
where and . Observe that
which belongs to the closed interval
Repeating this procedure times gives
(55) |
Case (i): We assume that (13) holds. Then, for any ,
(56) |
In particular, for any . As in (28), the conditional Borel-Cantelli lemma then implies that a.s. there is an infinite sequence of finite stopping times such that at each time , there exists some ,
(57) |
As in the proof of Theorem 1.4, at time , for any , the remaining time of the timer on has an exponential distribution with rate , which, by a slight abuse of notation, we denote by . We assume that for some . Then,
(58) | ||||
where we used that for each , the term is counted at most times in the sum in the second line, and the last inequality follows from (13).
By Markov inequality, for any positive integrable random variable , if denotes its median value, then
and thus . In particular, by (58),
(59) |
By (56) and (57), starting from time , the time needed to visit once more is
(60) |
Again, here should be interpreted as the remaining time of the timer on , which is independent of in virtue of (57). Therefore, by (59) and that ,
where the event is defined by
By symmetry (one may interchange and ),
(61) | ||||
For , we let be the event that
Note that by symmetry, (61) still holds if one replaces by . Thus, . Since are i.i.d., by Hölder’s inequality, one has
In virtue of (55), on , we have and
That is, the remaining time needed to visit i.o. is strictly less than that needed for any other edge. In other words, by Proposition 2.6, only balls of color are taken infinitely often. Therefore,
We conclude that by Levy’s 0-1 law.
Case (ii): We assume that (14) holds. By a slight abuse of notation, for , we let be such that . For any , by Lemma 5.1,
(62) |
and, by Markov’s inequality and (58), for large ,
(63) | ||||
where we used that converges to by (14). By a slight abuse of notation, we let be the event that
We deduce from (62) and (63) that
The rest of the proof follows the same lines as that of Case 1: Hölder’s inequality and (55) imply that for all large ,
which shows that by Levy’s 0-1 law. ∎
6. Coupling
Proof of Lemma 2.7.
Let be i.i.d. uniform random variables on . For any , we set , resp. if
otherwise, we set , resp. ; we set , resp. if
otherwise, we set , resp. . Then it is easy to check that and defined above have the desired laws. One can then prove (30) by induction. ∎
Proof of Lemma 2.8.
As in the proof of Theorem 1.5, we use a time-lines construction to prove Lemma 2.8. Let be a directed multigraph with
where and are two arcs from to and to , respectively. We regard as an undirected edge.
Let , , be independent Exp(1)-distributed random variables. We define a continuous-time jump process on :
(i) Define, on each (directed or undirected) edge , independent point processes (alarm times) : for each ,
(64) |
(ii) Each edge has its own clock, denoted by . If (resp. ), runs when is at (resp. ). For , set .
(iii) Set . If at time , the clock of an edge rings, i.e. for some , then jumps to cross instantaneously.
Let be the jumping times of . For , as in (23), let be the number of visits to up to time plus , and let be the number of visits to up to time plus (note that here we distinguish and ). Then, as in Proposition 2.6, one can show by the memoryless property of exponentials that
We may assume that on some probability space. We denote by the natural filtration of .
Now, assume that , and in particular, for . Starting from time , the total time that needs to spend to cross the undirected edge infinitely often is
(65) |
where is the usual ceiling function. Again, the first term should be interpreted as the remaining time of the clock on at time . On the other hand, the total time that needs to spend to cross both and infinitely often is
(66) |
Note that up to time , the time spends at , resp. , is upper bounded by , resp. . Now using properties of exponential random variables, we have, by (65) and (66),
and
By Chebyshev’s inequality,
(67) | ||||
where we used the monotonicity of to get
On the other hand, Markov’s inequality implies that
(68) |
In virtue of (9), the right-hand side of (68) can be made arbitrarily small for all by first choosing a small and then choosing a large . Using (67) and (68), by possibly choosing a smaller and a larger , we have
if and . It remains to observe that on the event , only black edges are crossed infinitely often, that is, only black balls are drawn infinitely often. ∎
7. Some open questions
For the interacting urn mechanism with strong reinforcement, some interesting questions remain unsolved.
-
(i)
For power function/polynomial reinforcements, an important question is whether there is a phase transition at . In Remark 1.2, we conjecture that . If this is true, is increasing in ? (Intuitively speaking, the reinforcement becomes stronger as grows.) And does the limit of exist as approaches 1 from above? To solve these questions, we may need a better understanding of the system (15), or we need to couple and for .
- (ii)
- (iii)
8. Acknowledgement
I am very grateful to Professor Tarrès, my Ph.D. advisor, for inspiring the choice of this subject.
References
- [1] Raffaele Argiento, Robin Pemantle, Brian Skyrms, and Stanislav Volkov. Learning to signal: analysis of a micro-level reinforcement model. Stochastic Process. Appl., 119(2):373–390, 2009.
- [2] Michel Benaïm. Dynamics of stochastic approximation algorithms. In Séminaire de Probabilités, XXXIII, volume 1709 of Lecture Notes in Math., pages 1–68. Springer, Berlin, 1999.
- [3] Michel Benaïm, Itai Benjamini, Jun Chen, and Yuri Lima. A generalized Pólya’s urn with graph based interactions. Random Structures Algorithms, 46(4):614–634, 2015.
- [4] Albert Benveniste, Michel Métivier, and Pierre Priouret. Adaptive algorithms and stochastic approximations, volume 22 of Applications of Mathematics (New York). Springer-Verlag, Berlin, 1990. Translated from the French by Stephen S. Wilson.
- [5] Vivek S. Borkar. Stochastic approximation. Cambridge University Press, Cambridge; Hindustan Book Agency, New Delhi, 2008. A dynamical systems viewpoint.
- [6] Irene Crimaldi, Paolo Dai Pra, and Ida Germana Minelli. Fluctuation theorems for synchronization of interacting Pólya’s urns. Stochastic Process. Appl., 126(3):930–947, 2016.
- [7] Irene Crimaldi, Pierre-Yves Louis, and Ida G. Minelli. Interacting nonlinear reinforced stochastic processes: synchronization or non-synchronization. Adv. in Appl. Probab., 55(1):275–320, 2023.
- [8] Didier Dacunha-Castelle and Marie Duflo. Probability and statistics. Vol. II. Springer-Verlag, New York, 1986. Translated from the French by David McHale.
- [9] Paolo Dai Pra, Pierre-Yves Louis, and Ida G. Minelli. Synchronization via interacting reinforcement. J. Appl. Probab., 51(2):556–568, 2014.
- [10] Burgess Davis. Reinforced random walk. Probab. Theory Related Fields, 84(2):203–229, 1990.
- [11] Eleni Drinea, Alan Frieze, and Michael Mitzenmacher. Balls and bins models with feedback. In Proceedings of 13th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 308–315. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2002.
- [12] Marie Duflo. Random iterative models, volume 34 of Applications of Mathematics (New York). Springer-Verlag, Berlin, 1997. Translated from the 1990 French original by Stephen S. Wilson and revised by the author.
- [13] Yilei Hu, Brian Skyrms, and Pierre Tarrès. Reinforcement learning in signaling game. arXiv preprint arXiv:1103.5818, 2011.
- [14] Gursharn Kaur and Neeraja Sahasrabudhe. Interacting urns on a finite directed graph. J. Appl. Probab., 60(1):166–188, 2023.
- [15] Steven G. Krantz and Harold R. Parks. The implicit function theorem. Modern Birkhäuser Classics. Birkhäuser/Springer, New York, 2013. History, theory, and applications, Reprint of the 2003 edition.
- [16] Mickaël Launay. Interacting urn models. arXiv preprint arXiv:1101.1410, 2011.
- [17] Mickaël Launay. Urns with simultaneous drawing. arXiv preprint arXiv:1201.3495, 2012.
- [18] Mickaël Launay and Vlada Limic. Generalized interacting urn models. arXiv preprint arXiv:1207.5635, 2012.
- [19] Vlada Limic and Pierre Tarrès. Attracting edge and strongly edge reinforced walks. Ann. Probab., 35(5):1783–1806, 2007.
- [20] Seyedmeghdad Mirebrahimi. Interacting stochastic systems with individual and collective reinforcement. PhD thesis, Université de Poitiers, 2019.
- [21] Roberto Oliveira. Balls-in-bins processes with feedback and Brownian motion. Combin. Probab. Comput., 17(1):87–110, 2008.
- [22] Robin Pemantle. Nonconvergence to unstable points in urn models and stochastic approximations. Ann. Probab., 18(2):698–712, 1990.
- [23] Robin Pemantle. A survey of random processes with reinforcement. Probab. Surv., 4:1–79, 2007.
- [24] Olivier Raimond and Pierre Tarres. Non-convergence to unstable equilibriums for continuous-time and discrete-time stochastic processes. arXiv preprint arXiv:2311.02978, 2023.
- [25] Neeraja Sahasrabudhe. Synchronization and fluctuation theorems for interacting Friedman urns. J. Appl. Probab., 53(4):1221–1239, 2016.
- [26] P Tarrès. Localization of reinforced random walks. arXiv preprint arXiv:1103.5536, 2011.