The Accumulation of Beneficial Mutations and Convergence to a Poisson Process

Nantawat Udomchatpitak This work (Grant No. RGNS 64-155) was supported by Office of the Permanent Secretary, Ministry of Higher Education, Science, Research and Innovation (OPS MHESI), Thailand Science Research and Innovation (TSRI) and Mahidol University. Department of Mathematics, Mahidol University Jason Schweinsberg Department of Mathematics, University of California San Diego

Abstract

We consider a model of a population with fixed size $N$ , which is subjected to an unlimited supply of beneficial mutations at a constant rate $\mu_{N}$ . Individuals with $k$ beneficial mutations have the fitness $(1+s_{N})^{k}$ . Each individual dies at rate 1 and is replaced by a random individual chosen with probability proportional to its fitness. We show that when $\mu_{N}\ll 1/(N\log N)$ and $N^{-\eta}\ll s_{N}\ll 1$ for some $\eta<1$ , large numbers of beneficial mutations are present in the population at the same time, competing against each other, yet the fixation times of beneficial mutations, after a time scaling, converge to the times of a Poisson process.

MSC: Primary 92D15; Secondary 60J27, 60J80, 92D25

Keywords: Population model, mutation, selection, Poisson process

1 Introduction

One of the most important questions in evolutionary biology is to understand how beneficial mutations accumulate in a population. We consider here a simple model of a population which repeatedly acquires beneficial mutations. We assume the population has fixed size $N$ . We assume that, at time zero, no individuals have mutations, but then each individual in the population independently acquires mutations at times of a homogeneous Poisson process with rate $\mu_{N}$ . All mutations are assumed to be beneficial and to increase the individual’s fitness by a factor of $1+s_{N}$ , so that an individual with $k$ mutations has fitness $(1+s_{N})^{k}$ . We assume that each individual independently lives for an exponentially distributed time with rate $1$ . When an individual dies, it gets replaced by a new individual whose parent is chosen at random from the $N$ individuals in the population, with probability proportional to the individual’s fitness. The new individual inherits all of its parent’s mutations.

It is instructive to consider what happens after one individual acquires a beneficial mutation, if we assume that no further mutations can occur. As we will explain in more detail below, the number of individuals with the mutation then evolves like a birth and death chain in which the ratio of the birth rate to the death rate is $1+s_{N}$ . Classical results on asymmetric random walks imply that the probability that this chain reaches $N$ before $0$ is

\frac{s_{N}}{(1+s_{N})(1-(1+s_{N})^{-N})},

which is approximately $s_{N}/(1+s_{N})$ as long as $(1+s_{N})^{N}\rightarrow\infty$ as $N\rightarrow\infty$ . Therefore, the beneficial mutation may quickly disappear, but with probability approximately $s_{N}/(1+s_{N})$ , the beneficial mutation will spread to the entire population, an event known as a selective sweep. One can also show that the duration of a selective sweep, that is, the time required for a beneficial mutation to spread to the entire population, is approximately $(2/s_{N})\log N$ . This question was first investigated by Kimura and Ohta [16], and a rigorous analysis for a population model very similar to the one presented here is given in section 6.1 of [5].

Returning now to original population model, because there are $N$ individuals acquiring mutations at rate $\mu_{N}$ , the total mutation rate for the population is $N\mu_{N}$ . Therefore, the rate of mutations which trigger a selective sweep is approximately $N\mu_{N}s_{N}/(1+s_{N})$ . It follows that the expected time between such mutations is approximately $(1+s_{N})/(N\mu_{N}s_{N})$ . Therefore, the time between selective sweeps is much longer than the duration of a selective sweep provided that $\mu_{N}\ll 1/(N\log N)$ . As a result, when $\mu_{N}\ll 1/(N\log N)$ , we expect to have approximately exponentially distributed waiting times between selective sweeps, so that after a suitable rescaling of time, the times of selective sweeps converge as $N\rightarrow\infty$ to the times of a homogeneous Poisson process. When $s_{N}$ is bounded away from zero as $N\rightarrow\infty$ , which is the case of strong selection, this result is straightforward to prove because with high probability, there will only be one beneficial mutation in the population at any given time that has not already spread to the entire population, which means the selective sweeps can be analyzed individually. However, when $s_{N}\rightarrow 0$ as $N\rightarrow\infty$ , even though the selective sweeps are well separated in time, at any given time there will be many different mutations in the population that will ultimately die out before spreading to a large number of individuals. The presence of these additional mutants leads to what is known as clonal interference and complicates the analysis significantly. In this paper, we demonstrate that nevertheless one can prove the expected result for a range of values of $s_{N}$ which includes the case of moderate selection, where $s_{N}=N^{-b}$ for some $b\in(0,1)$ .

Given two sequences of positive numbers $(a_{N})_{N=1}^{\infty}$ and $(b_{N})_{N=1}^{\infty}$ , we write $a_{N}\ll b_{N}$ if $\lim_{N\rightarrow\infty}a_{N}/b_{N}=0$ and $a_{N}\sim b_{N}$ if $\lim_{N\rightarrow\infty}a_{N}/b_{N}=1$ . Throughout the paper, we will assume that

\mu_{N}\ll\frac{1}{N\log N}

(1)

and that

N^{-\eta}\ll s_{N}\ll 1\quad\textup{for some }\eta<1.

(2)

Let $X_{k,N}(t)$ be the number of individuals at time $t$ with $k$ mutations, which we call type $k$ individuals. Let $T_{0,N}=0$ , and let

T_{k,N}=\inf\bigg{\{}t\geq 0:X_{k,N}(t)>\frac{\log N}{s_{N}}\bigg{\}}.

Also, let

\Delta=\left\lfloor\frac{1}{1-\eta}\right\rfloor+1.

The following theorem is the main result of this paper.

Theorem 1.1.

Assume that (1) and (2) hold. Let $(\xi_{k})_{k=1}^{\infty}$ be a sequence of independent random variables having the exponential distribution with mean one. Then for each fixed positive integer $K$ , as $N\rightarrow\infty$ we have the convergence in distribution

\big{(}N\mu_{N}s_{N}(T_{k,N}-T_{k-1,N})\big{)}_{k=1}^{K}\Rightarrow(\xi_{k})_{% k=1}^{K}.

(3)

Furthermore, there exist positive constants $C_{1}$ and $C_{2}$ , depending on $\eta$ , such that for all nonnegative integers $k$ , we have

\lim_{N\rightarrow\infty}P\bigg{(}X_{k,N}(t)\geq N-\frac{C_{2}\log N}{s_{N}}% \textup{ for all $t$ such that }T_{k,N}+\frac{C_{1}\log N}{s_{N}}\leq t<T_{k+1% ,N}\bigg{)}=1

(4)

and

\lim_{N\rightarrow\infty}P\bigg{(}\sum_{j=k}^{k+\Delta}X_{j,N}(t)=N\textup{ % for all $t$ such that }T_{k,N}+\frac{C_{1}\log N}{s_{N}}\leq t<T_{k+1,N}\bigg{% )}=1.

(5)

We can think of $T_{k,N}$ as being approximately the time when type $k$ becomes established in the population. The result (3) demonstrates that the times $T_{k,N}$ , when scaled by $N\mu_{N}s_{N}$ which is approximately the rate at which selective sweeps take place, converge as $N\rightarrow\infty$ to the times of a homogeneous rate one Poisson process. The result (4) shows that shortly after time $T_{k,N}$ , most of the population consists of type $k$ individuals. Furthermore, the result (5) shows that all individuals of types $k-1$ and lower disappear from the population shortly after time $T_{k,N}$ , and then all individuals have types between $k$ and $k+\Delta$ until at least time $T_{k+1,N}$ .

From this result, we obtain the following corollary regarding how the average number of mutations in the population evolves over time.

Corollary 1.2.

Assume that (1) and (2) hold. For all $t\geq 0$ , let

\overline{X}_{N}(t)=\frac{1}{N}\sum_{k=0}^{\infty}kX_{k}(t)

denote the average number of mutations carried by the $N$ individuals in the population at time $t$ . Then, the finite-dimensional distributions of the processes $(\overline{X}_{N}(t/(N\mu_{N}s_{N})),t\geq 0)$ converge as $N\rightarrow\infty$ to the finite-dimensional distributions of a homogeneous rate one Poisson process.

To give one indication of why these results are significant, we refer the reader to the award-winning papers [11, 1], which provide a mathematical analysis of the results of the famous Lenski experiments on bacterial evolution. In [11], the authors consider a model very similar to the one in the present paper and assume that $s_{N}\sim N^{-b}$ and $\mu_{N}\sim N^{-(1+a)}$ , where $0<b<1/2$ and $a>3b$ . (In [11], the mutation rate is written as $\mu_{N}\sim N^{-a}$ , but this is because $\mu_{N}$ in [11] refers to the mutation rate for the entire population and therefore corresponds to $N\mu_{N}$ in the present paper.) These restrictions on the parameters are chosen to eliminate all clonal interference on the time scale of interest with high probability. Theorem 1.1 suggests that the same results may still hold if the condition $a>3b$ is replaced by the weaker condition $a>0$ , which is sufficient to eliminate clonal interference among beneficial mutations that do not quickly die out.

Many papers have been devoted to analyzing this population model (or very similar models, perhaps with slightly different selection mechanisms) for different ranges of values for the parameters $\mu_{N}$ and $s_{N}$ . Much of this work has been carried out by statistical physicists and appears in the biology or physics literature; see, for example, [2, 3, 7, 12, 13, 18, 19, 21]. There is also a growing body of mathematically rigorous work on the subject. The case when $\mu_{N}\sim C/(N\log N)$ , where one begins to see overlaps between selective sweeps, was considered by Gerrish and Lenski in [9] and has recently been studied rigorously in [10]. Durrett and Mayberry [6] studied the case in which $s_{N}$ is a constant and $\mu_{N}\sim N^{-a}$ for some $a\in(0,1)$ . Schweinsberg [22, 23] studied slightly faster mutation rates, so that $\mu_{N}$ tends to zero more slowly than any power of $1/N$ . This work made rigorous the analysis in [3, 4]. Rigorous results for the case in which both $s_{N}$ and $\mu_{N}$ are constants were established in [24, 15]. One can also consider the case in which the mutation rate is very fast, but the selective benefit resulting from each mutation is very small. In this case, the fitness of a lineage over time is well approximated by Brownian motion. This parameter regime was studied by Neher and Hallatschek [19]. A branching Brownian motion model that should serve as a good approximation to this population model was studied rigorously in [20, 17]. Finally, we note that the case when both $s_{N}$ and $\mu_{N}$ are on the scale of $1/N$ can be studied using a diffusion approximation, as discussed, for example, in section 8.1 of [5].

The rest of this paper is devoted to proving Theorem 1.1 and Corollary 1.2. An important component of the proof will be a coupling between the population process and a branching process with immigration, which will allow us to bound the number of individuals with a given number of mutations from above and below by branching processes.

2 Transition rates for the population process

For the rest of the paper, to lighten notation, we shall omit the subscript $N$ and simply write $\mu$ , $s$ , $X_{k}(t)$ , and $T_{k}$ in place of $\mu_{N}$ , $s_{N}$ , $X_{k,N}(t)$ , and $T_{k,N}$ . Nevertheless, it is important to keep in mind that these quantities do depend on $N$ .

In this section, we work out the transition rates for the population process. Let

S(t)=\sum_{k=0}^{\infty}(1+s)^{k}X_{k}(t),

(6)

which is the total fitness of the population at time $t$ . Note that $S(t)\geq\sum_{k=0}^{\infty}X_{k}(t)=N$ for all $t\geq 0$ . We need to consider two types of transitions in the population process:

For every pair of non-negative integers $(i,j)$ such that $i\neq j$ and $i\neq j+1$ , $X_{i}$ decreases by $1$ while $X_{j}$ increases by 1 when a type $i$ individual is replaced by a type $j$ individual. Hence, the rate at which $X_{i}$ decreases by $1$ while $X_{j}$ increases by 1 at time $t$ is

X_{i}(t)\cdot\frac{(1+s)^{j}X_{j}(t)}{S(t)}

because type $i$ individuals die at rate $X_{i}(t)$ , and the probability that the new individual born is type $j$ is $(1+s)^{j}X_{j}(t)/S(t)$ .

For every non-negative integer $i$ , $X_{i}$ decreases by $1$ while $X_{i+1}$ increases by 1 when a type $i$ individual is replaced by a type $i+1$ individual, or a type $i$ individual gains a new mutation and becomes a type $i+1$ individual. Hence, the rate at which $X_{i}$ decreases by $1$ while $X_{i+1}$ increases by 1 at time $t$ is

X_{i}(t)\cdot\frac{(1+s)^{i+1}X_{i+1}(t)}{S(t)}+X_{i}(t)\mu.

There are also events in which a type $i$ individual is replaced by another type $i$ individual, but we may ignore these events because they do not change the composition of the population.

From these transition rates, we can see that for $k\geq 0$ , the process $(X_{k}(t),t\geq 0)$ can be viewed as a birth-death process with immigration having the following transition rates:

An immigration event occurs when a type $k-1$ individual becomes a type $k$ individual by acquiring a new mutation, which occurs at rate

m_{k}(t):=X_{k-1}(t)\mu.

(7)

Note that immigration only occurs for $k\geq 1$ . We will call a type $k$ individual a type $k$ immigrant if it arises from a type $k-1$ individual who gains a new mutation.

A given type $k$ individual gives birth when an individual that is not of type $k$ is replaced by a new individual who chooses this type $k$ individual as its parent. This event occurs at rate

b_{k}(t):=(N-X_{k}(t))\cdot\frac{(1+s)^{k}}{S(t)}.

(8)

A given type $k$ individual dies when it is replaced by an individual that is not of type $k$ , or it gains a new beneficial mutation, which occurs at rate

d_{k}(t):=\left(1-\frac{(1+s)^{k}X_{k}(t)}{S(t)}\right)+\mu.

(9)

Note that when discussing births and deaths of type $k$ individuals, we are disregarding events in which a type $k$ individual is replaced in the population by another type $k$ individual. Ignoring these birth and death events does not affect the distribution of types in the population but does alter the genealogy of the population. For the rest of the paper, we will work with this modified genealogy. This affects what is meant when we consider, for example, the set of individuals that are descended from a particular type $k$ immigrant.

3 Structure of the induction argument

Define $T_{0}^{\prime}=0$ , and for each positive integer $k$ , let

T_{k}^{\prime}=\inf\left\{t\geq T_{k-1}^{\prime}:X_{k-1}(t)=0\right\}.

Note that at time $T_{k}^{\prime}$ , all individuals of types $k-1$ or lower have disappeared from the population. Let

\theta=N\mu\vee\frac{1}{Ns}.

For positive integers $k$ , let

\beta_{k}=\frac{\theta^{1/2}\mu^{k-1}(\log N)^{2k-1}}{s^{k}}.

The following lemma is the key to the proof of our main results. Note that, although for the model described in the introduction, no individuals have mutations at time zero, we present the result here under a slightly more general initial condition, so that the lemma can be applied inductively.

Lemma 3.1.

Suppose $X_{k}(0)\leq\beta_{k}$ for $1\leq k\leq\Delta-1$ and $X_{k}(0)=0$ for $k\geq\Delta$ . Then the following hold:

For all $c>0$ , we have

\lim_{N\rightarrow\infty}P(N\mu sT_{1}>c)=e^{-c}.

There exists a positive constant $C$ such that

\lim_{N\rightarrow\infty}P\left(0\leq T_{1}^{\prime}-T_{1}<\frac{C\log N}{s}% \right)=1.

We have

\lim_{N\rightarrow\infty}P(T_{1}^{\prime}<T_{k}\mbox{ for all }k\geq 2)=1.

We have

\lim_{N\rightarrow\infty}P(X_{\Delta+1}(t)=0\mbox{ for all }t\in[0,T_{1}^{% \prime}])=1.

For all positive integers $k$ such that $2\leq k\leq\Delta$ , we have

\lim_{N\rightarrow\infty}P(X_{k}(T_{1}^{\prime})\leq\beta_{k-1})=1.

Part 1 of Lemma 3.1 shows that the number of type $1$ individuals reaches $(\log N)/s$ after a time which is approximately exponentially distributed with rate $N\mu s$ . Then part 2 of the lemma shows that the type 0 individuals completely disappear a short time later. Parts 3 and 4 show that type 0 individuals disappear before the number of type $k$ individuals reaches $(\log N)/s$ for any $k\geq 2$ , and before any individual acquires $\Delta+1$ mutations. Finally, part 5 of the lemma shows that at the time the type $0$ individuals disappear, there are at most $\beta_{k-1}$ individuals of type $k$ for $2\leq k\leq\Delta$ .

Sections 4, 5, and 6 are devoted to the proof of Lemma 3.1. In the rest of this section, we will show how to apply Lemma 3.1 inductively to obtain Lemma 3.2, and then use Lemma 3.2 to obtain Theorem 1.1 and Corollary 1.2. Let $\mathcal{F}_{t}$ be the $\sigma$ -field generated by the random variables $X_{k}(s)$ for nonnegative integers $k$ and $s\in[0,t]$ , so that $(\mathcal{F}_{t},t\geq 0)$ is the natural filtration associated with the population process. Note that this filtration implicitly depends on $N$ .

Lemma 3.2.

For all nonnegative integers $m$ , let $G_{m}$ be the event that $X_{m+k}(T_{m}^{\prime})\leq\beta_{k}$ for $1\leq k\leq\Delta-1$ and $X_{k}(T_{m}^{\prime})=0$ for $k\notin\{m,m+1,\dots,m+\Delta-1\}$ . Then for all positive integers $m$ , the following hold:

For all $c>0$ , we have

\lim_{N\rightarrow\infty}\big{|}P\big{(}N\mu s(T_{m}-T_{m-1}^{\prime})>c\,|\,% \mathcal{F}_{T_{m-1}^{\prime}}\big{)}-e^{-c}\big{|}\mathds{1}_{G_{m-1}}=0.

There exists a positive constant $C$ such that

\lim_{N\rightarrow\infty}P\left(0\leq T_{m}^{\prime}-T_{m}<\frac{C\log N}{s}% \right)=0.

We have

\lim_{N\rightarrow\infty}P(T_{m}^{\prime}<T_{m+k}\mbox{ for all }k\geq 1)=1.

We have

\lim_{N\rightarrow\infty}P(X_{m+\Delta}(t)=0\mbox{ for all }t\in[T_{m-1}^{% \prime},T_{m}^{\prime}])=1.

We have

\lim_{N\rightarrow\infty}P(G_{m})=1.

Proof.

The result when $m=1$ is equivalent to Lemma 3.1. Suppose $m$ is a positive integer, and the result holds up to $m$ . Then we have $\lim_{N\rightarrow\infty}P(G_{m})=1$ , so we can work on the event $G_{m}$ . On the event $G_{m}$ , every individual in the population must have at least $m$ mutations from time $T_{m}^{\prime}$ onward. For $k\geq m$ and $t\geq T_{m}^{\prime}$ , on the event $G_{m}$ ,

b_{k}(t)=\frac{(1+s)^{k}(N-X_{k}(t))}{S(t)}=\frac{(1+s)^{k}(N-X_{k}(t))}{\sum_% {j=m}^{\infty}(1+s)^{j}X_{j}(t)}=\frac{(1+s)^{k-m}(N-X_{k}(t))}{\sum_{i=0}^{% \infty}(1+s)^{i}X_{i+m}(t)}

and

d_{k}(t)=1-\frac{(1+s)^{k}X_{k}(t)}{S(t)}+\mu=1-\frac{(1+s)^{k-m}X_{k}(t)}{% \sum_{i=0}^{\infty}(1+s)^{i}X_{i+m}(t)}+\mu.

We can see from these formulas that the rates would be unchanged if $m$ were subtracted from the type of each individual, which is a consequence of the fact that subtracting $m$ from the type of each individual multiplies the fitness of each individual by $(1+s)^{-m}$ , without changing the relative fitnesses of the individuals. Therefore, will will shift the type of each individual down by $m$ , so that type $k$ individuals are relabeled as type $k-m$ .

After this relabeling of the types, on the event $G_{m}$ , the distribution of types at time $T_{m}^{\prime}$ satisfies the same conditions as the distribution of types at time zero in Lemma 3.1. Therefore, we can apply the strong Markov property at time $T_{m}^{\prime}$ , and after accounting for the relabeling of types, the five conclusions in Lemma 3.1 are equivalent to the five conclusions in Lemma 3.2 with $m+1$ in place of $m$ . Thus, the result holds for $m+1$ , and the lemma follows by induction. ∎

Proof of Theorem 1.1.

Fix a positive integer $K$ . It follows from part 5 of Lemma 3.2 that

\lim_{N\rightarrow\infty}P(G_{1}\cap\dots\cap G_{K-1})=1.

Therefore, by part 1 of Lemma 3.2, if $c_{1},\dots,c_{K}>0$ , then

\lim_{N\rightarrow\infty}P\big{(}N\mu s(T_{k}-T_{k-1}^{\prime})>c_{k}\mbox{ % for }k=1,\dots,K\big{)}=\prod_{k=1}^{K}e^{-c_{k}}.

That is, we have $(N\mu s(T_{k}-T_{k-1}^{\prime}))_{k=1}^{K}\Rightarrow(\xi_{k})_{k=1}^{K}$ . Because part 2 of Lemma 3.2 and (1) imply that $N\mu s(T_{k-1}^{\prime}-T_{k-1})\rightarrow_{p}0$ as $N\rightarrow\infty$ for $k=1,\dots K$ , the result (3) follows.

Next, note that part 5 of Lemma 3.2 implies that at time $T_{k}^{\prime}$ , with probability tending to one, all individuals have type at least $k$ and at most $k+\Delta-1$ . Parts 2 and 3 of Lemma 3.2 imply that with probability tending to one, we have $T_{k}^{\prime}<T_{k+1}\leq T_{k+1}^{\prime}<T_{k+2}\leq\dots<T_{k+\Delta}$ , so in particular before time $T_{k+1}$ , the number of individuals of type $j$ is less than $(\log N)/s$ for $j\in\{k+1,\dots,k+\Delta\}$ . Part 4 of Lemma 3.2 implies that with probability tending to one as $N\rightarrow\infty$ , no individual of type $k+\Delta+1$ appears before time $T_{k+1}^{\prime}$ . Putting together these observations, we conclude that

\lim_{N\rightarrow\infty}P\bigg{(}X_{k}(t)\geq N-\frac{\Delta\log N}{s}\mbox{ % for all }t\mbox{ such that }T_{k}^{\prime}\leq t<T_{k+1}\bigg{)}=1.

In view of part 2 of Lemma 3.2, the result (4) follows with $C_{1}=\Delta$ and $C_{2}=C$ . The result (5) also follows from this same reasoning. ∎

Proof of Corollary 1.2.

For all $u\geq 0$ , let $V_{N}(u)=\sup\{k:T_{k,N}\leq u\}$ . It follows from (3) that the finite-dimensional distributions of the processes $(V_{N}(t/(N\mu s)),t\geq 0)$ converge as $N\rightarrow\infty$ to the finite-dimensional distributions of a homogeneous rate one Poisson process. Therefore, it suffices to show that for each fixed $t>0$ , we have

\left|V_{N}\left(\frac{t}{N\mu s}\right)-\overline{X}_{N}\left(\frac{t}{N\mu s% }\right)\right|\rightarrow_{p}0\qquad\mbox{as }N\rightarrow\infty.

(10)

By (3), for any fixed $t>0$ and any $\delta>0$ ,

\limsup_{N\rightarrow\infty}P\left(\frac{t-\delta}{N\mu s}\leq T_{k}\leq\frac{% t}{N\mu s}\>\mbox{ for some }k\right)\leq\delta.

Since $(\log N)/s\ll 1/(N\mu s)$ by (1), it follows from part 2 of Lemma 3.2 that for each fixed $t>0$ , we have

\lim_{N\rightarrow\infty}P\left(T_{k}\leq\frac{t}{N\mu s}<T_{k}^{\prime}\>% \mbox{ for some }k\right)=0.

(11)

However, as long as, for all $u\in[T_{k}^{\prime},T_{k+1})$ , we have $X_{j}(u)<(\log N)/s$ for $j\in\{k+1,\dots,k+\Delta\}$ and $X_{j}(u)=0$ for all $j<k$ and $j>k+\Delta$ , an event which has probability tending to one as $N\rightarrow\infty$ by Lemma 3.2, we have

k\leq\overline{X}_{N}(u)\leq k+\frac{1}{N}\sum_{j=1}^{\Delta}jX_{k+j}(u)\leq k% +\frac{\Delta(\Delta+1)}{2}\cdot\frac{\log N}{Ns}\qquad\mbox{for all }u\in[T_{% k}^{\prime},T_{k+1}).

(12)

Because $(\log N)/(Ns)\rightarrow 0$ as $N\rightarrow\infty$ by (2), the result (10) follows from (11) and (12). ∎

4 Following the process until time $T_{1}$

In this section, we study the process between time zero and the time $T_{1}$ when the number of type 1 individuals reaches $(\log N)/s$ . By bounding the process $X_{1}$ from above and below by branching processes with immigration, we will show that $T_{1}$ is asymptotically exponentially distributed. We will also bound the processes $X_{k}$ from above to show that the number of individuals of type $2$ or higher stays small until after time $T_{1}$ .

4.1 Bounding the process $X_{k}$ from above by a branching process

For an interval $I\subseteq[0,\infty)$ , we define $X_{k,I}(t)$ to be the number of type $k$ individuals at time $t$ that descend from type $k$ immigrants who appear during the time interval $I$ . When $0\in I$ , descendants of type $k$ individuals that are in the population at time zero are included; recall that this matters because we are aiming to prove Lemma 3.1 under slightly more general initial conditions to facilitate the induction argument. Recall also that when determining which individuals are descended from a particular immigrant, we are ignoring events in which a type $k$ individual is replaced by another type $k$ individual. For $t\geq 0$ , define

\hat{m}_{1}(t)=\left(\frac{N\mu}{1-N^{\eta-1}\log N}\right)d_{1}(t),

(13)

and for each positive integer $k$ , define

\hat{b}_{k}(t)=(1+s)^{k}d_{k}(t).

(14)

Lemma 4.1.

The following statements hold.

1.

For all positive integers $N$ and $k$ , we have $\hat{b}_{k}(t)\geq b_{k}(t)$ for all $t\geq 0$ .
2.

For sufficiently large $N$ , we have $\hat{m}_{1}(t)\geq m_{1}(t)$ for all $t\in(0,T_{1})$ .

Proof.

By (8), (9) and (14), for every $t\geq 0$ and $k\geq 1$ ,

	$\displaystyle\hat{b}_{k}(t)$	$\displaystyle\geq(1+s)^{k}\left(1-\frac{(1+s)^{k}X_{k}(t)}{S(t)}\right)$
		$\displaystyle=\frac{(1+s)^{k}\left(\sum_{i=0}^{k-1}(1+s)^{i}X_{i}(t)+\sum_{j=k% +1}^{\infty}(1+s)^{j}X_{j}(t)\right)}{S(t)}$
		$\displaystyle\geq\frac{(1+s)^{k}\left(\sum_{i=0}^{k-1}X_{i}(t)+\sum_{j=k+1}^{% \infty}X_{j}(t)\right)}{S(t)}$
		$\displaystyle=\frac{(1+s)^{k}(N-X_{k}(t))}{S(t)}$
		$\displaystyle=b_{k}(t),$

which gives part 1 of the lemma. To prove part 2, note that from (9), when $t\in(0,T_{1})$ ,

d_{1}(t)\geq 1-\frac{(1+s)\log N}{S(t)s}+\mu\geq 1-\frac{(1+s)\log N}{Ns}.

Since $s\gg N^{-\eta}$ , for sufficiently large $N$ and $t\in(0,T_{1})$ , we have $d_{1}(t)\geq 1-N^{\eta-1}\log N.$ The second part of the lemma now follows from (13). ∎

For positive integers $k$ , let $M_{k}(t)$ be equal to $X_{k}(0)$ plus the number of times that a type $k-1$ individual mutates to type $k$ during the time interval $(0,t]$ . For $k\geq 2$ , define the stopping time

\tau_{k}=\inf\left\{t\geq 0:M_{k}(t)>\frac{\beta_{k-1}}{(\log N)^{1/2}}\right% \}=\inf\left\{t\geq 0:M_{k}(t)>\frac{\theta^{1/2}\mu^{k-2}(\log N)^{2k-7/2}}{s% ^{k-1}}\right\}.

(15)

Because $\beta_{k-1}/(\log N)^{1/2}\gg\beta_{k}$ for all $k\geq 2$ , for sufficiently large $N$ we have $\tau_{k}>0$ .

Let $I_{1}=[0,T_{1}]$ , and for $k\geq 2$ , let $I_{k}=[0,\tau_{k})$ . For all positive integers $k$ , we now construct a new process $(\bar{Y}_{k,I_{k}}(t),t\geq 0)$ from the population process as follows.

1.

Set $\bar{Y}_{k,I_{k}}(0)=0$ .
2.

For $k=1$ and $t\in(0,T_{1})$ , the process $\bar{Y}_{1,I_{1}}$ jumps up by 1 due to immigration at rate $\hat{m}_{1}(t)-m_{1}(t)$ . For $k\geq 2$ , there is no immigration.

For all $t>0$ , the process $\bar{Y}_{k,I_{k}}$ jumps up by 1 due to births at rate

\bar{Y}_{k,I_{k}}(t)\hat{b}_{k}(t)+X_{k,I_{k}}(t)(\hat{b}_{k}(t)-b_{k}(t)).

4.

For all $t>0$ , the process $\bar{Y}_{k,I_{k}}$ jumps down by 1 due to deaths at rate $\bar{Y}_{k,I_{k}}(t)d_{k}(t)$ .

Lemma 4.1 implies that the prescribed transition rates are nonnegative, so this process is well-defined. Also, once the process hits 0, it cannot jump down. Thus, $\bar{Y}_{k,I_{k}}(t)\geq 0$ for all $t\geq 0$ .

One can carry out this construction formally by defining homogeneous rate one Poisson processes $(N_{i}(t),t\geq 0)$ , $(N_{b,k}(t),t\geq 0)$ , and $(N_{d,k}(t),t\geq 0)$ which are independent of one another and of the population process, and then defining the process $\bar{Y}_{k,I_{k}}$ to satisfy

	$\displaystyle\bar{Y}_{k,I_{k}}(t)$	$\displaystyle=N_{i}\left(\int_{0}^{t\wedge T_{1}}\hat{m}_{1}(s)-m_{1}(s)\>ds% \right)\mathds{1}_{\{k=1\}}$
		$\displaystyle\qquad+N_{b,k}\left(\int_{0}^{t}\bar{Y}_{k,I_{k}}(s)\hat{b}_{k}(s% )+X_{k,I_{k}}(t)(\hat{b}_{k}(s)-b_{k}(s))\>ds\right)$
		$\displaystyle\qquad\qquad-N_{d,k}\left(\int_{0}^{t}\bar{Y}_{k,I_{k}}(s)d_{k}(s% )\>ds\right).$

For other similar constructions in this paper, we will simply specify the jump rates without explicitly introducing the Poisson processes.

For all $t\geq 0$ , we define

Y_{k,I_{k}}(t)=X_{k,I_{k}}(t)+\bar{Y}_{k,I_{k}}(t).

Therefore, $Y_{k,I_{k}}(t)\geq X_{k,I_{k}}(t)$ for all $t\geq 0$ . Note that $Y_{k,I_{k}}(t)$ is a birth-death process with immigration with the following rates:

An immigrant appears in the process $Y_{1,I_{1}}$ when an immigrant appears in $X_{1,I_{1}}$ or $\bar{Y}_{1,I_{1}}$ . Therefore, immigrants appear in $Y_{1,I_{1}}$ between times $0$ and $T_{1}$ at rate

m_{1}(t)+(\hat{m}_{1}(t)-m_{1}(t))=\hat{m}_{1}(t).

For $k\geq 2$ , a immigrant appears in the process $Y_{k,I_{k}}$ when an immigrant appears in $X_{k,I_{k}}$ , which occurs only during the time interval $(0,\tau_{k})$ at rate $m_{k}(t)$ .

For all $t\geq 0$ , a birth occurs in the process $Y_{k,I_{k}}$ at rate

X_{k,I_{k}}(t)b_{k}(t)+\bar{Y}_{k,I_{k}}(t)\hat{b}_{k}(t)+X_{k,I_{k}}(t)(\hat{% b}_{k}(t)-b_{k}(t))=Y_{k,I_{k}}(t)\hat{b}_{k}(t).

For all $t\geq 0$ , a death occurs in the process $Y_{k,I_{k}}$ at rate

X_{k,I_{k}}(t)d_{k}(t)+\bar{Y}_{k,I_{k}}(t)d_{k}(t)=Y_{k,I_{k}}(t)d_{k}(t).

We shall scale the time so that each individual after the time scaling gives birth at rate $(1+s)^{k}$ and dies at rate $1$ . For all positive integers $k$ and all $t\geq 0$ , define

\lambda_{k}(t)=\int_{0}^{t}d_{k}(v)\>dv,

(16)

and define $\tilde{Y}_{k,I_{k}}(t)=Y_{k,I_{k}}(\lambda_{k}^{-1}(t))$ . Then the process $(\tilde{Y}_{k,I_{k}}(t),t\geq 0)$ is a branching process with immigration with the following rates:

When $k=1$ , immigration occurs at time $t\in(0,\lambda_{1}(T_{1})]$ at rate

\hat{m}_{1}(\lambda_{1}^{-1}(t))\cdot(\lambda_{1}^{-1})^{\prime}(t)=\frac{\hat% {m}_{1}(\lambda_{1}^{-1}(t))}{d_{1}(\lambda_{1}^{-1}(t))}=\frac{N\mu}{1-N^{% \eta-1}\log N}.

When $k\geq 2$ , immigration occurs at a rate which is not constant in time and depends on how the population has evolved at earlier times.

Each individual produces an offspring at the rate

\hat{b}_{k}(\lambda_{k}^{-1}(t))\cdot(\lambda_{k}^{-1})^{\prime}(t)=\frac{\hat% {b}_{k}(\lambda_{k}^{-1}(t))}{d_{k}(\lambda_{k}^{-1}(t))}=(1+s)^{k}.

Each individual dies at the rate

d_{k}(\lambda_{k}^{-1}(t))\cdot(\lambda_{k}^{-1})^{\prime}(t)=\frac{d_{k}(% \lambda_{k}^{-1}(t))}{d_{k}(\lambda_{k}^{-1}(t))}=1.

4.2 An upper bound on $P(T_{1}\leq\frac{c}{N\mu s})$

We first record the following elementary result about branching processes, which follows from classical results on asymmetric random walks.

Lemma 4.2.

Consider a continuous-time branching process started from one individual in which each individual gives birth at rate $1+s$ and dies at rate $1$ . The probability that the branching process survives forever is $s/(1+s)$ , and the probability that it goes extinct is $1/(1+s)$ .

Define $A_{1,X}$ and $A_{1,Y}$ to be the events that the processes $X_{1,I_{1}}$ and $Y_{1,I_{1}}$ go extinct, respectively.

Lemma 4.3.

We have

\lim_{N\rightarrow\infty}P(A_{1,Y})=0.

Proof.

On the event $A_{1,Y}$ , all families of individuals at time $T_{1}$ must go extinct. Since individuals in the branching process $\tilde{Y}_{1,I_{1}}$ give birth at rate $1+s$ and die at rate $1$ , the extinction probability of each family is $1/(1+s)$ . Also, at time $T_{1}$ , there are at least $(\log N)/s$ individuals in the process $Y_{1,I_{1}}$ because $X_{1}(T_{1})=X_{1,I_{1}}(T_{1})\leq Y_{1,I_{1}}(T_{1})$ . Hence,

P(A_{1,Y})\leq\left(\frac{1}{1+s}\right)^{\frac{\log N}{s}}=\left((1+s)^{-1/s}% \right)^{\log N}.

As $N\rightarrow\infty$ , we have $s\rightarrow 0$ and $(1+s)^{-1/s}\rightarrow e^{-1}$ , which completes the proof. ∎

Lemma 4.4.

For every constant $c>0$ ,

\limsup_{N\rightarrow\infty}P\left(T_{1}\leq\frac{c}{N\mu s}\right)\leq 1-e^{-% c}.

Proof.

First, we show that

\limsup_{N\rightarrow\infty}P\left(A_{1,Y}^{c}\cap\left\{T_{1}\leq\frac{c}{N% \mu s}\right\}\right)\leq 1-e^{-c}.

(17)

For $t\geq 0$ , let $\tilde{M}_{1,s}(t)$ be the number of immigrants in the process $\tilde{Y}_{1,I_{1}}$ that appear in the time interval $(0,t]$ and whose families do not go extinct. In the process $\tilde{Y}_{1,I_{1}}$ , immigrants appear at rate $N\mu/(1-N^{\eta-1}\log N)$ until the time $\lambda_{1}(T_{1})$ . The family of each immigrant has extinction probability $1/(1+s)$ . Hence, the first immigrant whose family does not go extinct appears at rate

\frac{N\mu}{1-N^{\eta-1}\log N}\cdot\frac{s}{1+s}.

Note that $A_{1,Y}^{c}\cap\{T_{1}\leq\frac{c}{N\mu s}\}\subseteq\{\tilde{M}_{1,s}(\lambda% _{1}(T_{1}\wedge\frac{c}{N\mu s}))>0\}$ . Also, by (9) and (16),

\lambda_{1}\left(T_{1}\wedge\frac{c}{N\mu s}\right)=\int_{0}^{T_{1}\wedge\frac% {c}{N\mu s}}d_{1}(v)\>dv\leq\int_{0}^{\frac{c}{N\mu s}}(1+\mu)\>dv=\frac{(1+% \mu)c}{N\mu s}.

(18)

Hence,

	$\displaystyle P\left(A_{1,Y}^{c}\cap\left\{T_{1}\leq\frac{c}{N\mu s}\right\}\right)$	$\displaystyle\leq 1-\exp\left(-\frac{N\mu}{1-N^{\eta-1}\log N}\cdot\frac{s}{1+% s}\cdot\frac{(1+\mu)c}{N\mu s}\right)$
		$\displaystyle=1-\exp\left(-\frac{(1+\mu)c}{(1-N^{\eta-1}\log N)(1+s)}\right).$

We obtain the inequality (17) by taking the $\limsup$ of both sides and using that $\mu\rightarrow 0$ , $s\rightarrow 0$ , and $N^{\eta-1}\log N\rightarrow 0$ as $N\rightarrow\infty$ . Next, note that

P\left(T_{1}\leq\frac{c}{N\mu s}\right)\leq P\left(A_{1,Y}^{c}\cap\left\{T_{1}% \leq\frac{c}{N\mu s}\right\}\right)+P(A_{1,Y}).

Thus, the result of this lemma follows by Lemma 4.3 and (17). ∎

4.3 Finite and infinite lines of descent

In this subsection, we will use the fact that a branching process that is conditioned to go extinct is still a branching process. Let $(Y(t),t\geq 0)$ be a branching process with $Y(0)=1$ . Let $f(x)$ be the generating function of the offspring distribution of $Y$ . Let $b^{-1}$ be the mean lifetime of an individual in the process $Y$ . We define $u(x)=b(f(x)-x)$ .

An individual in the branching process $Y$ is said to have a finite line of descent if the family of this particular individual goes extinct; otherwise, it is said to have an infinite line of descent. Let $Y^{(F)}(t)$ be the number of individuals at time $t$ that have a finite line of descent, and let $Y^{(I)}(t)$ be the number of individuals at time $t$ that have an infinite line of descent. Gadag and Rajarshi [8] showed that $((Y^{(F)}(t),Y^{(I)}(t)),t\geq 0)$ is a two-type Markov branching process. Let $f^{(F)}(x,y)=\sum_{i=0}^{\infty}\sum_{j=0}^{\infty}p_{ij}^{(F)}x^{i}y^{j}$ where $p_{ij}^{(F)}$ is the probability that an individual with a finite line of descent has $i$ offspring with a finite line of descent and $k$ offspring with an infinite line of descent. Let $f^{(I)}(x,y)=\sum_{i=0}^{\infty}\sum_{j=0}^{\infty}p_{ij}^{(I)}x^{i}y^{j}$ where $p_{ij}^{(I)}$ is the probability that an individual with an infinite line of descent has $i$ offspring with a finite line of descent and $j$ offspring with an infinite line of descent. Also, define $u^{(F)}(x,y)=b(f^{(F)}(x,y)-x)$ and $u^{(I)}(x,y)=b(f^{(F)}(x,y)-y)$ . Gadag and Rajarshi [8] also showed that

u^{(F)}(x,y)=\frac{u(qx)}{q}

(19)

and

u^{(I)}(x,y)=\frac{u(qx+(1-q)y)-u(qx)}{1-q}

(20)

where $q$ is the extinction probability of the branching process $Y$ .

We will apply the following result to immigrant families in the branching process $\tilde{Y}_{1,I_{1}}$ .

Lemma 4.5.

Let $(Y(t),t\geq 0)$ be a continuous-time branching process with $Y(0)=1$ such that each individual gives birth at rate $1+s$ and dies at rate $1$ . Let $A$ be the event that the process goes extinct. Then

E\bigg{[}\int_{0}^{\infty}Y(t)\>dt\Big{|}A\bigg{]}=\frac{1}{s}.

Proof.

We have

u(x)=1+(1+s)x^{2}-(2+s)x.

By Lemma 4.2, the extinction probability is $1/(1+s)$ , which can also be found by finding the smallest non-negative root of $u(x)$ . Thus,

u^{(F)}(x,y)=\frac{u(x/(1+s))}{1/(1+s)}=x^{2}+(1+s)-(2+s)x

and

u^{(I)}(x,y)=\frac{u((x+sy)/(1+s))-u(x/(1+s))}{s/(1+s)}=sy^{2}+2xy-(2+s)y.

(21)

The coefficients of $u^{(F)}(x,y)$ tell us that an individual with a finite line of descent gives birth at rate $1$ and dies at rate $1+s$ . Hence, for all $t>0$ , we have

E[Y(t)|A]=e^{(1-(1+s))t}=e^{-st}.

The result follows by integrating over $t$ . ∎

Remark 4.6.

If instead each individual gives birth at rate $(1+s)^{k}$ and dies at rate one, then we can apply Lemma 4.5 with $(1+s)^{k}-1$ in place of $s$ to get

E\bigg{[}\int_{0}^{\infty}Y(t)\>dt\Big{|}A\bigg{]}=\frac{1}{(1+s)^{k}-1}\leq% \frac{1}{sk}.

Remark 4.7.

Equation (21) shows that an individual with an infinite line of descent gives birth to another individual with an infinite line of descent at rate $s$ . Therefore, conditional on $A^{c}$ , the number of individuals with an infinite line of descent is a Yule process with birth rate $s$ .

4.4 The number of type $k$ immigrants

Lemma 4.8.

For every constant $c>0$ ,

\lim_{N\rightarrow\infty}P\left(M_{2}\left(T_{1}\wedge\frac{c}{N\mu s}\right)% \leq\frac{\theta^{3/4}(\log N)^{3/4}}{s}\right)=1

(22)

and

\lim_{N\rightarrow\infty}P\left(T_{1}\wedge\frac{c}{N\mu s}<\tau_{2}\right)=1.

(23)

Proof.

For $t\in[0,T_{1})$ , we have $X_{1}(t)\leq(\log N)/s$ . Because $M_{2}(0)\leq\beta_{2}$ and each type 1 individual can mutate to type 2 at rate $\mu$ , it follows that

E\left[M_{2}\left(T_{1}\wedge\frac{c}{N\mu s}\right)\right]\leq\beta_{2}+\frac% {\log N}{s}\cdot\mu\cdot\frac{c}{N\mu s}=\frac{\theta^{1/2}\mu(\log N)^{3}}{s^% {2}}+\frac{c\log N}{Ns^{2}}.

Therefore, using Markov’s inequality, and then using $\theta\geq N\mu$ to bound the first term and $\theta\geq\frac{1}{Ns}$ to bound the second term, we get

	$\displaystyle P\left(M_{2}\left(T_{1}\wedge\frac{c}{N\mu s}\right)>\frac{% \theta^{3/4}(\log N)^{3/4}}{s}\right)$	$\displaystyle\leq\frac{\mu(\log N)^{9/4}}{\theta^{1/4}s}+\frac{c(\log N)^{1/4}% }{\theta^{3/4}Ns}$
		$\displaystyle\leq\frac{(N\mu)^{3/4}(\log N)^{9/4}}{Ns}+c\left(\frac{\log N}{Ns% }\right)^{1/4}.$

Because $N\mu\rightarrow 0$ and $(\log N)^{a}/(Ns)\rightarrow 0$ as $N\rightarrow\infty$ for all $a>0$ , the result (22) follows.

Next, note that

\theta\log N<\mu N\log N+\frac{\log N}{Ns}\ll 1.

(24)

In particular, we have $\theta^{3/4}(\log N)^{3/4}\ll\theta^{1/2}(\log N)^{1/2}$ . Therefore, for sufficiently large $N$ ,

\left\{M_{2}\left(T_{1}\wedge\frac{c}{N\mu s}\right)\leq\frac{\theta^{3/4}(% \log N)^{3/4}}{s}\right\}\subseteq\left\{\tau_{2}>T_{1}\wedge\frac{c}{N\mu s}% \right\},

which gives (23). ∎

For each positive integer $k\geq 2$ , let $A_{k,X}$ and $A_{k,Y}$ be the events that the processes $X_{k,I_{k}}$ and $Y_{k,I_{k}}$ go extinct, respectively. Note that $A_{k,Y}\subseteq A_{k,X}$ since $X_{k,I_{k}}(t)\leq Y_{k,I_{k}}(t)$ for all $t\geq 0$ .

Lemma 4.9.

For every positive integer $k\geq 2$ ,

\lim_{N\rightarrow\infty}\frac{\theta^{1/2}\mu^{k-2}(\log N)^{2k-7/2}}{s^{k-2}% }=0

Proof.

Recall that $\theta=N\mu\vee\frac{1}{Ns}$ . Using that $\mu\ll 1/(N\log N)$ , we have

\frac{(N\mu)^{1/2}\mu^{k-2}(\log N)^{2k-7/2}}{s^{k-2}}=\frac{N^{1/2}\mu^{k-3/2% }(\log N)^{2k-7/2}}{s^{k-2}}\ll\left(\frac{\log N}{Ns}\right)^{k-2}\ll 1.

Likewise, we have

\frac{\mu^{k-2}(\log N)^{2k-7/2}}{(Ns)^{1/2}s^{k-2}}=\frac{\mu^{k-2}(\log N)^{% 2k-7/2}}{N^{1/2}s^{k-3/2}}\ll\left(\frac{\log N}{Ns}\right)^{k-3/2}\ll 1.

The result follows. ∎

Lemma 4.10.

For every positive integer $k\geq 2$ ,

\lim_{N\rightarrow\infty}P(A_{k,Y})=1.

Proof.

In the process $Y_{k,I_{k}}$ , by Lemma 4.2, the family of each immigrant in the process $Y_{k,I_{k}}$ goes extinct with probability $1/(1+s)^{k}$ . By the definition of $\tau_{k}$ in (15), there are at most $\beta_{k-1}/(\log N)^{1/2}$ immigrants during the time interval $[0,\tau_{k})$ . Then,

P(A_{k,Y})\geq\left(\frac{1}{(1+s)^{k}}\right)^{\beta_{k-1}/(\log N)^{1/2}}=% \left[(1+s)^{1/s}\right]^{-ks\beta_{k-1}/(\log N)^{1/2}}.

The result follows because $(1+s)^{1/s}\rightarrow e^{-1}$ as $N\rightarrow\infty$ and $s\beta_{k-1}/(\log N)^{1/2}\rightarrow 0$ as $N\rightarrow\infty$ by Lemma 4.9. ∎

Let

T_{k}^{*}=\inf\{t\geq 0:X_{k}(t)>\beta_{k-1}\}.

Lemma 4.11.

For every positive integer $k\geq 2$ , we have

\lim_{N\rightarrow\infty}P(\tau_{k}\leq T_{k}^{*})=\lim_{N\rightarrow\infty}P(% \tau_{k}\leq T_{k})=1.

Proof.

By the assumptions on $\mu$ and $s$ , we have $T_{k}^{*}<T_{k}$ for sufficiently large $N$ . Therefore, it suffices to show that $\lim_{N\rightarrow\infty}P(\tau_{k}\leq T_{k}^{*})=1$ . By applying the result of Remark 4.6 to each immigrant family, we obtain

E\bigg{[}\bigg{(}\int_{0}^{\infty}\tilde{Y}_{k,I_{k}}(t)\>dt\bigg{)}\mathds{1}% _{A_{k,Y}}\bigg{]}\leq\frac{\theta^{1/2}\mu^{k-2}(\log N)^{2k-7/2}}{s^{k-1}}% \cdot\frac{1}{(1+s)^{k}-1}.

(25)

On the event $T_{k}^{*}<\tau_{k}$ , we have

\tilde{Y}_{k,I_{k}}(\lambda_{k}(T_{k}^{*}))=Y_{k,I_{k}}(T_{k}^{*})\geq\frac{% \theta^{1/2}\mu^{k-2}(\log N)^{2k-3}}{s^{k-1}}.

Therefore, applying the strong Markov property at time $\lambda_{k}(T_{k}^{*})$ and Remark 4.6, we get

E\bigg{[}\int_{0}^{\infty}\tilde{Y}_{k,I_{k}}(t)\>dt\Big{|}A_{k,Y}\cap\{T_{k}^% {*}<\tau_{k}\}\bigg{]}\geq\frac{\theta^{1/2}\mu^{k-2}(\log N)^{2k-3}}{s^{k-1}}% \cdot\frac{1}{(1+s)^{k}-1}.

(26)

Note that if $Z$ is a nonnegative random variable and $A$ and $B$ are events, then

P(A\cap B)=\frac{E[Z\mathds{1}_{A\cap B}]}{E[Z|A\cap B]}\leq\frac{E[Z\mathds{1% }_{A}]}{E[Z|A\cap B]}.

Applying this result to (25) and (26), we get

P(A_{k,Y}\cap\{T_{k}^{*}<\tau_{k}\})\leq\frac{1}{(\log N)^{1/2}}.

Because $\lim_{N\rightarrow\infty}P(A_{k,Y})=1$ by Lemma 4.10, it follows that $\lim_{N\rightarrow\infty}P(T_{k}^{*}<\tau_{k})=0$ , which implies the result. ∎

Lemma 4.12.

For every positive integer $k\geq 2$ , we have

\lim_{N\rightarrow\infty}P(\tau_{k}<\tau_{k+1})=1.

Proof.

Define the stopping time

\zeta_{k,Y}=\inf\left\{t\geq 0:\int_{0}^{t}\tilde{Y}_{k,I_{k}}(u)\>du>\frac{% \beta_{k-1}}{s}\right\}.

It follows from (25) and the fact that $(1+s)^{k}-1\geq sk$ that

E\bigg{[}\bigg{(}\int_{0}^{\infty}\tilde{Y}_{k,I_{k}}(t)\>dt\bigg{)}\mathds{1}% _{A_{k,Y}}\bigg{]}\leq\frac{\theta^{1/2}\mu^{k-2}(\log N)^{2k-7/2}}{ks^{k}}.

Therefore, by Markov’s inequality,

P(\zeta_{k,Y}<\infty)\leq P(A_{k,Y}^{c})+\frac{1}{k(\log N)^{1/2}},

which implies that

\lim_{N\rightarrow\infty}P(\zeta_{k,Y}=\infty)=1.

(27)

When $t\in[0,T_{k})$ ,

d_{k}(t)=1-\frac{(1+s)^{k}X_{k}(t)}{S(t)}+\mu\geq 1-\frac{(1+s)^{k}\log N}{Ns}.

Because $N^{-\eta}\ll s\ll 1$ , it follows that for sufficiently large $N$ , we have $d_{k}(t)\geq 1/2$ for $t\in[0,T_{k})$ , and therefore $\lambda_{k}^{\prime}(t)\geq 1/2$ for $t\in[0,T_{k})$ . Therefore, for $t\in[0,T_{k})$ ,

\int_{0}^{t}X_{k,I_{k}}(u)\>du\leq\int_{0}^{t}Y_{k,I_{k}}(u)\>du=\int_{0}^{t}% \tilde{Y}_{k,I_{k}}(\lambda_{k}(u))\>du\leq 2\int_{0}^{\lambda_{k}(t)}\tilde{Y% }_{k,I_{k}}(v)\>dv.

Therefore, if we let

\zeta_{k,X}=\inf\left\{t\geq 0:\int_{0}^{t}X_{k,I_{k}}(u)\>du>\frac{2\beta_{k-% 1}}{s}\right\},

then $\zeta_{k,X}\geq T_{k}$ on the event $\zeta_{k,Y}=\infty$ . It follows from (27) and Lemma 4.11 that

\lim_{N\rightarrow\infty}P(\tau_{k}\leq\zeta_{k,X})=1.

(28)

Because each individual acquires mutations at rate $\mu$ , we have

\displaystyle E[M_{k+1}(\zeta_{k,X}\wedge\tau_{k})]

\displaystyle\leq M_{k+1}(0)+\frac{2\mu\beta_{k-1}}{s}\leq\beta_{k+1}+\frac{2% \mu\beta_{k-1}}{s}.

Because $\beta_{k+1}\ll\beta_{k}/(\log N)^{1/2}$ and $\mu\beta_{k-1}/s\ll\beta_{k}/(\log N)^{1/2}$ , it follows from Markov’s inequality that

\lim_{N\rightarrow\infty}P\left(M_{k+1}(\zeta_{k,X}\wedge\tau_{k})>\frac{\beta% _{k}}{(\log N)^{1/2}}\right)=0,

and therefore

\lim_{N\rightarrow\infty}P(\tau_{k+1}\leq\zeta_{k,X}\wedge\tau_{k})=0.

(29)

The result follows from (28) and (29). ∎

Recall that $\Delta=\lfloor 1/(1-\eta)\rfloor+1$ and $\theta=N\mu\vee\frac{1}{Ns}$ . By (15),

\tau_{\Delta+1}=\inf\left\{t\geq 0:M_{\Delta+1}(t)>\frac{\theta^{1/2}\mu^{% \Delta-1}(\log N)^{2\Delta-3/2}}{s^{\Delta}}\right\}.

Since $\mu\ll 1/(N\log N)$ and $s\gg N^{-\eta}$ , we have $\theta\ll 1$ and

\frac{\mu^{\Delta-1}(\log N)^{2\Delta-3/2}}{s^{\Delta}}\ll\frac{(\log N)^{% \Delta-\frac{1}{2}}}{N^{\Delta(1-\eta)-1}}\ll 1.

Therefore, for sufficiently large $N$ ,

\tau_{\Delta+1}=\inf\{t\geq 0:M_{\Delta+1,X}(t)\geq 1\}.

(30)

That is, $\tau_{\Delta+1}$ is the first time that an individual of type $\Delta+1$ individual appears in the population.

Let

T^{(1)}=T_{2}\wedge T_{3}\wedge...\wedge T_{\Delta}\wedge\tau_{\Delta+1}.

By combining Lemmas 4.8, 4.11, and 4.12, we obtain the following result.

Lemma 4.13.

We have

\lim_{N\rightarrow\infty}P\left(T_{1}\wedge\frac{c}{N\mu s}<\tau_{2}\leq T^{(1% )}\right)=1.

4.5 Bounding the process $X_{1}$ from below by a branching process

Let $\alpha\in(0,1)$ , $\gamma\in(0,1)$ , and $\zeta\in(0,1)$ be constants. Define

T_{k,\alpha}=\inf\left\{t\geq 0:X_{k}(t)>\alpha N\right\}.

Lemma 4.14.

For sufficiently large $N$ , we have

\frac{b_{1}(t)}{d_{1}(t)}\geq 1+\gamma s\quad\mbox{for all }t\in[0,T_{1,\alpha% }\wedge T^{(1)}).

Proof.

From (8) and (9),

\frac{b_{1}(t)}{d_{1}(t)}=\frac{(1+s)(N-X_{1}(t))}{(1+\mu)S(t)-(1+s)X_{1}(t)}.

By (30), if $N$ is sufficiently large, then during the time interval $[0,\tau_{\Delta+1})$ , type $\Delta+1$ has never appeared in the population. Hence, when $t\in[0,\tau_{\Delta+1})$ ,

S(t)=X_{0}(t)+(1+s)X_{1}(t)+...+(1+s)^{\Delta}X_{\Delta}(t).

Therefore,

	$\displaystyle(1+\mu)S(t)-(1+s)X_{1}(t)$
	$\displaystyle\qquad=(1+\mu)[X_{0}(t)+(1+s)X_{1}(t)+...+(1+s)^{\Delta}X_{\Delta% }(t)]-(1+s)X_{1}(t)$
	$\displaystyle\qquad=(1+\mu)[X_{0}(t)+(1+s)^{2}X_{2}(t)+...+(1+s)^{\Delta}X_{% \Delta}(t)]+\mu(1+s)X_{1}(t).$

Because $s\rightarrow\infty$ as $N\rightarrow\infty$ , it is not difficult to show that for all $k$ , we have $(1+s)^{k}\leq 1+2ks$ for sufficiently large $N$ . Therefore, when $t\in[0,T^{(1)})$ ,

	$\displaystyle(1+\mu)S(t)-(1+s)X_{1}(t)$
	$\displaystyle\quad\leq(1+\mu)[X_{0}(t)+(1+4s)X_{2}(t)+...+(1+2\Delta s)X_{% \Delta}(t)]+\mu(1+s)X_{1}(t)$
	$\displaystyle\quad=(1+\mu)[X_{0}(t)+X_{1}(t)+X_{2}(t)+...+X_{\Delta}(t)]+2s(1+% \mu)\bigg{(}\sum_{j=2}^{\Delta}jX_{j}(t)\bigg{)}-(1-\mu s)X_{1}(t)$
	$\displaystyle\quad=(1+\mu)N+2s(1+\mu)\bigg{(}\sum_{j=2}^{\Delta}jX_{j}(t)\bigg% {)}-(1-\mu s)X_{1}(t)$
	$\displaystyle\quad\leq(1+\mu)N+2s(1+\mu)\bigg{(}\sum_{j=2}^{\Delta}j\cdot\frac% {\log N}{s}\bigg{)}-(1-\mu s)X_{1}(t)$
	$\displaystyle\quad\leq(1+\mu)N+(1+\mu)(\Delta+1)^{2}\log N-(1-\mu s)X_{1}(t).$

Thus, when $t\in[0,T^{(1)})$ , for sufficiently large $N$ ,

\frac{b_{1}(t)}{d_{1}(t)}\geq\frac{(1+s)(N-X_{1}(t))}{(1+\mu)(N+(\Delta+1)^{2}% \log N)-(1-\mu s)X_{1}(t)}.

For all positive real numbers $a,b,c$ and $d$ such that $ac<b$ , the function $f(x)=d(a-x)/(b-cx)$ is decreasing on the interval $[0,a]$ because $f^{\prime}(x)=d(ac-b)/(b-cx)^{2}<0$ for all $x\in(0,a)$ . Therefore, for sufficiently large $N$ , when $t\in[0,T_{1,\alpha}\wedge T^{(1)})$ ,

\frac{b_{1}(t)}{d_{1}(t)}\geq\frac{(1+s)(N-\alpha N)}{(1+\mu)(N+(\Delta+1)^{2}% \log N)-(1-\mu s)\alpha N}=\frac{(1+s)(1-\alpha)}{(1+\mu)\left(1+\frac{(\Delta% +1)^{2}\log N}{N}\right)-(1-\mu s)\alpha}.

Note that when $N\rightarrow\infty$ , because $(\log N)/(Ns)\rightarrow 0$ and $\mu\ll s\ll 1$ ,

	$\displaystyle\frac{1}{s}\left[\frac{(1+s)(1-\alpha)}{(1+\mu)\left(1+\frac{(% \Delta+1)^{2}\log N}{N}\right)-(1-\mu s)\alpha}-1\right]$
	$\displaystyle\qquad\qquad=\frac{1}{s}\left[\frac{s-s(1+\mu)\alpha-\frac{(% \Delta+1)^{2}\log N}{N}-\mu\left(1+\frac{(\Delta+1)^{2}\log N}{N}\right)}{(1+% \mu)\left(1+\frac{(\Delta+1)^{2}\log N}{N}\right)-(1-\mu s)\alpha}\right]$
	$\displaystyle\qquad\qquad=\frac{1-(1+\mu)\alpha-\frac{(\Delta+1)^{2}\log N}{Ns% }-\frac{\mu}{s}\left(1+\frac{(\Delta+1)^{2}\log N}{N}\right)}{(1+\mu)\left(1+% \frac{(\Delta+1)^{2}\log N}{N}\right)-(1-\mu s)\alpha}$
	$\displaystyle\qquad\qquad\rightarrow 1.$

Therefore, for sufficiently large $N$ ,

\frac{1}{s}\left[\frac{(1+s)(1-\alpha)}{(1+\mu)\left(1+\frac{(\Delta+1)^{2}% \log N}{N}\right)-(1-\mu s)\alpha}-1\right]\geq\gamma

and thus $b_{1}(t)/d_{1}(t)\geq 1+\gamma s$ . ∎

Lemma 4.15.

For sufficiently large $N$ , we have $X_{0}(t)\geq(1-\zeta)Nd_{1}(t)$ for all $t\in[0,T_{1}\wedge T^{(1)})$ .

Proof.

From the definition of $d_{1}$ in (9), $d_{1}(t)\leq 1+\mu$ for all $t\geq 0$ . Also, when $t\in[0,T_{1}\wedge T^{(1)})$ , by (30), for sufficiently large $N$ ,

X_{0}(t)=N-\sum_{i=1}^{\Delta}X_{i}(t)\geq N-\frac{\Delta\log N}{s}.

Hence,

\frac{X_{0}(t)}{d_{1}(t)}\geq\frac{N-\frac{\Delta\log N}{s}}{1+\mu}=\frac{N% \left(1-\frac{\Delta\log N}{Ns}\right)}{1+\mu}.

Since $(\log N)/(Ns)\rightarrow 0$ and $\mu\rightarrow 0$ as $N\rightarrow\infty$ , we have $X_{0}(t)/d_{1}(t)\geq(1-\zeta)N$ for all $t\in[0,T_{1}\wedge T^{(1)})$ if $N$ is sufficiently large. ∎

We now construct a new birth-death process with immigration called $Z_{1}$ which will bound the process $X_{1}$ from below. We set $Z_{1}(0)=X_{1}(0)$ .

1.
At time $t\in(0,T_{1}\wedge T^{(1)}]$ , if a birth occurs in $X_{1}$ , then
- •
  
  with probability $\frac{(1+\gamma s)Z_{1}(t)d_{1}(t)}{X_{1}(t)b_{1}(t)}$ , a birth also occurs in $Z_{1}$ ,
- •
  
  with probability $1-\frac{(1+\gamma s)Z_{1}(t)d_{1}(t)}{X_{1}(t)b_{1}(t)}$ , nothing happens in $Z_{1}$ .
2.
At time $t\in(0,T_{1}\wedge T^{(1)}]$ , if a death occurs in $X_{1}$ , then
- •
  
  with probability $\frac{Z_{1}(t)d_{1}(t)}{X_{1}(t)d_{1}(t)}$ , a death also occurs in $Z_{1}$ ,
- •
  
  with probability $1-\frac{Z_{1}(t)d_{1}(t)}{X_{1}(t)d_{1}(t)}$ , nothing happens in $Z_{1}$ .
3.
At time $t\in(0,T_{1}\wedge T^{(1)}]$ , if an immigration event occurs in $X_{1}$ , then
- •
  
  with probability $\frac{(1-\zeta)N\mu d_{1}(t)}{X_{0}(t)\mu}$ , an immigration event also occurs in $Z_{1}$ ,
- •
  
  with probability $1-\frac{(1-\zeta)N\mu d_{1}(t)}{X_{0}(t)\mu}$ , nothing happens in $Z_{1}$ .
4.
For times $t>T_{1}\wedge T^{(1)}$ , the process $Z_{1}$ behaves independently of $X_{1}$ , and
- •
  
  a birth occurs at rate $(1+\gamma s)Z_{1}(t)d_{1}(t)$ ,
- •
  
  a death occurs at rate $Z_{1}(t)d_{1}(t)$ ,
- •
  
  immigration occurs at rate $(1-\zeta)N\mu d_{1}(t)$ .

From this construction, we see that $Z_{1}(t)\leq X_{1}(t)$ for all $t\in[0,T_{1}\wedge T^{(1)}]$ . Also, all of the probabilities in the construction are guaranteed to be in $[0,1]$ by Lemmas 4.14 and 4.15. Hence, $Z_{1}$ is a branching process with immigration where

•

each individual gives birth at rate $(1+\gamma s)d_{1}(t)$ ,
•

each individual dies at rate $d_{1}(t)$ ,
•

immigration occurs at rate $(1-\zeta)N\mu d_{1}(t)$ .

Recall the definition of the time scaling function $\lambda_{1}$ in (16). We define $\tilde{Z}_{1}(t)=Z_{1}(\lambda_{1}^{-1}(t))$ for all $t\geq 0$ . Then the process $\tilde{Z}_{1}$ is a branching process with immigration in which each individual gives birth at rate $1+\gamma s$ , each individual dies at rate $1$ , and immigration occurs at rate $(1-\zeta)N\mu$ .

By Lemma 4.2, the extinction probability of a family of an immigrant is $1/(1+\gamma s)$ . Thus, in the process $\tilde{Z}_{1}$ , an immigrant whose family survives forever appears at rate

(1-\zeta)N\mu\cdot\frac{\gamma s}{1+\gamma s}=\left(\frac{(1-\zeta)\gamma}{1+% \gamma s}\right)N\mu s.

We define $\tau_{Z_{1}}$ to be the first time that an immigrant whose family survives forever appears in the process $\tilde{Z}_{1}$ , and define

T_{Z_{1}}=\inf\left\{t\geq 0:Z_{1}(t)>\frac{\log N}{s}\right\}.

(31)

Lemma 4.16.

Let $\kappa\in(0,1)$ be a constant. For sufficiently large $N$ , we have

P\left(\lambda_{1}(T_{Z_{1}})\leq\tau_{Z_{1}}+\frac{1}{\gamma s}\log\left(% \frac{\log N}{\kappa s}\right)\right)>1-\kappa.

Proof.

Let $n(t)$ be the number of individuals in the process $\tilde{Z}_{1}$ at time $t+\tau_{Z_{1}}$ who have an infinite line of descent and descend from the first immigrant that has an infinite line of descent. Let

\sigma=\inf\left\{t\geq 0:n(t)>\frac{\log N}{s}\right\}.

Since $\lambda_{1}(T_{Z_{1}})$ is the first time the process $\tilde{Z}_{1}$ goes above $(\log N)/s$ , we have $\lambda_{1}(T_{Z_{1}})\leq\tau_{Z_{1}}+\sigma$ . Therefore,

	$\displaystyle P\left(\lambda_{1}(T_{Z_{1}})\leq\tau_{Z_{1}}+\frac{1}{\gamma s}% \log\left(\frac{\log N}{\kappa s}\right)\right)$	$\displaystyle\geq P\left(\sigma\leq\frac{1}{\gamma s}\log\left(\frac{\log N}{% \kappa s}\right)\right)$
		$\displaystyle=P\left(n\left(\frac{1}{\gamma s}\log\left(\frac{\log N}{\kappa s% }\right)\right)>\frac{\log N}{s}\right).$

It follows from Remark 4.7 that $(n(t),t\geq 0)$ is a Yule process in which each individual gives birth at rate $\gamma s$ . Therefore, $n(t)$ has a geometric distribution with success probability $e^{-\gamma st}$ . Using that $(\log N)/s\rightarrow\infty$ as $N\rightarrow\infty$ , we have

	$\displaystyle P\left(n\left(\frac{1}{\gamma s}\log\left(\frac{\log N}{\kappa s% }\right)\right)>\frac{\log N}{s}\right)$	$\displaystyle=\left[1-\exp\left(-\log\left(\frac{\log N}{\kappa s}\right)% \right)\right]^{\left\lfloor\frac{\log N}{s}\right\rfloor}$
		$\displaystyle=\left(1-\frac{\kappa s}{\log N}\right)^{\left\lfloor\frac{\log N% }{s}\right\rfloor}$
		$\displaystyle\rightarrow e^{-\kappa}$

as $N\rightarrow\infty$ , and the result follows because $e^{-\kappa}>1-\kappa$ . ∎

Lemma 4.17.

For every constant $c>0$ ,

\liminf_{N\rightarrow\infty}P\left(T_{1}\wedge T^{(1)}\leq\frac{c}{N\mu s}% \right)\geq 1-e^{-c}.

Proof.

By the definition of $d_{1}(t)$ in (9), for sufficiently large $N$ , when $t\in[0,T_{1})$ , we have

d_{1}(t)\geq 1-\frac{(1+s)X_{1}(t)}{S(t)}\geq 1-\frac{(1+s)\frac{\log N}{s}}{N% }\geq 1-\frac{2\log N}{Ns}.

Thus, by (16), for sufficiently large $N$ , when $t\in[0,T_{1}]$ ,

\lambda_{1}(t)\geq\left(1-\frac{2\log N}{Ns}\right)t.

Because $Z_{1}(t)\leq X_{1}(t)$ for $t\in[0,T_{1}\wedge T^{(1)}]$ , we have $T_{1}\wedge T^{(1)}\leq T_{Z_{1}}$ . Also, $\lambda_{1}$ is an increasing function. Hence,

	$\displaystyle P\left(T_{1}\wedge T^{(1)}>\frac{c}{N\mu s}\right)$	$\displaystyle\leq P\left(\lambda_{1}(T_{1}\wedge T^{(1)})>\left(1-\frac{2\log N% }{Ns}\right)\frac{c}{N\mu s}\right)$
		$\displaystyle\leq P\left(\lambda_{1}(T_{Z_{1}})>\left(1-\frac{2\log N}{Ns}% \right)\frac{c}{N\mu s}\right).$

It follows from Lemma 4.16 that

	$\displaystyle P\left(T_{1}\wedge T^{(1)}\leq\frac{c}{N\mu s}\right)$
	$\displaystyle\qquad\geq P\left(\lambda_{1}(T_{Z_{1}})\leq\left(1-\frac{2\log N% }{Ns}\right)\frac{c}{N\mu s}\right)$
	$\displaystyle\qquad\geq P\left(\lambda_{1}(T_{Z_{1}})\leq\tau_{Z_{1}}+\frac{1}% {\gamma s}\log\left(\frac{\log N}{\kappa s}\right)\leq\left(1-\frac{2\log N}{% Ns}\right)\frac{c}{N\mu s}\right)$
	$\displaystyle\qquad\geq 1-\kappa-P\left(\tau_{Z_{1}}+\frac{1}{\gamma s}\log% \left(\frac{\log N}{\kappa s}\right)>\left(1-\frac{2\log N}{Ns}\right)\frac{c}% {N\mu s}\right)$
	$\displaystyle\qquad=1-\kappa-P\left(\tau_{Z_{1}}>\frac{c}{N\mu s}\left[\left(1% -\frac{2\log N}{Ns}\right)-\frac{N\mu}{c\gamma}\log\left(\frac{\log N}{\kappa s% }\right)\right]\right).$		(32)

As $N\rightarrow\infty$ , we have $(\log N)/(Ns)\rightarrow 0$ , and because $(\log N)/(\kappa s)\leq N$ for sufficiently large $N$ by (2), we have

\frac{N\mu}{c\gamma}\log\left(\frac{\log N}{\kappa s}\right)\leq\frac{N\mu}{c% \gamma}\log N\rightarrow 0.

Thus, from (32), for sufficiently large $N$ ,

P\left(T_{1}\wedge T^{(1)}\leq\frac{c}{N\mu s}\right)\geq 1-\kappa-P\left(\tau% _{Z_{1}}>\frac{c(1-\kappa)}{N\mu s}\right).

Since $\tau_{Z_{1}}$ has an exponential distribution with rate $\frac{(1-\zeta)\gamma}{1+\gamma s}\cdot N\mu s$ ,

P\left(\tau_{Z_{1}}>\frac{c(1-\kappa)}{N\mu s}\right)=\exp\left(-\frac{(1-% \zeta)\gamma N\mu s}{1+\gamma s}\cdot\frac{c(1-\kappa)}{N\mu s}\right)=\exp% \left(-\frac{c(1-\kappa)(1-\zeta)\gamma}{1+\gamma s}\right).

Thus,

P\left(T_{1}\wedge T^{(1)}\leq\frac{c}{N\mu s}\right)\geq 1-\kappa-\exp\left(-% \frac{c(1-\kappa)(1-\zeta)\gamma}{1+\gamma s}\right).

It follows that

\liminf_{N\rightarrow\infty}P\left(T_{1}\wedge T^{(1)}\leq\frac{c}{N\mu s}% \right)\geq 1-\kappa-e^{-c(1-\kappa)(1-\zeta)\gamma}.

Since this statement is true for all $\gamma,\zeta,\kappa\in(0,1)$ , the result follows by taking limits as $\gamma\rightarrow 1^{-}$ and $\zeta,\kappa\rightarrow 0^{+}$ . ∎

Proposition 4.18.

The following statements hold.

For every $c>0$ , we have

\lim_{N\rightarrow\infty}P(N\mu sT_{1}>c)=e^{-c}.

We have

\lim_{N\rightarrow\infty}P(T_{1}<\tau_{2}\leq T^{(1)})=1.

We have

\lim_{N\rightarrow\infty}P\left(M_{2}(T_{1})\leq\frac{\theta^{3/4}(\log N)^{3/% 4}}{s}\right)=1.

Proof.

First, note that for every $c>0$ ,

P\left(T_{1}\wedge T^{(1)}\leq\frac{c}{N\mu s}\right)\leq P\left(T_{1}\leq% \frac{c}{N\mu s}\right)+P\left(T^{(1)}\leq T_{1}\wedge\frac{c}{N\mu s}\right).

From Lemma 4.13, we have $P(T^{(1)}\leq T_{1}\wedge\frac{c}{N\mu s})=0$ . Therefore, by Lemma 4.17,

\liminf_{N\rightarrow\infty}P\left(T_{1}\leq\frac{c}{N\mu s}\right)\geq\liminf% _{N\rightarrow\infty}P\left(T_{1}\wedge T^{(1)}\leq\frac{c}{N\mu s}\right)\geq 1% -e^{-c}.

Combining this result with Lemma 4.4 gives the first statement in the proposition.

To prove the second statement, note that for every $c>0$ ,

P\left(T_{1}\wedge\frac{c}{N\mu s}<\tau_{2}\leq T^{(1)}\right)\leq P(T_{1}<% \tau_{2}\leq T^{(1)})+P\left(T_{1}>\frac{c}{N\mu s}\right).

(33)

By Lemma 4.13, the term on the left hand side converges to $1$ as $N\rightarrow\infty$ . Therefore, using the result of the first statement of the proposition and taking the liminf of both sides of (33), we get

1\leq\liminf_{N\rightarrow\infty}P(T_{1}<\tau_{2}\leq T^{(1)})+e^{-c}.

The result follows because this inequality is true for all $c>0$ .

To prove the last statement, we first observe that

P\left(M_{2}(T_{1})\leq\frac{\theta^{3/4}(\log N)^{3/4}}{s}\right)\geq P\left(% M_{2}\left(T_{1}\wedge\frac{c}{N\mu s}\right)\leq\frac{\theta^{3/4}(\log N)^{3% /4}}{s}\right)-P\left(T_{1}>\frac{c}{N\mu s}\right).

Taking the liminf of both sides and using Lemma 4.8 and part 1 of this proposition, we have

\liminf_{N\rightarrow\infty}P\left(M_{2}(T_{1})\leq\frac{\theta^{3/4}(\log N)^% {3/4}}{s}\right)\geq 1-e^{-c}.

Since this is true for all $c>0$ , we have proved the last statement of the proposition. ∎

5 Following the process until time $T_{1,\alpha}$

In this section, we study the process between the time $T_{1}$ when the number of type 1 individuals first reaches $(\log N)/s$ and the time $T_{1,\alpha}$ , when the number of type $1$ individuals reaches $\alpha N$ . Our goal is to show that the number of type 1 individuals reaches $\alpha N$ quickly after time $T_{1}$ , and that there is not enough time for many mutations to type 2 to occur during this period.

We now construct a branching process which will bound $X_{1}$ from below between times $T_{1}$ and $T_{1,\alpha}$ . For $t\geq T_{1}\wedge T^{(1)}$ , let

X_{1}^{(1)}(t)=X_{1,[0,T_{1}\wedge T^{(1)}]}(t).

That is, $X_{1}^{(1)}(t)$ is the number of type 1 individuals at time $t$ that are descended from type $1$ individuals that are alive at time $T_{1}\wedge T^{(1)}$ . Similar to the way we constructed $Z_{1}$ , we construct a new birth-death process $(Z_{1}^{\prime}(t),t\geq T_{1}\wedge T^{(1)})$ from the process $X_{1}^{(1)}$ as follows:

1.

Set $Z_{1}^{\prime}(T_{1}\wedge T^{(1)})=X_{1}^{(1)}(T_{1}\wedge T^{(1)})$ .
2.
At time $t\in(T_{1}\wedge T^{(1)},T_{1,\alpha}\wedge T^{(1)}]$ , if a birth occurs in $X_{1}^{(1)}$ , then
- •
  
  with probability $\frac{(1+\gamma s)Z^{\prime}_{1}(t)d_{1}(t)}{X_{1}^{(1)}(t)b_{1}(t)}$ , a birth also occurs in $Z^{\prime}_{1}$ ,
- •
  
  with probability $1-\frac{(1+\gamma s)Z^{\prime}_{1}(t)d_{1}(t)}{X_{1}^{(1)}(t)b_{1}(t)}$ , nothing happens in $Z^{\prime}_{1}$ .
3.
At time $t\in(T_{1}\wedge T^{(1)},T_{1,\alpha}\wedge T^{(1)}]$ , if a death occurs in $X_{1}^{(1)}$ , then
- •
  
  with probability $\frac{Z^{\prime}_{1}(t)d_{1}(t)}{X_{1}^{(1)}(t)d_{1}(t)}$ , a death also occurs in $Z^{\prime}_{1}$ ,
- •
  
  with probability $1-\frac{Z^{\prime}_{1}(t)d_{1}(t)}{X_{1}^{(1)}(t)d_{1}(t)}$ , nothing happens in $Z^{\prime}_{1}$ .
4.
For times $t>T_{1,\alpha}\wedge T^{(1)}$ , the process $Z^{\prime}_{1}$ evolves independently of the population, and
- •
  
  a birth occurs at rate $(1+\gamma s)Z^{\prime}_{1}(t)d_{1}(t)$ ,
- •
  
  a death occurs at rate $Z^{\prime}_{1}(t)d_{1}(t)$ .

From this construction, which is well-defined in view of Lemma 4.14, the process $Z^{\prime}_{1}$ is a birth-death process in which each individual gives birth at rate $(1+\gamma s)d_{1}(t)$ and each individual dies at rate $d_{1}(t)$ . Also,

Z^{\prime}_{1}(t)\leq X_{1}^{(1)}(t)\leq X_{1}(t)\quad\mbox{for all }t\in[T_{1% }\wedge T^{(1)},T_{1,\alpha}\wedge T^{(1)}).

(34)

For $t\geq 0$ , let $\tilde{Z}_{1}^{\prime}(t)=Z^{\prime}_{1}(\lambda_{1}^{-1}(t+\lambda_{1}(T_{1}% \wedge T^{(1)})))$ . Then the process $(\tilde{Z}_{1}^{\prime}(t),t\geq 0)$ is a branching process in which each individual gives birth at rate $1+\gamma s$ and dies at rate 1.

We now review the following standard result for continuous-time branching processes, which can be obtained, for example, from Theorem 6.1 on page 103 of [14].

Lemma 5.1.

Let $(Z(t),t\geq 0)$ be a continuous-time branching process with $Z(0)=1$ such that each individual independently lives for an exponentially distributed time with mean $b^{-1}$ and is then replaced by $k$ offspring with probability $p_{k}$ . For $x\in[0,1]$ , let

f(x)=\sum_{k=0}^{\infty}p_{k}x^{k},\qquad u(x)=b(f(x)-x).

Let $\lambda=u^{\prime}(1)$ . Then $E[Z(t)]=e^{\lambda t}$ , and if $\lambda\neq 0$ , then

\textup{Var}(Z(t))=\bigg{(}\frac{u^{\prime\prime}(1)-\lambda}{\lambda}\bigg{)}% (e^{2\lambda t}-e^{\lambda t}).

Let $D$ be the event that $T_{1}<T_{1}^{(1)}$ . Note that $P(D)\rightarrow 1$ as $N\rightarrow\infty$ by part 2 of Lemma 4.18, and that on the event $D$ , we have $\tilde{Z}_{1}^{\prime}(0)=\lfloor(\log N)/s\rfloor+1$ . Also, let

t_{0}=\frac{1}{\gamma s}\log(\alpha Ns).

Lemma 5.2.

We have

\lim_{N\rightarrow\infty}P\left(\tilde{Z}_{1}^{\prime}(t_{0})>\alpha N\,\big{|% }\,D\right)=1.

Proof.

Since $(\tilde{Z}_{1}^{\prime}(t),t\geq 0)$ is a branching process as described above,

E\left[\tilde{Z}_{1}^{\prime}(t_{0})\big{|}\,D\right]=\left(\left\lfloor\frac{% \log N}{s}\right\rfloor+1\right)e^{\gamma st_{0}}=\alpha Ns\left(\left\lfloor% \frac{\log N}{s}\right\rfloor+1\right).

Therefore, if $N$ is sufficiently large, $\tilde{Z}_{1}^{\prime}(t_{0})\leq\alpha N$ implies $|\tilde{Z}_{1}^{\prime}(t_{0})-E[\tilde{Z}_{1}^{\prime}(t_{0})|D]|>\frac{1}{2}% \alpha N\log N$ . It thus follows from the conditional Chebyshev’s inequality that

P\left(\tilde{Z}_{1}^{\prime}(t_{0})\leq\alpha N\big{|}D\right)\leq\frac{4}{(% \alpha N\log N)^{2}}\textup{Var}\left(\tilde{Z}_{1}^{\prime}(t_{0})\big{|}D% \right).

(35)

The generating function for this branching process, using the notation of Lemma 5.1, satisfies

u(x)=1+(1+\gamma s)x^{2}-(2+\gamma s)x,\qquad u^{\prime}(1)=\gamma s,\qquad u^% {\prime\prime}(1)=2(1+\gamma s).

Therefore, by Lemma 5.1, since the numbers of offspring produced by the $\lfloor(\log N)/s\rfloor+1$ individuals at time zero are independent, we have

\textup{Var}\left(\tilde{Z}_{1}^{\prime}(t_{0})\big{|}D\right)\leq\left(\left% \lfloor\frac{\log N}{s}\right\rfloor+1\right)\left(\frac{2+\gamma s}{\gamma s}% \right)\left(e^{2\gamma st_{0}}-e^{\gamma st_{0}}\right).

(36)

Note that $e^{2\gamma st_{0}}=(\alpha Ns)^{2}$ and, if $N$ is sufficiently large, $(\lfloor(\log N)/s\rfloor+1)(2+\gamma s)\leq 3(\log N)/s$ . Therefore, it follows from (35) and (36) that for sufficiently large $N$ ,

P\left(\tilde{Z}_{1}^{\prime}(t_{0})\leq\alpha N\big{|}D\right)\leq\frac{4}{(% \alpha N\log N)^{2}}\cdot\frac{3\log N}{\gamma s^{2}}(\alpha Ns)^{2}=\frac{12}% {\gamma\log N},

which goes to 0 as $N\rightarrow\infty$ . The result of the lemma follows. ∎

Lemma 5.3.

There is a positive constant $C$ such that

\lim_{N\rightarrow\infty}P\left(T_{1,\alpha}\wedge T^{(1)}\leq T_{1}+\frac{C% \log N}{s}\right)=1.

Proof.

By the definition of $d_{1}$ in (9), when $t\in[0,T_{1,\alpha})$ ,

d_{1}(t)\geq 1-\frac{(1+s)X_{1}(t)}{S(t)}\geq 1-\frac{(1+s)\alpha N}{N}=1-(1+s% )\alpha.

Let $\alpha^{\prime}$ be a constant such that $\alpha<\alpha^{\prime}<1$ . Since $s\rightarrow 0$ as $N\rightarrow\infty$ , for sufficiently large $N$ we have $d_{1}(t)\geq 1-\alpha^{\prime}$ for all $t\in[0,T_{1,\alpha})$ . By the definition of $\lambda_{1}$ in (16), when $0\leq u\leq t\leq T_{1,\alpha}$ ,

\lambda_{1}(t)-\lambda_{1}(u)=\int_{u}^{t}d_{1}(v)\>dv\geq(1-\alpha^{\prime})(% t-u).

(37)

Let $D^{*}$ be the event that $(T_{1}\wedge T^{(1)})+\frac{t_{0}}{1-\alpha^{\prime}}<T_{1,\alpha}\wedge T^{(1)}$ . By (37), on $D^{*}$ we have

\lambda_{1}\left(T_{1}\wedge T^{(1)}+\frac{t_{0}}{1-\alpha^{\prime}}\right)% \geq\lambda_{1}(T_{1}\wedge T^{(1)})+t_{0}.

Since $\lambda_{1}$ is an increasing function, it follows that on $D^{*}$ ,

T_{1,\alpha}\wedge T^{(1)}>T_{1}\wedge T^{(1)}+\frac{t_{0}}{1-\alpha^{\prime}}% >\lambda^{-1}\left(\lambda_{1}(T_{1}\wedge T^{(1)})+t_{0}\right).

Define $T_{1,\alpha}^{\prime}=\inf\{t\geq 0:Z_{1}^{\prime}(t)>\alpha N\}$ . By (34), either $X_{1}$ reaches $\alpha N$ before or at the same time as $Z_{1}^{\prime}$ does, or $Z_{1}^{\prime}$ reaches $\alpha N$ after time $T^{(1)}$ . Therefore, we have $T_{1,\alpha}\wedge T^{(1)}\leq T_{1,\alpha}^{\prime}$ , which means that on $D^{*}$ ,

T_{1,\alpha}^{\prime}>\lambda^{-1}\left(\lambda_{1}(T_{1}\wedge T^{(1)})+t_{0}% \right).

By the definition of $\tilde{Z}_{1}^{\prime}$ , it follows that the process $\tilde{Z}_{1}^{\prime}$ does not go above $\alpha N$ until after time $t_{0}$ . That is, on $D^{*}$ we have

\tilde{Z}_{1}^{\prime}(t_{0})\leq\alpha N.

Therefore, recalling that $D$ is the event that $T_{1}<T_{1}^{(1)}$ and using Lemma 5.2, we have

P(D^{*}|D)\leq P(\tilde{Z}_{1}^{\prime}(t_{0})\leq\alpha N|D)\rightarrow 0% \quad\mbox{as }N\rightarrow\infty.

Because $\lim_{N\rightarrow\infty}P(T_{1}<T^{(1)})=1$ by part 2 of Proposition 4.18, it follows that $P(D^{*})\rightarrow 0$ as $N\rightarrow\infty$ , which means

\lim_{N\rightarrow\infty}P\left(T_{1,\alpha}\wedge T^{(1)}\leq T_{1}+\frac{t_{% 0}}{1-\alpha^{\prime}}\right)=1.

Because $\alpha<1$ and $s\rightarrow 0$ , for sufficiently large $N$ we have $t_{0}\leq(\log N)/(\gamma s)$ . The result follows if we choose $C>1/(\gamma(1-\alpha^{\prime}))$ . ∎

Lemma 5.4.

For all positive constants $C$ , we have

\lim_{N\rightarrow\infty}P\left(\tau_{2}>T_{1}+\frac{C\log N}{s}\right)=1.

Proof.

The number of type 1 individuals at any time is bounded above by $N$ , so the rate of mutations to type 2 is bounded above by $N\mu$ . Therefore, the expected number of mutations to type 2 between times $T_{1}$ and $T_{1}+(C\log N)/s$ is bounded above by $(C\mu N\log N)/s$ . By Markov’s inequality and the fact that $\theta\geq N\mu$ ,

P\left(M_{2}\left(T_{1}+\frac{C\log N}{s}\right)-M_{2}(T_{1})>\frac{\theta^{3/% 4}(\log N)^{3/4}}{s}\right)\leq\frac{C\mu N\log N}{\theta^{3/4}(\log N)^{3/4}}% \leq C(\mu N\log N)^{1/4}.

Since $\mu N\log N\ll 1$ , we have

\lim_{N\rightarrow\infty}P\left(M_{2}\left(T_{1}+\frac{C\log N}{s}\right)-M_{2% }(T_{1})\leq\frac{\theta^{3/4}(\log N)^{3/4}}{s}\right)=1.

(38)

By Proposition 4.18 and (38),

\lim_{N\rightarrow\infty}P\left(M_{2}\left(T_{1}+\frac{C\log N}{s}\right)\leq% \frac{2\theta^{3/4}(\log N)^{3/4}}{s}\right)=1.

(39)

Since $\theta\log N\ll 1$ by (24), we have $\theta^{3/4}(\log N)^{3/4}\ll\theta^{1/2}(\log N)^{1/2}$ . Thus, this lemma follows from (39) and the definition of $\tau_{2}$ in (15). ∎

Proposition 5.5.

We have

\lim_{N\rightarrow\infty}P(T_{1}<T_{1,\alpha}<\tau_{2}\leq T^{(1)})=1.

(40)

Also, there exists a positive constant $C$ such that

\lim_{N\rightarrow\infty}P\left(T_{1,\alpha}-T_{1}\leq\frac{C\log N}{s}\right)% =1.

(41)

Proof.

By Lemmas 5.3 and 5.4, we have $\lim_{N\rightarrow\infty}P(T_{1,\alpha}\wedge T^{(1)}<\tau_{2})=1$ . By part 2 of Proposition 4.18, we have $\lim_{N\rightarrow\infty}P(T_{1}<\tau_{2}\leq T^{(1)})=1$ . Because $T_{1}<T_{1,\alpha}$ for sufficiently large $N$ by definition, these two equations imply (40). Combining the result $\lim_{N\rightarrow\infty}P(T_{1,\alpha}<T^{(1)})=1$ with Lemma 5.3 yields (41). ∎

6 Following the process until type 0 vanishes

In this section, we will prove that after the time $T_{1,\alpha}$ when the number of type 1 individuals reaches $\alpha N$ , the type 0 population quickly goes extinct. In particular, we will show that with probability tending to one as $N\rightarrow\infty$ , we have the inequality $0<T_{1}^{\prime}-T_{1,\alpha}\leq C^{\prime}(\log N)/s$ for some positive constant $C^{\prime}$ . We begin by showing that $T_{1,\alpha}<T_{1}^{\prime}$ with probability going to 1 as $N\rightarrow\infty$ .

Lemma 6.1.

We have $\lim_{N\rightarrow\infty}P(T_{1,\alpha}<T_{1}^{\prime})=1$ .

Proof.

For $t<T^{(1)}$ , we have $X_{k}(t)\leq(\log N)/s$ for $k=2,3,...,\Delta$ . Also, since $\tau_{\Delta+1}$ is the time that the first type $\Delta+1$ individual appears when $N$ is sufficiently large by (30), there are no individuals of type $\Delta+1$ or above for $t<T^{(1)}$ . Therefore, as long as $T_{1,\alpha}<T^{(1)}$ , we have for sufficiently large $N$ ,

X_{0}(T_{1,\alpha})=N-\sum_{j=1}^{\Delta}X_{j}(T_{1,\alpha})\geq N-(\alpha N+1% )-(\Delta-1)\cdot\frac{\log N}{s}=N\left(1-\alpha-\frac{1}{N}-\frac{(\Delta-1)% \log N}{Ns}\right).

Because $(\log N)/(Ns)\rightarrow 0$ as $N\rightarrow\infty$ , when $N$ is large enough, $X_{0}(T_{1,\alpha})>0$ if $T_{1,\alpha}<T^{(1)}$ . Because $\lim_{N\rightarrow\infty}P(T_{1,\alpha}<T^{(1)})=1$ by Proposition 5.5, the result follows. ∎

We now bound the process $X_{0}$ from above by a branching process. For $t\geq 0$ , we define

\hat{b}_{0}(t)=\frac{d_{0}(t)}{1+s}.

(42)

We construct a new process $(\bar{W}_{0}(t),t\geq T_{1,\alpha}\wedge T_{1}^{\prime})$ from the population process as follows.

1.

Set $\bar{W}_{0}(T_{1,\alpha}\wedge T_{1}^{\prime})=0$ .
2.

The process $\bar{W}_{0}$ jumps up by 1 at rate $\bar{W}_{0}\hat{b}_{0}(t)+X_{0}(t)(\hat{b}_{0}(t)-b_{0}(t))$ for all $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ .
3.

The process $\bar{W}_{0}$ jumps down by 1 at rate $\bar{W}_{0}(t)d_{0}(t)$ for all $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ .

Once this process hits 0, it cannot jump down. Therefore, $\bar{W}_{0}(t)\geq 0$ for all $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ . It remains to check that the rate at which the process $\bar{W}_{0}$ jumps up by 1 is non-negative, which follows from the lemma below.

Lemma 6.2.

For $t\geq 0$ , we have $b_{0}(t)\leq\hat{b}_{0}(t)$ .

Proof.

By the definitions of $b_{0}(t)$ and $d_{0}(t)$ in (8) and (9), we have that for all $t\geq 0$ ,

\hat{b}_{0}(t)=\frac{d_{0}(t)}{1+s}\geq\frac{1}{1+s}\left(1-\frac{X_{0}(t)}{S(% t)}\right)=\frac{\sum_{j=1}^{\infty}(1+s)^{j-1}X_{j}(t)}{S(t)}

and

b_{0}(t)=\frac{N-X_{0}(t)}{S(t)}=\frac{\sum_{j=1}^{\infty}X_{j}(t)}{S(t)}.

Hence, $b_{0}(t)\leq\hat{b}_{0}(t)$ for all $t\geq 0$ . ∎

Let $W_{0}(t)=X_{0}(t)+\bar{W}_{0}(t)$ for $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ . Clearly, $W_{0}(t)\geq X_{0}(t)$ for all $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ . Also, $W_{0}$ is a birth-death process in which, for $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ , a birth occurs at rate $W_{0}(t)\hat{b}_{0}(t)$ and a death occurs at rate $W_{0}(t)d_{0}(t)$ for $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ . Next, for $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ , define

\lambda_{0}(t)=\int_{T_{1,\alpha}\wedge T_{1}^{\prime}}^{t}d_{0}(v)\>dv.

(43)

Then, we define $\tilde{W}_{0}(t)=W_{0}(\lambda_{0}^{-1}(t))$ for $t\geq 0$ . It follows that $\tilde{W}_{0}$ is a subcritical branching process in which each individual gives birth at rate $1/(1+s)$ and dies at rate $1$ .

We define

\tau=\inf\{t\geq T_{1,\alpha}\wedge T_{1}^{\prime}:W_{0}(t)=0\}.

(44)

and

\tau^{\prime}=\inf\left\{t\geq T_{1,\alpha}\wedge T_{1}^{\prime}:W_{0}(t)>% \left(1-\frac{\alpha}{2}\right)N\right\}.

(45)

Lemma 6.3.

We have $\lim_{N\rightarrow\infty}P(\tau^{\prime}=\infty)=1$ .

Proof.

Consider the branching process $\tilde{W}_{0}$ . Since each individual in this process gives birth at rate $1/(1+s)$ and dies at rate $1$ , we know that at any time, the next event is a birth with probability $1/(2+s)$ and a death with probability $(1+s)/(2+s)$ . Therefore, if we evaluate the process $\tilde{W}_{0}$ at the time of each birth or death event, we obtain an asymmetric random walk. Note that if $T_{1,\alpha}<T_{1}^{\prime}$ , then $\tilde{W}_{0}(0)=W_{0}(T_{1,\alpha})=X_{0}(T_{1,\alpha})\leq N-X_{1}(T_{1,% \alpha})<(1-\alpha)N$ . Also, if $T_{1}^{\prime}\leq T_{1,\alpha}$ , then $\tilde{W}_{0}(0)=0$ . Thus, in both cases, we have $\tilde{W}_{0}(0)\leq(1-\alpha)N$ .

Given that there are $k$ individuals of type $0$ at time $T_{1,\alpha}\wedge T_{1}^{\prime}$ and $k\leq(1-\alpha)N$ , the probability that this random walk reaches $0$ before $\lfloor(1-\frac{\alpha}{2})N\rfloor+1$ is

	$\displaystyle 1-\frac{(1+s)^{k}-1}{(1+s)^{\lfloor(1-\frac{\alpha}{2})N\rfloor+% 1}-1}$	$\displaystyle\geq 1-(1+s)^{k-\lfloor(1-\frac{\alpha}{2})N\rfloor-1}$
		$\displaystyle\geq 1-(1+s)^{(1-\alpha)N-\lfloor(1-\frac{\alpha}{2})N\rfloor-1}$
		$\displaystyle\geq 1-(1+s)^{-N\alpha/2}.$

The bound on the right-hand side does not depend on $k$ . Therefore, given that $\tilde{W}_{0}(0)\leq(1-\alpha)N$ , the probability that $\tilde{W}_{0}$ hits $0$ before $\lfloor(1-\frac{\alpha}{2})N\rfloor+1$ is bounded from below by $1-(1+s)^{-N\alpha/2}$ . By the definitions of $\tau$ and $\tau^{\prime}$ in (44) and (45), we have

P(\tau<\tau^{\prime}\,|\,\tilde{W}_{0}(0)\leq(1-\alpha)N)\geq 1-(1+s)^{-N% \alpha/2}.

Therefore,

P(\tau<\tau^{\prime})\geq\left(1-(1+s)^{-N\alpha/2}\right)P(\tilde{W}_{0}(0)% \leq(1-\alpha)N).

(46)

Since $s\rightarrow 0$ and $Ns\alpha/2\rightarrow\infty$ as $N\rightarrow\infty$ , we have $(1+s)^{1/s}\rightarrow e$ and $(1+s)^{-N\alpha/2}=[(1+s)^{1/s}]^{-Ns\alpha/2}\rightarrow 0$ as $N\rightarrow\infty$ . Thus, from (46), $\lim_{N\rightarrow\infty}P(\tau<\tau^{\prime})=1$ . Lastly, note that after time $\tau$ , the process $W_{0}$ will stay at $0$ forever. Hence, $\tau<\tau^{\prime}$ implies that $\tau^{\prime}=\infty$ . ∎

Lemma 6.4.

We have

\lim_{N\rightarrow\infty}P\left(T_{1}^{\prime}\leq T_{1,\alpha}+\frac{4(1+s)% \log N}{\alpha s}\right)=1.

Proof.

Define $\tau$ and $\tau^{\prime}$ as in (44) and (45). Since $X_{0}(t)\leq W_{0}(t)$ for all $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ , the process $X_{0}$ must reach $0$ before or at the same time the process $W_{0}$ does, which implies that $T^{\prime}_{1}\leq\tau$ . It is therefore enough to show that

\lim_{N\rightarrow\infty}P\left(\tau\leq(T_{1,\alpha}\wedge T_{1}^{\prime})+% \frac{4(1+s)\log N}{\alpha s}\right)=1.

(47)

Consider the process $\tilde{W}_{0}$ , which is a branching process in which each individual gives birth at rate $1/(1+s)$ and dies at rate $1$ . For all $t\geq 0$

E[\tilde{W}_{0}(t)|\tilde{W}_{0}(0)]=\tilde{W}_{0}(0)e^{(\frac{1}{1+s}-1)t}=% \tilde{W}_{0}(0)e^{-\frac{st}{1+s}}.

Since $\tilde{W}_{0}(0)\leq N$ , we have $E[\tilde{W}_{0}(t)]\leq Ne^{-\frac{st}{1+s}}$ . Let $t_{1}=\frac{2(1+s)\log N}{s}$ . Then $E[\tilde{W}_{0}(t_{1})]\leq 1/N$ . By Markov’s inequality,

P(\tilde{W}_{0}(t_{1})=0)=1-P(\tilde{W}_{0}(t_{1})\geq 1)\geq 1-\frac{1}{N}.

Hence, $\lim_{N\rightarrow\infty}P(\tilde{W}_{0}(t_{1})=0)=1$ . Because $\tau$ is the first time that $W_{0}$ hits $0$ , we have $\lim_{N\rightarrow\infty}P(\tau\leq\lambda_{0}^{-1}(t_{1}))=1$ . By Lemma 6.3, $\lim_{N\rightarrow\infty}P(\tau\leq\lambda_{0}^{-1}(t_{1})<\tau^{\prime})=1$ .

Finally, we will show that if $\lambda_{0}^{-1}(t_{1})<\tau^{\prime}$ , then $\lambda_{0}^{-1}(t_{1})\leq(T_{1,\alpha}\wedge T_{1}^{\prime})+\frac{2t_{1}}{\alpha}$ , which will imply (47). By the definitions of $d_{0}$ and $\lambda_{0}$ , for all $t\geq T_{1,\alpha}\wedge T_{1}^{\prime}$ , we have

\lambda_{0}(t)=\int_{T_{1,\alpha}\wedge T_{1}^{\prime}}^{t}\left(1-\frac{X_{0}% (v)}{S(v)}+\mu\right)dv\geq\int_{T_{1,\alpha}\wedge T_{1}^{\prime}}^{t}\left(1% -\frac{X_{0}(v)}{N}\right)dv.

By the definition of $\tau^{\prime}$ in (45), if $T_{1,\alpha}\wedge T_{1}^{\prime}\leq v<\tau^{\prime}$ , we have $X_{0}(v)\leq(1-\frac{\alpha}{2})N$ . Hence, when $T_{1,\alpha}\wedge T_{1}^{\prime}\leq t\leq\tau^{\prime}$ ,

\lambda_{0}(t)\geq\int_{T_{1,\alpha}\wedge T_{1}^{\prime}}^{t}\left(1-\frac{X_% {0}(v)}{N}\right)dv\geq\frac{\alpha}{2}(t-(T_{1,\alpha}\wedge T_{1}^{\prime})).

Therefore, if $\lambda_{0}^{-1}(t_{1})<\tau^{\prime}$ , then

t_{1}=\lambda_{0}(\lambda_{0}^{-1}(t_{1}))\geq\frac{\alpha}{2}(\lambda_{0}^{-1% }(t_{1})-(T_{1,\alpha}\wedge T_{1}^{\prime})),

which implies that $\lambda_{0}^{-1}(t_{1})\leq(T_{1,\alpha}\wedge T_{1}^{\prime})+\frac{2t_{1}}{\alpha}$ . ∎

Proof of Lemma 3.1.

Part 1 of Lemma 3.1 is part 1 of Proposition 4.18. Part 2 of Lemma 3.1 follows from (41) and Lemmas 6.1 and 6.4. To prove parts 3 and 4 of Lemma 3.1, it suffices to show that $\lim_{N\rightarrow\infty}P(T_{1}^{\prime}<T^{(1)})=1$ . This result holds because $\lim_{N\rightarrow\infty}P(\tau_{2}\leq T^{(1)})=1$ by Lemma 4.13 and $\lim_{N\rightarrow\infty}P(T_{1}^{\prime}<\tau_{2})=1$ by Lemma 5.4 and part 2 of Lemma 3.1. Finally, to prove part 5 of Lemma 3.1, it is enough to establish that $\lim_{N\rightarrow\infty}P(T_{1}^{\prime}<T_{k}^{*})=1$ for $2\leq k\leq\Delta$ . However, we have already seen that $\lim_{N\rightarrow\infty}P(T_{1}^{\prime}<\tau_{2})=1$ , and Lemmas 4.11 and 4.12 imply that $\lim_{N\rightarrow\infty}P(\tau_{2}\leq\tau_{k}\leq T_{k}^{*})=1$ . ∎

References

[1] E. Baake, A. González Casanova, S. Probst, and A. Wakolbinger (2019). Modelling and simulating Lenski’s long-term evolution experiment. Theor. Pop. Biol. 127, 58-74.
[2] É. Brunet, I. M. Rouzine, and C. O. Wilke (2008). The stochastic edge in adaptive evolution. Genetics 179, 603-620.
[3] M. M. Desai and D. S. Fisher (2007). Beneficial mutation-selection balance and the effect of linkage on positive selection. Genetics 176, 1759-1798.
[4] M. M. Desai, A. M. Walczak, and D. S. Fisher (2013). Genetic diversity and the structure of genealogies in rapidly adapting populations. Genetics 193, 565-585.
[5] R. Durrett (2008). Probability Models for DNA Sequence Evolution. 2nd ed. Springer, New York.
[6] R. Durrett and J. Mayberry (2011). Traveling waves of selective sweeps. Ann. Appl. Probab. 21, 699-744.
[7] D. S. Fisher (2013). Asexual evolution waves: fluctuations and universality. Journal of Statistical Mechanics: Theory and Experiment, P01011.
[8] V. G. Gadag and M. B. Rajarshi (1992). On processes associated with a super-critical Markov branching process. Serdica. 18, 173-178.
[9] P. J. Gerrish and R. E. Lenski (1998). The fate of competing beneficial mutations in an asexual population. Genetica 102/103, 127-144.
[10] F. Hermann, A. González Casanova, R. Soares dos Santos, A. Tobiás, and A. Wakolbinger. From clonal interference to Poissonian interacting trajectories. arXiv:2407.00793.
[11] A. González Casanova, N. Kurt, A. Wakolbinger, and L. Yuan (2016). An individual-based model for the Lenski experiment, and the deceleration of the relative fitness. Stochastic Process. Appl. 126, 2211-2252.
[12] B. H. Good, I. M. Rouzine, D. J. Balick, O. Hallatschek, and M. M. Desai (2012). Distribution of fixed beneficial mutations and the rate of adaptation in asexual populations. Proc. Natl. Acad. Sci. USA 109, 4950-4955.
[13] B. H. Good, A. M. Walczak, R. A. Neher, and M. M. Desai (2014). Genetic diversity in the interference selection limit. PLOS Genetics 10, e1004222.
[14] T. E. Harris (1963). The Theory of Branching Processes. Springer-Verlag, Berlin.
[15] M. Kelly (2013). Upper bound on the rate of adaptation in an asexual population. Ann. Appl. Probab. 23, 1377-1408.
[16] M. Kimura and T. Ohta (1969). The average number of generations until the fixation of a mutant gene in a finite population. Genetics 61, 763-771.
[17] J. Liu and J. Schweinsberg (2021). Particle configurations for branching Brownian motion with an inhomogeneous branching rate. ALEA Lat. Am. J. Probab. Math. Stat. 20, 731-803.
[18] M. J. Melissa, B. H. Good, D. S. Fisher, and M. M. Desai (2022). Population genetics of polymorphism and divergence in rapidly evolving populations. Genetics 221, iyac053.
[19] R. A. Neher and O. Hallatschek (2013). Genealogies in rapidly adapting populations. Proc. Natl. Acad. Sci. USA 110, 437-442.
[20] M. I. Roberts and J. Schweinsberg (2020). A Gaussian particle distribution for branching Brownian motion with an inhomogeneous branching rate. Electron. J. Probab. 26, no. 103, 1-76.
[21] I. M. Rouzine, É. Brunet, and C. O. Wilke (2008). The traveling-wave approach to asexual evolution: Muller’s ratchet and speed of adaptation. Theor. Pop. Biol 73, 24-46.
[22] J. Schweinsberg (2017). Rigorous results for a population model with selection I: evolution of the fitness distribution. Electron. J. Probab. 22, no. 37, 1-94.
[23] J. Schweinsberg (2017). Rigorous results for a population model with selection II: genealogy of the population. Electron. J. Probab. 22, no. 38, 1-54.
[24] F. Yu, A. Etheridge, and C. Cuthbertson (2010). Asymptotic behavior of the rate of adaptation. Ann. Appl. Probab. 20, 978-1004.

The Accumulation of Beneficial Mutations and Convergence to a Poisson Process

Abstract

1 Introduction

Theorem 1.1.

Corollary 1.2.

2 Transition rates for the population process

3 Structure of the induction argument

Lemma 3.1.

Lemma 3.2.

Proof.

Proof of Theorem 1.1.

Proof of Corollary 1.2.

4 Following the process until time T1subscript𝑇1T_{1}italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT

4.1 Bounding the process Xksubscript𝑋𝑘X_{k}italic_X start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT from above by a branching process

Lemma 4.1.

Proof.

4.2 An upper bound on P⁢(T1≤cN⁢μ⁢s)𝑃subscript𝑇1𝑐𝑁𝜇𝑠P(T_{1}\leq\frac{c}{N\mu s})italic_P ( italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ divide start_ARG italic_c end_ARG start_ARG italic_N italic_μ italic_s end_ARG )

Lemma 4.2.

Lemma 4.3.

Proof.

Lemma 4.4.

Proof.

4.3 Finite and infinite lines of descent

Lemma 4.5.

Proof.

Remark 4.6.

Remark 4.7.

4.4 The number of type k𝑘kitalic_k immigrants

Lemma 4.8.

Proof.

Lemma 4.9.

Proof.

Lemma 4.10.

Proof.

Lemma 4.11.

Proof.

Lemma 4.12.

Proof.

Lemma 4.13.

4.5 Bounding the process X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT from below by a branching process

Lemma 4.14.

Proof.

Lemma 4.15.

Proof.

Lemma 4.16.

Proof.

Lemma 4.17.

Proof.

Proposition 4.18.

Proof.

5 Following the process until time T1,αsubscript𝑇1𝛼T_{1,\alpha}italic_T start_POSTSUBSCRIPT 1 , italic_α end_POSTSUBSCRIPT

Lemma 5.1.

Lemma 5.2.

Proof.

Lemma 5.3.

Proof.

Lemma 5.4.

Proof.

Proposition 5.5.

Proof.

6 Following the process until type 0 vanishes

Lemma 6.1.

Proof.

Lemma 6.2.

Proof.

Lemma 6.3.

Proof.

Lemma 6.4.

Proof.

Proof of Lemma 3.1.

References

4 Following the process until time $T_{1}$

4.1 Bounding the process $X_{k}$ from above by a branching process

4.2 An upper bound on $P(T_{1}\leq\frac{c}{N\mu s})$

4.4 The number of type $k$ immigrants

4.5 Bounding the process $X_{1}$ from below by a branching process

5 Following the process until time $T_{1,\alpha}$