Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Exact path-integral representation of the Wright-Fisher model with mutation and selection

David Waxman Centre for Computational Systems Biology, ISTBI, Fudan University, 220 Handan Road, Shanghai 200433, China
Abstract

The Wright-Fisher model describes a biological population containing a finite number of individuals. In this work we consider a Wright-Fisher model for a randomly mating population, where selection and mutation act at an unlinked locus. The selection acting has a general form, and the locus may have two or more alleles. We determine an exact representation of the time dependent transition probability of such a model in terms of a path integral. Path integrals were introduced in physics and mathematics, and have found numerous applications in different fields, where a probability distribution, or closely related object, is represented as a ‘sum’ of contributions over all paths or trajectories between two points. Path integrals provide alternative calculational routes to problems, and may be a source of new intuition and suggest new approximations. For the case of two alleles, we relate the exact Wright-Fisher path-integral result to the path-integral form of the transition density under the diffusion approximation. We determine properties of the Wright-Fisher transition probability for multiple alleles. We show how, in the absence of mutation, the Wright-Fisher transition probability incorporates phenomena such as fixation and loss.

keywords:
random genetic drift, frequency trajectories, multiple alleles, transition probability, Markov chain, allele frequency

Introduction

The Wright-Fisher model describes a biological population containing a finite number of individuals[1, 2]. It represents a fundamental model (or class of models) within population genetics that continues to be of relevance [3]. Such a model, at heart, describes the stochastic fluctuations of allele frequencies that occur in finite populations. The fluctuations arise because the parents of one generation do not all make the same contribution to the next generation. The Wright-Fisher model thus describes random genetic drift.

In addition to random genetic drift, the model can also incorporate the dynamical effects of selection, mutation and other evolutionary forces, and has been used in many different situations. For example, the model lies at the heart of forward simulations[4], as well as in inference[5], in evolutionary games[6], and it connects intimately with the coalescent[7].

The Wright-Fisher model is a discrete state, discrete time, Markov chain. The discrete states correspond to possible allele frequencies (or sets of allele frequencies), and the discrete time corresponds to generations. The model has been analysed under the diffusion approximation[8, 3], where states and times are treated as taking continuous values. Recently, the Wright-Fisher model of a biallelic locus, subject to multiplicative selection but in the absence of mutation, has been considered under the diffusion approximation, and the time-dependent transition density has been represented as a path-integral[9].

Path integrals were introduced into quantum theory by Feynman[10], and into the mathematics of diffusion by Wiener[11]. Generally, a probability distribution (or closely related object), associated with a diffusion-like process, is represented as a ‘sum’ of contributions over all paths or trajectories between two points or states. The ‘sum’ over paths is often an integral, and this is the origin of the name ‘path integral’. An alternative name is ‘functional integral’, since integration over paths/trajectories can also be thought of as integration over functions - namely the trajectories themselves. Related approaches, termed functional methods, have a large variety of applications[12].

The introduction of path integrals in physics has provided alternative calculational routes to problems. Beyond this, path integrals, because they focus on visualisable trajectories, may be a source of new intuition, and may suggest new ways to proceed and new approximations[10].

In the present paper we work fully within the framework of a Wright-Fisher model where both states and time are discrete; we do not employ the diffusion approximation, although we relate some results to this approximation. We consider a randomly mating sexual population, where a general scheme of selection acts at an unlinked locus, which is also subject to mutation. We derive an exact path-integral representation of the time-dependent transition probability of this model.

We first consider a locus with two alleles, and then generalise to the locus having n𝑛nitalic_n alleles, where n𝑛nitalic_n can take the values 2222, 3333, 4444, … . There are numerous examples of studies of loci with two alleles, and there are increasing numbers of examples where multiple (>2absent2>2> 2) alleles exist at a locus[13, 14, 15, 16].

Theoretical background for the case of two alleles

Consider a population that is diploid and sexual, and reproduces by random mating. We assume there is an equal sex ratio and no sexual dimorphism. Time is measured in generations, and labelled by t=0,1,2,3,𝑡0123t=0,1,2,3,\ldotsitalic_t = 0 , 1 , 2 , 3 , ….

The organisms in the population are subject to mutation and selection at a single unlinked locus. The locus has two alleles that we refer to as A1subscript𝐴1A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and A2subscript𝐴2A_{2}italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. We shall focus on just one of the two alleles, say A1subscript𝐴1A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and often call it the focal allele.

With two alleles, the sum of their frequencies is unity. Thus specification of the frequency of the focal allele at any time determines the frequency of the other allele at this time. The state of the population can thus be described just in terms of the frequency of the focal allele, which we will often just call the frequency.

Effectively infinite population

We first consider a very large (effectively infinite) population.

We take a generation to begin with adults, who sexually reproduce via random mating and then die. Each mating yields the same very large number of offspring.

If the frequency of the focal allele in adults is x𝑥xitalic_x, then the frequency of the focal allele in offspring is x=fmut(x)superscript𝑥subscript𝑓𝑚𝑢𝑡𝑥x^{\ast}=f_{mut}(x)italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) where fmut(x)subscript𝑓𝑚𝑢𝑡𝑥f_{mut}(x)italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) takes into account frequency changes caused by mutation. This function is given by

fmut(x)=(1u)x+v(1x)subscript𝑓𝑚𝑢𝑡𝑥1𝑢𝑥𝑣1𝑥f_{mut}(x)=\left(1-u\right)x+v(1-x)italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) = ( 1 - italic_u ) italic_x + italic_v ( 1 - italic_x ) (1)

where u𝑢uitalic_u is the probability that an A1subscript𝐴1A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT allele in a parent mutates to an A2subscript𝐴2A_{2}italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT allele in an offspring, and v𝑣vitalic_v is the corresponding A2subscript𝐴2A_{2}italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT to A1subscript𝐴1A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT mutation probability. In the absence of mutation fmut(x)=xsubscript𝑓𝑚𝑢𝑡𝑥𝑥f_{mut}(x)=xitalic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) = italic_x which is equivalent to x=xsuperscript𝑥𝑥x^{\ast}=xitalic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_x.

We assume viability selection determines the probability of different offspring surviving to maturity. If the frequency of the focal allele in offspring (i.e., after mutation) is xsuperscript𝑥x^{\ast}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, then the frequency of this allele after viability selection has acted is fsel(x)subscript𝑓𝑠𝑒𝑙superscript𝑥f_{sel}(x^{\ast})italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ), where fsel(x)subscript𝑓𝑠𝑒𝑙superscript𝑥f_{sel}(x^{\ast})italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) takes into account frequency changes of xsuperscript𝑥x^{\ast}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT due to selection. Some non selective thinning may occur at this point, but providing the population size remains very large, this does not cause any further changes in allele frequencies.

Selection acts on variation in the population, and when there is no variation there are no effects of selection. There is no variation when carriers of only one allele are present in the population, which corresponds to x=0𝑥0x=0italic_x = 0 and x=1𝑥1x=1italic_x = 1. We take

fsel(x)=x+σ(x)x(1x)subscript𝑓𝑠𝑒𝑙𝑥𝑥𝜎𝑥𝑥1𝑥f_{sel}(x)=x+\sigma(x)x(1-x)italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_x ) = italic_x + italic_σ ( italic_x ) italic_x ( 1 - italic_x ) (2)

where the function σ(x)𝜎𝑥\sigma(x)italic_σ ( italic_x ) (with |σ(x)|<𝜎𝑥|\sigma(x)|<\infty| italic_σ ( italic_x ) | < ∞) is determined by the particular scheme of selection that is operating, and the effect of selection in fsel(x)subscript𝑓𝑠𝑒𝑙𝑥f_{sel}(x)italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_x ), namely σ(x)x(1x)𝜎𝑥𝑥1𝑥\sigma(x)x(1-x)italic_σ ( italic_x ) italic_x ( 1 - italic_x ), has the required property of vanishing at both x=0𝑥0x=0italic_x = 0 and x=1𝑥1x=1italic_x = 1.

A few examples of σ(x)𝜎𝑥\sigma(x)italic_σ ( italic_x ) are as follows.

  1. (i)

    If the relative fitnesses of the three genotypes A1A1subscript𝐴1subscript𝐴1A_{1}A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, A1A2subscript𝐴1subscript𝐴2A_{1}A_{2}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and A2A2subscript𝐴2subscript𝐴2A_{2}A_{2}italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are 1+s1𝑠1+s1 + italic_s, 1+hs1𝑠1+hs1 + italic_h italic_s and 1111, respectively, then σ(x)=s×[(12h)x+h]/[1+sx2+2hsx(1x)]𝜎𝑥𝑠delimited-[]12𝑥delimited-[]1𝑠superscript𝑥22𝑠𝑥1𝑥\sigma(x)=s\times\left[\left(1-2h\right)x+h\right]/[1+sx^{2}+2hsx(1-x)]italic_σ ( italic_x ) = italic_s × [ ( 1 - 2 italic_h ) italic_x + italic_h ] / [ 1 + italic_s italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 italic_h italic_s italic_x ( 1 - italic_x ) ]. To leading order in s𝑠sitalic_s (assuming |s|1much-less-than𝑠1|s|\ll 1| italic_s | ≪ 1), we have σ(x)=s×[(12h)x+h]𝜎𝑥𝑠delimited-[]12𝑥\sigma(x)=s\times\left[\left(1-2h\right)x+h\right]italic_σ ( italic_x ) = italic_s × [ ( 1 - 2 italic_h ) italic_x + italic_h ], in which case any h1/212h\neq 1/2italic_h ≠ 1 / 2 will lead to σ(x)𝜎𝑥\sigma(x)italic_σ ( italic_x ) varying with x𝑥xitalic_x.

  2. (ii)

    If selection is additive, and the relative fitnesses of the three genotypes are 1+2s12𝑠1+2s1 + 2 italic_s, 1+s1𝑠1+s1 + italic_s and 1111, respectively, then σ(x)=s/(1+2sx)𝜎𝑥𝑠12𝑠𝑥\sigma(x)=s/\left(1+2sx\right)italic_σ ( italic_x ) = italic_s / ( 1 + 2 italic_s italic_x ), and to leading order in s𝑠sitalic_s we have σ(x)=s𝜎𝑥𝑠\sigma(x)=sitalic_σ ( italic_x ) = italic_s, i.e., a constant.

  3. (iii)

    If selection is multiplicative, and the relative fitnesses of the three genotypes are (1+s)2superscript1𝑠2\left(1+s\right)^{2}( 1 + italic_s ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, 1+s1𝑠1+s1 + italic_s and 1111, respectively, then σ(x)=s/(1+sx)𝜎𝑥𝑠1𝑠𝑥\sigma(x)=s/(1+sx)italic_σ ( italic_x ) = italic_s / ( 1 + italic_s italic_x ), and to leading order in s𝑠sitalic_s we have σ(x)=s𝜎𝑥𝑠\sigma(x)=sitalic_σ ( italic_x ) = italic_s., i.e., a constant. Thus weak multiplicative selection is very similar, in effect, to weak additive selection.

We note that while small selection coefficients (i.e., small values of s𝑠sitalic_s, and more generally small values of σ(x)𝜎𝑥\sigma(x)italic_σ ( italic_x )) are common in nature [17], strongly selected alleles do sometimes occur, for example alleles that are appreciably deleterious [13]. Accordingly, we will not make the assumption that |σ(x)|𝜎𝑥|\sigma(x)|| italic_σ ( italic_x ) | is small, and will simply assume that σ(x)𝜎𝑥\sigma(x)italic_σ ( italic_x ) follows, without any approximation, from a selection scheme.

The frequency of the A1subscript𝐴1A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT allele in offspring, after selection and mutation have acted, can be expressed in terms of the frequency, x𝑥xitalic_x, in adults, as fsel(x)fsel(fmut(x))subscript𝑓𝑠𝑒𝑙superscript𝑥subscript𝑓𝑠𝑒𝑙subscript𝑓𝑚𝑢𝑡𝑥f_{sel}(x^{\ast})\equiv f_{sel}(f_{mut}(x))italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ≡ italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) ) and we write

F(x)=fsel(fmut(x)).𝐹𝑥subscript𝑓𝑠𝑒𝑙subscript𝑓𝑚𝑢𝑡𝑥F(x)=f_{sel}(f_{mut}(x)).italic_F ( italic_x ) = italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) ) . (3)

Let X(t)𝑋𝑡X(t)italic_X ( italic_t ) denote the frequency of the A1subscript𝐴1A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT allele in the adults of generation t𝑡titalic_t. Then in a very large population, the frequency obeys the deterministic equation

X(t+1)=F(X(t)).𝑋𝑡1𝐹𝑋𝑡X(t+1)=F(X(t)).italic_X ( italic_t + 1 ) = italic_F ( italic_X ( italic_t ) ) . (4)

Finite population

We now consider a finite population, where N𝑁Nitalic_N adults are present at the start of each generation. The processes of reproduction, mutation and viability selection occur as in an effectively infinite population. However, after viability selection there is a round of non selective sampling/number regulation of the mature offspring, that leads to N𝑁Nitalic_N individuals being present in the population. These become the N𝑁Nitalic_N adults of the next generation. The behaviour of this population can be described by a Wright-Fisher model, as is shown in textbooks[18]. We will now use such a model (which can, like the diffusion approximation, incorporate an effective population size[19]).

For the population under consideration, let 𝐌𝐌\mathbf{M}bold_M denote the transition matrix of the Wright-Fisher model. We write the (i,j)𝑖𝑗(i,j)( italic_i , italic_j ) element of 𝐌𝐌\mathbf{M}bold_M as Mi,jsubscript𝑀𝑖𝑗M_{i,j}italic_M start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT, where i𝑖iitalic_i and j𝑗jitalic_j can take the values 0,1,2,,2N0122𝑁0,1,2,\ldots,2N0 , 1 , 2 , … , 2 italic_N. Then for a population where the focal allele has the frequency j/(2N)𝑗2𝑁j/(2N)italic_j / ( 2 italic_N ) in one generation, the probability that the focal allele will have the frequency i/(2N)𝑖2𝑁i/(2N)italic_i / ( 2 italic_N ) in the next generation is Mi,jsubscript𝑀𝑖𝑗M_{i,j}italic_M start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT. With (ab)=a!(ab)!b!binomial𝑎𝑏𝑎𝑎𝑏𝑏\binom{a}{b}=\tfrac{a!}{(a-b)!b!}( FRACOP start_ARG italic_a end_ARG start_ARG italic_b end_ARG ) = divide start_ARG italic_a ! end_ARG start_ARG ( italic_a - italic_b ) ! italic_b ! end_ARG a binomial coefficient, we have[18]

Mi,j=(2Ni)[F(j2N)]i[1F(j2N)]2Ni.subscript𝑀𝑖𝑗binomial2𝑁𝑖superscriptdelimited-[]𝐹𝑗2𝑁𝑖superscriptdelimited-[]1𝐹𝑗2𝑁2𝑁𝑖M_{i,j}=\binom{2N}{i}\left[F\left(\frac{j}{2N}\right)\right]^{i}\left[1-F\left% (\frac{j}{2N}\right)\right]^{2N-i}.italic_M start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG italic_i end_ARG ) [ italic_F ( divide start_ARG italic_j end_ARG start_ARG 2 italic_N end_ARG ) ] start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT [ 1 - italic_F ( divide start_ARG italic_j end_ARG start_ARG 2 italic_N end_ARG ) ] start_POSTSUPERSCRIPT 2 italic_N - italic_i end_POSTSUPERSCRIPT . (5)

The transition matrix is always normalised in the sense that i=02NMi,j=1superscriptsubscript𝑖02𝑁subscript𝑀𝑖𝑗1\sum_{i=0}^{2N}M_{i,j}=1∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_N end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = 1 for all j𝑗jitalic_j, and invoking normalisation can resolve ambiguities (for example, when F(x)=0𝐹𝑥0F(x)=0italic_F ( italic_x ) = 0, normalisation ensures M0,0=1subscript𝑀001M_{0,0}=1italic_M start_POSTSUBSCRIPT 0 , 0 end_POSTSUBSCRIPT = 1 so i=02NM0,j=1superscriptsubscript𝑖02𝑁subscript𝑀0𝑗1\sum_{i=0}^{2N}M_{0,j}=1∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_N end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 0 , italic_j end_POSTSUBSCRIPT = 1 for all j𝑗jitalic_j).

The transition matrix is key to many calculations. If 𝐏(t)𝐏𝑡\mathbf{P}(t)bold_P ( italic_t ) is a column vector containing the probabilities of all 2N+12𝑁12N+12 italic_N + 1 possible frequency states of the population in generation t𝑡titalic_t, i.e., the probability distribution for generation t𝑡titalic_t, then using the transition matrix we can determine the probability distribution for generation t+1𝑡1t+1italic_t + 1, namely 𝐏(t+1)=𝐌𝐏(t)𝐏𝑡1𝐌𝐏𝑡\mathbf{P}(t+1)=\mathbf{MP}(t)bold_P ( italic_t + 1 ) = bold_MP ( italic_t ). Furthermore, using the elements of the transition matrix, we can determine the probability that the population passes through a particular set of frequency states over time, i.e., displays a particular frequency trajectory. For example, if the population starts with frequency l/(2N)𝑙2𝑁l/(2N)italic_l / ( 2 italic_N ) in one generation, then the probability that in the next 3333 generations the population will have the frequencies k/(2N)𝑘2𝑁k/(2N)italic_k / ( 2 italic_N ), j/(2N)𝑗2𝑁j/(2N)italic_j / ( 2 italic_N ) and i/(2N)𝑖2𝑁i/(2N)italic_i / ( 2 italic_N ), respectively, is given by Mi,jMj,kMk,lsubscript𝑀𝑖𝑗subscript𝑀𝑗𝑘subscript𝑀𝑘𝑙M_{i,j}M_{j,k}M_{k,l}italic_M start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT italic_k , italic_l end_POSTSUBSCRIPT.

Alternative notation for the transition matrix

We shall now write elements of 𝐌𝐌\mathbf{M}bold_M in a different notation that will be useful for our purposes. We introduce the notion of an allowed frequency of an allele which is given by

allowed frequency=integer2Nallowed frequencyinteger2𝑁\text{allowed frequency}=\frac{\text{integer}}{2N}allowed frequency = divide start_ARG integer end_ARG start_ARG 2 italic_N end_ARG (6)

where the integer can take any of the values 0,1,2,,2N0122𝑁0,1,2,\ldots,2N0 , 1 , 2 , … , 2 italic_N.

To keep the notation as simple as possible, we shall, for a locus with two alleles, reserve the use of a𝑎aitalic_a, x𝑥xitalic_x, xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, x(r)𝑥𝑟x(r)italic_x ( italic_r ) (for various integral r𝑟ritalic_r) and z𝑧zitalic_z, as values of allowed frequencies. In terms of the allowed frequencies xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and x𝑥xitalic_x, we write the elements of 𝐌𝐌\mathbf{M}bold_M as

M(x|x)=(2N2Nx)[F(x)]2Nx[1F(x)]2N(1x)𝑀conditionalsuperscript𝑥𝑥binomial2𝑁2𝑁superscript𝑥superscriptdelimited-[]𝐹𝑥2𝑁superscript𝑥superscriptdelimited-[]1𝐹𝑥2𝑁1superscript𝑥M(x^{\prime}|x)=\binom{2N}{2Nx^{\prime}}\left[F(x)\right]^{2Nx^{\prime}}\left[% 1-F(x)\right]^{2N(1-x^{\prime})}italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) [ italic_F ( italic_x ) ] start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT [ 1 - italic_F ( italic_x ) ] start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT (7)

which gives the probability of a transition from the population state (i.e., frequency) x𝑥xitalic_x, in one generation, to state xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in the next generation. Thus if x=i/(2N)superscript𝑥𝑖2𝑁x^{\prime}=i/(2N)italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_i / ( 2 italic_N ) and x=j/(2N)𝑥𝑗2𝑁x=j/(2N)italic_x = italic_j / ( 2 italic_N ), with i𝑖iitalic_i and j𝑗jitalic_j any two of the integers 0,1,2,,2N0122𝑁0,1,2,\ldots,2N0 , 1 , 2 , … , 2 italic_N, then M(x|x)𝑀conditionalsuperscript𝑥𝑥M(x^{\prime}|x)italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) coincides with Mi,jsubscript𝑀𝑖𝑗M_{i,j}italic_M start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT in Eq. (5).

We shall refer to a locus that is not subject to selection (but which may be subject to mutation), as a neutral locus. The transition matrix of a neutral locus, written M(0)(x|x)superscript𝑀0conditionalsuperscript𝑥𝑥M^{(0)}(x^{\prime}|x)italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ), is obtained from M(x|x)𝑀conditionalsuperscript𝑥𝑥M(x^{\prime}|x)italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) by setting σ(x)𝜎𝑥\sigma(x)italic_σ ( italic_x ) to zero for all x𝑥xitalic_x. With

x=fmut(x)superscript𝑥subscript𝑓𝑚𝑢𝑡𝑥x^{\ast}=f_{mut}(x)italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) (8)

this leads to

M(0)(x|x)=(2N2Nx)(x)2Nx(1x)2N(1x).superscript𝑀0conditionalsuperscript𝑥𝑥binomial2𝑁2𝑁superscript𝑥superscriptsuperscript𝑥2𝑁superscript𝑥superscript1superscript𝑥2𝑁1superscript𝑥M^{(0)}(x^{\prime}|x)=\binom{2N}{2Nx^{\prime}}\left(x^{\ast}\right)^{2Nx^{% \prime}}\left(1-x^{\ast}\right)^{2N(1-x^{\prime})}.italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT . (9)

A central aspect of the analysis we present is that the form of F(x)𝐹𝑥F(x)italic_F ( italic_x ) in Eq. (3) allows us to write the transition matrix M(x|x)𝑀conditionalsuperscript𝑥𝑥M(x^{\prime}|x)italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) of Eq. (7) as the exact product of two factors:

M(x|x)=M(0)(x|x)×eC(x|x)𝑀conditionalsuperscript𝑥𝑥superscript𝑀0conditionalsuperscript𝑥𝑥superscript𝑒𝐶conditionalsuperscript𝑥𝑥M(x^{\prime}|x)=M^{(0)}(x^{\prime}|x)\times e^{C(x^{\prime}|x)}italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) × italic_e start_POSTSUPERSCRIPT italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) end_POSTSUPERSCRIPT (10)

where M(0)(x|x)superscript𝑀0conditionalsuperscript𝑥𝑥M^{(0)}(x^{\prime}|x)italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) is the neutral result given in Eq. (9), while, with xsuperscript𝑥x^{\ast}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT given by Eq. (8), we have

C(x|x)=2N×[xln(1+σ(x)(1x))+(1x)ln(1σ(x)x)]𝐶conditionalsuperscript𝑥𝑥2𝑁delimited-[]superscript𝑥1𝜎superscript𝑥1superscript𝑥1superscript𝑥1𝜎superscript𝑥superscript𝑥C(x^{\prime}|x)=2N\times\big{[}x^{\prime}\ln\big{(}1+\sigma(x^{\ast})\left(1-x% ^{\ast}\right)\big{)}+(1-x^{\prime})\ln\big{(}1-\sigma(x^{\ast})x^{\ast}\big{)% }\big{]}italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = 2 italic_N × [ italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_ln ( 1 + italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ) + ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_ln ( 1 - italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ] (11)

- see the first subsection of Methods for details.

The factorisation in Eq. (10) says the transition matrix M(x|x)𝑀conditionalsuperscript𝑥𝑥M(x^{\prime}|x)italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) depends on a ‘core’ neutral/mutational part, M(0)(x|x)superscript𝑀0conditionalsuperscript𝑥𝑥M^{(0)}(x^{\prime}|x)italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ), and a factor eC(x|x)superscript𝑒𝐶conditionalsuperscript𝑥𝑥e^{C(x^{\prime}|x)}italic_e start_POSTSUPERSCRIPT italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) end_POSTSUPERSCRIPT that is ‘selectively controlled’ in the sense that if there is no selection (i.e., if σ(x)𝜎𝑥\sigma(x)italic_σ ( italic_x ) vanishes for all x𝑥xitalic_x) then C(x|x)𝐶conditionalsuperscript𝑥𝑥C(x^{\prime}|x)italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) vanishes and the factor eC(x|x)superscript𝑒𝐶conditionalsuperscript𝑥𝑥e^{C(x^{\prime}|x)}italic_e start_POSTSUPERSCRIPT italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) end_POSTSUPERSCRIPT is simply unity.

We note that while C(x|x)𝐶conditionalsuperscript𝑥𝑥C(x^{\prime}|x)italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) depends on selection, and precisely vanishes in the absence of selection, it also depends on the population size, N𝑁Nitalic_N, and the mutation rates u𝑢uitalic_u and v𝑣vitalic_v.

Trajectories and path integral

We now consider a trajectory of the frequency, which starts at frequency x(0)𝑥0x(0)italic_x ( 0 ) in generation 00, has frequency x(1)𝑥1x(1)italic_x ( 1 ) in generation 1111,…, and frequency x(t)𝑥𝑡x(t)italic_x ( italic_t ) in generation t𝑡titalic_t. To represent such a trajectory, which runs from time 00 to time t𝑡titalic_t, we use the notation

[x]0t=(x(0),x(1),,x(t)).superscriptsubscriptdelimited-[]𝑥0𝑡𝑥0𝑥1𝑥𝑡[x]_{0}^{t}=\left(x(0),x(1),...,x(t)\right).[ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = ( italic_x ( 0 ) , italic_x ( 1 ) , … , italic_x ( italic_t ) ) . (12)

This expresses the trajectory as a row vector with t+1𝑡1t+1italic_t + 1 elements, all of which are allowed frequencies.

The probability of occurrence of the trajectory [x]0tsuperscriptsubscriptdelimited-[]𝑥0𝑡[x]_{0}^{t}[ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT in Eq. (12) is obtained by multiplying together the appropriate M(x|x)𝑀conditionalsuperscript𝑥𝑥M(x^{\prime}|x)italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) and is given by r=1tM(x(r)|x(r1)=r=1tM(0)(x(r)|x(r1))×er=1tC(x(r)|x(r1)){\textstyle\prod\limits_{r=1}^{t}}M(x(r)|x(r-1)={\textstyle\prod\limits_{r=1}^% {t}}M^{(0)}(x(r)|x(r-1))\times e^{\sum_{r=1}^{t}C(x(r)|x(r-1))}∏ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_M ( italic_x ( italic_r ) | italic_x ( italic_r - 1 ) = ∏ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x ( italic_r ) | italic_x ( italic_r - 1 ) ) × italic_e start_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_C ( italic_x ( italic_r ) | italic_x ( italic_r - 1 ) ) end_POSTSUPERSCRIPT. We write this as

probability of [x]0t=W(0)([x]0t)×eC([x]0t)probability of superscriptsubscriptdelimited-[]𝑥0𝑡superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡\text{probability of }[x]_{0}^{t}=W^{(0)}\left([x]_{0}^{t}\right)\times e^{C% \left([x]_{0}^{t}\right)}probability of [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) × italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT (13)

where

W(0)([x]0t)=r=1tM(0)(x(r)|x(r1))superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡superscriptsubscriptproduct𝑟1𝑡superscript𝑀0conditional𝑥𝑟𝑥𝑟1W^{(0)}\left([x]_{0}^{t}\right)={\textstyle\prod_{r=1}^{t}}M^{(0)}(x(r)|x(r-1))italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) = ∏ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x ( italic_r ) | italic_x ( italic_r - 1 ) ) (14)

and

C([x]0t)=r=1tC(x(r)|x(r1)).𝐶superscriptsubscriptdelimited-[]𝑥0𝑡superscriptsubscript𝑟1𝑡𝐶conditional𝑥𝑟𝑥𝑟1C\left([x]_{0}^{t}\right)=\sum_{r=1}^{t}C(x(r)|x(r-1)).italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_C ( italic_x ( italic_r ) | italic_x ( italic_r - 1 ) ) . (15)

Equation (13) says that the trajectory [x]0tsuperscriptsubscriptdelimited-[]𝑥0𝑡[x]_{0}^{t}[ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT, in the presence of selection, has a probability which we can write as the product of the probability of the trajectory under neutrality, W(0)([x]0t)superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡W^{(0)}\left([x]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ), and the factor eC([x]0t)superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡e^{C\left([x]_{0}^{t}\right)}italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT which is selectively controlled. The presence of eC([x]0t)superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡e^{C\left([x]_{0}^{t}\right)}italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT in Eq. (13) indicates how non-zero selection (in combination with other forces, via its N𝑁Nitalic_N, u𝑢uitalic_u and v𝑣vitalic_v dependence), modifies the probability of occurrence of an entire trajectory under neutrality.

Let K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) denote the overall probability of going from an initial allowed frequency of a𝑎aitalic_a at time 00 to the allowed frequency of z𝑧zitalic_z at time t𝑡titalic_t. In conventional (Markov chain) language, K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) is determined from matrix elements of 𝐌tsuperscript𝐌𝑡\mathbf{M}^{t}bold_M start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT, where 𝐌𝐌\mathbf{M}bold_M is the transition matrix that was introduced above (with elements given in Eq. (5)). By contrast, in trajectory language, all possible trajectories, between the end-points (a𝑎aitalic_a at time 00 and z𝑧zitalic_z at time t𝑡titalic_t) contribute to K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ). We can thus write K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) as sum of the trajectory probability, W(0)[𝐱]×eC([x]0t)superscript𝑊0delimited-[]𝐱superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡W^{(0)}[\mathbf{x}]\times e^{C\left([x]_{0}^{t}\right)}italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT [ bold_x ] × italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT, over all possible trajectories. That is

K(z,t|a,0)=x(0)=ax(t)=zW(0)([x]0t)eC([x]0t).𝐾𝑧conditional𝑡𝑎0𝑥𝑡𝑧𝑥0𝑎superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡K(z,t|a,0)=\overset{x(t)=z}{\underset{x(0)=a}{\sum\cdots\sum}}W^{(0)}\left([x]% _{0}^{t}\right)e^{C\left([x]_{0}^{t}\right)}.italic_K ( italic_z , italic_t | italic_a , 0 ) = start_OVERACCENT italic_x ( italic_t ) = italic_z end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT . (16)

The notation in Eq. (16) denotes a sum over all trajectories whose end points, x(0)𝑥0x(0)italic_x ( 0 ) and x(t)𝑥𝑡x(t)italic_x ( italic_t ), have the specific allowed frequencies a𝑎aitalic_a and z𝑧zitalic_z, respectively, while x(1)𝑥1x(1)italic_x ( 1 ), x(2)𝑥2x(2)italic_x ( 2 ), ,x(t1)𝑥𝑡1...,x(t-1)… , italic_x ( italic_t - 1 ), which give the state of the population at intermediate times, take values that cover all allowed frequencies. In Figure 1 we illustrate two trajectories that contribute to a transition probability.

Refer to caption
Figure 1: Contributing trajectories. An illustration of two trajectories (i.e., frequency as a function of time) that contribute to the transition probability K(z,T|a,0)𝐾𝑧conditional𝑇𝑎0K(z,T|a,0)italic_K ( italic_z , italic_T | italic_a , 0 ), where the initial frequency is a=0.2𝑎0.2a=0.2italic_a = 0.2 at time 00, and the final frequency is z=0.8𝑧0.8z=0.8italic_z = 0.8 at time T=50𝑇50T=50italic_T = 50. All trajectories that contribute to K(z,T|a,0)𝐾𝑧conditional𝑇𝑎0K(z,T|a,0)italic_K ( italic_z , italic_T | italic_a , 0 ) take only allowed frequencies.

Equation (16) is an exact ‘path integral’ or ‘sum over paths’ representation of the finite time transition probability in a two allele Wright-Fisher model where states and times are discrete, and the model incorporates mutation and a general form of selection.

Since, by construction, W(0)([x]0t)superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡W^{(0)}\left([x]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) is independent of selection, and since C([x]0t)𝐶superscriptsubscriptdelimited-[]𝑥0𝑡C\left([x]_{0}^{t}\right)italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) vanishes when selection vanishes, the transition probability corresponding to that in Eq. (16), when there is no selection, namely the neutral probability, is written as K(0)(z,t|a,0)superscript𝐾0𝑧conditional𝑡𝑎0K^{(0)}(z,t|a,0)italic_K start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_z , italic_t | italic_a , 0 ) and given by

K(0)(z,t|a,0)=x(0)=ax(t)=zW(0)([x]0t).superscript𝐾0𝑧conditional𝑡𝑎0𝑥𝑡𝑧𝑥0𝑎superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡K^{(0)}(z,t|a,0)=\overset{x(t)=z}{\underset{x(0)=a}{\sum\cdots\sum}}W^{(0)}% \left([x]_{0}^{t}\right).italic_K start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_z , italic_t | italic_a , 0 ) = start_OVERACCENT italic_x ( italic_t ) = italic_z end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) . (17)

The quantity K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) in Eq. (16) can also be interpreted as the probability distribution of the frequency (of the A1subscript𝐴1A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT allele) at time t𝑡titalic_t, which is a random variable that we write as X(t)𝑋𝑡X(t)italic_X ( italic_t ). In particular, K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) is the value of the distribution of X(t)𝑋𝑡X(t)italic_X ( italic_t ), when evaluated at frequency z𝑧zitalic_z, given that the frequency X(0)𝑋0X(0)italic_X ( 0 ) had the definite value a𝑎aitalic_a. Thus, for example, the expected value of X(t)𝑋𝑡X(t)italic_X ( italic_t ), given X(0)=a𝑋0𝑎X(0)=aitalic_X ( 0 ) = italic_a, is E[X(t)|X(0)=a]=zz×K(z,t|a,0)𝐸delimited-[]conditional𝑋𝑡𝑋0𝑎subscript𝑧𝑧𝐾𝑧conditional𝑡𝑎0E[X(t)|X(0)=a]=\sum_{z}z\times K(z,t|a,0)italic_E [ italic_X ( italic_t ) | italic_X ( 0 ) = italic_a ] = ∑ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT italic_z × italic_K ( italic_z , italic_t | italic_a , 0 ) where the sum runs over all allowed values of z𝑧zitalic_z.

Approximation when there is no mutation and selection is weak

We now consider a special case of the distribution K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ). We proceed under the following assumptions.

  1. (i)

    There is no mutation (u=v=0𝑢𝑣0u=v=0italic_u = italic_v = 0).

    Equation (9), with no mutation, entails replacing xsuperscript𝑥x^{\ast}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT by x𝑥xitalic_x and we obtain the no mutation, neutral (no selection) form of the transition matrix that we write as

    M(0,0)(x|x)=(2N2Nx)x2Nx(1x)2N(1x).superscript𝑀00conditionalsuperscript𝑥𝑥binomial2𝑁2𝑁superscript𝑥superscript𝑥2𝑁superscript𝑥superscript1𝑥2𝑁1superscript𝑥M^{(0,0)}(x^{\prime}|x)=\binom{2N}{2Nx^{\prime}}x^{2Nx^{\prime}}\left(1-x% \right)^{2N(1-x^{\prime})}.italic_M start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) italic_x start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x ) start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT . (18)
  2. (ii)

    Selection is multiplicative.

    We take the A1A1subscript𝐴1subscript𝐴1A_{1}A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, A1A2subscript𝐴1subscript𝐴2A_{1}A_{2}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and A2A2subscript𝐴2subscript𝐴2A_{2}A_{2}italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT genotypes to have relative fitnesses of (1+s)2superscript1𝑠2\left(1+s\right)^{2}( 1 + italic_s ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, 1+s1𝑠1+s1 + italic_s and 1111, respectively. We then have σ(x)=s/(1+sx)𝜎𝑥𝑠1𝑠𝑥\sigma(x)=s/(1+sx)italic_σ ( italic_x ) = italic_s / ( 1 + italic_s italic_x ) and from Eq. (11) obtain

    C(x|x)=2N[xln(1+s)ln(1+sx)].𝐶conditionalsuperscript𝑥𝑥2𝑁delimited-[]superscript𝑥1𝑠1𝑠𝑥C(x^{\prime}|x)=2N\left[x^{\prime}\ln\left(1+s\right)-\ln\left(1+sx\right)% \right].italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = 2 italic_N [ italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_ln ( 1 + italic_s ) - roman_ln ( 1 + italic_s italic_x ) ] . (19)
  3. (iii)

    Selection is weak (|s|1much-less-than𝑠1|s|\ll 1| italic_s | ≪ 1)

    In terms of the scaled strength of selection

    R=2Ns𝑅2𝑁𝑠R=2Nsitalic_R = 2 italic_N italic_s (20)

    which, unlike s𝑠sitalic_s need not be small, the expansion of C(x|x)𝐶conditionalsuperscript𝑥𝑥C(x^{\prime}|x)italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) in s𝑠sitalic_s is given by C(x|x)R(xx)R24N(xx2)similar-to-or-equals𝐶conditionalsuperscript𝑥𝑥𝑅superscript𝑥𝑥superscript𝑅24𝑁superscript𝑥superscript𝑥2C(x^{\prime}|x)\simeq R\cdot\left(x^{\prime}-x\right)-\frac{R^{2}}{4N}\left(x^% {\prime}-x^{2}\right)italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) ≃ italic_R ⋅ ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_x ) - divide start_ARG italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_N end_ARG ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) with corrections of order s3superscript𝑠3s^{3}italic_s start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT. This yields

    C([x]0t)r=1tC(x(r)|x(r1)(RR24N)[x(t)x(0)]r=0t1U(x(r))C\left([x]_{0}^{t}\right)\equiv\sum_{r=1}^{t}C(x(r)|x(r-1)\simeq\left(R-\frac{% R^{2}}{4N}\right)\left[x(t)-x(0)\right]-\sum_{r=0}^{t-1}U\left(x(r)\right)italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) ≡ ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_C ( italic_x ( italic_r ) | italic_x ( italic_r - 1 ) ≃ ( italic_R - divide start_ARG italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_N end_ARG ) [ italic_x ( italic_t ) - italic_x ( 0 ) ] - ∑ start_POSTSUBSCRIPT italic_r = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT italic_U ( italic_x ( italic_r ) ) (21)

    where

    U(x)=R24Nx(1x).𝑈𝑥superscript𝑅24𝑁𝑥1𝑥U\left(x\right)=\frac{R^{2}}{4N}x\left(1-x\right).italic_U ( italic_x ) = divide start_ARG italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_N end_ARG italic_x ( 1 - italic_x ) . (22)

Thus in the absence of mutation, but with weak selection, we have the approximation

K(z,t|a,0)e[RR2/(4N)](za)x(0)=ax(t)=zW(0,0)([x]0t)er=0t1U(x(r)).similar-to-or-equals𝐾𝑧conditional𝑡𝑎0superscript𝑒delimited-[]𝑅superscript𝑅24𝑁𝑧𝑎𝑥𝑡𝑧𝑥0𝑎superscript𝑊00superscriptsubscriptdelimited-[]𝑥0𝑡superscript𝑒superscriptsubscript𝑟0𝑡1𝑈𝑥𝑟K(z,t|a,0)\simeq e^{\left[R-R^{2}/(4N)\right]\left(z-a\right)}\overset{x(t)=z}% {\underset{x(0)=a}{\sum\cdots\sum}}W^{(0,0)}\left([x]_{0}^{t}\right)e^{-\sum_{% r=0}^{t-1}U\left(x(r)\right)}.italic_K ( italic_z , italic_t | italic_a , 0 ) ≃ italic_e start_POSTSUPERSCRIPT [ italic_R - italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( 4 italic_N ) ] ( italic_z - italic_a ) end_POSTSUPERSCRIPT start_OVERACCENT italic_x ( italic_t ) = italic_z end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) italic_e start_POSTSUPERSCRIPT - ∑ start_POSTSUBSCRIPT italic_r = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT italic_U ( italic_x ( italic_r ) ) end_POSTSUPERSCRIPT . (23)

The path integral representation of the transition probability density, under the diffusion approximation, which involves continuous frequencies and continuous time, can be written as

Kdiffusion(z,t|a,0)=eR(za)x(0)=ax(t)=zP([x]0t)e0tU(x(r))𝑑rd[x]subscript𝐾diffusion𝑧conditional𝑡𝑎0superscript𝑒𝑅𝑧𝑎superscriptsubscript𝑥0𝑎𝑥𝑡𝑧𝑃superscriptsubscriptdelimited-[]𝑥0𝑡superscript𝑒superscriptsubscript0𝑡𝑈𝑥𝑟differential-d𝑟𝑑delimited-[]𝑥K_{\text{\text{diffusion}}}(z,t|a,0)=e^{R\cdot\left(z-a\right)}\int_{x(0)=a}^{% x(t)=z}P([x]_{0}^{t})e^{-\int_{0}^{t}U(x(r))dr}d[x]italic_K start_POSTSUBSCRIPT diffusion end_POSTSUBSCRIPT ( italic_z , italic_t | italic_a , 0 ) = italic_e start_POSTSUPERSCRIPT italic_R ⋅ ( italic_z - italic_a ) end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT italic_x ( 0 ) = italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_x ( italic_t ) = italic_z end_POSTSUPERSCRIPT italic_P ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) italic_e start_POSTSUPERSCRIPT - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_U ( italic_x ( italic_r ) ) italic_d italic_r end_POSTSUPERSCRIPT italic_d [ italic_x ] (24)

where the integration is over all continuous trajectories that start at frequency a𝑎aitalic_a at time 00 and end at frequency z𝑧zitalic_z at time t𝑡titalic_t, with P([x]0t)𝑃superscriptsubscriptdelimited-[]𝑥0𝑡P([x]_{0}^{t})italic_P ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) a ‘weight’ associated with neutral trajectories, and d[x]𝑑delimited-[]𝑥d[x]italic_d [ italic_x ] the measure of the path integral[9].

A comparison of the approximate Wright-Fisher transition probability in Eq. (23) and the diffusion transition probability density in Eq. (24) indicates that the two results are similar. In particular, corresponding to the expressions e[RR2/(4N)](za)superscript𝑒delimited-[]𝑅superscript𝑅24𝑁𝑧𝑎e^{\left[R-R^{2}/(4N)\right]\left(z-a\right)}italic_e start_POSTSUPERSCRIPT [ italic_R - italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( 4 italic_N ) ] ( italic_z - italic_a ) end_POSTSUPERSCRIPT and er=1tU(x(r))superscript𝑒superscriptsubscript𝑟1𝑡𝑈𝑥𝑟e^{-\sum_{r=1}^{t}U\left(x(r)\right)}italic_e start_POSTSUPERSCRIPT - ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_U ( italic_x ( italic_r ) ) end_POSTSUPERSCRIPT that are present in the Wright-Fisher result are, respectively, the expressions eR(za)superscript𝑒𝑅𝑧𝑎e^{R\cdot\left(z-a\right)}italic_e start_POSTSUPERSCRIPT italic_R ⋅ ( italic_z - italic_a ) end_POSTSUPERSCRIPT and e0tU(x(r))𝑑rsuperscript𝑒superscriptsubscript0𝑡𝑈𝑥𝑟differential-d𝑟e^{-\int_{0}^{t}U(x(r))dr}italic_e start_POSTSUPERSCRIPT - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_U ( italic_x ( italic_r ) ) italic_d italic_r end_POSTSUPERSCRIPT in the diffusion result. The analogue of the Wright-Fisher neutral, mutation-free, trajectory probability, W(0,0)([x]0t)superscript𝑊00superscriptsubscriptdelimited-[]𝑥0𝑡W^{(0,0)}\left([x]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ), that is present in Eq. (23), is the neutral weight, P([x]0t)𝑃superscriptsubscriptdelimited-[]𝑥0𝑡P([x]_{0}^{t})italic_P ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ), that is present in Eq. (24).

Theoretical background for multiple alleles

We shall now generalise the above. We again consider a population that is diploid, reproduces sexually by random mating, has an equal sex ratio, exhibits no sexual dimorphism, and evolves in discrete generations. Selection again occurs at a single unlinked locus, but now there are n𝑛nitalic_n alleles at the locus, where n𝑛nitalic_n is arbitrary (i.e., n=2,3,4,𝑛234n=2,3,4,...italic_n = 2 , 3 , 4 , …) and we write allele i𝑖iitalic_i (for i=1,2,,n𝑖12𝑛i=1,2,...,nitalic_i = 1 , 2 , … , italic_n) as Aisubscript𝐴𝑖A_{i}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

When there are three or more alleles, the difference, compared with two alleles, is that knowledge of the frequency of one allele is not enough to specify the state of the population. In fact, we need to follow the behaviour of n1𝑛1n-1italic_n - 1 allele frequencies, while one allele frequency can be treated as being determined by all other allele frequencies (since allele frequencies sum to unity). However, we shall not proceed in this way; we shall treat all alleles as being on an equal footing, and follow the behaviour of all n𝑛nitalic_n allele frequencies.

Effectively infinite population

We first consider a very large (effectively infinite) population.

In what follows, we shall use 𝐱𝐱\mathbf{x}bold_x to denote an n𝑛nitalic_n component column vector whose i𝑖iitalic_i’th element, xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, is the frequency of allele Aisubscript𝐴𝑖A_{i}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in adults (i=1,2,,n𝑖12𝑛i=1,2,...,nitalic_i = 1 , 2 , … , italic_n).

The frequency of all alleles in offspring is then 𝐱=𝐐𝐱superscript𝐱𝐐𝐱\mathbf{x}^{\ast}=\mathbf{Q}\mathbf{x}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_Qx where 𝐐𝐐\mathbf{Q}bold_Q is an n×n𝑛𝑛n\times nitalic_n × italic_n matrix whose (i,j)𝑖𝑗(i,j)( italic_i , italic_j ) element, Qi,jsubscript𝑄𝑖𝑗Q_{i,j}italic_Q start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT, is the probability that an Ajsubscript𝐴𝑗A_{j}italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT allele mutates to an Aisubscript𝐴𝑖A_{i}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT allele. Elements of 𝐐𝐐\mathbf{Q}bold_Q are non-negative, and satisfy i=1nQi,j=1superscriptsubscript𝑖1𝑛subscript𝑄𝑖𝑗1\sum_{i=1}^{n}Q_{i,j}=1∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Q start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = 1 for all j𝑗jitalic_j (so the sum of all mutated frequencies is unity).

We next assume that viability selection acts and determines the probability of different offspring surviving to maturity. The frequencies, after viability selection, are given by 𝐟sel(𝐱)subscript𝐟𝑠𝑒𝑙superscript𝐱\mathbf{f}_{sel}(\mathbf{x}^{\ast})bold_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ), where 𝐟sel(𝐱)subscript𝐟𝑠𝑒𝑙superscript𝐱\mathbf{f}_{sel}(\mathbf{x}^{\ast})bold_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) takes into account frequency changes of 𝐱superscript𝐱\mathbf{x}^{\ast}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT due to selection, and is an n𝑛nitalic_n component column vector.

We shall shortly exploit a property of 𝐟sel(𝐱)subscript𝐟𝑠𝑒𝑙𝐱\mathbf{f}_{sel}(\mathbf{x})bold_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( bold_x ), that follows because selection acts on variation in a population. In particular, if the vector of allele frequencies, 𝐱𝐱\mathbf{x}bold_x, has an i𝑖iitalic_i’th element which is zero (xi=0subscript𝑥𝑖0x_{i}=0italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0), then the i𝑖iitalic_i’th element of 𝐟sel(𝐱)subscript𝐟𝑠𝑒𝑙𝐱\mathbf{f}_{sel}(\mathbf{x})bold_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( bold_x ), which we write, as fsel,i(𝐱)subscript𝑓𝑠𝑒𝑙𝑖𝐱f_{sel,i}(\mathbf{x})italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l , italic_i end_POSTSUBSCRIPT ( bold_x ), also vanishes, since selection alone cannot salvage an allele after its absence from a population. This motivates us to take fsel,i(𝐱)subscript𝑓𝑠𝑒𝑙𝑖𝐱f_{sel,i}(\mathbf{x})italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l , italic_i end_POSTSUBSCRIPT ( bold_x ) in the form

fsel,i(𝐱)=xi×[1+Gi(𝐱)]subscript𝑓𝑠𝑒𝑙𝑖𝐱subscript𝑥𝑖delimited-[]1subscript𝐺𝑖𝐱f_{sel,i}(\mathbf{x})=x_{i}\times[1+G_{i}(\mathbf{x})]italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l , italic_i end_POSTSUBSCRIPT ( bold_x ) = italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT × [ 1 + italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) ] (25)

where Gi(𝐱)subscript𝐺𝑖𝐱G_{i}(\mathbf{x})italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) is finite (|Gi(𝐱)|<subscript𝐺𝑖𝐱\left|G_{i}(\mathbf{x})\right|<\infty| italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) | < ∞) and is determined by the specific form of selection acting. Generally, i=1nxiGi(𝐱)=0superscriptsubscript𝑖1𝑛subscript𝑥𝑖subscript𝐺𝑖𝐱0\sum_{i=1}^{n}x_{i}G_{i}(\mathbf{x})=0∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) = 0 and Gi(𝐱)1subscript𝐺𝑖𝐱1G_{i}(\mathbf{x})\geq-1italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) ≥ - 1 (ensuring that after selection, the sum of all allele frequencies is unity, and all alleles frequencies are non-negative).

The set of allele frequencies in offspring, after selection and mutation have acted, can be expressed in terms of the set of frequencies 𝐱𝐱\mathbf{x}bold_x, in adults, as 𝐟sel(𝐐𝐱)subscript𝐟𝑠𝑒𝑙𝐐𝐱\mathbf{f}_{sel}(\mathbf{Qx})bold_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( bold_Qx ) and we write

𝐅(𝐱)=𝐟sel(𝐐𝐱).𝐅𝐱subscript𝐟𝑠𝑒𝑙𝐐𝐱\mathbf{F}(\mathbf{x})=\mathbf{f}_{sel}(\mathbf{Qx}).bold_F ( bold_x ) = bold_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( bold_Qx ) . (26)

We now consider dynamics. Let 𝐗(t)𝐗𝑡\mathbf{X}(t)bold_X ( italic_t ) denote an n𝑛nitalic_n component column vector containing the set of allele frequencies in generation t𝑡titalic_t. Because we have an effectively infinite population, 𝐗(t)𝐗𝑡\mathbf{X}(t)bold_X ( italic_t ) obeys the deterministic equation

𝐗(t+1)=𝐅(𝐗(t)).𝐗𝑡1𝐅𝐗𝑡\mathbf{X}(t+1)=\mathbf{F}(\mathbf{X}(t)).bold_X ( italic_t + 1 ) = bold_F ( bold_X ( italic_t ) ) . (27)

Finite population

Consider now a finite population, where N𝑁Nitalic_N adults are present in each generation. The quantity 𝐱𝐱\mathbf{x}bold_x is still an n𝑛nitalic_n component vector whose i𝑖iitalic_i’th element, xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, is the frequency of allele Aisubscript𝐴𝑖A_{i}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in adults, but it has the added feature that all elements have values which are allowed frequencies (Eq. (6)). That is, xi0subscript𝑥𝑖0x_{i}\geq 0italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≥ 0, i=1nxi=1superscriptsubscript𝑖1𝑛subscript𝑥𝑖1\sum_{i=1}^{n}x_{i}=1∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1, and each xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is an integer divided by 2N2𝑁2N2 italic_N. We shall call a vector that has this property an allowed set of allele frequencies. In the multiallele case we shall reserve the use of 𝐚𝐚\mathbf{a}bold_a, 𝐱𝐱\mathbf{x}bold_x, 𝐱superscript𝐱\mathbf{x}^{\prime}bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, 𝐱(r)𝐱𝑟\mathbf{x}(r)bold_x ( italic_r ) (for various r𝑟ritalic_r), and 𝐳𝐳\mathbf{z}bold_z, for allowed sets of allele frequencies. We now write the transition matrix element for the probability of a transition from state 𝐱𝐱\mathbf{x}bold_x to state 𝐱superscript𝐱\mathbf{x}^{\prime}bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT as

M(𝐱|𝐱)=(2N2N𝐱)i=1n[Fi(𝐱)]2Nxi𝑀conditionalsuperscript𝐱𝐱binomial2𝑁2𝑁superscript𝐱superscriptsubscriptproduct𝑖1𝑛superscriptdelimited-[]subscript𝐹𝑖𝐱2𝑁superscriptsubscript𝑥𝑖M(\mathbf{x}^{\prime}|\mathbf{x})=\binom{2N}{2N\mathbf{x}^{\prime}}\prod% \limits_{i=1}^{n}\left[F_{i}(\mathbf{x})\right]^{2Nx_{i}^{\prime}}italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) ] start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT (28)

where (2N𝐦)binomial2𝑁𝐦\binom{2N}{\mathbf{m}}( FRACOP start_ARG 2 italic_N end_ARG start_ARG bold_m end_ARG ), with 𝐦𝐦\mathbf{m}bold_m an n𝑛nitalic_n component column vector with integer elements, denotes a multinomial coefficient for n𝑛nitalic_n categories. We note that the transition matrix element, M(𝐱|𝐱)𝑀conditionalsuperscript𝐱𝐱M(\mathbf{x}^{\prime}|\mathbf{x})italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ), in its conventional matrix form, is an element of a matrix with vector indices, not scalars[20].

The transition matrix of a neutral locus has elements which are the zero selection limit of M(𝐱|𝐱)𝑀conditionalsuperscript𝐱𝐱M(\mathbf{x}^{\prime}|\mathbf{x})italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ), which we write as M(0)(𝐱|𝐱)superscript𝑀0conditionalsuperscript𝐱𝐱M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ), and which is given by

M(0)(𝐱|𝐱)=(2N2N𝐱)i=1n(xi)2Nxisuperscript𝑀0conditionalsuperscript𝐱𝐱binomial2𝑁2𝑁superscript𝐱superscriptsubscriptproduct𝑖1𝑛superscriptsuperscriptsubscript𝑥𝑖2𝑁superscriptsubscript𝑥𝑖M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})=\binom{2N}{2N\mathbf{x}^{\prime}}\prod% \limits_{i=1}^{n}\left(x_{i}^{\ast}\right)^{2Nx_{i}^{\prime}}italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT (29)

where 𝐱superscript𝐱\mathbf{x}^{\ast}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is given by

𝐱=𝐐𝐱.superscript𝐱𝐐𝐱\mathbf{x}^{\ast}=\mathbf{Qx}.bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_Qx . (30)

As for the case of two alleles, a factorisation is possible; the form of 𝐟sel(𝐱)subscript𝐟𝑠𝑒𝑙𝐱\mathbf{f}_{sel}(\mathbf{x})bold_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( bold_x ) in Eq. (25) allows us to write the transition matrix, M(𝐱|𝐱)𝑀conditionalsuperscript𝐱𝐱M(\mathbf{x}^{\prime}|\mathbf{x})italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) of Eq. (28), as the exact product of two factors:

M(𝐱|𝐱)=M(0)(𝐱|𝐱)×eC(𝐱|𝐱)𝑀conditionalsuperscript𝐱𝐱superscript𝑀0conditionalsuperscript𝐱𝐱superscript𝑒𝐶conditionalsuperscript𝐱𝐱M(\mathbf{x}^{\prime}|\mathbf{x})=M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})% \times e^{C(\mathbf{x}^{\prime}|\mathbf{x})}italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) × italic_e start_POSTSUPERSCRIPT italic_C ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) end_POSTSUPERSCRIPT (31)

where M(0)(𝐱|𝐱)superscript𝑀0conditionalsuperscript𝐱𝐱M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) is given in Eq. (29) and

C(𝐱|𝐱)=2Ni=1nxiln(1+Gi(𝐱))𝐶conditionalsuperscript𝐱𝐱2𝑁superscriptsubscript𝑖1𝑛superscriptsubscript𝑥𝑖1subscript𝐺𝑖superscript𝐱C(\mathbf{x}^{\prime}|\mathbf{x})=2N\sum_{i=1}^{n}x_{i}^{\prime}\ln\big{(}1+G_% {i}(\mathbf{x}^{\ast})\big{)}italic_C ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = 2 italic_N ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_ln ( 1 + italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ) (32)

- see the second subsection of Methods for details.

Trajectories and path integral

We now write a trajectory as

[𝐱]0t=(𝐱(0),𝐱(1),,𝐱(t))superscriptsubscriptdelimited-[]𝐱0𝑡𝐱0𝐱1𝐱𝑡[\mathbf{x}]_{0}^{t}=\left(\mathbf{x}(0),\mathbf{x}(1),...,\mathbf{x}(t)\right)[ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = ( bold_x ( 0 ) , bold_x ( 1 ) , … , bold_x ( italic_t ) ) (33)

in which each 𝐱(r)𝐱𝑟\mathbf{x}(r)bold_x ( italic_r ) is an n𝑛nitalic_n component column vector containing an allowed set of allele frequencies, which gives the state of the population at time r𝑟ritalic_r. It follows that the trajectory [𝐱]0tsuperscriptsubscriptdelimited-[]𝐱0𝑡[\mathbf{x}]_{0}^{t}[ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT in Eq. (33), is an n×(t+1)𝑛𝑡1n\times(t+1)italic_n × ( italic_t + 1 ) matrix. The probability of this trajectory is r=1tM(𝐱(r)|𝐱(r1)=r=1tM(0)(𝐱(r)|𝐱(r1))×exp(r=1tC(𝐱(r)|𝐱(r1))){\textstyle\prod\limits_{r=1}^{t}}M(\mathbf{x}(r)|\mathbf{x}(r-1)={\textstyle% \prod\limits_{r=1}^{t}}M^{(0)}(\mathbf{x}(r)|\mathbf{x}(r-1))\times\exp\left(% \sum_{r=1}^{t}C(\mathbf{x}(r)|\mathbf{x}(r-1))\right)∏ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_M ( bold_x ( italic_r ) | bold_x ( italic_r - 1 ) = ∏ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x ( italic_r ) | bold_x ( italic_r - 1 ) ) × roman_exp ( ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_C ( bold_x ( italic_r ) | bold_x ( italic_r - 1 ) ) ). We write this as

probability of[𝐱]0t=W(0)([𝐱]0t)×eC([𝐱]0t)probability ofsuperscriptsubscriptdelimited-[]𝐱0𝑡superscript𝑊0superscriptsubscriptdelimited-[]𝐱0𝑡superscript𝑒𝐶superscriptsubscriptdelimited-[]𝐱0𝑡\text{probability of}\ [\mathbf{x}]_{0}^{t}=W^{(0)}\left([\mathbf{x}]_{0}^{t}% \right)\times e^{C\left([\mathbf{x}]_{0}^{t}\right)}probability of [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) × italic_e start_POSTSUPERSCRIPT italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT (34)

where

W(0)([𝐱]0t)=r=1tM(0)(𝐱(r)|𝐱(r1))superscript𝑊0superscriptsubscriptdelimited-[]𝐱0𝑡superscriptsubscriptproduct𝑟1𝑡superscript𝑀0conditional𝐱𝑟𝐱𝑟1W^{(0)}\left([\mathbf{x}]_{0}^{t}\right)={\textstyle\prod_{r=1}^{t}}M^{(0)}(% \mathbf{x}(r)|\mathbf{x}(r-1))italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) = ∏ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x ( italic_r ) | bold_x ( italic_r - 1 ) ) (35)

and

C([𝐱]0t)=r=1tC(𝐱(r)|𝐱(r1)).𝐶superscriptsubscriptdelimited-[]𝐱0𝑡superscriptsubscript𝑟1𝑡𝐶conditional𝐱𝑟𝐱𝑟1C\left([\mathbf{x}]_{0}^{t}\right)=\sum_{r=1}^{t}C(\mathbf{x}(r)|\mathbf{x}(r-% 1)).italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_C ( bold_x ( italic_r ) | bold_x ( italic_r - 1 ) ) . (36)

Let K(𝐳,t|𝐚,0)𝐾𝐳conditional𝑡𝐚0K(\mathbf{z},t|\mathbf{a},0)italic_K ( bold_z , italic_t | bold_a , 0 ) denote the overall probability of going from an initial state of the population corresponding to the allowed set of frequencies, 𝐚𝐚\mathbf{a}bold_a at time 00, to state 𝐳𝐳\mathbf{z}bold_z at time t𝑡titalic_t, which is an another allowed set of frequencies. All possible trajectories between these end-points contribute to K(𝐳,t|𝐚,0)𝐾𝐳conditional𝑡𝐚0K(\mathbf{z},t|\mathbf{a},0)italic_K ( bold_z , italic_t | bold_a , 0 ). We thus write K(𝐳,t|𝐚,0)𝐾𝐳conditional𝑡𝐚0K(\mathbf{z},t|\mathbf{a},0)italic_K ( bold_z , italic_t | bold_a , 0 ) as sum of the probabilities W(0)([𝐱]0t)×eC([𝐱]0t)superscript𝑊0superscriptsubscriptdelimited-[]𝐱0𝑡superscript𝑒𝐶superscriptsubscriptdelimited-[]𝐱0𝑡W^{(0)}\left([\mathbf{x}]_{0}^{t}\right)\times e^{C\left([\mathbf{x}]_{0}^{t}% \right)}italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) × italic_e start_POSTSUPERSCRIPT italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT over all possible trajectories. That is

K(𝐳,t|𝐚,0)=𝐱(0)=𝐚𝐱(t)=𝐳W(0)([𝐱]0t)eC([𝐱]0t).𝐾𝐳conditional𝑡𝐚0𝐱𝑡𝐳𝐱0𝐚superscript𝑊0superscriptsubscriptdelimited-[]𝐱0𝑡superscript𝑒𝐶superscriptsubscriptdelimited-[]𝐱0𝑡K(\mathbf{z},t|\mathbf{a},0)=\overset{\mathbf{x}(t)=\mathbf{z}}{\underset{% \mathbf{x}(0)=\mathbf{a}}{\sum\cdots\sum}}W^{(0)}\left([\mathbf{x}]_{0}^{t}% \right)e^{C\left([\mathbf{x}]_{0}^{t}\right)}.italic_K ( bold_z , italic_t | bold_a , 0 ) = start_OVERACCENT bold_x ( italic_t ) = bold_z end_OVERACCENT start_ARG start_UNDERACCENT bold_x ( 0 ) = bold_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) italic_e start_POSTSUPERSCRIPT italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT . (37)

The notation in Eq. (37) denotes a sum over all trajectories whose end points, 𝐱(0)𝐱0\mathbf{x}(0)bold_x ( 0 ) and 𝐱(t)𝐱𝑡\mathbf{x}(t)bold_x ( italic_t ), have the specific (allowed set) values 𝐚𝐚\mathbf{a}bold_a and 𝐳𝐳\mathbf{z}bold_z, respectively, while 𝐱(1)𝐱1\mathbf{x}(1)bold_x ( 1 ), 𝐱(2)𝐱2\mathbf{x}(2)bold_x ( 2 ), ,𝐱(t1)𝐱𝑡1...,\mathbf{x}(t-1)… , bold_x ( italic_t - 1 ), which give the state of the population at intermediate times, take values that cover all allowed sets of frequencies.

Equation (37) is an exact ‘path integral’ representation of the finite time transition probability in a multiple (n𝑛nitalic_n) allele Wright-Fisher model where states and times are discrete.

Since, by construction, W(0)([𝐱]0t)superscript𝑊0superscriptsubscriptdelimited-[]𝐱0𝑡W^{(0)}\left([\mathbf{x}]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) is independent of selection, and since C([𝐱]0t)𝐶superscriptsubscriptdelimited-[]𝐱0𝑡C\left([\mathbf{x}]_{0}^{t}\right)italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) vanishes when there is no selection, the probability of going from state 𝐚𝐚\mathbf{a}bold_a at time 00 to state 𝐳𝐳\mathbf{z}bold_z at time t𝑡titalic_t, when there is no selection, is K(0)(𝐳,t|𝐚,0)=𝐱(0)=𝐚𝐱(t)=𝐳W(0)([𝐱]0t)superscript𝐾0𝐳conditional𝑡𝐚0𝐱𝑡𝐳𝐱0𝐚superscript𝑊0superscriptsubscriptdelimited-[]𝐱0𝑡K^{(0)}(\mathbf{z},t|\mathbf{a},0)=\overset{\mathbf{x}(t)=\mathbf{z}}{% \underset{\mathbf{x}(0)=\mathbf{a}}{\sum\cdots\sum}}W^{(0)}\left([\mathbf{x}]_% {0}^{t}\right)italic_K start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_z , italic_t | bold_a , 0 ) = start_OVERACCENT bold_x ( italic_t ) = bold_z end_OVERACCENT start_ARG start_UNDERACCENT bold_x ( 0 ) = bold_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ).

Approximation when there is no mutation and selection is weak

We now consider a special case of the distribution K(𝐳,t|𝐚,0)𝐾𝐳conditional𝑡𝐚0K(\mathbf{z},t|\mathbf{a},0)italic_K ( bold_z , italic_t | bold_a , 0 ) of Eq. (37), when there is no mutation and selection is multiplicative and weak.

When there is no mutation the matrix 𝐐𝐐\mathbf{Q}bold_Q becomes the n×n𝑛𝑛n\times nitalic_n × italic_n identity matrix.

Under multiplicative selection, we take the AiAjsubscript𝐴𝑖subscript𝐴𝑗A_{i}A_{j}italic_A start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT genotype to have a fitness proportional to (1+si)(1+sj)1subscript𝑠𝑖1subscript𝑠𝑗(1+s_{i})(1+s_{j})( 1 + italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ( 1 + italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ). It follows that Fi(𝐱)=xi(sij=1nsjxj)/(1+j=1nsjxj)subscript𝐹𝑖𝐱subscript𝑥𝑖subscript𝑠𝑖superscriptsubscript𝑗1𝑛subscript𝑠𝑗subscript𝑥𝑗1superscriptsubscript𝑗1𝑛subscript𝑠𝑗subscript𝑥𝑗F_{i}(\mathbf{x})=x_{i}\left(s_{i}-{\sum\nolimits_{j=1}^{n}}s_{j}x_{j}\right)/% \left(1+{\sum\nolimits_{j=1}^{n}}s_{j}x_{j}\right)italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) = italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) / ( 1 + ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) hence Gi(𝐱)=(sij=1nsjxj)/[1+j=1nsjxj]subscript𝐺𝑖𝐱subscript𝑠𝑖superscriptsubscript𝑗1𝑛subscript𝑠𝑗subscript𝑥𝑗delimited-[]1superscriptsubscript𝑗1𝑛subscript𝑠𝑗subscript𝑥𝑗G_{i}(\mathbf{x})=\left(s_{i}-{\sum\nolimits_{j=1}^{n}}s_{j}x_{j}\right)/\left% [1+{\sum\nolimits_{j=1}^{n}}s_{j}x_{j}\right]italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) = ( italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) / [ 1 + ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ] and

C(𝐱|𝐱)=2N[i=1nxiln(1+si)ln(1+i=1nsixi)].𝐶conditionalsuperscript𝐱𝐱2𝑁delimited-[]superscriptsubscript𝑖1𝑛superscriptsubscript𝑥𝑖1subscript𝑠𝑖1superscriptsubscript𝑖1𝑛subscript𝑠𝑖subscript𝑥𝑖C(\mathbf{x}^{\prime}|\mathbf{x})=2N\left[\sum_{i=1}^{n}x_{i}^{\prime}\ln\left% (1+s_{i}\right)-\ln\left(1+\sum_{i=1}^{n}s_{i}x_{i}\right)\right].italic_C ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = 2 italic_N [ ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_ln ( 1 + italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - roman_ln ( 1 + ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ] . (38)

We take weak selection to correspond to |si|1much-less-thansubscript𝑠𝑖1|s_{i}|\ll 1| italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | ≪ 1 for all i𝑖iitalic_i, then similar to the case of two alleles, we obtain approximate results by expanding C(𝐱|𝐱)𝐶conditionalsuperscript𝐱𝐱C(\mathbf{x}^{\prime}|\mathbf{x})italic_C ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) in the sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and discarding third and higher order terms. We shall express results in terms of scaled selection strengths that are given by

𝐑=2N𝐬 or Ri=2Nsi𝐑2𝑁𝐬 or subscript𝑅𝑖2𝑁subscript𝑠𝑖\mathbf{R}=2N\mathbf{s}\text{ or }R_{i}=2Ns_{i}bold_R = 2 italic_N bold_s or italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 2 italic_N italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (39)

where 𝐬𝐬\mathbf{s}bold_s is a column vector of the sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

With δi,jsubscript𝛿𝑖𝑗\delta_{i,j}italic_δ start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT denoting a Kronecker delta, a T𝑇Titalic_T superscript denoting the transpose of a vector, and 𝐕(𝐱)𝐕𝐱\mathbf{V}(\mathbf{x})bold_V ( bold_x ) denoting an n×n𝑛𝑛n\times nitalic_n × italic_n matrix with elements

Vi,j(𝐱)=xiδi,jxixjsubscript𝑉𝑖𝑗𝐱subscript𝑥𝑖subscript𝛿𝑖𝑗subscript𝑥𝑖subscript𝑥𝑗V_{i,j}(\mathbf{x})=x_{i}\delta_{i,j}-x_{i}x_{j}italic_V start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ( bold_x ) = italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_δ start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (40)

we obtain

C([𝐱]0t)r=1tC(𝐱(r)|𝐱(r1)𝐑T[𝐱(t)𝐱(0)]+ϕ(𝐱(t))ϕ(𝐱(0))r=0t1U(𝐱(r))C\left([\mathbf{x}]_{0}^{t}\right)\equiv\sum_{r=1}^{t}C(\mathbf{x}(r)|\mathbf{% x}(r-1)\simeq\mathbf{R}^{T}\left[\mathbf{x}(t)-\mathbf{x}(0)\right]+\phi\left(% \mathbf{x}(t)\right)-\phi\left(\mathbf{x}(0)\right)-\sum_{r=0}^{t-1}U\left(% \mathbf{x}(r)\right)italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) ≡ ∑ start_POSTSUBSCRIPT italic_r = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_C ( bold_x ( italic_r ) | bold_x ( italic_r - 1 ) ≃ bold_R start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT [ bold_x ( italic_t ) - bold_x ( 0 ) ] + italic_ϕ ( bold_x ( italic_t ) ) - italic_ϕ ( bold_x ( 0 ) ) - ∑ start_POSTSUBSCRIPT italic_r = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT italic_U ( bold_x ( italic_r ) ) (41)

where

ϕ(𝐱)=i=1nRi24Nxiitalic-ϕ𝐱superscriptsubscript𝑖1𝑛superscriptsubscript𝑅𝑖24𝑁subscript𝑥𝑖\phi\left(\mathbf{x}\right)=-\sum_{i=1}^{n}\frac{R_{i}^{2}}{4N}x_{i}italic_ϕ ( bold_x ) = - ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_N end_ARG italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (42)

and

U(𝐱)=14N𝐑T𝐕(𝐱)𝐑.𝑈𝐱14𝑁superscript𝐑𝑇𝐕𝐱𝐑U\left(\mathbf{x}\right)=\frac{1}{4N}\mathbf{R}^{T}\mathbf{V}(\mathbf{x})% \mathbf{R}.italic_U ( bold_x ) = divide start_ARG 1 end_ARG start_ARG 4 italic_N end_ARG bold_R start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V ( bold_x ) bold_R . (43)

Using Eqs. (41), (42) and (43) in Eq. (37), combined with W(0,0)([𝐱]0t)superscript𝑊00superscriptsubscriptdelimited-[]𝐱0𝑡W^{(0,0)}\left([\mathbf{x}]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ), which denotes the probability of trajectory [𝐱]0tsuperscriptsubscriptdelimited-[]𝐱0𝑡[\mathbf{x}]_{0}^{t}[ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT in the absence of mutation and selection (W(0,0)([𝐱]0t)superscript𝑊00superscriptsubscriptdelimited-[]𝐱0𝑡W^{(0,0)}\left([\mathbf{x}]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) is constructed from a product of terms of the form (2N2N𝐱)i=1nxi2Nxibinomial2𝑁2𝑁superscript𝐱superscriptsubscriptproduct𝑖1𝑛superscriptsubscript𝑥𝑖2𝑁superscriptsubscript𝑥𝑖\binom{2N}{2N\mathbf{x}^{\prime}}\prod\limits_{i=1}^{n}x_{i}^{2Nx_{i}^{\prime}}( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT - cf. Eq. (35)). we obtain the approximation

K(𝐳,t|𝐚,0)e𝐑T(𝐳𝐚)+ϕ(𝐳)ϕ(𝐚)𝐱(0)=𝐚𝐱(t)=𝐳W(0,0)([𝐱]0t)er=0t1U(𝐱(r)).similar-to-or-equals𝐾𝐳conditional𝑡𝐚0superscript𝑒superscript𝐑𝑇𝐳𝐚italic-ϕ𝐳italic-ϕ𝐚𝐱𝑡𝐳𝐱0𝐚superscript𝑊00superscriptsubscriptdelimited-[]𝐱0𝑡superscript𝑒superscriptsubscript𝑟0𝑡1𝑈𝐱𝑟K(\mathbf{z},t|\mathbf{a},0)\simeq e^{\mathbf{R}^{T}\left(\mathbf{z}-\mathbf{a% }\right)+\phi\left(\mathbf{z}\right)-\phi\left(\mathbf{a}\right)}\overset{% \mathbf{x}(t)=\mathbf{z}}{\underset{\mathbf{x}(0)=\mathbf{a}}{\sum\cdots\sum}}% W^{(0,0)}\left([\mathbf{x}]_{0}^{t}\right)e^{-\sum_{r=0}^{t-1}U\left(\mathbf{x% }(r)\right)}.italic_K ( bold_z , italic_t | bold_a , 0 ) ≃ italic_e start_POSTSUPERSCRIPT bold_R start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_z - bold_a ) + italic_ϕ ( bold_z ) - italic_ϕ ( bold_a ) end_POSTSUPERSCRIPT start_OVERACCENT bold_x ( italic_t ) = bold_z end_OVERACCENT start_ARG start_UNDERACCENT bold_x ( 0 ) = bold_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) italic_e start_POSTSUPERSCRIPT - ∑ start_POSTSUBSCRIPT italic_r = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT italic_U ( bold_x ( italic_r ) ) end_POSTSUPERSCRIPT . (44)

Discussion

In this work we have derived an exact ‘path integral’ representation of the time-dependent transition probability in a Wright-Fisher model. We have explicitly considered the case of two alleles, where the population’s description is in terms of a focal allele, and the case of an arbitrary number of n𝑛nitalic_n alleles, where the description is in terms of all n𝑛nitalic_n allele frequencies, with all frequencies treated as having the same status.

For the case of two alleles, we have compared the Wright-Fisher transition probability with a path integral representation of the corresponding quantity (a transition density) under the diffusion approximation. The result for the diffusion approximation result was derived for multiplicative selection, in the absence of mutation, and we have established the relation of this with the exact Wright-Fisher result in this case.

The Wright-Fisher path integral, derived in this work for two alleles, applies for a wider class of fitness functions than just multiplicative fitness, and can incorporate mutation. The general form of the path integral, for two alleles is given in Eq. (16), and takes the form of a sum over trajectories of a product the two terms: (i) a ‘weight’ W(0)([𝐱]0t)superscript𝑊0superscriptsubscriptdelimited-[]𝐱0𝑡W^{(0)}\left([\mathbf{x}]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) which gives the probability of a trajectory under neutrality, i.e., when only random genetic drift and mutation are operating, and (ii) the factor eC([𝐱]0t)superscript𝑒𝐶superscriptsubscriptdelimited-[]𝐱0𝑡e^{C\left([\mathbf{x}]_{0}^{t}\right)}italic_e start_POSTSUPERSCRIPT italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT which while depending on parameters such as mutation rates, is primarily determined by selection - this factor incorporates all effects of selection, and C([𝐱]0t)𝐶superscriptsubscriptdelimited-[]𝐱0𝑡C\left([\mathbf{x}]_{0}^{t}\right)italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) vanishes in the absence of selection. This separation into two factors represents an underlying property of the transition probability, K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ), that we know from other analyses, namely that at long times (t𝑡t\rightarrow\inftyitalic_t → ∞) the quantity K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) is a smooth function of selection, but the long time properties are very different for zero and non-zero mutation rates. For non-zero mutation rates, the long-time form of K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) is non-zero for all possible values of z𝑧zitalic_z (i.e., all allowed frequencies), and independent of the initial frequency, a𝑎aitalic_a. By contrast, for vanishing mutation rates, only the terminal frequency classes (00 and 1111) have non-zero probabilities at long times, and furthermore, these probabilities depend on the initial frequency, a𝑎aitalic_a. Thus K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ), as t𝑡t\rightarrow\inftyitalic_t → ∞, behaves discontinuously, as a function of mutation rates, in the sense that allowing mutation rates to tend to zero, and having mutation rates exactly equal to zero, yield different results. A diffusion analysis shows this most clearly, where singular spikes (Dirac delta functions) at the terminal frequencies are generally present in the transition probability density when mutation rates are zero, and are absent when mutation rates are non-zero [21]. The separation of a probability of a trajectory into the product of W(0)([𝐱]0t)superscript𝑊0superscriptsubscriptdelimited-[]𝐱0𝑡W^{(0)}\left([\mathbf{x}]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) and eC([𝐱]0t)superscript𝑒𝐶superscriptsubscriptdelimited-[]𝐱0𝑡e^{C\left([\mathbf{x}]_{0}^{t}\right)}italic_e start_POSTSUPERSCRIPT italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT is thus natural and a reflection of different behaviours arising from different features of the dynamics.

On the matters of fixation and loss, we note that since a Wright-Fisher model can describe these phenomena (in the absence of mutation), an exact path integral representation associated with this model can also, generally, describe features such as fixation and loss. This will also carry over to a path integral representation, based on the diffusion approximation, since the diffusion approximation is also known to encompass fixation and loss, albeit in a singular form [21]. Such singular behaviour seems likely to make the analysis of the path integral representation, based on the diffusion approximation, to be more complex, than in its absence.

As an elementary illustration of how fixation is incorporated into the path integral representation of the transition probability, K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ), we note the when all mutation rates are zero, the probability of ultimate fixation of the focal allele is limtK(1,t|a,0)subscript𝑡𝐾1conditional𝑡𝑎0\lim_{t\rightarrow\infty}K(1,t|a,0)roman_lim start_POSTSUBSCRIPT italic_t → ∞ end_POSTSUBSCRIPT italic_K ( 1 , italic_t | italic_a , 0 ). Let us revisit the case considered above where there is no mutation and weak multiplicative selection acting. We can expand K(1,t|a,0)𝐾1conditional𝑡𝑎0K(1,t|a,0)italic_K ( 1 , italic_t | italic_a , 0 ) in s𝑠sitalic_s by first expanding K(1,t|a,0)𝐾1conditional𝑡𝑎0K(1,t|a,0)italic_K ( 1 , italic_t | italic_a , 0 ) in C([x0t])𝐶delimited-[]superscriptsubscript𝑥0𝑡C\left(\left[x_{0}^{t}\right]\right)italic_C ( [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ] ), and then expanding C([x0t])𝐶delimited-[]superscriptsubscript𝑥0𝑡C\left(\left[x_{0}^{t}\right]\right)italic_C ( [ italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ] ) in s𝑠sitalic_s. To linear order in s𝑠sitalic_s we obtain (from Eq. (23)) K(1,t|a,0)[1+2Ns(1a)]×x(0)=ax(t)=1W(0,0)([x]0t)similar-to-or-equals𝐾1conditional𝑡𝑎0delimited-[]12𝑁𝑠1𝑎𝑥𝑡1𝑥0𝑎superscript𝑊00superscriptsubscriptdelimited-[]𝑥0𝑡K(1,t|a,0)\simeq[1+2Ns(1-a)]\times\overset{x(t)=1}{\underset{x(0)=a}{\sum% \cdots\sum}}W^{(0,0)}\left([x]_{0}^{t}\right)italic_K ( 1 , italic_t | italic_a , 0 ) ≃ [ 1 + 2 italic_N italic_s ( 1 - italic_a ) ] × start_OVERACCENT italic_x ( italic_t ) = 1 end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ). Since limtx(0)=ax(t)=1W(0,0)subscript𝑡𝑥𝑡1𝑥0𝑎superscript𝑊00\lim_{t\rightarrow\infty}\overset{x(t)=1}{\underset{x(0)=a}{\sum\cdots\sum}}W^% {(0,0)}roman_lim start_POSTSUBSCRIPT italic_t → ∞ end_POSTSUBSCRIPT start_OVERACCENT italic_x ( italic_t ) = 1 end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT is the probability of fixation ultimately occurring, under neutrality, this limit thus coincides with the initial frequency, a𝑎aitalic_a. In this way, we arrive at a fixation probability of Pfix(a)a+2Nsa(1a)similar-to-or-equalssubscript𝑃𝑓𝑖𝑥𝑎𝑎2𝑁𝑠𝑎1𝑎P_{fix}(a)\simeq a+2Nsa(1-a)italic_P start_POSTSUBSCRIPT italic_f italic_i italic_x end_POSTSUBSCRIPT ( italic_a ) ≃ italic_a + 2 italic_N italic_s italic_a ( 1 - italic_a ), which contains the neutral result and a term which is first order in s𝑠sitalic_s, which is the leading correction due to selection. Expansion of K(z,t|a,0)𝐾𝑧conditional𝑡𝑎0K(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) (and related quantities) to higher order in s𝑠sitalic_s, can be achieved, again by exploiting the factorisation between drift/mutation and selection that occurs in Eq. (16). Expansions in s𝑠sitalic_s beyond linear order involve more complicated calculations than that of the linear case.

In the case of two alleles, we have seen the relation between the path integral of the ‘fully discrete’ Wright-Fisher model and the path integral of the diffusion approximation, for this model. For the case of an arbitrary number of n𝑛nitalic_n alleles there is, at the present time, no such path integral for the diffusion approximation. However, from the lessons learned for two alleles we can infer this some of the properties of the general n𝑛nitalic_n case, under the diffusion approximation. In particular, when selection is multiplicative, and in the absence of mutation. we infer from Eq. (44) that

Kdiffusion(𝐳,t|𝐚,0)=e𝐑T(𝐳𝐚)𝐱(0)=𝐚𝐱(t)=𝐳P([𝐱]0t)e0tU(𝐱(r))𝑑rd[𝐱]subscript𝐾diffusion𝐳conditional𝑡𝐚0superscript𝑒superscript𝐑𝑇𝐳𝐚superscriptsubscript𝐱0𝐚𝐱𝑡𝐳𝑃superscriptsubscriptdelimited-[]𝐱0𝑡superscript𝑒superscriptsubscript0𝑡𝑈𝐱𝑟differential-d𝑟𝑑delimited-[]𝐱K_{\text{diffusion}}(\mathbf{z},t|\mathbf{a},0)=e^{\mathbf{R}^{T}\left(\mathbf% {z}-\mathbf{a}\right)}\int_{\mathbf{x}(0)=\mathbf{a}}^{\mathbf{x}(t)=\mathbf{z% }}P([\mathbf{x}]_{0}^{t})e^{-\int_{0}^{t}U\left(\mathbf{x}(r)\right)dr}d[% \mathbf{x}]italic_K start_POSTSUBSCRIPT diffusion end_POSTSUBSCRIPT ( bold_z , italic_t | bold_a , 0 ) = italic_e start_POSTSUPERSCRIPT bold_R start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_z - bold_a ) end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT bold_x ( 0 ) = bold_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_x ( italic_t ) = bold_z end_POSTSUPERSCRIPT italic_P ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) italic_e start_POSTSUPERSCRIPT - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_U ( bold_x ( italic_r ) ) italic_d italic_r end_POSTSUPERSCRIPT italic_d [ bold_x ] (45)

where 𝐑𝐑\mathbf{R}bold_R is a column vector containing the set of scaled selection strengths (Eq. (39)), the quantity P([𝐱]0t)𝑃superscriptsubscriptdelimited-[]𝐱0𝑡P([\mathbf{x}]_{0}^{t})italic_P ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) is the analogue of the neutral, mutation-free, probability of a trajectory in a Wright-Fisher model, W(0,0)([𝐱]0t)superscript𝑊00superscriptsubscriptdelimited-[]𝐱0𝑡W^{(0,0)}\left([\mathbf{x}]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 , 0 ) end_POSTSUPERSCRIPT ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ), while U(𝐱)𝑈𝐱U\left(\mathbf{x}\right)italic_U ( bold_x ) is given by Eq. (43). An interesting feature is the way selection enters Eq. (45), in both the prefactor, e𝐑T(𝐳𝐚)superscript𝑒superscript𝐑𝑇𝐳𝐚e^{\mathbf{R}^{T}\left(\mathbf{z}-\mathbf{a}\right)}italic_e start_POSTSUPERSCRIPT bold_R start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_z - bold_a ) end_POSTSUPERSCRIPT and within U(𝐱)𝑈𝐱U\left(\mathbf{x}\right)italic_U ( bold_x ) in forms that involve the vectors and matrices that occur in the problem. Additionally, a diffusion analysis would suggest that all occurrences of the population size, N𝑁Nitalic_N, are replaced by the effective population size, Nesubscript𝑁𝑒N_{e}italic_N start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT.

In the special cases considered above, of no mutation and weak selection, the ‘selectively controlled’ quantities C([x]0t)𝐶superscriptsubscriptdelimited-[]𝑥0𝑡C\left([x]_{0}^{t}\right)italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) and C([𝐱]0t)𝐶superscriptsubscriptdelimited-[]𝐱0𝑡C\left([\mathbf{x}]_{0}^{t}\right)italic_C ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ), for two and n𝑛nitalic_n alleles, respectively, both naturally split into two terms (see Eqs. (21) and (41)). One of the terms has dependence on only the initial and final frequencies of the transition probability, and has no dependence of the frequencies taken by trajectories at intermediate times; it is natural to call this a boundary term. To leading order in selection coefficients, the boundary term changes sign when the sign of all selection coefficients are reversed (for two alleles reversal entails ss𝑠𝑠s\rightarrow-sitalic_s → - italic_s; for n𝑛nitalic_n alleles, 𝐬𝐬𝐬𝐬\mathbf{s}\rightarrow-\mathbf{s}bold_s → - bold_s). The boundary term is thus the primary place that the deleterious or beneficial effect of a mutation manifests itself. The other term (U([x]0t)𝑈superscriptsubscriptdelimited-[]𝑥0𝑡U\left([x]_{0}^{t}\right)italic_U ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) and U([𝐱]0t)𝑈superscriptsubscriptdelimited-[]𝐱0𝑡U\left([\mathbf{x}]_{0}^{t}\right)italic_U ( [ bold_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ), respectively) depends on the frequencies taken by trajectories at all times, from the initial time to the final time. The U𝑈Uitalic_U terms, when large, have the effect of suppressing the contribution of a trajectory. They are a manifestation of the ‘probabilistic cost of selection’ of an entire trajectory. Interestingly, the U𝑈Uitalic_U terms cannot take negative values and remain unaltered when the sign of all selection coefficients are reversed.

In summary, we have presented an exact representation of the transition probability of a Wright-Fisher model in terms of a path integral (in reality a sum over paths/trajectories). Let us conclude with some possible ways that the path integral representation may be of use. We shall restrict our considerations to the case of two alleles, where the main result is given in Eq. (16), since very similar considerations apply to the n𝑛nitalic_n allele case in Eq. (37).

  1. 1.

    The path integral representation may make it easy to carry out an expansion in a small parameter, such as a selection coefficient. This has been carried out for the transition density at intermediate frequencies, under the diffusion approximation[9]. In the present work we have shown that expansion in selection coefficients can also be applied to phenomena such as fixation and loss. There may be many other applications of expansion in a small parameter..

  2. 2.

    A path integral involves trajectories whose contributions generally have different probabilities of occurrence. A possible approximation is where the most probable trajectory, along with trajectories that have with small fluctuations around the most probable trajectory, are used to estimate the path integral. The most probable trajectory may be of interest in its own right, since it may typify the way the population makes a transition between two states of the population over time.

  3. 3.

    The path integral representation involves a fundamental separation of mutation and drift from the process of what is primarily selection, as manifested by the two factors W(0)([x]0t)superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡W^{(0)}\left([x]_{0}^{t}\right)italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) and eC([x]0t)superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡e^{C\left([x]_{0}^{t}\right)}italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT in Eq. (16). To exploit this separation, we note that while, in this work, we have implicitly assumed that all parameters are independent of the time, a straightforward generalisation of the exact results allows parameters to be time dependent. Then one immediate case of application occurs when just selection fluctuates over time, with selection coefficients drawn each generation from a given distribution, or generated by a random process. In the absence of further knowledge, it is plausibly the case that the relevant transition probability follows from an average over all such selection coefficients. With the average denoted by an overbar, the average of Eq. (16) reads K(z,t|a,0)¯=x(0)=ax(t)=zW(0)([x]0t)eC([x]0t)¯¯𝐾𝑧conditional𝑡𝑎0𝑥𝑡𝑧𝑥0𝑎superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡¯superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡\overline{K(z,t|a,0)}=\overset{x(t)=z}{\underset{x(0)=a}{\sum\cdots\sum}}W^{(0% )}\left([x]_{0}^{t}\right)\overline{e^{C\left([x]_{0}^{t}\right)}}over¯ start_ARG italic_K ( italic_z , italic_t | italic_a , 0 ) end_ARG = start_OVERACCENT italic_x ( italic_t ) = italic_z end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) over¯ start_ARG italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT end_ARG. Thus only the selectively controlled factor is averaged, and this may lead to an effective theory that has new/modified selective terms, compared with the case where selection coefficients are simply set equal to the time-averaged value[22, 23].

  4. 4.

    A different approach, compared to the above three approaches, is to rewrite Eq. (16) in the form K(z,t|a,0)=K(0)(z,t|a,0)×D(z,t|a,0)𝐾𝑧conditional𝑡𝑎0superscript𝐾0𝑧conditional𝑡𝑎0𝐷𝑧conditional𝑡𝑎0K(z,t|a,0)=K^{(0)}(z,t|a,0)\times D(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) = italic_K start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_z , italic_t | italic_a , 0 ) × italic_D ( italic_z , italic_t | italic_a , 0 ) where K(0)(z,t|a,0)superscript𝐾0𝑧conditional𝑡𝑎0K^{(0)}(z,t|a,0)italic_K start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_z , italic_t | italic_a , 0 ) is the neutral result (Eq. (17)) and

    D(z,t|a,0)=x(0)=ax(t)=zW(0)([x]0t)eC([x]0t)/x(0)=ax(t)=zW(0)([x]0t).𝐷𝑧conditional𝑡𝑎0/𝑥𝑡𝑧𝑥0𝑎superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡𝑥𝑡𝑧𝑥0𝑎superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡D(z,t|a,0)=\left.\overset{x(t)=z}{\underset{x(0)=a}{\sum\cdots\sum}}W^{(0)}% \left([x]_{0}^{t}\right)e^{C\left([x]_{0}^{t}\right)}\right/\overset{x(t)=z}{% \underset{x(0)=a}{\sum\cdots\sum}}W^{(0)}\left([x]_{0}^{t}\right).italic_D ( italic_z , italic_t | italic_a , 0 ) = start_OVERACCENT italic_x ( italic_t ) = italic_z end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT / start_OVERACCENT italic_x ( italic_t ) = italic_z end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) . (46)

    We can interpret D(z,t|a,0)𝐷𝑧conditional𝑡𝑎0D(z,t|a,0)italic_D ( italic_z , italic_t | italic_a , 0 ) as an average of the quantity eC([x]0t)superscript𝑒𝐶superscriptsubscriptdelimited-[]𝑥0𝑡e^{C\left([x]_{0}^{t}\right)}italic_e start_POSTSUPERSCRIPT italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT over all neutral trajectories that start at allowed frequency a𝑎aitalic_a at time 00 and end at allowed frequency z𝑧zitalic_z at time t𝑡titalic_t. Applying Jensen’s inequality[24] to Eq. (46) yields D(z,t|a,0)DJ(z,t|a,0)𝐷𝑧conditional𝑡𝑎0subscript𝐷𝐽𝑧conditional𝑡𝑎0D(z,t|a,0)\geq D_{J}(z,t|a,0)italic_D ( italic_z , italic_t | italic_a , 0 ) ≥ italic_D start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ( italic_z , italic_t | italic_a , 0 ) with

    DJ(z,t|a,0)=exp(x(0)=ax(t)=zW(0)([x]0t)C([x]0t)/x(0)=ax(t)=zW(0)([x]0t)).subscript𝐷𝐽𝑧conditional𝑡𝑎0/𝑥𝑡𝑧𝑥0𝑎superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡𝐶superscriptsubscriptdelimited-[]𝑥0𝑡𝑥𝑡𝑧𝑥0𝑎superscript𝑊0superscriptsubscriptdelimited-[]𝑥0𝑡D_{J}(z,t|a,0)=\exp\left(\left.\overset{x(t)=z}{\underset{x(0)=a}{\sum\cdots% \sum}}W^{(0)}\left([x]_{0}^{t}\right)C\left([x]_{0}^{t}\right)\right/\overset{% x(t)=z}{\underset{x(0)=a}{\sum\cdots\sum}}W^{(0)}\left([x]_{0}^{t}\right)% \right).italic_D start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ( italic_z , italic_t | italic_a , 0 ) = roman_exp ( start_OVERACCENT italic_x ( italic_t ) = italic_z end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) / start_OVERACCENT italic_x ( italic_t ) = italic_z end_OVERACCENT start_ARG start_UNDERACCENT italic_x ( 0 ) = italic_a end_UNDERACCENT start_ARG ∑ ⋯ ∑ end_ARG end_ARG italic_W start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) ) . (47)

    Thus we find K(z,t|a,0)K(0)(z,t|a,0)×DJ(z,t|a,0)𝐾𝑧conditional𝑡𝑎0superscript𝐾0𝑧conditional𝑡𝑎0subscript𝐷𝐽𝑧conditional𝑡𝑎0K(z,t|a,0)\geq K^{(0)}(z,t|a,0)\times D_{J}(z,t|a,0)italic_K ( italic_z , italic_t | italic_a , 0 ) ≥ italic_K start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_z , italic_t | italic_a , 0 ) × italic_D start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ( italic_z , italic_t | italic_a , 0 ), where the exponent of DJ(z,t|a,0)subscript𝐷𝐽𝑧conditional𝑡𝑎0D_{J}(z,t|a,0)italic_D start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ( italic_z , italic_t | italic_a , 0 ) is a conditional average of C([x]0t)𝐶superscriptsubscriptdelimited-[]𝑥0𝑡C\left([x]_{0}^{t}\right)italic_C ( [ italic_x ] start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ) over all neutral trajectories that start at frequency a𝑎aitalic_a at time 00 and end at frequency z𝑧zitalic_z at time t𝑡titalic_t

Methods

Here we give details of the calculations underlying Eqs. (10) and (31).

Factorisation of the transition matrix: two alleles

For the case of two alleles, the transition matrix can be expressed as a product of two factors, one of which is independent of selection.

We begin with Eq. (7) for the transition matrix, which we reproduce here for convenience. We have

M(x|x)=(2N2Nx)[F(x)]2Nx[1F(x)]2N(1x)𝑀conditionalsuperscript𝑥𝑥binomial2𝑁2𝑁superscript𝑥superscriptdelimited-[]𝐹𝑥2𝑁superscript𝑥superscriptdelimited-[]1𝐹𝑥2𝑁1superscript𝑥M(x^{\prime}|x)=\binom{2N}{2Nx^{\prime}}\left[F(x)\right]^{2Nx^{\prime}}\left[% 1-F(x)\right]^{2N(1-x^{\prime})}italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) [ italic_F ( italic_x ) ] start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT [ 1 - italic_F ( italic_x ) ] start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT (48)

where

F(x)=fsel(fmut(x)).𝐹𝑥subscript𝑓𝑠𝑒𝑙subscript𝑓𝑚𝑢𝑡𝑥F(x)=f_{sel}(f_{mut}(x)).italic_F ( italic_x ) = italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) ) . (49)

In the absence of selection, F(x)𝐹𝑥F(x)italic_F ( italic_x ) reduces to fmut(x)subscript𝑓𝑚𝑢𝑡𝑥f_{mut}(x)italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) and M(x|x)𝑀conditionalsuperscript𝑥𝑥M(x^{\prime}|x)italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) reduces M(0)(x|x)superscript𝑀0conditionalsuperscript𝑥𝑥M^{(0)}(x^{\prime}|x)italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ), as given by

M(0)(x|x)=(2N2Nx)[fmut(x)]2Nx[1fmut(x)]2N(1x)(2N2Nx)(x)2Nx(1x)2N(1x)superscript𝑀0conditionalsuperscript𝑥𝑥binomial2𝑁2𝑁superscript𝑥superscriptdelimited-[]subscript𝑓𝑚𝑢𝑡𝑥2𝑁superscript𝑥superscriptdelimited-[]1subscript𝑓𝑚𝑢𝑡𝑥2𝑁1superscript𝑥binomial2𝑁2𝑁superscript𝑥superscriptsuperscript𝑥2𝑁superscript𝑥superscript1superscript𝑥2𝑁1superscript𝑥M^{(0)}(x^{\prime}|x)=\binom{2N}{2Nx^{\prime}}\left[f_{mut}(x)\right]^{2Nx^{% \prime}}\left[1-f_{mut}(x)\right]^{2N(1-x^{\prime})}\equiv\binom{2N}{2Nx^{% \prime}}\left(x^{\ast}\right)^{2Nx^{\prime}}\left(1-x^{\ast}\right)^{2N(1-x^{% \prime})}italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) [ italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) ] start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT [ 1 - italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) ] start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT ≡ ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT (50)

where we have set

x=fmut(x).superscript𝑥subscript𝑓𝑚𝑢𝑡𝑥x^{\ast}=f_{mut}(x).italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT italic_m italic_u italic_t end_POSTSUBSCRIPT ( italic_x ) . (51)

To establish factorisation we use the adopted form of selection in Eq. (2), namely fsel(x)=x+σ(x)x(1x)subscript𝑓𝑠𝑒𝑙𝑥𝑥𝜎𝑥𝑥1𝑥f_{sel}(x)=x+\sigma(x)x(1-x)italic_f start_POSTSUBSCRIPT italic_s italic_e italic_l end_POSTSUBSCRIPT ( italic_x ) = italic_x + italic_σ ( italic_x ) italic_x ( 1 - italic_x ) to write F(x)=x[1+σ(x)(1x)]𝐹𝑥superscript𝑥delimited-[]1𝜎superscript𝑥1superscript𝑥F(x)=x^{\ast}\left[1+\sigma(x^{\ast})(1-x^{\ast})\right]italic_F ( italic_x ) = italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT [ 1 + italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ]. Similarly we have 1F(x)=(1x)[1σ(x)x]1𝐹𝑥1superscript𝑥delimited-[]1𝜎superscript𝑥superscript𝑥1-F(x)=\left(1-x^{\ast}\right)\left[1-\sigma(x^{\ast})x^{\ast}\right]1 - italic_F ( italic_x ) = ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) [ 1 - italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ]. These allow us to write Eq. (48) as

M(x|x)𝑀conditionalsuperscript𝑥𝑥\displaystyle M(x^{\prime}|x)italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) =(2N2Nx){x[1+σ(x)(1x)]}2Nx{(1x)[1σ(x)x]}2N(1x)absentbinomial2𝑁2𝑁superscript𝑥superscriptsuperscript𝑥delimited-[]1𝜎superscript𝑥1superscript𝑥2𝑁superscript𝑥superscript1superscript𝑥delimited-[]1𝜎superscript𝑥superscript𝑥2𝑁1superscript𝑥\displaystyle=\binom{2N}{2Nx^{\prime}}\left\{x^{\ast}\left[1+\sigma(x^{\ast})(% 1-x^{\ast})\right]\right\}^{2Nx^{\prime}}\left\{\left(1-x^{\ast}\right)\left[1% -\sigma(x^{\ast})x^{\ast}\right]\right\}^{2N(1-x^{\prime})}= ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) { italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT [ 1 + italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ] } start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT { ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) [ 1 - italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] } start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT
=(2N2Nx)(x)2Nx(1x)2N(1x)×[1+σ(x)(1x)]2Nx[1σ(x)x]2N(1x)absentbinomial2𝑁2𝑁superscript𝑥superscriptsuperscript𝑥2𝑁superscript𝑥superscript1superscript𝑥2𝑁1superscript𝑥superscriptdelimited-[]1𝜎superscript𝑥1superscript𝑥2𝑁superscript𝑥superscriptdelimited-[]1𝜎superscript𝑥superscript𝑥2𝑁1superscript𝑥\displaystyle=\binom{2N}{2Nx^{\prime}}\left(x^{\ast}\right)^{2Nx^{\prime}}% \left(1-x^{\ast}\right)^{2N(1-x^{\prime})}\times\left[1+\sigma(x^{\ast})(1-x^{% \ast})\right]^{2Nx^{\prime}}\left[1-\sigma(x^{\ast})x^{\ast}\right]^{2N(1-x^{% \prime})}= ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT × [ 1 + italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ] start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT [ 1 - italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT 2 italic_N ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT
M(0)(x|x)×eC(x|x)absentsuperscript𝑀0conditionalsuperscript𝑥𝑥superscript𝑒𝐶conditionalsuperscript𝑥𝑥\displaystyle\equiv M^{(0)}(x^{\prime}|x)\times e^{C(x^{\prime}|x)}≡ italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) × italic_e start_POSTSUPERSCRIPT italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) end_POSTSUPERSCRIPT (52)

where

C(x|x)=2N×[xln(1+σ(x)(1x))+(1x)ln(1σ(x)x)].𝐶conditionalsuperscript𝑥𝑥2𝑁delimited-[]superscript𝑥1𝜎superscript𝑥1superscript𝑥1superscript𝑥1𝜎superscript𝑥superscript𝑥C(x^{\prime}|x)=2N\times\big{[}x^{\prime}\ln\big{(}1+\sigma(x^{\ast})\left(1-x% ^{\ast}\right)\big{)}+(1-x^{\prime})\ln\big{(}1-\sigma(x^{\ast})x^{\ast}\big{)% }\big{]}.italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) = 2 italic_N × [ italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_ln ( 1 + italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ( 1 - italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ) + ( 1 - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_ln ( 1 - italic_σ ( italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ] . (53)

Equation (52) represents an exact decomposition of the transition matrix, M(x|x)𝑀conditionalsuperscript𝑥𝑥M(x^{\prime}|x)italic_M ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ), into the product of a transition matrix, M(0)(x|x)superscript𝑀0conditionalsuperscript𝑥𝑥M^{(0)}(x^{\prime}|x)italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ), that is independent of selection, and a factor eC(x|x)superscript𝑒𝐶conditionalsuperscript𝑥𝑥e^{C(x^{\prime}|x)}italic_e start_POSTSUPERSCRIPT italic_C ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_x ) end_POSTSUPERSCRIPT which depends on selection, and is unity in the absence of selection.

Factorisation of the transition matrix: n alleles

For the case of n𝑛nitalic_n alleles, the transition matrix can again be expressed as a product of two factors, one of which is independent of selection.

We begin with Eqs. (28) and (25), which we reproduce here for convenience:

M(𝐱|𝐱)=(2N2N𝐱)i=1n[Fi(𝐱)]2Nxi𝑀conditionalsuperscript𝐱𝐱binomial2𝑁2𝑁superscript𝐱superscriptsubscriptproduct𝑖1𝑛superscriptdelimited-[]subscript𝐹𝑖𝐱2𝑁superscriptsubscript𝑥𝑖M(\mathbf{x}^{\prime}|\mathbf{x})=\binom{2N}{2N\mathbf{x}^{\prime}}\prod% \nolimits_{i=1}^{n}\left[F_{i}(\mathbf{x})\right]^{2Nx_{i}^{\prime}}italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) ] start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT (54)

and

Fi(𝐱)=xi×[1+Gi(𝐱)]subscript𝐹𝑖𝐱superscriptsubscript𝑥𝑖delimited-[]1subscript𝐺𝑖superscript𝐱F_{i}(\mathbf{x})=x_{i}^{\ast}\times[1+G_{i}(\mathbf{x}^{\ast})]italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x ) = italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT × [ 1 + italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ] (55)

in which

𝐱=𝐐𝐱.superscript𝐱𝐐𝐱\mathbf{x}^{\ast}=\mathbf{Qx}.bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_Qx . (56)

In the absence of selection, 𝐅(𝐱)𝐅𝐱\mathbf{F}(\mathbf{x})bold_F ( bold_x ) reduces to 𝐱superscript𝐱\mathbf{x}^{\ast}bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPTand M(𝐱|𝐱)𝑀conditionalsuperscript𝐱𝐱M(\mathbf{x}^{\prime}|\mathbf{x})italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) reduces to M(0)(𝐱|𝐱)superscript𝑀0conditionalsuperscript𝐱𝐱M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ), as given by

M(0)(𝐱|𝐱)=(2N2N𝐱)i=1n(xi)2Nx.superscript𝑀0conditionalsuperscript𝐱𝐱binomial2𝑁2𝑁superscript𝐱superscriptsubscriptproduct𝑖1𝑛superscriptsuperscriptsubscript𝑥𝑖2𝑁superscript𝑥M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})=\binom{2N}{2N\mathbf{x}^{\prime}}\prod% \nolimits_{i=1}^{n}\left(x_{i}^{\ast}\right)^{2Nx^{\prime}}.italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT . (57)

To establish a factorisation we note that Eq. (55) allows us to write Eq. (54) as

M(𝐱|𝐱)𝑀conditionalsuperscript𝐱𝐱\displaystyle M(\mathbf{x}^{\prime}|\mathbf{x})italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) =(2N2N𝐱)i=1n{xi[1+Gi(𝐱)]}2Nxi=(2N2N𝐱)i=1n(xi)2Nxi×i=1n[1+Gi(𝐱)]2Nxiabsentbinomial2𝑁2𝑁superscript𝐱superscriptsubscriptproduct𝑖1𝑛superscriptsuperscriptsubscript𝑥𝑖delimited-[]1subscript𝐺𝑖superscript𝐱2𝑁superscriptsubscript𝑥𝑖binomial2𝑁2𝑁superscript𝐱superscriptsubscriptproduct𝑖1𝑛superscriptsuperscriptsubscript𝑥𝑖2𝑁superscriptsubscript𝑥𝑖superscriptsubscriptproduct𝑖1𝑛superscriptdelimited-[]1subscript𝐺𝑖superscript𝐱2𝑁superscriptsubscript𝑥𝑖\displaystyle=\binom{2N}{2N\mathbf{x}^{\prime}}\prod\nolimits_{i=1}^{n}\left\{% x_{i}^{\ast}\left[1+G_{i}(\mathbf{x}^{\ast})\right]\right\}^{2Nx_{i}^{\prime}}% =\binom{2N}{2N\mathbf{x}^{\prime}}\prod\nolimits_{i=1}^{n}\left(x_{i}^{\ast}% \right)^{2Nx_{i}^{\prime}}\times\prod\nolimits_{i=1}^{n}\left[1+G_{i}(\mathbf{% x}^{\ast})\right]^{2Nx_{i}^{\prime}}= ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT { italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT [ 1 + italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ] } start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT × ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ 1 + italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ] start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT
=M(0)(𝐱|𝐱)×eC(𝐱|𝐱)absentsuperscript𝑀0conditionalsuperscript𝐱𝐱superscript𝑒𝐶conditionalsuperscript𝐱𝐱\displaystyle=M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})\times e^{C(\mathbf{x}^{% \prime}|\mathbf{x})}= italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) × italic_e start_POSTSUPERSCRIPT italic_C ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) end_POSTSUPERSCRIPT (58)

where

M(0)(𝐱|𝐱)=(2N2N𝐱)i=1n(xi)2Nxisuperscript𝑀0conditionalsuperscript𝐱𝐱binomial2𝑁2𝑁superscript𝐱superscriptsubscriptproduct𝑖1𝑛superscriptsuperscriptsubscript𝑥𝑖2𝑁superscriptsubscript𝑥𝑖M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})=\binom{2N}{2N\mathbf{x}^{\prime}}\prod% \nolimits_{i=1}^{n}\left(x_{i}^{\ast}\right)^{2Nx_{i}^{\prime}}italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = ( FRACOP start_ARG 2 italic_N end_ARG start_ARG 2 italic_N bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 italic_N italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT (59)

and

C(𝐱|𝐱)=2Ni=1nxiln(1+Gi(𝐱)).𝐶conditionalsuperscript𝐱𝐱2𝑁superscriptsubscript𝑖1𝑛superscriptsubscript𝑥𝑖1subscript𝐺𝑖superscript𝐱C(\mathbf{x}^{\prime}|\mathbf{x})=2N\sum\nolimits_{i=1}^{n}x_{i}^{\prime}\ln% \left(1+G_{i}(\mathbf{x}^{\ast})\right).italic_C ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) = 2 italic_N ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_ln ( 1 + italic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ) . (60)

Equation (58) represents an exact decomposition of the transition matrix, M(𝐱|𝐱)𝑀conditionalsuperscript𝐱𝐱M(\mathbf{x}^{\prime}|\mathbf{x})italic_M ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ), into the product of a transition matrix, M(0)(𝐱|𝐱)superscript𝑀0conditionalsuperscript𝐱𝐱M^{(0)}(\mathbf{x}^{\prime}|\mathbf{x})italic_M start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ), that is independent of selection, and a factor eC(𝐱|𝐱)superscript𝑒𝐶conditionalsuperscript𝐱𝐱e^{C(\mathbf{x}^{\prime}|\mathbf{x})}italic_e start_POSTSUPERSCRIPT italic_C ( bold_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | bold_x ) end_POSTSUPERSCRIPT which depends on selection, and is unity in the absence of selection.

References

  • [1] Fisher, R. A. On the dominance ratio. \JournalTitleProceedings of the Royal Society of Edinburgh 42, 321–341 (1922).
  • [2] Wright, S. G. Evolution in Mendelian populations. \JournalTitleGenetics 16, 97–159 (1931).
  • [3] Ewens, W. Mathematical Population Genetics I. Theoretical Introduction, 2nd Edition (Springer-Verlag, New York, 2004).
  • [4] Haller, B. C. & Messer, P. W. Forward genetic simulations beyond the Wright-Fisher model. \JournalTitleMolecular Biology and Evolution 36, 632–637 (2019).
  • [5] Tataru, P., Simonsen, M., Bataillon, T. & Hobolth, A. Inference in the Wright-Fisher model using allele frequency aata. \JournalTitleSystematic Biology 66, e30–e46 (2017).
  • [6] Imhof, L. & Nowak, M. Evolutionary game dynamics in a Wright-Fisher process. \JournalTitleJournal of Mathematical Biology 52, 667–681 (2006).
  • [7] Fu, Y.-X. Exact coalescent for the Wright-Fisher model. \JournalTitleTheoretical Population Biology 69, 385–394 (2006).
  • [8] Kimura, M. Stochastic processes and distribution of gene frequencies under natural selection. \JournalTitleCold Spring Harbor Symposium in Quantative Biology 20, 33–53 (1955).
  • [9] Schraiber, J. G. A path integral formulation of the Wright–Fisher process with genic selection. \JournalTitleTheoretical Population Biology 92, 30–35 (2014).
  • [10] Feynman, R. P. & Hibbs, A. Quantum Mechanics and Path-Integrals (McGraw-Hill, New York, 1965).
  • [11] Koval’chik, I. M. The Wiener integral. \JournalTitleRussian Mathematical Surveys 18(1), 97–134, DOI: DOI:10.1070/RM1963v018n01ABEH004126 (1963).
  • [12] Kleinert, H. Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 5th Edition (World Scientifc, Singapore, 2009).
  • [13] Okay, T. S., Oliveira, W. P., Raiz-J´unior, R., Rodrigues, J. C. & Negro, G. M. B. D. Frequency of the deltaf508 mutation in 108 cystic fibrosis patients in Sao Paulo: Comparison with reported Brazilian data. \JournalTitleClinics 60(2), DOI: 10.1590/S1807-59322005000200009 (2005).
  • [14] Hodgkinson, A. & Eyre-Walker, A. Human triallelic sites: Evidence for a new mutational mechanism? \JournalTitleGenetics 184(1), DOI: 10.1534/genetics.109.110510 (2010).
  • [15] Campbell, I. M. et al. Multiallelic positions in the human genome: Challenges for genetic analyses. \JournalTitleHuman Mutation 37(3), 231–234, DOI: 10.1002/humu.22944 (2016).
  • [16] Phillips, C. et al. A compilation of tri-allelic snps from 1000 genomes and use of the most polymorphic loci for a large-scale human identification panel. \JournalTitleForensic Science International: Genetics 46:102232, DOI: 10.1016/j.fsigen.2020.102232 (2020).
  • [17] Eyre-Walker, A. & Keightley, P. D. Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change. \JournalTitleMolecular Biology and Evolution 26, 2097–2108, DOI: 10.1093/molbev/msp119 (2009).
  • [18] Hoppensteadt, F. C. Mathematical Methods of Population Biology (Cambridge University Press, Cambridge, 1982).
  • [19] Zhao, L., Gossmann, T. I. & Waxman, D. A modified Wright–Fisher model that incorporates Nesubscript𝑁𝑒{N}_{e}italic_N start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT: A variant of the standard model with increased biological realism and reduced computational complexity. \JournalTitleJournal of Theoretical Biology 393, 218–228 (2016).
  • [20] Waxman, D. Fixation at a locus with multiple alleles: Structure and solution of the wright fisher model. \JournalTitleJournal of Theoretical Biology 257, 245–251 (2009).
  • [21] McKane, A. & Waxman, D. Singular solutions of the diffusion equation of population genetics. \JournalTitleJournal of Theoretical Biology 247, 849–858 (2007).
  • [22] Takahata, N., Ishii, K. & Matsuda, H. Effect of temporal fluctuation of selection coefficient on gene frequency in a population. \JournalTitleGenetics 72, 4541–4545 (1975).
  • [23] Waxman, D. A unified treatment of the probability of fixation when population size and the strength of selection change over time. \JournalTitleGenetics 188, 907–913 (2011).
  • [24] Ross, S. M. Introduction to Probability Models (13th ed.) (Academic Press, Oxford, 2024).