4 1. INTRODUCTION

that is of order

σm.

For the discovery time, there is no miracle: before discovering

the master sequence, the process is likely to explore a significant portion of the

genotype space, hence the discovery time should be of order

card { A, T, G, C } = 4 .

These simple heuristics indicate that the persistence time depends on the selection

drift, while the discovery time depends on the spatial entropy. Suppose that we send

m, to ∞ simultaneously. If the discovery time is much larger than the persistence

time, then the population will be neutral most of the time and the fraction of

the master sequence at equilibrium will be null. If the persistence time is much

larger than the discovery time, then the population will be invaded by the master

sequence most of the time and the fraction of the master sequence at equilibrium

will be positive. Thus the master sequence vanishes in the regime

m, → +∞ ,

m

→ 0 ,

while a quasispecies might be formed in the regime

m, → +∞ ,

m

→ +∞ .

This leads to an interesting feature, namely the existence of a critical population size

for the emergence of a quasispecies. For chromosomes of length , a quasispecies

can be formed only if the population size m is such that the ratio m/ is large

enough. In order to go further, we must put the heuristics on a firmer ground and

we should take the mutations into account when estimating the persistence time.

The main problem is to obtain finer estimates on the persistence and discovery

times. We cannot compute explicitly the laws of these random times, so we will

compare the Moran model with simpler processes.

In the non neutral populations, we shall compare the process with a birth and death

process (Zn)n≥0 on { 0,...,m }, which is precisely the one introduced by Nowak

and Schuster [32]. The value Zn approximates the number of copies of the master

sequence present in the population. For birth and death processes, explicit formulas

are available and we obtain that, if , m → +∞, q → 0, q → a ∈]0, +∞[, then

persistence time ∼ exp

(

m φ(a)

)

,

where

φ(a) =

σ(1 −

e−a)

ln

σ(1 −

e−a)

σ − 1

+

ln(σe−a)

(1 − σ(1 −

e−a))

.

In the neutral populations, we shall replace the process by a random walk on

{ A, T, G, C } = 4 . The lumped version of this random walk behaves like an

Ehrenfest process (Yn)n≥0 on { 0,..., } (see [5] for a nice review). The value Yn

represents the distance of the walker to the master sequence. A celebrated theorem

of Kac from 1947 [21], which helped to resolve a famous paradox of statistical

mechanics, yields that, when → ∞,

discovery time ∼ 4 .