# Evolution in Finite Populations

Modelling stochastic evolutionary dynamics from first principles (MS thesis project)

* June 2022 - Present Supervisors: Dr. Vishwesha Guttal, Prof. Rohini Balakrishnan*

# Part 1: Noise-induced selection and its effects in finite populations of non-constant size

Many central questions of population biology relate to the number of distinct variants of an entity that can emerge and coexist in a closed population. Examples include the study of standing genetic variation (alleles), polymorphisms (genotypes/phenotypes), and sympatric speciation (species). Historically, these questions have often been studied through phenomenological models formulated for infinitely large populations. However, we increasingly realize that demographic stochasticity can have important consequences for evolution. For example, in adaptive diversification models, demographic stochasticity can delay or prevent evolutionary branching in finite populations.

While we understand natural selection very generally through frameworks such as the Price equation, finite population models in population genetics, such as the Wright-Fisher or Moran models, often assume a fixed population size, greatly limiting their generality. For my Master's dissertation, I worked with Prof. Vishwesha Guttal and Prof. Rohini Balakrishnan at the Centre for Ecological Sciences at IISc Bangalore to build more general analytical (mathematical) theory for evolution in finite populations from first principles using ideas from statistical physics and stochastic processes. Starting from a density-dependent 'birth-death process’ describing a population of individuals with discrete traits, I derive stochastic differential equations (SDEs) for how the relative population sizes and trait frequencies change over time. These SDEs recover well-known results such as the replicator-mutator equation, the Price equation, and Fisher’s fundamental theorem in the infinite population limit, illustrating consistency with known formal descriptions of evolution and showing that these equations are 'universal', in the sense that almost any sufficiently large population will satisfy these equations, no matter how complicated the biology of the population. For finite populations, these same SDEs generically reveal a directional evolutionary force, 'noise-induced selection’, that is particular to finite, fluctuating populations and is present even when all types have the same fitness. The strength of noise-induced selection depends directly on the difference in turnover rates between types and inversely on the total population size. Noise-induced selection can reverse the direction of evolution predicted by infinite-population frameworks. This general derivation of evolutionary dynamics helps unify and organize several previous studies - typically performed for specific evolutionary and ecological contexts - under a single set of equations.

The formalism of birth-death processes and their description via master equations are well known to physicists and mathematicians. This thesis focused on the biological implications of the master equation/system-size expansion formalism for biological populations, which is relatively newer. In particular, two important biological observations, (i) that birth and death rates in biological populations must always admit per-capita descriptions, and (ii) Ecological interactions are in terms of individuals and densities but evolutionary processes are described in terms of frequencies, calling for a non-linear change of variables that requires It\^o's formula, both have major implications for the behavior of these systems. Remarkably, we find that we can (asymptotically) recover standard equations of theoretical biology such as the replicator equation and the Price equation with just these observations. Our equations also clarify and unite several previous studies that have illustrated the presence of noise-induced selection in various specific model systems postulated in diverse areas of biology such as epidemiology, heterogamety, life-history evolution, and social evolution by capturing their results in a single set of very general stochastic differential equations. Our equations also take the form of generalizations of the replicator-mutator and Price equations, which are familiar to biologists, thus more clearly illustrating the effects of noise-induced selection without resorting to tools such as slow manifold analysis that are ubiquitous in physics circles. Lastly, our equations make concrete biological predictions such as systematic deviations from true neutrality despite equal fitness that can, in principle, be directly measured empirically. This work is currently being revised for

While we understand natural selection very generally through frameworks such as the Price equation, finite population models in population genetics, such as the Wright-Fisher or Moran models, often assume a fixed population size, greatly limiting their generality. For my Master's dissertation, I worked with Prof. Vishwesha Guttal and Prof. Rohini Balakrishnan at the Centre for Ecological Sciences at IISc Bangalore to build more general analytical (mathematical) theory for evolution in finite populations from first principles using ideas from statistical physics and stochastic processes. Starting from a density-dependent 'birth-death process’ describing a population of individuals with discrete traits, I derive stochastic differential equations (SDEs) for how the relative population sizes and trait frequencies change over time. These SDEs recover well-known results such as the replicator-mutator equation, the Price equation, and Fisher’s fundamental theorem in the infinite population limit, illustrating consistency with known formal descriptions of evolution and showing that these equations are 'universal', in the sense that almost any sufficiently large population will satisfy these equations, no matter how complicated the biology of the population. For finite populations, these same SDEs generically reveal a directional evolutionary force, 'noise-induced selection’, that is particular to finite, fluctuating populations and is present even when all types have the same fitness. The strength of noise-induced selection depends directly on the difference in turnover rates between types and inversely on the total population size. Noise-induced selection can reverse the direction of evolution predicted by infinite-population frameworks. This general derivation of evolutionary dynamics helps unify and organize several previous studies - typically performed for specific evolutionary and ecological contexts - under a single set of equations.

The formalism of birth-death processes and their description via master equations are well known to physicists and mathematicians. This thesis focused on the biological implications of the master equation/system-size expansion formalism for biological populations, which is relatively newer. In particular, two important biological observations, (i) that birth and death rates in biological populations must always admit per-capita descriptions, and (ii) Ecological interactions are in terms of individuals and densities but evolutionary processes are described in terms of frequencies, calling for a non-linear change of variables that requires It\^o's formula, both have major implications for the behavior of these systems. Remarkably, we find that we can (asymptotically) recover standard equations of theoretical biology such as the replicator equation and the Price equation with just these observations. Our equations also clarify and unite several previous studies that have illustrated the presence of noise-induced selection in various specific model systems postulated in diverse areas of biology such as epidemiology, heterogamety, life-history evolution, and social evolution by capturing their results in a single set of very general stochastic differential equations. Our equations also take the form of generalizations of the replicator-mutator and Price equations, which are familiar to biologists, thus more clearly illustrating the effects of noise-induced selection without resorting to tools such as slow manifold analysis that are ubiquitous in physics circles. Lastly, our equations make concrete biological predictions such as systematic deviations from true neutrality despite equal fitness that can, in principle, be directly measured empirically. This work is currently being revised for

*The American Naturalist*. A preprint is available on bioRxiv (Bhat and Guttal, 2024). I was invited to give a 40 minute talk about this work at a monthly seminar organized by the Drosophila Ecology and Evolution supergroup in India, and you can listen to that talk here:# Part 2: A stochastic field theory for modelling the dynamics of populations bearing quantitative traits

Phenotypic traits such as human height are often under the influence of a very large number of genes. Due to the complex genetic and epigenetic factors affecting the expression of such phenotypic traits, they take on so many values that they can be said to vary approximately 'continuously' over some interval (Trait values may take all possible values in [0,1], for example). Since infinitely many distinct trait values may arise in populations bearing such 'quantitative' traits, these populations cannot be characterized by a vector or matrix containing the number of individuals bearing each trait value. Instead, the population is best characterized via a function or distribution, and the object describing the state of the system at any given point is thus, in general, infinite-dimensional. While the mathematics of birth-death processes that I described above works well for discrete traits, it is not easily extended to the study of the sort of infinite-dimensional stochastic processes that arise when attempting to model quantitative traits.

The mathematicians use something called measure-valued branching processes (MVBPs) to model such processes. However, the theory of MVBPs is still very much in development and is not very accessible to people without formal training in measure theory and functional analysis. In my thesis, I've used heuristic ideas from statistical physics to show that one can still extend the general theory we built for discrete traits above to one-dimensional quantitative traits through a ‘stochastic field theory’. This leads to the formulation of some field equations that yield standard equations of population genetics such as the continuous replicator-mutator equation, Price equation, Kimura's continuum-of-alleles model, and Lande's selection gradient dynamics in the infinite population limit and generalize them to finite populations of non-constant size. The approach consists of describing the population as a stochastic 'field' (function over space and time), assuming there exists an ecological carrying capacity, and then using ideas from statistical physics to derive stochastic equations that describe how this field changes over time when the carrying capacity is not too small. My framework largely only uses tools from calculus, calculus of variations, and some heuristics for spacetime white noise. As such, it complements the rigorous measure-theoretic framework presented in previous studies with a formalism that may be more accessible to those without a background in measure theory. The formulation of field equations also means that we can now use the powerful heuristic tools of field theories in physics such as the path integral formalism to attack questions about the evolution of quantitative traits in finite populations, though I personally do not do this in my work. The formulation of stochastic field equations for populations bearing quantitative traits using tools from statistical physics is, to the best of our knowledge, also original mathematically and may be of independent interest to applied mathematicians and physicists. The work covering quantitative traits is currently in review in

If any of this interests you, you can feel free to read my MS thesis by clicking this link (warning: links to a 175 page PDF). Please don't hesitate to reach out to me if you have any questions or just want to chat about this work!!

The mathematicians use something called measure-valued branching processes (MVBPs) to model such processes. However, the theory of MVBPs is still very much in development and is not very accessible to people without formal training in measure theory and functional analysis. In my thesis, I've used heuristic ideas from statistical physics to show that one can still extend the general theory we built for discrete traits above to one-dimensional quantitative traits through a ‘stochastic field theory’. This leads to the formulation of some field equations that yield standard equations of population genetics such as the continuous replicator-mutator equation, Price equation, Kimura's continuum-of-alleles model, and Lande's selection gradient dynamics in the infinite population limit and generalize them to finite populations of non-constant size. The approach consists of describing the population as a stochastic 'field' (function over space and time), assuming there exists an ecological carrying capacity, and then using ideas from statistical physics to derive stochastic equations that describe how this field changes over time when the carrying capacity is not too small. My framework largely only uses tools from calculus, calculus of variations, and some heuristics for spacetime white noise. As such, it complements the rigorous measure-theoretic framework presented in previous studies with a formalism that may be more accessible to those without a background in measure theory. The formulation of field equations also means that we can now use the powerful heuristic tools of field theories in physics such as the path integral formalism to attack questions about the evolution of quantitative traits in finite populations, though I personally do not do this in my work. The formulation of stochastic field equations for populations bearing quantitative traits using tools from statistical physics is, to the best of our knowledge, also original mathematically and may be of independent interest to applied mathematicians and physicists. The work covering quantitative traits is currently in review in

*Theoretical Population Biology*. I have spoken about the mathematical ideas and steps involved in a lab meet, and a recording of this talk can be viewed here (this talk assumes that you understand the mathematical ideas used in the discrete trait case,*i.e.*the mathematics of 'part 1' above):If any of this interests you, you can feel free to read my MS thesis by clicking this link (warning: links to a 175 page PDF). Please don't hesitate to reach out to me if you have any questions or just want to chat about this work!!