Coalescent
theory
In genetics, coalescent theory is a retrospective model of population
genetics. It employs a sample of individuals from a population to trace all
alleles of a gene shared by all members of the population to a single ancestral
copy, known as the most recent common ancestor (MRCA; sometimes also termed the
co-ancestor to emphasize the coalescent relationship). The inheritance
relationships between alleles are typically represented as a gene genealogy,
similar in form to a phylogenetic tree. This gene
genealogy is also known as the coalescent; understanding the statistical
properties of the coalescent under different assumptions forms the basis of
coalescent theory. The coalescent runs models of genetic drift backward in time
to investigate the genealogy of antecedents. In the most
simple case, coalescent theory assumes no recombination, no natural
selection, and no gene flow or population structure. Advances in coalescent
theory, however, allow extension to the basic coalescent, and can include
recombination, selection, and virtually any arbitrarily complex evolutionary or
demographic model in population genetic analysis. The mathematical theory of
the coalescent was originally developed in the early 1980s by John Kingman.
Theory
Consider two distinct haploid organisms who
differ at a single nucleotide. By tracing the ancestry of these two individuals
backwards there will be a point in time when the most recent common ancestor (MRCA)
is encountered and the two lineages will have coalesced.
Time to coalescence
A useful analysis based on coalescence theory seeks to predict the
amount of time elapsed between the introduction of a mutation and the arising
of a particular allele or gene distribution in a population. This time period
is equal to how long ago the most recent common ancestor existed.
The probability that two lineages coalesce in the immediately preceding
generation is the probability that they share a parent. In a diploid population
of constant size with 2N copies of each locus, there are 2N "potential
parents" in the previous generation, so the probability that two alleles
share a parent is 1/(2N) and correspondingly, the probability that they do not
coalesce is 1 − 1/(2N).
Graphical representation
Coalescents can be visualised
using dendrograms which show the relationship of
branches of the population to each other. The point where two branches meet
indicates a coalescent event.
Applications
Disease gene mapping
The utility of coalescent theory in the mapping of disease is slowly
gaining more appreciation; although the application of the theory is still in
its infancy, there are a number of researchers who are actively developing
algorithms for the analysis of human genetic data that utilise
coalescent theory.
History
Coalescent theory is a natural extension of the more classical
population genetics concept of neutral evolution and is an approximation to the
Fisher-Wright (or Wright-Fisher) model for large populations. It was ‘discovered’
independently by several researchers in the 1980’s, but the definitive formalisation is attributed to Kingman. Major contributions
to the development of coalescent theory have been made by Peter Donnelly,
Robert Griffiths, Richard R Hudson and Simon Tavaré. This
has included incorporating variations in population size, recombination and
selection. In 1999 Jim Pitman and Serik Sagitov independently introduced coalescent processes with
multiple collisions of ancestral lineages. Shortly later the full class of
exchangeable coalescent processes with simultaneous multiple mergers of
ancestral lineages was discovered by Martin Möhle and
Serik Sagitov and Jason
Schweinsberg.