Comparative analysis of nucleotide substitutions
(Yves Clement, Paz Polak, Peter Arndt)
Mammalian genomes show large-scale regional variations of GC-content (the isochors), but the substitution processes responsible for this structure are poorly understood. We have shown that meiotic recombination has a major impact on substitution patterns in humans and mice driving the evolution of GC-content. Furthermore also other cellular processes have an influence on nucleotide substitutions on a more local scale.
A regional analysis of nucleotide substitution rates along human genes and their flanking regions allowed us to quantify the effect of mutational mechanisms associated with transcription in germ line cells. Our results revealed three distinct patterns of substitution rates. First, a sharp decline in the deamination rate of methylated CpG dinucleotides, which is observed in the vicinity of the 5' end of genes. Second, a strand asymmetry in complementary substitution rates, which extends from the 5' end to 1 kbp downstream from the 3' end, associated with transcription-coupled repair. Finally, a localized strand asymmetry, i.e. an excess of C->T over G->A substitution in the non-template strand confined to the first 1-2 kbp downstream of the 5' end of genes.
Mathematics of evolutionary models
(Barbara Wilhelm, Federico Squartini, Peter Arndt)
Markov models describing the evolution of the nucleotide substitution process are widely used in phylogeny reconstruction. They usually assume the stationarity and time reversibility of this process. Although corresponding models give meaningful results when applied to biological data, it is not clear if the two assumptions hold and, if not, how much sequence evolution processes deviate from them. To this end, we introduced two sets of indices to quantify violations of the above two assumptions using the Kolmogorov cycle conditions for time reversibility.
In the future we try to answer questions about the limitations of parameter estimations in comparative genomics. Especially we want to explore whether the addition of more species improves the estimation of nucleotide substitution rates along a given branch in a phylogeny.
Models of genome evolution
(Florian Massip, Peter Arndt)
In the recent past it has become clear that besides nucleotide substitutions also the insertion and deletion of short pieces of DNA as well as the insertion of repetitive elements have a substantial influence on the evolution of GC isochors in mammals. We found that in the case where insertions happen to be segmental duplications of adjacent sequences, this process is able to generate correlations of the GC-content that fall off like a power law. We have shown that simple expansion randomization systems (ERS) are able to generate long-range correlation of the GC content, which is one of the hallmarks of isochors. A wide range of such ERSs fall within one universality class and the characteristic decay exponent of the correlation function can easily be calculated from the rates of the underlying processes. This result gives us also a simple method to simulate long-range correlated sequences and recently we were able to quantify the influence of such correlations on the alignment statistics of sequence, which turned out to be quite substantial.
Currently we are working on models that describe the evolution of segmental duplications of neutral genomic sequences. These duplications are thought to have no direct function and therefore dissolve into the genomic background by random mutations. However this process carries some fascinating statistical properties, which we are analyzing at the moment.
Spatial and temporal dynamics of the immune repertoire in mice
(Irina Czogiel, Peter Arndt)
In a joint project with immunologists from the Max Planck Institute for Infection Biology (Hedda Wardemann, Christian Busse) we try to accurately quantify the overall size, clonality, and histoanatomical distribution of the immune globulin gene repertoire in mice. Our wet-lab partners have developed a novel experimental approach for acquiring the necessary data that moves away from sequencing bulk isolated B cells. B cells will be isolated from different histoanatomical locations (i.e. from all lymphoid tissues and several non-lymphoid tissues) so that the acquired dataset will contain information of previously inaccessible detail. For the first time, we will be able to quantify the diversity of the Ig gene repertoire on a monoclonal level. Moreover, we will develop a model for the underlying evolutionary phylodynamics of the B cell populations that will increase our understanding of the selection processes that constantly shape the antibody repertoire during B cell development and differentiation.
In vitro selection
(Barbara Wilhelm, Peter Arndt)
The advancements of next generation sequencing technologies give us a novel tool for the quantitative analysis of Systematic Evolution of Ligands by Exponential Enrichment (SELEX) experiments. Such experiments are conducted in close collaboration by the Glökler group (Dept. Lehrach). Starting from a highly diverse pool of DNA sequences, ligands to particular molecules, e.g. transcription factors or other cellular relevant molecules, are enriched through subsequent rounds of selection. In house sequencing capabilities give us the opportunity to sequence the DNA pools after each round of selection. This way we are going to study the dynamics of selection for strong binding ligands in lieu of a highly diverse background of unspecific ligands. Since very high diversities can be charted using Illumina sequencing we will also be able to study non-dominant secondary clones and follow the dynamics of their frequency in the population during rounds of selections. New approaches to cluster and analyze the clonal structure of synthetic sequence pools have to be developed.
(Brian Cusack, Peter Arndt)
Recent studies have hinted at the importance of ‘phenotypic mutations’ (errors made in transcription and translation) in molecular evolution. These are thought to facilitate positive selection for adaptations that require multiple-substitutions but the generality of this phenomenon has yet to be explored. Our research in this area focuses on the importance of phenotypic mutations to negative selection and to the maintenance of genomic robustness by selective constraint. We initially approached this in the context of Nonsense Mediated Decay (NMD)-based surveillance of human gene transcription. We have discovered a pattern of codon usage in human genes that compensates for the variable NMD efficiency by minimizing nonsense errors during transcription. Our future work will focus on whether phenotypic mutations due to other types of mis-transcription constitute a similar selective force.