Contour map for genetic enhancers
Inventory and prediction of enhancer sequences in the mouse genome
How strongly does a genetic switch affect its gene? A Berlin research team created a registry of genetic enhancers, their location in the genome, and their activation strength in mouse stem cells. In the process, they discovered DNA patterns that had not previously been recognized as switches. Based on this data, the scientists developed a new algorithm that predicts whether any DNA sequence can act as a gene enhancer in stem cells.
Only about two percent of the mammalian genome contains blueprints for proteins, while some of the remaining DNA is able to control the activity of these genes. Hereditary diseases can therefore arise not only from mutations in the genes themselves, but also when such regulatory sequences are affected.
Among the most important regulators are enhancers. They determine in which tissue and under which circumstances genes are switched on or off. A team of researchers led by Sebastiaan Meijsing of the Max Planck Institute for Molecular Genetics (MPIMG) and the Max Planck Research Unit for the Science of Pathogens (MPUSP) developed a new experimental method to create a map containing the positions of all enhancers in the genome of mouse stem cells. They published their results in the scientific journal Nucleic Acids Research.
More a dimmer than a switch
The new map not only shows which DNA segments are able to activate genes. It also indicates the strength of the enhancers – like a topographic contour map that shows mountains and valleys as well as towns and villages.
“Enhancers don't just flick genes on or off like toggle switches,” Meijsing explains. “They behave more like dimmers that adjust exactly how much a gene needs to be turned on to make the right amount of protein.”
So, the new genome-wide map reveals the enhancer’s position for a gene and the “dimmer” level as well. The map is intended as a resource for researchers who want to study the relationship of a particular enhancer to its neighboring gene.
Analysis of thousands of genetic snippets
The researchers set out their mapping expedition with the molecular biology method "STARR-seq" that requires inserting a genetic snippet into a ring-shaped DNA molecule (a plasmid). This artificial mini-chromosome has the special ability to become active with the help of an enhancer.
The researchers then injected the plasmid into the cell. If the DNA snippet contained enhancer activity, the cell would read it – and the stronger the enhancer, the more often the plasmid would get transcribed. The strength of the enhancer was then quantified by sequencing the resulting RNA molecules. In this way, enhancers can be identified from a mixture of countless DNA fragments in parallel, making the method exceptionally powerful.
Detecting hidden patterns in enhancers
The researchers applied an improved variant of the STARR-seq method to mouse stem cells: “For our refined method, we pre-selected regions where we took a closer look,” says Laura Glaser, scientist on Meijsing's team. “We only analyzed the parts of the genome that were already decompressed, in particular where the cell has been accessing the DNA particularly frequently.” Inactive parts of the genome were dropped from the analysis, increasing accuracy and reducing the error rate.
“Additionally, we optimized the quantification capabilities of the method, enabling us to track down exactly which enhancer was read from the plasmid and how often,” Glaser adds. The team dubbed the improved protocol "FAIRE-STARR-seq."
While mapping, the scientists noticed previously unknown patterns in the enhancer’s DNA sequence. These sequence motifs are patterns that are recognized by specialized proteins that dock to genes and usher their transcription: transcription factors, some of the most important regulatory molecules in the cell.
The next step was to induce changes in the stem cells by stimulating them to differentiate into more specialized counterparts. Upon differentiation, new genetic programs are started, and genes that are only important for a stem cell state are switched off. Consequently, some enhancers lose, while others gain activity. “We were able to identify constellations of certain transcription factor motifs that lead to particularly strong enhancer activity, either in pluripotency or in the early differentiation stage,” says Glaser.
Prediction without additional epigenetic data
Together with scientists from the bioinformatics department at the MPIMG, the team then trained an algorithm that can predict whether a given DNA segment can serve as an enhancer in stem cells based on the DNA sequence alone.
“It is very convenient that our algorithm needs to be fed only enhancer activity and DNA sequence to get a prediction," Glaser says. “Other algorithms usually require a large amount of additional epigenetic data from the lab.”
Typically, it is necessary to test DNA fragments with suspected enhancer properties individually in the lab. With the exceptionally large number of presumed active enhancers in stem cells, the examination of individual elements is particularly tedious.
First steps toward a comprehensive understanding
“The method underlying FAIRE-STARR-seq can theoretically be used for almost all cell types to find enhancers in the genome and quantify their activity,” Glaser says. “But it was particularly exciting to analyze stem cells that begin to develop and specialize.”
Meijsing considers their genome-wide map of genes, enhancers and the detailed high-and-low landscape information an important starting point for further research. “Which enhancers activate which genes and when, and the logic behind it, is one of the most important but largely unresolved questions in the study of gene regulation,” Meijsing says. “Our results can serve as a reference for future functional studies and help refine prediction methods.”