Stripes in the genome

Scientists at the Max Planck Institute for Molecular Genetics describe tissue-specific formation of architectural stripes driven by enhancer activity and CTCF elements

February 12, 2019

Rearrangements of DNA can cause genes to be activated at the wrong time or in the wrong place. This can lead to malformations during embryonic development or to diseases such as cancer. Researchers at the Max Planck Institute for Molecular Genetics in Berlin have studied the effects of genomic rearrangements on the 3D structure of DNA. Now they describe that inversions between gene-poor regions and regions with many genes, but no clear TAD structures can lead to the formation of asymmetric regions of contact called architectural stripes. The stripes correlate with the position of active enhancers and play a role in gene regulation.

Heat map (capture Hi-C) showing a part of the structure of the Epha4 locus. Clearly recognizable is the Epha4 TAD, which contains the gene Epha4 and its associated enhancers. The TAD is flanked by CTCF elements on the right and left side, representing its boundaries. Next to the Epha4 TAD on the left is a gene-dense region without clear TAD structures. Here, an asymmetric pattern of contacts can be observed, forming an architectural stripe (arrow).

Each cell of an organism contains the organism's complete DNA. But a thumb looks different from an index finger; a hand looks different from a foot. So how does a cell know, at what time during embryonic development it should grow and what tissue it should develop? Crucial for the differentiation of the cells is the gene activity; a process controlled by many different mechanisms and at different levels. Well-known regulators include enhancers or promoters, specific regions within the DNA with regulatory function. They usually bind other molecules (transcription factors) that are required for reading the genes. But the three-dimensional structure of DNA also plays an important role. Formations fairly well understood are the TADs (topologically associated domains). These are "loops" in the genome containing one or a few genes and their regulatory elements (enhancers, promoters).

TADs have a clear structure with defined boundaries, which insulate the regulatory activity within the TAD and define the genes; an enhancer can act on. They are formed by ring-shaped protein-complexes called cohesin, which extrude a DNA loop by translocating along the DNA strand in both directions until they are stopped by the CTCF elements representing the boundaries of the TADs. But this mechanism functions only in one direction. If the cohesin comes from the opposite side or the CTCF element is inverted, the cohesin ring slides over it without any stop function at the respective position.

"But not all regions of the genome are organized in distinct TADs," explains Stefan Mundlos, head of the research group Development & Disease at the Max Planck Institute for Molecular Genetics in Berlin. "In addition to the TADs, we also find gene-dense regions in the genome where no clear TAD structures can be discerned." Mundlos is a human geneticist. He is particularly interested in how changes in the genome can cause congenital malformations or diseases such as cancer. In recent years, he has studied the formation of TADs to find out what happens, when congenital rearrangements break the architectural configuration, e.g, shift the defined boundary of a TAD. Now, together with scientists from the Department of Computational Molecular Biology, also at MPIMG, his team has studied the Epha4 locus. This is a region in the genome that contains many developmentally important genes. Analysis of the 3D structure of the region showed, that three genes (Epha4, Pax3, Pinc) reside in clearly defined TADs. In contrast, a gene-dense region between the Epha4 and Pinc TADs, shows no clear structure.

Inversions influence timing of gene expression

The scientists analyzed the functional effects of genomic inversions in this region. For this, they cut four segments of different length out of the DNA and re-inserted them at the same place, but in opposite direction (inverted). All inversions started at the same point of the DNA and included an enhancer within the Epha4 TAD, which is important for limb development, as well as the CTCF element at this end of the TAD. Each inversion ended individually at the promoters of four different genes in the gene-dense region. The team focused on day 11.5 of mouse development. In wild type embryos, the Epha4 gene is active during this time, while the other four genes are inactive. The researchers could show that the inversions induced an activation of the genes, which had been positioned to Epha4 enhancer vicinity by the inversion, at day 11.5. The mice with inversions 1 and 2 also showed a polydactyly, that is, they developed more than five toes on their paws.  "We believe that this is due to the activation of the Ihh gene, which belongs to the genes that have come into the scope of action of the enhancer through the inversion," explains Mundlos. “Ihh is important for the formation of the limbs, but physiologically, it is expressed later in development. Its activation before day 12.5 is known to cause polydactyly."

The scientists were particularly interested in which elements of the DNA interact with each other. To find out more about this, they investigated the DNA of the limb buds of the modified mouse embryos with a set of methods called Hi-C. Hi-C is used to analyze the spatial organization of the DNA inside the nucleus by quantifying the number of interactions between genomic loci that are nearby in 3-D space, but may be separated by many nucleotides on the linear genome.  All contact points are represented graphically on heat maps (see figure), where the intensity or rather the number of contacts is specified on a red color scale. Hi-C maps of the genome quantitatively reflect the interactions of the nucleotides within the analyzed regions of the genome. On such a map, TADs appear as triangles, on a genome-wide scale as well as in different species or cell types.  “Around the inversion breakpoints, we observed an asymmetric pattern of contacts”, explains Verena Heinrich, who performed the bioinformatics analysis. "It was formed between the inverted CTCF element and a contiguous genomic interval spanning over about 20 genes and leading to a stripe-like structure on the map.”

Formation of architectural stripes

Such elements have already been found by other groups and have been described as architectural stripes. The team was able to show that the genes under the stripe were activated or up-regulated compared to wild type mice. Among other things, this was reflected by the formation of the additional fingers. However, deletion of the CTCF elements from the inversion in the genome led to a complete loss of the stripe. "We assume that the 3D structure of the DNA in this region gets changed by the deletion of the CTCF anchor in a way that specific contacts, e.g., between enhancer and gene, are no longer possible. As a result, the activity of the respective genes decreases. Interestingly, after deletion of the CTCF anchor, the genetically modified mice develop normal fingers without polydactyly", says Mundlos.

In order to find out whether architectural stripes also occur physiologically during development, the scientists analyzed Hi-C data of the entire genome using bioinformatics algorithms that had been developed especially for this purpose. They were able to also detect the stripes in regions of the genome, which had not been genetically modified. Active enhancer regions were found especially to aggregate under stripes close to the stripe anchor. "Our results show very clearly that the genome does not only consist of the clearly defined TADs. In particular during embryonic development, there are also less structured regions that can be visualized as architectural stripes," explains Mundlos. "The stripes correlate with the position of active enhancers. We assume that their formation is important for the regulation of specific genes. In addition, DNA rearrangements, such as inversions, can provoke an activation of genes at the wrong time or place in the organism, thus causing congenital diseases or cancer."

Go to Editor View