Comprehensive mapping of long-range interactions reveals folding principles of the human genome
Entry by Leon Furchtgott, APP 225 Fall 2010.
Erez Lieberman-Aiden*, Nynke L. van Berkum*, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326 (2009).
This paper is about Hi-C, a method that probes the tree-dimensional architecture of whole genomes. The authors construct a spatial proximity map of the human genome with Hi-C at a resolution of 1 megabase. The map shows that the genome is spatially segregated into two genome-wide compartments corresponding to open and closed chromatin. The chromatin conformation is consistent with a fractal globule polymer conformation as opposed to an equilibrium globule conformation.
Understanding how chromosomes fold can provide insight into the complex relationships between chromatin structure, gene activity, and the functional state of the cell. Yet beyond the scale of nucleosomes, little is known about chromatin organization. Up until now long-range interactions between loci could only be probed for specific pairs of loci, not on the level of every combination of loci. This paper introduces a new method to obtain this degree of resolution and suggests some implications from the analysis of these sets of data.
Hi-C method: DNA from cells is digested with a restriction enzyme that leaves a 5′ overhang; the 5′ overhang is filled, including a biotinylated residue; and the resulting blunt-end fragments are ligated under dilute conditions that favor ligation events between the cross-linked DNA fragments. The resulting DNA sample contains ligation products consisting of fragments that were originally in close spatial proximity in the nucleus, marked with biotin at the junction. A Hi-C library is created by shearing the DNA and selecting the biotin-containing fragments with streptavidin beads. The library is then analyzed by using massively parallel DNA sequencing, producing a catalog of interacting fragments (Fig. 1).
Analysis of this data gives long-range contacts between segments more than 20 kb apart. The authors thus construct a genome-wide contact matrix by dividing the genome into 1-Mb regions and defining matrix entries m(i,j) to be the number of ligations products between two loci i and j. (Obviously this means that the contact matrices are symmetric). The contact matrices are highly reproducible (1B,C,D).
Using these matrices, the authors can also compute interesting statistics about the 3-D structure of DNA such as the contact probability <math>I_n(s)</math>, for paris of loci separated by a genomic distance s on chromosome n (Fig. 3A). <math>I_n(s)</math> decreases monotonically on every chromosome, suggesting polymer-like behavior. However, even at distances greater than 200 Mb, <math>I_n(s)</math> is always much greater than the average contact probability between different chromosomes -- this confirms the existence of polymer territories (chromosomes are not all intertwined).
When the authors zoomed into a single chromosome, they found large blocks of enriched and depleted interactions, generating a plaid pattern (Fig 2). This suggests that each chromosome can be decomposed into two sets of loci such that contacts within each set are enriched and contacts between sets are depleted. The authors find that one compartment is associated with open, accessible, actively transcribed chromatin, whereas the other consists of inactivated genes.
Finally, the authors examine chromatin structure within compartments. They observe a power-law scaling of the intra-chromosomal contact probability, specifically, contact probability scales as the inverse of genomic distance.
Various authors have proposed that chromosomal regions can be modeled as an “equilibrium globule”: a compact, densely knotted configuration originally used to describe a polymer in a poor solvent at equilibrium Grosberg et al. proposed an alternative model, theorizing that polymers, including interphase DNA, can self-organize into a long-lived, nonequilibrium conformation that they described as a “fractal globule.” This highly compact state is formed by an unentangled polymer when it crumples into a series of small globules in a “beads-on-a-string” configuration. These beads serve as monomers in subsequent rounds of spontaneous crumpling until only a single globule-of-globules-of-globules remains. In a fractal globule, contiguous regions of the genome tend to form spatial sectors whose size corresponds to the length of the original region (Fig. 3C). In contrast, an equilibrium globule is highly knotted and lacks such sectors; instead, linear and spatial positions are largely decorrelated after, at most, a few megabases (Fig. 3C). The fractal globule has not previously been observed.
The authors perform Monte Carlo simulations to generate ensembles of fractal globules and equilibrium globules. They find that the properties of the fractal globules match those of the Hi-C data. Whereas the contact probability goes as the power -3/2 in equilibrium globules, it goes as -1 in fractal globules, like the observed data.
Discussion / Relation to Soft Matter
This is an interesting paper. The Hi-C experimental method is very powerful and novel. The discussion of polymer scaling is particularly relevant to the course and highlights how finding out how an actual polymer scales can be quite complicated.