Eight Challenges of Genome Replication
- lscole
- 2 days ago
- 6 min read
Updated: 16 hours ago
Each time a cell divides, it creates a copy of both its genomes for the two daughter cells. That's about three billion nucleotides of DNA from each parent, or more than six billion total. It must do this with extremely high accuracy (or "fidelity" in molecular biology parlance). Human cells copy their genomes with 99.99999999% accuracy. That equates to one error every 100 million nucleotides! Really. Stop and think about that. One error every 100 million nucleotides. In this post, we're going to dive into the details and quickly review eight reasons why replicating the human genome is so challenging.
Challenge 1: The genome is very large.
As I've mentioned, one human genome consists of more than three billion nucleotides. Human cells hold two versions of the genome--one from each parent--and both must be replicated. Thus, the challenge: making a nearly exact copy of six billion nucleotides in eight hours. Six billion of anything is hard for humans to grasp. So let's convert to human-scale.
What would a genetic code six billion nucleotides long look like on a human scale? Consider a popular biology textbook: Campbell’s Biology. It comes in at about 1,250 pages. If, instead of biology lessons, the book was filled with only the letters A, G, C, and T using the same font as the actual textbook, it would require about 1,400 textbooks to hold the human genome. Stacked up the tower would be over 200 feet high! So. First thing. The cell has an enormous amount of DNA to copy.

Challenge 2: The genome is packaged.
Replicating the genome would be difficult enough if the DNA existed in its naked double-stranded form. But that’s not the case. Genomic DNA in the nucleus undergoes several levels of packaging (see "DNA Packaging") that restrict access to the DNA. To perform replication, though, the DNA must be naked. That means the 30-nm fiber will have to be unwound. And once it's unwound, the histone proteins that serve as the spools of the nucleosomes must be removed and then put back almost immediately.
To get the DNA out of its highly compacted 30-nm fiber form, the cell employs special proteins ("pioneer transcription factors") that are uniquely able to access highly packaged DNA. The pioneer transcription factors open up the DNA a little, but their second critical role is to recruit other proteins that assist in opening up the 30-nm fiber including chromatin remodeling complexes and histone modifying enzymes. These proteins and others disrupt the 30-nm fiber, leaving the DNA in its "beads-on-a-string" form, which presents then the next challenge that I mentioned, because the replication machinery can't just plow through nucleosomes.
Thus, during replication, the nucleosomes just in front of the forward moving point of replication (the replication fork) must be disassembled. The histone proteins must be removed from the DNA that's wrapped around them. And again, after the DNA is copied, the cell has to reconstruct the nucleosomes.
The second challenge, then, is that the genomic DNA that's about to be replicated must be unpackaged and then repackaged after replication at the same rate that the replication fork moves, which is about 50 bases per second. So, yes, a lot of DNA, but also DNA that isn't readily accessible.
Challenge 3: Epigenetic markers on histones must be preserved
Let's stay on the topic of histone proteins. As I reviewed in an earlier post, cells chemically mark the tails of histone proteins in very specific ways. These chemical marks have meaning to the cell, producing critical effects. For example, some chemical marks promote transcription of a gene or set of genes. Others impede transcription. Some identify regions to be stored as tightly packaged heterochromatin... or as loosely packaged euchromatin. Still others call DNA repair enzymes to a site of damage. Scientists refer to a "histone code" when discussing these kinds of meaningful chemical modifications. Deciphering the histone code is a very active area of research.
In addition to histone tails, the nucleotides that comprise DNA are also chemically marked in ways that have meaning to the cell. These markings are usually so-called methyl groups. You don't need to know what a methyl group is. You just need to know that, like the markings on histones tails, DNA methylation also has meaning to the cell.

I mentioned that after a replication fork passes, nucleosomes must be reconstructed. In fact, it's more challenging than that. Not only do nucleosomes have to be reconstructed, the epigenetic markings on the original histone tails have to be recreated on the new histones. And the methyl groups originally attached to the original double-stranded DNA must be recreated on the newly synthesized DNA. So, the cell not only has to disassemble and then quickly reassemble nucleosomes as replication fork progresses, it also has to recreate all of the original chemical marks both on histone tails and on the new DNA strand itself.
Challenge 4: Single-stranded DNA must be protected
After the cell unwraps the double stranded DNA from the histone spools, for replication to proceed the cell separates the double-stranded DNA into two single strands. Single-stranded DNA has two potential problems: (1) the formation of secondary structures, and (2) enzymatic cleavage.
Because of complementary base pairing, single stranded DNA tends to form secondary structures, which are regions of unwanted double stranded-ness. For example, if two neighboring regions along a short stretch of single-stranded DNA happen to be complementary (or close), then the strand will double back on itself and self-hybridize, creating an unwanted double-stranded region that can impede the replication machinery.
Second, single-stranded DNA is much more susceptible to enzymatic cleavage by the cell’s own nucleases (enzymes that cut DNA and RNA) than double stranded DNA. Cleavage of single-stranded DNA is difficult for the cell to repair. How the cell addresses the challenge of exposed single-stranded DNA is detailed in __.
Challenge 5: The genome must only be replicated once
As we will see in a future post, one of the tricks the cell uses to copy a huge genome in a short period of time is to replicate many hundreds of different stretches simultaneously and then stitch them together. In fact, in any given dividing cell, there are typically about 1,500 replications happening at a time. And by the end of the process, 10,000 to 20,000 replication forks will have been created.
Given the parallelism of the replication process, the cell needs to keep track of which parts of the genome have been replicated and which still need to be replicated. Without that ability, the cell would replicate sections of the genome more than once, which would, put simply, create a mess. So the cell somehow has to keep track of which parts of the genome have already been replicated and which parts still need to be replicated.
Challenge 6: Stretches of DNA repeats must be replicated
Some DNA sequences are more difficult to copy accurately than others. One of the main challenges are regions that contain repeated sequences. Much of the genome that’s not gene-coding contains such repeats. For example, so-called “CpG islands” are extremely common, especially near the starting points of genes. (The "p" stands for the phosphate groups that connect nucleotides together. CpG islands are enriched for two nucleotides: C and G. They can be from 300-3000 nucleotides long. Other repeated sequences are found in centromeres and in telomeres. For example, human telomeres repeat the sequence TTAGGG thousands of times at the ends of chromosomes. Repeats are hard for the cell to replicate accurately.
Challenge 7: Chemical lesions on the DNA must be dealt with
DNA is a molecule. It is susceptible to chemical damage in the cellular environment. For example, sometimes two DNA strands can become “crosslinked,” or chemically attached to each other in a relatively permanent way. The replication machinery needs a way to address such lesions. Other lesions involve random proteins attaching chemically to DNA. Again, the replication machinery needs a way to deal with these errantly bound proteins. We’ll discuss how the cell deals with DNA lesions in __.
Challenge 8: Replication errors must be corrected
Replication basically involves attaching complementary nucleotides to single stranded DNA with weak chemical bonds to create double-stranded DNA. In doing so, the replication machinery makes rare mistakes. Sometimes the wrong base is attached—for example, the replication machinery might attach a C across from a T (instead of an A). Sometimes an RNA nucleotide is attached to the growing DNA chain rather than a DNA nucleotide. During its first pass, the replication machinery makes very few (but still too many) errors. Thus, the cell needs ways to both identify errors and correct them.
These are some of the challenges the cell faces when it replicates its genome. I’ll call out even more challenges as we dive deeper into genome replication.
Comments