23. Errors I. Proofreading and MMR (1,117)

lscole
Oct 31, 2025
5 min read

Updated: Apr 16

DNA replication is astonishingly accurate — but not perfect.

Let me quantify that.

Before proofreading, replicative polymerases misincorporate roughly once every 100,000 to one million nucleotides. That’s too many. One misincorporation every 100,000 nucleotides would produce tens of thousands of errors in a replicated human genome. In reality, replicated human genomes contain only a handful.

Clearly, additional mechanisms must be at work.

Cells thus deploy repair pathways to correct polymerase errors. That is the topic of this chapter and the next.

Types of Errors

A misincorporation--inserting the wrong nucleotide opposite a template base--is called a substitution error. For example, given an “A” on the template strand, the polymerase might pair it with a “C” instead of the correct “T.” A substitution is essentially a misspelling.

But substitutions are not the only mistakes polymerases make.

There are three other categories: deletions, insertions, and ribonucleotide incorporations.

Occasionally, the polymerase slips while copying repetitive sequences. It may skip one or more nucleotides (a deletion) or insert extra ones (an insertion). Collectively, insertions and deletions are called indels. They occur less frequently than substitutions, but they too must be corrected.

The most common polymerase error, however--even more common than substitutions--is the incorporation of a ribonucleotide instead of a deoxyribonucleotide. Ribonucleotides are RNA building blocks.

A polymerase may occasionally insert an RNA “A” opposite a DNA “T.” These ribonucleotides destabilize the DNA backbone and increase susceptibility to strand breaks, so they must also be removed.

Why DNA repair

Why devote time to DNA repair in a book about DNA replication?

Because the two are inseparable.

Replication itself generates polymerase errors, and the correction of many of those errors occurs immediately after nucleotide incorporation--right inside the replication fork. Repair is spatially and temporally coordinated with synthesis.

But replication errors are only part of the story.

The genome is under constant assault from endogenous chemicals within the cell and exogenous environmental agents. These insults alter bases, distort the helix, and break the DNA backbone. We’ll refer to such damage as lesions, distinguishing them from polymerase errors.

Lesions are relevant here because they create replication roadblocks. They interfere with polymerases and, at times, with the CMG helicase itself. A lesion can halt a replisome in its tracks.

Thus, successful replication depends on repair. We cannot ignore it--and we wouldn’t want to. DNA repair is one of the most remarkable capabilities of living cells.

In this chapter and the next, we will focus on the correction of polymerase errors. Later chapters will address the repair of lesions.

DNA polymerase "proofreading"

The first line of defense against polymerase errors is proofreading.

DNA polymerases delta and epsilon possess a second enzymatic activity (a swecondso-called "active site") in addition to polymerization: a 3′→5′ exonuclease activity. This allows them to remove incorrectly inserted nucleotides.

If a polymerase incorporates a non-complementary nucleotide, the distorted geometry of the mismatched base pair alters the enzyme’s conformation. The polymerase pauses. The 3′ end of the nascent strand is then transferred from the polymerase active site to the exonuclease active site.

There, the mismatched nucleotide--and occasionally one or two additional bases--is ejected. The corrected 3′ end is then repositioned back into the polymerase site, and synthesis resumes.

Proofreading dramatically improves fidelity. The baseline mismatch rate of 1 in 10⁵–10⁶ nucleotides is reduced by 100- to 1000-fold. After proofreading, errors occur roughly once every 10 million to 100 million nucleotides.

Even so, that would still leave on the order of hundreds of errors per genome replication. Still not good enough.

Cells therefore employ a backup pathway: mismatch repair (MMR).

Mismatch repair (MMR)

MMR corrects substitution errors and small indels that escape proofreading.

Like proofreading, MMR operates within the replication fork. Its detector protein is physically linked to the PCNA sliding clamp.

MMR--and several other repair pathways--follow a common logic known as “cut-and-patch” repair. The five general steps are:

Detect the error
Nick the strand containing the error
Excise the erroneous DNA
Fill in the gap
Ligate the strands

Let’s walk through these steps in the context of MMR.

Step 1. Detecting

The core structural protein of MMR is MSH2. MSH2 pairs with one of two partners to form distinct complexes with different detection specificities.

When paired with its primary partner, the complex recognizes single-base mismatches and small insertion–deletion loops of one or two nucleotides. These represent most replication errors. A different partner detects larger indels.

Within the replisome, the MSH2-containing complex is tethered to the back face of PCNA via a PIP box. As DNA is synthesized, the complex continuously scans for distortions.

Mismatches create small kinks in the helix. These distortions widen the double helix slightly. Upon detection, the partner protein inserts a specific amino acid into the gap, stabilizing the mismatch.

ATP binding induces a conformational change that closes the MSH2-containing complex around the DNA, converting it into a sliding clamp--distinct from PCNA--that marks the site of the error It then recruits the next key factor: MutL-alpha.

Step 2. Nicking the error strand

MutL-alpha actually performs two critical functions. It introduces a single-strand nick in the newly synthesized strand and it recruits the enzymes required for excision, resynthesis, and ligation.

But how does MutL-alpha know which strand is the new one--the one that contains the error?

On the lagging strand, the answer is straightforward. Unligated Okazaki fragment junctions provide natural strand discrimination signals. The strand with the gaps between Okazaki fragments is the newly synthesized strand. Nick that one.

The leading strand lacks nicks. Here, strand identity is inferred from the orientation of PCNA. Because polymerases synthesize 5′→3′, the strand exiting the back face of PCNA with its 5′ end is the nascent strand--the one to cut.

Step 3. Excising

Once nicked, the erroneous strand is degraded by Exonuclease 1 (EXO1), a 5′→3′ exonuclease.

EXO1 begins at the nick and chews back the strand, removing the mismatched region. The presence of the MSH2-containing sliding clamp at the mismatch helps define the region to be excised and prevents excessive degradation.

Step 4. Filling

With the "bad" DNA removed, DNA polymerase delta fills in the gap, synthesizing new DNA using the intact complementary strand as template.

Step 5. Ligating

When polymerase delta reaches the 5′ end of the adjacent fragment, synthesis stops. DNA polymerases cannot join adjacent fragments. That task belongs to DNA ligase 1, which seals the remaining nick and restores strand continuity.

The substitution or indel is now corrected.

Mismatch repair takes minutes--typically somewhere between several and thirty--and occurs while the replication fork continues to move. The earliest moments are critical, because only then are the strand-discrimination signals still available.

In the next chapter, we’ll turn to a polymerase error that is even more common than mismatches: ribonucleotide incorporation.

23. Errors I. Proofreading and MMR (1,117)

Recent Posts

Comments

Get in Touch