Insights into DNA repeat expansions among 900,000 biobank participants

Depienne, C. & Mandel, J.-L. 30 years of repeat expansion disorders: what have we learned and what are the remaining challenges? Am. J. Hum. Genet. 108, 764–785 (2021).
Google Scholar
Hannan, A. J. Tandem repeats mediating genetic plasticity in health and disease. Nat. Rev. Genet. 19, 286–298 (2018).
Google Scholar
Ziaei Jam, H. et al. A deep population reference panel of tandem repeat variation. Nat. Commun. 14, 6711 (2023).
Google Scholar
English, A. C. et al. Analysis and benchmarking of small and large genomic variants across tandem repeats. Nat. Biotechnol. https://doi.org/10.1038/s41587-024-02225-z (2024).
Google Scholar
Fotsing, S. F. et al. The impact of short tandem repeat variation on gene expression. Nat. Genet. 51, 1652–1659 (2019).
Google Scholar
Margoliash, J. et al. Polymorphic short tandem repeats make widespread contributions to blood and serum traits. Cell Genom. 3, 100458 (2023).
Google Scholar
Manigbas, C. A. et al. A phenome-wide association study of tandem repeat variation in 168,554 individuals from the UK Biobank. Nat. Commun. 15, 10521 (2024).
Google Scholar
Mitra, I. et al. Patterns of de novo tandem repeat mutations and their role in autism. Nature 589, 246–250 (2021).
Google Scholar
Kristmundsdottir, S. et al. Sequence variants affecting the genome-wide rate of germline microsatellite mutations. Nat. Commun. 14, 3855 (2023).
Google Scholar
Porubsky, D. et al. Human de novo mutation rates from a four-generation pedigree reference. Nature 643, 427–436 (2025).
Google Scholar
Gymrek, M., Willems, T., Reich, D. & Erlich, Y. Interpreting short tandem repeat variations in humans using mutational constraint. Nat. Genet. 49, 1495–1501 (2017).
Google Scholar
Steely, C. J., Watkins, W. S., Baird, L. & Jorde, L. B. The mutational dynamics of short tandem repeats in large, multigenerational families. Genome Biol. 23, 253 (2022).
Google Scholar
Rajagopal, S., Donaldson, J., Flower, M., Hensman Moss, D. J. & Tabrizi, S. J. Genetic modifiers of repeat expansion disorders. Emerg. Top. Life Sci. 7, 325–337 (2023).
Google Scholar
Genetic Modifiers of Huntington’s Disease (GeM-HD) Consortium. Identification of genetic factors that modify clinical onset of Huntington’s disease. Cell 162, 516–526 (2015).
Google Scholar
Lee, J.-M. et al. A modifier of Huntington’s disease onset at the MLH1 locus. Hum. Mol. Genet. 26, 3859–3867 (2017).
Google Scholar
Genetic Modifiers of Huntington’s Disease (GeM-HD) Consortium. CAG repeat not polyglutamine length determines timing of Huntington’s disease onset. Cell 178, 887–900 (2019).
Google Scholar
Lee, J.-M. et al. Genetic modifiers of Huntington disease differentially influence motor and cognitive domains. Am. J. Hum. Genet. 109, 885–899 (2022).
Google Scholar
Genetic Modifiers of Huntington’s Disease (GeM-HD) Consortium. Genetic modifiers of somatic expansion and clinical phenotypes in Huntington’s disease highlight shared and tissue-specific effects. Nat. Genet. 57, 1426–1436 (2025).
Google Scholar
Moss, D. J. H. et al. Identification of genetic variants associated with Huntington’s disease progression: a genome-wide association study. Lancet Neurol. 16, 701–711 (2017).
Google Scholar
Handsaker, R. E. et al. Long somatic DNA-repeat expansion drives neurodegeneration in Huntington disease. Cell 188, 623–639 (2025).
Google Scholar
The UK Biobank Whole-Genome Sequencing Consortium. Whole-genome sequencing of 490,640 UK Biobank participants. Nature 645, 692–701 (2025).
Google Scholar
The All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program. Nature 627, 340–346 (2024).
Google Scholar
Tanudisastro, H. A., Deveson, I. W., Dashnow, H. & MacArthur, D. G. Sequencing and characterizing short tandem repeats in the human genome. Nat. Rev. Genet. 25, 460–475 (2024).
Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Google Scholar
Khristich, A. N. & Mirkin, S. M. On the wrong DNA track: molecular mechanisms of repeat-mediated genome instability. J. Biol. Chem. 295, 4134–4170 (2020).
Google Scholar
Lundström, O. S. et al. WebSTR: a population-wide database of short tandem repeat variation in humans. J. Mol. Biol. 435, 168260 (2023).
Google Scholar
Palamara, P. F. et al. Leveraging distant relatedness to quantify human mutation and gene-conversion rates. Am. J. Hum. Genet. 97, 775–789 (2015).
Google Scholar
Tian, X., Browning, B. L. & Browning, S. R. Estimating the genome-wide mutation rate with three-way identity by descent. Am. J. Hum. Genet. 105, 883–893 (2019).
Google Scholar
Tian, X., Cai, R. & Browning, S. R. Estimating the genome-wide mutation rate from thousands of unrelated individuals. Am. J. Hum. Genet. 109, 2178–2184 (2022).
Google Scholar
Chung, M. et al. Evidence for a mechanism predisposing to intergenerational CAG repeat instability in spinocerebellar ataxia type I. Nat. Genet. 5, 254–258 (1993).
Google Scholar
Eichler, E. E. et al. Length of uninterrupted CGG repeats determines instability in the FMR1 gene. Nat. Genet. 8, 88–94 (1994).
Google Scholar
Matuszek, Z. et al. Base editing of trinucleotide repeats that cause Huntington’s disease and Friedreich’s ataxia reduces somatic repeat expansions in patient cells and in mice. Nat. Genet. 57, 1437–1451 (2025).
Google Scholar
Shinde, D., Lai, Y., Sun, F. & Arnheim, N. Taq DNA polymerase slippage mutation rates measured by PCR and quasi-likelihood analysis: (CA/GT)n and (A/T)n microsatellites. Nucleic Acids Res. 31, 974–980 (2003).
Google Scholar
Raz, O. et al. Short tandem repeat stutter model inferred from direct measurement of in vitro stutter noise. Nucleic Acids Res. 47, 2436–2445 (2019).
Google Scholar
Sehgal, A., Ziaei Jam, H., Shen, A. & Gymrek, M. Genome-wide detection of somatic mosaicism at short tandem repeats. Bioinformatics 40, btae485 (2024).
Google Scholar
Goodwin, S., McPherson, J. D. & McCombie, W. R. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17, 333–351 (2016).
Google Scholar
Ashizawa, T., Dubel, J. R. & Harati, Y. Somatic instability of CTG repeat in myotonic dystrophy. Neurology 43, 2674–2674 (1993).
Google Scholar
Mouro Pinto, R. et al. Patterns of CAG repeat instability in the central nervous system and periphery in Huntington’s disease and in spinocerebellar ataxia type 1. Hum. Mol. Genet. 29, 2551–2567 (2020).
Google Scholar
Morales, F. et al. Individual-specific levels of CTG•CAG somatic instability are shared across multiple tissues in myotonic dystrophy type 1. Hum. Mol. Genet. 32, 621–631 (2023).
Google Scholar
Kacher, R. et al. CAG repeat mosaicism is gene specific in spinocerebellar ataxias. Am. J. Hum. Genet. 111, 913–926 (2024).
Google Scholar
Zarouchlioti, C. et al. Tissue-specific TCF4 triplet repeat instability revealed by optical genome mapping. EBioMedicine 108, 105328 (2024).
Google Scholar
Laabs, B.-H. et al. Identifying genetic modifiers of age-associated penetrance in X-linked dystonia-parkinsonism. Nat. Commun. 12, 3216 (2021).
Google Scholar
Maza, A. M. et al. MSH3 is a genetic modifier of somatic repeat instability in X-linked dystonia parkinsonism. Preprint at bioRxiv https://doi.org/10.1101/2025.05.14.653432 (2025).
Dolzhenko, E. et al. Detection of long repeat expansions from PCR-free whole-genome sequence data. Genome Res. 27, 1895–1903 (2017).
Google Scholar
Hafford-Tear, N. J. et al. CRISPR/Cas9-targeted enrichment and long-read sequencing of the Fuchs endothelial corneal dystrophy–associated TCF4 triplet repeat. Genet. Med. 21, 2092–2102 (2019).
Google Scholar
Arab, K. et al. GADD45A binds R-loops and recruits TET1 to CpG island promoters. Nat. Genet. 51, 217–223 (2019).
Google Scholar
Kim, K.-H. et al. Genetic and functional analyses point to FAN1 as the source of multiple Huntington disease modifier effects. Am. J. Hum. Genet. 107, 96–110 (2020).
Google Scholar
Wieben, E. D. et al. A common trinucleotide repeat expansion within the transcription factor 4 (TCF4, E2-2) gene predicts fuchs corneal dystrophy. PLoS ONE 7, e49083 (2012).
Google Scholar
Fautsch, M. P. et al. TCF4-mediated Fuchs endothelial corneal dystrophy: Insights into a common trinucleotide repeat-associated disease. Prog. Retin. Eye Res. 81, 100883 (2021).
Google Scholar
Gorman, B. R. et al. A multi-ancestry GWAS of Fuchs corneal dystrophy highlights the contributions of laminins, collagen, and endothelial cell regulation. Commun. Biol. 7, 418 (2024).
Google Scholar
Verma, A. et al. Diversity and scale: genetic architecture of 2068 traits in the VA Million Veteran Program. Science 385, eadj1182 (2024).
Google Scholar
Palombo, F. et al. hMutSβ, a heterodimer of hMSH2 and hMSH3, binds to insertion/deletion loops in DNA. Curr. Biol. 6, 1181–1184 (1996).
Google Scholar
Genschel, J., Littman, S. J., Drummond, J. T. & Modrich, P. Isolation of MutSβ from human cells and comparison of the mismatch repair specificities of MutSβ and MutSα. J. Biol. Chem. 273, 19895–19901 (1998).
Google Scholar
Hazra, T. K. et al. Identification and characterization of a novel human DNA glycosylase for repair of cytosine-derived lesions. J. Biol. Chem. 277, 30417–30420 (2002).
Google Scholar
Costelloe, T. et al. The yeast Fun30 and human SMARCAD1 chromatin remodellers promote DNA end resection. Nature 489, 581–584 (2012).
Google Scholar
Mouro Pinto, R. et al. In vivo CRISPR–Cas9 genome editing in mice identifies genetic modifiers of somatic CAG repeat instability in Huntington’s disease. Nat. Genet. 57, 314–322 (2025).
Google Scholar
Jadhav, B. et al. A phenome-wide association study of methylated GC-rich repeats identifies a GCC repeat expansion in AFF3 associated with intellectual disability. Nat. Genet. 56, 2322–2332 (2024).
Google Scholar
Van Kuilenburg, A. B. P. et al. Glutaminase deficiency caused by short tandem repeat expansion in GLS. N. Engl. J. Med. 380, 1433–1441 (2019).
Google Scholar
Fazal, S. et al. Repeat expansions nested within tandem CNVs: a unique structural change in GLS exemplifies the diagnostic challenges of non-coding pathogenic variation. Hum. Mol. Genet. 32, 46–54 (2023).
Google Scholar
Rumping, L. et al. Identification of a loss-of-function mutation in the context of glutaminase deficiency and neonatal epileptic encephalopathy. JAMA Neurol. 76, 342 (2019).
Google Scholar
Malik, I., Kelley, C. P., Wang, E. T. & Todd, P. K. Molecular mechanisms underlying nucleotide repeat expansion disorders. Nat. Rev. Mol. Cell Biol. 22, 589–607 (2021).
Google Scholar
Ciosi, M. et al. A genetic association study of glutamine-encoding DNA sequence structures, somatic CAG expansion, and DNA repair gene variants, with Huntington disease clinical outcomes. EBioMedicine 48, 568–580 (2019).
Google Scholar
Hujoel M. L. A. et al. Code and pheWAS data from ‘Insights into DNA repeat expansions among 900,000 biobank participants’. Zenodo https://doi.org/10.5281/zenodo.17419996 (2025).




