Insights into DNA repeat expansions among 900,000 biobank participants

Depienne, C. & Mandel, J.-L. 30 years of repeat expansion disorders: what have we learned and what are the remaining challenges? Am. J. Hum. Genet. 108, 764â785 (2021).
Google ScholarÂ
Hannan, A. J. Tandem repeats mediating genetic plasticity in health and disease. Nat. Rev. Genet. 19, 286â298 (2018).
Google ScholarÂ
Ziaei Jam, H. et al. A deep population reference panel of tandem repeat variation. Nat. Commun. 14, 6711 (2023).
Google ScholarÂ
English, A. C. et al. Analysis and benchmarking of small and large genomic variants across tandem repeats. Nat. Biotechnol. https://doi.org/10.1038/s41587-024-02225-z (2024).
Google ScholarÂ
Fotsing, S. F. et al. The impact of short tandem repeat variation on gene expression. Nat. Genet. 51, 1652â1659 (2019).
Google ScholarÂ
Margoliash, J. et al. Polymorphic short tandem repeats make widespread contributions to blood and serum traits. Cell Genom. 3, 100458 (2023).
Google ScholarÂ
Manigbas, C. A. et al. A phenome-wide association study of tandem repeat variation in 168,554 individuals from the UK Biobank. Nat. Commun. 15, 10521 (2024).
Google ScholarÂ
Mitra, I. et al. Patterns of de novo tandem repeat mutations and their role in autism. Nature 589, 246â250 (2021).
Google ScholarÂ
Kristmundsdottir, S. et al. Sequence variants affecting the genome-wide rate of germline microsatellite mutations. Nat. Commun. 14, 3855 (2023).
Google ScholarÂ
Porubsky, D. et al. Human de novo mutation rates from a four-generation pedigree reference. Nature 643, 427â436 (2025).
Google ScholarÂ
Gymrek, M., Willems, T., Reich, D. & Erlich, Y. Interpreting short tandem repeat variations in humans using mutational constraint. Nat. Genet. 49, 1495â1501 (2017).
Google ScholarÂ
Steely, C. J., Watkins, W. S., Baird, L. & Jorde, L. B. The mutational dynamics of short tandem repeats in large, multigenerational families. Genome Biol. 23, 253 (2022).
Google ScholarÂ
Rajagopal, S., Donaldson, J., Flower, M., Hensman Moss, D. J. & Tabrizi, S. J. Genetic modifiers of repeat expansion disorders. Emerg. Top. Life Sci. 7, 325â337 (2023).
Google ScholarÂ
Genetic Modifiers of Huntingtonâs Disease (GeM-HD) Consortium. Identification of genetic factors that modify clinical onset of Huntingtonâs disease. Cell 162, 516â526 (2015).
Google ScholarÂ
Lee, J.-M. et al. A modifier of Huntingtonâs disease onset at the MLH1 locus. Hum. Mol. Genet. 26, 3859â3867 (2017).
Google ScholarÂ
Genetic Modifiers of Huntingtonâs Disease (GeM-HD) Consortium. CAG repeat not polyglutamine length determines timing of Huntingtonâs disease onset. Cell 178, 887â900 (2019).
Google ScholarÂ
Lee, J.-M. et al. Genetic modifiers of Huntington disease differentially influence motor and cognitive domains. Am. J. Hum. Genet. 109, 885â899 (2022).
Google ScholarÂ
Genetic Modifiers of Huntingtonâs Disease (GeM-HD) Consortium. Genetic modifiers of somatic expansion and clinical phenotypes in Huntingtonâs disease highlight shared and tissue-specific effects. Nat. Genet. 57, 1426â1436 (2025).
Google ScholarÂ
Moss, D. J. H. et al. Identification of genetic variants associated with Huntingtonâs disease progression: a genome-wide association study. Lancet Neurol. 16, 701â711 (2017).
Google ScholarÂ
Handsaker, R. E. et al. Long somatic DNA-repeat expansion drives neurodegeneration in Huntington disease. Cell 188, 623â639 (2025).
Google ScholarÂ
The UK Biobank Whole-Genome Sequencing Consortium. Whole-genome sequencing of 490,640 UK Biobank participants. Nature 645, 692â701 (2025).
Google ScholarÂ
The All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program. Nature 627, 340â346 (2024).
Google ScholarÂ
Tanudisastro, H. A., Deveson, I. W., Dashnow, H. & MacArthur, D. G. Sequencing and characterizing short tandem repeats in the human genome. Nat. Rev. Genet. 25, 460â475 (2024).
Google ScholarÂ
Li, H. & Durbin, R. Fast and accurate short read alignment with BurrowsâWheeler transform. Bioinformatics 25, 1754â1760 (2009).
Google ScholarÂ
Khristich, A. N. & Mirkin, S. M. On the wrong DNA track: molecular mechanisms of repeat-mediated genome instability. J. Biol. Chem. 295, 4134â4170 (2020).
Google ScholarÂ
Lundström, O. S. et al. WebSTR: a population-wide database of short tandem repeat variation in humans. J. Mol. Biol. 435, 168260 (2023).
Google ScholarÂ
Palamara, P. F. et al. Leveraging distant relatedness to quantify human mutation and gene-conversion rates. Am. J. Hum. Genet. 97, 775â789 (2015).
Google ScholarÂ
Tian, X., Browning, B. L. & Browning, S. R. Estimating the genome-wide mutation rate with three-way identity by descent. Am. J. Hum. Genet. 105, 883â893 (2019).
Google ScholarÂ
Tian, X., Cai, R. & Browning, S. R. Estimating the genome-wide mutation rate from thousands of unrelated individuals. Am. J. Hum. Genet. 109, 2178â2184 (2022).
Google ScholarÂ
Chung, M. et al. Evidence for a mechanism predisposing to intergenerational CAG repeat instability in spinocerebellar ataxia type I. Nat. Genet. 5, 254â258 (1993).
Google ScholarÂ
Eichler, E. E. et al. Length of uninterrupted CGG repeats determines instability in the FMR1 gene. Nat. Genet. 8, 88â94 (1994).
Google ScholarÂ
Matuszek, Z. et al. Base editing of trinucleotide repeats that cause Huntingtonâs disease and Friedreichâs ataxia reduces somatic repeat expansions in patient cells and in mice. Nat. Genet. 57, 1437â1451 (2025).
Google ScholarÂ
Shinde, D., Lai, Y., Sun, F. & Arnheim, N. Taq DNA polymerase slippage mutation rates measured by PCR and quasi-likelihood analysis: (CA/GT)n and (A/T)n microsatellites. Nucleic Acids Res. 31, 974â980 (2003).
Google ScholarÂ
Raz, O. et al. Short tandem repeat stutter model inferred from direct measurement of in vitro stutter noise. Nucleic Acids Res. 47, 2436â2445 (2019).
Google ScholarÂ
Sehgal, A., Ziaei Jam, H., Shen, A. & Gymrek, M. Genome-wide detection of somatic mosaicism at short tandem repeats. Bioinformatics 40, btae485 (2024).
Google ScholarÂ
Goodwin, S., McPherson, J. D. & McCombie, W. R. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17, 333â351 (2016).
Google ScholarÂ
Ashizawa, T., Dubel, J. R. & Harati, Y. Somatic instability of CTG repeat in myotonic dystrophy. Neurology 43, 2674â2674 (1993).
Google ScholarÂ
Mouro Pinto, R. et al. Patterns of CAG repeat instability in the central nervous system and periphery in Huntingtonâs disease and in spinocerebellar ataxia type 1. Hum. Mol. Genet. 29, 2551â2567 (2020).
Google ScholarÂ
Morales, F. et al. Individual-specific levels of CTGâąCAG somatic instability are shared across multiple tissues in myotonic dystrophy type 1. Hum. Mol. Genet. 32, 621â631 (2023).
Google ScholarÂ
Kacher, R. et al. CAG repeat mosaicism is gene specific in spinocerebellar ataxias. Am. J. Hum. Genet. 111, 913â926 (2024).
Google ScholarÂ
Zarouchlioti, C. et al. Tissue-specific TCF4 triplet repeat instability revealed by optical genome mapping. EBioMedicine 108, 105328 (2024).
Google ScholarÂ
Laabs, B.-H. et al. Identifying genetic modifiers of age-associated penetrance in X-linked dystonia-parkinsonism. Nat. Commun. 12, 3216 (2021).
Google ScholarÂ
Maza, A. M. et al. MSH3 is a genetic modifier of somatic repeat instability in X-linked dystonia parkinsonism. Preprint at bioRxiv https://doi.org/10.1101/2025.05.14.653432 (2025).
Dolzhenko, E. et al. Detection of long repeat expansions from PCR-free whole-genome sequence data. Genome Res. 27, 1895â1903 (2017).
Google ScholarÂ
Hafford-Tear, N. J. et al. CRISPR/Cas9-targeted enrichment and long-read sequencing of the Fuchs endothelial corneal dystrophyâassociated TCF4 triplet repeat. Genet. Med. 21, 2092â2102 (2019).
Google ScholarÂ
Arab, K. et al. GADD45A binds R-loops and recruits TET1 to CpG island promoters. Nat. Genet. 51, 217â223 (2019).
Google ScholarÂ
Kim, K.-H. et al. Genetic and functional analyses point to FAN1 as the source of multiple Huntington disease modifier effects. Am. J. Hum. Genet. 107, 96â110 (2020).
Google ScholarÂ
Wieben, E. D. et al. A common trinucleotide repeat expansion within the transcription factor 4 (TCF4, E2-2) gene predicts fuchs corneal dystrophy. PLoS ONE 7, e49083 (2012).
Google ScholarÂ
Fautsch, M. P. et al. TCF4-mediated Fuchs endothelial corneal dystrophy: Insights into a common trinucleotide repeat-associated disease. Prog. Retin. Eye Res. 81, 100883 (2021).
Google ScholarÂ
Gorman, B. R. et al. A multi-ancestry GWAS of Fuchs corneal dystrophy highlights the contributions of laminins, collagen, and endothelial cell regulation. Commun. Biol. 7, 418 (2024).
Google ScholarÂ
Verma, A. et al. Diversity and scale: genetic architecture of 2068 traits in the VA Million Veteran Program. Science 385, eadj1182 (2024).
Google ScholarÂ
Palombo, F. et al. hMutSÎČ, a heterodimer of hMSH2 and hMSH3, binds to insertion/deletion loops in DNA. Curr. Biol. 6, 1181â1184 (1996).
Google ScholarÂ
Genschel, J., Littman, S. J., Drummond, J. T. & Modrich, P. Isolation of MutSÎČ from human cells and comparison of the mismatch repair specificities of MutSÎČ and MutSα. J. Biol. Chem. 273, 19895â19901 (1998).
Google ScholarÂ
Hazra, T. K. et al. Identification and characterization of a novel human DNA glycosylase for repair of cytosine-derived lesions. J. Biol. Chem. 277, 30417â30420 (2002).
Google ScholarÂ
Costelloe, T. et al. The yeast Fun30 and human SMARCAD1 chromatin remodellers promote DNA end resection. Nature 489, 581â584 (2012).
Google ScholarÂ
Mouro Pinto, R. et al. In vivo CRISPRâCas9 genome editing in mice identifies genetic modifiers of somatic CAG repeat instability in Huntingtonâs disease. Nat. Genet. 57, 314â322 (2025).
Google ScholarÂ
Jadhav, B. et al. A phenome-wide association study of methylated GC-rich repeats identifies a GCC repeat expansion in AFF3 associated with intellectual disability. Nat. Genet. 56, 2322â2332 (2024).
Google ScholarÂ
Van Kuilenburg, A. B. P. et al. Glutaminase deficiency caused by short tandem repeat expansion in GLS. N. Engl. J. Med. 380, 1433â1441 (2019).
Google ScholarÂ
Fazal, S. et al. Repeat expansions nested within tandem CNVs: a unique structural change in GLS exemplifies the diagnostic challenges of non-coding pathogenic variation. Hum. Mol. Genet. 32, 46â54 (2023).
Google ScholarÂ
Rumping, L. et al. Identification of a loss-of-function mutation in the context of glutaminase deficiency and neonatal epileptic encephalopathy. JAMA Neurol. 76, 342 (2019).
Google ScholarÂ
Malik, I., Kelley, C. P., Wang, E. T. & Todd, P. K. Molecular mechanisms underlying nucleotide repeat expansion disorders. Nat. Rev. Mol. Cell Biol. 22, 589â607 (2021).
Google ScholarÂ
Ciosi, M. et al. A genetic association study of glutamine-encoding DNA sequence structures, somatic CAG expansion, and DNA repair gene variants, with Huntington disease clinical outcomes. EBioMedicine 48, 568â580 (2019).
Google ScholarÂ
Hujoel M. L. A. et al. Code and pheWAS data from âInsights into DNA repeat expansions among 900,000 biobank participantsâ. Zenodo https://doi.org/10.5281/zenodo.17419996 (2025).




