STRavinsky STR database and PGTailor PGT tool demonstrate superiority of CHM13-T2T over hg38 and hg19 for STR-based applications.


Journal

European journal of human genetics : EJHG
ISSN: 1476-5438
Titre abrégé: Eur J Hum Genet
Pays: England
ID NLM: 9302235

Informations de publication

Date de publication:
Jul 2023
Historique:
received: 08 01 2023
accepted: 23 03 2023
revised: 18 03 2023
pmc-release: 01 07 2024
medline: 10 7 2023
pubmed: 14 4 2023
entrez: 13 4 2023
Statut: ppublish

Résumé

Short-Tandem-Repeats (STRs) have long been studied for possible roles in biological phenomena, and are utilized in multiple applications such as forensics, evolutionary studies and pre-implantation-genetic-testing (PGT). The two reference genomes most used by clinicians and researchers are GRCh37/hg19 and GRCh38/hg38, both constructed using mainly short-read-sequencing (SRS) in which all-STR-containing-reads cannot be assembled to the reference genome. With the introduction of long-read-sequencing (LRS) methods and the generation of the CHM13 reference genome, also known as T2T, many previously unmapped STRs were finally localized within the human genome. We generated STRavinsky, a compact STR database for three reference genomes, including T2T. We proceeded to demonstrate the advantages of T2T over hg19 and hg38, identifying nearly double the number of STRs throughout all chromosomes. Through STRavinsky, providing a resolution down to a specific genomic coordinate, we demonstrated extreme propensity of TGGAA repeats in p arms of acrocentric chromosomes, substantially corroborating early molecular studies suggesting a possible role in formation of Robertsonian translocations. Moreover, we delineated unique propensity of TGGAA repeats specifically in chromosome 16q11.2 and in 9q12. Finally, we harness the superior capabilities of T2T and STRavinsky to generate PGTailor, a novel web application dramatically facilitating design of STR-based PGT tests in mere minutes.

Identifiants

pubmed: 37055538
doi: 10.1038/s41431-023-01352-6
pii: 10.1038/s41431-023-01352-6
pmc: PMC10325972
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

738-743

Informations de copyright

© 2023. The Author(s), under exclusive licence to European Society of Human Genetics.

Références

Weber JL, Myers EW. Human whole-genome shotgun sequencing. Genome Res. 1997;7:401–9.
doi: 10.1101/gr.7.5.401 pubmed: 9149936
Craig Venter J, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, et al. The sequence of the human genome. Science. 2001;291:1304–51.
doi: 10.1126/science.1058040
Alkan C, Sajjadian S, Eichler EE. Limitations of next-generation genome sequence assembly. Nat Methods. 2010;8:61–5.
doi: 10.1038/nmeth.1527 pubmed: 21102452 pmcid: 3115693
Rhoads A, Au KF. PacBio sequencing and its applications. Genom Proteom Bioinforma. 2015;13:278–89.
doi: 10.1016/j.gpb.2015.08.002
Jain M, Olsen HE, Paten B, Akeson M. The Oxford nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 2016;17:1–11.
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, et al. The complete sequence of a human genome. Science. 2022;376:44–53.
doi: 10.1126/science.abj6987 pubmed: 35357919 pmcid: 9186530
Noyes MD, Harvey WT, Porubsky D, Sulovari A, Li R, Rose NR, et al. Familial long-read sequencing increases yield of de novo mutations. Am J Hum Genet. 2022;109:631–46.
doi: 10.1016/j.ajhg.2022.02.014 pubmed: 35290762 pmcid: 9069071
Hoyt SJ, Storer JM, Hartley GA, Grady PGS, Gershman A, de Lima LG, et al. From telomere to telomere: the transcriptional and epigenetic state of human repeat elements. Science. 2022;376:eabk3112.
doi: 10.1126/science.abk3112 pubmed: 35357925 pmcid: 9301658
Mahmoud M, Huang Y, Garimella K, Audano PA, Wan W, Prasad N, et al. Utility of long-read sequencing for all of us. bioRxiv. 2023;2023.01.23.525236.
Steely CJ, Watkins WS, Baird L, Jorde LB. The mutational dynamics of short tandem repeats in large, multigenerational families. Genome Biol. 2022;23:1–19.
doi: 10.1186/s13059-022-02818-4
Hardy T. The role of prenatal diagnosis following preimplantation genetic testing for single-gene conditions: a historical overview of evolving technologies and clinical practice. Prenat Diagn. 2020;40:647–51.
doi: 10.1002/pd.5662 pubmed: 32037566
Alfonse LE, Garrett AD, Lun DS, Duffy KR, Grgicak CM. A large-scale dataset of single and mixed-source short tandem repeat profiles to inform human identification strategies: PROVEDIt. Forensic Sci Int Genet. 2018;32:62–70.
doi: 10.1016/j.fsigen.2017.10.006 pubmed: 29091906
Roewer L. Y-chromosome short tandem repeats in forensics—sexing, profiling, and matching male DNA. Wiley Interdiscip Rev Forsenic Sci. 2019;1:e1336.
Truong DT, Minh NVN, Nhung DP, van Luong H, Quyet D, Anh TN, et al. Short tandem repeats used in preimplantation genetic testing of β-thalassemia: genetic polymorphisms for 15 linked loci in the Vietnamese population. J Med Sci. 2019;7:4383–8.
Basille C, Frydman R, Aly A el, Hesters L, Fanchin R, Tachdjian G, et al. Preimplantation genetic diagnosis: state of the art. Eur J Obstet Gynecol Reprod Biol. 2009;145:9–13.
doi: 10.1016/j.ejogrb.2009.04.004 pubmed: 19411132
Wang W, Yap CHA, Loh SF, Tan ASC, Lim MN, Prasath EB, et al. Simplified PGD of common determinants of haemoglobin Bart’s hydrops fetalis syndrome using multiplex-microsatellite PCR. Reprod Biomed Online. 2010;21:642–8.
doi: 10.1016/j.rbmo.2010.06.021 pubmed: 20864413
Alkuraya FS. Impact of new genomic tools on the practice of clinical genetics in consanguineous populations: the Saudi experience. Clin Genet. 2013;84:203–8.
doi: 10.1111/cge.12131 pubmed: 23451714
Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573–80.
doi: 10.1093/nar/27.2.573 pubmed: 9862982 pmcid: 148217
Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinform. 2004;25:4.10.1–14.
Sulovari A, Li R, Audano PA, Porubsky D, Vollger MR, Logsdon GA, et al. Human-specific tandem repeat expansion and differential gene expression during primate evolution. Proc Natl Acad Sci USA 2019;116:23243–53.
doi: 10.1073/pnas.1912175116 pubmed: 31659027 pmcid: 6859368
Grady DL, Ratliff RL, Robinson DL, Mccanlies EC, Meyne J, Moyzis RK. Highly conserved repetitive DNA sequences are present at human centromeres. Proc Natl Acad Sci USA 1992;89:1695.
doi: 10.1073/pnas.89.5.1695 pubmed: 1542662 pmcid: 48519
Page SL, Shin JC, Han JY, Choo KHA, Shaffer LG. Breakpoint diversity illustrates distinct mechanisms for Robertsonian translocation formation. Hum Mol Genet. 1996;5:1279–88.
doi: 10.1093/hmg/5.9.1279 pubmed: 8872467
Zhu L, Chou SH, Reid BR. A single G-to-C change causes human centromere TGGAA repeats to fold back into hairpins. Proc Natl Acad Sci. 1996;93:12159–64.
doi: 10.1073/pnas.93.22.12159 pubmed: 8901550 pmcid: 37960
Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. Primer3-new capabilities and interfaces. Nucleic Acids Res. 2012;40:e115.
doi: 10.1093/nar/gks596 pubmed: 22730293 pmcid: 3424584
Ye J, Coulouris G, Zaretskaya I, Cutcutache I, Rozen S, Madden TL. Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction. BMC Bioinform. 2012;13:134.
doi: 10.1186/1471-2105-13-134
Hossain S. Visualization of bioinformatics data with dash bio. Proc of the 18th Python in science conference. 2019; https://dash.plot.ly/dash-bio .
Cechova M, Harris RS, Tomaszkiewicz M, Arbeithuber B, Chiaromonte F, Makova KD. High satellite repeat turnover in great apes studied with short- and long-read technologies. Mol Biol Evol. 2019;36:2415–31.
doi: 10.1093/molbev/msz156 pubmed: 31273383 pmcid: 6805231
Giacalone JP, Francke U. Common sequence motifs at the rearrangement sites of a constitutional X/autosome translocation and associated deletion. Am J Hum Genet. 1992;50:725.
pubmed: 1347968 pmcid: 1682635

Auteurs

Noam Hadar (N)

Morris Kahn Laboratory of Human Genetics, NIBN and Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel.

Ginat Narkis (G)

Morris Kahn Laboratory of Human Genetics, NIBN and Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel.
Genetics Institute, Soroka Medical Center, Beer Sheva, Israel.

Shirly Amar (S)

Genetics Institute, Soroka Medical Center, Beer Sheva, Israel.

Marina Varnavsky (M)

Genetics Institute, Soroka Medical Center, Beer Sheva, Israel.

Glenda Calniquer Palti (GC)

Genetics Institute, Soroka Medical Center, Beer Sheva, Israel.

Amit Safran (A)

Morris Kahn Laboratory of Human Genetics, NIBN and Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel.

Ohad S Birk (OS)

Morris Kahn Laboratory of Human Genetics, NIBN and Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, Israel. obirk@bgu.ac.il.
Genetics Institute, Soroka Medical Center, Beer Sheva, Israel. obirk@bgu.ac.il.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH