A complete pedigree-based graph workflow for rare candidate variant analysis.


Journal

Genome research
ISSN: 1549-5469
Titre abrégé: Genome Res
Pays: United States
ID NLM: 9518021

Informations de publication

Date de publication:
05 2022
Historique:
received: 24 11 2021
accepted: 24 03 2022
pubmed: 29 4 2022
medline: 18 5 2022
entrez: 28 4 2022
Statut: ppublish

Résumé

Methods that use a linear genome reference for genome sequencing data analysis are reference-biased. In the field of clinical genetics for rare diseases, a resulting reduction in genotyping accuracy in some regions has likely prevented the resolution of some cases. Pangenome graphs embed population variation into a reference structure. Although pangenome graphs have helped to reduce reference mapping bias, further performance improvements are possible. We introduce VG-Pedigree, a pedigree-aware workflow based on the pangenome-mapping tool of Giraffe and the variant calling tool DeepTrio using a specially trained model for Giraffe-based alignments. We demonstrate mapping and variant calling improvements in both single-nucleotide variants (SNVs) and insertion and deletion (indel) variants over those produced by alignments created using BWA-MEM to a linear-reference and Giraffe mapping to a pangenome graph containing data from the 1000 Genomes Project. We have also adapted and upgraded deleterious-variant (DV) detecting methods and programs into a streamlined workflow. We used these workflows in combination to detect small lists of candidate DVs among 15 family quartets and quintets of the Undiagnosed Diseases Program (UDP). All candidate DVs that were previously diagnosed using the Mendelian models covered by the previously published methods were recapitulated by these workflows. The results of these experiments indicate that a slightly greater absolute count of DVs are detected in the proband population than in their matched unaffected siblings.

Identifiants

pubmed: 35483961
pii: gr.276387.121
doi: 10.1101/gr.276387.121
pmc: PMC9104704
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

893-903

Informations de copyright

© 2022 Markello et al.; Published by Cold Spring Harbor Laboratory Press.

Références

Proc Natl Acad Sci U S A. 2013 Mar 5;110(10):3985-90
pubmed: 23426633
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Genome Biol. 2021 Apr 12;22(1):101
pubmed: 33845884
Bioinformatics. 2019 Sep 1;35(17):2966-2973
pubmed: 30649250
Nat Biotechnol. 2018 Nov;36(10):983-987
pubmed: 30247488
F1000Res. 2017 Jan 18;6:52
pubmed: 28344774
Nat Genet. 2019 Sep;51(9):1349-1355
pubmed: 31477931
Nature. 2015 Oct 1;526(7571):68-74
pubmed: 26432245
NPJ Genom Med. 2018 Jul 9;3:16
pubmed: 30002876
Science. 2010 Apr 30;328(5978):636-9
pubmed: 20220176
Nat Biotechnol. 2022 May;40(5):672-680
pubmed: 35132260
JAMA. 2015 Nov 3;314(17):1797-8
pubmed: 26375289
Mol Genet Metab. 2016 Apr;117(4):393-400
pubmed: 26846157
Nature. 2020 May;581(7809):434-443
pubmed: 32461654
Genet Med. 2019 Aug;21(8):1772-1780
pubmed: 30700791
Am J Hum Genet. 2016 Oct 6;99(4):846-859
pubmed: 27666371
Bioinformatics. 2020 Apr 15;36(8):2385-2392
pubmed: 31860070
Nat Biotechnol. 2017 Apr 11;35(4):314-316
pubmed: 28398314
Genet Med. 2021 Jun;23(6):1075-1085
pubmed: 33580225
Nat Rev Genet. 2015 Jun;16(6):333-43
pubmed: 25963372
J Pathol Inform. 2016 Dec 30;7:53
pubmed: 28163975
Nat Genet. 2014 Mar;46(3):220-4
pubmed: 24509481
Genet Med. 2017 Sep;19(9):1040-1048
pubmed: 28252636
Nat Biotechnol. 2019 Oct;37(10):1155-1162
pubmed: 31406327
Nat Biotechnol. 2018 Oct;36(9):875-879
pubmed: 30125266
Nature. 2017 Oct 11;550(7675):239-243
pubmed: 29022581
N Engl J Med. 2018 Nov 29;379(22):2131-2139
pubmed: 30304647
JAMA. 2011 May 11;305(18):1904-5
pubmed: 21558523
Nat Biotechnol. 2019 May;37(5):555-560
pubmed: 30858580
Fly (Austin). 2012 Apr-Jun;6(2):80-92
pubmed: 22728672
Genet Med. 2012 Jan;14(1):51-9
pubmed: 22237431
Nat Genet. 2016 Nov;48(11):1443-1448
pubmed: 27694958
Nature. 2020 Jul;583(7814):83-89
pubmed: 32460305
NPJ Genom Med. 2021 Jul 15;6(1):60
pubmed: 34267211
Nat Genet. 2019 Jan;51(1):30-35
pubmed: 30455414
Science. 2021 Dec 17;374(6574):abg8871
pubmed: 34914532
BMC Genomics. 2012 Nov 24;13:667
pubmed: 23176082
Genome Med. 2015 Sep 30;7:100
pubmed: 26419432
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Genome Med. 2021 Feb 22;13(1):31
pubmed: 33618777
Bioinformatics. 2021 Jan 05;:
pubmed: 33399819
Nat Rev Genet. 2013 Feb;14(2):125-38
pubmed: 23329113
Nat Commun. 2019 Nov 28;10(1):5436
pubmed: 31780650

Auteurs

Charles Markello (C)

UC Santa Cruz Genomics Institute, Santa Cruz, California 95060, USA.

Charles Huang (C)

Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA.

Alex Rodriguez (A)

Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA.

Andrew Carroll (A)

Google Incorporated, Mountain View, California 94043, USA.

Pi-Chuan Chang (PC)

Google Incorporated, Mountain View, California 94043, USA.

Jordan Eizenga (J)

UC Santa Cruz Genomics Institute, Santa Cruz, California 95060, USA.

Thomas Markello (T)

Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA.

David Haussler (D)

UC Santa Cruz Genomics Institute, Santa Cruz, California 95060, USA.
Howard Hughes Medical Institute, University of California, Santa Cruz, California 95064, USA.

Benedict Paten (B)

UC Santa Cruz Genomics Institute, Santa Cruz, California 95060, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH