LinkedSV for detection of mosaic structural variants from linked-read exome and genome sequencing data.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
06 12 2019
Historique:
received: 12 03 2019
accepted: 07 11 2019
entrez: 8 12 2019
pubmed: 8 12 2019
medline: 3 3 2020
Statut: epublish

Résumé

Linked-read sequencing provides long-range information on short-read sequencing data by barcoding reads originating from the same DNA molecule, and can improve detection and breakpoint identification for structural variants (SVs). Here we present LinkedSV for SV detection on linked-read sequencing data. LinkedSV considers barcode overlapping and enriched fragment endpoints as signals to detect large SVs, while it leverages read depth, paired-end signals and local assembly to detect small SVs. Benchmarking studies demonstrate that LinkedSV outperforms existing tools, especially on exome data and on somatic SVs with low variant allele frequencies. We demonstrate clinical cases where LinkedSV identifies disease-causal SVs from linked-read exome sequencing data missed by conventional exome sequencing, and show examples where LinkedSV identifies SVs missed by high-coverage long-read sequencing. In summary, LinkedSV can detect SVs missed by conventional short-read and long-read sequencing approaches, and may resolve negative cases from clinical genome/exome sequencing studies.

Identifiants

pubmed: 31811119
doi: 10.1038/s41467-019-13397-7
pii: 10.1038/s41467-019-13397-7
pmc: PMC6898185
doi:

Substances chimiques

NF1 protein, human 0
Neurofibromin 1 0

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

5585

Subventions

Organisme : NIGMS NIH HHS
ID : R01 GM132713
Pays : United States
Organisme : NICHD NIH HHS
ID : U54 HD086984
Pays : United States

Références

J Thromb Haemost. 2008 Oct;6(10):1822-4
pubmed: 18647227
Nat Methods. 2018 Jun;15(6):461-468
pubmed: 29713083
Nat Rev Genet. 2016 Apr;17(4):224-38
pubmed: 26924765
Nature. 2004 Oct 21;431(7011):927-30
pubmed: 15496912
Nat Biotechnol. 2014 Mar;32(3):246-51
pubmed: 24531798
Bioinformatics. 2015 Nov 15;31(22):3694-6
pubmed: 26220959
Genome Biol. 2014 Jun 26;15(6):R84
pubmed: 24970577
Am J Hum Genet. 2005 Jul;77(1):78-88
pubmed: 15918152
Nature. 2015 Jan 29;517(7536):608-11
pubmed: 25383537
Mol Syst Biol. 2011 Aug 02;7:522
pubmed: 21811232
Genome Res. 2018 Apr;28(4):581-591
pubmed: 29535149
Comput Struct Biotechnol J. 2017 Nov 09;15:478-484
pubmed: 29213995
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Nat Rev Cancer. 2007 Apr;7(4):233-45
pubmed: 17361217
Proc Natl Acad Sci U S A. 2017 May 16;114(20):E3984-E3992
pubmed: 28465436
Nat Rev Genet. 2013 Feb;14(2):125-38
pubmed: 23329113
Nat Genet. 1993 Nov;5(3):236-41
pubmed: 8275087
Bioinformatics. 2012 Sep 15;28(18):i333-i339
pubmed: 22962449
Nature. 2009 Dec 24;462(7276):1005-10
pubmed: 20033038
Nat Genet. 2008 Jun;40(6):722-9
pubmed: 18438408
Nat Methods. 2009 Sep;6(9):677-81
pubmed: 19668202
Nat Commun. 2016 Jun 30;7:12065
pubmed: 27356984
Genome Res. 2017 May;27(5):677-685
pubmed: 27895111
Nat Methods. 2017 Sep;14(9):915-920
pubmed: 28714986
Nucleic Acids Res. 2015 Feb 27;43(4):2188-98
pubmed: 25613453
Bioinformatics. 2018 Jan 15;34(2):353-360
pubmed: 29112732
Nat Methods. 2017 Jan;14(1):65-67
pubmed: 27892959
Nat Biotechnol. 2016 Mar;34(3):303-11
pubmed: 26829319
Genome Res. 2015 Oct;25(10):1570-80
pubmed: 26286554
Nat Methods. 2015 Aug;12(8):780-6
pubmed: 26121404
Bioinformatics. 2009 Nov 1;25(21):2865-71
pubmed: 19561018
Nat Biotechnol. 2011 Jan;29(1):24-6
pubmed: 21221095

Auteurs

Li Fang (L)

Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.

Charlly Kao (C)

Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.

Michael V Gonzalez (MV)

Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.

Fernanda A Mafra (FA)

Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.

Renata Pellegrino da Silva (R)

Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.

Mingyao Li (M)

Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA.

Sören-Sebastian Wenzel (SS)

Institute of Human Genetics, Department for Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria.

Katharina Wimmer (K)

Institute of Human Genetics, Department for Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria.

Hakon Hakonarson (H)

Department of Pediatrics, University of Pennsylvania, Philadelphia, PA, 19104, USA.

Kai Wang (K)

Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA. wangk@email.chop.edu.
Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA. wangk@email.chop.edu.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH