An assessment of bioinformatics tools for the detection of human endogenous retroviral insertions in short-read genome sequencing data.

benchmarking bioinformatics herv-k retrovirus whole-genome sequencing

Journal

Frontiers in bioinformatics
ISSN: 2673-7647
Titre abrégé: Front Bioinform
Pays: Switzerland
ID NLM: 9918227263306676

Informations de publication

Date de publication:
2022
Historique:
received: 05 10 2022
accepted: 12 12 2022
entrez: 27 2 2023
pubmed: 28 2 2023
medline: 28 2 2023
Statut: epublish

Résumé

There is a growing interest in the study of human endogenous retroviruses (HERVs) given the substantial body of evidence that implicates them in many human diseases. Although their genomic characterization presents numerous technical challenges, next-generation sequencing (NGS) has shown potential to detect HERV insertions and their polymorphisms in humans. Currently, a number of computational tools to detect them in short-read NGS data exist. In order to design optimal analysis pipelines, an independent evaluation of the available tools is required. We evaluated the performance of a set of such tools using a variety of experimental designs and datasets. These included 50 human short-read whole-genome sequencing samples, matching long and short-read sequencing data, and simulated short-read NGS data. Our results highlight a great performance variability of the tools across the datasets and suggest that different tools might be suitable for different study designs. However, specialized tools designed to detect exclusively human endogenous retroviruses consistently outperformed generalist tools that detect a wider range of transposable elements. We suggest that, if sufficient computing resources are available, using multiple HERV detection tools to obtain a consensus set of insertion loci may be ideal. Furthermore, given that the false positive discovery rate of the tools varied between 8% and 55% across tools and datasets, we recommend the wet lab validation of predicted insertions if DNA samples are available.

Identifiants

pubmed: 36845320
doi: 10.3389/fbinf.2022.1062328
pii: 1062328
pmc: PMC9945273
doi:

Types de publication

Journal Article

Langues

eng

Pagination

1062328

Subventions

Organisme : Motor Neurone Disease Association
ID : ALCHALABI-DOBSON/APR14/829-791
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/R024804/1
Pays : United Kingdom

Informations de copyright

Copyright © 2023 Bowles, Kabiljo, Al Khleifat, Jones, Quinn, Dobson, Swanson, Al-Chalabi and Iacoangeli.

Déclaration de conflit d'intérêts

AC is the Principal Investigator of the Lighthouse 2 trial of Triumeq in ALS. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

Viruses. 2021 Mar 10;13(3):
pubmed: 33802118
Genome Biol. 2018 Nov 19;19(1):199
pubmed: 30454069
Eur J Immunol. 2020 May;50(5):685-694
pubmed: 32012247
Acta Neuropathol Commun. 2019 Jul 17;7(1):115
pubmed: 31315673
Crit Rev Microbiol. 2018 Nov;44(6):715-738
pubmed: 30318978
Mol Ecol. 2019 Mar;28(6):1491-1505
pubmed: 30520198
Methods Mol Biol. 2012;859:29-51
pubmed: 22367864
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Sci Transl Med. 2015 Sep 30;7(307):307ra153
pubmed: 26424568
J Virol. 2005 Oct;79(19):12507-14
pubmed: 16160178
J Virol. 2019 Jul 30;93(16):
pubmed: 31167914
Sci Data. 2019 Jun 14;6(1):91
pubmed: 31201313
Sci Rep. 2021 Jul 12;11(1):14283
pubmed: 34253796
iScience. 2022 Oct 07;25(11):105289
pubmed: 36339261
BMC Bioinformatics. 2019 Apr 27;20(1):213
pubmed: 31029080
Biology (Basel). 2021 May 14;10(5):
pubmed: 34069102
Mob DNA. 2021 Jan 12;12(1):2
pubmed: 33436076
Eur J Hum Genet. 2018 Oct;26(10):1537-1546
pubmed: 29955173
Nat Genet. 2019 Sep;51(9):1380-1388
pubmed: 31427791
Genome Res. 2017 Nov;27(11):1916-1929
pubmed: 28855259
Nat Commun. 2021 Jun 22;12(1):3836
pubmed: 34158502
Retrovirology. 2020 May 6;17(1):10
pubmed: 32375827
Mob DNA. 2019 Dec 29;10:52
pubmed: 31890048
J Virol. 1994 Jun;68(6):3830-40
pubmed: 8189520
Mob DNA. 2021 Nov 27;12(1):28
pubmed: 34838103
Front Mol Biosci. 2016 Nov 16;3:76
pubmed: 27900322
Front Oncol. 2021 May 13;11:658489
pubmed: 34055625
Genome Res. 2019 Oct;29(10):1567-1577
pubmed: 31575651
Bioinformatics. 2019 Oct 15;35(20):3913-3922
pubmed: 30895294
Virus Genes. 2003 May;26(3):291-315
pubmed: 12876457
Mob DNA. 2015 Dec 29;6:24
pubmed: 26719777
Chromosome Res. 2018 Mar;26(1-2):93-111
pubmed: 29460123
Int J Mol Sci. 2019 Jul 29;20(15):
pubmed: 31362360
Genome Biol. 2021 May 10;22(1):146
pubmed: 33971925
Gene. 2018 Oct 30;675:69-79
pubmed: 29953920
Genome Res. 2017 May;27(5):849-864
pubmed: 28396521
Retrovirology. 2011 Nov 08;8:90
pubmed: 22067224
Nucleic Acids Res. 2022 Mar 21;50(5):2493-2508
pubmed: 35212372
Nat Methods. 2020 Feb;17(2):155-158
pubmed: 31819265
Front Microbiol. 2020 Jul 17;11:1690
pubmed: 32765477
Bioorg Khim. 2003 Jan-Feb;29(1):103-6
pubmed: 12659000
Retrovirology. 2012 Dec 20;9:111
pubmed: 23253934
J Gen Virol. 2008 Feb;89(Pt 2):567-572
pubmed: 18198388
J Virol. 2007 Oct;81(19):10712-7
pubmed: 17634225
Bioinformatics. 2013 Feb 01;29(3):389-90
pubmed: 23233656
Proc Natl Acad Sci U S A. 2016 Apr 19;113(16):E2326-34
pubmed: 27001843
BMC Genomics. 2017 Jun 27;18(1):487
pubmed: 28655292
Genome Biol. 2014;15(10):488
pubmed: 25348035
Semin Cancer Biol. 2010 Aug;20(4):234-45
pubmed: 20416380
Virus Evol. 2017 Aug 21;3(2):vex023
pubmed: 28948042
Nat Rev Genet. 2019 Dec;20(12):760-772
pubmed: 31515540
Int J Mol Sci. 2019 Nov 27;20(23):
pubmed: 31783611
Retrovirology. 2022 Jun 8;19(1):11
pubmed: 35676699
Nat Rev Genet. 2020 Aug;21(8):448
pubmed: 32488197
Cell. 2015 Aug 27;162(5):974-86
pubmed: 26317466

Auteurs

Harry Bowles (H)

Department of Basic and Clinical Neuroscience, King's College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.

Renata Kabiljo (R)

Department of Basic and Clinical Neuroscience, King's College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.
Department of Biostatistics and Health Informatics, King's College London, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.

Ahmad Al Khleifat (A)

Department of Basic and Clinical Neuroscience, King's College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.

Ashley Jones (A)

Department of Basic and Clinical Neuroscience, King's College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.

John P Quinn (JP)

Department of Pharmacology and Therapeutics, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom.

Richard J B Dobson (RJB)

Department of Biostatistics and Health Informatics, King's College London, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.
NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London, London, United Kingdom.
Institute of Health Informatics, University College London, London, United Kingdom.
NIHR Biomedical Research Centre, University College London Hospitals NHS Foundation Trust, London, United Kingdom.

Chad M Swanson (CM)

Department of Infectious Diseases, School of Immunology and Microbial Sciences, King's College London, London, United Kingdom.

Ammar Al-Chalabi (A)

Department of Basic and Clinical Neuroscience, King's College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.
Department of Neurology, King's College Hospital, London, United Kingdom.

Alfredo Iacoangeli (A)

Department of Basic and Clinical Neuroscience, King's College London, Maurice Wohl Clinical Neuroscience Institute, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.
Department of Biostatistics and Health Informatics, King's College London, Institute of Psychiatry, Psychology and Neuroscience, London, United Kingdom.
NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London, London, United Kingdom.

Classifications MeSH