ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language.

Julia language aneuploidy genotype likelihoods high-throughput sequencing data polyploidy pooled sequencing population genetics

Journal

F1000Research
ISSN: 2046-1402
Titre abrégé: F1000Res
Pays: England
ID NLM: 101594320

Informations de publication

Date de publication:
2022
Historique:
accepted: 29 06 2023
pubmed: 25 9 2023
medline: 26 9 2023
entrez: 25 9 2023
Statut: epublish

Résumé

A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicability of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia.

Identifiants

pubmed: 37745626
doi: 10.12688/f1000research.104368.2
pmc: PMC10514575
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

126

Informations de copyright

Copyright: © 2023 Mas-Sandoval A et al.

Déclaration de conflit d'intérêts

No competing interests were disclosed.

Références

BMC Bioinformatics. 2011 Jun 11;12:231
pubmed: 21663684
Nat Genet. 2018 Aug;50(8):1189-1195
pubmed: 30013179
Nat Rev Genet. 2011 Jun;12(6):443-51
pubmed: 21587300
PLoS One. 2015 Oct 13;10(10):e0140462
pubmed: 26461136
Bioinformatics. 2011 Nov 1;27(21):2987-93
pubmed: 21903627
Annu Rev Genet. 2013;47:97-120
pubmed: 24274750
Annu Rev Genomics Hum Genet. 2016 Aug 31;17:95-115
pubmed: 27362342
Bioinformatics. 2011 Dec 15;27(24):3435-6
pubmed: 22025480
BMC Bioinformatics. 2012 Sep 20;13:239
pubmed: 22992255
Mol Ecol. 2021 Dec;30(23):5966-5993
pubmed: 34250668
Nat Rev Genet. 2020 Oct;21(10):597-614
pubmed: 32504078
Genome Biol. 2019 Feb 11;20(1):31
pubmed: 30744683
Genetics. 2013 Nov;195(3):979-92
pubmed: 23979584
Nat Rev Genet. 2014 Nov;15(11):749-63
pubmed: 25246196
Emerg Microbes Infect. 2018 Mar 29;7(1):43
pubmed: 29593275
Theor Popul Biol. 1972 Mar;3(1):87-112
pubmed: 4667078
Genome Res. 2010 Sep;20(9):1297-303
pubmed: 20644199
BMC Bioinformatics. 2014 Nov 25;15:356
pubmed: 25420514
F1000Res. 2023 Jul 14;11:126
pubmed: 37745626
Curr Protoc Bioinformatics. 2013;43:11.10.1-11.10.33
pubmed: 25431634
PLoS One. 2013 Nov 18;8(11):e79667
pubmed: 24260275
Expert Rev Anti Infect Ther. 2017 Sep;15(9):819-827
pubmed: 28783385
Genome Res. 2012 Mar;22(3):568-76
pubmed: 22300766
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Curr Biol. 2021 Mar 22;31(6):R276-R279
pubmed: 33756135
Bioinformatics. 2014 May 15;30(10):1486-7
pubmed: 24458950
PLoS One. 2012;7(7):e37558
pubmed: 22911679
Front Genet. 2012 Apr 24;3:66
pubmed: 22536207

Auteurs

Alex Mas-Sandoval (A)

Department of Life Sciences, Imperial College London, London, UK.

Chenyu Jin (C)

Department of Life Sciences, Imperial College London, London, UK.
Institute of population genetics, University of Veterinary Medicine Vienna, Vienna, Austria.

Marco Fracassetti (M)

Department of Ecology, Environment and Plant Sciences, Stockholm University, Stockholm, Sweden.

Matteo Fumagalli (M)

Department of Life Sciences, Imperial College London, London, UK.
School of Biological and Behavioural Science, Queen Mary, University of London, London, UK.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Populus Soil Microbiology Soil Microbiota Fungi
Coal Metagenome Phylogeny Bacteria Genome, Bacterial

Classifications MeSH