slimr: An R package for tailor-made integrations of data in population genomic simulations over space and time.
application
ecology
evolution
evolutionary ecology
landscape genomics
population genomics
simulation
software
Journal
Molecular ecology resources
ISSN: 1755-0998
Titre abrégé: Mol Ecol Resour
Pays: England
ID NLM: 101465604
Informations de publication
Date de publication:
20 Dec 2023
20 Dec 2023
Historique:
revised:
20
11
2023
received:
18
03
2022
accepted:
30
11
2023
medline:
21
12
2023
pubmed:
21
12
2023
entrez:
21
12
2023
Statut:
aheadofprint
Résumé
Software for realistically simulating complex population genomic processes is revolutionizing our understanding of evolutionary processes, and providing novel opportunities for integrating empirical data with simulations. However, the integration between standalone simulation software and R is currently not well developed. Here, we present slimr, an R package designed to create a seamless link between standalone software SLiM >3.0, one of the most powerful population genomic simulation frameworks, and the R development environment, with its powerful data manipulation and analysis tools. We show how slimr facilitates smooth integration between genetic data, ecological data and simulation in a single environment. The package enables pipelines that begin with data reading, cleaning and manipulation, proceed to constructing empirically based parameters and initial conditions for simulations, then to running numerical simulations and finally to retrieving simulation results in a format suitable for comparisons with empirical data - aided by advanced analysis and visualization tools provided by R. We demonstrate the use of slimr with an example from our own work on the landscape population genomics of desert mammals, highlighting the advantage of having a single integrated tool for both data analysis and simulation. slimr makes the powerful simulation ability of SLiM directly accessible to R users, allowing integrated simulation projects that incorporate empirical data without the need to switch between software environments. This should provide more opportunities for evolutionary biologists and ecologists to use realistic simulations to better understand the interplay between ecological and evolutionary processes.
Identifiants
pubmed: 38124500
doi: 10.1111/1755-0998.13916
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e13916Subventions
Organisme : Australian Research Council
ID : DP180103844
Informations de copyright
© 2023 John Wiley & Sons Ltd.
Références
Beaumont, M. A., Zhang, W., & Balding, D. J. (2002). Approximate Bayesian computation in population genetics. Genetics, 162(4), 2025-2035. https://doi.org/10.1093/genetics/162.4.2025
Brehmer, J., Louppe, G., Pavez, J., & Cranmer, K. (2020). Mining gold from implicit models to improve likelihood-free inference. Proceedings of the National Academy of Sciences of the United States of America, 117(10), 5242-5249. https://doi.org/10.1073/pnas.1915980117
Carvajal-Rodríguez, A. (2010). Simulation of genes and genomes forward in time. Current Genomics, 11(1), 58-61. https://doi.org/10.2174/138920210790218007
Cranmer, K., Brehmer, J., & Louppe, G. (2020). The frontier of simulation-based inference. Proceedings of the National Academy of Sciences of the United States of America, 117(48), 30055-30062. https://doi.org/10.1073/pnas.1912789117
Dickman, C., Wardle, G., Foulkes, J., & de Preu, N. (2014). Desert complex environments. Biodiversity and environmental change: Monitoring, challenges and direction (pp. 379-438). CSIRO Publishing. https://www.publish.csiro.au/book/7009/
Dickman, C. R., Greenville, A. C., Tamayo, B., & Wardle, G. M. (2011). Spatial dynamics of small mammals in central Australian desert habitats: The role of drought refugia. Journal of Mammalogy, 92(6), 1193-1209. https://doi.org/10.1644/10-MAMM-S-329.1
Greenville, A. C., Dickman, C. R., & Wardle, G. M. (2017). 75 years of dryland science: Trends and gaps in arid ecology literature. Plos One, 12(4), e0175014. https://doi.org/10.1371/journal.pone.0175014
Greenville, A. C., Wardle, G. M., & Dickman, C. R. (2012). Extreme climatic events drive mammal irruptions: Regression analysis of 100-year trends in desert rainfall and temperature. Ecology and Evolution, 2(11), 2645-2658. https://doi.org/10.1002/ece3.377
Greenville, A. C., Wardle, G. M., Nguyen, V., & Dickman, C. R. (2016). Population dynamics of desert mammals: Similarities and contrasts within a multispecies assemblage. Ecosphere, 7(5), e01343. https://doi.org/10.1002/ecs2.1343
Haller, B. C., & Messer, P. W. (2019). SLiM 3: Forward genetic simulations beyond the Wright-fisher model. Molecular Biology and Evolution, 36(3), 632-637. https://doi.org/10.1093/molbev/msy228
Haller, B. C., & Messer, P. W. (2023). SLiM 4: Multispecies eco-evolutionary modeling. The American Naturalist, 201(5), E127-E139. https://doi.org/10.1086/723601
Hill, P., Dickman, C. R., Dinnage, R., Duncan, R. P., Edwards, S. V., Greenville, A., Sarre, S. D., Stringer, E. J., Wardle, G. M., & Gruber, B. (2023). Episodic population fragmentation and gene flow reveal a trade-off between heterozygosity and allelic richness. Molecular Ecology, 32(24), 6766-6776. https://doi.org/10.1111/mec.17174
Hoban, S. (2014). An overview of the utility of population simulation software in molecular ecology. Molecular Ecology, 23(10), 2383-2401. https://doi.org/10.1111/mec.12741
Johri, P., Aquadro, C. F., Beaumont, M., Charlesworth, B., Excoffier, L., Eyre-Walker, A., Keightley, P. D., Lynch, M., McVean, G., Payseur, B. A., Pfeifer, S. P., Stephan, W., & Jensen, J. D. (2022). Recommendations for improving statistical inference in population genomics. PLoS Biology, 20(5), e3001669. https://doi.org/10.1371/journal.pbio.3001669
Kalinowski, T. (2023). guildai: Track Machine Learning Experiments.
Kelleher, J., Etheridge, A. M., & McVean, G. (2016). Efficient coalescent simulation and genealogical analysis for large sample sizes. PLoS Computational Biology, 12(5), e1004842. https://doi.org/10.1371/journal.pcbi.1004842
Landau, W. (2021). The targets R package: A dynamic make-like function-oriented pipeline toolkit for reproducibility and high-performance computing. The Journal of Open Source Software, 6(57), 2959. https://doi.org/10.21105/joss.02959
Manel, S., & Holderegger, R. (2013). Ten years of landscape genetics. Trends in Ecology & Evolution, 28(10), 614-621. https://doi.org/10.1016/j.tree.2013.05.012
Marjoram, P., Molitor, J., Plagnol, V., & Tavare, S. (2003). Markov chain Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences of the United States of America, 100(26), 15324-15328. https://doi.org/10.1073/pnas.0306899100
Messer, P. W. (2013). SLiM: Simulating evolution with selection and linkage. Genetics, 194(4), 1037-1039. https://doi.org/10.1534/genetics.113.152181
Patton, A. H., Margres, M. J., Stahlke, A. R., Hendricks, S., Lewallen, K., Hamede, R. K., Ruiz-Aravena, M., Ryder, O., McCallum, H. I., Jones, M. E., Hohenlohe, P. A., & Storfer, A. (2019). Contemporary demographic reconstruction methods are robust to genome assembly quality: A case study in tasmanian devils. Molecular Biology and Evolution, 36(12), 2906-2921. https://doi.org/10.1093/molbev/msz191
Petr, M., Haller, B. C., Ralph, P. L., & Racimo, F. (2022). slendr: A framework for spatio-temporal population genomic simulations on geographic landscapes. BioRxiv. https://doi.org/10.1101/2022.03.20.485041
Sisson, S. A. (2018). Handbook of approximate bayesian computation. CRC Press, [2019]: Chapman and Hall/CRC. https://doi.org/10.1201/9781315117195
Storfer, A., Patton, A., & Fraik, A. K. (2018). Navigating the interface between landscape genetics and landscape genomics. Frontiers in Genetics, 9, 68. https://doi.org/10.3389/fgene.2018.00068
Strand, A. E. (2002). Metasim 1.0: An individual-based environment for simulating population genetics of complex population dynamics. Molecular Ecology Notes, 2(3), 373-376. https://doi.org/10.1046/j.1471-8286.2002.00208.x
Torada, L., Lorenzon, L., Beddis, A., Isildak, U., Pattini, L., Mathieson, S., & Fumagalli, M. (2019). ImaGene: A convolutional neural network to quantify natural selection from genomic data. BMC Bioinformatics, 20(Suppl 9), 337. https://doi.org/10.1186/s12859-019-2927-x
Wang, Z., Wang, J., Kourakos, M., Hoang, N., Lee, H. H., Mathieson, I., & Mathieson, S. (2020). Automatic inference of demographic parameters using generative adversarial networks. Molecular Ecology Resources, 21(8), 2689-2705. https://doi.org/10.1111/1755-0998.13386
Wickham, H. (2014). Advanced R. Chapman and Hall/CRC. https://doi.org/10.1201/b17487
Yuan, X., Miller, D. J., Zhang, J., Herrington, D., & Wang, Y. (2012). An overview of population genetic data simulation. Journal of Computational Biology, 19(1), 42-54. https://doi.org/10.1089/cmb.2010.0188