Journeying towards best practice data management in biodiversity genomics.

CARE principles for indigenous data governance FAIR guiding principles data lifecycle data management plans digital sequence information indigenous data sovereignty

Journal

Molecular ecology resources
ISSN: 1755-0998
Titre abrégé: Mol Ecol Resour
Pays: England
ID NLM: 101465604

Informations de publication

Date de publication:
24 Oct 2023
Historique:
revised: 15 09 2023
received: 04 05 2023
accepted: 03 10 2023
medline: 24 10 2023
pubmed: 24 10 2023
entrez: 24 10 2023
Statut: aheadofprint

Résumé

Advances in sequencing technologies and declining costs are increasing the accessibility of large-scale biodiversity genomic datasets. To maximize the impact of these data, a careful, considered approach to data management is essential. However, challenges associated with the management of such datasets remain, exacerbated by uncertainty among the research community as to what constitutes best practices. As an interdisciplinary team with diverse data management experience, we recognize the growing need for guidance on comprehensive data management practices that minimize the risks of data loss, maximize efficiency for stand-alone projects, enhance opportunities for data reuse, facilitate Indigenous data sovereignty and uphold the FAIR and CARE Guiding Principles. Here, we describe four fictional personas reflecting differing user experiences with data management to identify data management challenges across the biodiversity genomics research ecosystem. We then use these personas to demonstrate realistic considerations, compromises and actions for biodiversity genomic data management. We also launch the Biodiversity Genomics Data Management Hub (https://genomicsaotearoa.github.io/data-management-resources/), containing tips, tricks and resources to support biodiversity genomics researchers, especially those new to data management, in their journey towards best practice. The Hub also provides an opportunity for those biodiversity researchers whose expertise lies beyond genomics and are keen to advance their data management journey. We aim to support the biodiversity genomics community in embedding data management throughout the research lifecycle to maximize research impact and outcomes.

Identifiants

pubmed: 37873890
doi: 10.1111/1755-0998.13880
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© 2023 Manaaki Whenua - Landcare Research and The Authors. Molecular Ecology Resources published by John Wiley & Sons Ltd.

Références

Anderson, J., & Hudson, M. (2020). The biocultural labels initiative: Supporting indigenous rights in data derived from genetic resources. Biodiversity Information Science and Standards, 4, e59230. https://doi.org/10.3897/biss.4.59230
Andrikopoulou, A., Rowley, J., & Walton, G. (2022). Research data management (RDM) and the evolving identity of academic libraries and librarians: A literature review. New Review of Academic Librarianship, 28(4), 349-365. https://doi.org/10.1080/13614533.2021.1964549
Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452-454. https://doi.org/10.1038/533452a
Batley, J., & Edwards, D. (2009). Genome sequence data: Management, storage, and visualization. BioTechniques, 46(5), 333-336. https://doi.org/10.2144/000113134
Beninde, J., Toffelmier, E., & Shaffer, H. B. (2022). A brief history of population genetic research in California and an evaluation of its utility for conservation decision-making. Journal of Heredity, 113(6), 604-614. https://doi.org/10.1093/jhered/esac049
Bloemers, M., & Montesanti, A. (2020). The FAIR funding model: Providing a framework for research funders to drive the transition toward FAIR data management and stewardship practices. Data Intelligence, 2(1-2), 171-180. https://doi.org/10.1162/dint_a_00039
Carroll, S. R., Garba, I., Figueroa-Rodríguez, O. L., Holbrook, J., Lovett, R., Materechera, S., Parsons, M., Raseroka, K., Rodriguez-Lonebear, D., Rowe, R., Sara, R., Walker, J. D., Anderson, J., & Hudson, M. (2020). The CARE principles for indigenous data governance. Data Science Journal, 19(1), Article 1. https://doi.org/10.5334/dsj-2020-043
Carroll, S. R., Herczog, E., Hudson, M., Russell, K., & Stall, S. (2021). Operationalizing the CARE and FAIR principles for indigenous data futures. Scientific Data, 8(1), Article 1. https://doi.org/10.1038/s41597-021-00892-0
Chiang, G.-T., Clapham, P., Qi, G., Sale, K., & Coates, G. (2011). Implementing a genomic data management system using iRODS in the Wellcome Trust Sanger institute. BMC Bioinformatics, 12(1), 361. https://doi.org/10.1186/1471-2105-12-361
Collier-Robinson, L., Rayne, A., Rupene, M., Thoms, C., & Steeves, T. (2019). Embedding indigenous principles in genomic research of culturally significant species: A conservation genomics case study. New Zealand Journal of Ecology, 43(3), 3389.
Cragin, M. H., Palmer, C. L., Carlson, J. R., & Witt, M. (2010). Data sharing, small science and institutional repositories. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 368(1926), 4023-4038. https://doi.org/10.1098/rsta.2010.0165
Crandall, E. D., Riginos, C., Bird, C. E., Liggins, L., Treml, E., Beger, M., Barber, P. H., Connolly, S. R., Cowman, P. F., DiBattista, J. D., Eble, J. A., Magnuson, S. F., Horne, J. B., Kochzius, M., Lessios, H. A., Liu, S. Y. V., Ludt, W. B., Madduppa, H., Pandolfi, J. M., … Gaither, M. R. (2019). The molecular biogeography of the Indo-Pacific: Testing hypotheses with multispecies genetic patterns. Global Ecology and Biogeography, 28(7), 943-960. https://doi.org/10.1111/geb.12905
Crandall, E. D., Toczydlowski, R. H., Liggins, L., Holmes, A. E., Ghoojaei, M., Gaither, M. R., Wham, B. E., Pritt, A. L., Noble, C., Anderson, T. J., Barton, R. L., Berg, J. T., Beskid, S. G., Delgado, A., Farrell, E., Himmelsbach, N., Queeno, S. R., Trinh, T., Weyand, C., … Toonen, R. J. (2023). Importance of timely metadata curation to the global surveillance of genetic diversity. Conservation Biology, 37, e14061. https://doi.org/10.1111/cobi.14061
Duntsch, L., Whibley, A., Brekke, P., Ewen, J. G., & Santure, A. W. (2021). Genomic data of different resolutions reveal consistent inbreeding estimates but contrasting homozygosity landscapes for the threatened Aotearoa New Zealand hihi. Molecular Ecology, 30(23), 6006-6020. https://doi.org/10.1111/mec.16068
Eisner, D. A. (2018). Reproducibility of science: Fraud, impact factors and carelessness. Journal of Molecular and Cellular Cardiology, 114, 364-368. https://doi.org/10.1016/j.yjmcc.2017.10.009
Exposito-Alonso, M., Booker, T. R., Czech, L., Gillespie, L., Hateley, S., Kyriazis, C. C., Lang, P. L. M., Leventhal, L., Nogues-Bravo, D., Pagowski, V., Ruffley, M., Spence, J. P., Toro Arana, S. E., Weiß, C. L., & Zess, E. (2022). Genetic diversity loss in the Anthropocene. Science, 377(6613), 1431-1435. https://doi.org/10.1126/science.abn5642
Fadlelmola, F. M., Zass, L., Chaouch, M., Samtal, C., Ras, V., Kumuthini, J., Panji, S., & Mulder, N. (2021). Data management plans in the genomics research revolution of Africa: Challenges and recommendations. Journal of Biomedical Informatics, 122, 103900. https://doi.org/10.1016/j.jbi.2021.103900
Field, D., Garrity, G., Gray, T., Morrison, N., Selengut, J., Sterk, P., Tatusova, T., Thomson, N., Allen, M. J., Angiuoli, S. V., Ashburner, M., Axelrod, N., Baldauf, S., Ballard, S., Boore, J., Cochrane, G., Cole, J., Dawyndt, P., De Vos, P., … Wipat, A. (2008). The minimum information about a genome sequence (MIGS) specification. Nature Biotechnology, 26(5), Article 5. https://doi.org/10.1038/nbt1360
Forsdick, N. J., Martini, D., Brown, L., Cross, H. B., Maloney, R. F., Steeves, T. E., & Knapp, M. (2021). Genomic sequencing confirms absence of introgression despite past hybridisation between a critically endangered bird and its common congener. Global Ecology and Conservation, 28, e01681. https://doi.org/10.1016/j.gecco.2021.e01681
Grigoriev, I. V., Nordberg, H., Shabalov, I., Aerts, A., Cantor, M., Goodstein, D., Kuo, A., Minovitsky, S., Nikitin, R., Ohm, R. A., Otillar, R., Poliakov, A., Ratnere, I., Riley, R., Smirnova, T., Rokhsar, D., & Dubchak, I. (2012). The genome portal of the Department of Energy Joint Genome Institute. Nucleic Acids Research, 40(D1), D26-D32. https://doi.org/10.1093/nar/gkr947
Henson, L. H., Balkenhol, N., Gustas, R., Adams, M., Walkus, J., Housty, W. G., Stronen, A. V., Moody, J., Service, C., Reece, D., vonHoldt, B. M., McKechnie, I., Koop, B. F., & Darimont, C. T. (2021). Convergent geographic patterns between grizzly bear population genetic structure and Indigenous language groups in coastal British Columbia, Canada. Ecology and Society, 26(3), 7. https://doi.org/10.5751/ES-12443-260307
Hoban, S., Archer, F. I., Bertola, L. D., Bragg, J. G., Breed, M. F., Bruford, M. W., Coleman, M. A., Ekblom, R., Funk, W. C., Grueber, C. E., Hand, B. K., Jaffé, R., Jensen, E., Johnson, J. S., Kershaw, F., Liggins, L., MacDonald, A. J., Mergeay, J., Miller, J. M., … Hunter, M. E. (2022). Global genetic diversity status and trends: Towards a suite of essential biodiversity variables (EBVs) for genetic composition. Biological Reviews, 97(4), 1511-1538. https://doi.org/10.1111/brv.12852
Jennings, L., Anderson, T., Martinez, A., Sterling, R., Chavez, D. D., Garba, I., Hudson, M., Garrison, N. A., & Carroll, S. R. (2023). Applying the ‘CARE principles for Indigenous data governance’ to ecology and biodiversity research. Nature Ecology & Evolution, 7, 1547-1551. https://doi.org/10.1038/s41559-023-02161-2
Jorgenson, L. A., Wolinetz, C. D., & Collins, F. S. (2021). Incentivizing a new culture of data stewardship: The NIH policy for data management and sharing. JAMA, 326(22), 2259-2260. https://doi.org/10.1001/jama.2021.20489
Khan, A., Patel, K., Shukla, H., Viswanathan, A., van der Valk, T., Borthakur, U., Nigam, P., Zachariah, A., Jhala, Y. V., Kardos, M., & Ramakrishnan, U. (2021). Genomic evidence for inbreeding depression and purging of deleterious genetic variation in Indian tigers. Proceedings of the National Academy of Sciences of the United States of America, 118(49), e2023018118. https://doi.org/10.1073/pnas.2023018118
King, J., & Steeves, T. E. (2023). From braided river to He Awa Whiria. In M. Sonja, M. Derby, & A. Macfarlane (Eds.), He Awa Whiria braiding the knowledge streams in research, policy and practice. Canterbury University Press In press.
Lau, J. W., Lehnert, E., Sethi, A., Malhotra, R., Kaushik, G., Onder, Z., Groves-Kirkby, N., Mihajlovic, A., DiGiovanna, J., Srdic, M., Bajcic, D., Radenkovic, J., Mladenovic, V., Krstanovic, D., Arsenijevic, V., Klisic, D., Mitrovic, M., Bogicevic, I., Kural, D., … Seven Bridges CGC Team. (2017). The cancer genomics cloud: Collaborative, reproducible, and democratized-a new paradigm in large-scale computational research. Cancer Research, 77(21), e3-e6. https://doi.org/10.1158/0008-5472.CAN-17-0387
Laurie, G., Jones, K. H., Stevens, L., & Dobbs, C. (2014). A review of evidence relating to harm resulting from uses of health and biomedical data (p. 210). Nuffield Council on Bioethics. https://www.pure.ed.ac.uk/ws/portalfiles/portal/19402878/Review_of_Evidence_Relating_to_Harms_Resulting_from_Uses_of_Health_and_Biomedical_Data_FINAL.pdf
Leigh, D. M., van Rees, C. B., Millette, K. L., Breed, M. F., Schmidt, C., Bertola, L. D., Hand, B. K., Hunter, M. E., Jensen, E. L., Kershaw, F., Liggins, L., Luikart, G., Manel, S., Mergeay, J., Miller, J. M., Segelbacher, G., Hoban, S., & Paz-Vinas, I. (2021). Opportunities and challenges of macrogenetic studies. Nature Reviews Genetics, 22(12), Article 12. https://doi.org/10.1038/s41576-021-00394-0
Liggins, L., Hudson, M., & Anderson, J. (2021). Creating space for indigenous perspectives on access and benefit-sharing: Encouraging researcher use of the local contexts notices. Molecular Ecology, 30(11), 2477-2482. https://doi.org/10.1111/mec.15918
Lin, D., Crabtree, J., Dillo, I., Downs, R. R., Edmunds, R., Giaretta, D., De Giusti, M., L'Hours, H., Hugo, W., Jenkyns, R., Khodiyar, V., Martone, M. E., Mokrane, M., Navale, V., Petters, J., Sierman, B., Sokolova, D. V., Stockhause, M., & Westbrook, J. (2020). The TRUST principles for digital repositories. Scientific Data, 7(1), Article 1. https://doi.org/10.1038/s41597-020-0486-7
Liu, L., Bosse, M., Megens, H.-J., de Visser, M., Groenen, A. M., & Madsen, O. (2021). Genetic consequences of long-term small effective population size in the critically endangered pygmy hog. Evolutionary Applications, 14(3), 710-720. https://doi.org/10.1111/eva.13150
Lovett, R., Lee, V., Kukutai, T., Cormack, D., Rainie, S. C., & Walker, J. (2019). Good data practices for indigenous data sovereignty and governance. In Good data (pp. 26-36). Institute of Network Cultures Inc.
Magid, M., Wold, J. R., Moraga, R., Cubrinovska, I., Houston, D. M., Gartrell, B. D., & Steeves, T. E. (2022). Leveraging an existing whole-genome resequencing population data set to characterize toll-like receptor gene diversity in a threatened bird. Molecular Ecology Resources, 22(7), 2810-2825. https://doi.org/10.1111/1755-0998.13656
McCartney, A. M., Head, M. A., Tsosie, K. S., Sterner, B., Glass, J. R., Paez, S., Geary, J., & Hudson, M. (2023). Indigenous peoples and local communities as partners in the sequencing of global eukaryotic biodiversity. NPJ Biodiversity, 2(1), Article 1. https://doi.org/10.1038/s44185-023-00013-7
Möller, S., Prescott, S. W., Wirzenius, L., Reinholdtsen, P., Chapman, B., Prins, P., Soiland-Reyes, S., Klötzl, F., Bagnacani, A., Kalaš, M., Tille, A., & Crusoe, M. R. (2017). Robust Cross-platform workflows: How technical and scientific communities collaborate to develop, test and share best practices for data analysis. Data Science and Engineering, 2(3), 232-244. https://doi.org/10.1007/s41019-017-0050-4
Mons, B., Neylon, C., Velterop, J., Dumontier, M., da Silva Santos, L. O. B., & Wilkinson, M. D. (2017). Cloudy, increasingly FAIR; revisiting the FAIR data guiding principles for the European Open Science cloud. Information Services & Use, 37(1), 49-56. https://doi.org/10.3233/ISU-170824
Ozaki, K., Ohnishi, Y., Iida, A., Sekine, A., Yamada, R., Tsunoda, T., Sato, H., Sato, H., Hori, M., Nakamura, Y., & Tanaka, T. (2002). Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarction. Nature Genetics, 32(4), Article 4. https://doi.org/10.1038/ng1047
Rainie, S. C., Kukutai, T., Walter, M., Figueroa-Rodríguez, O. L., Walker, J., & Axelsson, P. (2019). Indigenous data sovereignty. In The state of open data: Histories and horizons (pp. 300-319). African Minds and International Development Research Centre.
Rayne, A., Blair, S., Dale, M., Flack, B., Hollows, J., Moraga, R., Parata, R. N., Rupene, M., Tamati-Elliffe, P., Wehi, P. M., Wylie, M. J., & Steeves, T. E. (2022). Weaving place-based knowledge for culturally significant species in the age of genomics: Looking to the past to navigate the future. Evolutionary Applications, 15(5), 751-772. https://doi.org/10.1111/eva.13367
Riginos, C., Crandall, E. D., Liggins, L., Gaither, M. R., Ewing, R. B., Meyer, C., Andrews, K. R., Euclide, P. T., Titus, B. M., Therkildsen, N. O., Salces-Castellano, A., Stewart, L. C., Toonen, R. J., & Deck, J. (2020). Building a global genomics observatory: Using GEOME (the genomic observatories Metadatabase) to expedite and improve deposition and retrieval of genetic data and metadata for biodiversity research. Molecular Ecology Resources, 20(6), 1458-1469. https://doi.org/10.1111/1755-0998.13269
Robledo-Ruiz, D. A., Gan, H. M., Kaur, P., Dudchenko, O., Weisz, D., Khan, R., Lieberman Aiden, E., Osipova, E., Hiller, M., Morales, H. E., Magrath, M. J. L., Clarke, R. H., Sunnucks, P., & Pavlova, A. (2022). Chromosome-length genome assembly and linkage map of a critically endangered Australian bird: The helmeted honeyeater. GigaScience, 11, giac025. https://doi.org/10.1093/gigascience/giac025
Schadt, E. E., Linderman, M. D., Sorenson, J., Lee, L., & Nolan, G. P. (2010). Computational solutions to large-scale data management and analysis. Nature Reviews Genetics, 11(9), Article 9-Article 657. https://doi.org/10.1038/nrg2857
Te Aika, B., Liggins, L., Rye, C., Perkins, E. O., Huh, J., Brauning, R., Godfery, T., & Black, M. A. (2023). Aotearoa genomic data repository: An āhuru mōwai for taonga species sequencing data. Molecular Ecology Resources, 1-14. https://doi.org/10.1111/1755-0998.13866
Toczydlowski, R. H., Liggins, L., Gaither, M. R., Anderson, T. J., Barton, R. L., Berg, J. T., Beskid, S. G., Davis, B., Delgado, A., Farrell, E., Ghoojaei, M., Himmelsbach, N., Holmes, A. E., Queeno, S. R., Trinh, T., Weyand, C. A., Bradburd, G. S., Riginos, C., Toonen, R. J., & Crandall, E. D. (2021). Poor data stewardship will hinder global genetic diversity surveillance. Proceedings of the National Academy of Sciences of the United States of America, 118(34), e2107934118. https://doi.org/10.1073/pnas.2107934118
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., … Mons, B. (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3(1), Article 1. https://doi.org/10.1038/sdata.2016.18
Wold, J. R., Guhlin, J. G., Dearden, P. K., Santure, A. W., & Steeves, T. E. (2023). The promise and challenges of characterizing genome-wide structural variants: A case study in a critically endangered parrot. Molecular Ecology Resources, 1-18. https://doi.org/10.1111/1755-0998.13783
Wright, S. (1922). Coefficients of inbreeding and relationship. The American Naturalist, 56(645), 330-338. https://doi.org/10.1086/279872
Yilmaz, P., Kottmann, R., Field, D., Knight, R., Cole, J. R., Amaral-Zettler, L., Gilbert, J. A., Karsch-Mizrachi, I., Johnston, A., Cochrane, G., Vaughan, R., Hunter, C., Park, J., Morrison, N., Rocca-Serra, P., Sterk, P., Arumugam, M., Bailey, M., Baumgartner, L., … Glöckner, F. O. (2011). Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nature Biotechnology, 29(5), Article 5. https://doi.org/10.1038/nbt.1823

Auteurs

Natalie J Forsdick (NJ)

Manaaki Whenua-Landcare Research, Lincoln, New Zealand.
Genomics Aotearoa, Dunedin, New Zealand.

Jana Wold (J)

Genomics Aotearoa, Dunedin, New Zealand.
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.

Anton Angelo (A)

Library, University of Canterbury, Christchurch, New Zealand.

François Bissey (F)

Digital Services, University of Canterbury, Christchurch, New Zealand.

Jamie Hart (J)

Digital Services, University of Canterbury, Christchurch, New Zealand.

Mitchell Head (M)

Ngaati Mahuta, Waikato, New Zealand.
Ngaati Naho, Waikato, New Zealand.
Te Kotahi Research Institute, University of Waikato, Hamilton, New Zealand.

Libby Liggins (L)

Genomics Aotearoa, Dunedin, New Zealand.
School of Natural Sciences, Massey University, Palmerston North, New Zealand.

Dinindu Senanayake (D)

New Zealand eScience Infrastructure, Auckland, New Zealand.

Tammy E Steeves (TE)

Genomics Aotearoa, Dunedin, New Zealand.
School of Biological Sciences, University of Canterbury, Christchurch, New Zealand.

Classifications MeSH