G2PDeep: a web-based deep-learning framework for quantitative phenotype prediction and discovery of genomic markers.


Journal

Nucleic acids research
ISSN: 1362-4962
Titre abrégé: Nucleic Acids Res
Pays: England
ID NLM: 0411011

Informations de publication

Date de publication:
02 07 2021
Historique:
accepted: 03 05 2021
revised: 28 04 2021
received: 02 03 2021
pubmed: 27 5 2021
medline: 15 7 2021
entrez: 26 5 2021
Statut: ppublish

Résumé

G2PDeep is an open-access web server, which provides a deep-learning framework for quantitative phenotype prediction and discovery of genomics markers. It uses zygosity or single nucleotide polymorphism (SNP) information from plants and animals as the input to predict quantitative phenotype of interest and genomic markers associated with phenotype. It provides a one-stop-shop platform for researchers to create deep-learning models through an interactive web interface and train these models with uploaded data, using high-performance computing resources plugged at the backend. G2PDeep also provides a series of informative interfaces to monitor the training process and compare the performance among the trained models. The trained models can then be deployed automatically. The quantitative phenotype and genomic markers are predicted using a user-selected trained model and the results are visualized. Our state-of-the-art model has been benchmarked and demonstrated competitive performance in quantitative phenotype predictions by other researchers. In addition, the server integrates the soybean nested association mapping (SoyNAM) dataset with five phenotypes, including grain yield, height, moisture, oil, and protein. A publicly available dataset for seed protein and oil content has also been integrated into the server. The G2PDeep server is publicly available at http://g2pdeep.org. The Python-based deep-learning model is available at https://github.com/shuaizengMU/G2PDeep_model.

Identifiants

pubmed: 34037802
pii: 6284176
doi: 10.1093/nar/gkab407
pmc: PMC8262736
doi:

Substances chimiques

Genetic Markers 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

W228-W236

Subventions

Organisme : NIGMS NIH HHS
ID : R35 GM126985
Pays : United States

Informations de copyright

© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

Références

Genet Res (Camb). 2009 Oct;91(5):307-11
pubmed: 19922694
Plant Genome. 2017 Jul;10(2):
pubmed: 28724064
G3 (Bethesda). 2018 Feb 2;8(2):519-529
pubmed: 29217731
Nat Genet. 2000 May;25(1):25-9
pubmed: 10802651
Nucleic Acids Res. 2021 Jan 8;49(D1):D545-D551
pubmed: 33125081
Front Plant Sci. 2011 Jul 25;2:34
pubmed: 22645531
Bioinformatics. 2011 Aug 1;27(15):2156-8
pubmed: 21653522
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419
pubmed: 33125078
J Anim Breed Genet. 2006 Aug;123(4):218-23
pubmed: 16882088
Front Genet. 2019 Nov 22;10:1091
pubmed: 31824557
Planta. 2018 Nov;248(5):1307-1318
pubmed: 30101399
PLoS Genet. 2015 Feb 17;11(2):e1004982
pubmed: 25689273
Plant Genome. 2015 Nov;8(3):eplantgenome2015.04.0024
pubmed: 33228276
Genetics. 2001 Apr;157(4):1819-29
pubmed: 11290733
PLoS Biol. 2016 Jan 11;14(1):e1002342
pubmed: 26752627

Auteurs

Shuai Zeng (S)

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.
Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.

Ziting Mao (Z)

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.
Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.

Yijie Ren (Y)

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.
Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.

Duolin Wang (D)

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.
Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.

Dong Xu (D)

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.
Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
MU Institute for Data Science and Informatics, University of Missouri, Columbia, MO 65211, USA.

Trupti Joshi (T)

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.
Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
MU Institute for Data Science and Informatics, University of Missouri, Columbia, MO 65211, USA.
Department of Health Management and Informatics, University of Missouri, Columbia, MO 65211, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH