Identifying stress responsive genes using overlapping communities in co-expression networks.
Co-expression network
LASSO
Oryza sativa
Overlapping communities
Phenotypic traits
Rice
Salinity
Stress-responsive genes
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
07 Nov 2021
07 Nov 2021
Historique:
received:
17
12
2020
accepted:
26
10
2021
entrez:
8
11
2021
pubmed:
9
11
2021
medline:
10
11
2021
Statut:
epublish
Résumé
This paper proposes a workflow to identify genes that respond to specific treatments in plants. The workflow takes as input the RNA sequencing read counts and phenotypical data of different genotypes, measured under control and treatment conditions. It outputs a reduced group of genes marked as relevant for treatment response. Technically, the proposed approach is both a generalization and an extension of WGCNA. It aims to identify specific modules of overlapping communities underlying the co-expression network of genes. Module detection is achieved by using Hierarchical Link Clustering. The overlapping nature of the systems' regulatory domains that generate co-expression can be identified by such modules. LASSO regression is employed to analyze phenotypic responses of modules to treatment. The workflow is applied to rice (Oryza sativa), a major food source known to be highly sensitive to salt stress. The workflow identifies 19 rice genes that seem relevant in the response to salt stress. They are distributed across 6 modules: 3 modules, each grouping together 3 genes, are associated to shoot K content; 2 modules of 3 genes are associated to shoot biomass; and 1 module of 4 genes is associated to root biomass. These genes represent target genes for the improvement of salinity tolerance in rice. A more effective framework to reduce the search-space for target genes that respond to a specific treatment is introduced. It facilitates experimental validation by restraining efforts to a smaller subset of genes of high potential relevance.
Sections du résumé
BACKGROUND
BACKGROUND
This paper proposes a workflow to identify genes that respond to specific treatments in plants. The workflow takes as input the RNA sequencing read counts and phenotypical data of different genotypes, measured under control and treatment conditions. It outputs a reduced group of genes marked as relevant for treatment response. Technically, the proposed approach is both a generalization and an extension of WGCNA. It aims to identify specific modules of overlapping communities underlying the co-expression network of genes. Module detection is achieved by using Hierarchical Link Clustering. The overlapping nature of the systems' regulatory domains that generate co-expression can be identified by such modules. LASSO regression is employed to analyze phenotypic responses of modules to treatment.
RESULTS
RESULTS
The workflow is applied to rice (Oryza sativa), a major food source known to be highly sensitive to salt stress. The workflow identifies 19 rice genes that seem relevant in the response to salt stress. They are distributed across 6 modules: 3 modules, each grouping together 3 genes, are associated to shoot K content; 2 modules of 3 genes are associated to shoot biomass; and 1 module of 4 genes is associated to root biomass. These genes represent target genes for the improvement of salinity tolerance in rice.
CONCLUSIONS
CONCLUSIONS
A more effective framework to reduce the search-space for target genes that respond to a specific treatment is introduced. It facilitates experimental validation by restraining efforts to a smaller subset of genes of high potential relevance.
Identifiants
pubmed: 34743699
doi: 10.1186/s12859-021-04462-4
pii: 10.1186/s12859-021-04462-4
pmc: PMC8574028
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
541Subventions
Organisme : World Bank Group
ID : FP44842-217-2018
Informations de copyright
© 2021. The Author(s).
Références
Genes Brain Behav. 2014 Jan;13(1):13-24
pubmed: 24320616
PLoS Genet. 2017 Jun 5;13(6):e1006823
pubmed: 28582424
Rice (N Y). 2019 Dec 2;12(1):88
pubmed: 31792643
Genome Biol. 2014;15(12):550
pubmed: 25516281
Front Plant Sci. 2015 Dec 10;6:1073
pubmed: 26697033
Medicine (Baltimore). 2018 Jun;97(24):e10781
pubmed: 29901575
Plant Physiol Biochem. 2019 Nov;144:427-435
pubmed: 31639558
BMC Bioinformatics. 2008 Dec 29;9:559
pubmed: 19114008
Physiol Plant. 2002 Jul;115(3):393-400
pubmed: 12081532
Nucleic Acids Res. 2017 Jan 4;45(D1):D362-D368
pubmed: 27924014
BMC Genom Data. 2021 May 27;22(1):17
pubmed: 34044788
Nature. 2005 Jun 9;435(7043):814-8
pubmed: 15944704
BMC Bioinformatics. 2012 Dec 09;13:328
pubmed: 23217028
J Plant Physiol. 2011 Mar 1;168(4):317-28
pubmed: 20728960
Biol Res. 2018 Jan 16;51(1):4
pubmed: 29338771
Nat Commun. 2018 Mar 15;9(1):1090
pubmed: 29545622
Front Plant Sci. 2020 Oct 23;11:576479
pubmed: 33193518
BMC Plant Biol. 2019 Jun 17;19(1):261
pubmed: 31208339
Sci Rep. 2019 Jun 3;9(1):8249
pubmed: 31160691
Front Plant Sci. 2021 Jun 18;12:677611
pubmed: 34220896
Nature. 2010 Aug 5;466(7307):761-4
pubmed: 20562860
Plant Cell Physiol. 2007 Mar;48(3):381-90
pubmed: 17251202
Protoplasma. 2015 Mar;252(2):461-75
pubmed: 25164029
Methods Mol Biol. 2016;1418:93-110
pubmed: 27008011
Nucleic Acids Res. 2019 Jan 8;47(D1):D506-D515
pubmed: 30395287
Sci Am. 2003 May;288(5):60-9
pubmed: 12701331
Phytochemistry. 2010 Nov;71(16):1808-24
pubmed: 20800856
Rice (N Y). 2013 Feb 06;6(1):4
pubmed: 24280374
New Phytol. 2005 Sep;167(3):645-63
pubmed: 16101905
Bioinformatics. 2009 Nov 15;25(22):3045-6
pubmed: 19744993
Sci Rep. 2018 Aug 15;8(1):12207
pubmed: 30111825
Saudi J Biol Sci. 2015 Mar;22(2):123-31
pubmed: 25737642
New Phytol. 2021 May;230(3):1273-1287
pubmed: 33453070
Plant Direct. 2019 Aug 12;3(8):e00154
pubmed: 31417977