Representation and quantification of module activity from omics data with rROMA.
Journal
NPJ systems biology and applications
ISSN: 2056-7189
Titre abrégé: NPJ Syst Biol Appl
Pays: England
ID NLM: 101677786
Informations de publication
Date de publication:
19 Jan 2024
19 Jan 2024
Historique:
received:
31
05
2023
accepted:
03
01
2024
medline:
20
1
2024
pubmed:
20
1
2024
entrez:
19
1
2024
Statut:
epublish
Résumé
The efficiency of analyzing high-throughput data in systems biology has been demonstrated in numerous studies, where molecular data, such as transcriptomics and proteomics, offers great opportunities for understanding the complexity of biological processes. One important aspect of data analysis in systems biology is the shift from a reductionist approach that focuses on individual components to a more integrative perspective that considers the system as a whole, where the emphasis shifted from differential expression of individual genes to determining the activity of gene sets. Here, we present the rROMA software package for fast and accurate computation of the activity of gene sets with coordinated expression. The rROMA package incorporates significant improvements in the calculation algorithm, along with the implementation of several functions for statistical analysis and visualizing results. These additions greatly expand the package's capabilities and offer valuable tools for data analysis and interpretation. It is an open-source package available on github at: www.github.com/sysbio-curie/rROMA . Based on publicly available transcriptomic datasets, we applied rROMA to cystic fibrosis, highlighting biological mechanisms potentially involved in the establishment and progression of the disease and the associated genes. Results indicate that rROMA can detect disease-related active signaling pathways using transcriptomic and proteomic data. The results notably identified a significant mechanism relevant to cystic fibrosis, raised awareness of a possible bias related to cell culture, and uncovered an intriguing gene that warrants further investigation.
Identifiants
pubmed: 38242871
doi: 10.1038/s41540-024-00331-x
pii: 10.1038/s41540-024-00331-x
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
8Subventions
Organisme : Association Vaincre la Mucoviscidose (Vaincre la Mucoviscidos)
ID : 20190502488
Informations de copyright
© 2024. The Author(s).
Références
Hawkins, R. D., Hon, G. C. & Ren, B. Next-generation genomics: an integrative approach. Nat. Rev. Genet. 11, 476–486 (2010).
doi: 10.1038/nrg2795
pubmed: 20531367
pmcid: 3321268
Barillot, E., Calzone, L., Hupe, P., Vert, J. P., & Zinovyev, A. Computational systems biology of cancer (CRC Press, 2012).
Wang, K., Li, M. & Hakonarson, H. Analysing biological pathways in genome-wide association studies. Nat. Rev. Genet. 11, 843–854 (2010).
doi: 10.1038/nrg2884
pubmed: 21085203
Levine, D. M. et al. Pathway and gene-set activation measurement from mRNA expression data: the tissue distribution of human pathways. Genome Biol. 7, 1–17 (2006).
doi: 10.1186/gb-2006-7-10-r93
Schreiber, A. W. & Baumann, U. A framework for gene expression analysis. Bioinformatics 23, 191–197 (2007).
doi: 10.1093/bioinformatics/btl591
pubmed: 17118957
Puthier, D. & van Helden, J. Statistics for bioinformatics - practicals - gene enrichment statistics, https://dputhier.github.io/ASG/practicals/go_statistics_td/go_statistics_td_2015.html (2015).
Subramanian, T. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Nat. Genet 34, 267–273 (2003).
pubmed: 12808457
Tomfohr, J., Lu, J. & Kepler, T. B. Pathway level analysis of gene expression using singular value decomposition. BMC Bioinforma. 6, 1–11 (2005).
doi: 10.1186/1471-2105-6-225
Bild, A. H., Potti, A. & Nevins, J. R. Linking oncogenic pathways with therapeutic opportunities. Nat. Rev. Cancer 6, 735–741 (2006).
doi: 10.1038/nrc1976
pubmed: 16915294
Lim, S., Lee, S., Jung, I., Rhee, S. & Kim, S. Comprehensive and critical evaluation of individualized pathway activity measurement tools on pan-cancer data. Brief. Bioinforma. 21, 36–46 (2020).
Barbie, D. A. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108–112 (2009).
doi: 10.1038/nature08460
pubmed: 19847166
pmcid: 2783335
Hänzelmann, S., Castelo, R. & Guinney, J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinforma. 14, 1–15 (2013).
doi: 10.1186/1471-2105-14-7
Vaske, C. J. et al. Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics 26, i237–i245 (2010).
doi: 10.1093/bioinformatics/btq182
pubmed: 20529912
pmcid: 2881367
Alvarez, M. J. et al. Network-based inference of protein activity helps functionalize the genetic landscape of cancer. Nat. Genet. 48, 838 (2016).
doi: 10.1038/ng.3593
pubmed: 27322546
pmcid: 5040167
Schubert, M. et al. Perturbation-response genes reveal signaling footprints in cancer gene expression. Nat. Commun. 9, 20 (2018).
doi: 10.1038/s41467-017-02391-6
pubmed: 29295995
pmcid: 5750219
Fan, J. et al. Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis. Nat. Methods 13, 241–244 (2016).
doi: 10.1038/nmeth.3734
pubmed: 26780092
pmcid: 4772672
Landais, Y. & Vallot, C. Multi-modal quantification of pathway activity with MAYA, Nature. Communications 14, 1668 (2023).
Holland, C. H. et al. Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data. Genome Biol. 21, 1–19 (2020).
doi: 10.1186/s13059-020-1949-z
Zhang, Y. et al. Benchmarking algorithms for pathway activity transformation of single-cell RNA-seq data. Comput. Struct. Biotechnol. J. 18, 2953–2961 (2020).
doi: 10.1016/j.csbj.2020.10.007
pubmed: 33209207
pmcid: 7642725
Martignetti, L., Calzone, L., Bonnet, E., Barillot, E. & Zinovyev, A. ROMA: representation and quantification of module activity from target expression data. Front. Genet. 7, 18 (2016).
doi: 10.3389/fgene.2016.00018
pubmed: 26925094
pmcid: 4760130
Golub, G. H., & Reinsch, C. Singular value decomposition and least squares solutions. In Handbook for Automatic Computation: Volume II: Linear Algebra, 134–151 (Springer Berlin Heidelberg, 1971).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995).
Rehman, T. et al. Inflammatory cytokines TNF-α and IL-17 enhance the efficacy of cystic fibrosis transmembrane conductance regulator modulators. J. Clin. Investig. 131, e150398 (2021).
doi: 10.1172/JCI150398
pubmed: 34166230
pmcid: 8363270
Liberzon, A. et al. The molecular signatures database hallmark gene set collection. Cell Syst. 1, 417–425 (2015).
doi: 10.1016/j.cels.2015.12.004
pubmed: 26771021
pmcid: 4707969
Cantiello, H. Role of actin filament organization in CFTR activation. Pflügers Arch. 443, S75–S80 (2001).
doi: 10.1007/s004240100649
pubmed: 11845308
Vasconcellos, C. A. et al. Reduction in viscosity of cystic fibrosis sputum in vitro by gelsolin. Science 263, 969–971 (1994).
doi: 10.1126/science.8310295
pubmed: 8310295
Bucki, R. et al. Enhancement of Pulmozyme activity in purulent sputum by combination with poly-aspartic acid or gelsolin. J. Cyst. Fibros. 14, 587–593 (2015).
doi: 10.1016/j.jcf.2015.02.001
pubmed: 25682700
Saint-Criq, V. et al. Choice of differentiation media significantly impacts cell lineage and response to CFTR modulators in fully differentiated primary cultures of cystic fibrosis human airway epithelial cells. Cells 9, 2137 (2020).
doi: 10.3390/cells9092137
pubmed: 32967385
pmcid: 7565948
Plasschaert, L. W. et al. A single-cell atlas of the airway epithelium reveals the CFTR-rich pulmonary ionocyte. Nature 560, 377–381 (2018).
doi: 10.1038/s41586-018-0394-6
pubmed: 30069046
pmcid: 6108322
Okuda, K. et al. Secretory cells dominate airway CFTR expression and function in human airway superficial epithelia. Am. J. Respir. Crit. Care Med. 203, 1275–1289 (2021).
doi: 10.1164/rccm.202008-3198OC
pubmed: 33321047
pmcid: 8456462
Krug, K. et al. Proteogenomic landscape of breast cancer tumorigenesis and targeted therapy. Cell 183, 1436–1456 (2020).
doi: 10.1016/j.cell.2020.10.036
pubmed: 33212010
pmcid: 8077737
Parker, J. S. et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27, 1160 (2009).
doi: 10.1200/JCO.2008.18.1370
pubmed: 19204204
pmcid: 2667820
Fouad, T. M., Kogawa, T., Reuben, J. M. & Ueno, N. T. The role of inflammation in inflammatory breast cancer. Inflamm. Cancer 816, 53–73 (2014).
doi: 10.1007/978-3-0348-0837-8_3
Sarrió, D. et al. Epithelial-mesenchymal transition in breast cancer relates to the basal-like phenotype. Cancer Res. 68, 989–997 (2008).
doi: 10.1158/0008-5472.CAN-07-2017
pubmed: 18281472
Strandvik, B. Fatty acid metabolism in cystic fibrosis. Prostaglandins, leukotrienes Essent. Fat. acids 83, 121–129 (2010).
doi: 10.1016/j.plefa.2010.07.002
Baglama, J. IRLBA: fast partial singular value decomposition method. Handbook of Big Data, 125–136 (CRC press, 2016).
Tsuyuzaki, K., Sato, H., Sato, K. & Nikaido, I. Benchmarking principal component analysis for large-scale single-cell RNA-sequencing. Genome Biol. 21, 1–17 (2020).
doi: 10.1186/s13059-019-1900-3
Gorban, A. N., & Zinovyev, A. Y. Principal graphs and manifolds. Handbook of research on machine learning applications and trends: algorithms, methods and techniques, 28–59 (IGI Global 2010)
Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H. The elements of statistical learning: data mining, inference, and prediction, 2, 1–758 (Springer, 2009).
Van Buuren, S. & Groothuis-Oudshoorn, K. Mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45, 1–67 (2011).
doi: 10.18637/jss.v045.i03
Kanehisa, M., The KEGG database, In Silico Simulation of Biological Processes: Novartis Foundation Symposium Chichester, 247, 91–103 (John Wiley & Sons, 2002).
Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).
doi: 10.1093/bioinformatics/btr260
pubmed: 21546393
pmcid: 3106198
Lee, E., Chuang, H. Y., Kim, J. W., Ideker, T. & Lee, D. Inferring pathway activity toward precise disease classification. PLoS Comput. Biol. 4, e1000217 (2008).
doi: 10.1371/journal.pcbi.1000217
pubmed: 18989396
pmcid: 2563693
Wagner, F. GO-PCA: an unsupervised method to explore gene expression data using prior knowledge. PloS One 10, e0143196 (2015).
doi: 10.1371/journal.pone.0143196
pubmed: 26575370
pmcid: 4648502
Frost, H. R., Li, Z. & Moore, J. H. Principal component gene set enrichment (PCGSE). BioData Min. 8, 1–18 (2015).
doi: 10.1186/s13040-015-0059-z
Drier, Y., Sheffer, M. & Domany, E. Pathway-based personalized analysis of cancer. Proc. Natl Acad. Sci. 110, 6388–6393 (2013).
doi: 10.1073/pnas.1219651110
pubmed: 23547110
pmcid: 3631698