netDx: Software for building interpretable patient classifiers by multi-'omic data integration using patient similarity networks.
classification
data integration
genomics
networks
precision medicine
supervised learning
Journal
F1000Research
ISSN: 2046-1402
Titre abrégé: F1000Res
Pays: England
ID NLM: 101594320
Informations de publication
Date de publication:
2020
2020
Historique:
accepted:
28
09
2020
entrez:
25
2
2021
pubmed:
26
2
2021
medline:
29
4
2021
Statut:
epublish
Résumé
Patient classification based on clinical and genomic data will further the goal of precision medicine. Interpretability is of particular relevance for models based on genomic data, where sample sizes are relatively small (in the hundreds), increasing overfitting risk netDx is a machine learning method to integrate multi-modal patient data and build a patient classifier. Patient data are converted into networks of patient similarity, which is intuitive to clinicians who also use patient similarity for medical diagnosis. Features passing selection are integrated, and new patients are assigned to the class with the greatest profile similarity. netDx has excellent performance, outperforming most machine-learning methods in binary cancer survival prediction. It handles missing data - a common problem in real-world data - without requiring imputation. netDx also has excellent interpretability, with native support to group genes into pathways for mechanistic insight into predictive features. The netDx Bioconductor package provides multiple workflows for users to build custom patient classifiers. It provides turnkey functions for one-step predictor generation from multi-modal data, including feature selection over multiple train/test data splits. Workflows offer versatility with custom feature design, choice of similarity metric; speed is improved by parallel execution. Built-in functions and examples allow users to compute model performance metrics such as AUROC, AUPR, and accuracy. netDx uses RCy3 to visualize top-scoring pathways and the final integrated patient network in Cytoscape. Advanced users can build more complex predictor designs with functional building blocks used in the default design. Finally, the netDx Bioconductor package provides a novel workflow for pathway-based patient classification from sparse genetic data.
Identifiants
pubmed: 33628435
doi: 10.12688/f1000research.26429.1
pmc: PMC7883323
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1239Subventions
Organisme : NIMH NIH HHS
ID : R01 MH085542
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH093725
Pays : United States
Organisme : NIMH NIH HHS
ID : P50 MH066392
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH097276
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH075916
Pays : United States
Organisme : NIMH NIH HHS
ID : P50 MH096891
Pays : United States
Organisme : NIMH NIH HHS
ID : P50 MH084053
Pays : United States
Organisme : NIMH NIH HHS
ID : R37 MH057881
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH110921
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH109677
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH109897
Pays : United States
Organisme : NIMH NIH HHS
ID : U01 MH103392
Pays : United States
Organisme : NIDA NIH HHS
ID : HHSN271201300031C
Pays : United States
Organisme : NIGMS NIH HHS
ID : P41 GM103504
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG009979
Pays : United States
Informations de copyright
Copyright: © 2020 Pai S et al.
Déclaration de conflit d'intérêts
No competing interests were disclosed.