Machine Learning of Plasma Proteomics Classifies Diagnosis of Interstitial Lung Disease.

Connective tissue disease associated interstitial lung disease Differential diagnosis Idiopathic pulmonary fibrosis Machine learning model Plasma proteomics

Journal

American journal of respiratory and critical care medicine
ISSN: 1535-4970
Titre abrégé: Am J Respir Crit Care Med
Pays: United States
ID NLM: 9421642

Informations de publication

Date de publication:
29 Feb 2024
Historique:
medline: 29 2 2024
pubmed: 29 2 2024
entrez: 29 2 2024
Statut: aheadofprint

Résumé

Distinguishing connective tissue disease associated interstitial lung disease (CTD-ILD) from idiopathic pulmonary fibrosis (IPF) can be clinically challenging. Identify proteins that separate and classify CTD-ILD from IPF patients. Four registries with 1247 IPF and 352 CTD-ILD patients were included in analyses. Plasma samples were subjected to high-throughput proteomics assays. Protein features were prioritized using Recursive Feature Elimination (RFE) to construct a proteomic classifier. Multiple machine learning models, including Support Vector Machine, LASSO regression, Random Forest (RF), and imbalanced-RF, were trained and tested in independent cohorts. The validated models were used to classify each case iteratively in external datasets. A classifier with 37 proteins (PC37) was enriched in biological process of bronchiole development and smooth muscle proliferation, and immune responses. Four machine learning models used PC37 with sex and age score to generate continuous classification values. Receiver-operating-characteristic curve analyses of these scores demonstrated consistent Area-Under-Curve 0.85-0.90 in test cohort, and 0.94-0.96 in the single-sample dataset. Binary classification demonstrated 78.6%-80.4% sensitivity and 76%-84.4% specificity in test cohort, 93.5%-96.1% sensitivity and 69.5%-77.6% specificity in single-sample classification dataset. Composite analysis of all machine learning models confirmed 78.2% (194/248) accuracy in test cohort and 82.9% (208/251) in single-sample classification dataset. Multiple machine learning models trained with large cohort proteomic datasets consistently distinguished CTD-ILD from IPF. Identified proteins involved in immune pathways. We further developed a novel approach for single sample classification, which could facilitate honing the differential diagnosis of ILD in challenging cases and improve clinical decision-making.

Identifiants

pubmed: 38422478
doi: 10.1164/rccm.202309-1692OC
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : NHLBI NIH HHS
ID : R01 HL166290
Pays : United States
Organisme : NHLBI NIH HHS
ID : R01 HL169166
Pays : United States

Auteurs

Yong Huang (Y)

University of Virginia School of Medicine, 12349, Charlottesville, Virginia, United States.

Shwu-Fan Ma (SF)

University of Virginia School of Medicine, 12349, Division of Pulmonary & Critical Care Medicine, Charlottesville, Virginia, United States.
University of Virginia.

Justin M Oldham (JM)

University of California Davis, 8789, Pulmonary and Critical Care Medicine, Davis, California, United States.

Ayodeji Adegunsoye (A)

University of Chicago, Section of Pulmonary and Critical Care, Dept. of Medicine, Chicago, Illinois, United States.

Daisy Zhu (D)

University of Virginia School of Medicine, 12349, Medicine, Charlottesville, Virginia, United States.

Susan Murray (S)

University of Michigan, 1259, Ann Arbor, Michigan, United States.

John S Kim (JS)

University of Virginia, 2358, Medicine, Charlottesville, Virginia, United States.
Charlottesville, Virginia, United States.

Catherine Bonham (C)

University of Virginia, 2358, Pulmonary & Critical Care Medicine, Charlottesville, Virginia, United States.

Emma Strickland (E)

University of Virginia, 2358, Charlottesville, Virginia, United States.

Angela L Linderholm (AL)

University of California Davis, Sacramento, California, United States.

Cathryn T Lee (CT)

The University of Chicago, 2462, Department of Medicine, Chicago, Illinois, United States.

Tessy Paul (T)

University of Virginia, 2358, Medicine, Charlottesville, Virginia, United States.

Hannah Mannem (H)

University of Virginia, Medicine, Charlottesville, Virginia, United States.

Toby M Maher (TM)

University of Southern California Keck School of Medicine, 12223, PCCSM, Los Angeles, California, United States.

Philip L Molyneaux (PL)

Imperial College London, National Heart and Lung Institute, London, United Kingdom of Great Britain and Northern Ireland.

Mary E Strek (ME)

University of Chicago Hosp, Department of Medicine, Chicago, Illinois, United States.

Fernando J Martinez (FJ)

Cornell Medical College, New York, New York, United States.

Imre Noth (I)

University of Virginia, Division of Pulmonary & Critical Care & Sleep Medicine, Department of Medicine, , Charlottesville, Virginia, United States; IN2C@hscmail.mcc.virginia.edu.

Classifications MeSH