Extending Classification Algorithms to Case-Control Studies.

Diabetes biomarker discovery machine learning support vector machines variable selection

Journal

Biomedical engineering and computational biology
ISSN: 1179-5972
Titre abrégé: Biomed Eng Comput Biol
Pays: United States
ID NLM: 101633089

Informations de publication

Date de publication:
2019
Historique:
received: 21 01 2019
accepted: 26 04 2019
entrez: 20 7 2019
pubmed: 20 7 2019
medline: 20 7 2019
Statut: epublish

Résumé

Classification is a common technique applied to 'omics data to build predictive models and identify potential markers of biomedical outcomes. Despite the prevalence of case-control studies, the number of classification methods available to analyze data generated by such studies is extremely limited. Conditional logistic regression is the most commonly used technique, but the associated modeling assumptions limit its ability to identify a large class of sufficiently complicated 'omic signatures. We propose a data preprocessing step which generalizes and makes any linear or nonlinear classification algorithm, even those typically not appropriate for matched design data, available to be used to model case-control data and identify relevant biomarkers in these study designs. We demonstrate on simulated case-control data that both the classification and variable selection accuracy of each method is improved after applying this processing step and that the proposed methods are comparable to or outperform existing variable selection methods. Finally, we demonstrate the impact of conditional classification algorithms on a large cohort study of children with islet autoimmunity.

Identifiants

pubmed: 31320812
doi: 10.1177/1179597219858954
pii: 10.1177_1179597219858954
pmc: PMC6630079
doi:

Types de publication

Journal Article

Langues

eng

Pagination

1179597219858954

Subventions

Organisme : NIDDK NIH HHS
ID : U01 DK063821
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK063863
Pays : United States
Organisme : NIDDK NIH HHS
ID : U01 DK063861
Pays : United States
Organisme : NIDDK NIH HHS
ID : U01 DK063790
Pays : United States
Organisme : NCATS NIH HHS
ID : UL1 TR001082
Pays : United States
Organisme : NCATS NIH HHS
ID : UL1 TR000064
Pays : United States
Organisme : NLM NIH HHS
ID : HHSN267200700014C
Pays : United States
Organisme : NIDDK NIH HHS
ID : U01 DK063836
Pays : United States
Organisme : NIDDK NIH HHS
ID : U01 DK063829
Pays : United States
Organisme : NIDDK NIH HHS
ID : U01 DK063865
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK095300
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK063861
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK063829
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK063821
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK117483
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK063836
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK112243
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK063865
Pays : United States
Organisme : NIDDK NIH HHS
ID : U01 DK063863
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK106955
Pays : United States
Organisme : NIDDK NIH HHS
ID : UC4 DK100238
Pays : United States

Déclaration de conflit d'intérêts

Declaration of conflicting interests:The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Références

J Intern Med. 1999 Jan;245(1):99-102
pubmed: 10095823
Diabetes Care. 1999 Aug;22(8):1245-51
pubmed: 10480765
Diabetes. 2001 Apr;50(4):752-62
pubmed: 11289039
Nat Med. 2001 Sep;7(9):1057-62
pubmed: 11533711
J Pediatr Endocrinol Metab. 2005 Dec;18(12):1409-16
pubmed: 16459467
J Pediatr Endocrinol Metab. 2005 Dec;18(12):1417-23
pubmed: 16459468
Microbiology. 2006 Jul;152(Pt 7):2023-30
pubmed: 16804177
Ann N Y Acad Sci. 2006 Oct;1079:320-6
pubmed: 17130573
Diabetes. 2007 May;56(5):1341-9
pubmed: 17287460
Nat Genet. 2007 Jul;39(7):857-64
pubmed: 17554260
Nature. 2007 Jun 7;447(7145):661-78
pubmed: 17554300
JAMA. 2007 Sep 26;298(12):1420-8
pubmed: 17895458
Mol Cancer. 2007 Oct 29;6:70
pubmed: 17967182
Diabetologia. 2008 May;51(5):773-80
pubmed: 18317723
Mol Cancer. 2008 Jul 10;7:62
pubmed: 18616821
Brain. 2008 Nov;131(Pt 11):2969-74
pubmed: 18835868
Biomarkers. 2009 Aug;14(5):340-6
pubmed: 19552569
Int J Biostat. 2009 Jan 06;5(1):Article 1
pubmed: 20231866
Nutr Rev. 2010 May;68(5):270-9
pubmed: 20500788
PLoS One. 2010 May 28;5(5):e10883
pubmed: 20526369
Diabetes Care. 2010 Nov;33(11):2327-32
pubmed: 20724647
Diabetologia. 2011 Dec;54(12):2995-3002
pubmed: 21932150
Bioinformatics. 2012 Jan 1;28(1):112-8
pubmed: 22039212
Br J Clin Pharmacol. 2013 Mar;75(3):671-6
pubmed: 22242741
Ann Nutr Metab. 2013;62(1):80-5
pubmed: 23296094
Res Nurs Health. 2013 Jun;36(3):320-4
pubmed: 23408517
Food Chem. 2013 Dec 1;141(3):3085-92
pubmed: 23871063
Biometrics. 2014 Mar;70(1):153-63
pubmed: 24320930
Diabetes Metab Res Rev. 2014 Jul;30(5):424-34
pubmed: 24339168
Metabolism. 2014 Oct;63(10):1287-95
pubmed: 25088746
Gigascience. 2015 Feb 25;4:7
pubmed: 25722852
BMC Bioinformatics. 2015;16 Suppl 6:S1
pubmed: 25916593
J Stat Softw. 2014 Jul;58(12):
pubmed: 26257587
J Cell Physiol. 2016 Apr;231(4):852-62
pubmed: 26313443
Biochem Biophys Res Commun. 2016 Jan 8;469(2):319-25
pubmed: 26603935
Diabetes Care. 2016 Jun;39(6):988-95
pubmed: 27208342
Int J Biostat. 2017 Jan 31;13(1):
pubmed: 28157692
J Clin Invest. 2017 May 1;127(5):1757-1771
pubmed: 28375156
Diabetologia. 2017 Jul;60(7):1223-1233
pubmed: 28474159
Neurochem Int. 2018 Jan;112:234-238
pubmed: 28774719
Stat Med. 2017 Nov 20;36(26):4196-4213
pubmed: 28783882
Am J Epidemiol. 1978 Oct;108(4):299-307
pubmed: 727199

Auteurs

Bryan Stanfill (B)

Computing and Analytics Division, National Security Directorate, Pacific Northwest National Laboratory, Richland, WA, USA.

Sarah Reehl (S)

Computing and Analytics Division, National Security Directorate, Pacific Northwest National Laboratory, Richland, WA, USA.

Lisa Bramer (L)

Computing and Analytics Division, National Security Directorate, Pacific Northwest National Laboratory, Richland, WA, USA.

Ernesto S Nakayasu (ES)

Biological Sciences Division, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA.

Stephen S Rich (SS)

Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA.

Thomas O Metz (TO)

Biological Sciences Division, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA.

Marian Rewers (M)

Barbara Davis Center for Childhood Diabetes, University of Colorado Denver, Aurora, CO, USA.

Bobbie-Jo Webb-Robertson (BJ)

Biological Sciences Division, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA.

Classifications MeSH