Building more accurate decision trees with the additive tree.

CART additive tree decision tree gradient boosting interpretable machine learning

Journal

Proceedings of the National Academy of Sciences of the United States of America
ISSN: 1091-6490
Titre abrégé: Proc Natl Acad Sci U S A
Pays: United States
ID NLM: 7505876

Informations de publication

Date de publication:
01 10 2019
Historique:
pubmed: 19 9 2019
medline: 9 4 2020
entrez: 19 9 2019
Statut: ppublish

Résumé

The expansion of machine learning to high-stakes application domains such as medicine, finance, and criminal justice, where making informed decisions requires clear understanding of the model, has increased the interest in interpretable machine learning. The widely used Classification and Regression Trees (CART) have played a major role in health sciences, due to their simple and intuitive explanation of predictions. Ensemble methods like gradient boosting can improve the accuracy of decision trees, but at the expense of the interpretability of the generated model. Additive models, such as those produced by gradient boosting, and full interaction models, such as CART, have been investigated largely in isolation. We show that these models exist along a spectrum, revealing previously unseen connections between these approaches. This paper introduces a rigorous formalization for the additive tree, an empirically validated learning technique for creating a single decision tree, and shows that this method can produce models equivalent to CART or gradient boosted stumps at the extremes by varying a single parameter. Although the additive tree is designed primarily to provide both the model interpretability and predictive performance needed for high-stakes applications like medicine, it also can produce decision trees represented by hybrid models between CART and boosted stumps that can outperform either of these approaches.

Identifiants

pubmed: 31527280
pii: 1816748116
doi: 10.1073/pnas.1816748116
pmc: PMC6778203
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

19887-19893

Subventions

Organisme : NIBIB NIH HHS
ID : K08 EB026500
Pays : United States

Commentaires et corrections

Type : CommentIn
Type : CommentIn

Informations de copyright

Copyright © 2019 the Author(s). Published by PNAS.

Déclaration de conflit d'intérêts

Conflict of interest statement: J.M.L., E.E., L.H.U., C.B.S., T.D.S., and G.V. have a patent titled “Systems and methods for generating improved decision trees,” pending status.

Références

Radiother Oncol. 2019 Apr;133:106-112
pubmed: 30935565
BioData Min. 2017 Dec 11;10:36
pubmed: 29238404
Science. 2017 Apr 14;356(6334):183-186
pubmed: 28408601
Clin Chem Lab Med. 2018 Mar 28;56(4):516-524
pubmed: 29055936
Sci Rep. 2016 Nov 30;6:37854
pubmed: 27901055

Auteurs

José Marcio Luna (JM)

Department of Radiation Oncology, University of Pennsylvania, Philadelphia, PA 19104; gilmer.valdes@ucsf.edu jose.luna@pennmedicine.upenn.edu jhf@stanford.edu.

Efstathios D Gennatas (ED)

Department of Radiation Oncology, University of California, San Francisco, CA 94115.

Lyle H Ungar (LH)

Department of Computing and Information Science, University of Pennsylvania, Philadelphia, PA 19104.

Eric Eaton (E)

Department of Computing and Information Science, University of Pennsylvania, Philadelphia, PA 19104.

Eric S Diffenderfer (ES)

Department of Radiation Oncology, University of Pennsylvania, Philadelphia, PA 19104.

Shane T Jensen (ST)

Department of Statistics, University of Pennsylvania, Philadelphia, PA 19104.

Charles B Simone (CB)

Department of Radiation Oncology, New York Proton Center, New York, NY 10035.

Jerome H Friedman (JH)

Department of Statistics, Stanford University, Stanford, CA 94305 gilmer.valdes@ucsf.edu jose.luna@pennmedicine.upenn.edu jhf@stanford.edu.

Timothy D Solberg (TD)

Department of Radiation Oncology, University of California, San Francisco, CA 94115.

Gilmer Valdes (G)

Department of Radiation Oncology, University of California, San Francisco, CA 94115; gilmer.valdes@ucsf.edu jose.luna@pennmedicine.upenn.edu jhf@stanford.edu.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Understanding the role of machine learning in predicting progression of osteoarthritis.

Simone Castagno, Benjamin Gompels, Estelle Strangmark et al.
1.00
Humans Disease Progression Machine Learning Osteoarthritis
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature

Classifications MeSH