A review of spline function procedures in R.


Journal

BMC medical research methodology
ISSN: 1471-2288
Titre abrégé: BMC Med Res Methodol
Pays: England
ID NLM: 100968545

Informations de publication

Date de publication:
06 03 2019
Historique:
received: 06 07 2018
accepted: 18 01 2019
entrez: 8 3 2019
pubmed: 8 3 2019
medline: 25 1 2020
Statut: epublish

Résumé

With progress on both the theoretical and the computational fronts the use of spline modelling has become an established tool in statistical regression analysis. An important issue in spline modelling is the availability of user friendly, well documented software packages. Following the idea of the STRengthening Analytical Thinking for Observational Studies initiative to provide users with guidance documents on the application of statistical methods in observational research, the aim of this article is to provide an overview of the most widely used spline-based techniques and their implementation in R. In this work, we focus on the R Language for Statistical Computing which has become a hugely popular statistics software. We identified a set of packages that include functions for spline modelling within a regression framework. Using simulated and real data we provide an introduction to spline modelling and an overview of the most popular spline functions. We present a series of simple scenarios of univariate data, where different basis functions are used to identify the correct functional form of an independent variable. Even in simple data, using routines from different packages would lead to different results. This work illustrate challenges that an analyst faces when working with data. Most differences can be attributed to the choice of hyper-parameters rather than the basis used. In fact an experienced user will know how to obtain a reasonable outcome, regardless of the type of spline used. However, many analysts do not have sufficient knowledge to use these powerful tools adequately and will need more guidance.

Sections du résumé

BACKGROUND
With progress on both the theoretical and the computational fronts the use of spline modelling has become an established tool in statistical regression analysis. An important issue in spline modelling is the availability of user friendly, well documented software packages. Following the idea of the STRengthening Analytical Thinking for Observational Studies initiative to provide users with guidance documents on the application of statistical methods in observational research, the aim of this article is to provide an overview of the most widely used spline-based techniques and their implementation in R.
METHODS
In this work, we focus on the R Language for Statistical Computing which has become a hugely popular statistics software. We identified a set of packages that include functions for spline modelling within a regression framework. Using simulated and real data we provide an introduction to spline modelling and an overview of the most popular spline functions.
RESULTS
We present a series of simple scenarios of univariate data, where different basis functions are used to identify the correct functional form of an independent variable. Even in simple data, using routines from different packages would lead to different results.
CONCLUSIONS
This work illustrate challenges that an analyst faces when working with data. Most differences can be attributed to the choice of hyper-parameters rather than the basis used. In fact an experienced user will know how to obtain a reasonable outcome, regardless of the type of spline used. However, many analysts do not have sufficient knowledge to use these powerful tools adequately and will need more guidance.

Identifiants

pubmed: 30841848
doi: 10.1186/s12874-019-0666-3
pii: 10.1186/s12874-019-0666-3
pmc: PMC6402144
doi:

Types de publication

Journal Article Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

46

Références

Stat Med. 2007 Dec 30;26(30):5512-28
pubmed: 18058845
Am J Epidemiol. 2002 Aug 1;156(3):193-203
pubmed: 12142253
Stat Methods Med Res. 1995 Sep;4(3):187-96
pubmed: 8548102
Stat Med. 2014 Dec 30;33(30):5413-32
pubmed: 25074480
J Biopharm Stat. 2011 Nov;21(6):1177-86
pubmed: 22023685
Biom J. 2015 Jul;57(4):531-55
pubmed: 25501529
Stat Med. 1992 Jul;11(10):1305-19
pubmed: 1518992

Auteurs

Aris Perperoglou (A)

Department of Mathematical Sciences, University of Essex, Colchester, UK. aperpe@essex.ac.uk.

Willi Sauerbrei (W)

Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany.

Michal Abrahamowicz (M)

McGill University Health Centre, McGill University, Montreal, Canada.

Matthias Schmid (M)

Medical Biometry, Informatics and Epidemiology, Faculty of Medicine, University of Bonn, Bonn, Germany.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH