Systematic Error Removal Using Random Forest for Normalizing Large-Scale Untargeted Lipidomics Data.


Journal

Analytical chemistry
ISSN: 1520-6882
Titre abrégé: Anal Chem
Pays: United States
ID NLM: 0370536

Informations de publication

Date de publication:
05 03 2019
Historique:
pubmed: 14 2 2019
medline: 4 9 2020
entrez: 14 2 2019
Statut: ppublish

Résumé

Large-scale untargeted lipidomics experiments involve the measurement of hundreds to thousands of samples. Such data sets are usually acquired on one instrument over days or weeks of analysis time. Such extensive data acquisition processes introduce a variety of systematic errors, including batch differences, longitudinal drifts, or even instrument-to-instrument variation. Technical data variance can obscure the true biological signal and hinder biological discoveries. To combat this issue, we present a novel normalization approach based on using quality control pool samples (QC). This method is called systematic error removal using random forest (SERRF) for eliminating the unwanted systematic variations in large sample sets. We compared SERRF with 15 other commonly used normalization methods using six lipidomics data sets from three large cohort studies (832, 1162, and 2696 samples). SERRF reduced the average technical errors for these data sets to 5% relative standard deviation. We conclude that SERRF outperforms other existing methods and can significantly reduce the unwanted systematic variation, revealing biological variance of interest.

Identifiants

pubmed: 30758187
doi: 10.1021/acs.analchem.8b05592
pmc: PMC9652764
mid: NIHMS1848076
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

3590-3596

Subventions

Organisme : NHLBI NIH HHS
ID : P20 HL113452
Pays : United States
Organisme : NHLBI NIH HHS
ID : R01 HL091357
Pays : United States
Organisme : NHLBI NIH HHS
ID : U01 HL072524
Pays : United States
Organisme : NIEHS NIH HHS
ID : U2C ES030158
Pays : United States

Références

Sci Rep. 2016 Dec 13;6:38881
pubmed: 27958387
Methods Mol Biol. 2017;1609:149-170
pubmed: 28660581
Annu Rev Genet. 2017 Nov 27;51:287-310
pubmed: 28876980
Metabolomics. 2015;11(4):807-821
pubmed: 26109925
Anal Chem. 2013 Jan 15;85(2):1037-46
pubmed: 23240878
Am J Epidemiol. 2014 Mar 15;179(6):764-74
pubmed: 24589914
Anal Chim Acta. 2018 Dec 7;1036:66-72
pubmed: 30253838
J Chromatogr A. 2010 Nov 19;1217(47):7401-10
pubmed: 20950815
Anal Chem. 2018 Jan 16;90(2):1363-1369
pubmed: 29239170
Brief Bioinform. 2013 May;14(3):315-26
pubmed: 22786785
Anal Chem. 2015 Apr 7;87(7):3606-15
pubmed: 25692814
Anal Chem. 2006 Jan 15;78(2):567-74
pubmed: 16408941
Metabolites. 2018 Aug 28;8(3):
pubmed: 30154338
Cancer Metab. 2016 May 02;4:9
pubmed: 27141305
PLoS One. 2014 Dec 30;9(12):e116221
pubmed: 25549083
Anal Chem. 2009 Oct 1;81(19):7974-80
pubmed: 19743813
Stat Appl Genet Mol Biol. 2013 Aug;12(4):449-67
pubmed: 23934609
PLoS One. 2012;7(6):e38163
pubmed: 22715376
Analyst. 2009 Mar;134(3):478-85
pubmed: 19238283
Sci Rep. 2017 Jul 21;7(1):6120
pubmed: 28733574
Cancer Epidemiol Biomarkers Prev. 2013 Apr;22(4):631-40
pubmed: 23396963
BMC Genomics. 2009 Sep 17;10:439
pubmed: 19758461
BMC Bioinformatics. 2007 Mar 15;8:93
pubmed: 17362505
BMC Cancer. 2012 Jan 26;12:43
pubmed: 22280244
Bioanalysis. 2012 Sep;4(18):2249-64
pubmed: 23046267
Biomed Res Int. 2015;2015:354671
pubmed: 26090402
Anal Chem. 2017 Nov 21;89(22):12360-12368
pubmed: 29064229
Anal Chem. 2012 Mar 20;84(6):2670-7
pubmed: 22264131
BMC Bioinformatics. 2006 Jan 06;7:3
pubmed: 16398926
Am J Epidemiol. 2014 Jul 15;180(2):129-39
pubmed: 24966222
Sci Data. 2018 Nov 20;5:180263
pubmed: 30457571
PLoS One. 2015 Oct 06;10(10):e0138965
pubmed: 26440112
Nucleic Acids Res. 2017 Jul 3;45(W1):W162-W170
pubmed: 28525573
J Chromatogr A. 2016 Jan 29;1431:103-110
pubmed: 26755417
Sci Data. 2014 Jun 10;1:140012
pubmed: 25977770
Nat Genet. 2006 Jul;38(7):842-9
pubmed: 16751770

Auteurs

Sili Fan (S)

West Coast Metabolomics Center, UC Davis Genome Center , University of California, Davis , 451 Health Sciences Drive , Davis , California 95616 , United States.

Tobias Kind (T)

West Coast Metabolomics Center, UC Davis Genome Center , University of California, Davis , 451 Health Sciences Drive , Davis , California 95616 , United States.

Tomas Cajka (T)

West Coast Metabolomics Center, UC Davis Genome Center , University of California, Davis , 451 Health Sciences Drive , Davis , California 95616 , United States.
Department of Metabolomics , Institute of Physiology CAS , Videnska 1083 , 14220 Prague , Czech Republic.

Rima Kaddurah-Daouk (R)

Department of Psychiatry and Behavioral Sciences, Department of Medicine and the Duke Institute for Brain Sciences , Duke University , Durham , North Carolina 27708 , United States.

Marguerite R Irvin (MR)

Department of Epidemiology , University of Alabama at Birmingham , 1720 Second Avenue South , Birmingham , Alabama 35294 , United States.

Donna K Arnett (DK)

College of Public Health , University of Kentucky , 121 Washington Avenue , Lexington , Kentucky 40508 , United States.

Dinesh K Barupal (DK)

West Coast Metabolomics Center, UC Davis Genome Center , University of California, Davis , 451 Health Sciences Drive , Davis , California 95616 , United States.

Oliver Fiehn (O)

West Coast Metabolomics Center, UC Davis Genome Center , University of California, Davis , 451 Health Sciences Drive , Davis , California 95616 , United States.

Articles similaires

Electronic Health Records Humans Datasets as Topic
Humans Breast Neoplasms Female Lipidomics Lipid Metabolism
Humans Quality Control Retrospective Studies Total Quality Management Blood Transfusion
Humans Female Natural Language Processing Deep Learning Machine Learning

Classifications MeSH