Effects of Using Different Indirect Techniques on the Calculation of Reference Intervals: Observational Study.
clinical
clinical decision-making
comparative study
complete blood count
data transformation
indirect method
laboratory
outliers
platelets
red blood cells
reference interval
white blood cells
Journal
Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882
Informations de publication
Date de publication:
17 07 2023
17 07 2023
Historique:
received:
11
01
2023
accepted:
14
06
2023
revised:
28
05
2023
medline:
19
7
2023
pubmed:
17
7
2023
entrez:
17
7
2023
Statut:
epublish
Résumé
Reference intervals (RIs) play an important role in clinical decision-making. However, due to the time, labor, and financial costs involved in establishing RIs using direct means, the use of indirect methods, based on big data previously obtained from clinical laboratories, is getting increasing attention. Different indirect techniques combined with different data transformation methods and outlier removal might cause differences in the calculation of RIs. However, there are few systematic evaluations of this. This study used data derived from direct methods as reference standards and evaluated the accuracy of combinations of different data transformation, outlier removal, and indirect techniques in establishing complete blood count (CBC) RIs for large-scale data. The CBC data of populations aged ≥18 years undergoing physical examination from January 2010 to December 2011 were retrieved from the First Affiliated Hospital of China Medical University in northern China. After exclusion of repeated individuals, we performed parametric, nonparametric, Hoffmann, Bhattacharya, and truncation points and Kolmogorov-Smirnov distance (kosmic) indirect methods, combined with log or BoxCox transformation, and Reed-Dixon, Tukey, and iterative mean (3SD) outlier removal methods in order to derive the RIs of 8 CBC parameters and compared the results with those directly and previously established. Furthermore, bias ratios (BRs) were calculated to assess which combination of indirect technique, data transformation pattern, and outlier removal method is preferrable. Raw data showed that the degrees of skewness of the white blood cell (WBC) count, platelet (PLT) count, mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), and mean corpuscular volume (MCV) were much more obvious than those of other CBC parameters. After log or BoxCox transformation combined with Tukey or iterative mean (3SD) processing, the distribution types of these data were close to Gaussian distribution. Tukey-based outlier removal yielded the maximum number of outliers. The lower-limit bias of WBC (male), PLT (male), hemoglobin (HGB; male), MCH (male/female), and MCV (female) was greater than that of the corresponding upper limit for more than half of 30 indirect methods. Computational indirect choices of CBC parameters for males and females were inconsistent. The RIs of MCHC established by the direct method for females were narrow. For this, the kosmic method was markedly superior, which contrasted with the RI calculation of CBC parameters with high |BR| qualification rates for males. Among the top 10 methodologies for the WBC count, PLT count, HGB, MCV, and MCHC with a high-BR qualification rate among males, the Bhattacharya, Hoffmann, and parametric methods were superior to the other 2 indirect methods. Compared to results derived by the direct method, outlier removal methods and indirect techniques markedly influence the final RIs, whereas data transformation has negligible effects, except for obviously skewed data. Specifically, the outlier removal efficiency of Tukey and iterative mean (3SD) methods is almost equivalent. Furthermore, the choice of indirect techniques depends more on the characteristics of the studied analyte itself. This study provides scientific evidence for clinical laboratories to use their previous data sets to establish RIs.
Sections du résumé
BACKGROUND
Reference intervals (RIs) play an important role in clinical decision-making. However, due to the time, labor, and financial costs involved in establishing RIs using direct means, the use of indirect methods, based on big data previously obtained from clinical laboratories, is getting increasing attention. Different indirect techniques combined with different data transformation methods and outlier removal might cause differences in the calculation of RIs. However, there are few systematic evaluations of this.
OBJECTIVE
This study used data derived from direct methods as reference standards and evaluated the accuracy of combinations of different data transformation, outlier removal, and indirect techniques in establishing complete blood count (CBC) RIs for large-scale data.
METHODS
The CBC data of populations aged ≥18 years undergoing physical examination from January 2010 to December 2011 were retrieved from the First Affiliated Hospital of China Medical University in northern China. After exclusion of repeated individuals, we performed parametric, nonparametric, Hoffmann, Bhattacharya, and truncation points and Kolmogorov-Smirnov distance (kosmic) indirect methods, combined with log or BoxCox transformation, and Reed-Dixon, Tukey, and iterative mean (3SD) outlier removal methods in order to derive the RIs of 8 CBC parameters and compared the results with those directly and previously established. Furthermore, bias ratios (BRs) were calculated to assess which combination of indirect technique, data transformation pattern, and outlier removal method is preferrable.
RESULTS
Raw data showed that the degrees of skewness of the white blood cell (WBC) count, platelet (PLT) count, mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), and mean corpuscular volume (MCV) were much more obvious than those of other CBC parameters. After log or BoxCox transformation combined with Tukey or iterative mean (3SD) processing, the distribution types of these data were close to Gaussian distribution. Tukey-based outlier removal yielded the maximum number of outliers. The lower-limit bias of WBC (male), PLT (male), hemoglobin (HGB; male), MCH (male/female), and MCV (female) was greater than that of the corresponding upper limit for more than half of 30 indirect methods. Computational indirect choices of CBC parameters for males and females were inconsistent. The RIs of MCHC established by the direct method for females were narrow. For this, the kosmic method was markedly superior, which contrasted with the RI calculation of CBC parameters with high |BR| qualification rates for males. Among the top 10 methodologies for the WBC count, PLT count, HGB, MCV, and MCHC with a high-BR qualification rate among males, the Bhattacharya, Hoffmann, and parametric methods were superior to the other 2 indirect methods.
CONCLUSIONS
Compared to results derived by the direct method, outlier removal methods and indirect techniques markedly influence the final RIs, whereas data transformation has negligible effects, except for obviously skewed data. Specifically, the outlier removal efficiency of Tukey and iterative mean (3SD) methods is almost equivalent. Furthermore, the choice of indirect techniques depends more on the characteristics of the studied analyte itself. This study provides scientific evidence for clinical laboratories to use their previous data sets to establish RIs.
Identifiants
pubmed: 37459170
pii: v25i1e45651
doi: 10.2196/45651
pmc: PMC10390978
doi:
Types de publication
Observational Study
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e45651Informations de copyright
©Dan Yang, Zihan Su, Runqing Mu, Yingying Diao, Xin Zhang, Yusi Liu, Shuo Wang, Xu Wang, Lei Zhao, Hongyi Wang, Min Zhao. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 17.07.2023.
Références
Clin Biochem. 2017 Jun;50(9):502-505
pubmed: 28263716
Biometrics. 1967 Mar;23(1):115-35
pubmed: 6050463
Clin Biochem. 2020 Jun;80:25-30
pubmed: 32199936
Sci Rep. 2020 Feb 3;10(1):1704
pubmed: 32015476
JAMA. 1963 Sep 14;185:864-73
pubmed: 14043090
Clin Biochem. 2022 May;103:16-24
pubmed: 35181292
Am J Clin Pathol. 2019 Feb 4;151(3):328-336
pubmed: 30475946
S Afr Med J. 2021 Mar 31;111(4):327-332
pubmed: 33944765
Clin Chem. 2020 Dec 1;66(12):1558-1561
pubmed: 34214151
Int J Hematol. 2021 Sep;114(3):373-380
pubmed: 34080169
Matern Child Nutr. 2020 Jul;16(3):e12975
pubmed: 32141189
Am J Clin Pathol. 2015 Jan;143(1):134-42
pubmed: 25511152
Clin Lab Haematol. 1984;6(1):69-84
pubmed: 6734101
Front Cardiovasc Med. 2022 Mar 30;9:846685
pubmed: 35433869
Clin Chim Acta. 2021 Sep;520:186-195
pubmed: 34081933
Clin Biochem. 2016 Oct;49(15):1109-1112
pubmed: 27556285
PLoS One. 2022 Jan 7;17(1):e0261715
pubmed: 34995316
PLoS One. 2015 Mar 13;10(3):e0119669
pubmed: 25769040
Clin Biochem. 2020 Nov;85:53-56
pubmed: 32795473
Clin Chem Lab Med. 2021 May 28;:
pubmed: 34049430
Clin Chim Acta. 2022 Feb 15;527:23-32
pubmed: 34999059
Clin Chim Acta. 2019 Aug;495:8-12
pubmed: 30922856
Clin Chem Lab Med. 2018 Dec 19;57(1):20-29
pubmed: 29672266