Empirical likelihood inference for area under the receiver operating characteristic curve using ranked set samples.

Humans ROC Curve Area Under Curve Likelihood Functions Computer Simulation

AUC Mann-Whitney statistic diagnostic test profile empirical likelihood ranked set sampling

Journal

Pharmaceutical statistics

ISSN: 1539-1612

Titre abrégé: Pharm Stat

Pays: England

ID NLM: 101201192

Informations de publication

Date de publication:
11 2022

Historique:

revised: 17 03 2022

received: 09 07 2021

accepted: 02 05 2022

pubmed: 21 5 2022

medline: 18 11 2022

entrez: 20 5 2022

Statut: ppublish

Résumé

The area under a receiver operating characteristic curve (AUC) is a useful tool to assess the performance of continuous-scale diagnostic tests on binary classification. In this article, we propose an empirical likelihood (EL) method to construct confidence intervals for the AUC from data collected by ranked set sampling (RSS). The proposed EL-based method enables inferences without assumptions required in existing nonparametric methods and takes advantage of the sampling efficiency of RSS. We show that for both balanced and unbalanced RSS, the EL-based point estimate is the Mann-Whitney statistic, and confidence intervals can be obtained from a scaled chi-square distribution. Simulation studies and two case studies on diabetes and chronic kidney disease data suggest that using the proposed method and RSS enables more efficient inference on the AUC.

Identifiants

DOI: 10.1002/pst.2230 PMID: 35593451

pubmed: 35593451

doi: 10.1002/pst.2230

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

1219-1245

Informations de copyright

Références

Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol. 1975;12(4):387-415.

Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29-36.

Qin G, Zhou XH. Empirical likelihood inference for the area under the ROC curve. Biometrics. 2006;62(2):613-622.

Zou KH, Hall WJ, Shapiro DE. Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Stat Med. 1997;16(19):2143-2156.

Lloyd CJ. Using smoothed receiver operating characteristic curves to summarize and compare diagnostic systems. J Am Stat Assoc. 1998;93(444):1356-1364.

Owen AB. Empirical likelihood ratio confidence intervals for a single functional. Biometrika. 1988;75(2):237-249.

Owen AB. Empirical likelihood ratio confidence regions. Ann Stat. 1990;18(1):90-120.

Owen AB. Empirical likelihood for linear models. Ann Stat. 1991;19(4):1725-1747.

DiCiccio TJ, Hall P, Romano JP. Empirical likelihood is Bartlett-correctable. Ann Stat. 1991;19(2):1053-1061.

McIntyre GA. A method for unbiased selective sampling using ranked sets. Aust J Agr Res. 1952;3(4):385-390.

Stokes SL, Sager TW. Characterization of a ranked-set sample with application to estimating distribution functions. J Am Stat Assoc. 1988;83(402):374-381.

Bohn LL, Wolfe DA. Nonparametric two-sample procedures for ranked-set samples data. J Am Stat Assoc. 1992;87(418):552-561.

Bohn LL, Wolfe DA. The effect of imperfect judgment rankings on properties of procedures based on the ranked-set samples analog of the Mann-Whitney-Wilcoxon statistic. J Am Stat Assoc. 1994;89(425):168-176.

Ozturk O. Rank regression in ranked-set samples. J Am Stat Assoc. 2002;97(460):1180-1191.

Ghosh S, Chatterjee A, Balakrishnan N. Nonparametric confidence intervals for ranked set samples. Comput Stat. 2017;32(4):1689-1725.

Ozturk O. Statistical inference using rank-based post-stratified samples in a finite population. Test. 2019;28(4):1113-1143.

Ozturk O. Post-stratified probability-proportional-to-size sampling from stratified populations. J Agric Biol Environ Stat. 2019;24(4):693-718.

Ozturk O. Two-stage cluster samples with ranked set sampling designs. Ann Inst Stat Math. 2019;71(1):63-91.

Hatefi A, Reid N, Jozani MJ. Finite mixture modeling, classification and statistical learning with order statistics. Stat Sin. 2020;30:1881-1903.

Wang X, Lim J, Stokes L. Using ranked set sampling with cluster randomized designs for improved inference on treatment effects. J Am Stat Assoc. 2016;111(516):1576-1590.

Li T, Balakrishnan N, Ng HKT, Lu Y, An L. Precedence tests for equality of two distributions based on early failures of ranked set samples. J Stat Comput Simul. 2019;89(12):2328-2353.

Hatefi A, Jozani MJ, Ozturk O. Mixture model analysis of partially rank-ordered set samples: age groups of fish from length-frequency data. Scand Stat Theory Appl. 2015;42(3):848-871.

Frey J, Zhang Y. Testing perfect rankings in ranked-set sampling with binary data. Can J Stat. 2017;45(3):326-339.

Dümbgen L, Zamanzade E. Inference on a distribution function from ranked set samples. Ann Inst Stat Math. 2020;72(1):157-185.

Wang X, Wang M, Lim J, Ahn S. Using ranked set sampling with binary outcomes in cluster randomized designs. Can J Stat. 2020;48(3):342-365.

Zamanzade E, Mahdizadeh M. Using ranked set sampling with extreme ranks in estimating the population proportion. Stat Methods Med Res. 2020;29(1):165-177.

Faraji N, Jozani MJ, Nematollahi N. Another look at regression analysis using ranked set samples with application to an osteoporosis study. Biometrics. 2021.

Frey J, Zhang Y. Robust confidence intervals for a proportion using ranked-set sampling. J Korean Stat Soc. 2021;50:1-20.

Omidvar S, Jafari Jozani M, Nematollahi N. Judgment post-stratification in finite mixture modeling: An example in estimating the prevalence of osteoporosis. Stat Med. 2018;37(30):4823-4836.

Sengupta S, Mukhuti S. Unbiased estimation of P(X > Y) using ranked set sample data. Stat. 2008;42(3):223-230.

Mahdizadeh M, Zamanzade E. Kernel-based estimation of P(X > Y) in ranked set sampling. Sort. 2016;40:243-266.

Yin J, Hao Y, Samawi H, Rochani H. Rank-based kernel estimation of the area under the ROC curve. Stat Methodol. 2016;32:91-106.

Liu T, Lin N, Zhang B. Empirical likelihood for balanced ranked-set sampled data. Sci China Ser A Math. 2009;52:1351-1364.

Baklizi A. Empirical likelihood intervals for the population mean and quantiles based on balanced ranked set samples. Stat Methods Appt. 2009;18(4):483-505.

Baklizi A. Empirical likelihood inference for population quantiles with unbalanced ranked set samples. Commun Stat Theory Methods. 2011;40(23):4179-4188.

Chen Z, Bai Z, Sinha BK. Ranked Set Sampling: Theory and Applications. Springer; 2004.

Wolfe DA. Ranked set sampling: its relevance and impact on statistical inference. ISRN Probab Stat. 2012;2012:1-32.

Pepe MS, Cai T. The analysis of placement values for evaluating discriminatory measures. Biometrics. 2004;60(2):528-535.

Qin J, Lawless J. Empirical likelihood and general estimating equations. Ann Stat. 1994;22(1):300-325.

Wang Q, Rao J. Empirical likelihood-based inference in linear errors-in-covariables models with validation data. Biometrika. 2002;89(2):345-358.

Wang Q, Rao J. Empirical likelihood-based inference under imputation for missing response data. Ann Stat. 2002;30(3):896-924.

Wang Q, Linton O, Härdle W. Semiparametric regression analysis with missing response at random. J Am Stat Assoc. 2004;99(466):334-345.

Silverman BW. Density Estimation for Statistics and Data Analysis. Routledge; 2018.

Pruim R. NHANES: Data from the US National Health and Nutrition Examination Study. R Package Version 2.1.0; 2015.

Centers for Disease Control and Prevention. National Diabetes Fact Sheet: National Estimates and General Information on Diabetes and Prediabetes in the United States. Vol 201. US Department of Health and Human Services, Centers for Disease Control and Prevention 2011; 2011:2568-2569.

Levin A, Stevens PE, Bilous RW, et al. Kidney disease: improving global outcomes (KDIGO) CKD work group. KDIGO 2012 clinical practice guideline for the evaluation and management of chronic kidney disease. Kidney Int. 2013;3(1):1-150.

Levey AS, Stevens LA, Schmid CH, et al. A new equation to estimate glomerular filtration rate. Ann Intern Med. 2009;150(9):604-612.

Earley A, Miskulin D, Lamb EJ, Levey AS, Uhlig K. Estimating equations for glomerular filtration rate in the era of creatinine standardization: a systematic review. Ann Intern Med. 2012;156(11):785-795.

O'Hare AM, Choi AI, Bertenthal D, et al. Age affects outcomes in chronic kidney disease. J Am Soc Nephrol. 2007;18(10):2758-2765.

Kvam PH, Samaniego FJ. Nonparametric maximum likelihood estimation based on ranked set samples. J Am Stat Assoc. 1994;89(426):526-537.

Nahhas RW, Wolfe DA, Chen H. Ranked set sampling: cost and optimal set size. Biometrics. 2002;58(4):964-971.

Buchanan RA, Conquest LL, Courbois JY. A cost analysis of ranked set sampling to estimate a population mean. Environ. 2005;16(3):235-256.

Presnell B, Bohn LL. U-statistics and imperfect ranking in ranked set sampling. J Nonparametr Stat. 1999;10(2):111-126.

Empirical likelihood inference for area under the receiver operating characteristic curve using ranked set samples.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Chul Moon (C)

Xinlei Wang (X)

Johan Lim (J)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH