Empirical likelihood inference for area under the receiver operating characteristic curve using ranked set samples.

AUC Mann-Whitney statistic diagnostic test profile empirical likelihood ranked set sampling

Journal

Pharmaceutical statistics
ISSN: 1539-1612
Titre abrégé: Pharm Stat
Pays: England
ID NLM: 101201192

Informations de publication

Date de publication:
11 2022
Historique:
revised: 17 03 2022
received: 09 07 2021
accepted: 02 05 2022
pubmed: 21 5 2022
medline: 18 11 2022
entrez: 20 5 2022
Statut: ppublish

Résumé

The area under a receiver operating characteristic curve (AUC) is a useful tool to assess the performance of continuous-scale diagnostic tests on binary classification. In this article, we propose an empirical likelihood (EL) method to construct confidence intervals for the AUC from data collected by ranked set sampling (RSS). The proposed EL-based method enables inferences without assumptions required in existing nonparametric methods and takes advantage of the sampling efficiency of RSS. We show that for both balanced and unbalanced RSS, the EL-based point estimate is the Mann-Whitney statistic, and confidence intervals can be obtained from a scaled chi-square distribution. Simulation studies and two case studies on diabetes and chronic kidney disease data suggest that using the proposed method and RSS enables more efficient inference on the AUC.

Identifiants

pubmed: 35593451
doi: 10.1002/pst.2230
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1219-1245

Informations de copyright

© 2022 John Wiley & Sons Ltd.

Références

Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol. 1975;12(4):387-415.
Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29-36.
Qin G, Zhou XH. Empirical likelihood inference for the area under the ROC curve. Biometrics. 2006;62(2):613-622.
Zou KH, Hall WJ, Shapiro DE. Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Stat Med. 1997;16(19):2143-2156.
Lloyd CJ. Using smoothed receiver operating characteristic curves to summarize and compare diagnostic systems. J Am Stat Assoc. 1998;93(444):1356-1364.
Owen AB. Empirical likelihood ratio confidence intervals for a single functional. Biometrika. 1988;75(2):237-249.
Owen AB. Empirical likelihood ratio confidence regions. Ann Stat. 1990;18(1):90-120.
Owen AB. Empirical likelihood for linear models. Ann Stat. 1991;19(4):1725-1747.
DiCiccio TJ, Hall P, Romano JP. Empirical likelihood is Bartlett-correctable. Ann Stat. 1991;19(2):1053-1061.
McIntyre GA. A method for unbiased selective sampling using ranked sets. Aust J Agr Res. 1952;3(4):385-390.
Stokes SL, Sager TW. Characterization of a ranked-set sample with application to estimating distribution functions. J Am Stat Assoc. 1988;83(402):374-381.
Bohn LL, Wolfe DA. Nonparametric two-sample procedures for ranked-set samples data. J Am Stat Assoc. 1992;87(418):552-561.
Bohn LL, Wolfe DA. The effect of imperfect judgment rankings on properties of procedures based on the ranked-set samples analog of the Mann-Whitney-Wilcoxon statistic. J Am Stat Assoc. 1994;89(425):168-176.
Ozturk O. Rank regression in ranked-set samples. J Am Stat Assoc. 2002;97(460):1180-1191.
Ghosh S, Chatterjee A, Balakrishnan N. Nonparametric confidence intervals for ranked set samples. Comput Stat. 2017;32(4):1689-1725.
Ozturk O. Statistical inference using rank-based post-stratified samples in a finite population. Test. 2019;28(4):1113-1143.
Ozturk O. Post-stratified probability-proportional-to-size sampling from stratified populations. J Agric Biol Environ Stat. 2019;24(4):693-718.
Ozturk O. Two-stage cluster samples with ranked set sampling designs. Ann Inst Stat Math. 2019;71(1):63-91.
Hatefi A, Reid N, Jozani MJ. Finite mixture modeling, classification and statistical learning with order statistics. Stat Sin. 2020;30:1881-1903.
Wang X, Lim J, Stokes L. Using ranked set sampling with cluster randomized designs for improved inference on treatment effects. J Am Stat Assoc. 2016;111(516):1576-1590.
Li T, Balakrishnan N, Ng HKT, Lu Y, An L. Precedence tests for equality of two distributions based on early failures of ranked set samples. J Stat Comput Simul. 2019;89(12):2328-2353.
Hatefi A, Jozani MJ, Ozturk O. Mixture model analysis of partially rank-ordered set samples: age groups of fish from length-frequency data. Scand Stat Theory Appl. 2015;42(3):848-871.
Frey J, Zhang Y. Testing perfect rankings in ranked-set sampling with binary data. Can J Stat. 2017;45(3):326-339.
Dümbgen L, Zamanzade E. Inference on a distribution function from ranked set samples. Ann Inst Stat Math. 2020;72(1):157-185.
Wang X, Wang M, Lim J, Ahn S. Using ranked set sampling with binary outcomes in cluster randomized designs. Can J Stat. 2020;48(3):342-365.
Zamanzade E, Mahdizadeh M. Using ranked set sampling with extreme ranks in estimating the population proportion. Stat Methods Med Res. 2020;29(1):165-177.
Faraji N, Jozani MJ, Nematollahi N. Another look at regression analysis using ranked set samples with application to an osteoporosis study. Biometrics. 2021.
Frey J, Zhang Y. Robust confidence intervals for a proportion using ranked-set sampling. J Korean Stat Soc. 2021;50:1-20.
Omidvar S, Jafari Jozani M, Nematollahi N. Judgment post-stratification in finite mixture modeling: An example in estimating the prevalence of osteoporosis. Stat Med. 2018;37(30):4823-4836.
Sengupta S, Mukhuti S. Unbiased estimation of P(X > Y) using ranked set sample data. Stat. 2008;42(3):223-230.
Mahdizadeh M, Zamanzade E. Kernel-based estimation of P(X > Y) in ranked set sampling. Sort. 2016;40:243-266.
Yin J, Hao Y, Samawi H, Rochani H. Rank-based kernel estimation of the area under the ROC curve. Stat Methodol. 2016;32:91-106.
Liu T, Lin N, Zhang B. Empirical likelihood for balanced ranked-set sampled data. Sci China Ser A Math. 2009;52:1351-1364.
Baklizi A. Empirical likelihood intervals for the population mean and quantiles based on balanced ranked set samples. Stat Methods Appt. 2009;18(4):483-505.
Baklizi A. Empirical likelihood inference for population quantiles with unbalanced ranked set samples. Commun Stat Theory Methods. 2011;40(23):4179-4188.
Chen Z, Bai Z, Sinha BK. Ranked Set Sampling: Theory and Applications. Springer; 2004.
Wolfe DA. Ranked set sampling: its relevance and impact on statistical inference. ISRN Probab Stat. 2012;2012:1-32.
Pepe MS, Cai T. The analysis of placement values for evaluating discriminatory measures. Biometrics. 2004;60(2):528-535.
Qin J, Lawless J. Empirical likelihood and general estimating equations. Ann Stat. 1994;22(1):300-325.
Wang Q, Rao J. Empirical likelihood-based inference in linear errors-in-covariables models with validation data. Biometrika. 2002;89(2):345-358.
Wang Q, Rao J. Empirical likelihood-based inference under imputation for missing response data. Ann Stat. 2002;30(3):896-924.
Wang Q, Linton O, Härdle W. Semiparametric regression analysis with missing response at random. J Am Stat Assoc. 2004;99(466):334-345.
Silverman BW. Density Estimation for Statistics and Data Analysis. Routledge; 2018.
Pruim R. NHANES: Data from the US National Health and Nutrition Examination Study. R Package Version 2.1.0; 2015.
Centers for Disease Control and Prevention. National Diabetes Fact Sheet: National Estimates and General Information on Diabetes and Prediabetes in the United States. Vol 201. US Department of Health and Human Services, Centers for Disease Control and Prevention 2011; 2011:2568-2569.
Levin A, Stevens PE, Bilous RW, et al. Kidney disease: improving global outcomes (KDIGO) CKD work group. KDIGO 2012 clinical practice guideline for the evaluation and management of chronic kidney disease. Kidney Int. 2013;3(1):1-150.
Levey AS, Stevens LA, Schmid CH, et al. A new equation to estimate glomerular filtration rate. Ann Intern Med. 2009;150(9):604-612.
Earley A, Miskulin D, Lamb EJ, Levey AS, Uhlig K. Estimating equations for glomerular filtration rate in the era of creatinine standardization: a systematic review. Ann Intern Med. 2012;156(11):785-795.
O'Hare AM, Choi AI, Bertenthal D, et al. Age affects outcomes in chronic kidney disease. J Am Soc Nephrol. 2007;18(10):2758-2765.
Kvam PH, Samaniego FJ. Nonparametric maximum likelihood estimation based on ranked set samples. J Am Stat Assoc. 1994;89(426):526-537.
Nahhas RW, Wolfe DA, Chen H. Ranked set sampling: cost and optimal set size. Biometrics. 2002;58(4):964-971.
Buchanan RA, Conquest LL, Courbois JY. A cost analysis of ranked set sampling to estimate a population mean. Environ. 2005;16(3):235-256.
Presnell B, Bohn LL. U-statistics and imperfect ranking in ranked set sampling. J Nonparametr Stat. 1999;10(2):111-126.

Auteurs

Chul Moon (C)

Department of Statistical Science, Southern Methodist University, Dallas, Texas, USA.

Xinlei Wang (X)

Department of Statistical Science, Southern Methodist University, Dallas, Texas, USA.

Johan Lim (J)

Department of Statistics, Seoul National University, Seoul, Republic of Korea.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH