True verification probabilities should not be used in estimating the area under receiver operating characteristic curve.
area under a ROC curve
inverse probability weighting
two-phase design
verification bias
verification probability
Journal
Statistics in medicine
ISSN: 1097-0258
Titre abrégé: Stat Med
Pays: England
ID NLM: 8215016
Informations de publication
Date de publication:
30 11 2020
30 11 2020
Historique:
received:
11
02
2020
revised:
03
05
2020
accepted:
21
06
2020
pubmed:
30
7
2020
medline:
22
6
2021
entrez:
30
7
2020
Statut:
ppublish
Résumé
In medical research, a two-phase study is often used for the estimation of the area under the receiver operating characteristic curve (AUC) of a diagnostic test. However, such a design introduces verification bias. One of the methods to correct verification bias is inverse probability weighting (IPW). Since the probability a subject is selected into phase 2 of the study for disease verification is known, both true and estimated verification probabilities can be used to form an IPW estimator for AUC. In this article, we derive explicit variance formula for both IPW AUC estimators and show that the IPW AUC estimator using the true values of verification probabilities even when they are known are less efficient than its counterpart using the estimated values. Our simulation results show that the efficiency loss can be substantial especially when the variance of test result in disease population is small relative to its counterpart in nondiseased population.
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
3937-3946Informations de copyright
© 2020 John Wiley & Sons, Ltd.
Références
Tenenbein A. A double sampling scheme for estimating from binomial data with misclassifications. J Am Stat Assoc. 1970;65:1350-1361.
Graves AB, Larson EB, Edland SD, et al. Prevalence of dementia and its subtypes in the japanese american population of king county, Washington State. Am J Epidemiol. 1996;144:760-771.
Begg CB, Greens RA. Assessment of diagnostic tests when disease is subject to selection bias. Biometrics. 1983;39:207-216.
Zhou XH. A nonparametric maximum likelihood estimator for the receiver operating characteristic curve area in the presence of verification bias. Biometrics. 1996;52:299-305.
Liu D, Zhao XH. A model for adjusting for nonignorable verification bias in estimation of the ROC curveand its area with likelihood-based approach. Biometrics. 2010;66:1119-1128.
Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc. 1995;90:106-121.
He H, Lyness ML, McDermott MP. Direct estimation of the area under the receiver operating characteristic curve in the presence of verification bias. Stat Med. 2009;28:36-376.
Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol. 1975;12:387-415.
Alonzo TA, Lumley T, Pepe MS. Estimating disease prevalence in two-phase studies. Biostatistics. 2003;4:313-326.
Horvitz DG, Thompson DJ. A generalization of sampling without replacement from a finite universe. J Am Stat Assoc. 1951;47:663-685.