Estimation of cut-off points under complex-sampling design data

  1. Amaia Iparragirre 1
  2. Irantzu Barrio 1
  3. Jorge Aramendi 2
  4. Inmaculada Arostegui 1
  1. 1 Universidad del País Vasco/Euskal Herriko Unibertsitatea
    info

    Universidad del País Vasco/Euskal Herriko Unibertsitatea

    Lejona, España

    ROR https://ror.org/000xsnr85

  2. 2 Instituto Vasco de Estadística
Revista:
Sort: Statistics and Operations Research Transactions

ISSN: 1696-2281

Año de publicación: 2022

Volumen: 46

Número: 1

Páginas: 137-158

Tipo: Artículo

Otras publicaciones en: Sort: Statistics and Operations Research Transactions

Resumen

In the context of logistic regression models, a cut-off point is usually selected to dichotomize the estimated predicted probabilities based on the model. The techniques proposed to estimate optimal cut-off points in the literature, are commonly developed to be applied in simple random samples and their applicability to complex sampling designs could be limited. Therefore, in this work we propose a methodology to incorporate sampling weights in the estimation process of the optimal cut-off points, and we evaluate its performance using a real data-based simulation study. The results suggest the convenience of considering sampling weights for estimating optimal cut-off points.

Referencias bibliográficas

  • Baker, T. and Gerdin, M. (2017). The clinical usefulness of prognostic prediction models in critical illness. European Journal of Internal Medicine, 45:37–40.
  • Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology, 12(4):387–415.
  • Binder, D. A. (1981). On the variances of asymptotically normal estimators from complex surveys. Survey Methodology, 7(2):157–170.
  • Binder, D. A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51(3):279–292.
  • Binder, D. A. and Roberts, G. (2009). Designand model-based inference for model parameters. Handbook of Statistics, 29:33–54.
  • Chen, J.-Y., Feng, J., Wang, X.-Q., Cai, S.-W., Dong, J.-H., and Chen, Y.-L. (2015). Risk scoring system and predictor for clinically relevant pancreatic fstula after pancreaticoduodenectomy. World Journal of Gastroenterology, 21(19):5926–5933.
  • Cohen, J. (1960). A coeffcient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37–46.
  • Filella, X., Alcover, J., Molina, R., Giménez, N., Rodrı́guez, A., Jo, J., Carretero, P., and Ballesta, A. M. (1995). Clinical usefulness of free PSA fraction as an indicator of prostate cancer. International Journal of Cancer, 63(6):780–784.
  • Greiner, M. (1995). Two-graph receiver operating characteristic (TG-ROC): a MicrosoftEXCEL template for the selection of cut-off values in diagnostic tests. Journal of Immunological Methods, 185(1):145–146.
  • Greiner, M. (1996). Two-graph receiver operating characteristic (TG-ROC): update version supports optimisation of cut-off values that minimise overall misclassifcation costs. Journal of Immunological Methods, 191(1):93–94.
  • Greiner, M., Pfeiffer, D., and Smith, R. (2000). Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Preventive Veterinary Medicine, 45(1-2):23–41.
  • Hanley, J. A. and Mcneil, B. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143:29–36.
  • Heeringa, S. G., West, B. T., and Berglund, P. A. (2017). Applied Survey Data Analysis (2nd ed.). Chapman and Hall/CRC.
  • Hosmer, D. W. and Lemeshow, S. (2000). Applied Logistic Regression. Wiley New York.
  • Kalton, G. (1983). Introduction to Survey Sampling. Thousand Oaks, CA: Sage. Lewis, J. D., Chuai, S., Nessel, L., Lichtenstein, G. R., Aberra, F. N., and Ellenberg,
  • J. H. (2008). Use of the noninvasive components of the Mayo score to assess clinical response in ulcerative colitis. Infammatory Bowel Diseases, 14(12):1660–1666.
  • López-Ratón, M., Rodrı́guez-Álvarez, M. X., Cadarso-Suárez, C., Gude-Sampedro, F., et al. (2014). OptimalCutpoints: an R package for selecting optimal cutpoints in diagnostic tests. Journal of Statistical Software, 61(8):1–36.
  • Lumley, T. and Scott, A. (2015). AIC and BIC for modeling with complex survey data. Journal of Survey Statistics and Methodology, 3(1):1–18.
  • Magder, L. S. and Fix, A. D. (2003). Optimal choice of a cut point for a quantitative diagnostic test performed for research purposes. Journal of Clinical Epidemiology, 56(10):956–962.
  • Manel, S., Williams, H. C., and Ormerod, S. J. (2001). Evaluating presence–absence models in ecology: the need to account for prevalence. Journal of Applied Ecology, 38(5):921–931.
  • Metz, C. E. (1978). Basic principles of ROC analysis. Seminars in Nuclear Medicine, 8(4):283–298.
  • Pauker, S. G. and Kassirer, J. P. (1980). The threshold approach to clinical decision making. New England Journal of Medicine, 302(20):1109–1117.
  • Pepe, M. S. (2003). The Statistical Evaluation of Medical Tests for Classifcation and Prediction. Oxford University Press, New York.
  • Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.-C., and Müller, M. (2011). pROC: an open-source package for R and S+ to analyse and compare ROC curves. BMC Bioinformatics, 12(77).
  • Rutter, C. M. and Miglioretti, D. L. (2003). Estimating the accuracy of psychological scales using longitudinal data. Biostatistics, 4(1):97–107.
  • Skinner, C. J., Holt, D., and Smith, T. F. (1989). Analysis of Complex Surveys. John Wiley & Sons.
  • Spence, R. T., Chang, D. C., Kaafarani, H. M., Panieri, E., Anderson, G. A., and Hutter, M. M. (2018). Derivation, validation and application of a pragmatic risk prediction index for benchmarking of surgical outcomes. World Journal of Surgery, 42(2):533– 540.
  • Steyerberg, E. W. (2008). Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. Springer Science & Business Media.
  • Steyerberg, E. W., Marshall, P. B., Jan Keizer, H., and Habbema, J. D. F. (1999). Resection of small, residual retroperitoneal masses after chemotherapy for nonseminomatous testicular cancer: a decision analysis. Cancer, 85(6):1331–1341.
  • Swets, J. A. (1992). The science of choosing the right decision threshold in high-stakes diagnostics. American Psychologist, 47(4):522–532.
  • Vermont, J., Bosson, J., Francois, P., Robert, C., Rueff, A., and Demongeot, J. (1991). Strategies for graphical threshold determination. Computer Methods and Programs in Biomedicine, 35(2):141–150.
  • Wynants, L., van Smeden, M., McLernon, D. J., Timmerman, D., Steyerberg, E. W., Van Calster, B., et al. (2019). Three myths about risk thresholds for prediction models. BMC Medicine, 17(192).
  • Yao, W., Li, Z., and Graubard, B. I. (2015). Estimation of ROC curve with complex survey data. Statistics in Medicine, 34(8):1293–1303.
  • Youden, W. J. (1950). Index for rating diagnostic tests. Cancer, 3(1):32–35.