Research Article
BibTex RIS Cite

Influence of residuals on Cook's distance for Beta regression model: Simulation and application

Year 2025, Volume: 54 Issue: 2, 618 - 632, 28.04.2025
https://doi.org/10.15672/hujms.1544513

Abstract

Cook's distance is one of the renowned and classic tools for the detection of influential observations. In this article, we propose to use Cook's distance with different residuals in the Beta regression model, which is appropriate for modeling the response variable that undertakes a proportion data set. The influence of outlying observations on the basis of its estimated parameters and mean squared error is examined, and performance of residuals is compared. Based on the simulation results and the empirical application, it is observed that the performance of the deviation and weighted residuals is better than that of the rest of the residuals for the detection of influential observations. The observations deleted by the deviance residuals have a large impact on the regression coefficients and on the mean squared error for the Beta regression model.

References

  • [1] M. Amin, M. Amanullah, M. Aslam, and M. Qasim, Influence diagnostics in gamma ridge regression model, J. Stat. Comput. Simul. 89, 536-556, 2019.
  • [2] T. Anholeto, C.M. Sandoval, and D.A. Botter, Adjusted Pearson residuals in beta regression models, J. Stat. Comput. Simul. 84, 999-1014, 2014.
  • [3] A.C. Atkinson, Two graphical displays for outlying and influential observations in regression, Biometrika 68, 13-20, 1981.
  • [4] S. Chatterjee and A.S. Hadi, Influential observations, high leverage points, and outliers in linear regression, Stat. Sci., 379-393, 1986.
  • [5] S.W. Choi, The effect of outliers on regression analysis, pp. regime type and foreign direct investment, Q. J. Polit. Sci. 4, 153-165, 2009.
  • [6] R.D. Cook and S. Weisberg, Residuals and influence in regression, New York: Chapman and Hall, 1982.
  • [7] R.D. Cook, Detection of influential observation in linear regression, Technometrics 42, 65-68, 2000.
  • [8] F. Cribari-Neto and A. Zeileis, Beta Regression in R, J. Stat. Softw. 34, 1-24, 2010.
  • [9] A. Davison, Residuals and diagnostics, Stat. Theory. Mod., 83, 1991.
  • [10] P.L. Espinheira, S.L.P. Ferrari, and F. Cribari-Neto, On beta regression residuals, J. Appl. Stat. 35, 407-419, 2008.
  • [11] P.L. Espinheira, S.L.P. Ferrari, and F. Cribari-Neto, Influence diagnostics in beta regression, Comput. Statist. Data Anal. 52, 4417-4431, 2008.
  • [12] P.L. Espinheira, L.C.M. da Silva, A.O. Silva, and R. Ospina, Model selection criteria on beta regression for machine learning, Mach. Learn. Knowl. Extr. 1, 26, 2019.
  • [13] C. Forbes, M. Evans, N. Hastings, and B. Peacock, Statistical distributions, John Wiley & Sons, 2011.
  • [14] F. Silvia and F. Cribari-Neto, Beta regression for modelling rates and proportions, J. Appl. Stat. 3, 799-815, 2004.
  • [15] A.M. Garay, E.M. Hashimoto, E.M.M. Ortega, and V.H. Lachos, On estimation and influence diagnostics for zero-inflated negative binomial regression models, Comput. Statist. Data Anal. 55, 1304-1318, 2011.
  • [16] J.W. Hardin, J.M. Hilbe, and J. Hilbe, Generalized linear models and extensions, Stata Press, 2007.
  • [17] K. Venezuela, M.D.A. Botter, and M.C. Sandoval, Diagnostic techniques in generalized estimating equations, J. Stat. Comput. Simul. 77, 879-888, 2007.
  • [18] S. Liu, S.E. Ahmed, and L.Y. Ma, Influence diagnostics in the linear regression model with stochastic linear restrictions, Pak. J. Statist. 25, 647-662, 2009.
  • [19] P. McCullagh and J.A. Nelder, Generalized linear models, London: Chapman and Hall, 1989.
  • [20] K. Pammer and A. Kevan, The contribution of visual sensitivity, phonological processing, and nonverbal IQ to children’s reading, Sci. Stud. Read. 11, 33-53, 2007.
  • [21] G.H.A. Pereira, On quantile residuals in beta regression, Commun. Stat. Simul. Comput. 48, 302-316, 2019.
  • [22] D. Pregibon, Logistic regression diagnostics, Ann. Statist. 9, 705-724, 1981.
  • [23] J.S. Preisser and B.F. Qaqish, Deletion diagnostics for generalised estimating equations, Biometrika 83, 551-562, 1996.
  • [24] A.V. Rocha and A.B. Simas, Influence diagnostics in a general class of beta regression models, Test 20, 95-119, 2011.
  • [25] M. Smithson and J. Verkuilen, A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables, Psychol. Methods 11, 54, 2006.
  • [26] W. Thomas and R.D. Cook, Assessing influence on regression coefficients in generalized linear models, Biometrika 76, 741-749, 1989.
  • [27] M.A. Ullah and G.R. Pasha, The origin and developments of influence measures in regression, Pak. J. Stat. 25, 2009.
  • [28] D.A. Williams, Generalized linear model diagnostics using the deviance and single case deletions, J. R. Stat. Soc. Ser. C. Appl. Stat. 36, 181-191, 1987.
  • [29] F.C. Xie and B.C. Wei, Diagnostics analysis in censored generalized Poisson regression model, J. Stat. Comput. Simul. 77, 695-708, 2007.
  • [30] L. Xu, S.Y. Lee, and W.Y. Poon, Deletion measures for generalized linear mixed effects models, Comput. Statist. Data Anal. 51, 1131-1146, 2006.
  • [31] H.T. Zhu and S.Y. Lee, Local influence for incomplete data models, J. R. Stat. Soc. Ser. B. Stat. Methodol. 63, 111-126, 2001.
  • [32] M. Qasim, K. Månsson, and B.M.G. Kibria, On Some Beta Ridge Regression Estimators: Method, Simulation and Application, J. Stat. Comput. Simul. 91, 1699-1712, 2021.
  • [33] A.B. Simas, W. Barreto-Souza, and A.V. Rocha, Improved estimators for a general class of beta regression models, Comput. Statist. Data Anal. 54, 48-66, 2010.
  • [34] C.M. Hurvich and C.L. Tsai, Regression and time series model selection in small samples, Biometrika 76 (2), 297-307, 1989.
  • [35] F. Cribari-Neto, J. J. Santana-e-Silva, and K. L. P. Vasconcellos, Beta regression misspecification tests, J. Stat. Plan. Inference 233, 106193, 2024.
  • [36] J. A. Khan, A. Akbar, and B. M. G. Kibria, Behavior of Residuals in Cook’s Distance for Beta Ridge Regression Model (BRRM), Int. J. Appl. Math. Comput. Sci. Syst. Eng. 5, 202-208, 2023.
  • [37] N.H. Prater, Estimate gasoline yields from crudes, Pet. Refin. 35, 236238, 1956.
  • [38] A.C. Atkinson, Plots, Transformations and Regression: An Introduction to Graphical Methods of Diagnostic Regression Analysis, New York: Oxford University Press, 1985.
  • [39] A.J. Lemonte, S.L. Ferrari, and F. Cribari-Neto, Improved likelihood inference in BirnbaumSaunders regressions, Comput. Stat. Data Anal. 54, 13071316, 2010.
  • [40] S. Ferrari and F. Cribari-Neto, Beta regression for modelling rates and proportions, J. Appl. Stat. 31, 799815, 2004.
  • [41] R. Ospina, P. L. Espinheira, L. A. Arias, C. M. Xavier, V. Leiva, and C. Castro, New Statistical Residuals for Regression Models in the Exponential Family: Characterization, Simulation, Computation, and Applications, Mathematics 12(20), 3196, 2024. https://doi.org/10.3390/math12203196.
Year 2025, Volume: 54 Issue: 2, 618 - 632, 28.04.2025
https://doi.org/10.15672/hujms.1544513

Abstract

References

  • [1] M. Amin, M. Amanullah, M. Aslam, and M. Qasim, Influence diagnostics in gamma ridge regression model, J. Stat. Comput. Simul. 89, 536-556, 2019.
  • [2] T. Anholeto, C.M. Sandoval, and D.A. Botter, Adjusted Pearson residuals in beta regression models, J. Stat. Comput. Simul. 84, 999-1014, 2014.
  • [3] A.C. Atkinson, Two graphical displays for outlying and influential observations in regression, Biometrika 68, 13-20, 1981.
  • [4] S. Chatterjee and A.S. Hadi, Influential observations, high leverage points, and outliers in linear regression, Stat. Sci., 379-393, 1986.
  • [5] S.W. Choi, The effect of outliers on regression analysis, pp. regime type and foreign direct investment, Q. J. Polit. Sci. 4, 153-165, 2009.
  • [6] R.D. Cook and S. Weisberg, Residuals and influence in regression, New York: Chapman and Hall, 1982.
  • [7] R.D. Cook, Detection of influential observation in linear regression, Technometrics 42, 65-68, 2000.
  • [8] F. Cribari-Neto and A. Zeileis, Beta Regression in R, J. Stat. Softw. 34, 1-24, 2010.
  • [9] A. Davison, Residuals and diagnostics, Stat. Theory. Mod., 83, 1991.
  • [10] P.L. Espinheira, S.L.P. Ferrari, and F. Cribari-Neto, On beta regression residuals, J. Appl. Stat. 35, 407-419, 2008.
  • [11] P.L. Espinheira, S.L.P. Ferrari, and F. Cribari-Neto, Influence diagnostics in beta regression, Comput. Statist. Data Anal. 52, 4417-4431, 2008.
  • [12] P.L. Espinheira, L.C.M. da Silva, A.O. Silva, and R. Ospina, Model selection criteria on beta regression for machine learning, Mach. Learn. Knowl. Extr. 1, 26, 2019.
  • [13] C. Forbes, M. Evans, N. Hastings, and B. Peacock, Statistical distributions, John Wiley & Sons, 2011.
  • [14] F. Silvia and F. Cribari-Neto, Beta regression for modelling rates and proportions, J. Appl. Stat. 3, 799-815, 2004.
  • [15] A.M. Garay, E.M. Hashimoto, E.M.M. Ortega, and V.H. Lachos, On estimation and influence diagnostics for zero-inflated negative binomial regression models, Comput. Statist. Data Anal. 55, 1304-1318, 2011.
  • [16] J.W. Hardin, J.M. Hilbe, and J. Hilbe, Generalized linear models and extensions, Stata Press, 2007.
  • [17] K. Venezuela, M.D.A. Botter, and M.C. Sandoval, Diagnostic techniques in generalized estimating equations, J. Stat. Comput. Simul. 77, 879-888, 2007.
  • [18] S. Liu, S.E. Ahmed, and L.Y. Ma, Influence diagnostics in the linear regression model with stochastic linear restrictions, Pak. J. Statist. 25, 647-662, 2009.
  • [19] P. McCullagh and J.A. Nelder, Generalized linear models, London: Chapman and Hall, 1989.
  • [20] K. Pammer and A. Kevan, The contribution of visual sensitivity, phonological processing, and nonverbal IQ to children’s reading, Sci. Stud. Read. 11, 33-53, 2007.
  • [21] G.H.A. Pereira, On quantile residuals in beta regression, Commun. Stat. Simul. Comput. 48, 302-316, 2019.
  • [22] D. Pregibon, Logistic regression diagnostics, Ann. Statist. 9, 705-724, 1981.
  • [23] J.S. Preisser and B.F. Qaqish, Deletion diagnostics for generalised estimating equations, Biometrika 83, 551-562, 1996.
  • [24] A.V. Rocha and A.B. Simas, Influence diagnostics in a general class of beta regression models, Test 20, 95-119, 2011.
  • [25] M. Smithson and J. Verkuilen, A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables, Psychol. Methods 11, 54, 2006.
  • [26] W. Thomas and R.D. Cook, Assessing influence on regression coefficients in generalized linear models, Biometrika 76, 741-749, 1989.
  • [27] M.A. Ullah and G.R. Pasha, The origin and developments of influence measures in regression, Pak. J. Stat. 25, 2009.
  • [28] D.A. Williams, Generalized linear model diagnostics using the deviance and single case deletions, J. R. Stat. Soc. Ser. C. Appl. Stat. 36, 181-191, 1987.
  • [29] F.C. Xie and B.C. Wei, Diagnostics analysis in censored generalized Poisson regression model, J. Stat. Comput. Simul. 77, 695-708, 2007.
  • [30] L. Xu, S.Y. Lee, and W.Y. Poon, Deletion measures for generalized linear mixed effects models, Comput. Statist. Data Anal. 51, 1131-1146, 2006.
  • [31] H.T. Zhu and S.Y. Lee, Local influence for incomplete data models, J. R. Stat. Soc. Ser. B. Stat. Methodol. 63, 111-126, 2001.
  • [32] M. Qasim, K. Månsson, and B.M.G. Kibria, On Some Beta Ridge Regression Estimators: Method, Simulation and Application, J. Stat. Comput. Simul. 91, 1699-1712, 2021.
  • [33] A.B. Simas, W. Barreto-Souza, and A.V. Rocha, Improved estimators for a general class of beta regression models, Comput. Statist. Data Anal. 54, 48-66, 2010.
  • [34] C.M. Hurvich and C.L. Tsai, Regression and time series model selection in small samples, Biometrika 76 (2), 297-307, 1989.
  • [35] F. Cribari-Neto, J. J. Santana-e-Silva, and K. L. P. Vasconcellos, Beta regression misspecification tests, J. Stat. Plan. Inference 233, 106193, 2024.
  • [36] J. A. Khan, A. Akbar, and B. M. G. Kibria, Behavior of Residuals in Cook’s Distance for Beta Ridge Regression Model (BRRM), Int. J. Appl. Math. Comput. Sci. Syst. Eng. 5, 202-208, 2023.
  • [37] N.H. Prater, Estimate gasoline yields from crudes, Pet. Refin. 35, 236238, 1956.
  • [38] A.C. Atkinson, Plots, Transformations and Regression: An Introduction to Graphical Methods of Diagnostic Regression Analysis, New York: Oxford University Press, 1985.
  • [39] A.J. Lemonte, S.L. Ferrari, and F. Cribari-Neto, Improved likelihood inference in BirnbaumSaunders regressions, Comput. Stat. Data Anal. 54, 13071316, 2010.
  • [40] S. Ferrari and F. Cribari-Neto, Beta regression for modelling rates and proportions, J. Appl. Stat. 31, 799815, 2004.
  • [41] R. Ospina, P. L. Espinheira, L. A. Arias, C. M. Xavier, V. Leiva, and C. Castro, New Statistical Residuals for Regression Models in the Exponential Family: Characterization, Simulation, Computation, and Applications, Mathematics 12(20), 3196, 2024. https://doi.org/10.3390/math12203196.
There are 41 citations in total.

Details

Primary Language English
Subjects Soft Computing, Computational Statistics, Statistical Analysis, Applied Statistics, Statistics (Other)
Journal Section Statistics
Authors

Javaria Ahmad Khan 0000-0002-0733-2011

Atif Akbar 0000-0003-3227-9731

B M Golam Kibria 0000-0002-6073-1978

Early Pub Date January 30, 2025
Publication Date April 28, 2025
Submission Date September 6, 2024
Acceptance Date January 22, 2025
Published in Issue Year 2025 Volume: 54 Issue: 2

Cite

APA Ahmad Khan, J., Akbar, A., & Kibria, B. M. G. (2025). Influence of residuals on Cook’s distance for Beta regression model: Simulation and application. Hacettepe Journal of Mathematics and Statistics, 54(2), 618-632. https://doi.org/10.15672/hujms.1544513
AMA Ahmad Khan J, Akbar A, Kibria BMG. Influence of residuals on Cook’s distance for Beta regression model: Simulation and application. Hacettepe Journal of Mathematics and Statistics. April 2025;54(2):618-632. doi:10.15672/hujms.1544513
Chicago Ahmad Khan, Javaria, Atif Akbar, and B M Golam Kibria. “Influence of Residuals on Cook’s Distance for Beta Regression Model: Simulation and Application”. Hacettepe Journal of Mathematics and Statistics 54, no. 2 (April 2025): 618-32. https://doi.org/10.15672/hujms.1544513.
EndNote Ahmad Khan J, Akbar A, Kibria BMG (April 1, 2025) Influence of residuals on Cook’s distance for Beta regression model: Simulation and application. Hacettepe Journal of Mathematics and Statistics 54 2 618–632.
IEEE J. Ahmad Khan, A. Akbar, and B. M. G. Kibria, “Influence of residuals on Cook’s distance for Beta regression model: Simulation and application”, Hacettepe Journal of Mathematics and Statistics, vol. 54, no. 2, pp. 618–632, 2025, doi: 10.15672/hujms.1544513.
ISNAD Ahmad Khan, Javaria et al. “Influence of Residuals on Cook’s Distance for Beta Regression Model: Simulation and Application”. Hacettepe Journal of Mathematics and Statistics 54/2 (April 2025), 618-632. https://doi.org/10.15672/hujms.1544513.
JAMA Ahmad Khan J, Akbar A, Kibria BMG. Influence of residuals on Cook’s distance for Beta regression model: Simulation and application. Hacettepe Journal of Mathematics and Statistics. 2025;54:618–632.
MLA Ahmad Khan, Javaria et al. “Influence of Residuals on Cook’s Distance for Beta Regression Model: Simulation and Application”. Hacettepe Journal of Mathematics and Statistics, vol. 54, no. 2, 2025, pp. 618-32, doi:10.15672/hujms.1544513.
Vancouver Ahmad Khan J, Akbar A, Kibria BMG. Influence of residuals on Cook’s distance for Beta regression model: Simulation and application. Hacettepe Journal of Mathematics and Statistics. 2025;54(2):618-32.