Influence of residuals on Cook's distance for Beta regression model: Simulation and application
Year 2025,
Volume: 54 Issue: 2, 618 - 632, 28.04.2025
Javaria Ahmad Khan
,
Atif Akbar
,
B M Golam Kibria
Abstract
Cook's distance is one of the renowned and classic tools for the detection of influential observations. In this article, we propose to use Cook's distance with different residuals in the Beta regression model, which is appropriate for modeling the response variable that undertakes a proportion data set. The influence of outlying observations on the basis of its estimated parameters and mean squared error is examined, and performance of residuals is compared. Based on the simulation results and the empirical application, it is observed that the performance of the deviation and weighted residuals is better than that of the rest of the residuals for the detection of influential observations. The observations deleted by the deviance residuals have a large impact on the regression coefficients and on the mean squared error for the Beta regression model.
References
- [1] M. Amin, M. Amanullah, M. Aslam, and M. Qasim, Influence diagnostics in gamma
ridge regression model, J. Stat. Comput. Simul. 89, 536-556, 2019.
- [2] T. Anholeto, C.M. Sandoval, and D.A. Botter, Adjusted Pearson residuals in beta
regression models, J. Stat. Comput. Simul. 84, 999-1014, 2014.
- [3] A.C. Atkinson, Two graphical displays for outlying and influential observations in
regression, Biometrika 68, 13-20, 1981.
- [4] S. Chatterjee and A.S. Hadi, Influential observations, high leverage points, and outliers
in linear regression, Stat. Sci., 379-393, 1986.
- [5] S.W. Choi, The effect of outliers on regression analysis, pp. regime type and foreign
direct investment, Q. J. Polit. Sci. 4, 153-165, 2009.
- [6] R.D. Cook and S. Weisberg, Residuals and influence in regression, New York: Chapman
and Hall, 1982.
- [7] R.D. Cook, Detection of influential observation in linear regression, Technometrics
42, 65-68, 2000.
- [8] F. Cribari-Neto and A. Zeileis, Beta Regression in R, J. Stat. Softw. 34, 1-24, 2010.
- [9] A. Davison, Residuals and diagnostics, Stat. Theory. Mod., 83, 1991.
- [10] P.L. Espinheira, S.L.P. Ferrari, and F. Cribari-Neto, On beta regression residuals, J.
Appl. Stat. 35, 407-419, 2008.
- [11] P.L. Espinheira, S.L.P. Ferrari, and F. Cribari-Neto, Influence diagnostics in beta
regression, Comput. Statist. Data Anal. 52, 4417-4431, 2008.
- [12] P.L. Espinheira, L.C.M. da Silva, A.O. Silva, and R. Ospina, Model selection criteria
on beta regression for machine learning, Mach. Learn. Knowl. Extr. 1, 26, 2019.
- [13] C. Forbes, M. Evans, N. Hastings, and B. Peacock, Statistical distributions, John
Wiley & Sons, 2011.
- [14] F. Silvia and F. Cribari-Neto, Beta regression for modelling rates and proportions, J.
Appl. Stat. 3, 799-815, 2004.
- [15] A.M. Garay, E.M. Hashimoto, E.M.M. Ortega, and V.H. Lachos, On estimation and
influence diagnostics for zero-inflated negative binomial regression models, Comput.
Statist. Data Anal. 55, 1304-1318, 2011.
- [16] J.W. Hardin, J.M. Hilbe, and J. Hilbe, Generalized linear models and extensions,
Stata Press, 2007.
- [17] K. Venezuela, M.D.A. Botter, and M.C. Sandoval, Diagnostic techniques in generalized
estimating equations, J. Stat. Comput. Simul. 77, 879-888, 2007.
- [18] S. Liu, S.E. Ahmed, and L.Y. Ma, Influence diagnostics in the linear regression model
with stochastic linear restrictions, Pak. J. Statist. 25, 647-662, 2009.
- [19] P. McCullagh and J.A. Nelder, Generalized linear models, London: Chapman and
Hall, 1989.
- [20] K. Pammer and A. Kevan, The contribution of visual sensitivity, phonological processing,
and nonverbal IQ to children’s reading, Sci. Stud. Read. 11, 33-53, 2007.
- [21] G.H.A. Pereira, On quantile residuals in beta regression, Commun. Stat. Simul. Comput.
48, 302-316, 2019.
- [22] D. Pregibon, Logistic regression diagnostics, Ann. Statist. 9, 705-724, 1981.
- [23] J.S. Preisser and B.F. Qaqish, Deletion diagnostics for generalised estimating equations,
Biometrika 83, 551-562, 1996.
- [24] A.V. Rocha and A.B. Simas, Influence diagnostics in a general class of beta regression
models, Test 20, 95-119, 2011.
- [25] M. Smithson and J. Verkuilen, A better lemon squeezer? Maximum-likelihood regression
with beta-distributed dependent variables, Psychol. Methods 11, 54, 2006.
- [26] W. Thomas and R.D. Cook, Assessing influence on regression coefficients in generalized
linear models, Biometrika 76, 741-749, 1989.
- [27] M.A. Ullah and G.R. Pasha, The origin and developments of influence measures in
regression, Pak. J. Stat. 25, 2009.
- [28] D.A. Williams, Generalized linear model diagnostics using the deviance and single
case deletions, J. R. Stat. Soc. Ser. C. Appl. Stat. 36, 181-191, 1987.
- [29] F.C. Xie and B.C. Wei, Diagnostics analysis in censored generalized Poisson regression
model, J. Stat. Comput. Simul. 77, 695-708, 2007.
- [30] L. Xu, S.Y. Lee, and W.Y. Poon, Deletion measures for generalized linear mixed
effects models, Comput. Statist. Data Anal. 51, 1131-1146, 2006.
- [31] H.T. Zhu and S.Y. Lee, Local influence for incomplete data models, J. R. Stat. Soc.
Ser. B. Stat. Methodol. 63, 111-126, 2001.
- [32] M. Qasim, K. Månsson, and B.M.G. Kibria, On Some Beta Ridge Regression Estimators:
Method, Simulation and Application, J. Stat. Comput. Simul. 91, 1699-1712,
2021.
- [33] A.B. Simas, W. Barreto-Souza, and A.V. Rocha, Improved estimators for a general
class of beta regression models, Comput. Statist. Data Anal. 54, 48-66, 2010.
- [34] C.M. Hurvich and C.L. Tsai, Regression and time series model selection in small
samples, Biometrika 76 (2), 297-307, 1989.
- [35] F. Cribari-Neto, J. J. Santana-e-Silva, and K. L. P. Vasconcellos, Beta regression
misspecification tests, J. Stat. Plan. Inference 233, 106193, 2024.
- [36] J. A. Khan, A. Akbar, and B. M. G. Kibria, Behavior of Residuals in Cook’s Distance
for Beta Ridge Regression Model (BRRM), Int. J. Appl. Math. Comput. Sci. Syst.
Eng. 5, 202-208, 2023.
- [37] N.H. Prater, Estimate gasoline yields from crudes, Pet. Refin. 35, 236238, 1956.
- [38] A.C. Atkinson, Plots, Transformations and Regression: An Introduction to Graphical
Methods of Diagnostic Regression Analysis, New York: Oxford University Press, 1985.
- [39] A.J. Lemonte, S.L. Ferrari, and F. Cribari-Neto, Improved likelihood inference in
BirnbaumSaunders regressions, Comput. Stat. Data Anal. 54, 13071316, 2010.
- [40] S. Ferrari and F. Cribari-Neto, Beta regression for modelling rates and proportions,
J. Appl. Stat. 31, 799815, 2004.
- [41] R. Ospina, P. L. Espinheira, L. A. Arias, C. M. Xavier, V. Leiva, and C. Castro, New
Statistical Residuals for Regression Models in the Exponential Family: Characterization,
Simulation, Computation, and Applications, Mathematics 12(20), 3196, 2024.
https://doi.org/10.3390/math12203196.
Year 2025,
Volume: 54 Issue: 2, 618 - 632, 28.04.2025
Javaria Ahmad Khan
,
Atif Akbar
,
B M Golam Kibria
References
- [1] M. Amin, M. Amanullah, M. Aslam, and M. Qasim, Influence diagnostics in gamma
ridge regression model, J. Stat. Comput. Simul. 89, 536-556, 2019.
- [2] T. Anholeto, C.M. Sandoval, and D.A. Botter, Adjusted Pearson residuals in beta
regression models, J. Stat. Comput. Simul. 84, 999-1014, 2014.
- [3] A.C. Atkinson, Two graphical displays for outlying and influential observations in
regression, Biometrika 68, 13-20, 1981.
- [4] S. Chatterjee and A.S. Hadi, Influential observations, high leverage points, and outliers
in linear regression, Stat. Sci., 379-393, 1986.
- [5] S.W. Choi, The effect of outliers on regression analysis, pp. regime type and foreign
direct investment, Q. J. Polit. Sci. 4, 153-165, 2009.
- [6] R.D. Cook and S. Weisberg, Residuals and influence in regression, New York: Chapman
and Hall, 1982.
- [7] R.D. Cook, Detection of influential observation in linear regression, Technometrics
42, 65-68, 2000.
- [8] F. Cribari-Neto and A. Zeileis, Beta Regression in R, J. Stat. Softw. 34, 1-24, 2010.
- [9] A. Davison, Residuals and diagnostics, Stat. Theory. Mod., 83, 1991.
- [10] P.L. Espinheira, S.L.P. Ferrari, and F. Cribari-Neto, On beta regression residuals, J.
Appl. Stat. 35, 407-419, 2008.
- [11] P.L. Espinheira, S.L.P. Ferrari, and F. Cribari-Neto, Influence diagnostics in beta
regression, Comput. Statist. Data Anal. 52, 4417-4431, 2008.
- [12] P.L. Espinheira, L.C.M. da Silva, A.O. Silva, and R. Ospina, Model selection criteria
on beta regression for machine learning, Mach. Learn. Knowl. Extr. 1, 26, 2019.
- [13] C. Forbes, M. Evans, N. Hastings, and B. Peacock, Statistical distributions, John
Wiley & Sons, 2011.
- [14] F. Silvia and F. Cribari-Neto, Beta regression for modelling rates and proportions, J.
Appl. Stat. 3, 799-815, 2004.
- [15] A.M. Garay, E.M. Hashimoto, E.M.M. Ortega, and V.H. Lachos, On estimation and
influence diagnostics for zero-inflated negative binomial regression models, Comput.
Statist. Data Anal. 55, 1304-1318, 2011.
- [16] J.W. Hardin, J.M. Hilbe, and J. Hilbe, Generalized linear models and extensions,
Stata Press, 2007.
- [17] K. Venezuela, M.D.A. Botter, and M.C. Sandoval, Diagnostic techniques in generalized
estimating equations, J. Stat. Comput. Simul. 77, 879-888, 2007.
- [18] S. Liu, S.E. Ahmed, and L.Y. Ma, Influence diagnostics in the linear regression model
with stochastic linear restrictions, Pak. J. Statist. 25, 647-662, 2009.
- [19] P. McCullagh and J.A. Nelder, Generalized linear models, London: Chapman and
Hall, 1989.
- [20] K. Pammer and A. Kevan, The contribution of visual sensitivity, phonological processing,
and nonverbal IQ to children’s reading, Sci. Stud. Read. 11, 33-53, 2007.
- [21] G.H.A. Pereira, On quantile residuals in beta regression, Commun. Stat. Simul. Comput.
48, 302-316, 2019.
- [22] D. Pregibon, Logistic regression diagnostics, Ann. Statist. 9, 705-724, 1981.
- [23] J.S. Preisser and B.F. Qaqish, Deletion diagnostics for generalised estimating equations,
Biometrika 83, 551-562, 1996.
- [24] A.V. Rocha and A.B. Simas, Influence diagnostics in a general class of beta regression
models, Test 20, 95-119, 2011.
- [25] M. Smithson and J. Verkuilen, A better lemon squeezer? Maximum-likelihood regression
with beta-distributed dependent variables, Psychol. Methods 11, 54, 2006.
- [26] W. Thomas and R.D. Cook, Assessing influence on regression coefficients in generalized
linear models, Biometrika 76, 741-749, 1989.
- [27] M.A. Ullah and G.R. Pasha, The origin and developments of influence measures in
regression, Pak. J. Stat. 25, 2009.
- [28] D.A. Williams, Generalized linear model diagnostics using the deviance and single
case deletions, J. R. Stat. Soc. Ser. C. Appl. Stat. 36, 181-191, 1987.
- [29] F.C. Xie and B.C. Wei, Diagnostics analysis in censored generalized Poisson regression
model, J. Stat. Comput. Simul. 77, 695-708, 2007.
- [30] L. Xu, S.Y. Lee, and W.Y. Poon, Deletion measures for generalized linear mixed
effects models, Comput. Statist. Data Anal. 51, 1131-1146, 2006.
- [31] H.T. Zhu and S.Y. Lee, Local influence for incomplete data models, J. R. Stat. Soc.
Ser. B. Stat. Methodol. 63, 111-126, 2001.
- [32] M. Qasim, K. Månsson, and B.M.G. Kibria, On Some Beta Ridge Regression Estimators:
Method, Simulation and Application, J. Stat. Comput. Simul. 91, 1699-1712,
2021.
- [33] A.B. Simas, W. Barreto-Souza, and A.V. Rocha, Improved estimators for a general
class of beta regression models, Comput. Statist. Data Anal. 54, 48-66, 2010.
- [34] C.M. Hurvich and C.L. Tsai, Regression and time series model selection in small
samples, Biometrika 76 (2), 297-307, 1989.
- [35] F. Cribari-Neto, J. J. Santana-e-Silva, and K. L. P. Vasconcellos, Beta regression
misspecification tests, J. Stat. Plan. Inference 233, 106193, 2024.
- [36] J. A. Khan, A. Akbar, and B. M. G. Kibria, Behavior of Residuals in Cook’s Distance
for Beta Ridge Regression Model (BRRM), Int. J. Appl. Math. Comput. Sci. Syst.
Eng. 5, 202-208, 2023.
- [37] N.H. Prater, Estimate gasoline yields from crudes, Pet. Refin. 35, 236238, 1956.
- [38] A.C. Atkinson, Plots, Transformations and Regression: An Introduction to Graphical
Methods of Diagnostic Regression Analysis, New York: Oxford University Press, 1985.
- [39] A.J. Lemonte, S.L. Ferrari, and F. Cribari-Neto, Improved likelihood inference in
BirnbaumSaunders regressions, Comput. Stat. Data Anal. 54, 13071316, 2010.
- [40] S. Ferrari and F. Cribari-Neto, Beta regression for modelling rates and proportions,
J. Appl. Stat. 31, 799815, 2004.
- [41] R. Ospina, P. L. Espinheira, L. A. Arias, C. M. Xavier, V. Leiva, and C. Castro, New
Statistical Residuals for Regression Models in the Exponential Family: Characterization,
Simulation, Computation, and Applications, Mathematics 12(20), 3196, 2024.
https://doi.org/10.3390/math12203196.