Araştırma Makalesi
BibTex RIS Kaynak Göster

Robust multiple regression based on shrinkage $\mathbf{S}_{n}$ estimator

Yıl 2025, Cilt: 54 Sayı: 3, 1062 - 1093, 24.06.2025
https://doi.org/10.15672/hujms.1580654

Öz

Regression analysis is used to model the data statistically. However, data modeling and interpretation are affected by outliers and significant points. Robust regression analysis offers an alternative. In this study, the parameters that define the linear regression problem are estimated using a robust approach. The concept of shrinkage, which has been investigated for outlier detection in multivariate data. A comprehensive simulation analysis is performed to examine the breakdown value of the regression estimator, the affine equivariance, the robustness against contamination, and the efficiency with normal errors. The advantages of the suggested robust estimator in regression are demonstrated by the simulation results and real-world data examples. Simulation and research are conducted using the R software.

Kaynakça

  • [1] J. Agulló, C. Croux and S. Van Aelst, The multivariate least-trimmed squares estimator, J. Multivar. Anal. 99(3), 311-338, 2008.
  • [2] E. Cabana, R.E. Lillo and H. Laniado, Robust regression based on shrinkage with application to Living Environment Deprivation, Stoch. Environ. Res. Risk Assess. 34(2), 293-310, 2020.
  • [3] D.W. Scott and Z. Wang, Robust multiple regression, Entropy 23(1), 88, 2021.
  • [4] X. Liu, E.C. Chi and K. Lange, A sharper computational tool for regression, Technometrics 65(1), 117-126, 2023.
  • [5] E. Cabana, R.E. Lillo and H. Laniado, Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators, Stat. Pap. 62(2), 1583-1609, 2021.
  • [6] O.J. Ibidoja, F.P. Shan, J. Sulaiman and M.K.M. Ali, Robust M-estimators and machine learning algorithms for improving the predictive accuracy of seaweed contaminated big data, J. Niger. Soc. Phys. Sci. 1137(1), 1137-1137, 2023.
  • [7] E. Bas, Robust fuzzy regression functions approaches, Inf. Sci. 613(1), 419-434, 2022.
  • [8] M. Abonazel and A. Rabie, The impact of using robust estimations in regression models: An application on the Egyptian economy, J. Adv. Res. Appl. Math. Stat. 4(2), 8-16, 2019.
  • [9] C. Croux, S. Van Aelst and C. Dehon, Bounded influence regression using high breakdown scatter matrices, Ann. Inst. Stat. Math. 55(1), 265-285, 2003.
  • [10] C. Croux, P.J. Rousseeuw and O. Hössjer, Generalized S-estimators, J. Am. Stat. Assoc. 89(428), 1271-1281, 1994.
  • [11] V. DeMiguel, L. Garlappi and R. Uppal, Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?, Rev. Financ. Stud. 22(5), 19151953, 2009.
  • [12] F.Y. Edgeworth, On observations relating to several quantities, Hermathena 6, 279285, 1887.
  • [13] D. Gervini and V.J. Yohai, A class of robust and fully efficient regression estimators, Ann. Stat. 30(2), 583-616, 2002.
  • [14] J.B. Gray, Graphics for regression diagnostics, Am. Stat. Assoc. Proc. Stat. Comput. Sect. 1985(1), 102-107, 1985.
  • [15] D.M. Hawkins and D.J. Olive, Inconsistency of resampling algorithms for highbreakdown regression estimators and a new algorithm, J. Am. Stat. Assoc. 97(457), 136-159, 2002.
  • [16] M. Falk, On mad and comedians, Ann. Inst. Stat. Math. 49(3), 615-644, 1997.
  • [17] Z. Han, J. Chen, F. Zhang, Z. Gao, H. Huang and Y. Li, An efficient online outlier recognition method of dam monitoring data based on improved M-robust regression, Struct. Health Monit. 22(1), 581-599, 2022.
  • [18] P. Jana, D. Rosadi and E.D. Supandi, Comparison of robust estimation on multiple regression model, BAREKENG: J. Math. App. 17(2), 979-988, 2003.
  • [19] W. James and C. Stein, Estimation with quadratic loss, Bayesian Statistics 4, Oxford Univ. Press 4, 361-379, 1992.
  • [20] R. Lakshmi and T.A. Sajesh, A robust distance-based approach for detecting multidimensional outliers, J. Appl. Stat. 1(1), 1-21, 2024.
  • [21] H.P. Lopuhaa and P.J. Rousseeuw, Breakdown points of affine equivariant estimators of multivariate location and covariance matrices, Ann. Stat. 19(1), 229-248, 1991.
  • [22] O. Ledoit and M. Wolf, Improved estimation of the covariance matrix of stock returns with an application to portfolio selection, J. Empir. Finance 10(5), 603-621, 2003b.
  • [23] O. Ledoit and M.Wolf, A well-conditioned estimator for large-dimensional covariance matrices, J. Multivar. Anal. 88(2), 365-411, 2004.
  • [24] H. Oja, Multivariate non parametric methods with R: an approach based on spatial signs and ranks, Springer Sci. Bus. Media 1(1), 1-1, 2010.
  • [25] R. Maronna and S. Morgenthaler, Robust regression through robust covariances, Commun. Stat. Theory Methods 15(4), 1347-1365, 1986.
  • [26] R.A. Maronna and R.H. Zamar, Robust estimates of location and dispersion for highdimensional datasets, Technometrics 44(4), 307-317, 2002.
  • [27] R.A. Maronna, R.D. Martin and V.J. Yohai, Robust Statistics: Theory and Methods, John Wiley Sons 1(1), 1-1, 2006.
  • [28] F. Mosteller and J.W. Tukey, Data Analysis and Regression: A Second Course in Statistics, Addison-Wesley 1(1), 1-1, 1977.
  • [29] J. Mottonen, K. Nordhausen and H. Oja, Asymptotic theory of the spatial median, Nonparametrics and Robustness in Modern Statistical Inference and Time Series Analysis 7(1), 182-194, 2010.
  • [30] P.J. Rousseeuw, Least median of squares regression, J. Am. Stat. Assoc. 79(388), 871-880, 1984.
  • [31] P.J. Rousseeuw and A.M. Leroy, Robust regression and outlier detection, John Wiley Sons 1(1), 1-1, 2005.
  • [32] P. Rousseeuw and V. Yohai, Robust regression by means of S-estimators, Robust and Nonlinear Time Series Analysis 1(1), 256-272, 1984.
  • [33] P.J. Rousseeuw and K. Van Driessen, An algorithm for positive-breakdown regression based on concentration steps, Data Anal. Sci. Mod. Pract. Appl. 1(1), 335-346, 2000.
  • [34] P.J. Huber, Robust estimation of a location parameter, Ann. Math. Stat. 35(1), 73101, 1964.
  • [35] P.J. Huber and E.M. Ronchetti, Robust statistics, John Wiley Sons 1(1), 1-1, 2011.
  • [36] P.J. Rousseeuw, S. Van Aelst, K. Van Driessen and J.A. Gulló, Robust multivariate regression, Technometrics 46(3), 293-305, 2004.
  • [37] T.A. Sajesh and M.R. Srinivasan, Outlier detection for high dimensional data using the Comedian approach, J. Stat. Comput. Simul. 82(5), 745-757, 2012.
  • [38] A.F. Siegel, Robust regression using repeated medians, Biometrika 69(1), 242-244, 1982.
  • [39] R.E. Smith, N.A. Campbell and R. Litchfield, Multivariate statistical techniques applied to pisolitic laterite geochemistry at Golden Grove, Western Australia, J. Geochem. Explor. 22(13), 193-216, 1984.
  • [40] A.J. Stromberg, O. Hössjer and D.M. Hawkins, The least trimmed differences regression estimator and alternatives, J. Am. Stat. Assoc. 95(451), 853-864, 2000.
  • [41] V.J. Yohai, High breakdown-point and high efficiency robust estimates for regression, Ann. Stat. 15(2), 642-656, 1987.
  • [42] C. Yu and W. Yao, Robust linear regression: A review and comparison, Commun. Stat. Simul. Comput. 46(8), 6261-6282, 2017.
Yıl 2025, Cilt: 54 Sayı: 3, 1062 - 1093, 24.06.2025
https://doi.org/10.15672/hujms.1580654

Öz

Kaynakça

  • [1] J. Agulló, C. Croux and S. Van Aelst, The multivariate least-trimmed squares estimator, J. Multivar. Anal. 99(3), 311-338, 2008.
  • [2] E. Cabana, R.E. Lillo and H. Laniado, Robust regression based on shrinkage with application to Living Environment Deprivation, Stoch. Environ. Res. Risk Assess. 34(2), 293-310, 2020.
  • [3] D.W. Scott and Z. Wang, Robust multiple regression, Entropy 23(1), 88, 2021.
  • [4] X. Liu, E.C. Chi and K. Lange, A sharper computational tool for regression, Technometrics 65(1), 117-126, 2023.
  • [5] E. Cabana, R.E. Lillo and H. Laniado, Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators, Stat. Pap. 62(2), 1583-1609, 2021.
  • [6] O.J. Ibidoja, F.P. Shan, J. Sulaiman and M.K.M. Ali, Robust M-estimators and machine learning algorithms for improving the predictive accuracy of seaweed contaminated big data, J. Niger. Soc. Phys. Sci. 1137(1), 1137-1137, 2023.
  • [7] E. Bas, Robust fuzzy regression functions approaches, Inf. Sci. 613(1), 419-434, 2022.
  • [8] M. Abonazel and A. Rabie, The impact of using robust estimations in regression models: An application on the Egyptian economy, J. Adv. Res. Appl. Math. Stat. 4(2), 8-16, 2019.
  • [9] C. Croux, S. Van Aelst and C. Dehon, Bounded influence regression using high breakdown scatter matrices, Ann. Inst. Stat. Math. 55(1), 265-285, 2003.
  • [10] C. Croux, P.J. Rousseeuw and O. Hössjer, Generalized S-estimators, J. Am. Stat. Assoc. 89(428), 1271-1281, 1994.
  • [11] V. DeMiguel, L. Garlappi and R. Uppal, Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?, Rev. Financ. Stud. 22(5), 19151953, 2009.
  • [12] F.Y. Edgeworth, On observations relating to several quantities, Hermathena 6, 279285, 1887.
  • [13] D. Gervini and V.J. Yohai, A class of robust and fully efficient regression estimators, Ann. Stat. 30(2), 583-616, 2002.
  • [14] J.B. Gray, Graphics for regression diagnostics, Am. Stat. Assoc. Proc. Stat. Comput. Sect. 1985(1), 102-107, 1985.
  • [15] D.M. Hawkins and D.J. Olive, Inconsistency of resampling algorithms for highbreakdown regression estimators and a new algorithm, J. Am. Stat. Assoc. 97(457), 136-159, 2002.
  • [16] M. Falk, On mad and comedians, Ann. Inst. Stat. Math. 49(3), 615-644, 1997.
  • [17] Z. Han, J. Chen, F. Zhang, Z. Gao, H. Huang and Y. Li, An efficient online outlier recognition method of dam monitoring data based on improved M-robust regression, Struct. Health Monit. 22(1), 581-599, 2022.
  • [18] P. Jana, D. Rosadi and E.D. Supandi, Comparison of robust estimation on multiple regression model, BAREKENG: J. Math. App. 17(2), 979-988, 2003.
  • [19] W. James and C. Stein, Estimation with quadratic loss, Bayesian Statistics 4, Oxford Univ. Press 4, 361-379, 1992.
  • [20] R. Lakshmi and T.A. Sajesh, A robust distance-based approach for detecting multidimensional outliers, J. Appl. Stat. 1(1), 1-21, 2024.
  • [21] H.P. Lopuhaa and P.J. Rousseeuw, Breakdown points of affine equivariant estimators of multivariate location and covariance matrices, Ann. Stat. 19(1), 229-248, 1991.
  • [22] O. Ledoit and M. Wolf, Improved estimation of the covariance matrix of stock returns with an application to portfolio selection, J. Empir. Finance 10(5), 603-621, 2003b.
  • [23] O. Ledoit and M.Wolf, A well-conditioned estimator for large-dimensional covariance matrices, J. Multivar. Anal. 88(2), 365-411, 2004.
  • [24] H. Oja, Multivariate non parametric methods with R: an approach based on spatial signs and ranks, Springer Sci. Bus. Media 1(1), 1-1, 2010.
  • [25] R. Maronna and S. Morgenthaler, Robust regression through robust covariances, Commun. Stat. Theory Methods 15(4), 1347-1365, 1986.
  • [26] R.A. Maronna and R.H. Zamar, Robust estimates of location and dispersion for highdimensional datasets, Technometrics 44(4), 307-317, 2002.
  • [27] R.A. Maronna, R.D. Martin and V.J. Yohai, Robust Statistics: Theory and Methods, John Wiley Sons 1(1), 1-1, 2006.
  • [28] F. Mosteller and J.W. Tukey, Data Analysis and Regression: A Second Course in Statistics, Addison-Wesley 1(1), 1-1, 1977.
  • [29] J. Mottonen, K. Nordhausen and H. Oja, Asymptotic theory of the spatial median, Nonparametrics and Robustness in Modern Statistical Inference and Time Series Analysis 7(1), 182-194, 2010.
  • [30] P.J. Rousseeuw, Least median of squares regression, J. Am. Stat. Assoc. 79(388), 871-880, 1984.
  • [31] P.J. Rousseeuw and A.M. Leroy, Robust regression and outlier detection, John Wiley Sons 1(1), 1-1, 2005.
  • [32] P. Rousseeuw and V. Yohai, Robust regression by means of S-estimators, Robust and Nonlinear Time Series Analysis 1(1), 256-272, 1984.
  • [33] P.J. Rousseeuw and K. Van Driessen, An algorithm for positive-breakdown regression based on concentration steps, Data Anal. Sci. Mod. Pract. Appl. 1(1), 335-346, 2000.
  • [34] P.J. Huber, Robust estimation of a location parameter, Ann. Math. Stat. 35(1), 73101, 1964.
  • [35] P.J. Huber and E.M. Ronchetti, Robust statistics, John Wiley Sons 1(1), 1-1, 2011.
  • [36] P.J. Rousseeuw, S. Van Aelst, K. Van Driessen and J.A. Gulló, Robust multivariate regression, Technometrics 46(3), 293-305, 2004.
  • [37] T.A. Sajesh and M.R. Srinivasan, Outlier detection for high dimensional data using the Comedian approach, J. Stat. Comput. Simul. 82(5), 745-757, 2012.
  • [38] A.F. Siegel, Robust regression using repeated medians, Biometrika 69(1), 242-244, 1982.
  • [39] R.E. Smith, N.A. Campbell and R. Litchfield, Multivariate statistical techniques applied to pisolitic laterite geochemistry at Golden Grove, Western Australia, J. Geochem. Explor. 22(13), 193-216, 1984.
  • [40] A.J. Stromberg, O. Hössjer and D.M. Hawkins, The least trimmed differences regression estimator and alternatives, J. Am. Stat. Assoc. 95(451), 853-864, 2000.
  • [41] V.J. Yohai, High breakdown-point and high efficiency robust estimates for regression, Ann. Stat. 15(2), 642-656, 1987.
  • [42] C. Yu and W. Yao, Robust linear regression: A review and comparison, Commun. Stat. Simul. Comput. 46(8), 6261-6282, 2017.
Toplam 42 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Hesaplamalı İstatistik, İstatistiksel Analiz, Uygulamalı İstatistik
Bölüm İstatistik
Yazarlar

Lakshmi R 0000-0001-8441-7329

Sajesh T A 0000-0002-9037-2013

Erken Görünüm Tarihi 16 Mayıs 2025
Yayımlanma Tarihi 24 Haziran 2025
Gönderilme Tarihi 7 Kasım 2024
Kabul Tarihi 22 Nisan 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 54 Sayı: 3

Kaynak Göster

APA R, L., & T A, S. (2025). Robust multiple regression based on shrinkage $\mathbf{S}_{n}$ estimator. Hacettepe Journal of Mathematics and Statistics, 54(3), 1062-1093. https://doi.org/10.15672/hujms.1580654
AMA R L, T A S. Robust multiple regression based on shrinkage $\mathbf{S}_{n}$ estimator. Hacettepe Journal of Mathematics and Statistics. Haziran 2025;54(3):1062-1093. doi:10.15672/hujms.1580654
Chicago R, Lakshmi, ve Sajesh T A. “Robust Multiple Regression Based on Shrinkage $\mathbf{S}_{n}$ Estimator”. Hacettepe Journal of Mathematics and Statistics 54, sy. 3 (Haziran 2025): 1062-93. https://doi.org/10.15672/hujms.1580654.
EndNote R L, T A S (01 Haziran 2025) Robust multiple regression based on shrinkage $\mathbf{S}_{n}$ estimator. Hacettepe Journal of Mathematics and Statistics 54 3 1062–1093.
IEEE L. R ve S. T A, “Robust multiple regression based on shrinkage $\mathbf{S}_{n}$ estimator”, Hacettepe Journal of Mathematics and Statistics, c. 54, sy. 3, ss. 1062–1093, 2025, doi: 10.15672/hujms.1580654.
ISNAD R, Lakshmi - T A, Sajesh. “Robust Multiple Regression Based on Shrinkage $\mathbf{S}_{n}$ Estimator”. Hacettepe Journal of Mathematics and Statistics 54/3 (Haziran 2025), 1062-1093. https://doi.org/10.15672/hujms.1580654.
JAMA R L, T A S. Robust multiple regression based on shrinkage $\mathbf{S}_{n}$ estimator. Hacettepe Journal of Mathematics and Statistics. 2025;54:1062–1093.
MLA R, Lakshmi ve Sajesh T A. “Robust Multiple Regression Based on Shrinkage $\mathbf{S}_{n}$ Estimator”. Hacettepe Journal of Mathematics and Statistics, c. 54, sy. 3, 2025, ss. 1062-93, doi:10.15672/hujms.1580654.
Vancouver R L, T A S. Robust multiple regression based on shrinkage $\mathbf{S}_{n}$ estimator. Hacettepe Journal of Mathematics and Statistics. 2025;54(3):1062-93.