Araştırma Makalesi
BibTex RIS Kaynak Göster

Eksik Meteorolojik Verilerin Zaman Serisi ve Makine Öğrenmesi Yöntemleri Kullanılarak Tahmin Edilmesi: Batı Karadeniz Havzası Örneği

Yıl 2025, Cilt: 15 Sayı: 2, 54 - 67, 21.07.2025

Öz

Yapay zekâ uygulamalarının gelişimi günümüzde hızla devam etmektedir. Geniş bir uygulama yelpazesiyle elde var olan verinin anlamlı hale getirilmesini sağlamaktadır. Bu durum tüm yapay zekâ alt grupları için verinin önemini arttırmaktadır. Verinin doğruluğu, sürekliliği ve anlamlılığı modellerin eğitilmesi ve test edilmesi için oldukça önemlidir. Uzun süreli fiziksel ölçümlere dayalı elde edilen verilerde süreksizlikler veya hatalar olması olasıdır. Dışsal faktörlerin oldukça etkili olduğu meteorolojik verilerde eksikler olması kaçınılmazdır. Bu durum iklim değişikliği ve hidrolojik modellemelerde önemli bir yer tutan meteorolojik verilerin analiz güvenliğini negatif etkilemesine neden olmaktadır. Bu çalışma kapsamında Batı Karadeniz havzasında yer alan yedi farklı ilin şehir merkezlerindeki meteorolojik gözlem istasyon aylık yağış ölçümleri incelenmiştir. Ölçümleri yapılan 2000-2023 yıllarındaki aylık yağış verilerindeki eksiklikler zaman serisi, istatiksel ve makine öğrenmesi yaklaşımları ile tahmin edilmeye çalışılmıştır. Modellemelerde ARIMA, SARIMA, ARIMAX, XGBOOST ve ortalama ile tamamlama yöntemleri kullanılmıştır. Yapılan analizlerde mevsimsel etkileri dikkate alan SARIMA modellerinin daha uyumlu sonuçlar verdiği performans metrikleri ile ortaya konulmuştur. Tamamlanan veriler ileri düzey kuraklık analizlerine altlık oluşturmaktadır. Böylece ileride yapılacak kuraklık analizlerinin veri kayıplarından kaynaklı sapmalardan etkilenmesi minimize edilmiştir.

Kaynakça

  • Bentéjac , C., Csörgo, A., Martínez-Muñoz, G. 2021. A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54. https://doi.org/10.xxxx
  • Bierens, HJ. 1987. ARMAX model specification testing, with an application to unemployment in the Netherlands. Journal of Econometrics, 35(1):161–190. https://doi.org/10.1016/0304-4076(87)90086-8
  • Box, GE., Jenkins, GM., Reinsel, G. 1970. Forecasting and control. Time Series Analysis, 3, 75.
  • Box, GEP., Jenkins, GM., Reinsel, GC. 2008. Time series analysis: Forecasting and control (4th ed.). New Jersey: John Wiley & Sons, Inc. ISBN: 978-0470272848
  • Chen, T., Guestrin, C. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, s. 785–794. ACM. https://doi.org/10.1145/2939672.2939785
  • Cryer, JD., Chan, KS. 2008. Time series analysis: With applications in R. New York: Springer-Verlag. ISBN: 978-0-387-75958-6.
  • Dabral, PP., Murry, MZ. 2017. Modelling and forecasting of rainfall time series using SARIMA. Environmental Processes, 4:399–419. https://doi.org/10.1007/s40710-017-0226-y
  • Dempster, AP., Laird, NM., Rubin, DB. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–38. https://doi.org/10.2307/2984875
  • Fan, J., Shan, R., Cao, X., Li, P. 2009. The analysis of the tertiary industry with the ARIMAX model. Journal of Mathematics Research, 1(2):156. https://doi.org/10.5539/jmr.v1n2p156
  • García Laencina, PJ., Sancho Gómez, JL., Figueiras Vidal, AR., Verleysen, M. 2009. K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing, 72(7–9):1483–1493. https://doi.org/10.1016/j.neucom.2008.11.026
  • Huntra, P., Keener, TC. 2017. Evaluating the impact of meteorological factors on water demand in the Las Vegas Valley using time-series analysis: 1990–2014. International Journal of Geo-Information, 6(8), 249. https://doi.org/10.3390/ijgi6080249
  • Schafer, JL., Graham, JW. 2002. Missing data: Our view of the state of art. Psychological Methods, 7(2):147-177. https://doi.org/10.1037/1082-989X.7.2.147
  • Stekhoven, DJ., Bühlmann, P. 2012. MissForest: Non parametric missing value imputation for mixed type data. Bioinformatics, 28(1):112–118. https://doi.org/10.1093/bioinformatics/btr597
  • Sutthichaimethee, P., Ariyasajjakorn, D. 2017. Forecasting energy consumption in short-term and long-term periods by using the ARIMAX model in the construction and materials sector in Thailand. Journal of Ecological Engineering, 18(4):52-59. https://doi.org/10.12911/22998993/74396
  • Valipour, M. 2012. Number of required observation data for rainfall forecasting according to the climate conditions. American Journal of Scientific Research, 74:79–86.
  • Valipour, M. 2015. Long-term runoff study using SARIMA and ARIMA models in the United States. Meteorological Applications, 22:592–598. https://doi.org/10.1002/met.1491
  • Wangwongchai, A., Waqas, M., Dechpichai, P., Hlaing, PT., Ahmad, S., Humphries, U. W. 2023. Imputation of missing daily rainfall data: A comparison between artificial intelligence and statistical techniques. MethodsX, 11, 102459. https://doi.org/10.1016/j.mex.2023.102459
  • Vincent, P., Larochelle, H., Bengio, Y., Manzagol, PA. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, s. 1096–1103. ACM.
  • Wang, J., Du, Y H., Zhang, XT. 2008. Theory and application with seasonal time series (1st ed.). Nankai: Nankai University Press.
  • Waqas, M., Humphries , UW., Hlaing, PT., Wangwongchai, A., Dechpichai, P. 2024. Advancements in daily precipitation forecasting: A deep dive into daily precipitation forecasting hybrid methods in the tropical climate of Thailand. MethodsX, 12, 102757. https://doi.org/10.1016/j.mex.2024.102757
  • Zhang, G. P. 2003. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50:159–175.

Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case

Yıl 2025, Cilt: 15 Sayı: 2, 54 - 67, 21.07.2025

Öz

The development of artificial intelligence applications is rapidly advancing today. It enables making existing data meaningful through a wide range of applications. This increases the importance of data for all artificial intelligence subfields. The accuracy, continuity and meaningfulness of data are very important for training and testing the models. Discontinuities or errors are likely to occur in data obtained from long-term physical measurements. It is inevitable for meteorological data, which are highly influenced by external factors to have gaps. Mentioned situation causes negative effects on the analysis reliability of meteorological data, which plays a significant role in climate change and hydrological modelling. Within the scope of this study, monthly precipitation measurements of meteorological observation stations in the city centers of seven different provinces in the Western Black Sea basin were examined. The gaps in the monthly rainfall data measured between 2000 and 2023 were estimated using time series, statistical and machine learning approaches. In the modelling process, ARIMA, SARIMA, ARIMAX, XGBOOST, and mean imputation methods were employed. The analyses revealed that SARIMA models, which consider seasonal effects, provided more consistent results, as demonstrated by performance metrics. The completed data form the basis for advanced drought analysis. Thus, impact of deviations due to data loss in future drought analyses is minimized.

Kaynakça

  • Bentéjac , C., Csörgo, A., Martínez-Muñoz, G. 2021. A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54. https://doi.org/10.xxxx
  • Bierens, HJ. 1987. ARMAX model specification testing, with an application to unemployment in the Netherlands. Journal of Econometrics, 35(1):161–190. https://doi.org/10.1016/0304-4076(87)90086-8
  • Box, GE., Jenkins, GM., Reinsel, G. 1970. Forecasting and control. Time Series Analysis, 3, 75.
  • Box, GEP., Jenkins, GM., Reinsel, GC. 2008. Time series analysis: Forecasting and control (4th ed.). New Jersey: John Wiley & Sons, Inc. ISBN: 978-0470272848
  • Chen, T., Guestrin, C. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, s. 785–794. ACM. https://doi.org/10.1145/2939672.2939785
  • Cryer, JD., Chan, KS. 2008. Time series analysis: With applications in R. New York: Springer-Verlag. ISBN: 978-0-387-75958-6.
  • Dabral, PP., Murry, MZ. 2017. Modelling and forecasting of rainfall time series using SARIMA. Environmental Processes, 4:399–419. https://doi.org/10.1007/s40710-017-0226-y
  • Dempster, AP., Laird, NM., Rubin, DB. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–38. https://doi.org/10.2307/2984875
  • Fan, J., Shan, R., Cao, X., Li, P. 2009. The analysis of the tertiary industry with the ARIMAX model. Journal of Mathematics Research, 1(2):156. https://doi.org/10.5539/jmr.v1n2p156
  • García Laencina, PJ., Sancho Gómez, JL., Figueiras Vidal, AR., Verleysen, M. 2009. K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing, 72(7–9):1483–1493. https://doi.org/10.1016/j.neucom.2008.11.026
  • Huntra, P., Keener, TC. 2017. Evaluating the impact of meteorological factors on water demand in the Las Vegas Valley using time-series analysis: 1990–2014. International Journal of Geo-Information, 6(8), 249. https://doi.org/10.3390/ijgi6080249
  • Schafer, JL., Graham, JW. 2002. Missing data: Our view of the state of art. Psychological Methods, 7(2):147-177. https://doi.org/10.1037/1082-989X.7.2.147
  • Stekhoven, DJ., Bühlmann, P. 2012. MissForest: Non parametric missing value imputation for mixed type data. Bioinformatics, 28(1):112–118. https://doi.org/10.1093/bioinformatics/btr597
  • Sutthichaimethee, P., Ariyasajjakorn, D. 2017. Forecasting energy consumption in short-term and long-term periods by using the ARIMAX model in the construction and materials sector in Thailand. Journal of Ecological Engineering, 18(4):52-59. https://doi.org/10.12911/22998993/74396
  • Valipour, M. 2012. Number of required observation data for rainfall forecasting according to the climate conditions. American Journal of Scientific Research, 74:79–86.
  • Valipour, M. 2015. Long-term runoff study using SARIMA and ARIMA models in the United States. Meteorological Applications, 22:592–598. https://doi.org/10.1002/met.1491
  • Wangwongchai, A., Waqas, M., Dechpichai, P., Hlaing, PT., Ahmad, S., Humphries, U. W. 2023. Imputation of missing daily rainfall data: A comparison between artificial intelligence and statistical techniques. MethodsX, 11, 102459. https://doi.org/10.1016/j.mex.2023.102459
  • Vincent, P., Larochelle, H., Bengio, Y., Manzagol, PA. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, s. 1096–1103. ACM.
  • Wang, J., Du, Y H., Zhang, XT. 2008. Theory and application with seasonal time series (1st ed.). Nankai: Nankai University Press.
  • Waqas, M., Humphries , UW., Hlaing, PT., Wangwongchai, A., Dechpichai, P. 2024. Advancements in daily precipitation forecasting: A deep dive into daily precipitation forecasting hybrid methods in the tropical climate of Thailand. MethodsX, 12, 102757. https://doi.org/10.1016/j.mex.2024.102757
  • Zhang, G. P. 2003. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50:159–175.
Toplam 21 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular İnşaat Mühendisliği (Diğer)
Bölüm Research Article
Yazarlar

Yusuf Kaya 0000-0002-0923-2217

Berna Aksoy 0000-0001-6925-1594

İsmail Hakkı Özölçer 0000-0002-8404-0522

Yayımlanma Tarihi 21 Temmuz 2025
Gönderilme Tarihi 10 Şubat 2025
Kabul Tarihi 18 Mart 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 15 Sayı: 2

Kaynak Göster

APA Kaya, Y., Aksoy, B., & Özölçer, İ. H. (2025). Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case. Karaelmas Fen Ve Mühendislik Dergisi, 15(2), 54-67.
AMA Kaya Y, Aksoy B, Özölçer İH. Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case. Karaelmas Fen ve Mühendislik Dergisi. Temmuz 2025;15(2):54-67.
Chicago Kaya, Yusuf, Berna Aksoy, ve İsmail Hakkı Özölçer. “Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case”. Karaelmas Fen Ve Mühendislik Dergisi 15, sy. 2 (Temmuz 2025): 54-67.
EndNote Kaya Y, Aksoy B, Özölçer İH (01 Temmuz 2025) Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case. Karaelmas Fen ve Mühendislik Dergisi 15 2 54–67.
IEEE Y. Kaya, B. Aksoy, ve İ. H. Özölçer, “Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case”, Karaelmas Fen ve Mühendislik Dergisi, c. 15, sy. 2, ss. 54–67, 2025.
ISNAD Kaya, Yusuf vd. “Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case”. Karaelmas Fen ve Mühendislik Dergisi 15/2 (Temmuz 2025), 54-67.
JAMA Kaya Y, Aksoy B, Özölçer İH. Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case. Karaelmas Fen ve Mühendislik Dergisi. 2025;15:54–67.
MLA Kaya, Yusuf vd. “Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case”. Karaelmas Fen Ve Mühendislik Dergisi, c. 15, sy. 2, 2025, ss. 54-67.
Vancouver Kaya Y, Aksoy B, Özölçer İH. Time Series and Machine Learning Methods for Estimating Missing Meteorological Data: The Western Black Sea Basin Case. Karaelmas Fen ve Mühendislik Dergisi. 2025;15(2):54-67.