Araştırma Makalesi
BibTex RIS Kaynak Göster

YAPAY ZEKA VE ÇEVİRİ ETİĞİ: DOĞAL DİL İŞLEME MODELLERİNDEKİ ÖNYARGI PROBLEMLERİ VE ÇÖZÜM ÖNERİLERİ

Yıl 2025, Sayı: 68, 241 - 262, 12.05.2025
https://doi.org/10.30794/pausbed.1611253

Öz

Bu çalışmada, doğal dil işleme (NLP) modellerindeki sosyal önyargıların, veri çeşitliliği ve veri artırma yöntemleriyle nasıl azaltılabileceği araştırılmıştır. Özellikle, daha az yaygın dillere yönelik çeviri performansı ve önyargı oranları, veri çeşitliliği ve Generative Adversarial Networks (GAN) tabanlı veri artırma teknikleri ile analiz edilmiştir. Google Translate ve DeepL gibi popüler çeviri araçlarıyla yapılan deneyler, veri çeşitliliğinin çeviri doğruluğunu önemli ölçüde artırdığını ve BLEU puanlarını %40,8 oranında yükselttiğini göstermiştir. Ayrıca, cinsiyete dayalı önyargıların, özellikle meslekler konusunda, %33-41 oranında azaldığı belirlenmiştir. Veri çeşitliliği ve veri artırmanın çeviri kalitesi üzerindeki etkisi, regresyon analiziyle istatistiksel olarak doğrulanmıştır. Çalışma, veri çeşitliliğinin sadece çeviri doğruluğunu değil, aynı zamanda etik nötrlüğü de desteklediğini ortaya koymaktadır.

Kaynakça

  • Aharoni, R., Johnson, M., & Firat, O. (2019). Massively multilingual neural machine translation. Proceedings of NAACL-HLT 2019, 3874-3884. https://doi.org/10.18653/v1/N19-1388.
  • Alimovski, E. (2019). Derin öğrenmeye dayalı güçlü yüz tanıma sistemi için gan ile veri çoğaltma. Master‘s thesis, İstanbul Sabahattin Zaim Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Anabilim Dalı.
  • Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 5185-5198. https://doi.org/10.18653/v1/2020.acl-main.463
  • Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.
  • Berg, S., Kutra, D., Kröger, T., Strähle, C. N., Kausler, B. X., Haubold, C., & Kreshuk, A. (2019). Ilastik: Interactive machine learning for (bio) image analysis. Nature Methods, 16(12), 1226-1232.
  • Bird, S., Klein, E., & Loper, E. (2020). Natural language processing with Python. O‘Reilly Media.
  • Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? De-biasing word embeddings. Advances in Neural Information Processing Systems, 29, 4356-4364.
  • Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
  • Hovy, D., & Spruit, S. L. (2016). The social impact of natural language processing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 591-598.
  • Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 93-103. https://doi.org/10.18653/v1/2020.acl-main.720
  • Kırık, F., Demirkıran, B., Aslanoğlu, C. E., Koytak, A., & Özdemir, H. (2023). Detection and Classification of Diabetic Macular Edema with a Desktop-Based Code-Free Machine Learning Tool. Turkish Journal of Ophthalmology, 53(5), 301.
  • Makananise, F. O. (2024). Decolonial Thinking of Digital Media Inequalities and Indigenous Language Marginalisation of the Global South from the South African Context. In Decolonising Digital Media and Indigenisation of Participatory Epistemologies (pp. 17-39). Routledge.
  • Prates, M. O., Avelar, P. H., & Lamb, L. C. (2020). Assessing gender bias in machine translation: A case study with Google Translate. Neural Computing and Applications, 32, 6363-6381.
  • Pym, A. (2023). Exploring translation theories. Routledge.
  • Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020). Saving face: Investigating the ethical concerns of facial recognition auditing. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 145-151.
  • Sinap, V. (2023). Makine Öğrenmesi Teknikleri ile Counter-Strike: Global Offensive Raunt Sonuçlarının Tahminlenmesi. Journal of Intelligent Systems: Theory and Applications, 6(2), 119-129.

KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE

Yıl 2025, Sayı: 68, 241 - 262, 12.05.2025
https://doi.org/10.30794/pausbed.1611253

Öz

In dieser Studie wird untersucht, wie soziale Verzerrungen in natürlichen Sprachverarbeitungsmodellen (NLP) durch Datenvielfalt und Datenerweiterungstechniken reduziert werden können. Insbesondere werden die Übersetzungsleistung und Verzerrungsraten in weniger verbreiteten Sprachen mithilfe von Datenvielfalt und GAN-basierter Datenerweiterung analysiert. Experimente mit gängigen Übersetzungswerkzeugen wie Google Translate und DeepL zeigen, dass vielfältigere Datensätze die Übersetzungsgenauigkeit erheblich verbessern und die BLEU-Werte um bis zu 40,8 % erhöhen. Darüber hinaus wurden geschlechtsspezifische Verzerrungen, insbesondere in Bezug auf Berufe, um 33-41 % reduziert. Eine Regressionsanalyse bestätigte statistisch die Auswirkungen von Datenvielfalt und Datenerweiterung auf die Übersetzungsqualität und die Reduzierung von Verzerrungen. Die Ergebnisse zeigen, dass mehr Datenvielfalt in ressourcenarmen Sprachen nicht nur die Übersetzungsgenauigkeit verbessert, sondern auch die ethische Neutralität fördert.

Kaynakça

  • Aharoni, R., Johnson, M., & Firat, O. (2019). Massively multilingual neural machine translation. Proceedings of NAACL-HLT 2019, 3874-3884. https://doi.org/10.18653/v1/N19-1388.
  • Alimovski, E. (2019). Derin öğrenmeye dayalı güçlü yüz tanıma sistemi için gan ile veri çoğaltma. Master‘s thesis, İstanbul Sabahattin Zaim Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Anabilim Dalı.
  • Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 5185-5198. https://doi.org/10.18653/v1/2020.acl-main.463
  • Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.
  • Berg, S., Kutra, D., Kröger, T., Strähle, C. N., Kausler, B. X., Haubold, C., & Kreshuk, A. (2019). Ilastik: Interactive machine learning for (bio) image analysis. Nature Methods, 16(12), 1226-1232.
  • Bird, S., Klein, E., & Loper, E. (2020). Natural language processing with Python. O‘Reilly Media.
  • Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? De-biasing word embeddings. Advances in Neural Information Processing Systems, 29, 4356-4364.
  • Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
  • Hovy, D., & Spruit, S. L. (2016). The social impact of natural language processing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 591-598.
  • Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 93-103. https://doi.org/10.18653/v1/2020.acl-main.720
  • Kırık, F., Demirkıran, B., Aslanoğlu, C. E., Koytak, A., & Özdemir, H. (2023). Detection and Classification of Diabetic Macular Edema with a Desktop-Based Code-Free Machine Learning Tool. Turkish Journal of Ophthalmology, 53(5), 301.
  • Makananise, F. O. (2024). Decolonial Thinking of Digital Media Inequalities and Indigenous Language Marginalisation of the Global South from the South African Context. In Decolonising Digital Media and Indigenisation of Participatory Epistemologies (pp. 17-39). Routledge.
  • Prates, M. O., Avelar, P. H., & Lamb, L. C. (2020). Assessing gender bias in machine translation: A case study with Google Translate. Neural Computing and Applications, 32, 6363-6381.
  • Pym, A. (2023). Exploring translation theories. Routledge.
  • Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020). Saving face: Investigating the ethical concerns of facial recognition auditing. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 145-151.
  • Sinap, V. (2023). Makine Öğrenmesi Teknikleri ile Counter-Strike: Global Offensive Raunt Sonuçlarının Tahminlenmesi. Journal of Intelligent Systems: Theory and Applications, 6(2), 119-129.

ARTIFICIAL INTELLIGENCE AND TRANSLATION ETHICS: BIAS PROBLEMS IN NATURAL LANGUAGE PROCESSING MODELS AND SOLUTION SUGGESTIONS

Yıl 2025, Sayı: 68, 241 - 262, 12.05.2025
https://doi.org/10.30794/pausbed.1611253

Öz

This study explores how social biases in natural language processing (NLP) models can be mitigated through data diversity and data augmentation techniques. Specifically, translation performance and bias rates for less widely spoken languages are analyzed using data diversity and Generative Adversarial Networks (GAN)-based data augmentation methods. Experiments conducted with popular translation tools like Google Translate and DeepL show that data diversity significantly improves translation accuracy, increasing BLEU scores by 40.8%. Additionally, gender-based biases, especially regarding professions, were found to decrease by 33-41%. The impact of data diversity and augmentation on translation quality was statistically validated through regression analysis. The study demonstrates that data diversity not only enhances translation accuracy but also supports ethical neutrality.

Kaynakça

  • Aharoni, R., Johnson, M., & Firat, O. (2019). Massively multilingual neural machine translation. Proceedings of NAACL-HLT 2019, 3874-3884. https://doi.org/10.18653/v1/N19-1388.
  • Alimovski, E. (2019). Derin öğrenmeye dayalı güçlü yüz tanıma sistemi için gan ile veri çoğaltma. Master‘s thesis, İstanbul Sabahattin Zaim Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Anabilim Dalı.
  • Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 5185-5198. https://doi.org/10.18653/v1/2020.acl-main.463
  • Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.
  • Berg, S., Kutra, D., Kröger, T., Strähle, C. N., Kausler, B. X., Haubold, C., & Kreshuk, A. (2019). Ilastik: Interactive machine learning for (bio) image analysis. Nature Methods, 16(12), 1226-1232.
  • Bird, S., Klein, E., & Loper, E. (2020). Natural language processing with Python. O‘Reilly Media.
  • Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? De-biasing word embeddings. Advances in Neural Information Processing Systems, 29, 4356-4364.
  • Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
  • Hovy, D., & Spruit, S. L. (2016). The social impact of natural language processing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 591-598.
  • Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 93-103. https://doi.org/10.18653/v1/2020.acl-main.720
  • Kırık, F., Demirkıran, B., Aslanoğlu, C. E., Koytak, A., & Özdemir, H. (2023). Detection and Classification of Diabetic Macular Edema with a Desktop-Based Code-Free Machine Learning Tool. Turkish Journal of Ophthalmology, 53(5), 301.
  • Makananise, F. O. (2024). Decolonial Thinking of Digital Media Inequalities and Indigenous Language Marginalisation of the Global South from the South African Context. In Decolonising Digital Media and Indigenisation of Participatory Epistemologies (pp. 17-39). Routledge.
  • Prates, M. O., Avelar, P. H., & Lamb, L. C. (2020). Assessing gender bias in machine translation: A case study with Google Translate. Neural Computing and Applications, 32, 6363-6381.
  • Pym, A. (2023). Exploring translation theories. Routledge.
  • Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020). Saving face: Investigating the ethical concerns of facial recognition auditing. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 145-151.
  • Sinap, V. (2023). Makine Öğrenmesi Teknikleri ile Counter-Strike: Global Offensive Raunt Sonuçlarının Tahminlenmesi. Journal of Intelligent Systems: Theory and Applications, 6(2), 119-129.
Toplam 16 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Almanca
Konular Çeviri ve Yorum Çalışmaları, Karşılaştırmalı Dil Çalışmaları
Bölüm Araştırma Makalesi
Yazarlar

Gülfidan Aytaş 0000-0003-1566-1592

Erken Görünüm Tarihi 2 Mayıs 2025
Yayımlanma Tarihi 12 Mayıs 2025
Gönderilme Tarihi 1 Ocak 2025
Kabul Tarihi 4 Mart 2025
Yayımlandığı Sayı Yıl 2025 Sayı: 68

Kaynak Göster

APA Aytaş, G. (2025). KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi(68), 241-262. https://doi.org/10.30794/pausbed.1611253
AMA Aytaş G. KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. PAUSBED. Mayıs 2025;(68):241-262. doi:10.30794/pausbed.1611253
Chicago Aytaş, Gülfidan. “KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE”. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, sy. 68 (Mayıs 2025): 241-62. https://doi.org/10.30794/pausbed.1611253.
EndNote Aytaş G (01 Mayıs 2025) KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi 68 241–262.
IEEE G. Aytaş, “KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE”, PAUSBED, sy. 68, ss. 241–262, Mayıs 2025, doi: 10.30794/pausbed.1611253.
ISNAD Aytaş, Gülfidan. “KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE”. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi 68 (Mayıs 2025), 241-262. https://doi.org/10.30794/pausbed.1611253.
JAMA Aytaş G. KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. PAUSBED. 2025;:241–262.
MLA Aytaş, Gülfidan. “KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE”. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, sy. 68, 2025, ss. 241-62, doi:10.30794/pausbed.1611253.
Vancouver Aytaş G. KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. PAUSBED. 2025(68):241-62.