KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE

Gülfidan Aytaş

doi:10.30794/pausbed.1611253

Araştırma Makalesi

YAPAY ZEKA VE ÇEVİRİ ETİĞİ: DOĞAL DİL İŞLEME MODELLERİNDEKİ ÖNYARGI PROBLEMLERİ VE ÇÖZÜM ÖNERİLERİ

Yıl 2025, Sayı: 68, 241 - 262, 12.05.2025

Gülfidan Aytaş

https://doi.org/10.30794/pausbed.1611253

Öz

Bu çalışmada, doğal dil işleme (NLP) modellerindeki sosyal önyargıların, veri çeşitliliği ve veri artırma yöntemleriyle nasıl azaltılabileceği araştırılmıştır. Özellikle, daha az yaygın dillere yönelik çeviri performansı ve önyargı oranları, veri çeşitliliği ve Generative Adversarial Networks (GAN) tabanlı veri artırma teknikleri ile analiz edilmiştir. Google Translate ve DeepL gibi popüler çeviri araçlarıyla yapılan deneyler, veri çeşitliliğinin çeviri doğruluğunu önemli ölçüde artırdığını ve BLEU puanlarını %40,8 oranında yükselttiğini göstermiştir. Ayrıca, cinsiyete dayalı önyargıların, özellikle meslekler konusunda, %33-41 oranında azaldığı belirlenmiştir. Veri çeşitliliği ve veri artırmanın çeviri kalitesi üzerindeki etkisi, regresyon analiziyle istatistiksel olarak doğrulanmıştır. Çalışma, veri çeşitliliğinin sadece çeviri doğruluğunu değil, aynı zamanda etik nötrlüğü de desteklediğini ortaya koymaktadır.

Anahtar Kelimeler

Doğal Dil İşleme (NLP), Yanlılık Sorunları, Generative Adversarial Networks (GAN), Çeviri Performansı

Kaynakça

Aharoni, R., Johnson, M., & Firat, O. (2019). Massively multilingual neural machine translation. Proceedings of NAACL-HLT 2019, 3874-3884. https://doi.org/10.18653/v1/N19-1388.
Alimovski, E. (2019). Derin öğrenmeye dayalı güçlü yüz tanıma sistemi için gan ile veri çoğaltma. Master‘s thesis, İstanbul Sabahattin Zaim Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Anabilim Dalı.
Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 5185-5198. https://doi.org/10.18653/v1/2020.acl-main.463
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.
Berg, S., Kutra, D., Kröger, T., Strähle, C. N., Kausler, B. X., Haubold, C., & Kreshuk, A. (2019). Ilastik: Interactive machine learning for (bio) image analysis. Nature Methods, 16(12), 1226-1232.
Bird, S., Klein, E., & Loper, E. (2020). Natural language processing with Python. O‘Reilly Media.
Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? De-biasing word embeddings. Advances in Neural Information Processing Systems, 29, 4356-4364.
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
Hovy, D., & Spruit, S. L. (2016). The social impact of natural language processing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 591-598.
Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 93-103. https://doi.org/10.18653/v1/2020.acl-main.720
Kırık, F., Demirkıran, B., Aslanoğlu, C. E., Koytak, A., & Özdemir, H. (2023). Detection and Classification of Diabetic Macular Edema with a Desktop-Based Code-Free Machine Learning Tool. Turkish Journal of Ophthalmology, 53(5), 301.
Makananise, F. O. (2024). Decolonial Thinking of Digital Media Inequalities and Indigenous Language Marginalisation of the Global South from the South African Context. In Decolonising Digital Media and Indigenisation of Participatory Epistemologies (pp. 17-39). Routledge.
Prates, M. O., Avelar, P. H., & Lamb, L. C. (2020). Assessing gender bias in machine translation: A case study with Google Translate. Neural Computing and Applications, 32, 6363-6381.
Pym, A. (2023). Exploring translation theories. Routledge.
Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020). Saving face: Investigating the ethical concerns of facial recognition auditing. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 145-151.
Sinap, V. (2023). Makine Öğrenmesi Teknikleri ile Counter-Strike: Global Offensive Raunt Sonuçlarının Tahminlenmesi. Journal of Intelligent Systems: Theory and Applications, 6(2), 119-129.

KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE

Yıl 2025, Sayı: 68, 241 - 262, 12.05.2025

Gülfidan Aytaş

https://doi.org/10.30794/pausbed.1611253

Öz

In dieser Studie wird untersucht, wie soziale Verzerrungen in natürlichen Sprachverarbeitungsmodellen (NLP) durch Datenvielfalt und Datenerweiterungstechniken reduziert werden können. Insbesondere werden die Übersetzungsleistung und Verzerrungsraten in weniger verbreiteten Sprachen mithilfe von Datenvielfalt und GAN-basierter Datenerweiterung analysiert. Experimente mit gängigen Übersetzungswerkzeugen wie Google Translate und DeepL zeigen, dass vielfältigere Datensätze die Übersetzungsgenauigkeit erheblich verbessern und die BLEU-Werte um bis zu 40,8 % erhöhen. Darüber hinaus wurden geschlechtsspezifische Verzerrungen, insbesondere in Bezug auf Berufe, um 33-41 % reduziert. Eine Regressionsanalyse bestätigte statistisch die Auswirkungen von Datenvielfalt und Datenerweiterung auf die Übersetzungsqualität und die Reduzierung von Verzerrungen. Die Ergebnisse zeigen, dass mehr Datenvielfalt in ressourcenarmen Sprachen nicht nur die Übersetzungsgenauigkeit verbessert, sondern auch die ethische Neutralität fördert.

Anahtar Kelimeler

Natürliche Sprachverarbeitung (NLP), Verzerrungsprobleme, Generative Adversarial Networks (GAN), Übersetzungsleistung.

Kaynakça

Aharoni, R., Johnson, M., & Firat, O. (2019). Massively multilingual neural machine translation. Proceedings of NAACL-HLT 2019, 3874-3884. https://doi.org/10.18653/v1/N19-1388.
Alimovski, E. (2019). Derin öğrenmeye dayalı güçlü yüz tanıma sistemi için gan ile veri çoğaltma. Master‘s thesis, İstanbul Sabahattin Zaim Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Anabilim Dalı.
Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 5185-5198. https://doi.org/10.18653/v1/2020.acl-main.463
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.
Berg, S., Kutra, D., Kröger, T., Strähle, C. N., Kausler, B. X., Haubold, C., & Kreshuk, A. (2019). Ilastik: Interactive machine learning for (bio) image analysis. Nature Methods, 16(12), 1226-1232.
Bird, S., Klein, E., & Loper, E. (2020). Natural language processing with Python. O‘Reilly Media.
Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? De-biasing word embeddings. Advances in Neural Information Processing Systems, 29, 4356-4364.
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
Hovy, D., & Spruit, S. L. (2016). The social impact of natural language processing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 591-598.
Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 93-103. https://doi.org/10.18653/v1/2020.acl-main.720
Kırık, F., Demirkıran, B., Aslanoğlu, C. E., Koytak, A., & Özdemir, H. (2023). Detection and Classification of Diabetic Macular Edema with a Desktop-Based Code-Free Machine Learning Tool. Turkish Journal of Ophthalmology, 53(5), 301.
Makananise, F. O. (2024). Decolonial Thinking of Digital Media Inequalities and Indigenous Language Marginalisation of the Global South from the South African Context. In Decolonising Digital Media and Indigenisation of Participatory Epistemologies (pp. 17-39). Routledge.
Prates, M. O., Avelar, P. H., & Lamb, L. C. (2020). Assessing gender bias in machine translation: A case study with Google Translate. Neural Computing and Applications, 32, 6363-6381.
Pym, A. (2023). Exploring translation theories. Routledge.
Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020). Saving face: Investigating the ethical concerns of facial recognition auditing. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 145-151.
Sinap, V. (2023). Makine Öğrenmesi Teknikleri ile Counter-Strike: Global Offensive Raunt Sonuçlarının Tahminlenmesi. Journal of Intelligent Systems: Theory and Applications, 6(2), 119-129.

ARTIFICIAL INTELLIGENCE AND TRANSLATION ETHICS: BIAS PROBLEMS IN NATURAL LANGUAGE PROCESSING MODELS AND SOLUTION SUGGESTIONS

Yıl 2025, Sayı: 68, 241 - 262, 12.05.2025

Gülfidan Aytaş

https://doi.org/10.30794/pausbed.1611253

Öz

This study explores how social biases in natural language processing (NLP) models can be mitigated through data diversity and data augmentation techniques. Specifically, translation performance and bias rates for less widely spoken languages are analyzed using data diversity and Generative Adversarial Networks (GAN)-based data augmentation methods. Experiments conducted with popular translation tools like Google Translate and DeepL show that data diversity significantly improves translation accuracy, increasing BLEU scores by 40.8%. Additionally, gender-based biases, especially regarding professions, were found to decrease by 33-41%. The impact of data diversity and augmentation on translation quality was statistically validated through regression analysis. The study demonstrates that data diversity not only enhances translation accuracy but also supports ethical neutrality.

Anahtar Kelimeler

Natural Language Processing (NLP), Bias Issues, Generative Adversarial Networks (GAN), Translation Performance.

Kaynakça

Aharoni, R., Johnson, M., & Firat, O. (2019). Massively multilingual neural machine translation. Proceedings of NAACL-HLT 2019, 3874-3884. https://doi.org/10.18653/v1/N19-1388.
Alimovski, E. (2019). Derin öğrenmeye dayalı güçlü yüz tanıma sistemi için gan ile veri çoğaltma. Master‘s thesis, İstanbul Sabahattin Zaim Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Anabilim Dalı.
Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 5185-5198. https://doi.org/10.18653/v1/2020.acl-main.463
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623.
Berg, S., Kutra, D., Kröger, T., Strähle, C. N., Kausler, B. X., Haubold, C., & Kreshuk, A. (2019). Ilastik: Interactive machine learning for (bio) image analysis. Nature Methods, 16(12), 1226-1232.
Bird, S., Klein, E., & Loper, E. (2020). Natural language processing with Python. O‘Reilly Media.
Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? De-biasing word embeddings. Advances in Neural Information Processing Systems, 29, 4356-4364.
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186.
Hovy, D., & Spruit, S. L. (2016). The social impact of natural language processing. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 591-598.
Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020). The state and fate of linguistic diversity and inclusion in the NLP world. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), 93-103. https://doi.org/10.18653/v1/2020.acl-main.720
Kırık, F., Demirkıran, B., Aslanoğlu, C. E., Koytak, A., & Özdemir, H. (2023). Detection and Classification of Diabetic Macular Edema with a Desktop-Based Code-Free Machine Learning Tool. Turkish Journal of Ophthalmology, 53(5), 301.
Makananise, F. O. (2024). Decolonial Thinking of Digital Media Inequalities and Indigenous Language Marginalisation of the Global South from the South African Context. In Decolonising Digital Media and Indigenisation of Participatory Epistemologies (pp. 17-39). Routledge.
Prates, M. O., Avelar, P. H., & Lamb, L. C. (2020). Assessing gender bias in machine translation: A case study with Google Translate. Neural Computing and Applications, 32, 6363-6381.
Pym, A. (2023). Exploring translation theories. Routledge.
Raji, I. D., Gebru, T., Mitchell, M., Buolamwini, J., Lee, J., & Denton, E. (2020). Saving face: Investigating the ethical concerns of facial recognition auditing. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 145-151.
Sinap, V. (2023). Makine Öğrenmesi Teknikleri ile Counter-Strike: Global Offensive Raunt Sonuçlarının Tahminlenmesi. Journal of Intelligent Systems: Theory and Applications, 6(2), 119-129.

Toplam 16 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Almanca
Konular	Çeviri ve Yorum Çalışmaları, Karşılaştırmalı Dil Çalışmaları
Bölüm	Araştırma Makalesi
Yazarlar	Gülfidan Aytaş 0000-0003-1566-1592
Erken Görünüm Tarihi	2 Mayıs 2025
Yayımlanma Tarihi	12 Mayıs 2025
Gönderilme Tarihi	1 Ocak 2025
Kabul Tarihi	4 Mart 2025
Yayımlandığı Sayı	Yıl 2025 Sayı: 68

Kaynak Göster

APA	Aytaş, G. (2025). KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi(68), 241-262. https://doi.org/10.30794/pausbed.1611253
AMA	Aytaş G. KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. PAUSBED. Mayıs 2025;(68):241-262. doi:10.30794/pausbed.1611253
Chicago	Aytaş, Gülfidan. “KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE”. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, sy. 68 (Mayıs 2025): 241-62. https://doi.org/10.30794/pausbed.1611253.
EndNote	Aytaş G (01 Mayıs 2025) KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi 68 241–262.
IEEE	G. Aytaş, “KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE”, PAUSBED, sy. 68, ss. 241–262, Mayıs 2025, doi: 10.30794/pausbed.1611253.
ISNAD	Aytaş, Gülfidan. “KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE”. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi 68 (Mayıs 2025), 241-262. https://doi.org/10.30794/pausbed.1611253.
JAMA	Aytaş G. KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. PAUSBED. 2025;:241–262.
MLA	Aytaş, Gülfidan. “KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE”. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, sy. 68, 2025, ss. 241-62, doi:10.30794/pausbed.1611253.
Vancouver	Aytaş G. KÜNSTLICHE INTELLIGENZ UND ÜBERSETZUNGSETHIK: VERZERRUNGSPROBLEME BEI MODELLEN ZUR VERARBEITUNG NATÜRLICHER SPRACHE UND LÖSUNGSVORSCHLÄGE. PAUSBED. 2025(68):241-62.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin