Türkçe soru cevaplama için büyük dil modelleri üzerinde geniş ölçekli etki analizi

Zekeriya Anıl Güven

doi:10.17341/gazimmfd.1538022

Araştırma Makalesi

BibTex

RIS

Kaynak Göster

Large-scale impact analysis on large language models for Turkish question-answering

Yıl 2025, Cilt: 40 Sayı: 3, 1787 - 1796

Zekeriya Anıl Güven

https://doi.org/10.17341/gazimmfd.1538022

Öz

Large language models (LLMs) have recently become popular in many natural language processing tasks. There are fewer studies on LLMs in low-level languages such as Turkish. Therefore, the success of BERT, ALBERT, DistilBERT, mDeBERTa, and mT5 LLMs was analyzed for the Turkish question-answering task. The Turkish version of the benchmark dataset, SQuAD, was used as the dataset. As a result of training these LLMs by fine-tuning, mDeBERTa became the most successful model with 74.50% accuracy. In addition, the effect of the threshold value of the answer probability of these models and the semantic similarity between the predicted and actual answers of the LLMs were examined. When the effect of the threshold value was analyzed, an accuracy increase of up to 0.13% was observed in the accuracy value of LLMs. Analyzing the effect of semantic similarity on LLMs showed that the accuracy value increased between 0.7% and 6.59% and the most successful model was mDeBERTa with 79.09%. The results show that analyzing LLMs' threshold value and semantic similarity had a positive effect.

Anahtar Kelimeler

Large language model, SQuAD, SQuAD, SQuAD, SQuAD, text generation, question answering, semantic similarity

Kaynakça

1. Karanikolas, N., Manga, E., Samaridi, N., Tousidou, E., Vassilakopoulos, M., Large language models versus natural language understanding and generation, In Proceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics, 278-290, 2023.
2. Dwivedi S.K., Singh V., Research and reviews in question answering system, Procedia Technology, 10, 417–424, 2013.
3. Agarwal A., Sachdeva N., Yadav R.K., Udandarao V., Mittal V., Gupta A., Mathur A., Eduqa: Educational domain question answering system using conceptual network mapping, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, 2019-May, 8137–8141, 2019.
4. Jin D., Pan E., Oufattole N., Weng W.H., Fang H., Szolovits P., What disease does this patient have? a large-scale open domain question answering dataset from medical exams, Applied Sciences 2021, 11, 6421, 2021.
5. Etezadi R., Shamsfard M., The state of the art in open domain complex question answering: a survey, Applied Intelligence, 53, 4124–4144, 2023.
6. Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L., Polosukhin I., Attention is all you need, Advances in Neural Information Processing Systems, 2017-December, 5999–6009, 2017.
7. Devlin J., Chang M.W., Lee K., Toutanova K., Bert: Pre-training of deep bidirectional transformers for language understanding, NAACL HLT 2019-2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Proceedings of the Conference, 1, 4171–4186, 2018.
8. Improving language understanding with unsupervised learning-OpenAI. https://openai.com/index/language-unsupervised/. Erişim tarihi Temmuz 20, 2024.
9. Angelis L.D., Baglivo F., Arzilli G., Privitera G.P., Ferragina P., Tozzi A.E., Rizzo C., Chatgpt and the rise of large language models: the new ai-driven infodemic threat in public health, Frontiers in Public Health, 11, 2023.
10. Alzubi J.A., Jain R., Singh A., Parwekar P., Gupta M., Cobert: Covid-19 question answering system using bert, Arabian Journal for Science and Engineering, 48, 11003–11013, 2023.
11. Kierszbaum S., Lapasset L., Applying distilled bert for question answering on asrs reports, Proceedings of the 22nd International Conference on New Trends in Civil Aviation, NTCA 2020, 33–38, 2020.
12. Singhal K., Azizi S., Tu T., Mahdavi S.S., Wei J., Chung H.W., Scales N., Tanwani A., Cole-Lewis H., Pfohl S., Payne P., Seneviratne M., Gamble P., Kelly C., Babiker A., Scharli N., Chowdhery A., Mansfield P., DemnerFushman D., Arcas B.A., Webster D., Corrado G.S., Matias Y., Chou K., Gottweis J., Tomasev N., Liu Y., Rajkomar A., Barral J., Semturs C., Karthikesalingam A., Natarajan V., Large language models encode clinical knowledge, Nature, 620, 172–180, 2023.
13. Abdelhay M., Mohammed A., Hefn H.A., Deep learning for arabic healthcare: Medicalbot, Social Network Analysis and Mining, 13, 1–17, 2023.
14. Wang Z., Ng P., Ma X., Nallapati R., Xiang B., Multi-passage bert: A globally normalized bert model for open-domain question answering, EMNLP-IJCNLP 2019-2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, 5878–5882 2019.
15. Qu C., Yang L., Qiu M., Croft W.B., Zhang Y., Iyyer M., Bert with history answer embedding for conversational question answering, SIGIR 2019-Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 1133–1136, 2019.
16. Zhang Z., Wu Y., Zhou J., Duan S., Zhao H., Wang R., Sg-net: Syntax-guided machine reading comprehension, Proceedings of the AAAI Conference on Artificial Intelligence, 34, 9636–9643, 2020.
17. Zhang Z., Wu Y., Zhao H., Li Z., Zhang S., Zhou X., Zhou X., Semantics-aware bert for language understanding, Proceedings of the AAAI Conference on Artificial Intelligence, 2020.
18. Lan Z., Chen M., Goodman S., Gimpel K., Sharma P., Soricut R., ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, arXiv preprint arXiv:1909.11942, 2019.
19. Soygazi F., Çiftçi O., Kök U., Cengiz S., Thquad: Turkish historic question answering dataset for reading comprehension, Proceedings-6th International Conference on Computer Science and Engineering, UBMK 2021, 215–220, 2021.
20. Akyön F., Çavuşoğlu D., Cengiz C., Altinuç S.O., Temizel, A.: Automated question generation and question answering from Turkish texts, Turkish Journal of Electrical Engineering and Computer Sciences, 30, 1931–1940, 2022.
21. İncidelen, M., Aydoğan, M. (2024, Developing Question-Answering Models in Low-Resource Languages: A Case Study on Turkish Medical Texts Using Transformer-Based Approaches, In 2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP), 1-4, 2024.
22. Tutar, K., Yıldız, O.T., Turkish Question-Answer Dataset Evaluated with Deep Learning, In 2024 9th International Conference on Computer Science and Engineering (UBMK), 101-105, 2024.
23. Kahraman S.Y., Durmuşoğlu A., Dereli T., Patent classification with pre-trained Bert model, Journal of the Faculty of Engineering and Architecture of Gazi University, 39 (4), 2484-2496, 2024.
24. Tepecik A., Demir E., Analysis of Turkish audio recording data labeled with three emotions popular machine learning algorithms, Journal of the Faculty of Engineering and Architecture of Gazi University, 39 (2), 709-716, 2024.
25. Budur E., Ozçelik R., Soylu D., Khattab O., Güngör T., Potts C., Building Efficient and Effective OpenQA Systems for Low-Resource Languages, Knowledge-Based Systems, 302, 112243, 2024.
26. Sanh V., Debut L., Chaumond J., Wolf T., Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, 2019.
27. He P., Liu X., Gao J., Chen W., Deberta: Decoding-enhanced bert with disentangled attention, ICLR 2021-9th International Conference on Learning Representations, 2021.
28. Xue L., Constant N., Roberts A., Kale M., Al-Rfou R., Siddhant A., Barua A., Raffel C., mt5: A massively multilingual pre-trained text-to-text transformer, NAACL-HLT 2021-2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 483–498, 2020.
29. Hugging Face. emrecan/bert-base-turkish-cased-mean-nli-stsb-tr. https://huggingface.co/emrecan/bert-base-turkish-cased-mean-nli-stsb-tr . Erişim tarihi Aralık 25, 2024.

Türkçe soru cevaplama için büyük dil modelleri üzerinde geniş ölçekli etki analizi

Yıl 2025, Cilt: 40 Sayı: 3, 1787 - 1796

Zekeriya Anıl Güven

https://doi.org/10.17341/gazimmfd.1538022

Öz

Son zamanlarda, büyük dil modelleri (LLM) birçok doğal dil işleme görevinde oldukça popüler hale gelmiştir. Türkçe gibi düşük seviyeli dillerde LLM'ler üzerine daha az çalışma vardır. Bu nedenle, Türkçe soru-cevap görevi için BERT, ALBERT, DistilBERT, mDeBERTa ve mT5 LLM'lerinin başarısı analiz edilmiştir. Karşılaştırma SQuAD veri setinin Türkçe versiyonu veri seti olarak kullanılmıştır. Bu LLM'lerin ince ayar yapılarak eğitilmesi sonucunda, mDeBERTa %74.50 doğruluk ile en başarılı model olmuştur. Ayrıca, bu modellerin cevap olasılığına eşik değerinin etkisi ve LLM'lerin tahmin edilen ve gerçekleşen cevapları arasındaki anlamsal benzerlik incelenmiştir. Eşik değerinin etkisi analiz edildiğinde, LLM'lerin doğruluk değerinde %0.13'e kadar bir doğruluk artışı gözlenmiştir. Anlamsal benzerliğin LLM'ler üzerindeki etkisi analiz edildiğinde ise doğruluk değerinin %0.7 ile %6.59 arasında arttığı ve en başarılı modelin %79.09 ile mDeBERTa olduğu gösterilmiştir. Sonuç olarak, LLM'ler için eşik değeri ve anlamsal benzerliğin analiz edilmesinin olumlu bir etkiye sahip olduğunu göstermektedir.

Anahtar Kelimeler

Büyük dil modeli, metin üretme, soru cevaplama, anlamsal benzerlik, SQuAD, SQuAD, SQuAD, SQuAD

Kaynakça

1. Karanikolas, N., Manga, E., Samaridi, N., Tousidou, E., Vassilakopoulos, M., Large language models versus natural language understanding and generation, In Proceedings of the 27th Pan-Hellenic Conference on Progress in Computing and Informatics, 278-290, 2023.
2. Dwivedi S.K., Singh V., Research and reviews in question answering system, Procedia Technology, 10, 417–424, 2013.
3. Agarwal A., Sachdeva N., Yadav R.K., Udandarao V., Mittal V., Gupta A., Mathur A., Eduqa: Educational domain question answering system using conceptual network mapping, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, 2019-May, 8137–8141, 2019.
4. Jin D., Pan E., Oufattole N., Weng W.H., Fang H., Szolovits P., What disease does this patient have? a large-scale open domain question answering dataset from medical exams, Applied Sciences 2021, 11, 6421, 2021.
5. Etezadi R., Shamsfard M., The state of the art in open domain complex question answering: a survey, Applied Intelligence, 53, 4124–4144, 2023.
6. Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L., Polosukhin I., Attention is all you need, Advances in Neural Information Processing Systems, 2017-December, 5999–6009, 2017.
7. Devlin J., Chang M.W., Lee K., Toutanova K., Bert: Pre-training of deep bidirectional transformers for language understanding, NAACL HLT 2019-2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Proceedings of the Conference, 1, 4171–4186, 2018.
8. Improving language understanding with unsupervised learning-OpenAI. https://openai.com/index/language-unsupervised/. Erişim tarihi Temmuz 20, 2024.
9. Angelis L.D., Baglivo F., Arzilli G., Privitera G.P., Ferragina P., Tozzi A.E., Rizzo C., Chatgpt and the rise of large language models: the new ai-driven infodemic threat in public health, Frontiers in Public Health, 11, 2023.
10. Alzubi J.A., Jain R., Singh A., Parwekar P., Gupta M., Cobert: Covid-19 question answering system using bert, Arabian Journal for Science and Engineering, 48, 11003–11013, 2023.
11. Kierszbaum S., Lapasset L., Applying distilled bert for question answering on asrs reports, Proceedings of the 22nd International Conference on New Trends in Civil Aviation, NTCA 2020, 33–38, 2020.
12. Singhal K., Azizi S., Tu T., Mahdavi S.S., Wei J., Chung H.W., Scales N., Tanwani A., Cole-Lewis H., Pfohl S., Payne P., Seneviratne M., Gamble P., Kelly C., Babiker A., Scharli N., Chowdhery A., Mansfield P., DemnerFushman D., Arcas B.A., Webster D., Corrado G.S., Matias Y., Chou K., Gottweis J., Tomasev N., Liu Y., Rajkomar A., Barral J., Semturs C., Karthikesalingam A., Natarajan V., Large language models encode clinical knowledge, Nature, 620, 172–180, 2023.
13. Abdelhay M., Mohammed A., Hefn H.A., Deep learning for arabic healthcare: Medicalbot, Social Network Analysis and Mining, 13, 1–17, 2023.
14. Wang Z., Ng P., Ma X., Nallapati R., Xiang B., Multi-passage bert: A globally normalized bert model for open-domain question answering, EMNLP-IJCNLP 2019-2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, 5878–5882 2019.
15. Qu C., Yang L., Qiu M., Croft W.B., Zhang Y., Iyyer M., Bert with history answer embedding for conversational question answering, SIGIR 2019-Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 1133–1136, 2019.
16. Zhang Z., Wu Y., Zhou J., Duan S., Zhao H., Wang R., Sg-net: Syntax-guided machine reading comprehension, Proceedings of the AAAI Conference on Artificial Intelligence, 34, 9636–9643, 2020.
17. Zhang Z., Wu Y., Zhao H., Li Z., Zhang S., Zhou X., Zhou X., Semantics-aware bert for language understanding, Proceedings of the AAAI Conference on Artificial Intelligence, 2020.
18. Lan Z., Chen M., Goodman S., Gimpel K., Sharma P., Soricut R., ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, arXiv preprint arXiv:1909.11942, 2019.
19. Soygazi F., Çiftçi O., Kök U., Cengiz S., Thquad: Turkish historic question answering dataset for reading comprehension, Proceedings-6th International Conference on Computer Science and Engineering, UBMK 2021, 215–220, 2021.
20. Akyön F., Çavuşoğlu D., Cengiz C., Altinuç S.O., Temizel, A.: Automated question generation and question answering from Turkish texts, Turkish Journal of Electrical Engineering and Computer Sciences, 30, 1931–1940, 2022.
21. İncidelen, M., Aydoğan, M. (2024, Developing Question-Answering Models in Low-Resource Languages: A Case Study on Turkish Medical Texts Using Transformer-Based Approaches, In 2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP), 1-4, 2024.
22. Tutar, K., Yıldız, O.T., Turkish Question-Answer Dataset Evaluated with Deep Learning, In 2024 9th International Conference on Computer Science and Engineering (UBMK), 101-105, 2024.
23. Kahraman S.Y., Durmuşoğlu A., Dereli T., Patent classification with pre-trained Bert model, Journal of the Faculty of Engineering and Architecture of Gazi University, 39 (4), 2484-2496, 2024.
24. Tepecik A., Demir E., Analysis of Turkish audio recording data labeled with three emotions popular machine learning algorithms, Journal of the Faculty of Engineering and Architecture of Gazi University, 39 (2), 709-716, 2024.
25. Budur E., Ozçelik R., Soylu D., Khattab O., Güngör T., Potts C., Building Efficient and Effective OpenQA Systems for Low-Resource Languages, Knowledge-Based Systems, 302, 112243, 2024.
26. Sanh V., Debut L., Chaumond J., Wolf T., Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, 2019.
27. He P., Liu X., Gao J., Chen W., Deberta: Decoding-enhanced bert with disentangled attention, ICLR 2021-9th International Conference on Learning Representations, 2021.
28. Xue L., Constant N., Roberts A., Kale M., Al-Rfou R., Siddhant A., Barua A., Raffel C., mt5: A massively multilingual pre-trained text-to-text transformer, NAACL-HLT 2021-2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 483–498, 2020.
29. Hugging Face. emrecan/bert-base-turkish-cased-mean-nli-stsb-tr. https://huggingface.co/emrecan/bert-base-turkish-cased-mean-nli-stsb-tr . Erişim tarihi Aralık 25, 2024.

Toplam 29 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Derin Öğrenme, Doğal Dil İşleme, Konuşma Üretimi
Bölüm	Makaleler
Yazarlar	Zekeriya Anıl Güven 0000-0002-7025-2815
Erken Görünüm Tarihi	4 Haziran 2025
Yayımlanma Tarihi
Gönderilme Tarihi	24 Ağustos 2024
Kabul Tarihi	1 Şubat 2025
Yayımlandığı Sayı	Yıl 2025 Cilt: 40 Sayı: 3

Kaynak Göster

APA	Güven, Z. A. (2025). Türkçe soru cevaplama için büyük dil modelleri üzerinde geniş ölçekli etki analizi. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, 40(3), 1787-1796. https://doi.org/10.17341/gazimmfd.1538022
AMA	Güven ZA. Türkçe soru cevaplama için büyük dil modelleri üzerinde geniş ölçekli etki analizi. GUMMFD. Haziran 2025;40(3):1787-1796. doi:10.17341/gazimmfd.1538022
Chicago	Güven, Zekeriya Anıl. “Türkçe Soru Cevaplama için büyük Dil Modelleri üzerinde Geniş ölçekli Etki Analizi”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 40, sy. 3 (Haziran 2025): 1787-96. https://doi.org/10.17341/gazimmfd.1538022.
EndNote	Güven ZA (01 Haziran 2025) Türkçe soru cevaplama için büyük dil modelleri üzerinde geniş ölçekli etki analizi. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 40 3 1787–1796.
IEEE	Z. A. Güven, “Türkçe soru cevaplama için büyük dil modelleri üzerinde geniş ölçekli etki analizi”, GUMMFD, c. 40, sy. 3, ss. 1787–1796, 2025, doi: 10.17341/gazimmfd.1538022.
ISNAD	Güven, Zekeriya Anıl. “Türkçe Soru Cevaplama için büyük Dil Modelleri üzerinde Geniş ölçekli Etki Analizi”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 40/3 (Haziran 2025), 1787-1796. https://doi.org/10.17341/gazimmfd.1538022.
JAMA	Güven ZA. Türkçe soru cevaplama için büyük dil modelleri üzerinde geniş ölçekli etki analizi. GUMMFD. 2025;40:1787–1796.
MLA	Güven, Zekeriya Anıl. “Türkçe Soru Cevaplama için büyük Dil Modelleri üzerinde Geniş ölçekli Etki Analizi”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, c. 40, sy. 3, 2025, ss. 1787-96, doi:10.17341/gazimmfd.1538022.
Vancouver	Güven ZA. Türkçe soru cevaplama için büyük dil modelleri üzerinde geniş ölçekli etki analizi. GUMMFD. 2025;40(3):1787-96.

Makale Dosyaları

Tam Metin