Araştırma Makalesi
BibTex RIS Kaynak Göster

Dil farklılıklarının ChatGPT-3.5, Copilot ve Gemini’nin pediatrik oftalmoloji ve şaşılık çoktan seçmeli sorulardaki etkinliğinin değerlendirilmesi

Yıl 2025, Cilt: 30 Sayı: 2, 262 - 267, 29.05.2025
https://doi.org/10.21673/anadoluklin.1593858

Öz

Amaç: Bu çalışmada yapay zeka programlarının pediatrik oftalmoloji ve şaşılık ile ilişkili çoktan seçmeli soruları cevaplamadaki başarı düzeylerine dil farklılıklarının etkilerinin incelenmesi amaçlandı.

Yöntemler: Pediatrik oftalmoloji ve şaşılık ile ilişkili 44 soru çalışmaya dâhil edildi. Soruların Türkçe çevirileri sertifikasyonlu çevirmen (native speaker) tarafından gerçekleştirildikten sonra hem İngilizce hem Türkçe versiyonları ChatGPT-3,5, Copilot ve Gemini yapay zeka sohbet botlarına uygulandı. Sorulara verilen cevaplar cevap anahtarı ile karşılaştırılarak doğru ve yanlış olarak gruplandırıldı.

Bulgular: İngilizce sorulara ChatGPT-3,5, Copilot ve Gemini sırası ile %56,8, %72,7 ve %56,8 oranında doğru cevap verdi (p=0,206). Türkçe sorulara ChatGPT-3,5, Copilot ve Gemini sırası ile %45,5, %68,2 ve %56,8 oranında doğru cevap verdi (p=0,099). Yapay zeka programları soruların İngilizce ve Türkçe versiyonlarını cevaplamada benzer başarı düzeylerine sahipti (p>0,05).

Sonuç: Sohbet botları her ne kadar soruları cevaplamada benzer performans göstermiş olsa bile sorular ayrı ayrı incelendiğinde aynı sorulara farklı cevaplar üretebilmişlerdir. Bu durum kullanıcıların sohbet botlarının doğruluğuna olan güvenini zedeleyebilir. Sohbet botlarının dil performanslarının geliştirilmeye ihtiyacı vardır.

Kaynakça

  • Rahimy E. Deep learning applications in ophthalmology. Curr Opin Ophthalmol. 2018;29(3):254-60.
  • Patel VL, Shortliffe EH, Stefanelli M, et al. The coming of age of artificial intelligence in medicine. Artif Intell Med. 2009;46(1):5-17.
  • Schmidt-Erfurth U, Sadeghipour A, Gerendas BS, Waldstein SM, Bogunović H. Artificial intelligence in retina. Prog Retin Eye Res. 2018;67:1-29.
  • Antaki F, Coussa RG, Kahwati G, Hammamji K, Sebag M, Duval R. Accuracy of automated machine learning in classifying retinal pathologies from ultra-widefield pseudocolour fundus images. Br J Ophthalmol. 2023;107(1):90-5.
  • Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-75.
  • Kapoor R, Walters SP, Al-Aswad LA. The current state of artificial intelligence in ophthalmology. Surv Ophthalmol. 2019;64(2):233-40.
  • de Figueiredo LA, Dias JVP, Polati M, Carricondo PC, Debert I. Strabismus and Artificial Intelligence App: Optimizing Diagnostic and Accuracy. Transl Vis Sci Technol. 2021;10(7):22.
  • Mikolov T, Deoras A, Povey D, Burget L, Černocký J. Strategies for training large scale neural network language models. 2011 IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE. 2011;196-201.
  • Google AI updates: Bard and new AI features in search. Erişim Tarihi: 04.07.2024, https://blog.google/technology/ai/bard-google-ai-search-updates/.
  • Bing Chat | Microsoft Edge. Erişim Tarihi: 04.07.2024, https://www.microsoft.com/en-us/edge/features/bing-chat?form=MT00D8.
  • Khan AO, Chang TCP, El-Dairi MA, et al. (2023), Pediatric ophthalmology and strabismus. San Francisco: American Academy of Ophthalmology.
  • Wen J, Wang W. The future of ChatGPT in academic research and publishing: A commentary for clinical and translational medicine. Clin Transl Med. 2023;13(3):e1207.
  • Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605-7.
  • Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198.
  • Jeblick K, Schachtner B, Dexl J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol. 2024;34(5):2817-25.
  • Yılmaz İE. The Promise and the Challenge: Large Language Models for Patient Education - Are We There Yet?. Exp Appl Med Sci. 2024;5(3):137-49.
  • Yılmaz IBE, Doğan L. Talking technology: exploring chatbots as a tool for cataract patient education. Clin Exp Optom. 2025;108(1):56-64.
  • Edhem Yılmaz İ, Berhuni M, Özer Özcan Z, Doğan L. Chatbots talk Strabismus: Can AI become the new patient Educator?. Int J Med Inform. 2024;191:105592.
  • Tan Yip Ming C, Rojas-Carabali W, Cifuentes-González C, et al. The Potential Role of Large Language Models in Uveitis Care: Perspectives After ChatGPT and Bard Launch. Ocul Immunol Inflamm. 2024;32(7):1435-9.
  • Doğan L, Özçakmakcı GB, Yılmaz ĬE. The Performance of Chatbots and the AAPOS Website as a Tool for Amblyopia Education. J Pediatr Ophthalmol Strabismus. 2024;61(5):325-31.
  • Mihalache A, Popovic MM, Muni RH. Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment. JAMA Ophthalmol. 2023;141(6):589-97.
  • Tao BK, Hua N, Milkovich J, Micieli JA. ChatGPT-3.5 and Bing Chat in ophthalmology: an updated evaluation of performance, readability, and informative sources. Eye (Lond). 2024;38(10):1897-902.
  • Mihalache A, Grad J, Patil NS, et al. Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye (Lond). 2024;38(13):2530-5.
  • Canleblebici M, Dal A, Erdağ M. Evaluation of the Performance of Large Language Models (ChatGPT-3.5, ChatGPT-4, Bing and Bard) in Turkish Ophthalmology Chief-Assistant Exams: A Comparative Study. Turkiye Klinikleri J Ophthalmol. 2024;33(3):163-70.

Evaluation of language differences on the effectiveness of ChatGPT-3.5, Copilot and Gemini in pediatric ophthalmology and strabismus multiple choice questions

Yıl 2025, Cilt: 30 Sayı: 2, 262 - 267, 29.05.2025
https://doi.org/10.21673/anadoluklin.1593858

Öz

Aim: This study aimed to investigate the effects of language differences on the success levels of artificial intelligence programs in answering multiple-choice questions related to pediatric ophthalmology and strabismus.

Methods: Forty-four questions related to pediatric ophthalmology and strabismus were included in the study. After the questions were translated into Turkish by a certified native speaker, both English and Turkish versions were applied to ChatGPT-3.5, Copilot, and Gemini artificial intelligence chatbots. The answers given to the questions were compared with the answer key and grouped as correct and incorrect.

Results: ChatGPT-3.5, Copilot, and Gemini answered the English questions correctly at a rate of 56.8%, 72.7%, and 56.8%, respectively (p = 0.206). ChatGPT-3.5, Copilot, and Gemini answered the Turkish questions correctly at a rate of 45.5%, 68.2%, and 56.8%, respectively (p = 0.099). Artificial intelligence programs had similar levels of success in answering the English and Turkish versions of the questions (p>0.05).

Conclusion: Although chatbots performed similarly in answering questions, they could produce different answers to the same questions when examined separately. This situation may undermine users’ trust in the chatbots’ accuracy. The language performance of chatbots needs to be improved.

Kaynakça

  • Rahimy E. Deep learning applications in ophthalmology. Curr Opin Ophthalmol. 2018;29(3):254-60.
  • Patel VL, Shortliffe EH, Stefanelli M, et al. The coming of age of artificial intelligence in medicine. Artif Intell Med. 2009;46(1):5-17.
  • Schmidt-Erfurth U, Sadeghipour A, Gerendas BS, Waldstein SM, Bogunović H. Artificial intelligence in retina. Prog Retin Eye Res. 2018;67:1-29.
  • Antaki F, Coussa RG, Kahwati G, Hammamji K, Sebag M, Duval R. Accuracy of automated machine learning in classifying retinal pathologies from ultra-widefield pseudocolour fundus images. Br J Ophthalmol. 2023;107(1):90-5.
  • Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-75.
  • Kapoor R, Walters SP, Al-Aswad LA. The current state of artificial intelligence in ophthalmology. Surv Ophthalmol. 2019;64(2):233-40.
  • de Figueiredo LA, Dias JVP, Polati M, Carricondo PC, Debert I. Strabismus and Artificial Intelligence App: Optimizing Diagnostic and Accuracy. Transl Vis Sci Technol. 2021;10(7):22.
  • Mikolov T, Deoras A, Povey D, Burget L, Černocký J. Strategies for training large scale neural network language models. 2011 IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE. 2011;196-201.
  • Google AI updates: Bard and new AI features in search. Erişim Tarihi: 04.07.2024, https://blog.google/technology/ai/bard-google-ai-search-updates/.
  • Bing Chat | Microsoft Edge. Erişim Tarihi: 04.07.2024, https://www.microsoft.com/en-us/edge/features/bing-chat?form=MT00D8.
  • Khan AO, Chang TCP, El-Dairi MA, et al. (2023), Pediatric ophthalmology and strabismus. San Francisco: American Academy of Ophthalmology.
  • Wen J, Wang W. The future of ChatGPT in academic research and publishing: A commentary for clinical and translational medicine. Clin Transl Med. 2023;13(3):e1207.
  • Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605-7.
  • Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198.
  • Jeblick K, Schachtner B, Dexl J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol. 2024;34(5):2817-25.
  • Yılmaz İE. The Promise and the Challenge: Large Language Models for Patient Education - Are We There Yet?. Exp Appl Med Sci. 2024;5(3):137-49.
  • Yılmaz IBE, Doğan L. Talking technology: exploring chatbots as a tool for cataract patient education. Clin Exp Optom. 2025;108(1):56-64.
  • Edhem Yılmaz İ, Berhuni M, Özer Özcan Z, Doğan L. Chatbots talk Strabismus: Can AI become the new patient Educator?. Int J Med Inform. 2024;191:105592.
  • Tan Yip Ming C, Rojas-Carabali W, Cifuentes-González C, et al. The Potential Role of Large Language Models in Uveitis Care: Perspectives After ChatGPT and Bard Launch. Ocul Immunol Inflamm. 2024;32(7):1435-9.
  • Doğan L, Özçakmakcı GB, Yılmaz ĬE. The Performance of Chatbots and the AAPOS Website as a Tool for Amblyopia Education. J Pediatr Ophthalmol Strabismus. 2024;61(5):325-31.
  • Mihalache A, Popovic MM, Muni RH. Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment. JAMA Ophthalmol. 2023;141(6):589-97.
  • Tao BK, Hua N, Milkovich J, Micieli JA. ChatGPT-3.5 and Bing Chat in ophthalmology: an updated evaluation of performance, readability, and informative sources. Eye (Lond). 2024;38(10):1897-902.
  • Mihalache A, Grad J, Patil NS, et al. Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye (Lond). 2024;38(13):2530-5.
  • Canleblebici M, Dal A, Erdağ M. Evaluation of the Performance of Large Language Models (ChatGPT-3.5, ChatGPT-4, Bing and Bard) in Turkish Ophthalmology Chief-Assistant Exams: A Comparative Study. Turkiye Klinikleri J Ophthalmol. 2024;33(3):163-70.
Toplam 24 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Klinik Tıp Bilimleri (Diğer)
Bölüm ORJİNAL MAKALE
Yazarlar

Eyüpcan Şensoy 0000-0002-4401-8435

Melike Şensoy 0000-0002-8273-3851

Mehmet Çıtırık 0000-0002-0558-5576

Yayımlanma Tarihi 29 Mayıs 2025
Gönderilme Tarihi 30 Kasım 2024
Kabul Tarihi 28 Aralık 2024
Yayımlandığı Sayı Yıl 2025 Cilt: 30 Sayı: 2

Kaynak Göster

Vancouver Şensoy E, Şensoy M, Çıtırık M. Dil farklılıklarının ChatGPT-3.5, Copilot ve Gemini’nin pediatrik oftalmoloji ve şaşılık çoktan seçmeli sorulardaki etkinliğinin değerlendirilmesi. Anadolu Klin. 2025;30(2):262-7.

13151 This Journal licensed under a CC BY-NC (Creative Commons Attribution-NonCommercial 4.0) International License.