Clinical Research
BibTex RIS Cite

Artificial intelligence exercise recommendations in knee osteoarthritis rehabilitation: ChatGPT-4o and Gemini Advanced example

Year 2025, Volume: 35 Issue: 3, 487 - 492, 30.06.2025
https://doi.org/10.54005/geneltip.1634118

Abstract

Abstract
Aim: This study aimed to comparatively evaluate the propensity of the large language models ChatGPT-4o and Gemini Advanced to recommend personalised exercise based on patients' assessment data in knee osteoarthritis rehabilitation.
Methods: This observational study included 40 patients diagnosed with knee OA according to the American College of Rheumatology criteria. Demographic data, pain levels, range of motion, muscle strength, functional status, and balance were assessed using standardized clinical tests. ChatGPT-4o and Gemini Advanced generated three-phase rehabilitation programs based on these assessments. Exercise recommendations were analyzed across 12 parameters, and statistical comparisons were conducted using the Mann-Whitney U test and Spearman’s correlation (p<0.05).
Results: ChatGPT-4o demonstrated statistically significant differences in 7 parameters: Phase 1 (quadriceps muscle strength, knee flexion angle, knee extension angle, and four-square step test; p=0.017, p=0.012, p=0.033, p=0.043), Phase 2 (quadriceps muscle strength and Lysholm scale; p=0.032, p=0.040), and Phase 3 (quadriceps muscle strength; p=0.007). In contrast, Gemini Advanced exhibited significant differences in only 2 parameters: Phase 1 (Lysholm scale score; p=0.044) and Phase 3 (quadriceps strengthening exercise; p=0.047). ChatGPT-4o appeared to integrate patient assessment data more effectively, but both models showed limitations in personalization.
Conclusions: While ChatGPT-4o and Gemini Advanced show potential for designing personalized knee OA rehabilitation programs, their recommendations remain constrained. Further improvements in dataset quality, real-time medical knowledge integration, and domain-specific training are needed to enhance their clinical utility.

Ethical Statement

All participants provided written and verbal informed consent, and the study was approved by the Ethics Committee of Kirsehir Ahi Evran University Faculty of Medicine Health Sciences Scientific Research(2024-13/110).

Supporting Institution

This study received no financial support.

Thanks

The authors express their gratitude to all participants.

References

  • 1. Bedi S, Liu Y, Orr-Ewing L, et al. Testing and evaluation of health care applications of large language models: a systematic review. JAMA. 2024.
  • 2. Goldberg CB, Adams L, Blumenthal D, et al. To do no harm—and the most good—with AI in health care. NejmAi. 2024. p. AIp2400036.
  • 3. Kohane IS. Injecting artificial intelligence into medicine. NejmAi. 2024. p. AIe2300197.
  • 4. Rao A, Pang M, Kim J, et al. Assessing the utility of ChatGPT throughout the entire clinical workflow: development and usability study. JMIR. 2023;25:e48659.
  • 5. Stafie CS, Sufaru I-G, Ghiciuc CM, et al. Exploring the intersection of artificial intelligence and clinical healthcare: a multidisciplinary review. Diagnostics. 2023;13:1995.
  • 6. Wachter RM, Brynjolfsson E. Will generative artificial intelligence deliver on its promise in health care? JAMA. 2024;331:65-9.
  • 7. Nazi ZA, Peng W, editors. Large language models in healthcare and medical domain: A review. Informatics; 2024: MDPI.
  • 8. Duran A, Cortuk O, Ok B. Future Perspective of Risk Prediction in Aesthetic Surgery: Is Artificial Intelligence Reliable? Aesthet Surg J. 2024;44:NP839-NP49.
  • 9. Güneş YC, Cesur T, Çamur E. Comparative Analysis of Large Language Models in Simplifying Turkish Ultrasound Reports to Enhance Patient Understanding. EurJTher. 2024;30:714-23.
  • 10. Cao M, Wang Q, Zhang X, et al. Large language models’ performances regarding common patient questions about osteoarthritis: A comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and perplexity. J Sport Health Sci. 2024:101016.
  • 11. Bilika P, Stefanouli V, Strimpakos N, Kapreli EV. Clinical reasoning using ChatGPT: Is it beyond credibility for physiotherapists to use? Physiother Theory Pract. 2024;40:2943-62.
  • 12. Zhang L, Tashiro S, Mukaino M, Yamada S. Use of artificial intelligence large language models as a clinical tool in rehabilitation medicine: a comparative test case. J Rehabil Med. 2023;55.
  • 13. Nazir T, Ahmad U, Mal M, et al. Microsoft Bing vs Google Bard in Neurology: A comparative study of AI-generated patient education material. medRxiv. 2023:2023.08. 25.23294641.
  • 14. Dobson F, Hinman RS, Roos EM, et al. OARSI recommended performance-based tests to assess physical function in people diagnosed with hip or knee osteoarthritis. Osteoarthritis cartilage. 2013;21:1042-52.
  • 15. Fransen M, McConnell S, Harmer AR, et al. Exercise for osteoarthritis of the knee. Cochrane database of systematic reviews. 2015.
  • 16. McAlindon TE, Bannuru RR, Sullivan M, et al. OARSI guidelines for the non-surgical management of knee osteoarthritis. Osteoarthritis cartilage. 2014;22:363-88.
  • 17. Arbel Y, Gimmon Y, Shmueli L. Evaluating the Potential of Large Language Models for Vestibular Rehabilitation Education: A Comparison of ChatGPT, Google Gemini, and Clinicians. medRxiv. 2024:2024.01. 24.24301737.
  • 18. Chen X, You M, Wang L, et al. Evaluating and Enhancing Large Language Models Performance in Domain-specific Medicine: Osteoarthritis Management with DocOA. arXiv preprint arXiv:240112998. 2024.
  • 19. Gomez-Cabello CA, Borna S, Pressman SM, Haider SA, Forte AJ. Large Language Models for Intraoperative Decision Support in Plastic Surgery: A Comparison between ChatGPT-4 and Gemini. Medicina. 2024;60:957.
  • 20. Pirkle S, Yang J, Blumberg TJ. Do ChatGPT and Gemini Provide Appropriate Recommendations for Pediatric Orthopaedic Conditions? J Pediatr Orthop. 2025;45:e66-e71.
  • 21. Lau J. Gemini vs. ChatGPT: What's the difference? [2025]. In: Zapier, editor. 2024.

Diz osteoartriti rehabilitasyonunda yapay zeka egzersiz önerileri: ChatGPT-4o ve Gemini Advanced örneği

Year 2025, Volume: 35 Issue: 3, 487 - 492, 30.06.2025
https://doi.org/10.54005/geneltip.1634118

Abstract

Amaç: Bu çalışma, büyük dil modelleri ChatGPT-4o ve Gemini Advanced'in diz osteoartriti rehabilitasyonunda hastaların değerlendirme verilerine dayanarak kişiselleştirilmiş egzersiz önerme eğilimini karşılaştırmalı olarak değerlendirmeyi amaçlamıştır.
Yöntem: Gözlemsel nitelikteki bu çalışmaya, Amerikan Romatoloji Koleji kriterlerine göre diz osteoartriti tanısı almış 40 hasta dahil edilmiştir. Demografik veriler, ağrı düzeyi, eklem hareket açıklığı, kas kuvveti ve fonksiyonel durum ve denge standart klinik testlerle değerlendirilmiştir. ChatGPT-4o ve Gemini Advanced, bu değerlendirmelere dayanarak üç fazdan oluşan rehabilitasyon programları oluşturmuştur. Egzersiz önerileri 12 parametre üzerinden analiz edilmiş, istatistiksel karşılaştırmalar Mann-Whitney U testi ve Spearman korelasyonu ile yapılmıştır (p<0.05).
Bulgular: Faz 1’de kuadriseps kas kuvveti, diz fleksiyon açısı, diz ekstansiyon açısı ve dört kare adım testi (p=0.017, p=0.012, p=0.033, p=0.043); Faz 2’de kuadriseps kas kuvveti ve Lysholm ölçeği (p=0.032, p=0.040); Faz 3’te ise kuadriseps kas kuvveti (p=0.007) olmak üzere ChatGPT-4o, 7 parametrede istatistiksel olarak anlamlı fark göstermiştir. Öte yandan, Gemini Advanced ise Faz 1’de Lysholm skoru (p=0.044) ve Faz 3’te kuadriseps güçlendirme egzersizi (p=0.047) ile yalnızca 2 parametrede istatistiksel olarak anlamlı fark göstermiştir. ChatGPT-4o’nun hasta değerlendirme verilerini daha etkin entegre ettiği görülmüştür ancak her iki modelin de kişiselleştirme konusunda sınırlılıkları mevcuttur.
Sonuç: ChatGPT-4o ve Gemini Advanced, kişiselleştirilmiş diz osteoartriti rehabilitasyon programları tasarlama potansiyeli taşısa da önerileri halen sınırlıdır. Klinik faydalarının artırılması için veri seti kalitesinin iyileştirilmesi, gerçek zamanlı tıbbi bilgi entegrasyonu ve alana özgü eğitimlerle desteklenmeleri gerekmektedir.

References

  • 1. Bedi S, Liu Y, Orr-Ewing L, et al. Testing and evaluation of health care applications of large language models: a systematic review. JAMA. 2024.
  • 2. Goldberg CB, Adams L, Blumenthal D, et al. To do no harm—and the most good—with AI in health care. NejmAi. 2024. p. AIp2400036.
  • 3. Kohane IS. Injecting artificial intelligence into medicine. NejmAi. 2024. p. AIe2300197.
  • 4. Rao A, Pang M, Kim J, et al. Assessing the utility of ChatGPT throughout the entire clinical workflow: development and usability study. JMIR. 2023;25:e48659.
  • 5. Stafie CS, Sufaru I-G, Ghiciuc CM, et al. Exploring the intersection of artificial intelligence and clinical healthcare: a multidisciplinary review. Diagnostics. 2023;13:1995.
  • 6. Wachter RM, Brynjolfsson E. Will generative artificial intelligence deliver on its promise in health care? JAMA. 2024;331:65-9.
  • 7. Nazi ZA, Peng W, editors. Large language models in healthcare and medical domain: A review. Informatics; 2024: MDPI.
  • 8. Duran A, Cortuk O, Ok B. Future Perspective of Risk Prediction in Aesthetic Surgery: Is Artificial Intelligence Reliable? Aesthet Surg J. 2024;44:NP839-NP49.
  • 9. Güneş YC, Cesur T, Çamur E. Comparative Analysis of Large Language Models in Simplifying Turkish Ultrasound Reports to Enhance Patient Understanding. EurJTher. 2024;30:714-23.
  • 10. Cao M, Wang Q, Zhang X, et al. Large language models’ performances regarding common patient questions about osteoarthritis: A comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and perplexity. J Sport Health Sci. 2024:101016.
  • 11. Bilika P, Stefanouli V, Strimpakos N, Kapreli EV. Clinical reasoning using ChatGPT: Is it beyond credibility for physiotherapists to use? Physiother Theory Pract. 2024;40:2943-62.
  • 12. Zhang L, Tashiro S, Mukaino M, Yamada S. Use of artificial intelligence large language models as a clinical tool in rehabilitation medicine: a comparative test case. J Rehabil Med. 2023;55.
  • 13. Nazir T, Ahmad U, Mal M, et al. Microsoft Bing vs Google Bard in Neurology: A comparative study of AI-generated patient education material. medRxiv. 2023:2023.08. 25.23294641.
  • 14. Dobson F, Hinman RS, Roos EM, et al. OARSI recommended performance-based tests to assess physical function in people diagnosed with hip or knee osteoarthritis. Osteoarthritis cartilage. 2013;21:1042-52.
  • 15. Fransen M, McConnell S, Harmer AR, et al. Exercise for osteoarthritis of the knee. Cochrane database of systematic reviews. 2015.
  • 16. McAlindon TE, Bannuru RR, Sullivan M, et al. OARSI guidelines for the non-surgical management of knee osteoarthritis. Osteoarthritis cartilage. 2014;22:363-88.
  • 17. Arbel Y, Gimmon Y, Shmueli L. Evaluating the Potential of Large Language Models for Vestibular Rehabilitation Education: A Comparison of ChatGPT, Google Gemini, and Clinicians. medRxiv. 2024:2024.01. 24.24301737.
  • 18. Chen X, You M, Wang L, et al. Evaluating and Enhancing Large Language Models Performance in Domain-specific Medicine: Osteoarthritis Management with DocOA. arXiv preprint arXiv:240112998. 2024.
  • 19. Gomez-Cabello CA, Borna S, Pressman SM, Haider SA, Forte AJ. Large Language Models for Intraoperative Decision Support in Plastic Surgery: A Comparison between ChatGPT-4 and Gemini. Medicina. 2024;60:957.
  • 20. Pirkle S, Yang J, Blumberg TJ. Do ChatGPT and Gemini Provide Appropriate Recommendations for Pediatric Orthopaedic Conditions? J Pediatr Orthop. 2025;45:e66-e71.
  • 21. Lau J. Gemini vs. ChatGPT: What's the difference? [2025]. In: Zapier, editor. 2024.
There are 21 citations in total.

Details

Primary Language English
Subjects Orthopaedics, Rheumatology and Arthritis
Journal Section Original Article
Authors

Ömer Alperen Gürses 0000-0001-6564-7428

Anıl Özüdoğru 0000-0002-7507-9863

Figen Tuncay 0000-0002-0886-2006

Caner Karartı 0000-0002-4655-0986

Publication Date June 30, 2025
Submission Date February 5, 2025
Acceptance Date April 18, 2025
Published in Issue Year 2025 Volume: 35 Issue: 3

Cite

Vancouver Gürses ÖA, Özüdoğru A, Tuncay F, Karartı C. Artificial intelligence exercise recommendations in knee osteoarthritis rehabilitation: ChatGPT-4o and Gemini Advanced example. Genel Tıp Derg. 2025;35(3):487-92.

The Journal of General Medicine is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY NC).