Publication | Open Access
Evaluating the Performance of ChatGPT in Ophthalmology: An Analysis of its Successes and Shortcomings
103
Citations
13
References
2023
Year
Unknown Venue
Ocular DiseaseOphthalmologyExperimental OphthalmologyLanguage TestingDomain-specific Pre-trainingClinical SpecialtiesSurgical TrainingLarge Language ModelOphthalmology Question-answering SpacePsycholinguisticsGlaucomaLanguage StudiesOptometryOcular PathologyMedicine
ABSTRACT We tested the accuracy of ChatGPT, a large language model (LLM), in the ophthalmology question-answering space using two popular multiple choice question banks used for the high-stakes Ophthalmic Knowledge Assessment Program (OKAP) exam. The testing sets were of easy-to-moderate difficulty and were diversified, including recall, interpretation, practical and clinical decision-making problems. ChatGPT achieved 55.8% and 42.7% accuracy in the two 260-question simulated exams. Its performance varied across subspecialties, with the best results in general medicine and the worst in neuro-ophthalmology and ophthalmic pathology and intraocular tumors. These results are encouraging but suggest that specialising LLMs through domain-specific pre-training may be necessary to improve their performance in ophthalmic subspecialties.
| Year | Citations | |
|---|---|---|
Page 1
Page 1