Concepedia

Publication | Open Access

Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study

18

Citations

43

References

2025

Year

Abstract

GPT-4V achieved high accuracy on multiple-choice questions with images, highlighting its potential in medical assessments. However, significant shortcomings were observed in the quality of explanations when questions were answered incorrectly, particularly in the interpretation of images, which could not be efficiently resolved through expert interaction. These findings reveal hidden flaws in the image interpretation capabilities of GPT-4V, underscoring the need for more comprehensive evaluations beyond multiple-choice questions before integrating GPT-4V into clinical settings.

References

YearCitations

Page 1