Concepedia

Publication | Open Access

Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education

13

Citations

25

References

2025

Year

Abstract

ChatGPT-4 was the most accurate LLM on critical care pharmacy questions and few-shot CoT improved accuracy the most. Average student accuracy was similar to LLMs overall, and higher on knowledge application questions. These findings support the need for future assessment of customized training for the type of output needed. Reliance on LLMs is only supported with recall-based questions.

References

YearCitations

Page 1