Publication | Open Access
Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education
13
Citations
25
References
2025
Year
ChatGPT-4 was the most accurate LLM on critical care pharmacy questions and few-shot CoT improved accuracy the most. Average student accuracy was similar to LLMs overall, and higher on knowledge application questions. These findings support the need for future assessment of customized training for the type of output needed. Reliance on LLMs is only supported with recall-based questions.
| Year | Citations | |
|---|---|---|
Page 1
Page 1