Benchmarking Human–AI collaboration for common evidence appraisal tools

Concepedia

Publication | Open Access

DOI Full Paper Access

Citations

References

2024

Year

Tim Woelfle, Julian Hirt, Perrine Janiaud, Ludwig Kappos, John P. A. Ioannidis, Lars G. Hemkens

Journal of Clinical Epidemiology

Abstract

Current LLMs alone appraised evidence worse than humans. Human-AI collaboration may reduce workload for the second human rater for the assessment of reporting (PRISMA) and methodological rigor (AMSTAR) but not for complex tasks such as PRECIS-2.

References

Page 1

	Year	Citations

Page 1