Systems for grading the quality of evidence and the strength of recommendations I: Critical appraisal of existing approaches The GRADE Working Group

TLDR

A variety of methods have been used to grade evidence levels and recommendation strength, and the lack of a unified approach hampers clear communication and informed decision‑making. The study aimed to critically appraise six prominent grading systems to determine shared characteristics for a common, sensible approach. Twelve assessors independently evaluated each of the six selected systems using twelve criteria to judge their sensibility. The appraisal found poor agreement on sensibility, only one system suitable for all four question types, none usable for all target groups, low reproducibility, and that all current approaches have important shortcomings.

Abstract

A number of approaches have been used to grade levels of evidence and the strength of recommendations. The use of many different approaches detracts from one of the main reasons for having explicit approaches: to concisely characterise and communicate this information so that it can easily be understood and thereby help people make well-informed decisions. Our objective was to critically appraise six prominent systems for grading levels of evidence and the strength of recommendations as a basis for agreeing on characteristics of a common, sensible approach to grading levels of evidence and the strength of recommendations.Six prominent systems for grading levels of evidence and strength of recommendations were selected and someone familiar with each system prepared a description of each of these. Twelve assessors independently evaluated each system based on twelve criteria to assess the sensibility of the different approaches. Systems used by 51 organisations were compared with these six approaches.There was poor agreement about the sensibility of the six systems. Only one of the systems was suitable for all four types of questions we considered (effectiveness, harm, diagnosis and prognosis). None of the systems was considered usable for all of the target groups we considered (professionals, patients and policy makers). The raters found low reproducibility of judgements made using all six systems. Systems used by 51 organisations that sponsor clinical practice guidelines included a number of minor variations of the six systems that we critically appraised.All of the currently used approaches to grading levels of evidence and the strength of recommendations have important shortcomings.

References

Page 1

	Year	Citations

Page 1