Concepedia

Abstract

Abstract Although the root‐mean squared deviation (RMSD) is a popular statistical measure for evaluating country‐specific item‐level misfit (i.e., differential item functioning [DIF]) in international large‐scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered countries. Specifically, items for which most respondents in a country have a very low (or high) probability of providing a correct answer will rarely be flagged by the RMSD as showing misfit, even if very strong DIF is present. With many international large‐scale assessment initiatives moving toward covering a more heterogeneous group of countries, this raises issues for the ability of the RMSD to detect item‐level misfit, especially in low‐performing countries that are not well‐aligned with the overall difficulty level of the test. This may put one at risk of incorrectly assuming measurement invariance to hold, and may also inflate estimated between‐country difference in proficiency. The degree to which the RMSD is able to detect DIF in low‐performing countries is studied using both an empirical example from PISA 2015 and a simulation study.

References

YearCitations

Page 1