Concepedia

Abstract

In classical test theory, a test is regarded as a sample of items from a domain defined by generating rules or by content, process, and format specifications, l f the items are a random sample of the domain, then the percent‐correct score on the test estimates the domain score, that is, the expected percent correct for all items in the domain. When the domain is represented by a large set of calibrated items, as in item banking applications, item response theory (IRT) provides an alternative estimator of the domain score by transformation of the IRT scale score on the test. This estimator has the advantage of not requiring the test items to be a random sample of the domain, and of having a simple standard error. We present here resampling results in real data demonstrating for uni‐ and multidimensional models that the IRT estimator is also a more accurate predictor of the domain score than is the classical percent‐correct score. These results have implications for reporting outcomes of educational qualification testing and assessment.

References

YearCitations

Page 1