Concepedia

Publication | Closed Access

7. Respondent-Driven Sampling: An Assessment of Current Methodology

624

Citations

29

References

2010

Year

TLDR

Respondent‑driven sampling uses link‑tracing in social networks to sample hard‑to‑reach populations, but its estimators rely on strong assumptions that treat the data as a probability sample. The study evaluates how seed bias, respondent referral behavior, and without‑replacement sampling affect estimator accuracy, and proposes methodological improvements. RDS expands the sample by tracing social links, thereby reducing dependence on the initial convenience sample. The analysis demonstrates that seed convenience, preferential referral, and sampling a large fraction of the population can all introduce substantial bias, showing that RDS’s claimed asymptotic unbiasedness depends on unrealistic assumptions and should be used with caution.

Abstract

Respondent-driven sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure to expand the sample and reduce its dependence on the initial (convenience) sample. The current estimators of population averages make strong assumptions in order to treat the data as a probability sample. We evaluate three critical sensitivities of the estimators: (1) to bias induced by the initial sample, (2) to uncontrollable features of respondent behavior, and (3) to the without-replacement structure of sampling. Our analysis indicates: (1) that the convenience sample of seeds can induce bias, and the number of sample waves typically used in RDS is likely insufficient for the type of nodal mixing required to obtain the reputed asymptotic unbiasedness; (2) that preferential referral behavior by respondents leads to bias; (3) that when a substantial fraction of the target population is sampled the current estimators can have substantial bias. This paper sounds a cautionary note for the users of RDS. While current RDS methodology is powerful and clever, the favorable statistical properties claimed for the current estimates are shown to be heavily dependent on often unrealistic assumptions. We recommend ways to improve the methodology.

References

YearCitations

Page 1