Know What You Don’t Know: Unanswerable Questions for SQuAD

TLDR

Extractive reading comprehension systems often guess answers even when the context lacks them, and existing datasets either focus solely on answerable questions or use easily identifiable unanswerable ones. This work introduces SQuADRUn, a dataset that merges SQuAD with over 50,000 adversarially written unanswerable questions designed to resemble answerable ones. The dataset requires models to answer when possible and abstain when no answer is supported, challenging systems to detect unanswerability. On SQuADRUn, a strong neural model achieving 86 % F1 on SQuAD drops to 66 % F1, demonstrating the task’s difficulty, and the dataset is released to the community.

Abstract

Extractive reading comprehension systems can often locate the correct answer to a question in a context document, but they also tend to make unreliable guesses on questions for which the correct answer is not stated in the context. Existing datasets either focus exclusively on answerable questions, or use automatically generated unanswerable questions that are easy to identify. To address these weaknesses, we present SQuADRUn, a new dataset that combines the existing Stanford Question Answering Dataset (SQuAD) with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuADRUn, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuADRUn is a challenging natural language understanding task for existing models: a strong neural system that gets 86% F1 on SQuAD achieves only 66% F1 on SQuADRUn. We release SQuADRUn to the community as the successor to SQuAD.

References

Page 1

	Year	Citations

Page 1