Publication | Closed Access
The relationship between Recall and Precision
940
Citations
6
References
1994
Year
Memory RetrievalEngineeringDatabasesAccuracy And PrecisionOverall Retrieval PerformanceCognitionSocial SciencesText MiningInformation RetrievalData ScienceData MiningRelevance FeedbackMemoryAdaptive MemoryRetrieval TechniqueCognitive ScienceRetrieval PerformanceTangent ParabolaKnowledge RetrievalMnemonic
Empirical studies show that Precision tends to decline as Recall increases. The article investigates how Precision and Recall relate, describing their connections to the number of documents retrieved under various retrieval‑performance assumptions. The authors analyze the mathematical link between Precision and Recall, noting that a quadratic Recall curve can mimic empirical behavior when transformed into a tangent parabola, and they propose a two‑stage retrieval strategy—initial high‑Recall retrieval followed by detailed re‑search—to simultaneously improve Recall and Precision, especially in large databases or limited‑capability systems. They demonstrate that a Precision–Recall tradeoff is unavoidable whenever retrieval outperforms random chance, and that avoiding this tradeoff as the number of retrieved documents grows requires performance to match or exceed overall retrieval performance up to that point, yet the tradeoff persists. © 1994 John Wiley & Sons, Inc.
Empirical studies of retrieval performance have shown a tendency for Precision to decline as Recall increases. This article examines the nature of the relationship between Precision and Recall. The relationships between Recall and the number of documents retrieved, between Precision and the number of documents retrieved, and between Precision and Recall are described in the context of different assumptions about retrieval performance. It is demonstrated that a tradeoff between Recall and Precision is unavoidable whenever retrieval performance is consistently better than retrieval at random. More generally, for the Precision–Recall trade-off to be avoided as the total number of documents retrieved increases, retrieval performance must be equal to or better than overall retrieval performance up to that point. Examination of the mathematical relationship between Precision and Recall shows that a quadratic Recall curve can resemble empirical Recall–Precision behavior if transformed into a tangent parabola. With very large databases and/or systems with limited retrieval capabilities there can be advantages to retrieval in two stages: Initial retrieval emphasizing high Recall, followed by more detailed searching of the initially retrieved set, can be used to improve both Recall and Precision simultaneously. Even so, a tradeoff between Precision and Recall remains. © 1994 John Wiley & Sons, Inc.
| Year | Citations | |
|---|---|---|
Page 1
Page 1