Publication | Closed Access
Identifying Sections in Scientific Abstracts using Conditional Random Fields
143
Citations
16
References
2008
Year
Prior knowledge of abstract rhetorical structure aids text‑mining tasks such as extraction, retrieval, and summarization. This paper introduces a method to classify abstract sentences into objective, methods, results, and conclusions sections. The authors formulate the task as a sequential labeling problem and use Conditional Random Fields trained on automatically extracted Medline abstracts. The CRF approach achieved 95.5 % per‑sentence accuracy and 68.8 % per‑abstract accuracy, outperforming previous methods.
OBJECTIVE: The prior knowledge about the rhetorical structure of scientific abstracts is useful for various text-mining tasks such as information extraction, information retrieval, and automatic summarization. This paper presents a novel approach to categorize sentences in scientific abstracts into four sections, objective, methods, results, and conclusions. METHOD: Formalizing the categorization task as a sequential labeling problem, we employ Conditional Random Fields (CRFs) to annotate section labels into abstract sentences. The training corpus is acquired automatically from Medline abstracts. RESULTS: The proposed method outperformed the previous approaches, achieving 95.5% per-sentence accuracy and 68.8% per-abstract accuracy. CONCLUSION: The experimental results showed that CRFs could model the rhetorical structure of abstracts more suitably.
| Year | Citations | |
|---|---|---|
Page 1
Page 1