Concepedia

Publication | Closed Access

Identifying Sections in Scientific Abstracts using Conditional Random Fields

143

Citations

16

References

2008

Year

TLDR

Prior knowledge of abstract rhetorical structure aids text‑mining tasks such as extraction, retrieval, and summarization. This paper introduces a method to classify abstract sentences into objective, methods, results, and conclusions sections. The authors formulate the task as a sequential labeling problem and use Conditional Random Fields trained on automatically extracted Medline abstracts. The CRF approach achieved 95.5 % per‑sentence accuracy and 68.8 % per‑abstract accuracy, outperforming previous methods.

Abstract

OBJECTIVE: The prior knowledge about the rhetorical structure of scientific abstracts is useful for various text-mining tasks such as information extraction, information retrieval, and automatic summarization. This paper presents a novel approach to categorize sentences in scientific abstracts into four sections, objective, methods, results, and conclusions. METHOD: Formalizing the categorization task as a sequential labeling problem, we employ Conditional Random Fields (CRFs) to annotate section labels into abstract sentences. The training corpus is acquired automatically from Medline abstracts. RESULTS: The proposed method outperformed the previous approaches, achieving 95.5% per-sentence accuracy and 68.8% per-abstract accuracy. CONCLUSION: The experimental results showed that CRFs could model the rhetorical structure of abstracts more suitably.

References

YearCitations

Page 1