Concepedia

Publication | Closed Access

Extracting Opinion Expressions with semi-Markov Conditional Random Fields

140

Citations

25

References

2012

Year

TLDR

Opinion expression extraction is commonly treated as token‑level sequence labeling with CRFs, which cannot easily capture segment‑level syntactic structure. This work proposes a semi‑CRF approach that performs sequence labeling at the segment level. The authors extend the semi‑CRF model to handle arbitrarily long expressions and incorporate syntactic parse information at segment boundaries, then evaluate it on two opinion extraction tasks. Experiments show the semi‑CRF method outperforms state‑of‑the‑art approaches on both tasks.

Abstract

Extracting opinion expressions from text is usually formulated as a token-level sequence labeling task tackled using Conditional Random Fields (CRFs). CRFs, however, do not readily model potentially useful segment-level information like syntactic constituent structure. Thus, we propose a semi-CRF-based approach to the task that can perform sequence labeling at the segment level. We extend the original semi-CRF model (Sarawagi and Cohen, 2004) to allow the modeling of arbitrarily long expressions while accounting for their likely syntactic structure when modeling segment boundaries. We evaluate performance on two opinion extraction tasks, and, in contrast to previous sequence labeling approaches to the task, explore the usefulness of segmentlevel syntactic parse features. Experimental results demonstrate that our approach outperforms state-of-the-art methods for both opinion expression tasks.

References

YearCitations

Page 1