Concepedia

Publication | Open Access

Bidirectional LSTM-CRF Models for Sequence Tagging

3.3K

Citations

21

References

2015

Year

TLDR

The paper proposes a range of LSTM‑based models for sequence tagging. The authors implement several architectures—LSTM, BI‑LSTM, LSTM‑CRF, and BI‑LSTM‑CRF—combining a bidirectional LSTM with a CRF layer to capture both past/future context and sentence‑level tag dependencies. The BI‑LSTM‑CRF achieves near state‑of‑the‑art accuracy on POS, chunking, and NER benchmarks, demonstrating robustness and reduced reliance on word embeddings.

Abstract

In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer (LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a bidirectional LSTM component. It can also use sentence level tag information thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or close to) accuracy on POS, chunking and NER data sets. In addition, it is robust and has less dependence on word embedding as compared to previous observations.

References

YearCitations

1997

93.8K

1989

22.6K

2001

13K

1990

10.6K

2013

8.7K

2010

5.4K

2005

3K

2005

1.4K

2000

1.3K

1996

1.3K

Page 1