Publication | Open Access
Incorporating non-local information into information extraction systems by Gibbs sampling
3K
Citations
21
References
2005
Year
Unknown Venue
Syntactic ParsingEngineeringKnowledge ExtractionSemantic WebCorpus LinguisticsLanguage ProcessingText MiningNatural Language ProcessingInformation RetrievalData ScienceInformation Extraction SystemsComputational LinguisticsLong Distance StructureGibbs SamplingLanguage StudiesNamed-entity RecognitionMachine TranslationNlp TaskKnowledge DiscoveryTerminology ExtractionInformation ExtractionSemantic ParsingInformation Extraction TasksData ExtractionLinguistics
Most current statistical NLP models rely on local features to enable dynamic programming inference, which limits their ability to capture the long‑distance structure common in language. The study proposes using Gibbs sampling to incorporate non‑local information into statistical NLP models. By replacing Viterbi decoding with simulated annealing in sequence models such as HMMs, CMMs, and CRFs, the authors augment a CRF‑based information extraction system with long‑distance dependency models that enforce label and template consistency while keeping inference tractable. This technique achieves an error reduction of up to 9 % over state‑of‑the‑art systems on two established information extraction tasks.
Most current statistical natural language processing models use only local features so as to permit dynamic programming in inference, but this makes them unable to fully account for the long distance structure that is prevalent in language use. We show how to solve this dilemma with Gibbs sampling, a simple Monte Carlo method used to perform approximate inference in factored probabilistic models. By using simulated annealing in place of Viterbi decoding in sequence models such as HMMs, CMMs, and CRFs, it is possible to incorporate non-local structure while preserving tractable inference. We use this technique to augment an existing CRF-based information extraction system with long-distance dependency models, enforcing label consistency and extraction template consistency constraints. This technique results in an error reduction of up to 9% over state-of-the-art systems on two established information extraction tasks.
| Year | Citations | |
|---|---|---|
1983 | 44K | |
1989 | 22.6K | |
1984 | 17.9K | |
2001 | 13K | |
1988 | 3K | |
2003 | 2.4K | |
1997 | 1K | |
2002 | 634 | |
1999 | 467 | |
1999 | 427 |
Page 1
Page 1