Publication | Closed Access
FASTUS: A Finite-state Processor for Information Extraction from Real-world Text.
256
Citations
4
References
1993
Year
Syntactic ParsingEngineeringPart-of-speech TaggingSeveral Blind TestsText MiningNatural Language ProcessingSyntaxInformation RetrievalData ScienceFinite-state MachineComputational LinguisticsLanguage EngineeringGrammarLanguage StudiesMachine TranslationMassive AmbiguityNlp TaskLinguisticsComputer ScienceInformation ExtractionShallow ParsingParsingData ExtractionText ProcessingFinite-state Processor
Approaches to text processing that rely on parsing the text with a context-free grammar tend to be slow and error-prone because of the massive ambiguity of long sentences. In contrast, FASTUS employs a nondeterministic finite-state language model that produces a phrasal decomposition of a sentence into noun groups, verb groups and particles. Another finite-state machine recognizes domain-specific phrases based on combinations of the heads of the constituents found in the first pass. FASTUS has been evaluated on several blind tests that demonstrate that state-of-the-art performance on information-extraction tasks is obtainable with surprisingly little computational effort.
| Year | Citations | |
|---|---|---|
Page 1
Page 1