Publication | Open Access
Challenges inbuilding a DBMS Resource Advisor
103
Citations
10
References
2006
Year
Data AnnotationEngineeringKnowledge ExtractionBusiness IntelligenceProject ManagementSemantic WebText MiningDbms Resource AdvisorNatural Language ProcessingInformation RetrievalData ScienceData MiningInformation Extraction SystemsDatabase SystemComputational LinguisticsManagementData IntegrationDatabase ConstructionNamed-entity RecognitionData ManagementKnowledge DiscoveryComputer ScienceInformation ManagementDatabase TechnologyInformation ExtractionKnowledge ManagementData ExtractionData Modeling
The AVATAR Information Extraction System (IES) at the IBM Almaden Research Center enables highprecision, rule-based, information extraction from text-documents. Drawing from our experience we propose the use of probabilistic database techniques as the formal underpinnings of information extraction systems so as to maintain high precision while increasing recall. This involves building a framework where rule-based annotators can be mapped to queries in a database system. We use examples from AVATAR IES to describe the challenges in achieving this goal. Finally, we show that deriving precision estimates in such a database system presents a significant challenge for probabilistic database systems.
| Year | Citations | |
|---|---|---|
Page 1
Page 1