Publication | Closed Access
Automatic ontology-based knowledge extraction from Web documents
459
Citations
7
References
2003
Year
Natural Language ProcessingEngineeringInformation RetrievalKnowledge ExtractionExtended Ontology TerminologyOntology EngineeringArtequakt ProjectSemantic Knowledge ManagementOntology LearningOntology LanguageSemanticsSemantic WebText MiningWeb Documents
The Semantic Web demands efficient access to knowledge from Web documents, yet annotations are scarce, manual annotation is impractical, and existing automatic tools are underdeveloped, creating a need for ontology‑guided extraction. This paper develops the Artequakt system, linking a knowledge extraction tool with an ontology to continuously support specialized knowledge services that harvest specific knowledge directly from unstructured Web text. Artequakt searches online documents, extracts ontology‑matched knowledge, outputs it in a machine‑readable format for automatic maintenance in a knowledge base, and expands terminology through a lexicon‑based term expansion mechanism.
To bring the Semantic Web to life and provide advanced knowledge services, we need efficient ways to access and extract knowledge from Web documents. Although Web page annotations could facilitate such knowledge gathering, annotations are rare and will probably never be rich or detailed enough to cover all the knowledge these documents contain. Manual annotation is impractical and unscalable, and automatic annotation tools remain largely undeveloped. Specialized knowledge services therefore require tools that can search and extract specific knowledge directly from unstructured text on the Web, guided by an ontology that details what type of knowledge to harvest. An ontology uses concepts and relations to classify domain knowledge. Other researchers have used ontologies to support knowledge extraction, but few have explored their full potential in this domain. The paper considers the Artequakt project which links a knowledge extraction tool with an ontology to achieve continuous knowledge support and guide information extraction. The extraction tool searches online documents and extracts knowledge that matches the given classification structure. It provides this knowledge in a machine-readable format that will be automatically maintained in a knowledge base (KB). Knowledge extraction is further enhanced using a lexicon-based term expansion mechanism that provides extended ontology terminology.
| Year | Citations | |
|---|---|---|
Page 1
Page 1