Data Mining on Symbolic Knowledge Extracted from the Web

Abstract

Information extractors and classifiers operating on unrestricted unstructured texts are an errorful source of large amounts of potentially useful information, especially when combined with a crawler which automatically augments the knowledge base from the world-wide web. At the same time, there is much structured information on the world-wide web. Wrapping the web-sites which provide this kind of information provide us with a second source of information; possibly less up-to-date, but reliable as facts. We give a case study of combining information from these two kinds of sources in the context of learning facts about companies. We provide results of association rules, propositional and relational learning, which demonstrate that data-mining can help us improve our extractors, and that using information from two kinds of sources improves the reliability of data-mined rules.

References

Page 1

	Year	Citations

Page 1