Publication | Closed Access
THALIA: Test Harness for the Assessment of Legacy Information Integration Approaches
48
Citations
10
References
2005
Year
Unknown Venue
Software MaintenanceTest HarnessEngineeringBusiness IntelligenceSoftware EngineeringSemantic WebCorpus LinguisticsInformation InfrastructureNatural Language ProcessingData SourcesInformation RetrievalData ScienceInformation Technology ManagementComputational LinguisticsManagementLegacy SystemIntegration TestingData IntegrationEnterprise Information SystemAvailable TestbedSemantic IntegrationLearning AnalyticsInformation ManagementSoftware DesignSoftware TestingAdvanced Information SystemData-driven LearningTest CollectionSemantic Interoperability
We introduce our new, publicly available testbed and benchmark called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches) for testing and evaluating integration technologies. THALIA provides researchers with a collection of 40 downloadable data sources representing University course catalogs from computer science departments worldwide. In addition, THALIA currently provides a set of twelve challenge queries as well as a scoring function for ranking the performance of an integration system. A second contribution is a systematic classification of the types of syntactic and semantic heterogeneities, which directly lead to the twelve challenge. We have chosen course information as our domain of discourse because it is well known and easy to understand. Furthermore, there is an abundance of data sources publicly available that allowed us to develop a testbed exhibiting all of the syntactic and semantic heterogeneities that we have identified.
| Year | Citations | |
|---|---|---|
Page 1
Page 1