Concepedia

Publication | Closed Access

A Metadata Catalog Service for Data Intensive Applications

224

Citations

12

References

2003

Year

TLDR

Advances in computational, storage, and network technologies, together with middleware such as the Globus Toolkit, enable data‑intensive applications that generate terabytes to petabytes of distributed data, necessitating efficient metadata management and specialized cataloguing services in Grid environments. The paper designs a Metadata Catalog Service (MCS) to store, access, and query descriptive metadata for data‑intensive applications. The authors implemented the MCS, applied it to several applications, and performed a scalability study of the service. The scalability study demonstrates that the MCS can efficiently handle large data sets and support scalable metadata queries.

Abstract

Advances in computational, storage and network technologies as well as middle ware such as the Globus Toolkit allow scientists to expand the sophistication and scope of data-intensive applications. These applications produce and analyze terabytes and petabytes of data that are distributed in millions of files or objects. To manage these large data sets efficiently, metadata or descriptive information about the data needs to be managed. There are various types of metadata, and it is likely that a range of metadata services will exist in Grid environments that are specialized for particular types of metadata cataloguing and discovery. In this paper, we present the design of a Metadata Catalog Service (MCS) that provides a mechanism for storing and accessing descriptive metadata and allows users to query for data items based on desired attributes. We describe our experience in using the MCS with several applications and present a scalability study of the service.

References

YearCitations

Page 1