Publication | Closed Access
A Metadata Best Practice for a Scientific Data Repository
80
Citations
16
References
2009
Year
EngineeringMetadataSemantic WebDcmi Metadata TermsInformation RetrievalData ScienceManagementData IntegrationDryad RepositoryData ManagementMetadata Best PracticeMetadata IntegrationMetadata ManagementSoftware DesignMetadata InteroperabilityMetadata SchemaBritish ColumbiaData ModelingSemantic Interoperability
Abstract Digital data repositories ought to support immediate operational needs and long-term project goals. This paper presents the Dryad repository's metadata best practice balancing of these two needs. The paper reviews background work exploring the meaning of science, characterizing data, and highlighting data curation metadata challenges. The Dryad repository is introduced, and the initiative's metadata best practice and underlying rationales are described. Dryad's metadata approach includes two prongs: one addressing the long-term goal to align with the Semantic Web via a metadata application profile; and another addressing the immediate need to make content available in DSpace via an extensible markup language (XML) schema. The conclusion summarizes limitations and advantages of the two prongs underlying Dryad's metadata effort. KEYWORDS: metadatascientific dataDublin Core Application ProfileSingapore FrameworkSemantic Web ACKNOWLEDGMENT This work is supported by National Science Foundation Grant # EF-0423641. We would like to acknowledge contributions by the Dryad team members Hilmar Lapp and Todd Vision of NESCent; and Michael Whitlock, University of British Columbia. We would also like to thank Stuart Weibel, OCLC, for his thoughtful comments and support of this work. Notes 1. DOE (Department of Energy) Data Explorer (DDE): http://www.osti.gov/dataexplorer/ 2. Knowledge Network for Biocomplexity Data (KNB): http://knb.ecoinformatics.org/ 3. The Dublin Core comprises both the 15 core properties from the DCMES Metadata Element Set (DCMES), Version 1.1. Reference Description: http://dublincore.org/documents/2004/12/20/dces/ and a set of additional properties registered in the DCMI (Dublin Core Metadata Initiative) Metadata Terms namespace: http://dublincore.org/documents/dcmi-terms/ 4. Dublin Core Abstract Model (DCAM): http://dublincore.org/documents/abstract-model/ 5. Dublin Core Application Profile Guidelines: http://dublincore.org/usage/documents/profile-guidelines/. 6. Dryad repository: http://www.datadryad.org/repo/ 7. Dryad repository Partners: http://www.datadryad.org/repo/themes/Dryad/pages/partners.html 8. Joint Data Archiving Policy: http://www.datadryad.org/repo/ 9. Interoperability Levels for Dublin Core Metadata: http://dublincore.org/documents/interoperability-levels/ 10. Dryad Workshop: https://www.datadryad.org/wiki/Dec_5_Workshop_Minutes 11. Collectively the DCMES (http://dublincore.org/documents/2004/12/20/dces/) and DCMI Metadata Terms (http://dublincore.org/documents/dcmi-terms/), as explained in footnote 3. 12. Darwin Core (DwC), Version 1.3: http://digir.sourceforge.net/schema/conceptual/darwin/core/2.0/darwincoreWithDiGIRv1.3.xsd; Version 1.4 being reviewed, see: http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreVersions 13. Publishing Requirements for Industry Standard Metadata (PRISM): http://www.prismstandard.org/specifications/ 14. Journal Publishing Tag Set Tag Library, Version 3.0, November 2008: http://dtd.nlm.nih.gov/publishing/tag-library/ 15. Data Document Initiative (DDI): http://webapp.icpsr.umich.edu/cocoon/DDI-LIBRARY/Version2-1.xsd?section=all 16. Ecological Metadata Language (EML): http://knb.ecoinformatics.org/software/eml/eml-2.0.1/index.html 17. PREMIS Editorial Committee. PREMIS Data Dictionary for Preservation Metadata Version 2.0, 2008: http://www.loc.gov/standards/premis/v2/premis-2-0.pdf 18. Status Element—Dryad: http://www.purl.org/dryad/terms/status 19. Dryad Domain: http://www.purl.org/dryad 20. Text Encoding Initiative (TEI) Header, Chapter 2 (P5: Guidelines for Electronic Text Encoding and Interchange): http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html 21. Tim Berners-Lee on the next Web (TED Conferences, LLC): http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html 22. GenBank database: http://www.psc.edu/general/software/packages/genbank/genbank.php 23. TreeBASE: http://www.treebase.org 24. Long Term Ecological Research (LTER) Network's Metacat data catalog: http://metacat.lternet.edu/knb 25. Gleaning Resource Descriptions from Dialects of Languages (GRDDL): http://www.w3.org/TR/grddl-primer/
| Year | Citations | |
|---|---|---|
Page 1
Page 1