Concepedia

Publication | Closed Access

Addressing big data issues in Scientific Data Infrastructure

479

Citations

8

References

2013

Year

TLDR

Big Data, defined by volume, velocity, variety, value, and veracity, are increasingly central to science and industry. The paper aims to address Big Data challenges for Scientific Data Infrastructure by proposing the SDLM model and a generic SDI architecture to guide interoperable, project‑centric SDI development. The authors derive requirements from diverse scientific communities, design the SDLM and SDI models, and demonstrate their implementation via cloud‑based infrastructure services, outlining key components for Big Data. The proposed SDLM and SDI models can be implemented using cloud‑based infrastructure services, with identified major components supporting Big Data.

Abstract

Big Data are becoming a new technology focus both in science and in industry. This paper discusses the challenges that are imposed by Big Data on the modern and future Scientific Data Infrastructure (SDI). The paper discusses a nature and definition of Big Data that include such features as Volume, Velocity, Variety, Value and Veracity. The paper refers to different scientific communities to define requirements on data management, access control and security. The paper introduces the Scientific Data Lifecycle Management (SDLM) model that includes all the major stages and reflects specifics in data management in modern e-Science. The paper proposes the SDI generic architecture model that provides a basis for building interoperable data or project centric SDI using modern technologies and best practices. The paper explains how the proposed models SDLM and SDI can be naturally implemented using modern cloud based infrastructure services provisioning model and suggests the major infrastructure components for Big Data.

References

YearCitations

Page 1