Publication | Closed Access
RAMP
72
Citations
6
References
2011
Year
Cluster ComputingEngineeringData ScienceMap ProvenanceCloud ComputingManagementProvenance CaptureData IntegrationComputer ScienceMap-reduceSemantic WebProvenance AnalysisData ManagementData ProvenanceSentiment AnalysisMassive Data ProcessingBig Data
RAMP ( Reduce And Map Provenance ) is an extension to Hadoop that supports provenance capture and tracing for workflows of MapReduce jobs. RAMP uses a wrapper-based approach, requiring little if any user intervention in most cases, while retaining Hadoop's parallel execution and fault tolerance. We demonstrate RAMP on a real-world MapReduce workflow generated from a Pig script that performs sentiment analysis over Twitter data. We show how RAMP's automatic provenance capture and tracing capabilities provide a convenient and efficient means of drilling-down and verifying output elements.
| Year | Citations | |
|---|---|---|
Page 1
Page 1