Publication | Closed Access
MAD skills
466
Citations
18
References
2009
Year
Massive Data AcquisitionData ModelingEngineeringDatabase SystemData ScienceBusiness IntelligenceBig Data AnalyticsManagementData IntegrationComputer ScienceMassive Data ProcessingMap-reduceData ManagementData-intensive ComputingBig DataHigh-performance Data Analytics
Massive data acquisition and storage are becoming affordable, prompting many enterprises to employ statisticians for sophisticated data analysis. This paper introduces Magnetic, Agile, Deep (MAD) data analysis as a radical departure from traditional enterprise data warehousing and business intelligence, outlining database design methods that support agile analyst workflows and highlighting system features that enable flexible algorithm development via SQL and MapReduce. The authors describe a design philosophy and techniques for MAD analytics at Fox Audience Network using the Greenplum parallel database, including data‑parallel algorithms for advanced statistical density methods. They conclude that database system features such as SQL and MapReduce interfaces over diverse storage mechanisms support agile design and flexible algorithm development.
As massive data acquisition and storage becomes increasingly affordable, a wide variety of enterprises are employing statisticians to engage in sophisticated data analysis. In this paper we highlight the emerging practice of Magnetic, Agile, Deep (MAD) data analysis as a radical departure from traditional Enterprise Data Warehouses and Business Intelligence. We present our design philosophy, techniques and experience providing MAD analytics for one of the world's largest advertising networks at Fox Audience Network, using the Greenplum parallel database system. We describe database design methodologies that support the agile working style of analysts in these settings. We present dataparallel algorithms for sophisticated statistical techniques, with a focus on density methods. Finally, we reflect on database system features that enable agile design and flexible algorithm development using both SQL and MapReduce interfaces over a variety of storage mechanisms.
| Year | Citations | |
|---|---|---|
Page 1
Page 1