Concepedia

Publication | Closed Access

MLbase: A Distributed Machine-learning System

294

Citations

19

References

2013

Year

TLDR

Machine learning and statistical techniques are essential for extracting actionable knowledge from big data, yet their complexity and the inaccessibility of scalable systems make them difficult for many users and researchers. The paper proposes MLbase, a system designed to make machine learning accessible to both end‑users and researchers. MLbase offers a declarative interface for ML tasks, a dynamic optimizer that selects and adapts learning algorithms, high‑level operators that let researchers implement diverse methods without deep systems knowledge, and a runtime optimized for the data‑access patterns of these operators.

Abstract

Machine learning (ML) and statistical techniques are key to transforming big data into actionable knowledge. In spite of the modern primacy of data, the complexity of existing ML algorithms is often overwhelming|many users do not understand the trade-os and challenges of parameterizing and choosing between dierent learning techniques. Furthermore, existing scalable systems that support machine learning are typically not accessible to ML researchers without a strong background in distributed systems and low-level primitives. In this work, we present our vision for MLbase, a novel system harnessing the power of machine learning for both end-users and ML researchers. MLbase provides (1) a simple declarative way to specify ML tasks, (2) a novel optimizer to select and dynamically adapt the choice of learning algorithm, (3) a set of high-level operators to enable ML researchers to scalably implement a wide range of ML methods without deep systems knowledge, and (4) a new run-time optimized for the data-access patterns of these high-level operators.

References

YearCitations

Page 1