Publication | Closed Access
Towards Resource-Elastic Machine Learning
16
Citations
9
References
2013
Year
Unknown Venue
Artificial IntelligenceCluster ComputingEngineeringMachine LearningDistributed AlgorithmsMachine Learning ToolBig Data AnalyticsMap-reduceInformation RetrievalData ScienceData MiningData-intensive PlatformComputing SystemsParallel ComputingData ManagementIterative ComputationsComputational Learning TheoryMachine Learning ModelPredictive AnalyticsKnowledge DiscoveryDistributed SystemsComputer ScienceDistributed Data StorageData-intensive ComputingScalable ComputingCloud ComputingData PlatformsMassive Data Processing
The availability of powerful distributed data platforms and the widespread success of Machine Learning (ML) has led to a virtuous cycle wherein organizations are investing in gathering a wider range of (even bigger!) datasets and addressing an even broader range of tasks. The Hadoop Distributed File System (HDFS) is being provisioned to capture and durably store these datasets. Along side HDFS, resource managers like Mesos [10], Corona [8] and YARN [16] enable the allocation of compute resources “near the data,” where frameworks like REEF [3] can cache it and support fast iterative computations. Unfortunately, most ML algorithms are not tuned to operate on these new cloud platforms, where two new challenges arise: 1) scale-up: the need to acquire more resources dedicated to a particular algorithm, and 2) scale-down: the need to react to resource preemption. This paper focuses on the scale-down challenge, since it poses the most stringent requirement for executing on cloud platforms like YARN, which reserves the right to preempt compute resources dedicated to a job (tenant) [16].
| Year | Citations | |
|---|---|---|
Page 1
Page 1