Publication | Closed Access
Applying reinforcement learning towards automating resource allocation and application scalability in the cloud
213
Citations
28
References
2012
Year
Artificial IntelligenceEngineeringDynamic Resource AllocationEducationMarkov Decision ProcessesReinforcement Learning (Educational Psychology)Cloud Resource ManagementReinforcement Learning (Computer Engineering)Systems EngineeringResource OptimizationAuto-scalingCloud SchedulingReinforcement Learning TowardsApplication ScalabilitySummary Public InfrastructureComputer ScienceCloud Service AdaptationGamesCloud AutomationVirtualisation TechnologiesQueueing SystemsDeep Reinforcement LearningCloud ComputingVirtual Resource PartitioningResource AllocationBig Data
Public IaaS clouds provide on‑demand virtualized resources, but dynamic scaling is challenged by performance interference and non‑stationary environments, making optimal resource allocation difficult. The study proposes a parallel Q‑learning method to accelerate learning of optimal scaling policies for cloud resources. Optimal scaling policies are derived using Q‑learning, with a parallel variant to mitigate state‑space explosion and speed up online learning. © 2012 John Wiley & Sons, Ltd.
SUMMARY Public Infrastructure as a Service (IaaS) clouds such as Amazon, GoGrid and Rackspace deliver computational resources by means of virtualisation technologies. These technologies allow multiple independent virtual machines to reside in apparent isolation on the same physical host. Dynamically scaling applications running on IaaS clouds can lead to varied and unpredictable results because of the performance interference effects associated with co‐located virtual machines. Determining appropriate scaling policies in a dynamic non‐stationary environment is non‐trivial. One principle advantage exhibited by IaaS clouds over their traditional hosting counterparts is the ability to scale resources on‐demand. However, a problem arises concerning resource allocation as to which resources should be added and removed when the underlying performance of the resource is in a constant state of flux. Decision theoretic frameworks such as Markov Decision Processes are particularly suited to decision making under uncertainty. By applying a temporal difference, reinforcement learning algorithm known as Q‐learning, optimal scaling policies can be determined. Additionally, reinforcement learning techniques typically suffer from curse of dimensionality problems, where the state space grows exponentially with each additional state variable. To address this challenge, we also present a novel parallel Q‐learning approach aimed at reducing the time taken to determine optimal policies whilst learning online. Copyright © 2012 John Wiley & Sons, Ltd.
| Year | Citations | |
|---|---|---|
Page 1
Page 1