An analytical model for multi-tier internet services and its applications

TLDR

Many Internet applications use multi‑tier architectures, creating a need for analytical models of their behavior. This paper aims to analytically model the behavior of such multi‑tier Internet applications. The authors develop a queue‑network model that represents each tier, captures diverse performance traits, session workloads, concurrency limits, and caching, and validate it on real Linux cluster deployments. Experiments confirm the model accurately predicts response times within 95% confidence intervals and enables dynamic capacity provisioning, bottleneck detection, and session policing, as demonstrated by maintaining response time targets when scaling two tiers by factors of 2 and 3.5 during a surge from 1500 to 4200 requests per minute.

Abstract

Since many Internet applications employ a multi-tier architecture, in this paper, we focus on the problem of analytically modeling the behavior of such applications. We present a model based on a network of queues, where the queues represent different tiers of the application. Our model is sufficiently general to capture (i) the behavior of tiers with significantly different performance characteristics and (ii) application idiosyncrasies such as session-based workloads, concurrency limits, and caching at intermediate tiers. We validate our model using real multi-tier applications running on a Linux server cluster. Our experiments indicate that our model faithfully captures the performance of these applications for a number of workloads and configurations. For a variety of scenarios, including those with caching at one of the application tiers, the average response times predicted by our model were within the 95% confidence intervals of the observed average response times. Our experiments also demonstrate the utility of the model for dynamic capacity provisioning, performance prediction, bottleneck identification, and session policing. In one scenario, where the request arrival rate increased from less than 1500 to nearly 4200 requests/min, a dynamic provisioning technique employing our model was able to maintain response time targets by increasing the capacity of two of the application tiers by factors of 2 and 3.5, respectively.

References

Page 1

	Year	Citations

Page 1