WSMeter: A Fast, Accurate, and Low-Cost Performance Evaluation for Warehouse-Scale Computers
Abstract
A warehouse-scale computer (WSC) is a vast collection
of tightly networked computers providing modern internet
services, that is becoming increasingly popular as the most
cost-effective approach to serve users at global scale. It is
however extremely difficult to accurately measure the holistic
performance of WSC. The existing load-testing benchmarks
are tailored towards a dedicated machine model and do not
address shared infrastructure environments. Evaluating the
performance of a live shared production WSC environment
presents many challenges due to the lack of holistic performance
metrics, high evaluation costs, and potential service
disruptions they may cause. WSC providers and customers
are in need of a cost effective methodology to accurately evaluate
the holistic performance of their platforms and hosted
services.
To address these challenges, we propose WSMeter, a cost
effective framework and methodology to accurately evaluate
the holistic performance of WSC in a live production environment.
We define a new performance metric to accurately reflect
the holistic performance of a WSC running a wide variety of
unevenly distributed jobs. We propose a model to statistically
embrace the performance variances amplified by co-located
jobs, to evaluate holistic performance with minimum costs.
For validation of our approach, we analyze two real-world
use cases and show that WSMeter accurately discerns 7% and
1% performance improvements, using only 0.9% and 6.6% of
the machines in the WSC, respectively. We show through a
Cloud customer case study, where WSMeter helped quantify
the performance benefits of service software optimization with
minimal costs.
of tightly networked computers providing modern internet
services, that is becoming increasingly popular as the most
cost-effective approach to serve users at global scale. It is
however extremely difficult to accurately measure the holistic
performance of WSC. The existing load-testing benchmarks
are tailored towards a dedicated machine model and do not
address shared infrastructure environments. Evaluating the
performance of a live shared production WSC environment
presents many challenges due to the lack of holistic performance
metrics, high evaluation costs, and potential service
disruptions they may cause. WSC providers and customers
are in need of a cost effective methodology to accurately evaluate
the holistic performance of their platforms and hosted
services.
To address these challenges, we propose WSMeter, a cost
effective framework and methodology to accurately evaluate
the holistic performance of WSC in a live production environment.
We define a new performance metric to accurately reflect
the holistic performance of a WSC running a wide variety of
unevenly distributed jobs. We propose a model to statistically
embrace the performance variances amplified by co-located
jobs, to evaluate holistic performance with minimum costs.
For validation of our approach, we analyze two real-world
use cases and show that WSMeter accurately discerns 7% and
1% performance improvements, using only 0.9% and 6.6% of
the machines in the WSC, respectively. We show through a
Cloud customer case study, where WSMeter helped quantify
the performance benefits of service software optimization with
minimal costs.