- Krzysztof Ostrowski
- Gideon Mann
- Mark Sandler
5th Workshop on Large Scale Distributed Systems and Middleware (LADIS 2011) (to appear)
As multi-tier cloud applications become pervasive, we need better tools for understanding their performance. This paper presents a system that analyzes observed or desired changes to end-to-end latency prole in a large distributed application, and identifies their underlying causes. It recognizes changes to system conguration, workload, or performance of individual services that lead to the observed or desired outcome. Experiments on an industrial datacenter demonstrate the utility of the system.
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work