Krzysztof Ostrowski

Krzysztof Ostrowski

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract End-to-end latency of serving jobs in distributed and shared environments, such as a Cloud, is an important metric for jobs' owners and infrastructure providers. Yet it is notoriously challenging to model precisely, since it is affected by a large collection of unrelated moving pieces, from the software design to the job schedulers strategies. In this work we present a novel approach to modeling latency, by tracking how it varies with CPU usage. We train a classifier to automatically assign the latency behavior of methods in three classes: constant latency regardless of CPU, uncorrelated latency and CPU, and predictable latency as a function of CPU. We use our model on a random sample of serving jobs running on the Google infrastructure. We illustrate unexpected and insightful patterns of latency variations with CPU. The visualization of latency-CPU variations and the corresponding class may be used by both jobs' owners and infrastructure providers, for a variety of applications, such as smarter latency alerting, latency-aware configuration of jobs, and automated detection of changes in behavior, either over time, during pre-release testing, or across data centers. View details
    Recursion in Scalable Protocols via Distributed Data Flows
    Languages for Distributed Algorithms (2012) (to appear)
    Preview abstract This paper proposes a new approach to representing scalable hierarchical distributed multi-party protocols, and reasoning about their behavior. The established endpoint-to-endpoint message-passing abstraction provides little support for modeling distributed algorithms in hierarchical systems, in which the hierarchy and membership dynamically evolve. This paper explains how with our new Distributed Data Flow (DDF) abstraction, hierarchical architecture can be modeled via recursion in the language. This facilitates a more concise code, and it enables automated generation of scalable hierarchical implementations for heterogeneous network environments. View details
    Diagnosing Latency in Multi-Tier Black-Box Services
    Gideon Mann
    5th Workshop on Large Scale Distributed Systems and Middleware (LADIS 2011) (to appear)
    Preview abstract As multi-tier cloud applications become pervasive, we need better tools for understanding their performance. This paper presents a system that analyzes observed or desired changes to end-to-end latency pro le in a large distributed application, and identi fies their underlying causes. It recognizes changes to system con guration, workload, or performance of individual services that lead to the observed or desired outcome. Experiments on an industrial datacenter demonstrate the utility of the system. View details