Alexander Totok
Alexander Totok was born in Moscow, Russia (USSR at the time). He graduated with Ph.D. in Computer Science from New York University in 2006. Previously, he received his M.Sc. in Mathematics from Moscow State University. Alexander works as a Software Engineer at Google. Before that, he worked as a Research Staff Member at IBM T.J. Watson Research Center. Alexander's interests span across several areas of distributed computing systems and World Wide Web. He has authored more than a dozen of research papers in the areas of Computer Science and Mathematics. Alexander resides in New York City. Find him at www.totok.info.
Authored Publications
Sort By
Exploiting Service Usage Information for Optimizing Server Resource Management
Vijay Karamcheti
ACM Transactions on Internet Technology (TOIT), 11 (2011), pp. 1-26
Preview abstract
It is often difficult to tune the performance of modern component-based Internet services because: (1) component middleware are complex software systems that expose several independently tuned server resource management mechanisms; (2) session-oriented client behavior with complex data access patterns makes it hard to predict what impact tuning these mechanisms has on application behavior; and (3) component-based Internet services themselves exhibit complex structural organization with requests of different types having widely ranging execution complexity. In this article we show that exposing and using detailed information about how clients use Internet services enables mechanisms that achieve two interconnected goals: (1) providing improved QoS to the service clients, and (2) optimizing server resource utilization. To differentiate among levels of service usage (service access) information, we introduce the notion of the service access attribute and identify four related groups of service access attributes, encompassing different aspects of service usage information, ranging from the high-level structure of client web sessions to low-level fine-grained information about utilization of server resources by different requests. To show how the identified service usage information can be collected, we implement a request profiling infrastructure in the JBoss Java application server. In the context of four representative service management problems, we show how collected service usage information is used to improve service performance, optimize server resource utilization, or to achieve other problem-specific service management goals.
View details
Optimizing Utilization of Resource Pools in Web Application Servers
Vijay Karamcheti
Concurrency and Computation: Practice and Experience, 22 (2010), pp. 2421-2444
Preview abstract
Among the web application server resources, most critical for its performance are those that are held exclusively by a service request for the duration of its execution (or some significant part of it). Such exclusively-held server resources become performance bottleneck points, with failures to obtain such a resource constituting a major portion of request rejections under server overload conditions. In this paper, we propose a methodology that computes the optimal pool sizes for two such critical resources: web server threads and database connections. Our methodology uses information about incoming request flow and about fine-grained server resource utilization by service requests of different types, obtained through offline and online request profiling. In our methodology, we advocate (and show its benefits) the use of a database connection pooling mechanism that caches database connections for the duration of a service request execution (so-called request-wide database connection caching). We evaluate our methodology by testing it on the TPC-W web application. Our method is able to accurately compute the optimal number of server threads and database connections, and the value of sustainable request throughput computed by the method always lies within a 5% margin of the actual value determined experimentally.
View details
RDRP: Reward-Driven Request Prioritization for e-Commerce Web Sites
Vijay Karamcheti
Electronic Commerce Research and Applications, 9 (2010), pp. 549-561
Preview abstract
Meeting client Quality-of-Service (QoS) expectations proves to be a difficult task for the providers of e-Commerce services, especially when web servers experience overload conditions, which cause increased response times and request rejections, leading to user frustration, lowered usage of the service and reduced revenues. In this paper, we propose a server-side request scheduling mechanism that addresses these problems. Our Reward-Driven Request Prioritization (RDRP) algorithm gives higher execution priority to client web sessions that are likely to bring more service profit (or any other application-specific reward). The method works by predicting future session structure by comparing its requests seen so far with aggregated information about recent client behavior, and using these predictions to preferentially allocate web server resources. Our experiments using the TPC-W benchmark application with an implementation of the RDRP techniques in the JBoss web application server show that RDRP can significantly boost profit attained by the service, while providing better QoS to clients that bring more profit.
View details