- Artyom Sharov
- Alexander Shraer
- Arif Merchant
- Murray Stokely
Abstract
The configuration of a distributed storage system with multiple data replicas typically includes the set of servers and their roles in the replication protocol. The configuration can usually be changed manually, but in most cases, system administrators have to determine a good configuration by trial and error. We describe a new workload-driven optimization framework that dynamically determines the optimal configuration at run time. Applying the framework to a large-scale distributed storage system used internally in Google resulted in halving the operation latency in 17% of the tested databases, and reducing it by more than 90% in some cases.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work