Xiaozhou Li
Xiaozhou (Steve) Li has been a software engineer at Google since 2011. Before joining Google, he was a storage systems researcher at HP Labs. His main interests are in distributed computing/systems. He received his PhD in computer science from the University of Texas at Austin. He has published at Communications of the ACM, SIAM Journal on Computing, PODC, SIGCOMM, and WWW (full list).
Research Areas
Authored Publications
Sort By
Learned Indexes for a Google-scale Disk-based Database
Hussam Abu-Libdeh
Alex Beutel
Lyric Pankaj Doshi
Tim Klas Kraska
Chris Olston
(2020)
Preview abstract
There is great excitement about learned index structures, but understandable skepticism about the practicality of a new method uprooting decades of research on B-Trees. In this paper, we work to remove some of that uncertainty by demonstrating how a learned index can be integrated in a distributed, disk-based database system: Google’s Bigtable. We detail several design decisions we made to integrate learned indexes in Bigtable. Our results show that integrating learned index significantly improves the end-to-end read latency and throughput for Bigtable.
View details
Computing weak consistency in polynomial time
Wojciech Golab
Alejandro López-Ortiz
Naomi Nishimura
Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, ACM, New York, NY, USA, pp. 395-404
Preview abstract
The k-atomicity property can be used to describe the consistency of data operations in large distributed storage systems. The weak consistency guarantees offered by such systems are seen as a necessary compromise in view of Brewer's CAP principle. The k-atomicity property requires that every read operation obtains a value that is at most k updates (writes) old, and becomes a useful way to quantify weak consistency if k is treated as a variable that can be computed from a history of operations. Specifically, the value of k quantifies how far the history deviates from Lamport's atomicity property for read/write registers. We address the problem of computing k indirectly by solving the k-atomicity verification problem (k-AV): given a history of read/write operations and a positive integer k, decide whether the history is k-atomic. Gibbons and Korach showed that in general this problem is NP-complete when k = 1, and hence not solvable in polynomial time unless P = NP. In this paper we present two algorithms that solve the k-AV problem for any k >= 2 in special cases. Similarly to known solutions for k = 1 and k = 2, both algorithms assume that all the values written to a given object are distinct. The first algorithm places an additional restriction on the structure of the input history and solves k-AV in O(n^2 + n (k log k) time. The second algorithm does not place any additional restrictions on the input but is efficient only when k is small and when concurrency among write operations is limited. Its time complexity is O(n^2) if both k and our particular measure of write concurrency are bounded by constants.
View details
Eventually consistent: Not what you were expecting?
Preview
Wojciech Golab
Muntasir R. Rahman
Alvin AuYoung
Kimberly Keeton
Communications of the ACM, 57, no. 3 (2014), pp. 38-44
On the k-atomicity-verification problem
Wojciech Golab
The 33rd International Conference on Distributed Computing Systems, IEEE (2013)
Preview abstract
Modern Internet-scale storage systems often provide weak consistency in exchange for better perfor- mance and resilience. An important weak consistency prop- erty is k-atomicity, which bounds the staleness of values returned by read operations. The k-atomicity-verification problem (or k-AV for short) is the problem of deciding whether a given history of operations is k-atomic. The 1-AV problem is equivalent to verifying atomicity/linearizability, a well-known and solved problem. However, for k ? 2, no polynomial-time k-AV algorithm is known.
This paper makes the following contributions towards solving the k-AV problem. First, we present a simple 2- AV algorithm called LBT, which is likely to be efficient (quasilinear) for histories that arise in practice, although it is less efficient (quadratic) in the worst case. Second, we present a more involved 2-AV algorithm called FZF, which runs efficiently (quasilinear) even in the worst case. To our knowledge, these are the first algorithms that solve the 2-AV problem fully. Third, we show that the weighted k-AV problem, a natural extension of the k-AV problem, is NP-complete.
View details