
John Cieslewicz
Research Areas
Authored Publications
Sort By
F1 Lightning: HTAP as a Service
Kelvin Lau
Jiacheng Yang
Zhan Yuan
Jeff Naughton
Ziyang Chen
Jeremy David Wood
Yuan Gao
Junxiong Zhou
Qiang Zeng
Xi Zhao
Jun Xu
Jun Ma
Ian James Rae
VLDB, VLDB Endowment (2020), ??-??
Preview abstract
The ongoing and increasing interest in HTAP (Hybrid Transactional and Analytical Processing) systems documents the intense interest from data owners in simultaneously running transactional and analytical workloads over the same data set. Much of the reported work on HTAP has arisen in the context of “green field” systems, answering the question “if we could design a system for HTAP from scratch, what would it look like?” While there is great merit in such an approach, and a lot of valuable technology has been developed with it, we found ourselves facing a different challenge: one in which there is a great deal of transactional data already existing in several transactional systems, heavily queried by an existing federated engine that does not “own” the transactional systems, supporting both new and legacy applications that demand transparent fast queries and transactions from this combination. This paper reports on our design and experiences with F1 Lightning, a system we built and deployed to meet this challenge. We describe our design decisions, some details of our implementation, and our experience with the system in production for some of Google's most demanding applications.
View details
F1 Query: Declarative Querying at Scale
Bart Samwel
Ahmed Aly
Thanh Do
Somayeh Sardashti
Jiexing Li
Jiacheng Yang
Chanjun Yang
Jason Govig
Andrew Harn
Zhan Yuan
Daniel Tenedorio
Colin Zheng
Allen Yan
Orri Erling
Yang Xia
Qiang Zeng
Divy Agrawal
Jun Xu
Mohan Yang
Andrey Gubichev
Felix Weigel
Yiqun Wei
Ben Handy
Anurag Biyani
Ian Rae
Amr El-Helw
Shivakumar Venkataraman
David G Wilhite
PVLDB (2018), pp. 1835-1848
Preview abstract
F1 Query is a stand-alone, federated query processing platform that executes SQL queries against data stored in different file-based formats as well as different storage systems (e.g., BigTable, Spanner, Google Spreadsheets, etc.). F1 Query eliminates the need to maintain the traditional distinction between different types of data processing workloads by simultaneously supporting: (i) OLTP-style point queries that affect only a few records; (ii) low-latency OLAP querying of large amounts of data; and (iii) large ETL pipelines transforming data from multiple data sources into formats more suitable for analysis and reporting. F1 Query has also significantly reduced the need for developing hard-coded data processing pipelines by enabling declarative queries integrated with custom business logic. F1 Query satisfies key requirements that are highly desirable within Google: (i) it provides a unified view over data that is fragmented and distributed over multiple data sources; (ii) it leverages datacenter resources for performant query processing with high throughput and low latency; (iii) it provides high scalability for large data sizes by increasing computational parallelism; and (iv) it is extensible and uses innovative approaches to integrate complex business logic in declarative query processing. This paper presents the end-to-end design of F1 Query. Evolved out of F1, the distributed database that Google uses to manage its advertising data, F1 Query has been in production for multiple years at Google and serves the querying needs of a large number of users and systems.
View details
F1: A Distributed SQL Database That Scales
Ben Handy
David Menestrina
Traian Stancescu
Mircea Oancea
Ian Rae
Kyle Littlefield
Stephan Ellner
Bart Samwel
Chad Whipkey
VLDB (2013)
Preview abstract
F1 is a distributed relational database system built at
Google to support the AdWords business. F1 is a hybrid
database that combines high availability, the scalability of
NoSQL systems like Bigtable, and the consistency and usability of traditional SQL databases. F1 is built on Spanner, which provides synchronous cross-datacenter replication and strong consistency. Synchronous replication implies higher commit latency, but we mitigate that latency
by using a hierarchical schema model with structured data
types and through smart application design. F1 also includes a fully functional distributed SQL query engine and
automatic change tracking and publishing.
View details