Online, Asynchronous Schema Change in F1
Abstract
We introduce a protocol for schema evolution in a globally
distributed database management system with shared data,
stateless servers, and no global membership. Our protocol
is asynchronous—it allows different servers in the database
system to transition to a new schema at different times—and
online—all servers can access and update all data during a
schema change. We provide a formal model for determining
the correctness of schema changes under these conditions,
and we demonstrate that many common schema changes can
cause anomalies and database corruption. We avoid these
problems by replacing corruption-causing schema changes
with a sequence of schema changes that is guaranteed to
avoid corrupting the database so long as all servers are no
more than one schema version behind at any time. Finally,
we discuss a practical implementation of our protocol in
F1, the database management system that stores data for
Google AdWords.
distributed database management system with shared data,
stateless servers, and no global membership. Our protocol
is asynchronous—it allows different servers in the database
system to transition to a new schema at different times—and
online—all servers can access and update all data during a
schema change. We provide a formal model for determining
the correctness of schema changes under these conditions,
and we demonstrate that many common schema changes can
cause anomalies and database corruption. We avoid these
problems by replacing corruption-causing schema changes
with a sequence of schema changes that is guaranteed to
avoid corrupting the database so long as all servers are no
more than one schema version behind at any time. Finally,
we discuss a practical implementation of our protocol in
F1, the database management system that stores data for
Google AdWords.