Engineering Reliability into Web Sites: Google SRE

Proceedings of LinuxWorld(2007)


This talk introduces Site Reliability Engineering (SRE) at Google, explaining its purpose and describing the challenges it addresses. SRE teams in Mountain View, Zürich, New York, Santa Monica, Dublin and Kirkland manage Google's many services and websites. They draw upon the Linux based computing resources that are distributed in data centers around the world.