Jump to Content

SRE Antipatterns in Everyday Life and What They Teach Us

Abstract

Real world experience and things that go wrong are two of life’s best teachers. This talk will explore key elements of scalable large-system design and Site Reliability Engineering (SRE) principles* through anti-patterns encountered in real life. Find out what lessons can be gleaned from watching the dynamics in a crowded cafe or dealing with a security issue during a hotel stay. Learn about fundamental site reliability engineering principles and practices including: -Avoiding cascading failures -Not feeding the machines with human toil -Writing blameless postmortems -Engineering solutions to eliminate classes of errors rather than implementing point fixes These principles will be framed through a lens of the suboptimal while demonstrating the impact of SRE anti-patterns on user trust. * SRE is often thought of as a specific implementation of the DevOps interface.

Research Areas