Real world experience and things that go wrong are two of life’s best teachers. This talk will explore key elements of scalable large-system design and Site Reliability Engineering (SRE) principles* through anti-patterns encountered in real life. Find out what lessons can be gleaned from watching the dynamics in a crowded cafe or dealing with a security issue during a hotel stay. Learn about fundamental site reliability engineering principles and practices including:
-Avoiding cascading failures -Not feeding the machines with human toil -Writing blameless postmortems -Engineering solutions to eliminate classes of errors rather than implementing point fixes
These principles will be framed through a lens of the suboptimal while demonstrating the impact of SRE anti-patterns on user trust.
- SRE is often thought of as a specific implementation of the DevOps interface.