A role-playing game for incident management training
-
Updated
Feb 27, 2024 - HTML
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
A role-playing game for incident management training
Calculate how much downtime should be permitted in your Service Level Agreement or Objective
Technical blogs on topics of Kubernetes, GitOps, CI/CD and SRE in general. Created with ❤️ using Markdown format.
Calculate the tolerable downtime of your service
A resource website dedicated to Reliability Engineering
👨💻 blog using github pages | About SRE | PT-BR 🇧🇷
A collection of opensource runbooks / playbooks