Lead Site Reliability Engineer Central London (Hybrid) Up to £95k Car Allowance & Bonus TRIA are working with a leading hospitality client for a Lead SRE, where they are investing heavily in the performance, stability, and reliability of its digital platforms. This is a hands-on leadership role - you won’t just guide others, you’ll be the go-to expert when systems are under pressure. You'll lead incident response, own root cause analysis, and solve performance issues like memory leaks, outages, and flaky services. Your focus will include : Leading incident management, post-mortems, and blameless RCAs Building scalable, resilient microservices with the dev teams Uplifting observability Improving alerting, monitoring, and system-level metrics Driving better SLOs, SLIs, and overall uptime The stack includes Kubernetes , Terraform , AWS , Python , and modern CI/CD tools, and it's evolving. If you're confident in a crisis, understand what a good SRE practice looks like, and want to leave systems in a better place than you found them, please apply to be considered and learn more! What you’ll bring : Experience in high-traffic digital or eCommerce platforms 5 years in SRE/DevOps roles; strong background in incident response Observability, automation, and infrastructure as code expertise Leadership skills - mentoring others or leading from the front