The Dark Side of SRE

Site Reliability Engineering has emerged as one of the hottest career paths in tech in the recent years. SREs get to tackle technical challenges on complex systems at scale, and are well-compensated for their specialized skillset.
From the outside, the life of an SRE might seem prestige and full of opportunity. But behind the curtain you can often find reality full of chronic stress, career stagnation, and occupational hazards.
By exploring the flip side of SRE, we can make more informed decisions about our engineering careers and have realistic expectations. Whether you're an aspiring or current SRE, let's discuss darker aspects of things.
The High-Stress Life of an SRE
Like firefighters constantly on call, SREs live a life of high-stakes pressure and urgency. They maintain constant readiness to quickly resolve production incidents. While tackling system outages can provide adrenaline-fueled troubleshooting challenges, the unpredictable on-call schedules and urgency of incidents can lead to chronic stress over time.
Below is a stress graph shared by an SRE engineer on r/sre.
The 24/7 on-call expectation at many companies can extend SRE work hours well beyond a typical business day. Being woken up by pager alerts in the middle of the night, handling issues early in the morning before arriving at the office, and working overtime during evenings and weekends further stretches SREs thin and strains work-life balance
SREs risk the effects of chronic stress from being constantly on call, even as their expertise allows them to repeatedly resolve critical incidents. This stress residue can accumulate silently over time before reaching unhealthy levels. And that can literally shorten your life expectancy and reduce quality of life.
Jack of All Trades, Master of None
The typical SRE workday involves continual task switching between incidents, projects, technologies, and teams. This fragments focus. Interruptions are frequent, and few tasks can be driven to completion before the next pager alert demands a context swap. Such a hectic environment makes it impossible to achieve flow states where real development of talent occurs.
Of course, modern infrastructure requires broadly skilled engineers to integrate its many interdependent parts. But SREs seeking long-term career growth may find their knowledge plateauing. At a certain point, professional advancement requires deep specialized expertise. Successfully navigating the generalist-specialist tradeoff remains an ongoing balancing act for ambitious SREs.
🔥 Improve your code reviews with our new Code Review Checklist. Download it now for free to ensure your reviews cover all the key areas like readability, functionality, structure and more.
Operational Treadmill
SREs spend most of their time reacting to operational issues rather than focusing on strategic initiatives. Putting out fires day-to-day leaves little room for long-term projects. This constant reactive work can lead to skill stagnation and career flattening over time.
Incident response utilizes SRE expertise but the repetitive firefighting takes away from creativity and intellectual stimulation. Without opportunities to architect systems or pursue multi-quarter roadmaps, some SREs may feel unfulfilled.
Of course SREs understand that infrastructure reliability remains the top priority. But many grow weary of endless reactive cycles without opportunities to build transformative capabilities that demonstrate their full talents. A career fueled solely by adrenaline is destined to lead to burnout. For SRE leaders, providing resources for strategic initiatives separates those who energize their teams versus those who extinguish them.
The Limited Career Progression
Because the SRE discipline is still relatively new, most companies have small SRE teams. This means there are fewer intermediate and senior-level positions between entry-level SRE and the SRE manager role. Early to mid-career SREs may feel stalled if they want to be promoted but the next level role just isn't available due to team size.
While larger tech companies that pioneered the SRE concept have more well-defined job ladders, smaller companies that adopt SRE more recently struggle with providing advancement for those employees.
Engineers who want to continue specializing in SRE but also want career growth may find the options lacking at some organizations.
Navigating the Inconsistent SRE Role
You cannot take all of your habits of work and expect to successfully transplant them to another company unchanged.
You’d expect the skills of an SRE to directly transfer between departments/companies. But in reality, each company defines and implements the role uniquely. An expert SRE at one organization may find their knowledge irrelevant at another. This document is a great example, that illustrates the point - it prepares ex-google SREs to what they should expect in the outside world.
Some key areas where the SRE role varies between the companies:
Scope of responsibility - At some companies, SREs are responsible for operational work only. At others, they share ownership of product code as well.
Level of software engineering work - The ratio of software development to operations work fluctuates. At some firms SRE is more like a specialized sysadmin role while at others it is much closer to a regular software engineer.
Team structure - SRE can either be its own independent team or paired with product engineering teams (a.k.a embedded SREs). Both models have tradeoffs.
On-call expectations - The frequency and intensity of being on-call as an SRE ranges widely. At some companies it is relatively light and infrequent, while at others it is a heavy burden.
Technical vs soft skills - While strong technical skills are always needed, some SRE roles emphasize soft skills like influencing product teams more than others.
For the good of the profession, the SRE community still needs to coalesce around more consistent job ladders, expectations, and competencies. Only then can top talent build their careers across organizations rather than starting from scratch at each new environment.
Conclusion
Like any profession, SRE comes with tradeoffs. Its good pay and technical complexity bring great reward but also chronic stress, uneven career progression, and unclear skill development paths. However, by understanding these issues, both aspiring and current SREs can make more informed decisions and proactively mitigate risks.
SRE will likely continue advancing infrastructure frontiers, although sometimes through immense human effort. However, by openly discussing the role's dark sides, we want to empower SREs to craft intentional careers.