DevOps & SRE

New research: what sets top-performing DevOps teams apart

State of DevOps

DevOps practices have matured a lot in just a few years, as IT teams work toward building resilient systems that scale easily and deliver positive user experiences. DevOps is an important component of fast-moving businesses and IT teams: Using these best practices means teams can develop, deliver, and manage software faster as well as respond to customer problems and regulatory changes. At Google Cloud, we’re always working to bring best practices from what we’ve learned making Google engineers happy and productive to developers and operators everywhere.

Today we’re pleased to share with you the Accelerate: State of DevOps 2018: Strategies for a New Economy from DevOps Research and Assessment, or DORA. Based on a survey of almost 1,900 DevOps professionals, the report offers a comprehensive look at how software teams are using DevOps, and what’s working for the highest performers among them. We’re glad to bring new insights to you with this report, which we sponsored this year, to help your team understand and apply practices to be more agile and efficient while maintaining stability and security.

The report is brimming with data on modern-day DevOps, examining the practices of the highest-performing DevOps teams to help you prioritize your own team’s work. One particular point of inspiration: the research shows that high performers are found in any industry, in organizations of all sizes. Whatever your software development process, DevOps can help you improve it.

Here are some of the takeaways of the DORA report.

1. We’re all collectively getting better at DevOps. A key finding is that DevOps as a practice is maturing. DevOps practitioners are getting better at what they do and pushing the pace for everyone else. The technical practices that high-performing teams use really matter. The report’s authors looked at new capabilities this year—including continuous testing, using monitoring tools, and adding data and an app’s underlying database into the software delivery pipeline—and found that they all positively affect software delivery.

2. Maturing DevOps practices require new measurements. This is the first year that the DORA research team examined availability as a key measurement of software performance, both in terms of knowing exactly what and when software will be available, and making sure it’s accessible by end users. This availability measurement also includes how well DevOps teams define their availability targets and learn from outages. The survey found that software delivery performance that includes these availability measurements offers a more accurate model to understand organizational performance, with elite performers 3.55 times more likely to have strong availability practices. We see this mirrored in our site reliability engineering (SRE) practice here at Google, which starts with the idea of availability and lays out the metrics you can use to tie it to business objectives.

3. Focus on outcomes, not output. The DORA team coined the term software delivery and operational performance (SDO) to include this availability measurement along with the throughput and stability metrics that DORA typically measures. The need for this new SDO performance metric reflects that advanced DevOps teams have become more sophisticated. They’re developing and delivering software quickly and thus are able to lend their attention to delivering availability assurances. SDO brings the focus to global outcomes, not output. So a high-performing DevOps team today is providing positive global outcomes for the business, like throughput and stability in tandem.

4. Don’t be too cautious. Failure in software development is a given, and that can lead DevOps teams to deploy new code less frequently while they do more testing and quality checks. But this report finds that

large-batch, less frequent software deployments lead to bigger failures that take longer to fix.

The top-performing DevOps teams surveyed in the report deploy on-demand, multiple times a day. That group’s average time to restore service is less than one hour. Correspondingly, their change failure rate is between zero and 15 percent. The low-performing groups, by contrast, have an average time to restore service of between a week and a month, and a change failure rate of 46 to 60 percent.

5. How you use the cloud is more important than whether you use the cloud. Simply adopting cloud services won’t automatically translate to business and IT success. The report studied some common cloud infrastructure usage patterns and found that respondents who have adopted cloud computing practices, such as self-service functions, resource pooling, and automatic scaling, were 23 times more likely to be on a high-performing DevOps team. Lower-performing DevOps teams are more likely to have to jump through hoops to access cloud systems for critical work, like opening a ticket, or to run into other roadblocks. High-performing respondents also tend to use new and emerging technologies more than lower-performing respondents. These elite DevOps shops are more likely to use infrastructure as code, more likely to use containers, more likely to adopt cloud-native design and more likely to extensively use open source components, libraries and platforms.

DevOps teams that perform at a high level are more likely to meet or exceed their organizational goals—so not just improving IT measurements, but customer satisfaction and market share as well. These high-performing DevOps teams are using cloud infrastructure wisely, including availability in measuring software and team performance, and deploying frequent updates. When a software team implements DevOps, everyone benefits. Whether you’re a seasoned DevOps pro or just getting started, there’s a lot to learn—and a lot to accomplish for your entire organization. You can find the full report here for more details and insights, and register for our upcoming webinar to learn more about the report and how we scale DevOps at Google.