How do we improve?
How do we improve SDO and organizational performance? Our
research provides evidence-based guidance to help you focus
on the capabilities that drive performance.
This year’s report examined the impact of cloud, SRE
practices, security, technical practices, and culture.
Throughout this section we introduce each of these
capabilities and note their impact on a variety of outcomes.
For those of you who are familiar with DORA’s State of
DevOps research models, we’ve created an online resource
that hosts this year’s model and all previous
Consistent with Accelerate State of DevOps 2019, an
increasing number of organizations are choosing multicloud
and hybrid cloud solutions. In our survey, respondents were
asked where their primary service or application was hosted,
and public cloud usage is on the rise. 56% of respondents
indicated using a public cloud (including multiple public
clouds), a 5% increase from 2019. This year we also asked
specifically about multicloud usage, and 21% of respondents
reported deploying to multiple public clouds. 21% of
respondents indicated not using the cloud, and instead used
a data center or on-premises solution. Finally, 34% of
respondents report using a hybrid cloud and 29% report using
a private cloud.
Accelerating business outcomes with hybrid and
This year we see growth in use of hybrid and multicloud,
with significant impact on the outcomes businesses care
about. Respondents who use hybrid or multicloud were 1.6
times more likely to exceed their organizational performance
targets than those who did not. We also saw strong effects
on SDO, with users of hybrid and multicloud 1.4 times more
likely to excel in terms of deployment frequency, lead time
for changes, time to recover, change failure rate, and
Similar to our 2018 assessment, we asked respondents to
report their rationale for leveraging multiple public cloud
providers. Instead of selecting all that apply, this year we
asked respondents to report their primary reason for using
multiple providers. Over a quarter (26%) of respondents did
so to leverage the unique benefits of each cloud provider.
This suggests that when respondents select an additional
provider, they look for differentiation between their
current provider and alternatives. The second most common
reason for moving to multicloud was availability (22%).
Unsurprisingly, respondents who have adopted multiple cloud
providers were 1.5 times more likely to meet or exceed their
Primary reason for using multiple providers
|Leverage unique benefits for each provider
|Negotiation tactic or procurement requirement
|Lack of trust in one provider
How you implement cloud infrastructure matters
Historically, we find that not all respondents adopt cloud
in the same way. This leads to variation in how effective
cloud adoption is for driving business outcomes. We
addressed this limitation by focusing on the essential
characteristics of cloud computing—as defined by the
National Institute of Standards and Technology (NIST)—and
using that as our guide. Using the NIST Definition of Cloud
Computing, we investigated the impact of essential practices
on SDO performance rather than just investigating cloud
adoption’s impact on SDO.
For the third time, we find that what really matters is how
teams implement their cloud services, not just that they are
using cloud technologies. Elite performers were 3.5 times
more likely to have met all essential NIST cloud
characteristics. Only 32% of respondents who said they were
using cloud infrastructure agreed or strongly agreed that
they met all five of the essential characteristics of cloud
computing defined by NIST, an increase of 3% from 2019.
Overall, usage of NIST’s characteristics of cloud computing
have increased by 14%–19%, with rapid elasticity showing the
Consumers can provision computing resources as needed,
automatically, without any human interaction required on the
part of the provider.
73% of respondents used on-demand self-service, a 16%
increase from 2019.
Broad network access
Capabilities are widely available and can be accessed
through multiple clients such as mobile phones, tablets,
laptops, and workstations.
74% of respondents used broad network access, a 14%
increase from 2019.
Provider resources are pooled in a multi-tenant model, with
physical and virtual resources dynamically assigned and
reassigned on demand. The customer generally has no direct
control over the exact location of the provided resources,
but can specify location at a higher level of abstraction,
such as country, state, or data center.
73% of respondents used resource pooling, a 15% increase
Capabilities can be elastically provisioned and released to
rapidly scale outward or inward with demand. Consumer
capabilities available for provisioning appear to be
unlimited and can be appropriated in any quantity at any
77% of respondents used rapid elasticity, a 18% increase
Cloud systems automatically control and optimize resource
use by leveraging a metering capability at a level of
abstraction appropriate to the type of service, such as
storage, processing, bandwidth, and active user accounts.
Resource usage can be monitored, controlled, and reported
78% of respondents used measured service, a 16% increase
SRE and DevOps
While the DevOps community was emerging at public
conferences and conversations, a like-minded movement was
forming inside Google: site reliability engineering (SRE).
SRE, and similar approaches, like the Facebook production
engineering discipline, embrace many of the same goals and
techniques that motivate DevOps. In 2016, SRE officially
joined the public discourse when the first book4
on site reliability engineering was published. The movement
has grown since then, and today a global community of SRE
practitioners collaborates on practices for technical
Perhaps inevitably, confusion arose. What’s the difference
between SRE and DevOps? Do I need to choose one or the
other? Which one is better? In truth, there’s no
conflict here; SRE and DevOps are highly complementary, and
our research demonstrates their alignment. SRE is a learning
discipline that prioritizes cross-functional communication
and psychological safety, the same values that are at the
core of the performance-oriented generative culture typical
of elite DevOps teams. Extending from its core principles,
SRE provides practical techniques, including the service
level indicator/service level objective (SLI/SLO) metrics
framework. Just as the lean product framework specifies how
to achieve the rapid customer feedback cycles supported by
our research, the SRE framework offers definition on
practices and tooling that can improve a team’s ability to
consistently keep promises to their users.
In 2021, we broadened our inquiry into operations,
expanding from an analysis of service availability into the
more general category of reliability. This year’s survey
introduced several items inspired by SRE practices, to
assess the degree to which teams:
- Employ the SLI/SLO metrics framework to prioritize work
according to error budgets
- Use automation to reduce manual work and disruptive
- Define protocols and preparedness drills for incident
- Incorporate reliability principles throughout the
software delivery lifecycle (“shift left on reliability”)
In analyzing the results, we found evidence that teams who
excel at these modern operational practices are 1.4 times
more likely to report greater SDO performance, and 1.8 times
more likely to report better business outcomes.
SRE practices have been adopted by a majority of teams in
our study: 52% of respondents reported the use of
these practices to some extent, although the
depth of adoption varies substantially between teams. The
data indicates that the use of these methods predicts
greater reliability and greater overall SDO performance: SRE
drives DevOps success.
Additionally, we found that a shared responsibility model
of operations, reflected in the degree to which developers
and operators are jointly empowered to contribute to
reliability, also predicts better reliability outcomes.
Beyond improving objective measures of performance, SRE
improves technical practitioners’ experience of work.
Typically, individuals with a heavy load of operations tasks
are prone to burnout, but SRE has a positive effect. We
found that the more a team employs SRE practices, the less
likely its members are to experience burnout. SRE might also
help in optimizing resources: teams that meet their
reliability targets through the application of SRE practices
report that they spend more time writing code than teams
that don’t practice SRE.
Our research reveals that teams at any level of SDO
performance—from low through elite—are likely to see
benefits from the increased use of SRE practices. The better
a team’s performance is, the greater the likelihood that
they employ modern modes of operations: elite performers are
2.1 times as likely to report the use of SRE
practices as their low-performing counterparts. But even
teams operating at the highest levels have room for
growth: only 10% of elite respondents indicated that their
teams have fully implemented every SRE practice we
investigated. As SDO performance across
industries continues to advance, each team’s approach to
operations is a critical driver of ongoing DevOps
4. Betsy Beyer et al., eds., Site Reliability
Engineering (O’Reilly Media, 2016).
Documentation and security
This year, we looked at the quality of internal
documentation, which is documentation—such as
manuals, READMEs, and even code comments—for the services
and applications that a team works on. We measured
documentation quality by the degree to which the
- Helps readers accomplish their goals
- Is accurate, up-to-date, and comprehensive
- Is findable, well organized, and clear5
Recording and accessing information about internal systems
is a critical part of a team’s technical work. We found that
about 25% of respondents have good-quality documentation,
and the impact of this documentation work is clear: teams
with higher quality documentation are 2.4 times more likely
to see better software delivery and operational (SDO)
performance. Teams with good documentation deliver software
faster and more reliably than those with poor documentation.
Documentation doesn’t have to be perfect. Our research shows
that any improvement in documentation quality has a positive
and direct impact on performance.
Today’s tech environment has increasingly complex systems,
as well as experts and specialized roles for different
aspects of these systems. From security to testing,
documentation is a key way to share specialized knowledge
and guidance both between these specialized sub-teams and
with the wider team.
We found that documentation quality predicts teams’ success
at implementing technical practices. These practices in turn
predict improvements to the system’s technical capabilities,
such as observability, continuous testing, and deployment
automation. We found that teams with quality documentation
- 3.8 times more likely to implement security practices
- 2.4 times more likely to meet or exceed their
- 3.5 times more likely to implement site reliability
engineering (SRE) practices
- 2.5 times more likely to fully leverage the cloud
How to improve documentation quality
Technical work involves finding and using information, but
quality documentation relies on people writing and
maintaining the content. In 2019, our research found that
access to internal and external information sources supports
productivity. This year’s research takes this investigation
a step further to look at the quality of the documentation
that is accessed, and at practices that have an impact on
this documentation quality.
Our research shows the following practices have significant
positive impact on documentation quality:
Document critical use cases for your products and
services. What you document about a system is
important, and use cases allow your readers to put the
information, and your systems, to work.
Create clear guidelines for updating and editing
existing documentation. Much of documentation
work is maintaining existing content. When team members know
how to make updates or remove inaccurate or out-of-date
information, the team can maintain documentation quality
even as the system changes over time.
Define owners. Teams with quality
documentation are more likely to have clearly defined
ownership of documentation. Ownership allows for explicit
responsibilities for writing new content and updating or
verifying changes to existing content. Teams with quality
documentation are more likely to state that documentation is
written for all major features of the applications they work
on, and clear ownership helps create this broad coverage.
Include documentation as part of the software
development process. Teams that created
documentation and updated it as the system changed have
higher quality documentation. Like testing, documentation
creation and maintenance is an integral part of a
high-performing software development process.
Recognize documentation work during performance
reviews and promotions. Recognition is correlated
with overall documentation quality. Writing and maintaining
documentation is a core part of software engineering work,
and treating it as such improves its quality.
Other resources that we found to support quality
- Training on how to write and maintain documentation
- Automated testing for code samples or incomplete
- Guidelines, such as documentation style guides and
guides for writing for a global audience
Documentation is foundational for successfully implementing
DevOps capabilities. Higher quality documentation amplifies
the results of investments in individual DevOps capabilities
like security, reliability, and fully leveraging the cloud.
Implementing practices to support quality documentation pays
off through stronger technical capabilities and higher SDO
5. Quality metrics informed by existing research on
technical documentation, such as:
— Aghajani, E. et al. (2019). Software Documentation Issues
Unveiled. Proceedings of the 2019 IEEE/ACM 41st
International Conference on Software Engineering, 1199-1210.
— Plösch, R., Dautovic, A., & Saft, M. (2014). The
Value of Software Documentation Quality. Proceedings of the
International Conference on Quality Software, 333-342.
— Zhi, J. et al. (2015). Cost benefits and quality of
software development documentation: A systematic mapping.
Journal of Systems and Software, 99(C), 175-198.
[Shift left] and integrate throughout
As technology teams continue to accelerate and evolve, so
do the quantity and sophistication of security threats. In
2020, more than 22 billion records of confidential personal
information or business data were exposed, according to
Tenable’s 2020 Threat Landscape Retrospective
Report.6 Security can’t be an afterthought or the
final step before delivery, it must be integrated throughout
the software development process.
To securely deliver software, security practices must
evolve faster than the techniques used by malicious actors.
During the 2020 SolarWinds and Codecov software supply chain
attacks, hackers compromised SolarWinds’s build system and
Codecov’s bash uploader script7 to covertly embed
themselves into the infrastructure of thousands of customers
of those companies. Given the widespread impact of these
attacks, the industry must shift from a preventive to a
diagnostic approach, where software teams should assume that
their systems are already compromised and build security
into their supply chain.
Consistent with previous reports, we found that elite
performers excel at implementing security practices. This
year, elite performers who met or exceeded their
reliability targets were twice as likely to have security
integrated in their software development process.
This suggests that teams who have accelerated delivery while
maintaining their reliability standards have found a way to
integrate security checks and practices without compromising
their ability to deliver software quickly or reliably.
In addition to exhibiting high delivery and operational
performance, teams who integrate security practices
throughout their development process are 1.6 times more
likely to meet or exceed their organizational goals.
Development teams that embrace security see significant
value driven to the business.
How to get it right
It’s easy to emphasize the importance of security and
suggest that teams need to prioritize it, but doing so
requires several changes from traditional information
security methods. You can integrate security, improve
software delivery and operational performance, and improve
organizational performance by leveraging the following
Test for security. Test security
requirements as a part of the automated testing process,
including areas where pre-approved code should be used.
Integrate security review into every
phase. Integrate information security (InfoSec)
into the daily work of the entire software delivery
lifecycle. This includes having the InfoSec team provide
input during the design and architecture phases of the
application, attend software demos, and provide feedback
Security reviews. Conduct a security
review for all major features. Build pre-approved code. Have
the InfoSec team build pre-approved, easy-to-consume
libraries, packages, toolchains, and processes for
developers and IT operators to use in their work. Invite
InfoSec early and often. Include InfoSec during planning and
all subsequent phases of application development, so that
they can spot security-related weaknesses early, which gives
the team ample time to fix them.
Build pre-approved code. Have the InfoSec
team build pre-approved, easy-to-consume libraries,
packages, toolchains, and processes for developers and IT
operators to use in their work.
Invite InfoSec early and often. Include
InfoSec during planning and all subsequent phases of
application development, so that they can spot
security-related weaknesses early, which gives the team
ample time to fix them.
As we’ve noted previously, high-quality documentation
drives the success of a variety of capabilities and security
is no exception. We found that teams with high-quality
documentation were 3.8 times as likely to integrate security
throughout their development process. Not everyone in an
organization has expertise in cryptography. The expertise of
those who do is best shared in an organization through
documented security practices.
Technical DevOps capabilities
Our research shows that organizations who undergo a DevOps
transformation by adopting continuous delivery are more
likely to have processes that are high-quality, low-risk,
Specifically, we measured the following technical
- Loosely coupled architecture
- Trunk-based development
- Continuous testing
- Continuous integration
- Use of open source technologies
- Monitoring and observability practices
- Management of database changes
- Deployment automation
We found that while all of these practices improve
continuous delivery, loosely coupled architecture and
continuous testing have the greatest impact. For example,
this year we found that elite performers who meet their
reliability targets are three times more likely to employ a
loosely coupled architecture than their low-performing
Loosely coupled architecture
Our research continues to show that you can improve IT
performance by working to reduce fine-grained dependencies
between services and teams. In fact, this is one of the
strongest predictors of successful continuous delivery.
Using a loosely coupled architecture, teams can scale, fail,
test, and deploy independently of one another. Teams can
move at their own pace, work in smaller batches, accrue less
technical debt, and recover faster from failure.
Continuous testing and continuous
Similar to our findings from previous years, we show that
continuous testing is a strong predictor of successful
continuous delivery. Elite performers who meet their
reliability targets are 3.7 times more likely to leverage
continuous testing. By incorporating early and frequent
testing throughout the delivery process, with testers
working alongside developers throughout, teams can iterate
and make changes to their product, service, or application
more quickly. You can use this feedback loop to deliver
value to your customers while also easily incorporating
practices like automated testing and continuous
Continuous integration also improves continuous delivery.
Elite performers who meet their reliability targets are 5.8
times more likely to leverage continuous integration. In
continuous integration, each commit triggers a build of the
software and runs a series of automated tests that provide
feedback in a few minutes. With continuous integration, you
decrease the manual and often complex coordination needed
for a successful integration.
Continuous integration, as defined by Kent Beck and the
Extreme Programming community, where it originated, also
includes the practice of trunk-based development, discussed
Our research has consistently shown that high-performing
organizations are more likely to have implemented
trunk-based development, in which developers work in small
batches and merge their work into a shared trunk frequently.
In fact, elite performers who meet their reliability targets
are 2.3 times more likely to use trunk-based development.
Low performers are more likely to use long-lived branches
and to delay merging.
Teams should merge their work at least once a day—multiple
times a day if possible. Trunk-based development is closely
related to continuous integration, so you should implement
these two technical practices concurrently, because they
have more impact when you use them together.
In ideal work environments, computers perform repetitive
tasks while humans focus on solving problems. Implementing
deployment automation helps your teams get closer to this
When you move software from testing to production in an
automated way, you decrease lead time by enabling faster and
more efficient deployments. You also reduce the likelihood
of deployment errors, which are more common in manual
deployments. When your teams use deployment automation, they
receive immediate feedback, which can help you improve your
service or product at a much faster rate. While you don’t
have to implement continuous testing, continuous
integration, and automated deployments simultaneously, you
are likely to see greater improvements when you use these
three practices together.
Database change management
Tracking changes through version control is a crucial part
of writing and maintaining code, and for managing databases.
Our research found that elite performers who meet their
reliability targets are 3.4 times more likely to exercise
database change management compared to their low-performing
counterparts. Furthermore, the keys to successful database
change management are collaboration, communication, and
transparency across all relevant teams. While you can choose
from among specific approaches to implement, we recommend
that whenever you need to make changes to your database,
teams should get together and review the changes before you
update the database.
8. Beck, K. (2000). Extreme programming explained: Embrace
change. Addison-Wesley Professional
Monitoring and observability
As with previous years, we found that monitoring and
observability practices support continuous delivery. Elite
performers who successfully meet their reliability targets
are 4.1 times more likely to have solutions that incorporate
observability into overall system health. Observability
practices give your teams a better understanding of your
systems, which decreases the time it takes to identify and
troubleshoot issues. Our research also indicates that teams
with good observability practices spend more time coding.
One possible explanation for this finding is that
implementing observability practices helps shift developer
time away from searching for causes of issues toward
troubleshooting and eventually back to coding.
Open source technologies
Many developers already leverage open source technologies,
and their familiarity with these tools is a strength for the
organization. A primary weakness of closed source
technologies is that they limit your ability to transfer
knowledge in and out of the organization. For instance, you
cannot hire someone who is already familiar with your
organization’s tools, and developers cannot transfer the
knowledge they have accumulated to other organizations. In
contrast, most open source technologies have a community
around them that developers can use for support. Open source
technologies are more widely accessible, relatively low
cost, and customizable. Elite performers who meet their
reliability targets are 2.4 times more likely to leverage
open source technologies. We recommend that you shift to
using more open source software as you implement your DevOps
For more information about technical DevOps
capabilities, see DORA capabilities at
COVID-19 and culture
This year we investigated the factors that influenced how
teams performed during the COVID-19 pandemic. Specifically,
has the COVID-19 pandemic negatively impacted software
delivery and operational (SDO) performance? Do teams
experience more burnout as a result? Finally, what factors
are promising for mitigating burnout?
First, we sought to understand the impact the pandemic had
on delivery and operational performance. Many organizations
prioritized modernization to accommodate dramatic market
changes (for example, the shift from purchasing in-person to
online). In the “How do we compare?” chapter, we discuss how
performance in the software industry has accelerated
significantly and continues to accelerate. Higher performing
teams are now the majority of our sample, and elite
performers continue to raise the bar, deploying more often
with shorter lead times, faster recovery times, and better
change failure rates. Similarly, a study by GitHub
researchers showed an increase in developer activity (that
is, pushes, pull requests, reviewed pull requests, and
commented issues per user9) through the year
2020. Arguably, the industry has continued to accelerate
despite the pandemic, rather than because of it, but it’s
noteworthy that we did not see a downward trend in SDO
performance during this dire period.
The pandemic changed how we work, and for many it changed
where we work. For this reason, we look at the impact of
working remotely as a result of the pandemic. We found that
89% of respondents worked from home due to the
pandemic. Only 20% reported having ever worked
from home prior to the pandemic. Shifting to a remote work
environment had significant implications for how we develop
software, run business, and work together. For many, working
from home eliminated the ability to connect through
impromptu hallway conversations or collaborate in
What reduced burnout?
Despite this, we did find a factor that had a large effect
on whether or not a team struggled with burnout as a result
of working remotely: culture. Teams with a
generative team culture, composed of people who felt
included and like they belonged on their team, were half
as likely to experience burnout during the
pandemic. This finding reinforces the importance
of prioritizing team and culture. Teams that do better are
equipped to weather more challenging periods that put
pressure on both the team as well as on individuals.
Broadly speaking, culture is the inescapable interpersonal
undercurrent of every organization. It is anything that
influences how employees think, feel, and behave towards the
organization and one another. All organizations have their
own unique culture, and our findings consistently show that
culture is one of the top drivers of organizational and IT
performance. Specifically, our analyses indicate that a
generative culture—measured using the
Westrum organizational culture typology, and people’s sense
of belonging and inclusion within the organization—predicts
higher software delivery and operational (SDO) performance.
For example, we find that elite performers that meet their
reliability targets are 2.9 times more likely to have a
generative team culture than their low-performing
counterparts. Similarly, a generative culture predicts
higher organizational performance and lower rates of
employee burnout. In short, culture really matters.
Fortunately, culture is fluid, multifaceted, and
always in flux, making it something you can change.
The successful execution of DevOps requires your
organization to have teams that work collaboratively and
cross-functionally. In 2018 we found that high-performing
teams were twice as likely to develop and deliver software
in a single, cross-functional team. This reinforces that
collaboration and cooperation are paramount to the success
of any organization. One key question is: what factors
contribute to creating an environment that encourages and
celebrates cross-functional collaboration?
Over the years, we have tried to make the construct of
culture tangible and to provide the DevOps community with a
better understanding of the impact of culture on
organizational and IT performance. We began this journey by
operationally defining culture using Westrum’s
organizational culture typology. He identified three types
of organizations: power-oriented, rule-oriented, and
performance-oriented. We used this framework in our own
research and found that a performance-oriented
organizational culture that optimizes for information flow,
trust, innovation, and risk-sharing is predictive of high
As our understanding of culture and DevOps evolves, we have
worked to expand our initial definition of culture to
include other psycho-social factors such as psychological
safety. High-performing organizations are more likely to
have a culture that encourages employees to take calculated
and moderate risks without fear of negative consequences.
Belonging and inclusion
Given the consistently strong impact culture has
on performance, this year we expanded our model to explore
whether employees’ sense of belonging and inclusion
contribute to the beneficial effect of culture on
Psychological research has shown that people are inherently
motivated to form and maintain strong and stable
relationships with others.10 We are motivated to
feel connected to others and to feel accepted within the
various groups we inhabit. Feelings of belonging lead to a
wide range of favorable physical and psychological outcomes.
For example, research indicates that feelings of belonging
positively impact motivation and lead to improvements in
A component of this sense of connectedness is the idea that
people should feel comfortable bringing their whole self to
work and that their unique experiences and background are
valued and celebrated.12 Focusing on creating
inclusive cultures of belonging within organizations helps
create a thriving, diverse, and motivated workforce.
Our results indicate that performance-oriented
organizations that value belonging and inclusion are more
likely to have lower levels of employee burnout compared to
organizations with less positive organizational cultures.
Given the evidence showing how psycho-social factors affect
SDO performance and levels of burnout among employees, we
recommend that if you’re seeking to go through a successful
DevOps transformation, you invest in addressing
culture-related issues as part of your transformation
10. Baumeister & Leary, 1995. The need to belong:
Desire for interpersonal attachments as a fundamental human
motivation. Psychological Bulletin, 117(3), 497–529.
11. Walton et al., 2012. Mere belonging: the power of
social connections. Journal of Personality and Social
12. Mor Barak & Daya, 2014; Managing diversity: Toward
a globally inclusive workplace. Sage. Shore, Cleveland,
& Sanchez, 2018; Inclusive workplaces: A review and
model, Human Resources Review.