DORA goes to SPACE?? DORA Metrics

What are DORA Metrics and Why Do They Matter?

DevOps Research and Assessment (DORA) is a startup created by Gene Kim, Jez Humble, and Dr. Nicole Forsgren. Gene Kim and Jez Humble are best known for their best-selling books, such as The DevOps Handbook. Dr. Nicole Forsgren also joined the pair to co-author Accelerate in 2018.

DORA (DevOps Research and Assessment) was created due to frustration with silos between development and operations teams. The DevOps philosophy promotes trust, collaboration, and multidisciplinary teams. DORA aimed to understand the practices and processes that lead to high software delivery velocity and performance. The trio identified four critical areas with "DORA Metrics" that engineering teams can use to measure their performance. This allows engineering leaders to compare their teams with the industry, identify areas for improvement, and make necessary changes.

DORA metrics are a set of four measurements that DevOps teams can use to evaluate their performance, as identified by DORA. The metrics include Deployment Frequency, Mean Lead Time for Changes, Mean Time to Recover, and Change Failure Rate. These were determined by surveying over 31,000 professionals worldwide over six years. DORA also established performance benchmarks, dividing teams into four categories: Elite, High-Performing, Medium, and Low-Performing, based on each metric's characteristics.

Deployment frequency: How often is your team successfully deploying code to production?
Lead time for changes: How much time does it take for committed code to reach production?
Change failure rate: What percentage of changes to code result in deployment failures or bugs requiring patches, rollbacks, or other hands-on fixes?
Time to restore service: How long does it take to restore regular services in the event of incidents that impair users?

How can your team leverage DORA metrics

DORA benchmarks provide two crucial elements for improvement: goals to work towards and evidence of progress. By tracking progress, evidence can motivate teams to continue working towards their goals. In addition, the benchmarks give engineering leaders clear objectives, which can be further broken down into metrics to track key results.

The DORA metrics provide insight into team performance. By monitoring the Change Failure Rate and Mean Time to Recover, leaders can ensure their teams build reliable services with minimal downtime. Similarly, tracking Deployment Frequency and Mean Lead Time for Changes assures leaders that the team is working efficiently. The metrics comprehensively view the team's balance between speed and quality.

Deployment Frequency

Deployment Frequency, as the name implies, refers to the number of times a company successfully releases software to production. In addition, it measures how often a company deploys code for a specific application. Based on the concept of controlling inventory batch size in manufacturing, this metric uses the total number of daily deployments as a reference.

High-performing companies tend to do smaller and more frequent deployments, with the standard being one deployment per week. However, the standard number of deployments varies depending on the product. For example, mobile applications typically make one or two releases per quarter, while SaaS solutions can deploy multiple times daily. Therefore, the number of deployments for a high-performing company can be as many as seven a day.

Question it answers	Elite performers	High performers	Medium performers	Low performers
How often does your organization deploy code to production or release it to end-users?	On-demand (multiple deployments per day)	Between once per day and once per week	Between once per week and once per month	Between once per month and once every six months

Source: 2019 Accelerate State of DevOps, Google

When DevOps teams discover they are in a low-performing category based on DORA metrics, they can improve by implementing more automated processes for testing and validating new code and reducing the time between error recovery and delivery. These measures help increase efficiency and overall performance.

Lead Time for Changes

Mean Lead Time for Changes measures the time committed code takes from development to production. It captures the speed at which the DevOps team delivers software and indicates how efficiently they handle increased requests. The metric is calculated by taking the difference between the commit time and deployment time and finding the average. A lower lead time for changes means that the DevOps team is more efficient in deploying code.

Question it answers	Elite performers	High performers	Medium performers	Low performers
How long does it take to go from code committed to code successfully running in production?	Less than one day	Between one day and one week	Between one week and one month	Between one month and six months

Source: 2019 Accelerate State of DevOps, Google

This will help to streamline the delivery process and reduce the time taken to move code from development to production. Additionally, breaking down larger code bases into smaller, more manageable units will make it easier to identify and fix errors, resulting in a faster Mean Time to Recover.

Mean Time to Recover

The Mean Time to Recover (MTTR) metric measures the time it takes for a system or application to bounce back from a failure. A short MTTR is crucial for business success as it gives leadership the confidence to innovate and experiment, resulting in a competitive advantage and increased revenue. To calculate MTTR, the average time between a bug report and the deployment of a bug fix is tracked. This metric encourages engineers to build robust systems and highlights the importance of fast and effective recovery from failures, which are inevitable in any DevOps process.

Question it answers	Elite performers	High performers	Medium performers	Low performers
How long does it take to restore service when a service incident or a defect that impacts users occur?	Less than an hour	Less than one day	Less than one day	Between one week and one month

Source: 2019 Accelerate State of DevOps, Google

MTTR (Mean Time to Recovery) can be improved by continuously monitoring the systems, having a clear action plan to respond to failures, and prioritizing recovery when a failure occurs. This can reduce the time it takes to recover from an outage and minimize the impact on users and customers.

Change Failure Rate

Change Failure Rate provides a measure of the quality of software and helps track the efficiency of the development and deployment processes. A low Change Failure Rate indicates a well-functioning DevOps pipeline and a high-quality software product. Therefore, DevOps teams need to track and monitor this metric over time to identify trends and areas for improvement. Staying within the range of 0-15% is considered high performance, but the actual target should be set based on each organization's specific requirements and constraints.

Question it answers	Elite performers	High performers	Medium performers	Low performers
What percentage of changes to production or end-users results in degraded service?	0-15%	0-15%	0-15%	46-60%

Source: 2019 Accelerate State of DevOps, Google