Engineering Metrics That Matter (AI Era)

What to track at team and org level—DORA, flow, reliability—and how to avoid gaming and vanity metrics as AI changes how work gets done.

Good metrics guide decisions and improve outcomes. Bad metrics get gamed and distort behavior. In the AI era, output can increase while outcome quality shifts; what you measure matters more than ever. This resource focuses on what to track and what to avoid.

What to track at team and org level

DORA-style delivery metrics (from Accelerate):

  • Deployment frequency — How often you ship to production. Higher is better if quality is maintained.
  • Lead time for changes — Time from commit to production. Shorter is better.
  • MTTR (mean time to recover) — How quickly you recover from incidents. Lower is better.
  • Change fail rate — Percentage of deployments that cause incidents or rollbacks. Lower is better.

Flow. Cycle time, throughput, and work-in-progress limits. These help you see bottlenecks and overloading.

Reliability. Uptime, error rates, SLOs/SLAs. Align with what the business and users depend on.

Outcome vs output

Prefer outcome metrics (did we improve the user or business result?) over output (story points, PRs, lines of code). Output can go up while outcome stays flat or worsens—especially when AI generates more code or tickets. Tie metrics to business goals (e.g. “time to value for feature X,” “incident impact,” “developer productivity as perceived by the team”).

Avoiding gaming and vanity metrics

  • Don’t optimize for the number. If you reward deployment frequency, people will deploy tiny changes. Balance with change fail rate and impact.
  • No “lines of code” or “PRs merged” as goals. AI can inflate both; they don’t measure value or quality.
  • Use metrics as a signal, not a target. When a metric becomes a target, it ceases to be a good measure. Review trends and outliers; don’t make a single number the sole basis for comp or promotion.
  • Include qualitative feedback. Surveys, retros, and 1:1s surface what metrics miss—especially wellbeing and sustainable pace. See Inclusive, sustainable pace.

Further reading

  • Accelerate (Forsgren, Humble, Kim) — Evidence-based high-performing orgs and the research behind DORA metrics.

← Resources · AI and engineering strategy · Head of Engineering topics