Tools / Monitoring and Observability Interview Questions

What is MTTR and MTTD and why do they matter to SRE teams?

MTTR (Mean Time To Recover) and MTTD (Mean Time To Detect) are reliability engineering metrics that quantify two key phases of an incident lifecycle.

MTTD — Mean Time To Detect is the average time between when a failure actually begins and when the monitoring system (or a customer) first detects it. A low MTTD means your alerting and observability systems are working well — issues are caught quickly, before they impact many users or accumulate large error budget burns.

MTTR — Mean Time To Recover is the average time from detection to full service restoration. MTTR encompasses diagnosis time (finding the root cause), mitigation time (deploying a fix or rollback), and verification time (confirming recovery). A low MTTR reflects good runbooks, good observability (fast diagnosis), fast deployment pipelines, and practiced incident response processes.

These metrics directly reflect observability maturity. If MTTD is high, alerts are too slow or missing entirely. If MTTR is high despite fast detection, either the debugging experience is poor (missing traces or logs), deployments are slow, or on-call engineers lack the knowledge to diagnose the system. Observability improvements — better traces, correlated logs, runbooks linked from alerts — directly reduce MTTR.

DORA (DevOps Research and Assessment) research identifies MTTR as one of four key metrics for elite engineering organizations, alongside deployment frequency, lead time, and change failure rate.

If MTTD is high but MTTR is low, what does this suggest about the team's observability? Incident response is slow but detection is fast

✗ Try again — you have the interpretation reversed.

The team fixes issues quickly once detected, but alerting is too slow and issues go unnoticed for too long

✓ Well done — high MTTD = slow detection; low MTTR = fast recovery once the issue is known.

Deployment pipelines are the bottleneck

✗ Try again — slow pipelines would increase MTTR, not MTTD.

According to DORA research, MTTR is one of how many key software delivery metrics? Two

✗ Try again.

Four

✓ Well done — DORA tracks deployment frequency, lead time for changes, change failure rate, and MTTR.

Eight

✗ Try again.

Invest now in Acorns!!! 🚀 Join Acorns and get your $5 bonus!

Invest now in Acorns!!! 🚀
Join Acorns and get your $5 bonus!

Earn passively and while sleeping

Acorns is a micro-investing app that automatically invests your "spare change" from daily purchases into diversified, expert-built portfolios of ETFs. It is designed for beginners, allowing you to start investing with as little as $5. The service automates saving and investing. Disclosure: I may receive a referral bonus.

Invest now!!! Get Free equity stock (US, UK only)!

Use Robinhood app to invest in stocks. It is safe and secure. Use the Referral link to claim your free stock when you sign up!.

The Robinhood app makes it easy to trade stocks, crypto and more.

Webull! Receive free stock by signing up using the link: Webull signup.

More Related questions...

Show more question and Answers...

Golang

	Interviews Questions Java Spring Hibernate Maven Testing API BigData Web DataStructures AI Database Integration Cloud Scala Python Tools Golang	About Javapedia.net Javapedia.net is for Java and J2EE developers, technologist and college students who prepare of interview. Also this site includes many practical examples. This site is developed using J2EE technologies by Steve Antony, a senior Developer/lead at one of the logistics based company.
	contact: javatutorials2016[at]gmail[dot]com
Kindly consider donating for maintaining this website. Thanks.
	Copyright © 2026, javapedia.net, all rights reserved. privacy policy.

Tools / Monitoring and Observability Interview Questions

What is MTTR and MTTD and why do they matter to SRE teams?

Comments & Discussions

Recently added...