OPERATIONS / DEVOPS / SRE TEMPLATE

Disaster Recovery Plan Template

A disaster recovery plan template on one wiki page: what you protect, recovery targets, the runbook, and who is called. Copy it in and keep it current.

Updated June 11, 2026

TL;DR. A disaster recovery plan is targets, a runbook, and a contact list — and it has been tested. Copy the body of this page into a wiki page, set RTO and RPO with the business, write the runbook so a tired engineer can follow it, and schedule a drill.

A disaster recovery plan earns its keep at 3am, when the person reading it is stressed and the systems are down. Write for that reader: numbered steps, named commands, no judgement calls. The single most common failure is a plan that was written once and never tested — by the time it is needed, the systems have changed and the steps no longer work.

What a disaster recovery plan includes

  • Scope. The system this plan covers and what depends on it.
  • Targets. RTO (how fast back) and RPO (how much data loss is tolerable).
  • The runbook. Numbered recovery steps, with commands.
  • Contacts. Who is called, who declares the incident, who talks to customers.
  • Test record. When it was last drilled and what broke.

How to use this template

  1. Copy the body below into a new wiki page — one per system.
  2. Set RTO and RPO with the business, not just engineering.
  3. Write the runbook as steps a tired on-call engineer can follow.
  4. List who is called and how.
  5. Schedule a drill and record the result.

The template — copy from here

Summary

  • System: <name>Owner: <team / role>
  • RTO: <target time to restore>RPO: <tolerable data loss>
  • Last tested: <date>Next test: <date>

What this protects

<The system, the data it holds, and what else breaks if it is down.>

Recovery runbook

  1. <Detect and declare — how the failure is confirmed and who declares the incident.>
  2. <First action — the command or console step, named exactly.>
  3. <Restore data — from which backup, with the command.>
  4. <Verify — how you confirm service is actually back.>
  5. <Stand down — who calls the all-clear.>

Contacts

RoleNameContact path
Incident commander<name><path>
On-call engineer<name><path>
Customer comms<name><path>

Dependencies and backups

  • Backups: <where, how often, retention.>
  • Depends on: <upstream systems, vendors.>

Test record

DateScenarioWhat brokeFixed
<date><drill><finding><yes / link>

Common questions

What should it include? Scope, targets, a runbook, contacts, and a test record.

Disaster recovery versus business continuity? DR restores systems; business continuity keeps the business running while that happens. A combined plan covers both.

How often should it be tested? At least annually, and after any major architecture change.

Keep the plan in a wiki the on-call team can reach fast and search under pressure, with version history showing what changed after each drill. For a section-by-section walkthrough of a filled-in plan, see the disaster recovery plan example. Pair this with the Service Runbook and the Incident Postmortem, or browse the full template library.