Disaster Recovery Plan — Template

Field	Details
Scope	What systems and environments this plan covers
Owner	Name / team responsible for DR
Last Tested	YYYY-MM-DD
Next Test	YYYY-MM-DD
Last Updated	YYYY-MM-DD

Not everything is equally critical. Classify systems so the team knows what to recover first.

Tier	RTO	RPO	Definition	Examples
Tier 1: Critical	< 1 hour	< 5 min	Business cannot function without these	Auth, database, payment processing
Tier 2: Important	< 4 hours	< 1 hour	Significant impact but workarounds exist	Search, notifications, analytics
Tier 3: Standard	< 24 hours	< 24 hours	Can tolerate a day of downtime	Internal tools, batch jobs, reporting

System	Tier	Backup Method	Backup Frequency	Restore Tested?	Restore Time
Primary database	Tier 1	WAL streaming + daily snapshot	Continuous + daily	Yes	X minutes
Object storage	Tier 1	Cross-region replication	Real-time	Yes	N/A — automatic
Application config	Tier 2	Git + infra-as-code	On every change	Yes	X minutes
Search index	Tier 2	Rebuild from primary DB	N/A	No	X hours

When do we activate the DR plan? Be specific so there's no debate during a crisis.

Assess: determine the type of failure (hardware, corruption, human error, region down)
Choose recovery method: failover to replica / restore from snapshot / point-in-time recovery
Execute recovery (include exact commands or link to runbook)
Validate: run data integrity checks, compare row counts, verify critical records
Reconnect: update application connection strings, restart services
Verify: confirm application is healthy, run smoke tests

Audience	Channel	Who Sends	Template
Engineering team	Slack #incidents	IC	DR activated for [system]. Recovery in progress. ETA: X hours.
Leadership	Email / Slack DM	Eng lead	DR activated. Impact: [X]. Recovery timeline: [X]. Next update: [time].
Customers	Status page	Comms lead	We are experiencing a service disruption. We are working on recovery.

A DR plan that isn't tested is fiction. Schedule regular tests and document results.

Test Type	Frequency	Last Tested	Result	Next Test
Backup restore (Tier 1 systems)	Monthly	YYYY-MM-DD	Pass	YYYY-MM-DD
Failover drill (full DR activation)	Quarterly	YYYY-MM-DD	Pass	YYYY-MM-DD
Tabletop exercise (walk through plan)	Bi-annually	YYYY-MM-DD	Pass	YYYY-MM-DD

Disaster Recovery Plan Template