It’s Not a Disaster Recovery Plan if It’s Written After a Disaster!

by | Jul 7, 2020 | ZAG Standards

Developing a Disaster Recovery strategy is one of those topics that seems to stay on the to-do list longer than it should. Whether it is due to the effort required to put one together, lack of resources or funding, or just having too many other priorities, it shouldn’t be ignored. Not having a disaster recovery plan doesn’t hurt you—provided everything is running perfectly—but the minute a disaster strikes, it can be the difference between a speedy recovery and costly failure.

Start with a Plan

The following may sound basic, but you must start somewhere. Creating a Disaster Recovery (DR) plan from scratch can be a daunting task, but don’t allow that to paralyze you. There are plenty of templates available to get you started and several professional services organizations that can provide help. If you can’t outsource the work, break it down into smaller, actionable items to make it easier to accomplish. You don’t have to get it all done at once. Set up a team, establish a meeting cadence, and assign small tasks. You will be amazed at how easily it will come together.

Be sure to include key stakeholders and business leaders from your organization in the planning stages.

Include the Basics

At a minimum, there are a few things that can be used as an outline to get your plan started.

Invoking the Plan & Declaring an Emergency

It is imperative to determine who in the company has the authority to declare an emergency, how that gets communicated out to the entire staff, and the details of how and when to invoke the plan.

Critical Hardware & Software Identified

Create a list of critical hardware and software to ensure that the right services are part of the DR process. Doing this allows you to establish priority on these items for restoration and business continuity.

Application Flow Diagrams

Building out application flow diagrams for your critical applications makes it easier to understand the complex interactions between all systems involved in the application. The visual representation of the components ensures that you don’t miss any key elements when planning your disaster recovery options.

RPO & RTO Recovery Point and Time Objectives

These are two critical metrics to consider when putting your plan together. They are often confused or interchanged, but each one should be planned out carefully.

  • Recovery Point Objective: this is how far back BEFORE the disaster that you can tolerate data loss. Determining this metric upfront will help you plan your backup frequencies. For example, if your RPO is 4 hours and your backups are nightly, you will not meet your objective.
  • Recovery Time Objective: this is the amount of time AFTER a disaster that your business can continue before your systems are fully restored.

Illustration of RPO and RTO timeline

Backups

Backup and Replication to DR Site

Whether you decide to replicate your backup data to the cloud or an offsite data center, make sure you consider geographic locations when making your selection. This will mitigate both sets of data being impacted by the same disaster.

Sufficient Bandwidth for Backups

Ensure that you have adequate bandwidth to support transferring data from the offsite location at rates that will allow you to meet your recovery time objectives.

Backup Procedures

Procedures should be formally documented, and the backup schedule published.
Frequency and retention of backups should be aligned with RPO.

Test Restores

Do you regularly run test restore jobs on a monthly or quarterly basis? Backups are only valuable if you can restore the data. ZAG recommends that test restores be done either quarterly or monthly. An untested backup should never be trusted.

SQL

  • Maintenance Plans: You should establish a database maintenance plan to be implemented and maintained by an application developer.
  • SQL Aware Backup: Backup software that is SQL aware can properly backup logs and other live file services while they are running and restore individual portions of databases.
  • Transaction Logs: SQL transaction logs should be purged after each backup.

Document and Test

It’s not enough to have a plan, it needs to be documented and tested. Considerations you should be mindful of include creation of runbooks and a formalized annual testing schedule.

DR Documentation and Procedures

  • Runbooks: A runbook documents the procedures required to get your organization back up and running in the event of a disaster.
  • Change Management: Include updating your runbooks into your organization’s change management process to ensure they are kept up to date.
  • Storage: In addition to electronic copies in multiple locations, you may also consider making hard copies and distributing them to key individuals quarterly. Doing this will ensure that at least one copy is readily available during a crisis.

Plan Updates/Testing

  • Annual Testing of DR Plan: Running through a mock disaster once a year can vet out missing components of your plan. Just reading through your plan doesn’t have the same impact as actually walking through it.
  • Quarterly Updates to the DR Plan: Processes and procedures change constantly, and the DR plan should be updated as well. Putting a quarterly review on the calendar ensures that your plan remains relevant.

In summary, if you are one of the lucky ones with both budget and resources, then kick off a project and make a concentrated effort to complete your Disaster Recovery plan immediately. Accomplish this by either outsourcing the work or setting up a dedicated in-house team. If you don’t have that luxury, then at least get the ball rolling. Start the discussions, setup regularly scheduled checkpoints, break work down into realistic chunks. In time, your Disaster Recovery plan will come together. Taking no action now is only going to make recovering from a disaster more costly, lengthy, and detrimental to your business success.

 

Related Content