Modern Disaster Recovery: Preparing For The Worst

As a follow up to my blog post on Commvault’s site with regards to a webinar I was privileged to recently join, here is the blog I promised, taking a more in-depth look at the tips I shared.

The webinar in question, “Mastering Modern Disaster Recovery,” with Jeff Harbert, Director of Technical Services Programs with Commvault, is where we discussed DR and how it has changed over the years.  DR, as I experienced, meant trucking tapes from our off-site storage location to a DR facility such as Sungard or Comdisco, if you’re “mature” enough to remember that company.  Today, we have moved to a cloud recovery model, one with more agility, speed, and, hopefully, efficiency.

The purpose of this blog is to try to go into some of the topics we just didn’t have time to cover during our one-hour timeslot.  That’s usually the case when taking on an issue such as this, DR.

What is Business-Centric IT

It is a standard I developed years ago to help bring focus to the business outcomes, goals, objectives, and key operational assets to impact the core business with technology solutions offered or recommended by IT. This philosophy is the bridge that spans the gap between IT’s “techinease” speak and real business value. It is the enabling factor that allows IT to be viewed in a much more strategic light and earn the proverbial “seat at the table.” When executed, a Business-Centric IT approach will help turn complex technology concepts into clear, comprehensive business values.  It is this principle that I have stressed over the years as being one of the most critical assets an IT organization can possess, IT alignment of its strategy with the goals and objectives of the business.  If you’d like to learn more about this concept, please visit my SlideShare on Business-Centric IT

The result of a “Business-Centric IT” approach is it draws in the business unit managers and executives into a conversation about the mission of the core business, the top priorities, and expectations. While at the same time provides space for IT to identify complementary technology solutions to help achieve these business aspirations.  While this may not sound like a typical IT conversation on the front-end, the back-end conversation is 100% IT as you look to deploy solutions to support the front end.  It is from this perspective that I typically address all of the challenges within IT, and DR is a perfect topic for this philosophy because it touches every aspect of the business.

Where to start

I wish there were a one-size-fits-all DR plan.  Unfortunately, those don’t exist in this day and age.  Each DR plan is unique to the individual organization; however, there are a few things you can have in common across all DR plans, which I would call the bones or structure.  Here is a shortlist of the items I focus on when working with clients to develop a modern DR plan.

  1. Identify the mission of the organization

Many times in the past, we would simply conduct a business impact analysis session to gather critical information about the cost of downtime, revenue impact, customer impact, etc.  While I still believe the BIA is an essential component, to fast-track the discussion, identifying the mission of the organization may often generate similar results, but maybe without all of the in-depth detail.  If you understand the mission of the organization (“the business”), then you can begin to identify the data, systems, applications, and platforms that support this “mission.”  Now, if you are part of a larger company, you may have these set up as business units. In basic terms, these are functional groups (i.e., Business units) that make up the greater mission.  In either case, the process is the same; it is how you prioritize the risks.

  1. Prioritize the risks

As mentioned in the previous bullet, it will be essential to know how the functional groups or business units relate to support the core mission of the company.  Having this knowledge of the functional groups will help prioritize the recovery steps, and the order of importance of the various groups when it comes to recovery.  For example, if your business is an airline, your primary mission is keeping planes in the air, cargo, and people delivered safely and minimizing the costs as much as possible.  There may be several other functional groups in your org as well, such as loyalty club membership/clubs, accounting, commercial sales, etc.  This will become more clear once you define the mission and begin to ask questions around the systems, applications, data, and platforms required to support the mission.  At this point is where the inter-dependencies start to emerge, and you start to see a clearer picture of how to prioritize the functional group into risk categories.  Start very merely with high, medium, and low.  Define what those risk categories mean and the implications when you declare a disaster.  These definitions will help executive management set expectations going forward.  Remember, it is always a good idea to include executive sponsorship when creating your modern DR plan.

  1. Define what Declaration means to you

I can’t stress this point enough, define what a disaster declaration means to your organization.  In other words, document how long the business can tolerate one of the functional groups, or business units, being down before it becomes potentially damaging to the company, customers, brand, etc.  You should take all of these into account.  Now, if you have ever gone through a business impact analysis, this would be the financial portion of that plan that many would use as the metric or KPI for declaring a disaster.  You need to identify a pre-determined amount of time the business can be down and document it in your execution plan.  This pre-determined time can align with your risk categories we discussed earlier, or it can be a target across the board for all of the functional groups.  Your decision is all dependent on how the business must respond, and why taking a “Business-Centric IT” approach is so important.  While IT is the steward of the data, the business unit is the owner and will have much better insights into the level of response it would require when a potential disaster strikes.  I feel I should explain a bit more about this last statement, “when a potential disaster strikes,” because until you reach that pre-determined amount of time, it is not a disaster, right?  Well, maybe if there is a “smoking hole,” you can immediately declare a disaster, but in most cases, we are faced with some other type of event as we know there is a continuum of events that could result in what you define as a disaster.  This particular event, outside of the smoking hole, I have called a “business interruption.”  Business interruption is just what it sounds like, an event or situation that prevents business as usual operations.  When identified, the IT team has that pre-determined time to find a remedy before declaring a disaster.  Therefore, this key metric in your plan will help prevent the “running around with your hair on fire” scenario; it allows you to access the current situation and decide if you can recover within that period.  So, once again, defining what a declaration means to your organization is one of the top priorities.

  1. Schedule DR Tests

Testing, testing, testing.  Without it, you cannot validate your solution will be successful.  One of the things that Jeff brought up during the webinar was the fact that CommVault can test your DR by looking at the data in the CommServe and validating the recovery without having to go through an entire recovery test.  Testing used to be a whole day effort, whereas today, the elastic use of the cloud allows you to spin up compute and storage to test and validate and then spin it down when you complete testing.  You want to make sure you are checking at least every quarter, and more if you have changes to your environment in between quarterly tests.  And don’t let the word “test” give you any anxiety.  It is not the time to worry if you get an “A+” or an “F,” it is about validating your plan.  If you fail, you want to fail during your test so you can expose the gaps you need to address in your overall plan.  So, testing is supremely essential and should be a leverage point for the improvement of your strategy and execution plan.

  1. Review with Executive Management

While I didn’t talk about this early on, I did mention engaging with executive management and getting executive sponsorship.  My belief, based on a “Business-Centric IT” approach, is when you are planning your DR strategy that you do engage executive management and request executive sponsorship.  Why?  When you have an executive sponsoring your work, it makes it that much easier to get the other business leaders to respond and work collectively with IT during the initial discovery phases of the DR planning portion.  When we look at DR by itself, it is really about bringing technology back to business as usual operationally.  It makes sense that you would engage executive management in this process to ensure you and your IT teams are moving things forward in alignment with the primary mission of the business.

In closing

Disaster Recovery is a big topic, and it is clear that we only scratched the surface during the webinar, and even here in this blog.  My suggestion is to get in touch with the partners you work with and ask for assistance as you go forward with a DR strategy and plan.  I would undoubtedly recommend a conversation with CommVault to hear about how their platform can support you and your IT organization as it helps the mission of your business.  My company is available to consult with you and your team as well to review your plan and strategy or spend a day going through a DR workshop and double-clicking on some of the things I mentioned in this blog.  Do not feel you must go this alone; it is a critical component to your business. There are DR/BC professionals on staff with CommVault who have gone through this process and understand the questions to ask, so whether it is The CTE Group, CommVault, or your reseller partner you are working with, reach out to us. We are here to help.

 

David A Chapa, Founder, Chief Analyst, The CTE Group

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.