#! /usr/bin/ksh – Backup Admin Life: Testing

In a blog, far far away…

In my previous post on Automation in this series I’m calling “Backup Admin Life”, I talked about what we used to do with scripting to “get $#!+” done and how the more complex the systems got the more important it was to make sure things were automated and easy to use.

Restore v. Recover

In this blog, I’m going to wrestle with the question that most backup admins hate or hated having to answer, “Have we tested our restores?  Do we know we can recover if we ever needed to?”

If you’re a recovering backup admin you probably had this thought come to mind.

“My manager just used two different words to ask two very different questions.  I wonder if she knows?  How should I answer?”

“Uh, restores should work.  We’ve gotten all successful backup jobs, so there’s no reason why it shouldn’t work.”

Then you think, “Damn, I seriously did not want to hear that question today of all days (Friday).” Because as an admin of legacy systems, there really was never a good answer.  Restores?  Sure, I’m sure we can restore but recover?  That’s a horse of a different color.

If the manager asked it deliberately then there are two things going through her mind pondering the backup admin’s response.

  1. That didn’t answer my question about recovery, it was only about “restores”
  2. He’s probably thinking that response will quell the balance of my curiosity.

And you, as the backup admin, know full well that was the lamest answer, ever because it didn’t stop the questioning, in fact, it prolonged the conversation and led to the inevitable, “We should test it.”, statement.

“Have we tested our restores?  Do we know we can recover if we ever needed to?”

TEST THE RECOVERY?  WHAT?  I’m a backup admin, not a recovery admin!!  But, what’s the purpose of backup if you never test recoveries, let alone validate the restorability?  Right, backup just gets relegated to being just that ‘insurance policy’ that we have to have and never use.  And it puts the higher value of data protection back to the dark ages.

Well, fortunately for me I never was one to shy away from that question, in fact as a data protection consultant I was the one asking that horrible question and would, with enthusiasm, suggest that we TEST IT!!

Seriously? He just suggested, TEST IT?

Yes, I was that guy.  The problem was, “back in the day”, while I was the friend of management for asking this question, and the enemy of the backup admin, again, for asking this question, it quickly reversed when I asked management for a sufficient test environment on which to conduct our recoverability testing.  If you read my last blog, Backup Admin Life: Automation,  I talked about the capacity planning automation script I wrote to manage an undersized tape library due to the client’s unwillingness to purchase a right-sized solution to keep up with data growth.  This particular client was in a similar boat, but not because they were unwilling to invest but unable to invest in a sizeable expansion of hardware in order to properly test its recoverability.  So, as with many others, this customer would have to be satisfied with just validating the data backed up to tape or disk can actually be read from the backup medium, and not even actually restored.  This is a far cry from being able to test recovery.

I documented the plight of one customer’s experience and the stark differences they uncovered between restore and recovery on my LinkedInprofile called, “Take A Business-Centric Approach to IT“.   Life would have been much easier if we could have randomly decided to test the recovery of a particular application without having to do very much heavy lifting.  The fact that, in most cases, the best we could do was to read the data on tape or disk and compare it to some flat file the legacy app had stored is sad.

Good Enough is Not Enough

I used to work for a company whose motto was, “Good Enough is Not Enough”.  Some people thought it was over the top perfectionism at its best, while those of us who worked at that company knew it meant that we deliver the absolute best to our customers each day and every day.  It had nothing to do with perfection, but everything to do with NOT settling.  So, if you are still using a legacy data protection solution and you can’t easily and at random decided to test or validate your recoverability, why?  Don’t settle, you can be doing so much more with your data, for your organization, and for yourself.  I’ve never been a fan of “dead-ending” data to disk or tape, but that’s what most people do with backup applications.  Extract the real value out of your valuable data without impacting your production environment and do it without having to invest in a completely separate hardware infrastructure.

Because Good Enough is NOT Enough.

Get the t-shirt.

-Chapa, signing off

Leave a Reply