Thursday, August 4, 2011

You DO have a backup, right?

As a professional system administrator who has experienced many potentials for data loss both at work and among my extended family, and who has personally lived through a house fire, I have a somewhat uncommon level of expertise when it comes to preparing for the inevitable loss of electronic data, whether it comes from hardware failure, ill-advised modification, theft, or natural disaster. I often hear people mention how lost they would be if something happened to the data on their computer. Sometimes they ask my advice on the subject. More often, I give it without being asked because it's one of my pet peeves.


The following is the general advice that I give. It is presented here from the perspective of someone who's asked about the wisdom of buying a second drive to setup a live mirror of the primary drive on their home desktop computer, because that's the question I received the day before penning this page.

QUESTION: Should I get a Western Digital 2TB green drive to put in my home desktop computer (running Windows Vista) to mirror my main drive?
I bought a WD 2TB green drive a while back. So far, so good.  WD's green series gets good reviews.  WD as a manufacturer is certainly no worse than other other big names, and depending on which year it is, is occasionally the most reliable brand out there.  I bought mine to put into an external enclosure without a fan, so the low power requirement (meaning low heat) was important for me.  I also bought a Samsung EcoGreen 2TB drive at the same time. That's my second Samsung drive, and both have worked flawlessly.  (I also own a few Seagate drives and have had no problems with them, either.)  I bought a pair of Rosewill RX35-AT-SC external enclosures in which to put those drives, and they, too, have worked just fine so far.  The drawback today is that this particular model doesn't support larger than 2TB drives.  I'll explain my usage for those items below.

Shop for USB hard drives at Amazon.com

I only own one Windows computer (most of my computers run Linux), so I've never setup mirrored drives on Windows.  I can't help you with the process any more than a Google search could.  On Linux, mirroring drives is pretty easy with LVM (the Logical Volume Manager).

However, my opinion on mirrored drives for general purpose home computers is that your money can be better spent elsewhere.  Exactly which problem(s) are you trying to solve with mirrored drives?  If you use your computer for something that requires it to be constantly accessible (like a media PC that acts as a DVR, or for running your home phone through it), then mirrored drives (aka "RAID 1") are a good thing.  However, if you can tolerate a day or so of down time (with no data loss) in the event of a disk failure, then RAID 1 is not the best answer.

In case you were wondering, "RAID" stands for "Redundant Array of Inexpensive Disks."  There are six RAID configurations (plus combinations thereof and a couple related configurations that offer no redundancy), each offering different methods for providing increased space, speed, or reliability beyond that of which a single drive is capable.  RAID 1 means duplicating everything to a second drive in real time.  RAID has been used on servers for decades, but is fairly new to the desktop world.

Why not use mirroring at home? Because RAID 1 only protects you against hard drive failure.   (Well, it also gives you better transfer speeds, but that's rarely an issue for home users.)  RAID 1 won't protect you if your computer gets stolen or if your house burns down or floods, because both copies of your data are still inside the same box.  RAID 1 also won't protect you if you accidentally delete an entire folder and then wish you could get it back (or if a virus does that for you), because any changes to the first drive are instantly duplicated on the second drive.  Granted, mechanical drive failure is by far the most common form of data loss, but the other failures aren't so rare that they should be ignored.  My standard mantra is, "Never let the same thief, flood, or fire take out every copy of your data."

QUESTION: I burn my data to a DVD on the first of every month and then put that in a safety deposit box.  Isn't that good enough?
You tell me if that's good enough for you.  How bad would it be if your computer crashed or disappeared on the 31st of the month and you lost the last 30 days of changes?  If you have very little important data on your computer, and if your backup DVD is burned just hours after you balance the checkbook for that month, then maybe the potential of losing 30 days of changes is something you can live with.  I sure couldn't.  I practically live online and store lots of important info on my computer, and a month of photography could mean dozens of gigabytes of irreplaceable photos.  For me, losing more than a couple days of changes would be very painful.

QUESTION: So how do you handle this problem at your house?
On my primary desktop computer -- the one where I house 100,000 digital photos and all our financial records -- I don't use mirrored drives.  Instead, I make daily backup copies of my entire drive onto secondary hard drives.  I have one backup drive located in my computer that gets automatically synchronized every morning at 6am.  This drive contains an exact copy of my primary drives as of 6am.  If I accidentally change or delete something (as I occasionally do), I can just grab a day-old version of it off my backup drive.  If the primary drive fails, I can simply remove it and boot off the backup drive, because it's an exact duplicate.

On Unix, including Apple's Macintosh OSX, the software I use to sync my data is "rsync(1)".  On Windows, Microsoft's "SyncToy" software works well and is (surprisingly) free.

In order to address the thief/flood/fire problem, I also have two USB hard drives (mentioned above) that I store at my office across town.  I alternate which one I bring home each evening to sync up with my home computer.  The next morning, I take it back to work and bring the other drive home the following night.  Because I've got two USB drives, there's always at least one of them located off site, so I'm still protected if my house burns down while I've got one of them at home to be synced.  Although I have two different models of hard drives so they don't both die at the same time, the two external enclosures are the same model, so I can leave one set of power cords at each location and don't have to cart them back and forth with me every day.  This setup provides me with a 2-day old copy of my data in case I do something bad and don't discover it until the next day.  Both of these hard drives are encrypted (using LUKS on Linux, but Windows can do the same), because I don't trust the cleaning crew at work with my financial records.  Identity theft will make your life hell.

Since I mentioned digital photos (I'm a semi-pro photographer), I never delete photos off the flash card until after they've been copied to one of the off-site backup drives and removed from my home... unless, of course, all of my flash cards are full.  At this writing, I only own about 54GB of flash cards, and sometimes that's not enough to make it through a busy weekend.  In that case, I at least store my USB drive out in the garage so it's somewhat physically separated from the computer in my basement.  Yeah, I'm paranoid.

For those not keeping score at home, this means that by the time a file is 48 hours old, it exists on four separate hard drives in two separate locations.  Some say I'm pedantic, but I've lived through one house fire and have had to help many friends and relatives recover from dead hard drives that caused them to lose every bit of electronic info they owned.  Ask yourself:  in the unfortunate event that your house was reduced to smouldering rubble, how gladly would you write a check for $100 to get back all of your family digital photos and your financial records?  Well, $100 won't help you after the fact.  Today, $100 is all it takes to get an external USB hard drive which will save you the trouble.

Interested in other basic computer skills for photographers?  Read about my strategy for basic photo organization.

Edit: This strategy was updated in 2012 due to increased storage needs.  You can read about the new (but still similar) methodology in this post.

If you have any questions or suggestions, please speak up in the comments section below.

4 comments:

  1. Nearly a year after originally posting this article, I ran across some videos produced by big name editorial photographer/videographer Chase Jarvis outlining how he handles workflow and backups. While the videos are a couple years old now, Chase says they're still accurate. It's nice to see that somebody like him has essentially the same backup philosophy that I advocate above.

    The first video is only 10 minutes long. The second one is 90 minutes.

    http://blog.ChaseJarvis.com/blog/2010/06/workflow-and-backup-for-photo-video/

    http://blog.ChaseJarvis.com/blog/2010/10/photo-workflow-backup-chasejarvi/

    ReplyDelete
  2. I finally found someone as paranoid as I am. It also amazes me how small companies, not just photographers, take this so lightly.
    I have run into many individuals who are pleased as punch how organized they are always backing up to a second drive, that is either inside the computer(!) or sits next to it, permanently.
    Something I haven't seen mentioned much in any of this is the importance of having EVERYTHING that is connected to the main computer on a surge protector. I was developing on-site for a company and had set their system up completely on good surge protectors, etc. Several months after I had finished I get a call that their machine was dead. I went in and found someone had moved the console (ok...I'm old) to another area and plugged it in to the wall. I opened up the box and it was real evident where the surge came in. I expect a hurriedly connected USB, FireWire device, or monitor would cause the same thing.

    ReplyDelete
  3. Do you mind if I quote a few of your posts as long as I provide credit and sources back to
    your weblog? My blog site is in the very same area of interest
    as yours and my users would really benefit from some of the information you present here.
    Please let me know if this ok with you. Thanks a lot!


    My web blog

    ReplyDelete
    Replies
    1. I know the above was written by a spammer who was just too stupid to link back to his site with the "My web blog" line, but I wanted to take this opportunity anyway to talk about quoting. Copying one or two sentences is generally fine, as long as you give attribution back to the original author. However, anything beyond that constitutes copyright infringement, regardless of whether you attribute it back to the original author.

      Delete

Please leave your comment below. Comments are moderated, so don't be alarmed if your note doesn't appear immediately. Also, please don't use my blog to advertise your own web site unless it's related to the discussion at hand.