13 Sep 2013 My Data Backup Strategy – An Exercise in Paranoia (Podcast 386)
Today I’m going to walk you through my data backup strategies at home and in the field. This is in response to a listener question from someone that heard me talking about this briefly on This Week in Photo. I should precede this with the disclaimer that I’m perhaps a little bit paranoid with my backups, but I should also add that I’ve never lost an image in 18 years of digital imaging, and that includes scans of slide film from way back when.
I’ve been running behind a little this week too, hoping to release an Iceland update a little earlier, but I’ve been struggling with some new software that I had hoped to use to show you my new Iceland portfolio images with. It’s almost ready now, and you can see my selected 50 shots on my site already, although I need a little more time to finesse the presentation. If you want a sneak preview, go to http://mbp.ac/iceland2013, and click the little arrow at the bottom right of the gallery to view the images full-screen.
So, before we jump into my backup strategies, I’d like to talk about how I synchronize my files between the various hard drives that I’ll discuss. When I first switched to the Mac OS, I initially used a product called Sync, Sync, Sync, but a couple of serious bugs introduced via upgrades and a few corrupted files that were possibly caused by this application got me looking for a new solution about six months ago, and what I decided on was an application that I’ve been very happy with, called ChronoSync.
ChronoSync is very powerful, and can be used to create all kinds of synchronization jobs. I’m not going to go into details today, but you can specify various types of Synchronization jobs, some of which could result in your deleting files by mistake if you don’t understand how a specific synchronization type works, so you have to be careful with applications like this, but as long as you read the help and set things up correctly, ChronoSync will serve you well. Another great thing about ChronoSync though is that you can click the Trial Sync button, and it will throw up a dialog that will tell you exactly what will be copied, what will be deleted and everything like that, so you can check your sync jobs before you actually run them.
Here’s a screenshot of one of my Sync jobs, which I use to synchronize my September 2013 raw files from my MacBook Pro drive to my Drobo 5D. I’ve selected Mirror Left-to-Right as the synchronization type, so that anything I delete from my local hard drive as I edit, is also deleted from my backup when I sync the two. You can also select to move deleted files to an archive folder if you don’t want to risk automatically deleting something by mistake.
The other good thing about ChronoSync is that you can create Containers that are basically batches of Sync jobs, so for example, I have one Container that holds all of the Sync jobs that I run between my two Drobos, so when I’m ready to sync, I can run individual jobs, or just run the entire batch, syncing everything that I’ve built individual sync jobs for.
On Windows, I used to use Robocopy, which is included in the operating system to sync between files. I created text based command files that I would just double click to run the sync jobs, and the result was very much what ChronoSync is doing, but without the interface and ease of configuration. You basically have to read the help and figure out all the commands you need, and write the scripts yourself with Robocopy. It’s not difficult, but it’s too complicated to try and cover here, especially as I no longer use it.
So, when I talk about running a sync job, or synchronizing between drives today, just understand that I’m talking about running a ChronoSync task, but this could be whatever you chose to use to synchronize between your own storage solutions. Another thing to note though, is that I do not recommend manually moving files around.
Manually copying files is OK for an initial backup, like say when you get home from a trip, and just copy an entire directory structure to your main hard drive, but once you have to start doing incremental backups to save changed images and deletions, any new files you create for black and white versions or other edited images to your backup drive, it quickly becomes a pain to do this manually, and it’s error prone. Do find and use a good synchronization solution for this part of your workflow.
Let’s first touch on how I organize and access my images at home, and then how I back up my images. Initially, every photograph I shoot is first copied to the solid state drive in my MacBook Pro. Because of this, I always get the largest internal drive I can afford when I buy a new computer, so my MacBook Pro Retinal has a 750GB solid state drive installed. This generally enables me to save all of each months images locally, before I clean that out as I move into the following month.
It’s totally up to you how you organize your images, but for me, having everything in a three layer year, month and day structure works well, especially when backing up. I’ve seen people that use location based folder names, but that relies on you remembering what you’ve already backed up and what you have not, or manually comparing your backups etc. which is error prone, not to mention a pain.
If everything for this year is under 2013, and everything for September say is in a directory named 09, each in their own day numbered folders, it’s really easy to check that you are backed up. With the keywording and collections that we now have in Lightroom and other media management software, it really isn’t necessary to include the location or shoot name in the folder structure, unless you have some sort of process imposed on you at a workplace or something.
Anyway, I shoot away for the current month, and create a Sync job for that month that I can run after each shoot, and that automatically copies all of my images, including any changes or deletions to my main storage, which is now a Drobo 5D. This is connected to my MacBook Pro, which is currently my main computer, via Thunderbolt. The Drobo 5D with an mSATA SSD Accelerator drive installed is lightening fast. It’s almost as fast as the internal SSD drive in my MacBook Pro, so backups are very fast, even with 100+ Gigabytes of data.
Now, I know that Drobos provide a certain amount of redundancy/fault tolerance, in that with my current configuration, if one hard drive fails, I can pull it out, and replace it, and my data will be safe, but I’ve had a drive fail, and with the amount of data I have stored, which is currenly 5.12TB, it takes two to three days to rebuild the data once you put the new hard drive in. That means there is a two to three day window in which a second failed drive would cost me all of my data, and I don’t want to take that risk.
So, I have two Drobos, basically one is a copy of the other, but there is a second very important reason for having that second Drobo, which is Cloud Storage. My second Drobo, a 2nd Generation USB connected four bay Drobo, is connected to my old MacBook Pro, on which I have Backblaze installed. Backblaze currently costs just $50 per year, or $95 for two years, and that is for unlimited storage. As I say, I have over 5 terabytes of data and every byte of that is uploaded to Backblaze, so if I lose any files, I can download them from Backblaze at any time, and that has happened in the past.
I did a portrait shoot for a client and when I came to work on some prints for them, I found one of the images was corrupted. I believe, though I can’t prove, that this was caused by my last synchronization application during a few synchronization back and forth, so all of my backups were corrupted. Of course, the Backblaze copy was also corrupt, but Backblaze keeps up to four weeks of versions of files, and because I found this issue within a few weeks, I was able to roll-back to an uncorrupted version of the image file.
Had I gone past the four weeks, I’d have lost the file and had to deal with the embarrassing task of telling my client that I lost one of their precious portrait images, so it’s important to ensure that things don’t get corrupted, and that is why I switched synchronization software and touch-wood, nothing has been corrupted since.
So, to recap, my main workflow at home is to shoot for a month, keeping everything on my MacBook Pro, and then after each shoot everything gets backed up first to a Drobo 5D connected via Thunderbolt, and then a second backup is done over the network to a second Drobo connected via USB to my old MacBook Pro. As long as my Backblaze backup is up to date, I can usually backup around 20GB of data per day, so unless I have done a really big shoot, I’m usually backed up in the cloud too within about 24 hours.
Once I’ve finished processing each month’s images usually within the first week or so of the following month, I run one last synchronization from my local MacBook Pro drive to the Drobos, then I delete the images from the local drive. At that point, Lightroom sees that the local images are missing, and I point it to the new month directory on the Drobo 5D, and continue to access my images as normal.
I also catalog the images on my old Drobo over the network, so if I need anything while I’m not at my desk with the Drobo 5D plugged in, I can still access it over the network from anywhere in the house. This is important as if you recall, my office and studio are on the 3rd floor of our apartment, and our living space is on the second floor, and so I don’t spend too much time in the studio, I work from our living room or dining table for a while after breakfast, and then in the evenings, and it’s nice to be able to get to stuff over the network if I need to.
In the Field
That’s my basic home/office workflow, but now let’s look at what I do when I’m traveling. Right now I use four portable hard drives in the following way. I have two 2TB Western Digital My Passport Studio drives that are my main backups in the field. These are the two drives stacked together in this image (below).
Every day when I get to the hotel, I transfer all of my images to my local hard drive first. Then, I synchronize that to my first 2TB hard disk. I usually try to do at least this much before dinner, and I put the 2TB drive in my pocket before I leave the room. If I don’t have time for that, I still put the drive in my pocket because it contains all the previous days backups, but I also put the compact flash cards from that day in my pocket too, rather than leaving them in the hotel room.
Once I get back to the room after dinner, if I have shot any video on my GoPros that day, I back them up to a third 1TB hard drive, that I connect to the computer via Thunderbolt. This is the white Buffalo drive that you can see to the left in this photo. This isn’t much faster than the Firewire Drives though, because it’s a slow 2.5in hard drive. Thunderbolt is kind of wasted on standard 2.5in hard drives, which I guessed would be the case, but I bought this to try it anyway. I won’t buy any more unless they boast very fast hard drive speeds to keep up with Thunderbolt. In fact, I think if I buy anything else for portable backups, it will probably be a Drobo Mini, because they have the same SSD acceleration that the Drobo 5D uses, and that screams along.
Once I have my video backed up to my 1TB hard drive, I run a synchronization between that drive and my first 2TB hard drive, and then, I synchronize my first 2TB hard drive with the second 2TB hard drive. That gives me my two backups in the field, so if I need to, I can delete the images from the local hard drive, although I try to avoid this if at all possible. As I mentioned earlier, I have a 750GB internal solid state drive so I can usually shoot for around three to four weeks before I have to start deleting stuff. This is also why I backup my video straight to an external drive, as I’d fill up my local drive too quickly otherwise.
Finally, usually before I go to sleep, I plugin my fourth portable hard drive to my computer, that you can see at the back in this photo (above), which is my portable Time Machine backup. This means that until I delete anything from the local hard drive, I actually have four copies of everything while traveling. This is enough to keep me happy.
Note that for the last few versions of the Mac OS, you can now have multiple time machines. In the photo here (above) you can also see a Belkin Thunderbolt hub, into which I basically plug everything, including my Drobo 5D which chains to my external monitor via Thunderbolt, and all of my other USB3.0 and USB2.0 devices, as well as my Firewire card reader, Wacom tablet and speakers etc. all plug into this, so when I sit at my desk, I actually only have to plug in the power to the MacBook Pro, and one thunderbolt cable, and everything just connects. The reason I mention this is because I also have a USB3.0 external hard drive attached to this hub, which continuously updates a Time Machine backup of my computer when I’m at my desk. The portable Time Machine copy is only used when traveling.
The main thing to note about these hard disks now, is that I always carry my main 2TB backup disk with me everywhere when I’m traveling. It not only goes to dinner with me, but it stays in my photographer’s vest all day long. When possible, I also carry my 1TB hard drive that you can see at the back of the photo. This disk is very tough, even withstanding a bit of a dunk in water if necessary, so as long as I haven’t deleted my local copy of my images, they are all in there, inside my Time Machine backup, so if I lost my computer, I could get all my information back, including mail and other personal data to the point of the last backup.
Note too that my 2TB drives are large enough for me to keep a backup of all of what I call my Final images. These are images that I have selected for my portfolios, or stuff that I feel is good enough to show people. If I have done a black and white conversion in Silver Efex Pro for example, I will have the original RAW file, and the converted TIFF or PSD file in my Finals folders too. These are organized by year, so I basically end each year with a new folder, with all of my best shots and original RAW files for that year. This means if I’m traveling and someone needs a few images from me, the chances are I can get them to them from on the road. I can also access all of my RAW files for my best work to give demonstrations of software etc.
I also keep all of my RAW files from every shoot that I do during any given year on this 2TB drive, because when I get home from a big trip, it will take a while for the backups to upload to Backblaze, especially if I have video to upload too. This means that I can carry my hard drive around with me for a while after I get home, and if anything should happen to my house while I’m out, I don’t lose all my recent work.
So, one last summary here, I have all of my images, and all of my documents, email, music and everything that I value, all on my Drobo 5D, which is my main storage. That is backed up to a second Drobo and that gets backed up to the cloud using Backblaze. This is three copies of all of my data, which is currently 5.12TB and counting. When I travel, I have at least two external backups of my work, as well as a Time Machine backup, in case I lose my computer.
Off Site Backup
Now that I have this much redundancy in my backups, including the cloud backup, I don’t do off-site backups as much as I used to. When I was still in my old day job, I would keep a backup of all my data on a few 3.5 inch hard disks that I would load into an external bay occasionally, and sync from my main data, then take that copy back to the office and just leave it in a drawer. This was still Tokyo though, so every year or so, I would also copy my entire library to a series of old hard drives, and send them to my brother in the UK, and would just store the hard disks somewhere for me.
This is less important to me now that all of my data is in two places at home and the cloud, but when I can, I still like to do this. It’s just one more backup that could save my ass if something really nasty happened here in Japan, at the same time as Backblaze turning pear-shaped, although I can never see that happening. Realistically though, if anything did happen to my local backups, I’d probably request a my data to be sent to me on hard disks from Backblaze rather than my brother, as the copies he has area never going to include my latest work.
As I said, I might be a little bit paranoid about my backups, but if even a part of what I do gives you a hint on how you might improve your own backup strategy, that’s great. The most important thing to remember is that all hard drives fail at some point, so you should never trust your images in just one place. The minimum you should do is backup to an external hard drive, and if possible, make a backup of that to keep away from your home, or sign up for a Backblaze account or a similar service, and ensure that your precious images are also backed up in the cloud.
Music by UniqueTracks
Subscribe in iTunes for Enhanced Podcasts delivered automatically to your computer.
Download this Podcast in MP3 format (Audio Only).
Download this Podcast in Enhanced Podcast M4A format. This requires Apple iTunes or Quicktime to view/listen.