My Data Backup Strategy – An Exercise in Paranoia (Podcast 386)

Portable Backups

13 Sep My Data Backup Strategy – An Exercise in Paranoia (Podcast 386)

Today I’m going to walk you through my data backup strategies at home and in the field. This is in response to a listener question from someone that heard me talking about this briefly on This Week in Photo. I should precede this with the disclaimer that I’m perhaps a little bit paranoid with my backups, but I should also add that I’ve never lost an image in 18 years of digital imaging, and that includes scans of slide film from way back when.

Use this audio player if you’d prefer to listen:

Audio MP3

There are also download and subscription options at the end of the post.

I’ve been running behind a little this week too, hoping to release an Iceland update a little earlier, but I’ve been struggling with some new software that I had hoped to use to show you my new Iceland portfolio images with. It’s almost ready now, and you can see my selected 50 shots on my site already, although I need a little more time to finesse the presentation. If you want a sneak preview, go to http://mbp.ac/iceland2013, and click the little arrow at the bottom right of the gallery to view the images full-screen.

Synchronization Software

So, before we jump into my backup strategies, I’d like to talk about how I synchronize my files between the various hard drives that I’ll discuss. When I first switched to the Mac OS, I initially used a product called Sync, Sync, Sync, but a couple of serious bugs introduced via upgrades and a few corrupted files that were possibly caused by this application got me looking for a new solution about six months ago, and what I decided on was an application that I’ve been very happy with, called ChronoSync.

ChronoSync is very powerful, and can be used to create all kinds of synchronization jobs. I’m not going to go into details today, but you can specify various types of Synchronization jobs, some of which could result in your deleting files by mistake if you don’t understand how a specific synchronization type works, so you have to be careful with applications like this, but as long as you read the help and set things up correctly, ChronoSync will serve you well. Another great thing about ChronoSync though is that you can click the Trial Sync button, and it will throw up a dialog that will tell you exactly what will be copied, what will be deleted and everything like that, so you can check your sync jobs before you actually run them.

Here’s a screenshot of one of my Sync jobs, which I use to synchronize my September 2013 raw files from my MacBook Pro drive to my Drobo 5D. I’ve selected Mirror Left-to-Right as the synchronization type, so that anything I delete from my local hard drive as I edit, is also deleted from my backup when I sync the two. You can also select to move deleted files to an archive folder if you don’t want to risk automatically deleting something by mistake.

ChronoSync Screenshot

ChronoSync Screenshot

The other good thing about ChronoSync is that you can create Containers that are basically batches of Sync jobs, so for example, I have one Container that holds all of the Sync jobs that I run between my two Drobos, so when I’m ready to sync, I can run individual jobs, or just run the entire batch, syncing everything that I’ve built individual sync jobs for.

On Windows, I used to use Robocopy, which is included in the operating system to sync between files. I created text based command files that I would just double click to run the sync jobs, and the result was very much what ChronoSync is doing, but without the interface and ease of configuration. You basically have to read the help and figure out all the commands you need, and write the scripts yourself with Robocopy. It’s not difficult, but it’s too complicated to try and cover here, especially as I no longer use it.

So, when I talk about running a sync job, or synchronizing between drives today, just understand that I’m talking about running a ChronoSync task, but this could be whatever you chose to use to synchronize between your own storage solutions. Another thing to note though, is that I do not recommend manually moving files around.

Manually copying files is OK for an initial backup, like say when you get home from a trip, and just copy an entire directory structure to your main hard drive, but once you have to start doing incremental backups to save changed images and deletions, any new files you create for black and white versions or other edited images to your backup drive, it quickly becomes a pain to do this manually, and it’s error prone. Do find and use a good synchronization solution for this part of your workflow.

At Home/Office

Let’s first touch on how I organize and access my images at home, and then how I back up my images. Initially, every photograph I shoot is first copied to the solid state drive in my MacBook Pro. Because of this, I always get the largest internal drive I can afford when I buy a new computer, so my MacBook Pro Retinal has a 750GB solid state drive installed. This generally enables me to save all of each months images locally, before I clean that out as I move into the following month.

It’s totally up to you how you organize your images, but for me, having everything in a three layer year, month and day structure works well, especially when backing up. I’ve seen people that use location based folder names, but that relies on you remembering what you’ve already backed up and what you have not, or manually comparing your backups etc. which is error prone, not to mention a pain.

If everything for this year is under 2013, and everything for September say is in a directory named 09, each in their own day numbered folders, it’s really easy to check that you are backed up. With the keywording and collections that we now have in Lightroom and other media management software, it really isn’t necessary to include the location or shoot name in the folder structure, unless you have some sort of process imposed on you at a workplace or something.

Anyway, I shoot away for the current month, and create a Sync job for that month that I can run after each shoot, and that automatically copies all of my images, including any changes or deletions to my main storage, which is now a Drobo 5D. This is connected to my MacBook Pro, which is currently my main computer, via Thunderbolt. The Drobo 5D with an mSATA SSD Accelerator drive installed is lightening fast. It’s almost as fast as the internal SSD drive in my MacBook Pro, so backups are very fast, even with 100+ Gigabytes of data.

Drobo 5D Screenshot

Drobo 5D Screenshot

Now, I know that Drobos provide a certain amount of redundancy/fault tolerance, in that with my current configuration, if one hard drive fails, I can pull it out, and replace it, and my data will be safe, but I’ve had a drive fail, and with the amount of data I have stored, which is currenly 5.12TB, it takes two to three days to rebuild the data once you put the new hard drive in. That means there is a two to three day window in which a second failed drive would cost me all of my data, and I don’t want to take that risk.

Cloud Storage

So, I have two Drobos, basically one is a copy of the other, but there is a second very important reason for having that second Drobo, which is Cloud Storage. My second Drobo, a 2nd Generation USB connected four bay Drobo, is connected to my old MacBook Pro, on which I have Backblaze installed. Backblaze currently costs just $50 per year, or $95 for two years, and that is for unlimited storage. As I say, I have over 5 terabytes of data and every byte of that is uploaded to Backblaze, so if I lose any files, I can download them from Backblaze at any time, and that has happened in the past.

Drobos

Drobos

I did a portrait shoot for a client and when I came to work on some prints for them, I found one of the images was corrupted. I believe, though I can’t prove, that this was caused by my last synchronization application during a few synchronization back and forth, so all of my backups were corrupted. Of course, the Backblaze copy was also corrupt, but Backblaze keeps up to four weeks of versions of files, and because I found this issue within a few weeks, I was able to roll-back to an uncorrupted version of the image file.

Had I gone past the four weeks, I’d have lost the file and had to deal with the embarrassing task of telling my client that I lost one of their precious portrait images, so it’s important to ensure that things don’t get corrupted, and that is why I switched synchronization software and touch-wood, nothing has been corrupted since.

So, to recap, my main workflow at home is to shoot for a month, keeping everything on my MacBook Pro, and then after each shoot everything gets backed up first to a Drobo 5D connected via Thunderbolt, and then a second backup is done over the network to a second Drobo connected via USB to my old MacBook Pro. As long as my Backblaze backup is up to date, I can usually backup around 20GB of data per day, so unless I have done a really big shoot, I’m usually backed up in the cloud too within about 24 hours.

Once I’ve finished processing each month’s images usually within the first week or so of the following month, I run one last synchronization from my local MacBook Pro drive to the Drobos, then I delete the images from the local drive. At that point, Lightroom sees that the local images are missing, and I point it to the new month directory on the Drobo 5D, and continue to access my images as normal.

I also catalog the images on my old Drobo over the network, so if I need anything while I’m not at my desk with the Drobo 5D plugged in, I can still access it over the network from anywhere in the house. This is important as if you recall, my office and studio are on the 3rd floor of our apartment, and our living space is on the second floor, and so I don’t spend too much time in the studio, I work from our living room or dining table for a while after breakfast, and then in the evenings, and it’s nice to be able to get to stuff over the network if I need to.

In the Field

That’s my basic home/office workflow, but now let’s look at what I do when I’m traveling. Right now I use four portable hard drives in the following way. I have two 2TB Western Digital My Passport Studio drives that are my main backups in the field. These are the two drives stacked together in this image (below).

Every day when I get to the hotel, I transfer all of my images to my local hard drive first. Then, I synchronize that to my first 2TB hard disk. I usually try to do at least this much before dinner, and I put the 2TB drive in my pocket before I leave the room. If I don’t have time for that, I still put the drive in my pocket because it contains all the previous days backups, but I also put the compact flash cards from that day in my pocket too, rather than leaving them in the hotel room.

Portable Backups

Portable Backups

Once I get back to the room after dinner, if I have shot any video on my GoPros that day, I back them up to a third 1TB hard drive, that I connect to the computer via Thunderbolt. This is the white Buffalo drive that you can see to the left in this photo. This isn’t much faster than the Firewire Drives though, because it’s a slow 2.5in hard drive. Thunderbolt is kind of wasted on standard 2.5in hard drives, which I guessed would be the case, but I bought this to try it anyway. I won’t buy any more unless they boast very fast hard drive speeds to keep up with Thunderbolt. In fact, I think if I buy anything else for portable backups, it will probably be a Drobo Mini, because they have the same SSD acceleration that the Drobo 5D uses, and that screams along.

Once I have my video backed up to my 1TB hard drive, I run a synchronization between that drive and my first 2TB hard drive, and then, I synchronize my first 2TB hard drive with the second 2TB hard drive. That gives me my two backups in the field, so if I need to, I can delete the images from the local hard drive, although I try to avoid this if at all possible. As I mentioned earlier, I have a 750GB internal solid state drive so I can usually shoot for around three to four weeks before I have to start deleting stuff. This is also why I backup my video straight to an external drive, as I’d fill up my local drive too quickly otherwise.

Finally, usually before I go to sleep, I plugin my fourth portable hard drive to my computer, that you can see at the back in this photo (above), which is my portable Time Machine backup. This means that until I delete anything from the local hard drive, I actually have four copies of everything while traveling. This is enough to keep me happy. :)

Note that for the last few versions of the Mac OS, you can now have multiple time machines. In the photo here (above) you can also see a Belkin Thunderbolt hub, into which I basically plug everything, including my Drobo 5D which chains to my external monitor via Thunderbolt, and all of my other USB3.0 and USB2.0 devices, as well as my Firewire card reader, Wacom tablet and speakers etc. all plug into this, so when I sit at my desk, I actually only have to plug in the power to the MacBook Pro, and one thunderbolt cable, and everything just connects. The reason I mention this is because I also have a USB3.0 external hard drive attached to this hub, which continuously updates a Time Machine backup of my computer when I’m at my desk. The portable Time Machine copy is only used when traveling.

The main thing to note about these hard disks now, is that I always carry my main 2TB backup disk with me everywhere when I’m traveling. It not only goes to dinner with me, but it stays in my photographer’s vest all day long. When possible, I also carry my 1TB hard drive that you can see at the back of the photo. This disk is very tough, even withstanding a bit of a dunk in water if necessary, so as long as I haven’t deleted my local copy of my images, they are all in there, inside my Time Machine backup, so if I lost my computer, I could get all my information back, including mail and other personal data to the point of the last backup.

Note too that my 2TB drives are large enough for me to keep a backup of all of what I call my Final images. These are images that I have selected for my portfolios, or stuff that I feel is good enough to show people. If I have done a black and white conversion in Silver Efex Pro for example, I will have the original RAW file, and the converted TIFF or PSD file in my Finals folders too. These are organized by year, so I basically end each year with a new folder, with all of my best shots and original RAW files for that year. This means if I’m traveling and someone needs a few images from me, the chances are I can get them to them from on the road. I can also access all of my RAW files for my best work to give demonstrations of software etc.

I also keep all of my RAW files from every shoot that I do during any given year on this 2TB drive, because when I get home from a big trip, it will take a while for the backups to upload to Backblaze, especially if I have video to upload too. This means that I can carry my hard drive around with me for a while after I get home, and if anything should happen to my house while I’m out, I don’t lose all my recent work.

Summary

So, one last summary here, I have all of my images, and all of my documents, email, music and everything that I value, all on my Drobo 5D, which is my main storage. That is backed up to a second Drobo and that gets backed up to the cloud using Backblaze. This is three copies of all of my data, which is currently 5.12TB and counting. When I travel, I have at least two external backups of my work, as well as a Time Machine backup, in case I lose my computer.

Off Site Backup

Now that I have this much redundancy in my backups, including the cloud backup, I don’t do off-site backups as much as I used to. When I was still in my old day job, I would keep a backup of all my data on a few 3.5 inch hard disks that I would load into an external bay occasionally, and sync from my main data, then take that copy back to the office and just leave it in a drawer. This was still Tokyo though, so every year or so, I would also copy my entire library to a series of old hard drives, and send them to my brother in the UK, and would just store the hard disks somewhere for me.

This is less important to me now that all of my data is in two places at home and the cloud, but when I can, I still like to do this. It’s just one more backup that could save my ass if something really nasty happened here in Japan, at the same time as Backblaze turning pear-shaped, although I can never see that happening. Realistically though, if anything did happen to my local backups, I’d probably request a my data to be sent to me on hard disks from Backblaze rather than my brother, as the copies he has area never going to include my latest work.

As I said, I might be a little bit paranoid about my backups, but if even a part of what I do gives you a hint on how you might improve your own backup strategy, that’s great. The most important thing to remember is that all hard drives fail at some point, so you should never trust your images in just one place. The minimum you should do is backup to an external hard drive, and if possible, make a backup of that to keep away from your home, or sign up for a Backblaze account or a similar service, and ensure that your precious images are also backed up in the cloud.


Show Notes

ChronoSync: http://www.econtechnologies.com/pages/cs/chrono_overview.html

Backblase: http://www.backblaze.com/

Music by UniqueTracks


Audio

Subscribe in iTunesSubscribe in iTunes for Enhanced Podcasts delivered automatically to your computer.

Download this Podcast in MP3 format (Audio Only).

Download this Podcast in Enhanced Podcast M4A format. This requires Apple iTunes or Quicktime to view/listen.

14 Comments
  • Mark Vandenwauver
    Posted at 11:45h, 24 September Reply

    Great point about the caching SSD. No idea it would go down so quickly. Your 2 and 3 the drives are they 7200 rpm? For mobile backup, I use the lacie rugged 7200 rpm with USB 3.0 connection. They get me around 110-120 which I find pleasantly fast for a portable lightweight solution.

    Cheers,

    Mark

    • Martin Bailey
      Posted at 12:07h, 24 September Reply

      The two 3TB drives and two of the 2TB drives are 7200 rpm, but the third 2TB drive is a Western Digital Green drive, which I believe is 5400 rpm. I would get faster speeds with this one faster too, but I’ll switch that out soon enough, when I start to replace these 2TB drives with 4TB ones for more space.

      I’m still using the 2TB FW800 drives for my mobile backups, as mentioned above. I like the fact that they are 2TB, and the speed is OK-ish, so I’ll keep them for a while, but I’m seriously considering the Drobo Mini as my main mobile backup. The only downside is that it needs power, whereas all of my current portable drives don’t. Plus, the Drobo Mini is just a bit too big to realistically drop into my photographer’s vest and carry with me all the time. I could just carry my 2nd backup around instead though.

      Cheers,
      Martin.

  • Mark Vandenwauver
    Posted at 08:42h, 24 September Reply

    Martin, a couple of questions from a fellow paranoid photographer who also loves screaming disk access speed.

    1. Do you have any actual data for read and write to your Drobo 5d? I’m getting 330 MB/s on my raid solution and wanted to know if the 5d can approach this.

    2. Do you have one Lightroom catalog for all your shoots, every shoot? Do you keep your edits in the same directory as your raws? Sorry if you’ve discussed this before.

    Cheers,

    Mark

    • Martin Bailey
      Posted at 10:52h, 24 September Reply

      Hi Mark,

      I’m not quite getting those speeds from the Drobo 5D. When it was new and empty, I was getting 275 MB/s write and 313 MB/s read speeds. This is with 64GB Crucial mSATA SSD Acceleration.

      Now, with over 5TB of data loaded, it has dropped slightly to 193 MB/s write and 294 MB/s read speeds, when connected directly to my MacBook Pro Retina. Connected through the Belkin Thunderbolt hub, I get 187 MB/s write and 270 MB/s read speeds.

      These speeds don’t really change when I daisy chain a 24″ external display from the back of the Drobo 5D or whatever else I connect to the Belkin hub. The difference is small enough that I prefer to just put everything through the Belkin hub.

      Yes, I have just one Lightroom catalog for everything. All shoots, and multiple copies of the same photos on different drives. And yes, I keep all of my initial edited images in the same directory as my RAW files. Once I have finished my editing, I copy all the final selection images and their original RAW file (if I made a copy for a black and white conversion etc.) to a new directory that I call my Finals. This directory contains folders for each year, so I’m currently at 2013. I might continue to add keywords or change titles etc. on my Finals copy, and I don’t go back to the original folder to update that, once I have my final selection images in my Finals folder.

      I hope that helps.

      Cheers,
      Martin.

      • Mark Vandenwauver
        Posted at 10:57h, 24 September Reply

        Hi Martin. Thanks for your quick and extensive answer. I’m happy to see the 5d perform at this speed. That’s a tremendous improvement over the previous generation. I can only imagine what speed the mini with nothing but SSD drives can obtain then.

        Wrt Lightroom and edits, I’m doing exactly the same and I guess I was kinda looking for confirmation by one of the industry leaders. :)

        Cheers,

        Mark

        • Martin Bailey
          Posted at 11:05h, 24 September Reply

          I don’t know about the industry leader part, but you’re welcome Mark. :) I too was pleasantly surprised by the speed of the Drobo 5D.

          One other thing to note. The Drobo Tech Support team told me that there is no reason to buy larger than a 64GB SSD for acceleration, but as I use the 64GB SSD, I see the life dropping already to 82%. With the price difference between 64GB and 128GB being so small, I kind of wish I’d gone for larger now, if only to give me more sectors and extend the expected lifespan of the SSD.

          Having said that, the 64GB are literally so cheap, that it might be better to just replace this when the life drops close to 0%. I guess we have to think of this SSD as a consumable, like printer ink or something.

          Cheers,
          Martin.

  • Aaron Thomas
    Posted at 01:56h, 20 September Reply

    Martin, thanks for the great podcast! This episode couldn’t have come at a better time as I am trying to get my backup system figured out. I am mostly a wedding photographer in which case a lot of the time I do the couple’s engagement shoot one year and then their wedding is the following year. You mentioned how you keep your photos organized by year, month, and then day. If I wanted to keep the couple’s engagement photos together with the wedding photos and one is one year and the other is the next year, do you have any suggestions for an organizational system in this case? Or what would you do if you were in my shoes?

    • Martin Bailey
      Posted at 09:17h, 20 September Reply

      Hi Aaron,

      I’m pleased this helped.

      That’s a good question. I would probably approach it in one of two ways, or maybe even both. Assuming the wedding is the more important of the two shoots, I would probably move the engagement shoot to a sub-folder under the wedding folder once the wedding is complete. Then they could be stored together.

      Or, you could manage the linkage with Lightroom Collections, if you use Lightroom. This way the files could stay where they are, and you could build a list of all your weddings and engagement shoots in Lightroom, but this would mean that you’d have to also be specially careful about backing up your Lightroom catalog, because Collections only live in the catalog.

      You might also want to check what other wedding photographers do though. It almost feels like one of those exceptions that might take you out of my year/month/day structure. Maybe you could use year/month/day for your general shooting, but have all of your wedding shoot files in a different location, and maybe even just index them couple name or something.

      If I were to do that though, I’d probably still leave the original raw files in my year/month/day structure, and copy all the final select images to a second, special folder, just for that couple. This is like what I do with my Originals folder and my Finals folder. I end up with two copies of the most important images, even inside of a single backup.

      Not really a definitive answer, but this is probably what I’d consider.

      Hope it helps!

      Martin.

  • Nick Nieto (@NickNieto)
    Posted at 01:42h, 14 September Reply

    This is really useful – thanks for this post – it’s somewhat very well timed for me. I actually just lost about a terabyte of data on one of my Media Drives. Luckily I have a copy on the cloud at backblaze that I am able to recover. I’m going to use your workflow to help refine my current system.

    Thanks for all the information!

    • Martin Bailey
      Posted at 08:44h, 14 September Reply

      Thank heavens for that Backblaze account Nick! Congratulations on getting that set up BEFORE your disaster.

      I’m pleased this episode will help you think through your extended strategy.

  • Dan Dilloway
    Posted at 21:33h, 13 September Reply

    I’ve just seen this post via twitter, and think it’s really solid advice. I work in a data recovery lab, so see lots of failed drives and this is how I would approach a backup system. It sounds like you backup Mac->Drobo1->Drobo2 but maybe Mac->Drobo1 & Mac->Drobo2 would be safer. If issues happen during the first backup, they wouldn’t be transferred to the second Drobo. Of course this is just nitpicking and I think you’re safer than anyone I’ve ever heard of!

    • Martin Bailey
      Posted at 08:43h, 14 September Reply

      Thanks Dan!

      For some of my syncs that’s actually what I do, but then once my local copy of my images is deleted at the start of the next month, Drobo1 becomes the master copy, so most of my sync jobs are set up to sync from Drobo1 to Drobo2. I can certainly see the value in your suggestion though. If I kept stuff on my local machine longer, I’d probably do that more.

      Thanks for stopping by and for the advice!

  • Mylan
    Posted at 19:57h, 13 September Reply

    Martin, when I play the audio from this page it appears to be the audio from a previous blog post (your review of various inkjet papers). Nonetheless, great post and certainly gives me some ideas to improve my own backup system.

    • Martin Bailey
      Posted at 20:24h, 13 September Reply

      Doh! Forgot to update the link. Thanks for letting me know Mylan!

Leave a Comment