Photography for Non-Photographers – Storage
This is the third in an intended series of thoughts on a workflow intended for the non-photographer that takes pictures anyways (like me). There’s lots of advice on the web and in print from a more professional perspective, where the goals might include getting a set of contact prints to a client on the day of the shoot, or having the right color space management all the way through the print process. I think the goals of an non-photographer often differ quite substantially from those of a pro, and that workflow consequently should too. I’ve collected my thoughts on this as follows:
- Shooting
- Processing
- Storage (this page)
- Backup and Sharing
I also decided that “amateur photographer” wasn’t nearly as apt a term as “non-photographer”; after all, an “amateur” athlete might go to the Olympics and win a gold medal even if it’s not their full time paid job, and I’m most certainly not the photography equivalent of that! Still, I think these topics are important for anyone who takes a decent number of pictures.
What’s the Big Deal with Storage?
In the first page in this series, I mainly tried to convince you to shoot RAW, so that in the second article in the series, I could try and convince you to spend more time processing those RAW files into slightly nicer pictures (or more precisely, using the latitude in RAW files to fix our non-photographer errors). So you’re probably guessing that in this one, I’m going to try and talk you into some ridiculously complex storage scheme.
Well, you’re part right – I am going to suggest that you do more than just copy files to your hard drive. But this one should be pretty painless (and short) compared to past suggestions; I’ll share the pretty simple scheme that I use for storing and organizing things, and you can decide if it works for you.
But why even have a page on this? Modern hard drives store so much data – $100 will store an entire lifetime of photos – that this should be pretty simple, no? Almost – but a couple of things complicate the picture:
- Hard drives fail. This isn’t the page about backup, I’ll talk separately about why I like SmugMug (the service that I use) so much. But you definitely want a storage approach that doesn’t cost you days upon days to restore your data.
- Organization. If you keep 1,000 photos a year, then when you’re looking back in your old age at all your memories, it wouldn’t be surprising for you to have 30,000 photos or more. And by that time, you’ll have no recollection of where the photos of what are stored, no matter how good visual search might become by then. So it’s important to organize now.
I’ll briefly mention the options and what I chose for physical storage of my photos, and then talk a little about how I organize things.
Planning for Disk Failure
In the last year alone, I’ve suffered the complete failure of two disks in equipment that I use – one at home, and one at work. Neither of these was due to knocking over a machine or any kind of accident – the drives just died. Fortunately, having learnt my lesson once the hard way in the past, both of these recent incidents were trivial to recover from, because I typically have many copies of any non-replaceable data. In any case, hard drives fail.
Unfortunately, besides failing, you may lose everything in your home to theft, fire, or other disasters. So no matter how you store things, off-site backup is important and that’s the next topic I’ll get to. Still, if you have 30,000 photos, it’s a real pain to recover them all from most off-site backups (especially if you’re bandwidth capped like me and almost every other Canadian!).
The obvious solution is to keep at least two copies of photos (and other non-replaceable data). But there’s lots of ways to do this, many of which I’ve actually tried over the years. Here’s what I’ve done, and what I do now.
Two disks in my main PC + nightly backup
This is the simplest solution, and one I’d still recommend if you have only one copy of your photos. Just get a second hard drive that’s at least big enough to store everything of value, keep it permanently attached to your PC, and keep a copy of all your important files on that second hard drive. Easy! It doesn’t matter if the second hard drive is internal or external, and it doesn’t have to be identical to your main drive – it just has to be big enough. Ironically, because manufacturing defects in hard drives come in clumps, it’s actually better not to have two identical drives!
Setting up the nightly backup is easy. There’s lots of free backup software that you can find with a quick Google search; I’ve set two separate programs up for my mom on different occasions, both of which worked just fine.
Personally, I use SyncToy, a free tool from Microsoft (which is always a good assurance against picking a tool that might have spyware or other nasty stuff embedded). It’s very basic, and essentially just keeps two folders in sync. It can also be scheduled to run at a certain time every day, and since I leave my main machine on all the time, this works for me. You can download SyncToyfrom Microsoft here.
The only real downside to this approach? A trojan/virus/issue that deletes your files could affect the 2nd hard drive, and if your main PC is stolen the 2nd hard drive would likely go along with it. Plus this is a little less optimal for sharing.
Backup to a Network Drive
If you have more than one PC in your house, or you have a consumer-oriented Network Attached Storage (NAS) device like the Drobo (http://drobo.com/), you can keep the second copy there. This is an improvement over the above, because there’s a little more isolation with the second copy. What I do today is essentially a variation on this. You can use the same backup software as you would have above.
Windows Home Server
Windows Home Server (WHS) is a product from Microsoft (the above image links to the Microsoft WHS site) that I’ve used since it first came out. It has a number of features and capabilities, but to me, it distinguishes itself from other options in two ways:
- It does an awesome job of backing up PCs attached to your home network. It’s awesome because it’s a single image store (which means files duplicated across PCs, like the OS itself and most installed programs get stored only once, saving space), all opened files are fully backed up, you can browse backups and restore any file from any backup (very handy for restoring the one file you killed), and it can completely restore a PC over the network by booting from a provided CD.
- It really makes the most of your disks. You can (or rather, could – more on that below) give it any combination of internal and external disk drives, of any size, and it will turn it into one big pool of storage. It could serve up any media you placed in its storage, and it had automatic replication (so if you had 3 disks, it would make sure your data was on at least two of them).
On top of the above, it offered some additional features around things like remote access, but the two items above were the major selling points.
I was a big WHS cheerleader, and recommended it wholeheartedly – whether you bought it as a pre-built system, or put a server together yourself. It wasn’t perfect – recovering from the loss of the primary hard drive was still a pain, and backups didn’t appear to be stored redundantly, but all in all it was a great product.
Unfortunately, with the latest/upcoming release, Microsoft effective dropped feature # 2 (despite the most unanimous criticism I’ve ever seen of any product decision). I wholeheartedly recommend the original, but have serious reservations about the new version.
What Choice Did I Make?
Great as WHS is, I actually went with a bit of a different approach from what’s recommended. Here’s the scheme I use:
- The master copy of all photos (and other media) is stored on my main PC. This was necessary, because it’s a lot slower accessing all the necessary content over the network when processing images, or otherwise managing the collection.
- I use the “backup to a network drive” step above to back up all media files on a nightly basis, to the storage pool managed by WHS. This provides triple protection; there’s one copy on my PC, and as mentioned above, WHS ensures that there are multiple copies of anything stored on the server. So even if two disks failed, I’d still have a copy of everything.
- WHS backs up my main PC, except for media files (photos, videos, music, etc). The backup functionality is still phenomenal, but I wanted the extra redundancy on media files. The other advantage of backing up media this way is that other PCs (like my wife’s) can access all the media – and through read-only shares that prevent any accidental changes.
For in-home storage, this works great for me. The cost of putting a home server together was only around $250 for me, for a license of WHS plus some extra hard drive space – I re-purposed an old PC for this.
Organization
Keeping files organized is a simpler thing for me, though I’ve seen many friends/family who have their pictures spread out all over the place! I only have a few suggestions in this area, reflecting how I organize my content. This is one domain where if you have something that works for you, stick with it!
Use the Directory Structure on Disk
Some photo management systems have an interesting way of organizing your catalog of photos, in which you import photos into a magical database that’s maintained by the tool you’re using, and export photos from that magical database as you need to. I don’t like those schemes, as I find it handy to always have access to files. Plus, if your installed software all gets wiped out or you’re trying to recover things later, you really want your individual files – not some big database that’s in a proprietary format. So I strongly recommend tools like Lightroom or Picasa that leave your files alone on disk (unless you want them not to). They still have their own databases/catalogs, and that’s fine.
I use a very simple means of organizing things on disk:
- I have one folder for my original RAW images, and another folder for “final” JPEGs (after all processing, etc). Within each of these folders, there’s an identical set of folders as described below, with \RAW\A\B\C.JPG always having a corresponding \JPG\A\B\C.JPG (though my actual naming is a little different).
- The next level is organized by year, based on when the pictures were taken – 2010, 2011, etc.
- The level below that is grouped into folders based on event. So a birthday party, or a vacation, or a trip to some park/playground will each result in a discrete folder. The folders are named YYYY-MM-EventName, with YYYY being a four-digit year and MM being the month. If I take just a handful of shots, or there’s no real theme, I have a YYYY-MM-Misc folder for miscellaneous pictures taken in a given month.
This is straightforward, but works for me, and is a good bit better than dumping lots of mixed photos together and hoping to find things afterwards!
Archive both the RAW and the JPG?
To cut down on the amount of archived data, I used to process photos, export the JPGs, and then delete the RAW files. My reasoning was that with a catalog of more than 10,000 photos, I was never going to go back and see if I could do just that little bit better in processing the picture.
Of course, I felt pretty stupid when Adobe Photoshop Lightroom 3 came out, with significantly improved noise reduction algorithms. You could basically just re-export all your existing photos, and they’d all look better (if noise reduction was used) due to the improved algorithms. At that point I was kicking myself for deleting the RAW files and making that impossible!
Learning from that, and also realizing that you can learn as much about processing as you do about taking pictures (allowing you to go back and improve on your favorite pictures later), I switched to keeping both the original RAW and the final JPG. It takes more storage space, but storage is so cheap now that it’s a non-issue.
However, I don’t create off-site backups of the RAW files. If my home burns down and both my PC and home server are lost, then I’ll sadly lose the RAW files. It just wasn’t worth the added level of protection to protect the RAW files against catastrophic events. Maybe if I get unmetered Internet some day and cheap cloud storage!
Tag and Caption Photos Where Possible
It’s kind of a pain to tag photos with descriptive tags like “food” or “playground” or whatever, but I’ve gotten myself in the habit of doing this recently. I also tag people in the photo if they are the primary subject of the photograph.
When you combine this with a catalog management tool like Picasa or Lightroom, it allows you to very rapidly find the pictures you’re looking for after the fact.
Picasa offers face detection, but I found that it was sufficiently inaccurate (especially on kids growing up over the course of a few years) that I never considered using it seriously. Perhaps one day it will get to a level where manually tagging people isn’t necessary!
I use captions sparingly, but if I think there’s something about the context of the photo that I’ll forget in 10 years from now, then I use captions to record that context.
That’s It! Organized, and Protected!
The above might seem like a lot of work, but it really boils down to a couple of simple things – make sure you’re automatically backing up all your photos to a second hard drive or to another computer in your home, come up with something sane on disk, and considering captioning and tagging your photos. You’ll thank yourself down the road when you experience a drive failure or you’re searching for that one picture you don’t remember the location of.
I’m not totally done on the “protected” thing, and when I next get some time I’ll talk about backing things up to the cloud for even more safety!