Google as last-resort backup
25th of May, 2007 WWW , WordPress , Google ,
Not long ago, at work, we were painfully close to losing 2.5TB of data belonging to our users due to two faulty disks in the RAID. With two faulty disks in the array, you've basically lost everything and with that amount of data, the problem is not having a recent enough backup (which we did), but rather restoring it -- 2.5TB of data would take days to restore. Fortunately, some of my co-workers were able to keep one of the disks alive long enough to restore redundancy. The first thing I did when I came home that day, was to make an additional backup of all my valuable data and stored it in a safe place.
A few days later, a blog post was deleted by accident from one of the blogs I'm administrating. "No worries, this is why we do backups", I said, in good faith restoring the post from backup would take me about five minutes or less. The blog was running the bundled WordPress Database Backup-plugin and WP-cron to ensure that backups were taken on a regular basis and would require no manual labour from my part. The daily backups would be sent to two different e-mail accounts hosted on two different servers, in case of a disk crash. Needless to say, I didn't worry much when I was told about the blog post had been deleted.
As it turned out, every single backup taken since the last time I updated WordPress was corrupted. Three or four months worth of backup turned out to be of absolutely no use whatsoever. What's worse, is that I had been living in the illusion that everything was working as it should, and that I had a daily snapshot in case something should happen, like a blog post being deleted by accident...
What saved me was Google and its Cache-function. Some cut-and-paste magic and everything was back to normal, the post even got its original timestamp. Comments had been turned off, but if there had been any I could've restored them manually as well.
This approach will work in most cases, given that a search engine has cached your site recently. If you've just posted the entry, chances are that it hasn't been indexed yet and thus no cache has been created. This is likely to be the case for recently posted comments.
Although losing a blog post isn't the worst thing that could happen, it provides a good example of what might go wrong with a backup. With information that's published on the web, there's usually a cached copy of it somewhere, so you're not entirely lost at sea. That's not the case with backup of local files and even though the backup program does its job without any warnings, it being Backup Exec or any home made script, it doesn't necessarily mean that the quality of the backup is good enough.
In school we performed fire drills quarterly, which always seemed like a hassle, but maybe it's not such a bad idea after all and it's certainly something I'll start doing with my backups. Like the saying goes "he who forgets history is destined to repeat it".
