Have you ever sat back and watched a pretty little cloud slowly working it’s way across the sky, only to see it break apart and evaporate, leaving almost no trace? Something that seems solid and substantial becomes just a memory. That’s part of the danger we face as we surrender our computing and data management to “the cloud”. If we're not careful, we could be watching one day as our data just disappears into thin air!
That we refer to “the cloud” as this incredibly new thing is somewhat amusing. It just means that our data and how we work with it exists outside of our own personal computers. It lives on servers that are beyond our physical control and are shared with others in a way that we don’t have to deal with individual hardware. We’ve all been doing certain things “in the cloud” for a long time. My first email account was accessed via Telnet. It lived on a University computer that I couldn’t have managed if my life depended on it. That was way back in 1993.
Now suddenly, the cloud is the future of computing. From webmail to web-based document editing to web-based accounting, we’re doing more and more without actual installed software. Google’s even trying to convince us to forget about software altogether and live entirely through webapps via Chrome OS. As long as we have access to a decent Internet connection, we now have the freedom to roam far and wide but still have instant access to everything we need.
There’s no doubt that accessibility, scalability and reduction in our own investment of time into maintenance and upgrade activities are a huge boon to productivity, but this carefree lifestyle comes with its own unique costs, beyond just the subscription fees for the online tools we use. Ultimately, somebody else holds the keys to our data. They make decisions that affect when and how we can use it and we generally have little choice but to go along with whatever changes they make. When it comes to most cloud services, we have very little data portability and limited-to-no ability to easily back up our own data. This is the dark lining to the silver clouds.
Why should we care about being able to backup our own cloud data offline? After all, one of the reasons we use these services is so that they can worry about backing up our data and we won’t have to. Not so fast! There’s more than one reason to backup data, and more than one situation in which to use those backups. In a traditional software world, we back up copies of our data to
- Protect against data loss due to hardware failure or file corruption OR
- Save a snapshot of data that we can revert back to or reference in the future if we make changes that either don’t work the way we intended or we later discover were not what we actually wanted.
In the cloud software world, the backup processes of the services themselves usually protect us against hardware faults or corruption (although not always), but they do very little to protect us from accidental damage we may do just by using the software. Most provide limited or no ability to “undo” an action, and only a very few provide automatic versioning and rollback (Google Docs being one of the most capable I’ve seen so far). How many times have you felt a little pang of concern or anxiety before hitting a submit button on one of these services, out of fear that you might make an irreparable mistake?
There’s also the risk that the service could be discontinued or the company providing it could go under. Finally, with free services there’s the risk that the provider could alter its data retention policies without clear warning. Yahoo lost me forever as a customer when they deleted the contents of my Yahoo! Mail account because I hadn’t logged in for 6 months. Somehow the 100MB of space they cleared was worth more to them than my potential as a customer!
What can we do to mitigate this danger and make cloud computing the computing heaven we dream of? I don’t think it would take much. I think all we really need is for cloud services to support a standard offline-backup interface.
The way I envision it, each service would package up the user data as XML files and zipped folders of media files, and feed out a master XML file referencing the files to be downloaded. It’s a little like a Sitemap Index file for Google Sitemaps. The master file would be a standard format, but the individual XML files could vary from provider to provider (and would have to since everyone stores different data). A piece of desktop backup software, or even a cloud-backup service, could then connect to each of our cloud accounts and download backup copies of our data according to whatever schedule we want.
The freedom of do-anything-anywhere cloud services with the security and peace-of-mind that comes from holding our own backups. That’s my vision of cloud nine!