Cloud Data Storage: It'll Still Cost You, So Give It Some Thought

Share it on Twitter  
Share it on Facebook  
Share it on Linked in  

With the ever-growing amounts of data flowing into organizations, I've always thought the information lifecycle management (ILM) approach makes intuitive sense. (I can't say the same for every technology trend, complete with accompanying acronym, that comes down the pike.) More of a strategy than a technology, ILM involves moving data to different tiers of storage based on its importance to an organization and the organization's need to access it.


As I've written before, ILM would impose some much-needed discipline around storage management.


But why go down the ILM path if you can just offload your data-storage problems to someone else? I fear that may be the approach many companies are taking in this emerging era of cloud computing.


John Engates, CTO of cloud computing provider Rackspace, recently told Forbes his company has more than 100 employees devoted to managing its 15 petabytes of data. (A petabyte is hard for me to picture. It's 1,000 terabytes, or 1 million gigabytes. To put it into perspective, Forbes points out the entire Library of Congress takes up about 10 terabytes of digital space.)


The company's storage needs are growing by one petabyte a month. So, not surprisingly, Rackspace is adopting a tiered approach to data storage, putting some data on tape, some on disk, and integrating all of it with a management software layer developed by CommVault to determine what's being backed up and how much it's going to cost customers.


The last four words of the sentence is key. Said Engates:

That's a big difference in the cloud service provider world vs. doing it internally. There is metering and there are charge-backs.

You're paying for the storage. It's almost certainly less expensive than storing data internally, but it will still cost you. So wouldn't it make sense for companies to make some decisions about the relative importance of their data before sending it to the cloud? If your go-to method of dealing with a problem is sweeping it under the rug, you're eventually going to end up with one big damned mess.


George Crump's InformationWeek article, in which he discusses four tiers of data storage for the new decade, is well worth a read. The four tiers: SSD, which offers the quickest access to data but can be costly, so should be reserved for active data; SAS-based mechanical drives for storing near-active data of a somewhat less critical nature; archived storage, which should be more scalable and less expensive than SSD or SAS-based drives; and finally cloud storage, which Crump says can play a role as possible permanent resting place for data.


IT Business Edge's Arthur Cole and Mike Vizard in October both looked at some of the cloud storage options, with Vizard noting cloud providers are beginning to couple services to their cloud offerings that help customers dynamically manage data both locally and in the cloud. Earlier this year The Storage Networking Industry Association announced the formation of the Cloud Storage Initiative to foster growth and success for both the commercial and consumer cloud storage markets.


Again, to adopt this kind of a tiered approach to data storage, companies will have to apply more discipline to storage management than they have in the past. Deciding which set of data should go where isn't a simple process. Crump asks whether it should be a manual process or an automated process and promises to follow up this piece with discussions of the available options. (I am betting the answer will be processes that combine automation with some manual intervention, though I am no storage expert.)


As promised, Crump wrote about automated tiering options yesterday. In this article, he says automated tiering can simplify ILM by moving the data-movement decision away from the server or client and closer to the storage. Automated tiering can be implemented through file virtualization, storage virtualization, smart storage controllers or a cache-like implementation, he adds. He writes:

The promise of automated tiering is that it will remove one of the challenges and time consuming tasks from storage managers; data placement. For many organizations it may be the only practical way to fully leverage all the new tiers of storage.

If you need inspiration on switching to a tiered approach to data management, there's a tiered storage savings calculator in IT Business Edge's Knowledge Network.