The cloud is already making gains in enterprise circles as an effective means to boost storage capacity, so it shouldn't be any surprise that it is already heading in the next logical direction: as a full-blown data warehousing solution.
A few weeks ago, I blogged about comments by several leading thinkers that cloud-based warehousing is an idea whose time has not yet come. That may still be the case, but it's certainly not through a lack of trying. And while serious privacy and policy/governance issues still exist when you start talking about analyzing corporate and consumer data in the cloud, there's no reason to think these can't be worked out once functioning warehouse architectures are up and running.
In fact, it's important to note that one of the experts downplaying the idea of warehousing in the cloud was Stephen Brobst, CTO of Teradata, who argued that warehousing applications are probably best left to private clouds. Well, surprise, Teradata this week announced a new strategy aimed at bringing its warehouse capabilities to both public and private clouds. The push features a new set of products called the Agile Analytics Cloud that can be used to establish data marts on an internal cloud, but there is also the Teradata Express software for Amazon's EC2 server and the VMware Player.
Now, to be fair to Brobst, Teradata is still wading into the public cloud very carefully. The Express system only works on one node and provides only 1 TB of data, which isn't very much as far as the cloud goes. But there is something to be said for not expecting too much performance over public networks until I/O and service-level capabilities ramp up a bit more.
Other vendors, meanwhile, aren't so reticent. Greenplum Software, which bills itself as the "pioneer" of enterprise data cloud solutions, recently beefed up its Greenplum Database with a new column-oriented feature designed to leverage the system's parallel processing capabilities by allowing for mixed row and column analytics. The company is also offering a free version of the software, although only as a single-node edition.
Making the technology available is one thing, but actual user experience is quite another. To date, however, only a handful of enterprises have deployed cloud-based warehousing, and even then only on limited bases. That's why it might be a good idea to keep track of Dollar General Corp., which recently signed with service provider 1010data to offload its entire warehouse, more than 70 billion records, to the cloud. As reported by Information Management's Julie Langenkamp, the deal calls for 1010 to provide the warehouse as well as all front-end analytic tools and support services for DG's central operations, nine distribution centers and more than 8,500 stores, which cumulatively generate more than 5 billion records a year.
Traditional warehousing is probably the prime example of an enterprise function that can only be truly effective if it is backed by substantial investment. The value of the knowledge that comes out depends largely on the storage capacity and the level of analytics that you put in.
Putting it all on the cloud draws down much of the investment, but the jury is still out as to whether it can provide the return that makes it worthwhile.
But if it can't do it just yet, it probably will be able to very soon.