Big Data is quickly moving from concept to reality in many enterprises, and with that comes the realization that organizations need to build and provision the infrastructure to deal with extremely large volumes, and fast.
So it is no wonder that the cloud is emerging as the go-to solution for Big Data, both as a means to support the data itself and the advanced database and analytics platforms that will hopefully make sense of it all.
In a recent survey from Unisphere Research, more than half of all enterprises are already using cloud-based services, while the number of Big Data projects is set to triple over the next year or so. This leads to the basic conundrum that the business world faces with Big Data: the need to ramp up infrastructure and services quickly and at minimal cost in order to maintain a competitive edge in the rapidly expanding data economy. The convergence between Big Data and the cloud, therefore, is a classic example of technology enabling a new way to conduct business, which in turn fuels demand for the technology and the means to optimize it.
Cloud providers are already jumping on the Big Data bandwagon, issuing a slew of new services in recent months that seek to capitalize on the immediate demands of Big Data coupled with the time it will take for most enterprises to build out their own cloud infrastructure to sufficient scale. Datapipe, for example, recently acquired GoGrid in a move to automate the provisioning process for Big Data cloud environments. GoGrid’s toolkit provides a full suite of automation and integration systems that enable rapid deployment and scale, plus the orchestration and management components to combine disparate environments into a cohesive whole.
Through these kinds of targeted services, many cloud providers are seeking to de-emphasize the cloud aspect of what they do in favor of the solutions they provide – Big Data in this case, but also storage, backup/recovery and other vital functions. At the moment, this is an uphill struggle because, as outlined in a recent Capgemini survey, most enterprises are disappointed with the cloud’s ability to handle Big Data so far. Part of this is due to high expectations: Most pitches describe a Big Data nirvana in the cloud that simply does not exist yet. But part can be attributed to the fact that few organizations have figured out how to make Big Data vital to their core business. So in the end, you have a perfect storm surrounding a technology that is still largely in its infancy making promises about a service that few people know how to properly leverage.
If you take a step back from all this, however, you notice that there is a bigger movement here than just Big Data on the cloud, says DataHero CEO Chris Neumann. The emergence of cloud data is having as profound an effect on enterprise software as bring-your-own-device did on hardware – that is, the decentralization of data and the loss of control that enterprises typically exert over that data through the IT department. This can be seen in the emergence of new data management models like self-service analytics and service-based rules engines in which activity in one application triggers actions in another. This will not only affect the way we work, but will call for an entirely new data infrastructure that stresses productivity and cooperation rather than control as business processes begin to span widely disparate and geographically dispersed resource sets.
It seems clear at this point, though, that the enterprise will not be able to leverage Big Data to a significant degree without the cloud. And it is also clear that creating a private cloud capable of handling Big Data volumes will take both time and money, which are in short supply for the vast majority of organizations out there.
This means greater trust will have to be placed on third-party infrastructure in order to maintain the competitive advantage that Big Data is supposed to provide. This does not have to be multi-tenant public cloud infrastructure, of course, but can take the form of hosted private services, integrated hybrid clouds or a range of other solutions.
In all cases, however, it will be up to the enterprise to ensure that the supporting infrastructure is appropriate for the class of data on hand, and then to supplement the entire process with the tools and strategies to turn large collections of data into working knowledge.
Arthur Cole writes about infrastructure for IT Business Edge. Cole has been covering the high-tech media and computing industries for more than 20 years, having served as editor of TV Technology, Video Technology News, Internet News and Multimedia Weekly. His contributions have appeared in Communications Today and Enterprise Networking Planet and as web content for numerous high-tech clients like TwinStrata, Carpathia and NetMagic.