It’s been said that Big Data is not a storage problem, but a management problem. That may be true up to a point, but the fact is that worldwide data loads are expected to at least double every year for the remainder of the decade. So the question of where it will reside physically is more than just an interesting thought experiment.
In 2012, global data volumes measured some 2.4 zettabytes, brought on largely by converging trends like mobility, broadband connectivity, social networks and the cloud. By 2020, we could be looking at 34 zettabytes or more as the exchange of information between businesses and individuals becomes increasingly common. Clearly, this will exceed the capacity of even the largest enterprise data center, so it is inevitable that demand for outsourced capacity will only increase, whether it is in the form of traditional hosting, cloud computing or a combination of both.
But that does not obviate the fact that, even in the cloud, stored data still needs a physical home. And even if storage costs are lower on the cloud, it is conceivable that, as volumes grow, even leased capacity will start to take a toll on profitability relatively quickly.
Is it possible, then, that some new type of storage will emerge — one that can dramatically lower costs no matter where data resides? Researchers at A*STAR Data Storage Institute in Singapore are working on several techniques using non-volatile memory and traditional magnetic media that promise to provide high-speed, high-capacity solutions that will cost less than current HDD and SSD systems on the market. The group is working on refining the algorithms that would govern data handling and other functions, but evidence to date indicates that such a system can be both longer-lasting and more adept at storing and retrieving metadata than existing technology.
This isn’t to say that management of these immense volumes of data is of little concern. As Veeam Software’s Doug Hazelman notes, simply finding new places to put data is of little help if it cannot be accessed and put to good use. For distributed architectures to be truly effective, the enterprise will need an integrated and broadly interoperable management infrastructure — and the best way to do that is through virtualization. Not only does virtualization greatly enhance deployment and scalability, but it works around nearly all of the roadblocks that plague traditional physical-layer backup and recovery platforms. Virtual B&R, however, requires specialized tools, meaning that any CIO who thinks he can simply repurpose traditional software for the virtual world is asking for trouble.
Underlying all of these issues, however, is the fact that, with very few exceptions, Big Data volumes remain as cost centers to the enterprise, not revenue generators. But as eWeek’s Nathan Eddy pointed out recently, many organizations are looking to monetize their stored data through trade, barter or even outright sale. Already, the industry is seeing the rise of “information brokers” who act as intermediaries to match data sets with people and organizations looking to buy them. Gartner, for one, says the rise of data monetization is inevitable, if only to defray the cost of managing and maintaining storage infrastructure.
Is this the best we can hope for, then? Continuing rear-guard actions designed to blunt the impact that Big Data will have on the bottom line? Perhaps, but it is also possible that the key to future success will come from both the storage and management of that data — that somewhere in those massive volumes lies the keys to new business opportunities or even entirely new industries that could dramatically alter life in the technological age and produce the next Microsoft or Google in the bargain.
That’s still quite an investment for an extremely fuzzy return, but if the alternative is to simply ignore Big Data to focus on more immediate concerns, then don’t be surprised to find yourself at a disadvantage going into the next technology cycle.