While the fact that technologies such as Hadoop have given IT organizations the ability to more easily access large amounts of data, the fact of the matter is that the infrastructure needed to process all that data is still beyond the means of most organizations.
Yahoo, for example, has over 40,000 nodes running Hadoop in order to manage 180-200 petabytes of data. Facebook has 2,000 compute nodes with 20 petabytes of data. Within most IT organizations the average cluster might scale up to about 30 servers. Hadoop clusters on average consist of 200 servers.
In an IBM webcast that can be found here, Leon Katsnelson, IBM program director for information management at the IBM Cloud Computing Center for Competence, makes a case that the cost of Hadoop by definition is going to push these deployments into the cloud. Simply put, Katsnelson says most companies can't make the upfront capital investment in the compute, storage and networking resources required.
Unless organizations find a way to access Hadoop resources in the cloud, adds Katsnelson, they are never going to develop the skills needed to use Hadoop. And yet, a recent study conducted by the Ventana Group found that 94 percent of Hadoop users perform analytics on large volumes of data not possible before; 88 percent analyze data in greater detail; while 82 percent can now retain more of their data. In effect, that means these organizations are starting to view Hadoop as a strategic technology they need to gain a competitive advantage.
Katsnelson argues that the only way the analytics playing filed is going to remain level is if organizations take advantage of Hadoop as a cloud service, which is why IBM launched BigInsights on SmartCloud Enterprise, which makes Hadoop available to customers in less than 30 minutes at a cost that starts at 30 cents per hour. At that price point, IBM figures that most organizations can't afford to ignore the potential impact that analytics applications based on Hadoop might have on their business.
IBM is not the only cloud service provider making Hadoop available. What's most interesting about all this is how cloud computing should substantially reduce the amount of time it takes for organizations to gain access to new and emerging technologies. That might not only reduce the time it normally takes for new technologies to reach mainstream adoption; it might ultimately have a major impact on the rate at which companies of all sizes compete with each other.