Partnerships Key to Expanding Big Data's Use

Loraine Lawson
Slide Show

Big Data Analytics

The first steps toward achieving a lasting competitive edge with Big Data analytics.

Forrester analyst James Kobielus recently pondered what could be a multi-million-dollar question:

When will one-stop-shop data analytic tool vendors emerge to field integrated development environments (IDEs) for all or most of the following advanced analytics capabilities at the heart of Big Data?

He checks off a list of must-haves for this "nirvana-grade Big Data IDE," including data integration, governance, metadata management, MDM and so on. Now that kind of offering could really make Big Data a reality for organizations. The problem is, it's still wishful thinking, as Kobielus acknowledges.


"The only vendors whose current product portfolios span most of this functional range are SAS Institute, IBM, and Oracle," he writes. "I haven't seen any push by any of them to coalesce what they each have into unified Big Data tools."


As my great-grandmother told my father when he rushed two-year-old me to put on my winter coat: Give her time, give her time.


Certainly new partnerships and solutions are announced every day. Most recently, Teradata revealed it will be working closely with Hadoop company Hortonworks. The goal is to integrate Hadoop with Teradata data and Teradata Aster analytics technologies; in other words, the goal is to put some wings on Big Data by turning it into useable analytics information.


As Gartner analyst Merv Adrian explained to Computerworld, it's all about integration. DBMS vendors want to offer costumers a way into Hadoop stores and Big Data startups want to tap into those enterprise connectors already offered by enterprise database vendors.


These partnerships help both sides win.


"Many already have disconnected data marts and don't want another wave of fragmentation as they begin to leverage big data," Adrian told ComputerWorld.


So what kind of deals are in the works? And who are the players to watch?


From the Big Data side, the big players are: Cloudera, EMC, Hortonworks and IBM. The names you're most likely to hear in any announcements, though, are Cloudera and Hortonworks, because other enterprise players are opting to partner with them instead of offering their own Hadoop distributions. Let's look at their core offerings:


Cloudera released the first distribution of Apache's Hadoop stack for enterprises. The company's chief architect happens to be Doug Cutting, a co-creator of and the man who named Hadoop. Cloudera's version of Hadoop powers Amazon Web Services and Rackspace clouds. It, too, is a partner with Teradata. It's most recent big partner? Oracle.


EMC has its own distribution of Hadoop, thanks to its acquisition in 2010 of Greenplum. EMC offers Greenplum DCA, basically a Hadoop appliance. This month, it also announced that its Isilon networked storage system will include native support for Hadoop's HDFS file system.


Hortonworks is a Yahoo spinoff company that formed last July. It does not have its own distribution of Hadoop, but does provide services to other companies. Right now, its partnership with Teradata is generating a lot of buzz in the press, in part because it appears to offer deeper, more strategic integration with the analytic tool, according to GigaOm. But Hortonworks also secured a similar partnership with Microsoft last fall.


IBM is the reigning champ of big money from Big Data, making at just over $1 billion in revenue from Big Data, according to the research company Wikibon. It offers a Hadoop platform called InfoSphere BigInsights. Basically, what you're getting here is software, including a host of open source tools such as Apache Hadoop (obviously), MapReduce, Pig, Hive, HBase, plus IBM software such as a spreadsheet-based analytic tool - all ready to work together, once you've set it up on the hardware of your choice. This is a heavyweight choice, it seems to me, and, indeed, in his assessment of the Big Data options, Ovum's Tony Baer suggests you hire a systems integrator for setting up.


In addition to the platform, IBM also offers Netezza, an appliance for processing Big Data.


Tomorrow, I'll share how other big tech vendors are entering the Big Data market.

Add Comment      Leave a comment on this blog post
Mar 2, 2012 12:18 PM DataH DataH  says:

I am not sure I agree with the perspective that it is still wishful thinking that data integration, governance, metadata management etc are not available in Big Data tools. Unlike most other multi million dollar Big Data vendor offerings, the open source and free offering from HPCC Systems is both mature and simple to use. It also provides all the 'nirvana-grade' features without the need to breaking the bank and spending millions of dollars on investing in something like Hadoop and then engaging with a high end vendor to support it. The HPCC tool is simple to setup and use. The data flow oriented ECL language is inherently parallel (no need to Map and Reduce) and provides the flexibility the represent your data in multiple models-Relational Tables (like an RDBMs), XML, RDF, Key and Value pairs etc


Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.