SHARE
Facebook X Pinterest WhatsApp

Two Major Hadoop Vendors Call Foul on New Industry Consortium

Eight Steps for Mastering Information Mapping Hadoop distributor MapR declined an invitation to be part of a vendor consortium centered on the Hadoop stack. MapR CEO John Schroeder explained the company’s decision in a post last week, deriding the consortium definition of Hadoop’s core as “vendor-biased.” The consortium calls itself the Open Data Platform, which […]

Written By
thumbnail
Loraine Lawson
Loraine Lawson
Apr 27, 2015
Slide Show

Eight Steps for Mastering Information Mapping

Hadoop distributor MapR declined an invitation to be part of a vendor consortium centered on the Hadoop stack. MapR CEO John Schroeder explained the company’s decision in a post last week, deriding the consortium definition of Hadoop’s core as “vendor-biased.”

The consortium calls itself the Open Data Platform, which is a bit of an odd title given how platform is generally defined in the industry. In February, the group revealed its founding Platinum members as GE, Hortonworks, IBM, Infosys, Pivotal, SAS and an undisclosed international telecom. Gold members include Altiscale, Capgemini, CenturyLink, EMC, Splunk, Verizon Enterprise Solutions, Teradata and VMware.

It’s not unusual for vendors to join forces for creating interoperability standards and that is the stated intent in the group’s press release:

“A key benefit of the ODP will be for members to collaborate across various Apache projects as well as other open source-licensed big data projects with a goal toward meeting enterprise class requirements. The ODP is expected to promote a set of standard open source technologies and versions that will increase compatibility among big data solutions and simplify the process for applications and tools to integrate with and run on any compliant system.”

You would think that wouldn’t be an issue, since all vendors draw on Apache’s open source version of Hadoop. Members of the consortium say otherwise. Sunny Madra, head of data products at Pivotal, pointed out that the distributions are “dissimilar by design.” In an InfoWorld interview, he compared the Hadoop situation to the early days of Unix distributions.

“The Unix ecosystem was quite fragmented; everyone had their own things going on, and you couldn’t be sure if something ran in one place or the other,” Madra told InfoWorld. “Then Linux comes around and standardizes that. So if you take a look at RHEL or CentOS or Oracle, you know that if you have something that runs on any one of those, it’ll run on all of them.”

Hadoop’s lack of standardization makes it hard to certify software that is developed for it, Madra added.

But there are a few things that make this a consortium a bit … awkward. First, MapR isn’t the only major Hadoop distributor missing from that list. The top two commercial Hadoop companies — Amazon Web Services EMR and Cloudera — are also absent. Cloudera declined to join early on. Chief Strategy Officer Mike Olson didn’t mince words about why.

“I have an engineer’s disdain for industry consortia in general, and for vendor-driven consortia in particular. Far too often, these organizations aim not at promoting, but rather at slowing, innovation in the technology industry,” he wrote. “Pivotal and Hortonworks claim that the ODP is driven by an industry-wide longing for standardization in the Apache Hadoop ecosystem. I don’t believe them.”

That’s significant since Cloudera is the leader in customer deployments, with more than 200 customers in March, 2014. Schroeder writes that together, MapR and Cloudera run nearly 75 percent of Hadoop implementations.

Also missing are Intel and Microsoft, which offers Microsoft Windows Azure HDInsight Service, one of two Hadoop distributions that run on Windows, according to CIO.com. A version of Hortonworks Data Platform also runs on Windows.

So, you have to wonder what good standardization does if the largest distributors aren’t involved — a point that Schroeder succinctly raises in his post, referring specifically to the idea of “platinum” and “gold” memberships that bestow different rights.

Data Management

“The Open Data Platform is not open unless equal voting rights are provided to the leading Hadoop distributions,” he writes. “The Open Data Platform has not disclosed how governance is done, but it is a different model than the preferred and fair meritocracy used by the Apache Software Foundation.”

This is one of three concerns Schroeder outlines in his post, the other two being:

The Open Data Platform is redundant with Apache Software Foundation Governance.

The Open Data Platform is ‘solving’ problems that don’t need solving. “Companies implementing Hadoop applications do not need to be concerned about vendor lock-in or interoperability issues,” he writes. “Applications built on one distribution can be migrated with virtually zero switching costs to the other distributions.”

Gartner’s informal findings from a webinar back that up. In a joint post, Gartner analysts Nick Heudecker and Merv Adrian reveal that less than 1 percent of attendees indicated that vendor lock-in or interoperability was a concern.

Where stands Gartner amid this vendor bickering? You’ll note that Heudecker and Adrian’s post is written as a dialogue between Muppet malcontents Statler and Waldorf, which I think softens their criticism of the ODP:

“This simply institutionalizes a dichotomy in favor of a few favored players. Who wants it? As Cloudera suggests, the paying members, and it’s not clear who else. It’s ironic that Hortonworks is one of the founders of an organization that wants to add an anchor slowing innovation in the open source free-for-all it has been the flag-bearer for.”

Loraine Lawson is a veteran technology reporter and blogger. She currently writes the Integration blog for IT Business Edge, which covers all aspects of integration technology, including data governance and best practices. She has also covered IT/Business Alignment and IT Security for IT Business Edge. Before becoming a freelance writer, Lawson worked at TechRepublic as a site editor and writer, covering mobile, IT management, IT security and other technology trends. Previously, she was a webmaster at the Kentucky Transportation Cabinet and a newspaper journalist. Follow Lawson at Google+ and on Twitter.

Recommended for you...

Top Data Lake Solutions for 2022
Aminu Abdullahi
Jul 19, 2022
Top ETL Tools 2022
Collins Ayuya
Jul 14, 2022
Snowflake vs. Databricks: Big Data Platform Comparison
Surajdeep Singh
Jul 14, 2022
Identify Where Your Information Is Vulnerable Using Data Flow Diagrams
Jillian Koskie
Jun 22, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.