Why Big Data Needs Better Middleware Instead of Data Scientists

Loraine Lawson
Slide Show

Capitalizing on Big Data: Analytics with a Purpose

If Lars George thinks there’s a gap between technology and application with Hadoop, he’s probably right.

George is now Cloudera’s chief architect for EMEA, but he began with Hadoop in 2007, using it to build scalable Web solutions. He wrote the O’Reilly book on HBase. To sum up: He knows his stuff. In a recent Datanami article, George acknowledges that such a gap exists, particularly when it comes to middleware.

“We’re customers who want to use it now,” George is quoted as saying. “But I think we’re still not there yet.”

Often, when I write about middleware, I focus on the integration aspects. And, indeed, we’ve been talking about the integration layer for some time now, going back at least to 2010. Big Data and data in general have certainly spurred investments in middleware. And most, if not all, of the major integration players now support Hadoop.

Lavastorm Analytics CEO Drew Rockwell explains in the same article that the Big Data middleware layer needs an analytic orchestration and assembly environment. Some vendors are focusing on Hadoop and some vendors are innovating for the visualization layer, he said, but there’s little in the way of a middle layer to assemble or build analytic applications.

Rockwell sees that as a real sticking point for broad adoption.

Here’s one reason why both Rockwell and George see a need for a stronger middleware layer: There aren’t enough data scientists to go around, and both say there never will be. They’re quite adamant about this, as this quote from George shows:

“If we think we can hire or train more data scientist to accelerate the adoption of Hadoop, I think we are mistaken. We really need to do this differently.”

Data Management

They recommend a different model. Data scientists should work for software vendors developing algorithms for specific business use cases. Then, using a middleware layer, Hadoop techies and business analysts would handle the in-house work of applying those algorithms to their own data.

It’s not an unprecedented idea. As I shared last week, some startups are already doing just that.

Loraine Lawson is a veteran technology reporter and blogger. She currently writes the Integration blog for IT Business Edge, which covers all aspects of integration technology, including data governance and best practices. She has also covered IT/Business Alignment and IT Security for IT Business Edge. Before becoming a freelance writer, Lawson worked at TechRepublic as a site editor and writer, covering mobile, IT management, IT security and other technology trends. Previously, she was a webmaster at the Kentucky Transportation Cabinet and a newspaper journalist. Follow Lawson at Google+ and on Twitter.

Add Comment      Leave a comment on this blog post
Mar 28, 2015 5:57 AM Ilya Geller Ilya Geller  says:
There is no Big Data. Language has its own Internal parsing, indexing and statistics. For instance, there are two sentences: a) ‘Fire!’ b) ‘In this amazing city of Rome some people sometimes may cry in agony: ‘Fire!’’ Evidently, that the phrase ‘Fire!’ has different importance into both sentences, in regard to extra information in both. This distinction is reflected as the phrase weights: the first has 1, the second –0.12; the greater weight signifies stronger emotional ‘acuteness’. First you need to parse obtaining phrases from clauses, for sentences and paragraphs. Next, you calculate Internal statistics, weights; where the weight refers to the frequency that a context phrase occurs in relation to other context phrases. After that, you index each word from each phrase by dictionary, annotate it by subtexts. Reply

Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.