SHARE
Facebook X Pinterest WhatsApp

Do Data Lakes Need a ‘Refinement Layer?’

Four Steps to Ensure Your Big Data Investment Pays Off Okay, sure, maybe Gartner has a point about this whole “data lake becoming a data swamp” problem. But a recent Information Age piece proposes that organizations can get around all that — and the need for data scientists — with a “data refinery layer.” Haven’t […]

Written By
thumbnail
Loraine Lawson
Loraine Lawson
Nov 28, 2014
Slide Show

Four Steps to Ensure Your Big Data Investment Pays Off

Okay, sure, maybe Gartner has a point about this whole “data lake becoming a data swamp” problem. But a recent Information Age piece proposes that organizations can get around all that — and the need for data scientists — with a “data refinery layer.”

Haven’t heard of such a thing? Neither have I, and Google seems to only have heard of it twice, including this article and an unsourced Word document.

“As data is consolidated, the refinement layer would process, evaluate, correlate and learn from the information passing through it, essentially generating additional insights and information from the data, and also linking to the aforementioned applications to drive value,” the article explains.

That sounds wonderful. Let’s do it! The problem is, after reading the article, I’m still not exactly sure what it is or if it exists or if it could exist.

This piece says it’s by Ben Rossi, but if you read to the bottom it’s “sourced from Matti Aksela, Comptel.” That’s a niche company that focuses on building customer interaction automation systems for telecoms. Apparently, refinery data layers won’t be on discount during today’s sales.

So why am I sharing this? I’ve been researching data lakes and talking with numerous experts (more on that another day), and I realize there’s actually a really good point here.

While there’s some heated debate over the usability and merit of data lakes, large companies are building them for legitimate use cases. Sensor and other Internet of Things data will need to go somewhere, and there are already successful use cases for network intrusion detection and security.

So the real question is how we make them useful more broadly and, ultimately, that’s going to require abstraction layers. One of the benefits of a data lake is supposed to be less data integration work, but wherever there are layers, there seems to be middleware.

Data Lakes

We’ll also probably hear a lot of different names for the same tools along the way. That’s just how tech happens.

If data tech history teaches us anything, we can expect that industry-specific vendors will pioneer the first drafts of these tools. So, the piece is worth a read to see what technologists are thinking as they try to solve this problem.

In the meantime, a more practical read for the weekend might be Forbes’ recent article, “3 Major Mistakes Companies Make With Big Data And How To Fix Them.” It’s written by Erik Severinghaus, founder and CEO of digital marketing personalization company SimpleRelevance. Severinghaus discusses more immediate solutions for squeezing business value from Big Data, including the four roles you must have on your Big Data team.

Webinars and Events:

Postgres – The NoSQL Cake You Can Eat,” Tuesday, Dec. 2, at 2 p.m. ET. Do you have to use NoSQL to achieve goals like managing transactional system data? This webinar discusses an alternative: Postgres, aka PostgreSQL, a object-relational database management system (ORDBMS). Marc Linster, SVP, Products & Services at EnterpriseDB, will discuss using ETL, foreign data wrappers and other techniques you can use with this open source solution.

When Is a Document-Oriented Database the Right Tool for You?” Tuesday, Dec. 2, at 4 p.m. ET. Next week’s The Briefing Room with Dr. Robin Bloor will dig into scalability challenges with application scale and databases. He’ll be briefed by Cloudant Chief Scientist Mike Miller, who will demonstrate denormalizing data into documents for better data management across distributed infrastructure.

Loraine Lawson is a veteran technology reporter and blogger. She currently writes the Integration blog for IT Business Edge, which covers all aspects of integration technology, including data governance and best practices. She has also covered IT/Business Alignment and IT Security for IT Business Edge. Before becoming a freelance writer, Lawson worked at TechRepublic as a site editor and writer, covering mobile, IT management, IT security and other technology trends. Previously, she was a webmaster at the Kentucky Transportation Cabinet and a newspaper journalist. Follow Lawson at Google+ and on Twitter.

Recommended for you...

Top Data Lake Solutions for 2022
Aminu Abdullahi
Jul 19, 2022
Top ETL Tools 2022
Collins Ayuya
Jul 14, 2022
Snowflake vs. Databricks: Big Data Platform Comparison
Surajdeep Singh
Jul 14, 2022
Identify Where Your Information Is Vulnerable Using Data Flow Diagrams
Jillian Koskie
Jun 22, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.