SHARE
Facebook X Pinterest WhatsApp

Hortonworks Launches Hadoop Data Governance Initiative

How to Monetize Data in Five Steps Recognizing that data governance concerns are holding back deployments of Hadoop in production environments, Hortonworks announced this week that it is partnering with Aetna, Merck, Target and SAS to launch the Data Governance Initiative (DGI). Andrew Ahn, director of product management for governance at Hortonworks, says DGI will […]

Written By
MV
Mike Vizard
Jan 30, 2015
Slide Show

How to Monetize Data in Five Steps

Recognizing that data governance concerns are holding back deployments of Hadoop in production environments, Hortonworks announced this week that it is partnering with Aetna, Merck, Target and SAS to launch the Data Governance Initiative (DGI).

Andrew Ahn, director of product management for governance at Hortonworks, says DGI will create a customizable metadata framework on top of Hadoop that IT organizations will be able to invoke using a rules-based policy engine.

In addition, DGI will create an audit store that will make it simpler to discover how data was employed in a particular context, while also working to integrate the metadata framework with existing Apache Falcon data life cycle management and Apache Ranger data security projects.

Ahn says DGI will also lay the groundwork for additional long-term initiatives, because once the metadata framework is established, it will become more feasible to, for example, expose particular data sets via an application programming interface (API) that can be invoked by analytics applications.

Data Governance

The most critical aspect of the DGI, says Ahn, is that each instance of the metadata framework can be customized to meet the specific needs of each customer. Rather than imposing a data governance framework from the top down, the DGI effort is intended to allow customers to create individual taxonomies to tag data in a way that can be consistently managed via a rules-based engine, says Ahn.

Ahn concedes that none of these ideas are new from a traditional data warehousing perspective; they just haven’t been applied yet to Hadoop. But what will be more interesting to watch is how broadly they are applied across the enterprise as Hadoop continues to emerge as a “data lake” that all applications wind up drinking from.

MV

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

Recommended for you...

Top ETL Tools 2022
Collins Ayuya
Jul 14, 2022
Snowflake vs. Databricks: Big Data Platform Comparison
Surajdeep Singh
Jul 14, 2022
Identify Where Your Information Is Vulnerable Using Data Flow Diagrams
Jillian Koskie
Jun 22, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.