SHARE
Facebook X Pinterest WhatsApp

Big Data Security Risk in the Enterprise: The Pitfalls of Hadoop

Big Data wasn’t designed with security in mind; in fact, security has never really been the focus of distributed sciences. With these mountains of data, informing businesses on critical buyer decisions, habits, and countless other minutiae, comes a pressing need to keep this valuable information secure and protected. This is sensitive information, after all, and […]

Written By
thumbnail
ITBE Staff
ITBE Staff
May 20, 2013

Big Data wasn’t designed with security in mind; in fact, security has never really been the focus of distributed sciences. With these mountains of data, informing businesses on critical buyer decisions, habits, and countless other minutiae, comes a pressing need to keep this valuable information secure and protected. This is sensitive information, after all, and with so much of it comes a greater risk of breaches.

Data volumes are doubling annually, and roughly 80 percent of that captured data is unstructured, and must be formatted using a technology like Hadoop in order to be mineable for information. Considering this growth, it is clear that security concerns won’t be going away anytime soon. Quite the opposite, actually.

As Hadoop becomes more widely adopted in the enterprise, its security limitations are becoming more apparent. Brian Christian, co-founder and CTO of secure Big Data management vendor Zettaset, explains the biggest Big Data security challenges facing the enterprise today and his thoughts on creating a unified security model for Big Data.

Big Data Security Risk in the Enterprise: The Pitfalls of Hadoop - slide 1

Click through for the biggest Big Data security challenges facing the enterprise today, as identified by Brian Christian, co-founder and CTO of secure Big Data management vendor Zettaset.

Big Data Security Risk in the Enterprise: The Pitfalls of Hadoop - slide 2

Hadoop was originally adopted to manage publicly available information, not enterprise data.

Like many ground-breaking technologies (for instance, TCP/IP or UNIX), Hadoop wasn’t originally built with the enterprise in mind, let alone security. Hadoop’s original purpose was to manage publicly available information like Web links, and was designed to format large amounts of unstructured data within a distributed computing environment, specifically Google’s. It was not written to support hardened security, compliance, encryption, policy enablement, and risk management.

Companies leveraging Big Data are mostly on their own when it comes to security. Best practices for companies with Hadoop clusters include implementing additional access controls and limiting the number of personnel allowed to access the Hadoop cluster.

Big Data Security Risk in the Enterprise: The Pitfalls of Hadoop - slide 3

Hadoop’s security relies entirely on Kerberos

Hadoop does utilize Kerberos for authentication. However, this protocol can be difficult to implement and it doesn’t cover a number of other enterprise security requirements, like role-based authentication, LDAP, and Active Directory for policy enablement. Hadoop also doesn’t support encryption on nodes or on data in transit between nodes.

Big Data Security Risk in the Enterprise: The Pitfalls of Hadoop - slide 4

As a data store, a Hadoop cluster is not a single physical entity, but consists of hundreds, sometimes thousands, of nodes.

Traditional data security technologies have been built on the concept of protecting a single physical entity (like a database or server), not the uniquely distributed Big Data computing environments characterized by Hadoop clusters. Traditional security technologies are not effective in this type of distributed, large-scale environment.

Side Note: The distributed nature of Hadoop clusters also renders many traditional backup and recovery methods and policies ineffective. Companies leveraging Hadoop need to replicate, back up and store data in a separate, secured environment.

Big Data Security Risk in the Enterprise: The Pitfalls of Hadoop - slide 5

To reap the benefits of Big Data, Hadoop is used in conjunction with other technologies such as Hive, HBase or Pig. While these tools make Big Data accessible and useable, most also lack any real enterprise-grade security. Hardening Hadoop itself is only one part of the Big Data security challenge.

Big Data Security Risk in the Enterprise: The Pitfalls of Hadoop - slide 6

So far, no one has been able to put an accurate number on how much a security breach can cost an organization. Without thoroughly evaluating its security coverage, an enterprise cannot assess its security weaknesses nor determine exactly how much to spend on security coverage.

Recommended for you...

How Revolutionary Are Meta’s AI Efforts?
Kashyap Vyas
Aug 8, 2022
Data Lake Strategy Options: From Self-Service to Full-Service
Chad Kime
Aug 8, 2022
What’s New With Google Vertex AI?
Kashyap Vyas
Jul 26, 2022
Data Lake vs. Data Warehouse: What’s the Difference?
Aminu Abdullahi
Jul 25, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.