Server Virtualization Business Intelligence CRM Solutions Data Warehousing Database Administration Middleware Oracle Database SAN SOA Tape Drives
                       

Sorting Through What's Really Going on in the Hadoop Stack

Share  
1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11
Previous Next

Click through for an overview of the Hadoop stack and gain a better understanding of its capabilities, as identified by Loraine Lawson.

Topics : Vulnerabilities and Patches, Resellers, Broadcom, Broadband Services, Supercomputing

Loraine LawsonEveryone tends to focus on the “big” in Big Data, so much so that it’s easy to lose focus on the fact that Hadoop is really about data. Let’s regroup for a minute and really look at what’s going on with the data on Hadoop.

First, there’s the core. When people say “Hadoop,” they’re usually referring to the Hadoop core, which Loraine Lawson explained:

The Hadoop Distributed File System. What’s it doing with the data? It’s distributing it on nodes and storing it there.

MapReduce. This does the real work in the Hadoop core. If you want to run a process or computation on the data, it “maps” that out to the nodes and then runs the process, and “reduces” the results to your answer. So, it’s processing the data.

Now, if you’re familiar with data at all, you’ll notice there are a whole lot of things missing from that equation, such as:

  • Modeling
  • Metadata
  • Job scheduling
  • Workflow
  • Data management

This is where the growing list of Apache Hadoop-related projects comes into play.

These projects go by an odd assortment of names: Pig, Hive, Flume, Zookeeper, but they’re often short-changed when we talk about Hadoop. Loraine has seen them referred to as the “Hadoop stack,” though some programmers prefer “Hadoop ecosystem." Forrester refers to them as “functional layers.”

For the most part, they’re of interest to developers more than executives, but hopefully a high-level view of these solutions will add some depth to your understanding of Hadoop and its capabilities.

Here are a few of the more common names you’ll hear.

 

More Slideshows

Checking Email Anywhere, Anytime Is the New Norm for U.S. Work Force

The conventional, 9 a.m. to 5 p.m., five-day work week is a thing of the past for the overwhelming majority of workers. ...  More >>

Seven Enterprise Applications Trends for High Growth Companies

Oracle and its partners look at seven trends driving companies to adopt business tech, and how they are helping organizations capitalize on growth. ...  More >>

Five Tips to Improve Your Data Analysis

For companies that look for real business value by analyzing their data, here are five tips to help you through the process. ...  More >>

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


 

Resource centers

Business Intelligence

Business performance information for strategic and operational decision-making

SOA

SOA uses interoperable services grouped around business processes to ease data integration

Data Warehousing

Data warehousing helps companies make sense of their operational data