Big Data: You Need What Skills?

Susan Hall

Just as data scientist is becoming one of the up-and-coming job descriptions in Big Data, so, too, are the roles of Big Data architects and engineers. My colleague Loraine Lawson wrote yesterday about the people required for Big Data implementations,starting with the CIO. It seems, though, that the various roles are still evolving.


There seem to be more job postings for Big Data engineers than Big Data architects, though a post at Silicon Angle points out that at least for the engineers, companies seem to want the world:

Employers are looking for people who know MapReduce, Hadoop and related frameworks such as HBase, Pig and Hive. Programming languages in demand include: Java, Ruby, or C++. It really covers the gamut, which is part of the issue with using Big Data in a job description. Do you have MongoDB expertise? Yes, that's applicable. Practical hands-on experience with Bayesian models and neural networks? Yes, the job may be right for you.

Any wonder that companies say they can't find Big Data talent? It lists the requirements from a couple of ads:


  • The Amazon subsidiary is looking for a Big Data engineers to:
    • Design, develop and support a MapReduce-based data-aggregation pipeline for processing billions of events a day
    • Support data-mining and machine-learning algorithms using behavioral data
    • Study state-of-the-art techniques in massively parallel frameworks and apply them to advertising problems
    • Help other engineers get the most out of the platform you own
  • Climate Corp.: This San Francisco-based company helps companies adjust to climate change:
    • Experience with Lisp and/or Clojure (functional programming languages)
    • Experience with large-scale machine learning techniques (examples: Google PageRank, Netflix Prize, genome sequence assembly, computational finance)
    • Experience with Amazon Web Services (EC2, S3, SQS, etc.)
    • Deep knowledge of the Hadoop ecosystem
    • Git version control
    • Frequent contributor to open source projects (show us your work on GitHub!)

There's not only a shortage of analytics talent for Big Data, but engineering talent as well. Andy Mendelsohn, Oracle senior vice president of database server technologies, talked about that in an article at businesscloud9:

... [Hadoop] is a development platform for very sophisticated Java developers to build parallel applications, so one of the big problems around Hadoop is a skill-set problem. I talk to customers all the time, and they just don't have developers who know how to write these MapReduce programs. And so one of the big challenges of Hadoop is sort of to raise the level of discourse around Hadoop so you don't have to have rocket-scientist Java parallel programming developers, but you can code at a higher level.

So it appears the staff shortage will continue until vendors make it easier or more people become trained. In the meantime, companies would do well to stop looking for a purple squirrel and really zero in on the essential skills they need.

Add Comment      Leave a comment on this blog post
Apr 27, 2012 6:18 AM Ben Brumm Ben Brumm  says:

I agree with the statement above that outlines the lack of people with the relevant skills. It's still a very new area of data management and database, so it will take some time. I majored in database at Uni, and if it was being taught when I was at uni I would have learnt it!

Jun 3, 2014 2:36 AM Abhishek Gowlikar Abhishek Gowlikar  says:
I too agree with the above statement.Here learners are ambiguous about how to start learning hadoop and its ecosystem. Primarily they need any programming skill set like java,perl etc.But soon hadoop will take a major place in the business/IT structure. Reply

Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.