Cambridge Semantics this week launched a new tool that allows business analysts to take over those little ETL jobs that consume so much of the IT backlog.
It’s not a data integration platform — I want to be clear on that. Instead, it’s an add-on layer that will automatically generate ETL jobs for ETL engines.
It’s been interesting to watch Cambridge Semantics over the past four or so years, because the company is always a little bit unusual. This is not just my opinion: In 2011, Gartner named the firm one of its “Cool Vendors to Watch.” As you can guess from their name, they use a semantic web approach to common IT problems such as integration or reporting.
Its new offering, software called Anzo Smart Data Integration (ASDI), is yet another example of the company’s unusual approach. According to the Cambridge Semantics press release, it uses common, conceptual business models to automate traditional data integration coding:
“It enables business analysts to map source and target systems to the common, conceptual model via an easy-to-use, familiar spreadsheet interface. Systems are only ever mapped once, rather than once per project. ASDI then automatically generates extract, transform and load (ETL) jobs that run on popular ETL engines such as Informatica or Pentaho.”
So, you are quite literally cutting out the middle man — or in this case, an already overloaded programmer.
Obviously, this is not a new concept, since it’s a huge trend to push minor data integration work into the business. That’s a trend that works for both parties since these jobs are often a hassle for IT, but hold up business users from accessing vital data.
What’s really different here is that this tool can act as an add-on with your existing ETL solution. Most of the similar solutions I’ve read about are stand-alone tools that use APIs, tools that draw in data without actually integrating it (“simple integration”) or ones evolved out of a specific solution.
Vendors and others are welcome to correct me in the comments below.
The Who’s Who of Running Big Data on AWS
Part of Hadoop’s appeal is that it gives you a way to handle Big Data sets on commodity hardware. Sometimes, according to one CIO, you just really need to put the pedal to the metal, metaphorically speaking.
"Most of us wanted Big Data to run on very commoditized servers, but the reality is the more metal that you can give to it the better it is," HubSpot CIO Jim O’Neill told TechTarget in a recent article on Big Data companies who use Amazon Web Services as the engine for their computations.
HubSpot is a digital marketing SaaS company that analyzes between 500 million and 1 billion new data points each month. In addition to running its analytics and processing on AWS, HubSpot also runs a contact management system built on Hadoop and Apache HBase.
I’ve long heard that companies were using AWS for BI and Big Data, but this piece helps fill in the gaps with very real examples. In addition to HubSpot, the site talked with leaders from Yelp, Illumina, Inc. and LexisNexis.
You may recall that three years ago, LexisNexis open sourced its own internal Big Data tool and created a subsidiary for managing it called High Performance Computing Clusters Systems (HPCC Systems). HPCC now offers a customizable data analytics platform that can run on top of AWS, according to the article.