Talend Simplifies Transition from Hadoop to Apache Spark

Slide Show

Flawed Integration Can Destroy Data Quality and Reliability

In the journey to embrace Hadoop as the centerpiece of a modern data warehouse platform, a funny thing happened along the way: IT organizations discovered that they are interested in the Apache Spark framework for processing Big Data analytics in real time.

To help make that transition less of a grind, Talend today unveiled the latest version of its data integration platform, Talend 6, at the Strata + Hadoop World 2015 conference that makes it simpler to move code developed for Hadoop over to Apache Spark using a set of graphical tools.

Talend CEO Mike Tuchen says that rather than having to manually recode everything, Talend 6 takes advantage of the company’s core data integration capabilities to convert MapReduce jobs to Spark with the click of a button. The end result is applications that run in-memory and are typically five times faster than when they were executing directly on top of Hadoop.


Because no recoding is required, Tuchen says IT organizations are discovering that Talend 6 essentially provides a layer of isolation between their code and disruptive technologies such as Hadoop and Apache Spark, which enables them to more quickly adopt emerging technologies. Given the current pace of innovation, Tuchen notes that Talend 6 will provide that same future-proof capability for any code being written today that needs to be applied to the next great platform coming tomorrow.

In the meantime, IT organizations should take comfort in the idea that all the code they are writing today can live on in a future that is subject to sudden shifts and changes in the way in which the application development winds happen to be blowing.