As the number of platforms where enterprise IT organizations can store data proliferates, getting data in and out of those platforms quickly has become a major IT challenge.
To address that issue, Syncsort has released an update to its suite of data integration offerings that adds an “Intelligent Execution Layer” that enables users to visually design a data transformation once and then run it anywhere—across Hadoop, Linux, Windows, or Unix—on premise or the cloud.
Tendü Yoğurtçu, general manager for Big Data at Syncsort, says version 8.0 of the company’s DMX Software is designed to provide not only a consistent approach for collecting, transforming and distributing data across multiple platforms, but also one that embeds algorithms that automatically select the optimal execution path based on the type of platform, the attributes of the data and the condition of the cluster.
The goal, says Yoğurtçu, is to allow business users and data scientists to take advantage of a run-time environment that allows them to transform data in flight in a single step.
Other new capabilities available in version 8.0 of DMX include support for NoSQL data stores such as Apache Cassandra, HBase and MongoDB, and the ability to load data in parallel into Hive, Vertica, and Greenplum databases.
Users can now also manage Apache Hadoop transformations with customized dashboards based on operational metadata and RESTful application programming interfaces (APIs) that are embedded in Docker containers. They can also load Apache Spark engines with legacy mainframe data sets, including VSAM and binary sequential files with COBOL copybook metadata, using a new mainframe connector.
There is probably more data moving between platforms today than at any time in the history of IT. Being dependent on an IT specialist or developer to keep the data flowing only serves to slow down the enterprise. By automating that process, enterprise IT organizations can essentially turn mundane data integration and transformation tasks into a self-service process that frees up their time to work on other important business data tasks.