Hitachi Vantara at a PentahoWorld 2017 conference today announced it will deliver an update to Pentaho data integration and analytics software through which users will be able to natively interact with Big Data applications based on the Apache Spark in-memory computing framework and the Apache Kafka messaging system for sharing streams of data. Pentaho was acquired by Hitachi in 2015 and is now part of the new Hitachi Vantara business unit that includes servers and storage systems offered by what were once other units of Hitachi.
Arik Pelkey, senior director of Pentaho product marketing of Hitachi Vantara, says Pentaho 8.0 can now make use of Kafka to ingest Big Data processed on an instance of Apache Spark without any intervention on the part of a developer required. Pentaho 8.0 also adds support for the Knox Gateway to authenticate users accessing Big Data repositories.
At the same time, Hitachi Vantara is making it simpler to scale out analytics applications using a Worker Node capability. Based on Docker containers and an implementation of open source Mesos container orchestration software, Pelkey says this new capability makes it much simpler to add additional capacity to an analytics application.
In general, Pelkey says, the amount data that organizations will be analyzing in the years ahead is about to increase by a factor of 10, with about a quarter of that data streaming into applications in real time. To deal with that expanded data pipeline, organizations are going to require analytics applications that can process massive amounts of data regardless of the original source.
“We’re taking the mystery out of Big Data,” says Pelkey.
In fact, there may come a day soon when organizations no longer distinguish between Big Data and any other type of data. Instead, the focus will shift to being able to access the right data at the right time.