As part of an effort to create a comprehensive approach to building Big Data environments, SAP this week announced it will not only resell Hadoop distributions from Intel and HortonWorks, but that it is also putting data virtualization hooks into Hadoop to better integrate the open source data management framework for Big Data with the SAP in-memory computing platform.
According to Byron Banks, vice president of Big Data marketing for SAP, the goal is to merge two Big Data platforms with distinctly different attributes into an integrated portfolio of Big Data platforms. In addition, SAP also announced that it is setting up a service organization to help customers develop Big Data projects by leveraging the data scientist skills that SAP has internally developed.
The “hooks” that SAP is putting into Hadoop come in the form of the company’s Smart Data Access data virtualization platform. By virtualizing all the data sources, SAP allows organizations to launch queries using, for example, SQL, against any data source regardless of the format it is in. Hadoop is the first of several data sources to which SAP intends to extend its data virtualization capabilities as part of an architecture for managing data across the enterprise at a higher level of abstraction.
Banks says that customers are making it clear they want a unified approach to Big Data that encompasses Hadoop, existing data warehouse investments and emerging in-memory database platforms such as HANA. The end result, says Banks, is a more federated approach to managing Big Data that leverages existing investments along platforms such as HANA, that can process data in real time, while Hadoop is used for more batch-oriented applications. Ultimately, Banks says SAP sees Hadoop as a primary source of data that will be used to feed complementary Big Data applications running on HANA.
Ultimately, the end goal is to make enterprise data readily available to any number of applications regardless of where it happens to be stored at any given moment. Of course, where it’s stored will be determined by what format it’s in and how critical access to that information is to the organization at any given moment.