SHARE
Facebook X Pinterest WhatsApp

IBM Embraces Spark and Gives Open Source Access to Its Machine Learning Technology

Capitalizing on Big Data: Analytics with a Purpose To join in the momentum from the open source Apache Spark in-memory computing framework build, IBM today announced that it is making a major commitment to Spark in the form of IBM SystemML machine learning software that it will donate to a project that 3,500 IBM researchers […]

Written By
MV
Mike Vizard
Jun 15, 2015
Slide Show

Capitalizing on Big Data: Analytics with a Purpose

To join in the momentum from the open source Apache Spark in-memory computing framework build, IBM today announced that it is making a major commitment to Spark in the form of IBM SystemML machine learning software that it will donate to a project that 3,500 IBM researchers located in a dozen labs are already now working on.

Joel Horwitz, director of portfolio marketing for the IBM Analytics Platform, says that IBM views the in-memory framework for creating clusters as a foundational component of an emerging “insight economy” where analytics are processed in real time alongside transactions. As such, IBM will embed Apache Spark software into all of its analytics and e-commerce software, says Horwitz.

In addition, Horwitz says that IBM will offer Spark on its SoftLayer cloud alongside an instance of Spark that can be invoked as a service running on the IBM Bluemix platform-as-a-service (PaaS) environment that can be provisioned in as little as 10 minutes. One of the things that makes this possible, says Horwitz, is that the application programming interfaces (APIs) that have been created for Apache Spark are already well defined.

Horwitz says IBM is committed to making additional contributions to the project as it continues to invest in machine learning applications designed to, for example, advance gene sequencing or optimize transportation routes using data collected from millions of Internet of Things (IoT) endpoints. Horwitz adds that IBM is committed to extending the number of programming languages that can be used to create Spark applications. Spark itself, notes Horwitz, is written in Scala, a derivative of Java.

IBM will also open a Spark Technology Center in San Francisco. The company is pledging to educate at least 1 million data scientists and data engineers on Spark through extensive partnerships with AMPLab, DataCamp, MetiStream, Galvanize and Big Data University MOOC. 

Though data itself is not actually stored in Spark, as an in-memory compute engine that is layered on top of Hadoop, Horwitz says Spark is becoming part of the logical data warehouse that is starting to emerge in Big Data environments. In fact, Spark is not only multiple orders of magnitude faster than standard Hadoop, it sharply reduces the number of machines needed in a cluster to process Big Data.

As a top-level Apache open source project originally developed by Databricks, Horwitz says that IBM views Spark today as significant an open source project as Linux itself. The challenge, of course, is turning what is clearly still an emerging, immature technology into something that can be deployed in support of production applications across the enterprise.

MV

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

Recommended for you...

Data Lake Strategy Options: From Self-Service to Full-Service
Chad Kime
Aug 8, 2022
What’s New With Google Vertex AI?
Kashyap Vyas
Jul 26, 2022
Data Lake vs. Data Warehouse: What’s the Difference?
Aminu Abdullahi
Jul 25, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.