SHARE

Databricks Marries Apache Spark to Serverless Computing

At a Spark Summit 2017 conference this week, Databricks announced that it will be making an instance of the Apache Spark in-memory computing framework available as a managed cloud service running on top of serverless computing environment. In addition, Databricks revealed it will be making available curated instances of machine learning algorithms and tools via […]

Written By

MV

Mike Vizard

Jun 7, 2017

At a Spark Summit 2017 conference this week, Databricks announced that it will be making an instance of the Apache Spark in-memory computing framework available as a managed cloud service running on top of serverless computing environment.

In addition, Databricks revealed it will be making available curated instances of machine learning algorithms and tools via its cloud service as well as an application programming interface (API) through which IT organizations can stream data into Spark five times faster.

Databricks CTO Matei Zaharia says all three announcements are intended to reduce the amount of time it takes for organizations to start getting return on their investments in Apache Spark. Rather than having to acquire, configure and deploy all the infrastructure needed to run an instance of Apache Spark, Zaharia says it’s much simpler for the average end user to invoke a cloud service.

“In the early days of Spark, most of the end users were very technical,” says Zaharia. “Now we’re seeing a lot more other types of users that just want to use the data.”

The Databricks serverless computing framework, says Zaharia, is based on an event-driven architecture that makes infrastructure resources instantly available and makes it simpler to scale Apache Spark usage up and down as required using infrastructure provided by Amazon Web Services (AWS).

Zaharia notes that as usage of Apache Spark has expanded, the number of data sources it’s being tied into now extends well beyond Apache Hadoop clusters. In addition, Zaharia notes that SQL is now the primary interface being used to launch queries against data stored in Apache Spark systems.

In general, serverless computing frameworks are still in their infancy. But as they continue to evolve, it’s already clear organizations of all sizes are about to experience much less friction in terms of how any given application consumes IT infrastructure resources.

MV

Mike Vizard

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

Databricks Marries Apache Spark to Serverless Computing

Mike Vizard

Company

Categories