SHARE
Facebook X Pinterest WhatsApp

Databricks Marries Apache Spark to Serverless Computing

At a Spark Summit 2017 conference this week, Databricks announced that it will be making an instance of the Apache Spark in-memory computing framework available as a managed cloud service running on top of serverless computing environment. In addition, Databricks revealed it will be making available curated instances of machine learning algorithms and tools via […]

Written By
MV
Mike Vizard
Jun 7, 2017

At a Spark Summit 2017 conference this week, Databricks announced that it will be making an instance of the Apache Spark in-memory computing framework available as a managed cloud service running on top of serverless computing environment.

In addition, Databricks revealed it will be making available curated instances of machine learning algorithms and tools via its cloud service as well as an application programming interface (API) through which IT organizations can stream data into Spark five times faster.

Databricks CTO Matei Zaharia says all three announcements are intended to reduce the amount of time it takes for organizations to start getting return on their investments in Apache Spark. Rather than having to acquire, configure and deploy all the infrastructure needed to run an instance of Apache Spark, Zaharia says it’s much simpler for the average end user to invoke a cloud service.

“In the early days of Spark, most of the end users were very technical,” says Zaharia. “Now we’re seeing a lot more other types of users that just want to use the data.”

The Databricks serverless computing framework, says Zaharia, is based on an event-driven architecture that makes infrastructure resources instantly available and makes it simpler to scale Apache Spark usage up and down as required using infrastructure provided by Amazon Web Services (AWS).

Zaharia notes that as usage of Apache Spark has expanded, the number of data sources it’s being tied into now extends well beyond Apache Hadoop clusters. In addition, Zaharia notes that SQL is now the primary interface being used to launch queries against data stored in Apache Spark systems.

In general, serverless computing frameworks are still in their infancy. But as they continue to evolve, it’s already clear organizations of all sizes are about to experience much less friction in terms of how any given application consumes IT infrastructure resources.

MV

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

Recommended for you...

Strategies for Successful Data Migration
Kashyap Vyas
May 25, 2022
Leveraging AI to Secure CloudOps as Threat Surfaces Grow
ITBE Staff
May 20, 2022
The Emergence of Confidential Computing
Tom Taulli
Apr 20, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.