While there is clearly no shortage of ways to invoke SQL against Hadoop, the Presto SQL engine developed by Facebook has clearly proven it can scale. This week, Teradata threw its weight behind the open source Presto project by pledging to not only provide support services, but also develop additional drivers and security technologies.
Justin Borgman, vice president and general manager for Teradata Center for Hadoop, says that in addition to being able to support Hadoop, Presto provides the added benefit of being able to support additional relational and NoSQL database formats. For example, Teradata plans to support Presto within the context of the Cassandra NoSQL database that many organizations are starting to adopt within transaction processing environments.
One of the things that sets Presto apart from other SQL engines for Hadoop, adds Borgman, is that it supports both real-time and batch-oriented applications, while most existing SQL engines assume that the only use cases that need to be supported are Hadoop applications running in batch mode.
In addition to throwing its weight behind Presto, Teradata this week unveiled management tools for “data lakes” based on Hadoop and RainStor 7.0, a SQL-compatible data archiving platform for Big Data that now supports Teradata QueryGrid software alongside several distributions of Hadoop.
Obviously, Teradata views Hadoop and other sources of Big Data as a natural extension of its data warehouse platform. Given the number of data warehousing applications most organizations already have in place, that’s probably a safe assumption. Not nearly as clear, however, is what form of SQL engine will be used to logically unify all those data sources.