For all the hype surrounding Big Data, the one constant has been SQL. Despite some early chatter about replacing SQL, it remains the lingua franca for accessing data sources big and small. The only real question now is the speed at which SQL queries can be made to run against multiple data sources.
To address that issue, MapR Technologies created Apache Drill, an open source SQL query engine designed to generate schemas on demand. Available for the distribution of Hadoop crafted by MapR, Drill is designed to ultimately work against multiple distributed data sources, says Jack Norris, chief marketing officer for MapR.
Because users no longer have to wait for IT to generate schemas before analyzing data, Norris says Drill enables the self-service exploration of massive amounts of data using a SQL engine that is orders of magnitude faster than any other SQL alternative. Norris says because it’s compatible with a variety of data types, including JavaScript Object Notation (JSON), Drill enables SQL to be used at scale without forcing IT organizations to replace existing data repositories.
That latter point is critical, because no matter how popular Hadoop or any other emerging data repository proves to be, most IT organizations are not going to simply abandon existing data warehouse investments. Instead, Norris says it’s clear that most IT organizations are on an extended journey that will involve melding Hadoop with a variety of database formats to create a next-generation data warehouse that is more of a logical entity than a physical construct. For that very reason, Information Builders, Jinfonet Software, MicroStrategy, Qlik, SAP, Simba, Tableau and TIBCO have all signaled their support for Drill.
Obviously, different IT organizations will make that journey starting from different places and times. But the one thing they can count on is that some form of SQL will surely be their constant companion.