In the wake of a rise of a variety of so-called NoSQL databases, there’s no doubt that data management has become more complex. But the one surprising thing is how constant the use of SQL has been.
Looking to extend the reach of SQL into the realm of new data management frameworks, Oracle announced today Oracle Big Data SQL, an implementation of Hadoop based on the distribution created by Cloudera that Oracle resells, which can now support joins across tables stored in an Oracle database.
Only available on the Oracle Big Data Appliance, Neil Mendleson, vice president of product development, Big Data and Analytics, says Oracle Big Data SQL represents the unification of structured data running in an Oracle database with the massive amounts of unstructured data running in Hadoop. As far as the average SQL query is concerned, Hadoop appears to be just another Oracle database table, says Mendleson.
Mendleson says the ability to join data across an Oracle RDBMS and Hadoop is made possible by the way Oracle manages storage nodes inside an Oracle Big Data Appliance and a capability that Oracle previously developed via which a SQL query can access external database tables.
As the language of data that organizations of all sizes have invested billions of dollars in, Mendleson says that despite the rise of many alternatives, customers have been reluctant to give up SQL. In fact, when organizations do employ an alternative to a relational database, more often than not, they are using SQL to access it.
Oracle Big Data SQL, adds Mendleson, provides IT organizations a way to accomplish that goal using a consistent set of data governance and security policies.
Mendleson says Oracle Big Data SQL runs on an instance of the Oracle Big Data Appliance that is based on x86 processors as part of a deliberate effort to make the instance of Cloudera that Oracle provides affordable to organizations of any size. As usage of Cloudera expands, Mendleson says the Oracle Big Data Appliance is designed to scale both up and out.
Now that organizations are getting comfortable with multiple types of engines for managing different classes of data, the issue going forward isn’t so much replacing one with another as much as it is figuring out how to effectively employ them in concert with one another. In that context, SQL remains the best place to start to make that happen.