DBAs Need to Evolve into Data Scientists

Michael Vizard
Slide Show

Nine Best Practices for Efficient Database Archiving

Database administrators basically optimize database performance by managing the distribution of data across an environment that is complex to manage. That generally requires a lot of arcane skills, not the least of which is partitioning a database, also known as "sharding," so that the database can more easily scale across multiple processors and systems.

As database performance has become a bigger issue in the era of the cloud, the need to shard the database has increased exponentially. But databases that are sharded are not only complex to manage; they are prone to the introduction of errors because there's just simply more that can go wrong.

But what if there was never any need to shard a database or, for that matter, perform most any other arcane database optimization task? That's the thinking that went into the development of the Clustrix distributed database system, which allows MySQL applications to transparently run on a database appliance loaded with solid-state disks (SSDs) that eliminate the need to shard the database.

As part of an effort to increase exposure to its technology, Clustrix this week made available a free software-only version of its technology that developers can use to create applications for the Clustrix appliance. According to Mark Sarbiewski, chief marketing officer for Clustrix, applications built using the Clustrix Development Kit will allow developers to get familiar with the technology without having to commit to the expense of buying the Clustrix platform until the application is ready to be deployed in production.

Part of an emerging "NewSQL" movement, which calls for the continued use of SQL to access Big Data by replacing existing database engines with platforms that are built on parallel database architecture, Clustrix can invoke sophisticated algorithms for managing data that allows applications to scale without relying on sharding.

That level of automation raises the specter of a lot of DBA unemployment in the future. But in reality, Sarbiewski says most businesses don't want DBAs who generally concentrate on database maintenance tasks. Instead, they want DBAs to evolve into data scientists and chief data officers who focus on generating the most business value possible out of all the data they collect. DBAs are not the only group within IT that aspire to take on that role, but they do have the inside track when it comes to understanding how the company's data is currently managed.

In the coming Big Data era, it's clear that IT organizations are going to need to automate the management of massive amounts of data more than ever. There's no place better for that to happen than within the actual database itself.

Add Comment      Leave a comment on this blog post
Feb 10, 2012 10:59 AM Doron Levari Doron Levari  says:

Disclaimer: I work for ScaleBase.

I agree that scale challenges are enormous and self-implementation of sharding is very difficult and not recommended for the non-expert hands.

However Solutions do exist to enable out-of-the-box automatic transparent sharding++. This way anyone can benefit from the advantages of sharding while avoiding self-implementation hell, and enjoy good old MySQL databases, leverage existing skills, management, etc.

ScaleBase is a leading solution for transparent sharding. Try it out!


Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.