Big Data is changing things, and not just because it requires shiny, new solutions such as Hadoop or Apache whatsit-of-the-week. As organizations use and assimilate Big Data, the more obvious it becomes that IT will need to reimagine some old standards in the data toolbox.
Why? The obvious reason is that standard data tools aren’t designed to handle unstructured or high-velocity data. But other issues unique to Big Data will require us to rethink the tools we’re using to manage, analyze and present the data. Here are two that have been in the news recently:
The Executive Dashboard
Executive dashboards were created over a decade ago to help leaders visualize specific enterprise metrics, such as key performance indicators. Not a lot has changed since then. That’s a problem in the era of Big Data, when insight is gained not so much through reporting as it is through exploration.
That “simplicity makes them too brittle for today’s high-speed data and advanced analytics,” according to this IT World article explaining why the dashboard should be replaced. The piece quotes experts who spoke at the recent O’Reilly Strata + Hadoop World conference, including Sharmila Shahani-Mulligan, CEO and co-founder of Big-Data startup ClearStory Data. Dashboards restrict the data and so restrict what you can see, which makes them useless at leveraging Big Data assets, she said.
“You can’t really dig in and see what is happening underneath the visuals,” Shahani-Mulligan said.
Data Migration Tools
For decades, moving data has been a relatively simple, if time consuming and sometimes expensive, process. But the standard tools — including ETL tools — are not designed for moving large data sets, so data migration becomes a much longer conversation when you’re dealing with Big Data.
Of course, your options will vary by what data you’re moving and where you’re moving it — for instance, on-premise data replication or migration will look different than a full-scale cloud migration. Even with something as straightforward as, say, moving data to Hadoop can be overwhelming for many organizations, according to this recent InfoWorld article:
In theory, getting data into and out of Hadoop is well within the capacity of both the software and its users. Apache’s Sqoop project was created to deal with Hadoop import and export, with native support for the usual suspects: MySQL, Oracle, PostgreSQL, and HSQLDB. But not everyone is comfortable doing the work themselves, so vendors are offering polished import/export solutions that require less manual labor.
You can also hire the migration out to an IT services firm, or obtain help from one of the Hadoop distributors. Vendors such as Syncsort, Attunity and Diyotta also have developed options to help organizations. Even so, some argue it’s better not to move the data — although, as more data moves to the cloud, that may prove impractical.
Regardless of how you solve the data migration problem, it’s a problem and that’s a far cry from the good old days when data migration was just what IT did. Even with a data scientist on staff, experts such as Tom Davenport say Big Data needs better tools just to provide meaningful results.
On the bright side, at least there are early adopters to follow and implementations from which to learn. For a fun flash-back read, check out this 2011 account from then-Facebook engineer Paul Yang about how the data infrastructure team migrated the (then) largest Hadoop cluster in the world.
Loraine Lawson is a veteran technology reporter and blogger. She currently writes the Integration blog for IT Business Edge, which covers all aspects of integration technology, including data governance and best practices. She has also covered IT/Business Alignment and IT Security for IT Business Edge. Before becoming a freelance writer, Lawson worked at TechRepublic as a site editor and writer, covering mobile, IT management, IT security and other technology trends. Previously, she was a webmaster at the Kentucky Transportation Cabinet and a newspaper journalist. Follow Lawson at Google+ and on Twitter.