Editor’s Note: This is part of a series on the factors changing data analytics and integration. The first post covered cloud infrastructure.
It’s a truism that technology changes quickly and ages fast — and yet, despite massive network and computer evolutions, not much changed for data until Big Data came along.
To be fair, for all practical purposes, Big Data was first seen as a natural extension of the relational database, but with larger amounts of data and faster processing speed. Almost immediately, though, vendors like IBM and research firms like Gartner pushed the definition of Big Data to include other data types — semi-structured and unstructured data, delivered at high speeds, which can mean real time, near-time and streaming or, as I privately call it, all time data.
“These days, unstructured data is not contained in the simple raw data storage systems from years ago, nor is it all binary data, such as videos or audio,” wrote David Linthicum, a cloud and data integration consultant, in a recent whitepaper. “The growth pattern is in unstructured data that is also complex data. This means that we’re dealing with massive amounts of data that’s missing metadata. Moreover, that data is typically related to other structured or unstructured data, but those relationships are not tracked within the data storage systems.”
These complex data types are richer and diverse than traditional data. The problem is that most organizations are still running networks, systems and applications designed for relational data.
Why it matters:
That disconnect is why Informatica CEO Sohaib Abbasi says these new data types are disrupting both the traditional data and analytics infrastructure. He’s not alone; most experts agree that how we interact with data — from integration to network capacities — will require a significant overhaul if organizations are going to stay competitive.
Mark Shilling, principal, Deloitte Consulting LLP, US Information Management Practice Leader, nicely summarized the problem when I spoke with him this week about the future of analytics.
“If you think about it, if the volume of information is doubling every two to three years, that means that the CIO’s infrastructure is quickly becoming outdated,” Shilling said. “It’s likely there are processing bottlenecks in that environment, it’s likely there are significant cost pressures and that it’s a burden to manage, govern and secure in different places within the scope of their responsibility.”
“They have some foundational infrastructure that needs to be addressed that becomes almost non-negotiable to run the business.”
Loraine Lawson is a veteran technology reporter and blogger. She currently writes the Integration blog for IT Business Edge, which covers all aspects of integration technology, including data governance and best practices. She has also covered IT/Business Alignment and IT Security for IT Business Edge. Before becoming a freelance writer, Lawson worked at TechRepublic as a site editor and writer, covering mobile, IT management, IT security and other technology trends. Previously, she was a webmaster at the Kentucky Transportation Cabinet and a newspaper journalist. Follow Lawson at Google+ and on Twitter.