Data is everywhere. Close inspection of a runner’s shoes, a dog’s collar, a prescription bottle, or even a keg of beer could reveal a flow of hidden data. And if your local pub isn’t analyzing beer consumption through connected kegs today, it likely will be tomorrow. Analysts predict that the Internet of Things (IoT) will grow to encompass as many as 200 billion connected devices by 2020.
Couple the exploding IoT with other exploding data sources such as specialized cloud applications, social media feeds and mobile devices, and it’s clear that there is no shortage of data for businesses to mine. Unfortunately, just as businesses have more data than ever to monetize, the data itself is becoming increasingly uncooperative as it ramps up in volume, variety and velocity — with no end in sight.
As the availability of data exponentially increases, unprecedented opportunities exist to do all kinds of amazing things: monitor illegal deforestation, improve population health, reduce traffic jams — and, yes, sell more beer. But with these unprecedented opportunities come unprecedented data wrangling challenges.
So how do you turn opportunity into insight? In this slideshow, Rob Consoli, senior vice president of sales and marketing, North America at Liaison Technologies, has compiled five tips on how to wrangle uncontrolled data flow in your enterprise.
How to Deal with a Data Avalanche
Click through for five tips on how to wrangle uncontrolled data flow in your enterprise, as identified by Rob Consoli, senior vice president of sales and marketing, North America at Liaison Technologies.
Step 1: Stock Your Data Lake
You may not yet know the value of all the data available to your organization, but it’s time to start capturing as much as possible for future initiatives. Data lakes are purpose-built to store and process large volumes and varieties of data in today’s Big Data world. A data lake’s efficiency comes from its inverse approach to data storage. Rather than enforce rigid schemas upon arrival as traditional data warehouses and relational databases do, data lakes allow all types of data, structured and unstructured, to exist, unconstrained, in raw format. This approach defers the time-consuming task of data modeling until a time when the enterprise has a clear idea of what questions it would like to ask of the data.
Step 2: Manage Your Metadata
There’s the data itself and then there’s the data about data — called metadata. Metadata has always been an important component of data governance, but in today’s world of unstructured data, it’s become even more critical. Without schemas or relational databases to provide data with built-in context and consistency, metadata must be heavily relied upon to act as the bonding agent between widely disparate data types. This requires building a robust metadata repository to physically store and catalog descriptive metadata (used to find and identify data content), structural metadata (defines how the components of an object are organized), and administrative metadata (technical information such as file type, size, or creation date).
Step 3: Stop Muddling with Middleware
The decades-old middleware approach to integration (i.e., ESBs, EAIs, ETL) came of age when ERPs were the heavyweights of enterprise software. The middleware model was well matched to the integration needs of its time, but, today, its large installation footprint, heavy reliance on network processing, and use of older languages and protocols make it a less than ideal integration tool at a time when agility above all else is required to accommodate ever-changing cloud endpoints.
Given the long implementation cycles and high infrastructure and personnel costs, IT organizations should no longer invest in keeping middleware implementations current. While it’s probably unrealistic to rip and replace, steps should be taken to begin migrating integration functionality to more modern cloud-based models such as data platform as a service (dPaaS).
Step 4: Outsource Where Possible
The end goal of all data operations is to uncover insights that drive revenue. But all too often, organizations divide their resources among all facets of the undertaking — including the more mundane operations of integration and data storage. At a time when the cloud is proving to be a superior delivery method for technology and infrastructure, and more IT functions than ever are being delivered as managed services (e.g., network monitoring, backup operations, email), there’s no need to reinvent the wheel. Leverage what other experts have to offer so that your experts are given the breathing room to take your organization to new heights.
Step 5: Avoid Analysis Paralysis
You’re all set to discover insights, but where to start? Avoid the paralyzing paradox of choice by starting small — and smart. Identify low-hanging fruit that can offer confidence — and an arsenal of lessons learned — as you progress to more complex initiatives.