Do you really need to be told the volume of data grew last year? Probably not, although the fact that it grew 56 percent is a nice data point. That’s what the Aberdeen Group found.
What’s more interesting, according to David Linthicum, is that the Aberdeen Group report discovered business leaders want to access that data in real time.
“The report noted that 89 percent of enterprises that use real-time integration have the power to provide managers with accurate information when it is needed, as opposed to only 73 percent of organizations that do not use real-time services,” Linthicum writes in a recent blog post.
And of course, that’s where things can get pretty darn tricky, particularly when you’re talking about integration. Why? Because you’re no longer moving data out of back-end storage systems, where it’s (hopefully) been cleaned and gussied up for reporting. You’re now moving the data from its core transactional systems, and that introduces all kinds of new concerns, he explains.
“… because the data is real time, there are data externalization issues that also must be dealt with,” he writes. “This includes updating the real time data that is flowing to the decision maker, as the data changes over a given period of time. For instance, the ability to track factory production over an afternoon, as the production data changes minute-to-minute.
“In other words, we are moving from a report-oriented mentality to a dashboard-oriented mentality.”
This is where data integration technology can make or break you, Linthicum warns.
I’ve no doubt that’s the case, but I will add a pinch of salt: He’s writing this for Pervasive’s Data Integration blog. Still, it’s hard to argue with the fact that old considerations take on a new urgency when you’re performing real-time data integration, including:
- Latency
- Resiliency
- Governance
- Security
You can also check out the full Aberdeen report, “Ever Harder and Faster: Managing the New Demands of Data Integration,” which is currently available for free download with registration.
Red Hat and Hortonworks Buddy Up Hadoop File System Ecosystem
Now here’s an interesting partnership: Red Hat and Hortonworks engineers are working together to “accelerate the enablement of the broader file system ecosystem to be used with Apache Hadoop.”
I translate that as “They’re going to make it easier to use different file systems with Hadoop.”
The press release explains that they’ll be focusing on three areas. First, they’ll work on enhancing Apache Ambari, which is an open source project to support managing Hadoop-compatible file systems, such as GlusterFS. So, basically, you’d be able to use alternative file systems in Hadoop with Ambari.
The second focus area is to create “generic test suites to validate compatibility between Hadoop and alternative file systems.” And finally, they’ll work on integrating Hortonworks Data Platform so customers can process data stored on Red Hat Storage.
“Since Red Hat Storage is POSIX-compliant, it makes it easy to connect to the enterprise applications and run Hadoop analytics on enterprise data to reduce duplication of data and save costs,” the press release states.
Bloor Group Explores ‘What Is Data Science’
There’s just a ton of interest in Data Science, both as a discipline and a possible career choice. If you’re interested in exploring both aspects of Data Science, you might want to check out the Bloor Group’s recent webinar, “What is Data Science.”
It features data scientist Dr. Geoffrey Malafsky, along with Bloor Group CEO Dr. Robin Bloor; Kevin Moran, the COO of Phasic Systems, Inc.; Jerry Best, an Information Technology and Services Consultant; and Anand Rao, Partner of PWC.
And as always, the event is hosted by Eric Kavanagh. You can learn more and then register for an on-demand viewing on Information Management’s site, although the slides can be downloaded as a PDF at Inside Analysis.