Regularly managing massive amounts of Big Data is a challenge that at this point most IT organizations are not going to be able to overcome.
To help IT organizations to rise to the challenge, IBM recently moved to acquire Daeja Image Systems, a provider of software that makes it simpler to view large documents and files even when users don’t have the application in which the document was created residing on their device.
According to IBM Enterprise Content Management Business Leader Doug Hunt, the viewer software that IBM is gaining through its acquisition of Daeja is part of a larger enterprise content management (ECM) strategy for managing Big Data that IBM is in the process of forming.
At the core of that strategy, says Hunt, is first giving organizations the tools they need to access Big Data in any format, and then providing the analytics tools needed to analyze that unstructured data and ultimately, the governance tools needed to manage it all.
Hunt says ECM will provide the foundation through which all policies for managing Big Data at scale will be applied. As much as it does today for data in the enterprise, Hunt says ECM systems will be extended to apply policies and access controls to Big Data.
In fact, ECM working in close cooperation with text analytics applications, says Hunt, will provide the insights that first identify what data has value and then the mechanism through which the disposal of data that has no value can be done in a defensible manner. That means that instead of having to save every piece of Big Data, organizations can implement policies across massive amounts of data that not only meet compliance guidelines, but also automate much of the process.
Arguably, one of the best ways to manage Big Data is to keep the integral data and then get rid of as much of the rest as possible. As Big Data governance becomes a more significant issue across the enterprise, what matters is not every piece of data, but rather the analyses that the business derives from the Big Data. As such, what needs to be saved, shared and ultimately made recoverable becomes identifiable patterns in the reams of Big Data that in aggregate, has a lot of value to the business, but in isolation, seems actually of little consequence.