Automation the Answer to Big Data Question

David Gibson
David Gibson
David Gibson is Vice President of Strategy at Varonis.

Most of the world's data is unstructured and is stored in continuously changing, user-defined directory structures with few rules about what types of data are stored, where it is stored and what each file contains. Ensuring this data is adequately managed and protected is an enormous challenge for IT departments, requiring a significant investment in resources, and IT departments are increasingly looking for practical Big Data solutions that can be applied.

With the explosion of data growth and the demand for rapid, ubiquitous digital collaboration, IT knows traditional data management methods can no longer keep pace, so they are looking for advanced solutions to protect their data. The key with Big Data is to get past all the hype and to learn more about the practical benefits, like finding exposed sensitive data, flagging malicious activity and identifying excessive access.

Big Data a Strategic Priority

A research survey conducted by Varonis recently revealed that more than two thirds of IT professionals think Big Data should be a strategic priority. More than half of the respondents expect Big Data to be a strategic initiative over the next five years, but fewer than half felt there was a clear definition of Big Data, and even fewer felt they had adequate knowledge of Big Data products. When asked how they would like to use Big Data, the respondents had clear ideas - the top three most selected applications were: find at-risk sensitive data, identify possible malicious activity and find users with excessive access rights.

The results indicate that while the majority of organizations in this study will engage in Big Data strategic initiatives over the next five years, a broad understanding of Big Data remains elusive. With over 80 percent of respondents rating their knowledge of Big Data products as low, but identifying how they'd like to use Big Data (fixing access control and protecting data), two things are clear: IT needs more specific information about the practical applications of Big Data, and when they get this information, the majority will be ready to engage in a project.

Increasing complexity widens an already sizable information gap between end users, data and IT. Unstructured data is becoming more valuable and bigger, and uses more sophisticated formats: spreadsheets, presentations, video, email, audio, images. Every process is becoming digital, and each file tells a more complete story.

IT's Collaborative Data Management Role

In order to derive maximum value from these files, organizations need to collaborate. The result of collaboration is that organizations are flattening, becoming horizontal. With so many digitized processes, multiple teams need to access many data sets for the organization to function. Consequently, the data is growing at 50 percent year-over-year, and the metadata - the data about the data - is growing 100 percent year-over-year. The complexity of managing the data is growing faster than the available resources to manage them, and so requires processes and technologies to identify and remediate exposed sensitive content, excessive authorization, and abuse of access - Big Data analytics for human-generated content.

Virtually all attempts by IT to manage and protect unstructured data have resulted in only partial success. The proposed solutions are manual and fundamentally static, while the data, its usage and location change all the time. At best, manual data and access reviews result in a series of 'snapshots' that try to capture the inherently dynamic nature of an organization's business owners, users, groups, permissions and data. These snapshots require a lot of time to create with manual methods - they are often stale before they are completed and reviewed.

Until recently, most unstructured data protection activities, such as entitlement reviews, data usage audits, data owner confirmation and state data identification have been manual and error-prone or not done at all for lack of controls and resources. In many cases, IT is unable to reliably identify business owners of data or involve them in the governance process. Determining who has access to a data set, which folders a user or group can access, and identifying unneeded permissions can be a challenge, and often IT is completely unable to answer questions such as, 'Who accessed or deleted my data?"

The trend for IT departments is to implement data governance software automation to reduce the burden on their personnel and budgets, and to ensure they are proactively protecting their critical data.

For software automation to provide full management and protection capabilities, it needs to non-intrusively collect critical metadata about unstructured data, such as who has access to data, who is using their access, who shouldn't have access, who owns the data, and what data is sensitive. Then once this metadata has been collected, processed, analyzed and presented, IT and data owners have the ability to make informed authorization and permissions maintenance decisions that are then programmatically executed - dramatically reducing IT overhead and manual backend processes.

With the Big Data analytics for human-generated content that a metadata framework provices, organizations can effectively and automatically manage data access control, ownership, classification, entitlements and authorization processes on the platforms that host unstructured data. Data governance software automation enables organizations to expand digital collaboration boundaries safely while at the same time significantly increasing IT work force productivity for daily data protection and management tasks.

Add Comment      Leave a comment on this blog post

May 23, 2012 4:54 AM kalpanaceo kalpanaceo  says:

Informatics Outsourcing is an Offshore Data Management service company. Data Management Service includes all types of Data Conversion, File Conversion, XML Conversion, HTML Conversion,SGML Conversion, Document Conversion,Data Entry, Data Extraction and Validation,OCR and ICR Services with affordable price. Our team to give the solution quickly and given requirements.


Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


Resource centers

Business Intelligence

Business performance information for strategic and operational decision-making


SOA uses interoperable services grouped around business processes to ease data integration

Data Warehousing

Data warehousing helps companies make sense of their operational data