Most of the world's data is unstructured and is stored in continuously changing, user-defined directory structures with few rules about what types of data are stored, where it is stored and what each file contains. Ensuring this data is adequately managed and protected is an enormous challenge for IT departments, requiring a significant investment in resources, and IT departments are increasingly looking for practical Big Data solutions that can be applied.
With the explosion of data growth and the demand for rapid, ubiquitous digital collaboration, IT knows traditional data management methods can no longer keep pace, so they are looking for advanced solutions to protect their data. The key with Big Data is to get past all the hype and to learn more about the practical benefits, like finding exposed sensitive data, flagging malicious activity and identifying excessive access.
A research survey conducted by Varonis recently revealed that more than two thirds of IT professionals think Big Data should be a strategic priority. More than half of the respondents expect Big Data to be a strategic initiative over the next five years, but fewer than half felt there was a clear definition of Big Data, and even fewer felt they had adequate knowledge of Big Data products. When asked how they would like to use Big Data, the respondents had clear ideas - the top three most selected applications were: find at-risk sensitive data, identify possible malicious activity and find users with excessive access rights.
The results indicate that while the majority of organizations in this study will engage in Big Data strategic initiatives over the next five years, a broad understanding of Big Data remains elusive. With over 80 percent of respondents rating their knowledge of Big Data products as low, but identifying how they'd like to use Big Data (fixing access control and protecting data), two things are clear: IT needs more specific information about the practical applications of Big Data, and when they get this information, the majority will be ready to engage in a project.
Increasing complexity widens an already sizable information gap between end users, data and IT. Unstructured data is becoming more valuable and bigger, and uses more sophisticated formats: spreadsheets, presentations, video, email, audio, images. Every process is becoming digital, and each file tells a more complete story.
In order to derive maximum value from these files, organizations need to collaborate. The result of collaboration is that organizations are flattening, becoming horizontal. With so many digitized processes, multiple teams need to access many data sets for the organization to function. Consequently, the data is growing at 50 percent year-over-year, and the metadata - the data about the data - is growing 100 percent year-over-year. The complexity of managing the data is growing faster than the available resources to manage them, and so requires processes and technologies to identify and remediate exposed sensitive content, excessive authorization, and abuse of access - Big Data analytics for human-generated content.
Until recently, most unstructured data protection activities, such as entitlement reviews, data usage audits, data owner confirmation and state data identification have been manual and error-prone or not done at all for lack of controls and resources. In many cases, IT is unable to reliably identify business owners of data or involve them in the governance process. Determining who has access to a data set, which folders a user or group can access, and identifying unneeded permissions can be a challenge, and often IT is completely unable to answer questions such as, 'Who accessed or deleted my data?"
The trend for IT departments is to implement data governance software automation to reduce the burden on their personnel and budgets, and to ensure they are proactively protecting their critical data.
For software automation to provide full management and protection capabilities, it needs to non-intrusively collect critical metadata about unstructured data, such as who has access to data, who is using their access, who shouldn't have access, who owns the data, and what data is sensitive. Then once this metadata has been collected, processed, analyzed and presented, IT and data owners have the ability to make informed authorization and permissions maintenance decisions that are then programmatically executed - dramatically reducing IT overhead and manual backend processes.
With the Big Data analytics for human-generated content that a metadata framework provices, organizations can effectively and automatically manage data access control, ownership, classification, entitlements and authorization processes on the platforms that host unstructured data. Data governance software automation enables organizations to expand digital collaboration boundaries safely while at the same time significantly increasing IT work force productivity for daily data protection and management tasks.