Banks may be undermining their own efforts at Big Data, according to a recent Information Week column.
“When faced with the requirements of a new big data initiative, banks too often only draw on prior experience and attempt to leverage familiar technologies and software-development-lifecycle (SDLC) methodologies for deployment,” writes Michael Flynn, managing director in AlixPartners’ Information Management Services Community.
The problem: Those technologies enforce structure and focus on optimizing processing performance. That means the data is aggregated and normalized in an environment that works against Big Data sets in three ways, Flynn explains:
- An inability to respond to changes in the data stream due to rigid schemas
- Potential problems with tracing data lineage
- Data governance problems because multiple data “constituents retain responsibility for an extended, multi-stage data flow”
It makes sense that legacy data environments might hinder Big Data, but Flynn goes several steps further, criticizing traditional approaches to managing, implementing and changing projects.
These traditional techniques for managing data projects are all based on the premise that the platform won’t change a lot in the foreseeable future. The opposite is true with Big Data platform requirements, Flynn explains, which are expected to evolve.
“Therefore, any rigidity in the deployment approach can pose immediate risks of stagnation and failed adaptation,” he writes.
Flynn contends that this accounts for the results of a recent Said Business School at the University of Oxford study, in which 91 percent of banks said they lacked the key skills necessary to execute Big Data more efficiently and only 3 percent reported that their organizations had deployed Big Data on an on-going basis.
It’s not because banks don’t care about Big Data, either. The same study found that 64 percent of those surveyed agreed that Big Data proficiency provides a competitive advantage.
So what’s the solution for banks or any organization?
Obviously, the infrastructure needs to change and Flynn recommends the usual: distributed computing systems such as Hadoop or NoSQL databases.
When it comes to managing the actual implementations, he favors an Agile development methodology. Agile development allows IT to deliver the project rapidly but in iterative increments.
Neither suggestion is new advice. I’ve seen Agile development recommended by other experts. It makes sense when you think about the data paradigm shift both IT and the business must make with Big Data.