Here’s Infochimps CEO Jim Kaskade’s rule of thumb on Big Data projects: 100 percent go over schedule, 100 percent go over budget, and half fail.
That’s an alarming statement, but it’s based on an InfoChimps survey of both customer and non-customer organizations. During a recent IT Business Edge Q&A with Kaskade, he shared the three primary reasons for failure identified in the survey:
- Inaccurate scope
- Technical roadblocks
- Political silos
Inaccurate scope is a huge issue for companies, he said, and often happens because organizations just jump in without a real plan. Infochimps is a cloud-based analytics company, and Kaskade said clients often come to the company after a poorly scoped Big Data project. But the problem is larger: When the company surveyed over 300 CIOs from outside its customer list, it found 58 percent cited inaccurate scope as the cause of their Big Data failure.
So how can you avoid the inaccurate scope pitfall? Kaskade shared 12 steps to ensuring that a Big Data project is scoped well:
- Find out what you need in terms of recruiting and training. “If you’re not mindful of the team that you need to make it successful, you better learn what it is you need,” he said.
- Business discovery. You need to understand the broader use cases for Big Data, and which use case you’re trying to address with your project. “You’re prioritizing them not just based on revenue impact, but political friction,” he added.
- Information discovery and architecture. This is the nitty-gritty technology work of discovering what data sources you need to support your target use cases. Then you develop an architecture that supports that.
- Design. After you’ve established the information architecture, you need to design the Big Data part — including hardware, software, analytics and application stacks. Note from me: Design is not the same as building. People tend to forget that, so I just want to throw that out there.
- Procurement. Do you have the right hardware? If you’re going to the cloud, shop around. Buy the software and whatever else you identified in the design step.
- Install all of the stuff you just bought.
- Catalog. You know how bands will do a sound check, and test the sound on every mike and instrument? That’s similar to what Kaskade means when he says catalog: Go through your systems and all the processes you’ve set in place to ensure you can integrate the required data to your Big Data systems. Make sure everything works as intended.
- Ingestion. Now you’re getting into the real work of the scope by actually bringing your data sets into your Big Data platform of choice.
- Apply the analytics to your data sets.
- Develop your custom applications for using the data, whether it’s a BI report or a custom application.
- Test. Test all of it.
- Quality assurance.
At that point, you’ve scoped your project well and you’re ready to go live. If that sounds familiar, it should: It’s very much a traditional design pattern, Kaskade said.
If you’d like to learn more about Big Data failure, check out my full interview with Kaskade. The company also offers a working template on How to Do a Big Data Project that looks interesting, although I haven’t downloaded it to read it fully.