EMC and IBM Accidentally Partner on Big Data Analytics

Share it on Twitter  
Share it on Facebook  
Share it on Linked in  
Slide Show

The Business Impact of Big Data

Many business executives want more information than ever, even though they're already drowning in it.

I was at the annual IBM analyst event this week when news of EMC's effort to expand education and certification for computer scientists broke. Interestingly enough, IBM was pitching its new Watson division, which, at its core, is trying to solve the same problem. Turning Big Data into actionable information was the vision of both companies. What is interesting is that while the two firms are clearly not partnered and often we speak more of them in contrast, the two efforts dovetail nicely - the ultimate goals of one will likely depend heavily on the success of the other.

Let me explain.

Big Data Is a Big Problem

We talk a lot about Big Data, which is kind of the enterprise version of a hoarder. Big Data is focused on the massive growth, containment and management of information, which is being increasingly captured by the work we do, the devices we use and carry, and the equipment we use. Increasingly, everything from our home appliances to our children will be capturing data and, like a hoarder, much of the visible effort appears to be focused on making sure that we have sufficient room in our virtual homes, attics and basements to store it and that it is cataloged in a way so we can be sure that we don't lose any of it, and, from time to time, that we can actually find it.

But the reality is all of this effort, by itself, provides little more value than if we simply turned off the efforts that were creating this data. Given we aren't doing much more than storing Big Data and maybe looking at a small fraction of it (when it can be found), wouldn't you think the easiest fix would be to turn off much of the data capture, like what you'd do with a hoarder?

If you can take this massive and growing amount of data and actually use it to model and tell you things about the world you need to know, then it stops being just a problem and truly starts being the asset we'd hoped it would be. You go from being a hoarder to a collector of valuable assets because you can now get value out of your collection.

Computer Scientists

EMC, which is one of the big drivers of the Big Data message, is approaching this as a people problem - which it is. There aren't enough people trained to analyze these massive piles of stinking data and turn them into the spun gold that they could be. Data scientists are the alchemists of the current age, with one difference: They can, if trained properly, actually turn this problem into an asset. But if there aren't enough of them, then this transformation may not occur and the world is left with the unmet potential to solve big problems like curing cancer.

However, data scientists, while they can be trained to understand systems, aren't experts in the areas that need the answers. In short, they could look right at the cure for cancer and not see it. In effect, they only represent half of the problem; the other half is tying the person who can understand the answer to mining the data without forcing them to be both an expert in their own field and a data scientist.


Enter the Watson Division

IBM is developing Watson, which is the analytical engine that won "Jeopardy!." This tool is now being positioned against industries like the medical industry so that doctors can ask questions and the system will generate likely answers. For instance, there are over 12,000 known diseases and no one doctor knows all the symptoms for all of them, and we already know, through argumentative theory, that people generally start with an assumption and then use confirmation bias to build arguments around it. While the science is interesting, this screwy way our heads are wired, which forces us to be unable to see mistakes, also likely contributes the most to a wrong diagnosis.

What Watson will do is be able to mine the massive amount of information that has been collected in real time and help people make decisions based on facts. The best part is that it forces decision makers to consider alternatives that may have higher probabilities of being right. The example that was given on stage at the IBM event by a medical doctor was one where he had a woman whose health was declining and who had become unable to take care of her young son. After months of research and a bit of luck, he figured out what she had after more senior (this happened when he was an intern) doctors had given up. When this problem was put in front of the Watson medical prototype, it identified this very unique conclusion as likely and one simple, and inexpensive, test would need to be ordered to cure her. Something that took months - and only then happened by luck - happened in minutes with near-absolute surety with Watson. It could quite literally save your life.

But Watson is kind of like the first IBM mainframe, providing answers for a few specialties unless it can be rapidly expanded and the capabilities distributed so that you and I (and hopefully a few politicians) can use it to avoid mistakes. To get there will take a massive increase in data scientists who will then help build Watsons for the rest of us.

Wrapping Up: Working Together by Accident

Companies don't have to compete on everything and this just showcases how one company's initiatives can dovetail with another to help create what will likely be world-changing, and maybe lifesaving, solutions. We need both people with the right training to mine these massive and growing Big Data systems and we'll need new systems to make the result available to everyone who needs them in order to avoid expensive mistakes and make better decisions. It is nice to see IBM and EMC working together to solve world problems - even if it is by accident.