Machine Learning Emerges in the Enterprise to Tackle Big Data

Don Tennant
Slide Show

Tracing the Evolution of Business Intelligence Through the Rise of Embedded Analytics

I have to admit I’m one of those people whose eyes glaze over at the mention of “predictive analytics,” yet make me an offer to hear about “machine learning,” and I’m all over it. After all, the concept of a machine being able to learn stuff is pretty cool, right? But according to Alex Gray, a machine learning guru who got his start working with astronomers in NASA’s Machine Learning Systems Group in the 1990s, predictive analytics and machine learning are basically the same thing. It’s just that “machine learning” is the term all the scientists and researchers use, Gray says, and it’s finally beginning to catch on.

In fact, it’s even catching on in the corporate enterprise, the bastion of relational databases and SQL queries. Gray went on to co-found Skytree, a machine learning technology provider in San Jose, and now serves as the company’s CTO. In a recent interview in that capacity, he explained that machine learning is a natural next step for the enterprise:

Predictive analytics was relatively obscure until very recently. Now everyone’s interested in big data. We know how to store, manage, and query data with SQL queries, but that very next level has to do with finding patterns and using them to make predictions—whether you call that machine learning or pattern recognition or data mining, that is the end of the value chain—that’s how you finally connect data to dollars. … There are many things that effectively are predictive problems, and problems where you’re trying to find patterns from data. People are approaching it today using SQL, even though they are prediction and data mining problems. … Machine learning is designed to directly optimize prediction accuracy. It’s always going to do better than any SQL rule a human is going to make, because it can automatically think about hundreds of variables, and how they may interact in some complex way that a human can’t even write down in a clear way. … Within a handful of years, a company that’s not doing machine learning for some part of its business is going to be handicapped, and its competitors are going to beat it. I think there’s an inevitable movement toward machine learning.

Gray acknowledged that there’s a lot of hype around big data, but he insisted that machine learning is the real deal:

That database doesn’t go away—it just gets added to. You’re always going to need a database, because you’re always going to need to store, manage, and query data. It’s just that there is another level of analysis and automation on top of that that is going to be needed. And it’s going to translate to billions of dollars across all kinds of companies. So people like database administrators should learn about this stuff. This is the next wave—that’s not an exaggeration. There is a lot of hype, of course. But this is one of those cases where it’s for real.

Incidentally, I mentioned to Gray that I had co-written a book about how to tell when someone is lying, and I asked him if he had any thoughts on how machine learning might be employed to expose deception. His response provided a perspective on machine learning that’s worth sharing here:

Generally speaking, the only limit to machine learning is the input data, and these days, those limitations are almost non-existent. Machine learning can work on images, videos, text, audio, so all of the cues that a human can use, whatever we look for as highly trained humans, with enough data and training examples, I don’t think I’ve ever seen a problem where eventually, machine learning doesn’t beat humans. In some cases the kinds of data are hard to obtain. So to make machine learning beat humans in that kind of a setting or problem, you would need to give it lots and lots of examples of people who are deceiving, and people who aren’t deceiving, in a whole range of circumstances. And then yes, machine learning will eventually capture more variables and subtleties than a human can pay attention to.

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


Add Comment      Leave a comment on this blog post
Aug 13, 2013 6:22 PM DataSolution DataSolution  says:
Good article Don. One open source technology to mention is HPCC Systems from LexisNexis, a data-intensive supercomputing platform for processing and solving big data analytical problems. Their open source Machine Learning Library and Matrix processing algorithms assist data scientists and developers with business intelligence and predictive analytics. More at http://hpccsystems.com Reply
Aug 23, 2013 3:55 AM Anthony Lavia Anthony Lavia  says:
A software solution or a multi-core/multi-processor solution is not the answer to the Big Data problem/opportunity. In an environment of exponential data growth, linear solutions will max out - sooner rather than later. (Check out Amdal's Law.) What is needed, I believe, is a hardware-based approach, employing massively parallel cores that combining memory and processing functions, and with size-transparent scalability. Neuromorphic technology fits the bill, and its time has come. Reply

Post a comment





(Maximum characters: 1200). You have 1200 characters left.




Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.

Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.