Starbucks Brews up Business Case for Mobile BI

Loraine Lawson
Slide Show

Big Data Analytics

The first steps toward achieving a lasting competitive edge with Big Data analytics.

It's a boy! But do you know why?

 

Having a male child isn't as simple as a 50/50 chance. Did you know, for instance, that the birth of boys has been on a downward trend since 1971? Or that black and Native American mothers have fewer male children than Asians and Caucasians? Or that your chance of having a boy goes down as the mother or father age? And what's up with those 3,316 boys born to 89-year-old men in 1989?

 

Sue Ranney knows why. The VP of product development at Revolution Analytics found the answer by analyzing 22 years' worth of 70 gigabytes of raw data - on her laptop - to demonstrate how Revolution R Enterprise's RevoScaleR Big Data analysis package works.

 


Revolution Analytics is a predictive analysis company that takes the open-source R language and adds enterprise support for it. Data scientists are something of a rare breed - much more so than, say, a .NET programmer. So one of the reasons companies would use Revolution Analytics is to have that data scientist build apps to process the data for embedding in a BI dashboard or even an Excel spreadsheet.

 

The app can be stored on the Revolution R Enterprise server, made accessible by a Web services API. Then a .NET programmer can embed it using a .NET client API that the solution provides, explained David Smith, the company's vice president of marketing, in a recent interview.

 

Smith says the newest release, Revolution R Enterprise 5.0, includes even more support for making the R programming language more enterprise-friendly and easier for IT to manage. Among the new features are Hadoop integration certified with the Cloudera CDH3 distribution, integration with the Microsoft HPC server platform for doing high-performance, distributed computing and LDAP support for better security.

 

But perhaps most significantly, version 5.0 supports multiple nodes for distributed/parallel computing, which can be used for high-performance statistical modeling.

 

"If somebody wanted to do that on 10 billion rows of data, they could farm that problem out to a cluster of five,10, 20 or 50 machines running on the Microsoft HPC server framework and really reduce the processing time required to do those types of computations," Smith said. "We did a test where we ran a regression on 10 billion rows of data using just five machines. These are just off-the-shelf machines, not a very high hardware cost, but we were able to do that computation in just 90 seconds. You can do that same kind of thing in SAS, but you'd have to do it with hardware that costs up toward a million dollars or so."

 

The company reports impressive performance benefits. Researchers at Michigan State University managed to cut a three-and-a-half month analytical project to a little over one week using Revolution R on Microsoft HPC. Smith expects that level of performance benefits will translate over into the commercial sector.



Add Comment      Leave a comment on this blog post

Post a comment

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.


 
Resource centers

Business Intelligence

Business performance information for strategic and operational decision-making

SOA

SOA uses interoperable services grouped around business processes to ease data integration

Data Warehousing

Data warehousing helps companies make sense of their operational data


Thanks for your registration, follow us on our social networks to keep up-to-date