Another Tool for Analyzing Big Data


The concept of "Big Data" is catching on quickly in business, and what's interesting to me is learning about the tools and technologies that can help companies use Big Data to push the boundaries of business analytics.


"R" is one of those tools, and by tools I mean only in the loosest sense of the word. R is actually an open source programming language and software environment that's used for statistical computing and graphics. It is-to quote Wikipedia-"a de facto standard among statisticians for developing statistical software," and therefore widely used for data analysis.


Its main competition is the proprietary SAS (Statistical Analysis System), which is offered by the SAS Institute.


R is also one of those tech things that's found a home in academia, but not the enterprise, but that's quickly starting to change, in part because software vendors are starting to offer tools designed for enterprise use.


Revolution Analytics is working hard to change that. The company is lead by Norman Nie, who, interestingly enough, co-invented a statistical analysis programming language called SPSS.


Recently, the company announced a new partnership with IBM Netezza. The plan is to integrate its Revolution R Enterprise with the IBM Netezza TwinFin Data Warehouse Appliance. The press release notes the integration will be able to "deliver 10-100x performance improvements at a fraction of the cost compared to traditional analytics vendors."


Last month, I interviewed Revolution COO Jeff Erhardt about R and its new partnership. We were discussing the business uses for R, how it ties in with Hadoop and so on. Of course, being an integration blogger, I asked about the integration issues-you can read the full interview here. I mentioned that I had noticed that the CTO, David Champagne, had recently presented a webinar on integrating R into third-party and Web applications.


As luck would have it, Champagne was lurking on the call and spoke up, giving me a chance to ask him directly about the integration angle.


Champagne explained that R is designed to do analysis through a command line, which, obviously, makes for some challenges when you want to integrate it with software. It has interfaces that allow it to be extended, he said, but there's no set of APIs. Revolution's RevoDeployR solves that problem:

What we've built is a collection of Web services that allows an application to tap into the power of R being able to execute predefined scripts that do some analysis and return artifacts of execution like plots or tables or summary information. So the stack that we've built really enables interactive Web applications to take advantage of R in a consistent way, or desktop applications like Excel to integrate and take advantage of R or any other kind of business intelligence applications that need advanced analytics and being able to use R and bring those results back to their end users so that they can get more meaning from their data.

The tool separates the R coding from the application coding, so the R coder can focus on developing the analytic and writing the R code, he explained. RevoDeployR allows you to do Web-based or desktop-based application integration. It also includes client libraries in JavaScript, Java and .NET, as well as a specific library for JasperSoft, so you can integrate the R analysis into your business reports.


The webinar is geared toward developers and gives specific examples of integration using Java, JavaScript, .NET and Web services. It's available for free on-demand viewing.