Because it’s an open source tool for building analytics applications, the R programming language has gained a lot of traction in academia, and that is now carrying over into the business world. As new graduates enter the workforce, they naturally want to bring their R programming skills with them. The challenge is that R runs in memory, which makes it difficult to build R applications that scale.
Teradata today lifted the limitation with the launch of Teradata Aster R, an instance of R that runs on a massively parallel processing (MPP) database that will be available in the fourth quarter.
Arlene Zaima, a Teradata product marketing manager within Teradata Labs, says the primary benefit of this approach to running R applications is that developers of these applications no longer have to artificially limit the amount of data they can work with to fit within the confines of the memory available on either a desktop or a server. Instead, the simplicity of the R programming environment can now be applied to an almost unlimited amount of data.
Teradata Aster R consists of an Aster R Parallel Library and an Aster R Parallel Constructor that allows programmers to run instances of an R engine on multiple nodes running in the Aster MPP database. In addition, Teradata is making use of connectors from SnapLogic to allow programmers to access data residing in SQL databases, Hadoop or another R engine.
As an alternative to commercial environments for building analytic applications from SAS Institute and IBM SPSS, the R programming language has already gained a significant following both inside and out of academia. The challenge now is finding a way to exploit all those skills in a way that enables R programmers to build enterprise-class applications that truly scale.