IBM Creates Data Science Experience for Analytics Apps Using Apache Spark

    Slide Show

    How to Transform into an Insight-Driven Enterprise

    Looking to make it significantly simpler for the average organization to embrace advanced analytics, IBM today announced a cloud-based integrated development environment (IDE) for building analytics applications aimed at data scientists of almost any skill level.

    Ritika Gunnar, vice president, Offering Management, Data and Analytics at IBM, says the Data Science Experience from IBM provides not only access to development tools, but also 250 curated data sets that data scientists can employ to jumpstart an application development project using the R programming language.

    At the same time, Gunnar says, IBM is providing a venue on its Bluemix platform-as-a-service (PaaS) environment running on the IBM SoftLayer cloud through which multiple data scientists can collaborate on the development of these applications by sharing artifacts they create.

    The Data Science Experience, says Gunnar, leverages over $300 million in investments IBM has made in the open source Apache Spark framework for building advanced analytics applications. Last year, IBM committed to turning Apache Spark into the de facto operating environment for building these types of applications. As part of that effort, IBM revealed that it is joining the R Consortium to accelerate usage of Apache Spark.

    As part of a push to create a massive analytics ecosystem, IBM has been making analytics-related contributions to the Apache Toree, EclairJS, Apache Quarks, Apache Mesos, and Tachyon (now called Alluxio) projects, as well as Apache Spark sub-projects such as Spark SQL, SparkR, MLlib, and PySpark. All told, IBM says it has made 3,000 contributions to those projects in the last year.

    IBM has built Spark into the core of 30 of its offerings, including IBM BigInsights for Apache Hadoop, IBM Analytics on Apache Spark, Spark Power Systems, Watson Analytics, SPSS Modeler and Stream Computing. IBM also open-sourced its SystemML machine learning technology last year.

    In general, Gunnar says, IBM has seen a marked shift in where the budget for advanced analytics applications comes from inside organizations. Rather than being the sole province of internal IT organizations, Gunnar says, line-of-business units inside organizations are directly funding development of analytics applications. IBM’s goal is to make it easier for those business units to tap their own internal expertise to build those applications.

    Of course, once the applications are created, they almost inevitably come back into the province of the internal IT organization to manage. But, in the meantime, IBM hopes that by letting thousands of advanced analytics projects bloom, those internal IT organizations will soon find themselves adding more value to the business in ways many of them would previously have not thought possible.

    Mike Vizard
    Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

    Latest Articles