These days, it’s all about the citizen integrator. That’s why it’s not so surprising that ClearStory Data, a start-up still primarily making headlines for fundraising, focused on business users out of the gate at the Gartner Business Intelligence & Analytics Summit, held this week in Las Vegas.
That’s pretty cool, and Gartner says it makes ClearStoryData a cool vendor, but recently I spoke with company co-founder and chief architect Vaibhav Nivargi, and I have to say, that’s a bit like saying anyone can drive a Mustang. Sure, it’s true, but who cares about that? What people really want to know about is the muscle in the engine. Let’s look underneath that hood and I think you’ll see why that’s true for ClearStory Data, too, even for business users.
ClearStory Data uses what it calls a Data Harmonization Engine to power its analytics. Here’s why that matters:
- It’s in-memory. In-memory gets short-changed in the Big Data discussions, but loading the data into in-memory makes the integration and analytics work fast, really fast, because you’re saving all that read/write time waste. Your data is loaded in-memory, where it can be explored for new connections. It remains untouched at source. That also means, of course, that this isn’t a replacement for a data warehouse or database, he added.
- It’s built on Apache Spark, a fast engine used for large-scale data processing. ClearStory Data is one of the earliest to embrace Spark, and it’s not hard to see why. In a TechCrunch column, Nivargi explains its beginnings at Berkeley’s AMPLab and notes that its “in-memory, parallel processing power runs programs 100X faster than Hadoop MapReduce in memory and 10X faster on disk.” That’s important because it “allows dozens of data sources to be blended and harmonized at once,” he adds.
- The back-end is cloud-based, which means users can access its drag-and-drop interface through a browser.
With ClearStory Data, there’s no coding, modeling or building, Nivargi said.
“Customers access with their browsers, so there’s no installation required, it works on any platform and they can get to any of their internal data,” Nivargi explained. "The interface is highly intuitive, visual in nature, drag and drop, pinch and zoom, so it's accessible to a wide range of users. And it scales for really large volumes of data because of the in-memory Spark technology.”
Loraine Lawson is a veteran technology reporter and blogger. She currently writes the Integration blog for IT Business Edge, which covers all aspects of integration technology, including data governance and best practices. She has also covered IT/Business Alignment and IT Security for IT Business Edge. Before becoming a freelance writer, Lawson worked at TechRepublic as a site editor and writer, covering mobile, IT management, IT security and other technology trends. Previously, she was a webmaster at the Kentucky Transportation Cabinet and a newspaper journalist. Follow Lawson at Google+ and on Twitter.