SHARE
Facebook X Pinterest WhatsApp

DataTorrent Brings Real-Time Streaming Analytics to Hadoop

When it comes to Hadoop, the popular perception is that the open source data management framework is limited to mainly batch-oriented applications. As a result, running any serious real-time analytics application would require deploying expensive, massively parallel database systems. Looking to lay that misperception to rest, DataTorrent this week announced the general availability of DataTorrent […]

Written By
MV
Mike Vizard
Jun 4, 2014

When it comes to Hadoop, the popular perception is that the open source data management framework is limited to mainly batch-oriented applications. As a result, running any serious real-time analytics application would require deploying expensive, massively parallel database systems.

Looking to lay that misperception to rest, DataTorrent this week announced the general availability of DataTorrent RTS, an implementation of a real-time streaming processing platform on Hadoop. It can be deployed on just about any flavor of Hadoop.

DataTorrent CEO Phu Hoang says that with the advent of Yet Another Resource Manager (YARN) 2.0 for Hadoop environments, it’s now feasible to run all kinds of real-time applications on Hadoop. In fact, DataTorrent claims its platform can process more than one billion data events per second, which the company says is equivalent to processing 46 cumulative hours of streaming Twitter data in one second.

The implications of that for data warehouse environments is particularly profound. If Hadoop becomes the platform where those applications wind up residing, the cost of delivering a bevy of analytics applications will substantially drop.

viz20140604-01

Rather than bringing data to the application, DataTorrent and other Hadoop proponents say it makes a lot more sense to bring Java applications to where the data resides. That not only eliminates the need for complex extract, transform and load (ETL) processes that often introduce errors, it reduces the amount of IT infrastructure that needs to be managed.

Quite a number of different types of processing engines can now be deployed on top of Hadoop using YARN. While no one knows which ones will ultimately gain market traction, it is pretty clear at this point that Hadoop has arrived as an application development platform.

MV

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

Recommended for you...

Observability: Why It’s a Red Hot Tech Term
Tom Taulli
Jul 19, 2022
Top GRC Platforms & Tools in 2022
Jira vs. ServiceNow: Features, Pricing, and Comparison
Surajdeep Singh
Jun 17, 2022
IT Business Edge Logo

The go-to resource for IT professionals from all corners of the tech world looking for cutting edge technology solutions that solve their unique business challenges. We aim to help these professionals grow their knowledge base and authority in their field with the top news and trends in the technology space.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.