One of the first things any organization is going to want to do once it accumulates a mass of Big Data is search it. That usually means finding and deploying an enterprise search engine.
To eliminate the friction involved in that process, Cloudera this week announced that it plans to bundle the open source Apace Solr enterprise search engine with the company’s distribution of Hadoop.
According to Charles Zedlewski, vice president of products at Cloudera, the goal is to make Hadoop more accessible to organizations that don’t have a lot of programming skills. While there are roughly 120,000 people in the world that know Hadoop, and one to two million who know how to program in SQL, Zedleski notes that billions of people are familiar with how to use a search.
Right now, Zedleski says Cloudera estimates that 20 percent of its customers have already integrated an enterprise search engine with Hadoop.
Available in beta, Zedleski says Cloudera Search is tightly integrated with Hadoop Distributed File System (HDFS) and Apache HBase and includes an application programming interface that can be used to tie Cloudera Search to legacy systems.
As a storage file system, Hadoop presents a unique opportunity to affordably store massive amounts of data, which makes a lot of sense assuming that people actually have some way of getting at it. Once those tools are in place, the number of Hadoop projects being deployed across the enterprise should steadily increase.