At this juncture, it really is only a matter of time before IT organizations find themselves looking for a way to replicate data between two or more diverse implementations of Hadoop.
To address that issue, WANdisco this week announced it has begun shipping WANdisco Fusion, a replication software that enables data to be synchronously copied between two active implementations of Hadoop.
WANdisco CEO David Richards says that data sovereignty issues will soon force Hadoop replication issues. Countries around the globe are legally requiring that data be available locally, which generally means replicating data between multiple data centers. Prior to the arrival of WANdisco Fusion, Richards says there was no way to replicate data between two active implementations of Hadoop.
Richards adds that WANdisco Fusion is also the only way to replicate data between heterogeneous implementations of Hadoop. Given that it’s unlikely that multiple organizations will be able to standardize on a specific implementations of Hadoop, Richards says the only way those organizations will be able to synchronously replicate data is via WANdisco Fusion software, which takes only 10 minutes to implement.
As a company, WANdisco has been selling data storage replication software for years. It is now applying that technology to Hadoop platforms, which will over time become the data hubs and data lakes through which most enterprise data will flow.
Big Data management is clearly one of the major challenges that enterprise IT organizations will need to address in the months and years ahead. As part of those efforts, there’s almost no doubt that some form of replication of data at levels of unprecedented scale will eventually be part of that conversation.