Does the Future Belong to Dedupe?


It should be clear to everyone by now that the days of bigger, better storage are over. It's no longer a simple matter of adding capacity to suit peak data loads because a) storage maintenance and operation is still fairly expensive even if the actual storage is not, and b) the over-provisioning practices of the past simply do not cut the mustard in these efficiency-focused times.

That's why many of the leading vendors are turning away from big storage and are starting to focus on smart storage, with technologies like thin provisioning and capacity management to make more efficient use of available resources.

A key tool in this effort is deduplication, and by the looks of things, dedupe is emerging as the primary weapon as enterprises increasingly turn to remote and cloud-based storage facilities.

NetApp is shoring up its dedupe capabilities in a big way, with the purchase of Data Domain for an estimated $1.5 billion. The merger is expected to play a significant role in NetApp's virtual tape library (VTL) and heterogeneous disk-based backup offerings as the company tries to wean enterprises away from tape-based systems. There is, however, the risk that NetApp might not pursue dedupe as aggressively as it could, considering too much efficiency could start to eat into its general storage appliance sales.

Nonetheless, the deal propels NetApp into the upper echelons of the data storage market, closing in on number 4 vendor Dell, according to eWEEK's Chris Preimsberger. And if dedupe does become a primary driver of future storage sales, the NetApp/Data Domain union could really shake things up, considering that all of the top vendors that thrive on capacity tend to downplay their dedupe capabilities to a certain extent.

Evidence of that pressure may already be surfacing at some organizations. EMC, for example, promoted dedupe pretty heavily, along with cloud storage and SSDs, at its EMC World show in Orlando this week. The company added a number of dedupe capabilities to its Avamar backup system, adding source-based dedupe to platforms like Microsoft SharePoint and IBM Lotus Domino. Source-based dedupe is considered more effective than target-based schemes in some architectures because it is more efficient and reduces data loads across networks as well. The company added dedupe to its Disk Library and Networker products as well.

The thing to remember, though, is that dedupe is still anyone's game, with a large number of small backup vendors eager to use the technology to tap into the larger firms' market shares. A company called Sepaton, for example, recently bolstered its DeltaRemote replication software for its S2100 VTL with the DeltaStor dedupe system, which the company claims reduces bandwidth requirements some 97 percent. The system is packaged into a single management console that runs on existing nodes, cutting the need for numerous dedicated appliances while still providing backup, dedupe and replication for about 25 TB per node each day.

As more and more storage is made available over the cloud, it may not be too long before provisioning and capacity issues become a thing of the past for enterprise managers. For the cloud provider, though, it could take center stage as they try to juggle the competing storage needs of multiple clients. For them, any tool that extends available storage across existing resources will be welcome, and data deduplication fills that role nicely.