Does the Future Belong to Dedupe?

Arthur Cole

It should be clear to everyone by now that the days of bigger, better storage are over. It's no longer a simple matter of adding capacity to suit peak data loads because a) storage maintenance and operation is still fairly expensive even if the actual storage is not, and b) the over-provisioning practices of the past simply do not cut the mustard in these efficiency-focused times.

That's why many of the leading vendors are turning away from big storage and are starting to focus on smart storage, with technologies like thin provisioning and capacity management to make more efficient use of available resources.

A key tool in this effort is deduplication, and by the looks of things, dedupe is emerging as the primary weapon as enterprises increasingly turn to remote and cloud-based storage facilities.

NetApp is shoring up its dedupe capabilities in a big way, with the purchase of Data Domain for an estimated $1.5 billion. The merger is expected to play a significant role in NetApp's virtual tape library (VTL) and heterogeneous disk-based backup offerings as the company tries to wean enterprises away from tape-based systems. There is, however, the risk that NetApp might not pursue dedupe as aggressively as it could, considering too much efficiency could start to eat into its general storage appliance sales.

Nonetheless, the deal propels NetApp into the upper echelons of the data storage market, closing in on number 4 vendor Dell, according to eWEEK's Chris Preimsberger. And if dedupe does become a primary driver of future storage sales, the NetApp/Data Domain union could really shake things up, considering that all of the top vendors that thrive on capacity tend to downplay their dedupe capabilities to a certain extent.

Evidence of that pressure may already be surfacing at some organizations. EMC, for example, promoted dedupe pretty heavily, along with cloud storage and SSDs, at its EMC World show in Orlando this week. The company added a number of dedupe capabilities to its Avamar backup system, adding source-based dedupe to platforms like Microsoft SharePoint and IBM Lotus Domino. Source-based dedupe is considered more effective than target-based schemes in some architectures because it is more efficient and reduces data loads across networks as well. The company added dedupe to its Disk Library and Networker products as well.

The thing to remember, though, is that dedupe is still anyone's game, with a large number of small backup vendors eager to use the technology to tap into the larger firms' market shares. A company called Sepaton, for example, recently bolstered its DeltaRemote replication software for its S2100 VTL with the DeltaStor dedupe system, which the company claims reduces bandwidth requirements some 97 percent. The system is packaged into a single management console that runs on existing nodes, cutting the need for numerous dedicated appliances while still providing backup, dedupe and replication for about 25 TB per node each day.

As more and more storage is made available over the cloud, it may not be too long before provisioning and capacity issues become a thing of the past for enterprise managers. For the cloud provider, though, it could take center stage as they try to juggle the competing storage needs of multiple clients. For them, any tool that extends available storage across existing resources will be welcome, and data deduplication fills that role nicely.

Add Comment      Leave a comment on this blog post
May 22, 2009 11:15 AM Emerson Lima Emerson Lima  says:

Good article. Congratulations. Let's just wait and see what will be the next move of the IT industry regarding this topic. In my opinion is a better technical approach for storage systems.

Greetings from Brazil

Emerson Lima

Intelligent Networks engineer

May 22, 2009 12:37 PM Kimberley Kimberley  says:

Great article. You raise a good point that there's so much opportunity to reclaim storage, de-dupe and manage capacity better. And all this will be even a greater issue with cloud. Here's a recent video on thin provisioning and storage reclamation, but there's lots of great info available on the subject from both hardware and software vendors. It would be great to hear how some storage managers are dealing with this, as well.

Jun 15, 2009 11:12 AM CT CT  says:

De-dupe is significantly over-hyped.   And it sure is over-priced.  There is very little point in spending on clever de-dupe software to save on hardware, if it costs virtually as much as the hardware.  Take a look at Data Domain and Quantum's approach to this market.  I can't believe what EMC and NetApp are paying for Data Domain.

What users really want in backup is CDP - continuous data protection, backing up without taking a backup.  Importantly, anyone who can combined de-dupe with CDP, and offer it affordably is really meeting the market need.  No one seems to understand this except perhaps FalconStor.


Post a comment





(Maximum characters: 1200). You have 1200 characters left.



Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.