The Data Deduplication Revolution - Slide 2

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11
Next The Data Deduplication Revolution-2 Next

As with many things in the world of IT, there are numerous techniques in use for deduplicating data, some are unique to specific vendors, who guard their technology behind patents and copyrights, others use more open methods. The goal of all is to identify the maximum amount of duplicate data using the minimum of resources.

The most common technique in use is that of “chunking” the data. Deduplication takes place by splitting the data stream into chunks” and then comparing the chunks with each other. Some implementations use fixed chunk sizes, other use variable chunk sizes. The latter tends to offer a higher success rate in identifying duplicate data as it is able to adapt to different data types and environments. The smaller the chunk size then the more duplicates will be found, however, performance of the backup and more importantly the restore is affected. Therefore, vendors spend a lot of time identifying the optimal size for different data types and environments, and the use of variable chunk sizes often allow tuning to occur, sometimes automatically.

During deduplication every chunk of data is processed using a hash algorithm and assigned a unique identifier, which is then compared to an index. If that hash number is already in the index, the piece of data is considered a duplicate and does not need to be stored again, and instead a link is made to the original data. Otherwise the new hash number is added to the index and the new data is stored on the disk. When the data is read back, if a link is found, the system simply replaces that link with the referenced data chunk. The deduplication process is intended to be transparent to end users and applications.

The term data deduplication increasingly refers to the technique of data reduction by breaking streams of data down into very granular components, such as blocks or bytes, and then storing only the first instance of the item on the destination media, and then adding all other occurrences to an index. Because it works at a more granular level than single instance storage, the resulting savings in space are much higher, thus delivering more cost effective solutions. The savings in space translate directly to reduced acquisition, operation, and management costs.

Data deduplication technologies are deployed in many forms and many places within the backup and recovery infrastructure. It has evolved from being delivered within specially designed disk appliances offering post processing deduplication to being a distributed technology found as an integrated part of backup and recovery software. According to CA Technologies, along the way solution suppliers have identified the good and bad points of each evolution and developed what today are high performance efficient technologies.

This slideshow looks at data deduplication and five areas, identified by CA Technologies, that you should consider carefully when approaching a data deduplication project.

More Slideshows:

Don't Be a Loser: Think Before You Post It doesn't look like online users have learned much, as the number of those with "poster's remorse" has increased since last year.

Eleven Easy Ways to Improve Your Survey Response Rates Tips for getting better results when conducting surveys.

Nine Female Executives to Watch Top female executives to keep your eye on.


Related Topics : Brocade, EMC, Fibre Channel, Network Adapters, SAN

More Slideshows

ReduxioFlashStorage0x Flash Storage Architecture: What's Available and Why It Matters

By comparing flash storage architectures side by side, storage administrators can better understand what flash architectures make the most sense for their particular set of applications. ...  More >>

QumuloEnterpriseStorage0x 5 Trends Shaking Up Today's Enterprise Storage Strategy

"Storage Wars" is a popular reality TV show, but the title may seem all too real to enterprises trying to deal with storage demands as they drown in data. ...  More >>

Holiday17-290x195 Eleven Hot Gadgets for Dads and Grads

Got a gadget-loving grad or dad that you need a gift for? We've gathered a list of 11 new and innovative gadgets that are sure to please. ...  More >>

Subscribe Daily Edge Newsletters

Sign up now and get the best business technology insights direct to your inbox.