Data deduplication and compression promise often-staggering reductions in data storage capacity requirements, shrinking backup data by 90 percent or more. Reducing this backup load achieves a corresponding reduction in the cost of hardware and management. Small and midsized businesses and departmental computing operations, however, can struggle to achieve optimal data reduction because of a number of factors including a lack of hands-on experience using these technologies.
Poway, Calif.-based BridgeSTOR LLC, a provider of data reduction technology, has outlined the following Q&A based on its most frequently asked questions about backup data reduction.
Click through for answers to six common backup data reduction questions, as identified by BridgeSTOR LLC, a provider of data reduction technology.
Both. Deduplication and compression work independently and are complementary technologies that together can provide data reduction of up to 90 percent. Used alone, either can be effective at reducing disk backup capacity requirements, but an overall data reduction strategy includes both deduplication and compression. If you achieve 2:1 compression of backup data that has already been deduplicated by a 10:1 ratio, the result is a total data reduction ratio of 20:1.
There are reasons to consider both, but compression in particular requires processing power, so it impacts backup performance least when it is deployed as a hardware-based solution. Hardware-based data reduction appliances also offer the advantage of compatible preconfigured applications, rather than a piecemeal collection of products that you must purchase, install, configure and manage yourself. Look for data reduction appliances that do not require a separate backup server, which adds complexity and increases the overall cost of the backup infrastructure. An ‘all-in-one’ backup data reduction appliance takes the guesswork out of implementing multiple solutions.
Disk compression has traditionally been file-based. Hardware compression works on data blocks, not files. Until recently, the challenges of implementing block-based data compression on disk have been insurmountable. That is now changing with the introduction of data-reducing backup appliances equipped with hardware-based disk compression.
Backup data reduction must address backing up the servers at HQ as well as remote offices, and sending backup data to a disaster recovery (DR) site. Backup should be a single, continuous process, centrally and conveniently managed, which is possible with data reduction appliances.
Typically, deduplication and compression are applied to primary data in very different ways. Primary storage deduplication works on blocks of data aligned at the disk’s boundaries. Windows and Linux file systems align the beginning of each file at the beginning of a block. This means that primary storage deduplication will always identify duplicate blocks within files. Also, databases read and write on fixed-size pages, so duplicate data within a single database or across databases can be detected. Backup applications, on the other hand, create files that are the equivalent of .tar or .zip files in which the blocks are not always aligned the same way, so backup deduplication applications have a very different job to do than primary data deduplication.
Just comparing prices won’t provide a true apples-to-apples evaluation. A better metric is the cost per terabyte of backup to disk capacity. How many concurrent backup streams are supported? What is the impact on backup performance, if any? The wrong answers to these questions will actually cost you more even at the lower entry price point.