Over the last few years I have had a ridiculous amount of conversations around deduplication. The benefits are abundant (up to 90% reduction in storage) and for the most part purchasing is a no brainer; can get rid of tapes, faster restore, limited to no impact on performance, etc…
One obstacle that tends to stand in the way however, is the cost of such a useful technology as an appliance can start around $30,000 and for a proper solution you would need two of them (primary and secondary.)
With that said, I did come across a open source option today, and if you are brave enough you can have your very own deduplication appliance with nothing more than some time and old server equipment!
I would note that this is “use at your own risk.” Data (in my humble opinion) is the most valuable asset of any company can have, and while the idea of a free solution is nice, I would not trust my tier 1 data (tier 2 as a matter a fact) with this till you have really, really tested it. Remember, there is no support numbers or anyone to call if something goes wrong. You are on your own!
What is it?
Opendedup makes a file-system call SDFS is not only free but does legitimate inline deduplication! Below is the highlights from the website:
- Cross Platform Support – Works on Linux or Windows.
- Reduced Storage Utilization – SDFS Deduplication can reduce storage utilization by up to 90%-95%
- Scalability – SDFS can dedup a Petabyte or more of data. Over 3TB per gig of memory at 128k chunk size.
- Speed – SDFS can perform deduplication/redup at line speed 290 MB/S+
- VMWare support – Work with vms – can dedup at 4k block sized. This is required to dedup Virtual Machines effectively
- Flexible storage – deduplicated data can be stored locally, on the network across multiple nodes, or in the cloud.
- Inline and Batch Mode deduplication – The file system can dedup inline or periodically based on needs. This can be changed on the fly
- File and Folder Snapshot support – Support for file or folder level snapshots.
The flexibility and performance looks good and it does support 128k, 64k, 32k, and 4k block sizes, not bad! Below are some of the graphs showing performance:
Set-up looks pretty basic and there is a pretty extensive administrator guide. One concern I d0 have is that it looks like the project may be slowing down, but the product still works as it is.
I say give it a try, its free!