Powershell Howto: Enabling and Managing Server 2012 Volume Level Deduplication
Today I’m going to talk about one of my personal favorite features in Windows Server 2012. Volume level deduplication.
For those not familiar volume level deduplication (or dedupe for short), it is a file services technology baked into server 2012 that can provide significant storage savings on data volumes by retaining only a single copy of like files.
For example: Let’s say I have 3 copies of a 4 GB .ISO file. Server 2012 Volume level dedupe is intelligent enough to know that all three copies of that file are exactly the same and under the hood it will actually remove two of the three copies and place file pointers in their place referencing the remaining copy.
This all happens without the end user even knowing about it as the file structure appears to remain unchanged.
It is not without it’s downsides however. Your backup application has to support 2012 volume level dedupe, and you also need to take into consideration the added CPU cycles it will take to complete the deduplication process. Which, after the initial pass, are greatly reduced. So, plan accordingly!
Further information regarding planning and limitations can be found here.
So, how do we enable it? Beings this is a Powershell howto, the Server Manager way wouldn’t be very relevant now would it? Read on to see how we accomplish this via Powershell, which can be very beneficial knowledge for Server core administrators.
The first thing we need to do is verify the necessary roles are installed. We can do this with the Get-WindowsFeature cmdlet.
As we can see in the image above, I have the needed roles installed.
If you need to install them however you would do so with the following command:
Install-WindowsFeature FS-FileServer, FS-Data-Deduplication
Now we need to determine what volumes we want to dedupe. Deduplication is not allowed on System volumes and can only be enabled on data volumes. To get a list of volumes on my system I simply issue a Get-Volume command.
In my case, I want dedupe applied only to the volume that contains all my file-shares, which is my DATA volume E: To do so I will use the cmdlets located in the Deduplication module in Powershell.
The most basic way of enabling dedupe is by way of the Enable-DedupVolume cmdlet.
Enable-DedupVolume (Volume Letter)
This cmdlet by default will enable dedupe on the specified volume and will set the optimization type to background, meaning files will be deduped slowly in the background.
From what I’ve seen myself, this works pretty well in most cases, but can put a strain on lighter CPU file servers. So, take that into account.
To verify that dedupe has been enabled on the stated volume simply run the Get-DedupStatus cmdlet.
As you can see, I’ve already saved some space since enabling dedupe. Depending on your file contents, your mileage may vary.
If you want even more detailed information you can pipe the Get-DedupStatus command to the format-list command as shown below:
Get-DedupStatus | fl
Now… onto schedules
If the background optimization doesn’t fit your needs you can schedule when the deduplication actually occurs. This is especially useful if you have a highly utilized file server and can’t spare the CPU cycles during production hours.
You can create and modify custom schedules via the New-DedupSchedule and Set-DedupSchedule cmdlets.
At this point you’ll want to disable the default BackgroundOptimization schedule. The command to do so is not the most intuitive command. You would do so by running:
Set-DedupSchedule BackgroundOptimization -enable $false
This will disable the background optimization schedule and your volume will now be deduped on your schedule only.
To review the current status of your dedupe schedules you would simply issue:
Get-DedupSchedule | fl
You’ll also see a couple of other default created schedules. You’ll see a garbage collection schedule and a scrubbing schedule.
The scrubbing schedule is simply a data integrity schedule. It goes though and verifies that the files that are deduped actually have the correct pointers in place and all the necessary metadata….etc…etc
In regards to the garbage collection schedule, when a deduped file is deleted from the volume, some chunk files are left over. The garbage collection job simply identifies those chunks and removes them.
One other thing to note. By default files older than 5 days will be deduped. This setting can be easily modified. For example: if I wanted to set it so only files older than 15 days on the E: volume get deduped, I would issue:
Set-DedupVolume E: -MinimumFileAgeDays 15
Also, if you want all files to be deduped automatically no matter the age you would just set the -MinimumFileAgeDays option to 0.
Utilizing the Server 2012 and 2012 R2 volume level deduplication features allows you to achieve extreme space savings depending on your file make ups. I highly recommend enabling this feature on any and every Windows file server in your environment.
I’m hoping this guide has been a great help in assisting you with this great feature.
Thanks for reading!