The following article describes how to use Windows Server data deduplication on an Solid State Drive (SSD) that holds active Hyper-V virtual machines.
Coloring Outside the Lines Statement:
This configuration is not supported by Microsoft. See Plan to Deploy Data Deduplication for more information. Use these procedures at your own risk. That said, it works great for me. Your mileage may vary.
A while back I decided to add another 224GB SATA III SSD to my blistering Windows Server 2012 Hyper-V server for my active VMs. The performance is outstanding and it makes the server dead silent. I moved my primary always-on HyperV VM workloads to this new SSD:
- Domain Controller on WS2012
- Exchange 2010 multi-role server on WS2012
- TMG server on WS2008 R2
These VMs took 134GB, or 60%, of the capacity of the drive which was
fine at the time. Later, I added a multi-role Exchange 2013 server
which took up another 60GB of space. That left me with only 13% free
space, which didn't leave much room for VHD expansion and certainly not
enough to host any other VMs. Rather than buy another larger and more
expensive SSD, I decided to see how data deduplication performs in
Windows Server 2012.
Add the Data Deduplication Feature |
Data Deduplication is a feature of the File and Storage Services role in
Windows Server 2012. It's not installed by default, so you need to
install it using the Add Roles and Features Wizard (as above) or by
using the following PowerShell commands:
PS C:\> Import-Module ServerManager
PS C:\> Add-WindowsFeature -Name FS-Data-Deduplication
PS C:\> Import-Module Deduplication
Next, you need to enable data deduplication on the volume. Use the File and Storage Services node of Server Manager and click Volumes. Then right-click the drive you want to configure for deduplication and select Configure Data Deduplication, as shown below:
Configuring Data Deduplication on Volume X: |
From here on I'm going to customize deduplication for my Hyper-V SSD.
In the Configure Data Deduplication Settings for the SSD, select Enable data deduplication and configure it to deduplicate files older than 0 days. Click the Set Deduplication Schedule button and uncheck Enable background optimization, Enable throughput optimization, and Create a second schedule for throughput optimization.
Enable Data Deduplication for Files Older Than 0 Days |
Disable Background Optimization and Throughput Optimization Schedules |
You can also configure these data deduplication settings from PowerShell using the following commands:
PS C:\> Enable-DedupVolume X:This configuration mitigates the reason why Microsoft does not support data deduplication on drives that host Hyper-V VMs. Mounted VMs are always open for writing and have a fairly large change rate.1 This is the reason Microsoft says, "Deduplication is not supported for files that are open and constantly changing for extended periods of time or that have high I/O requirements."
PS C:\> Set-Dedupvolume X: -MinimumFileAgeDays 0
PS C:\> Set-DedupSchedule -Name "BackgroundOptimization", "ThroughputOptimization", "ThroughputOptimization-2" -Enabled $false
In order to deduplicate the files and recover substantial disk space you need to shutdown the VMs hosted on the volume and then run deduplication manually with this command:
PS C:\> Start-DedupJob –Volume X: –Type OptimizationThis manual deduplication job can take some time to run depending on the amount of data and the speed of your drive. In my environment it took about 90 minutes to deduplicate a 224GB SATA III SSD that was 87% full. You can monitor the progress of the deduplication job at any time using the Get-DedupJob cmdlet. The cmdlet shows the percentage of progress, but does not return any output once the job finishes.
You can also monitor the job using Resource Monitor, as shown below:
Process Monitor During Deduplication |
Once deduplication completes you can restart your VMs, check the level of deduplication, and how much data has been recovered. From the File and Storage Services console, right-click the volume and select Properties:
Properties of Deduplicated SSD Volume |
The drive above now actually holds more reconstituted data than the capacity of the drive itself with no noticeable degradation in performance. It currently hosts the following active Hyper-V VMs:
- Domain Controller on WS2012
- Exchange 2010 multi-role server on WS2012
- TMG server on WS2008 R2
- Exchange 2013 multi-role server on WS2012
- Exchange 2013 CAS on WS2012
- Exchange 2013 Mailbox Server on WS2012
Caveats:
- Because real-time optimization is not being performed, the VMs will grow over time as changes are made and data is added. The manual deduplication job would need to be run as needed to recover space.
- Since the SSD actually contains more raw duplicated data than the drive can hold, I'm unable to disable deduplication without moving some data off the volume first.
- Even though more VMs can be added to this volume, you have to be sure that there is sufficient free space on the volume to perform deduplication.
For even more information about Windows Server 2012 data deduplication, I encourage your to read Step-by-Step: Reduce Storage Costs with Data Deduplication in Windows Server 2012!
I hope you find this article useful in your own deployments and I'm
interested to know what your experience is. Please leave a comment
below!