Using WIM files to speed up application deployment
Deploying large applications can be a soul crushing job. The time to install a huge application can be bad enough but it is often made worse by how long it takes to copy tens of thousands of files around the network.
Occasionally an idea will appear out of nowhere that makes you wonder why no-one thought of it before. I discovered this method via a blog post from Justin Holloman, who in turn was inspired by Aaron Y (@_aarony).
- Background
- Example - how to speed up Solidworks deployment time by an hour
- Configuration Manager adds to the misery
- WVD packages over a VPN
- Some real life data
- Adding WIM files to a packaging workflow
- Expert level WIM files - multiple index files
- Final thoughts
Background
The idea here is to use a WIM file as a data disk to hold the source files. This results in one large file instead of a folder with a large number of files. This is important because copying large numbers of small files over SMB is not a good idea
During file transfer, file creation causes both high protocol overhead and high file system overhead. For large file transfers, these costs occur only one time. When a large number of small files are transferred, the cost is repetitive and causes slow transfers.
Over the past week I’ve run a number of tests to see how using this idea would work for large problematic apps. The method has worked so well I’m about to transition all of the packages detailed in this test to production.
The results are nothing short of amazing. To see why, consider an app such as Solidworks from Dassault Systèmes.
Example - how to speed up Solidworks deployment time by an hour
This application is updated annually, is 22GB in size and the installer has over 60,000 files. Copying the source files from a network drive to a workstation takes over an hour
Copying the same files encapsulated in a WIM takes 3 minutes 40 seconds
Just by placing the source files in a WIM has saved one hour and 13 minutes deployment time!
Configuration Manager adds to the misery
If SCCM (MEMCM) is used for software distribution the problem of deploying a large application with lots of files is compounded by the Configuration Manager Content Library
In a nut-shell, the content library stores all the Configuration Manager content efficiently on the disk. If the same file is part of two different packages, it stores only one copy in the content library. However, references are kept indicating that the file is part of both the packages.
The Content Library was first introduced in SCCM 2012 and has been one of the few components of SCCM that regularly causes me headaches. The general concept is sound, but it struggles with very large files and packages with lots of smaller files.
When an app is imported into SCCM every single file needs to be hashed then copied into the SCCMContentLib. This takes a very long time because SMS_DISTRIBUTION_MANAGER
not only copies the files from the source location into SCCMContentLib, it also creates a signature for each file.
This can take days to finish. It’s not much fun watching thousands of lines of distmgr.log
scrolling past in cmtrace when something inevitably goes wrong with the process.
Adding a package to SCCMContentLib still takes time with a single WIM file, but with an application like Solidworks only one hash is needed instead of 60,000.
WVD packages over a VPN
My workflow for WVD is to prepare packages locally then copy to Azure. Copying tens of thousands of files over a VPN connection is not a pleasant experience, so our solution was to compress the install package in a zip file.
Using a WIM file is a subtle but welcome change to our WVD workflow. We still can package locally and compress into a single file. But since the WIM file has already been created to improve SCCM deployment times, there is no extra work to be done. And because the resulting file is mounted and not decompressed, there is very little overhead when installing the app on a WVD desktop.
Some real life data
To measure the improvement a WIM file can bring, the following steps were followed
-
Copy the source files from a network drive to a test workstation
Measure-Command {Copy-Item R:\Applications\Solidworks_2020_SP3_x64_R01 -Destination C:\users\Public\solidworksource -recurse -Container }
-
Create a WIM file from the source
Measure-Command { New-WindowsImage -CapturePath C:\users\public\solidworksource -ImagePath "C:\Users\Public\Solidworks_2020_SP3_x64_R01.wim" -Name "Solidworks_2020_SP3_x64_R01" }
-
Copy the resulting WIM file from a network drive to a test workstation
Measure-Command {Copy-Item R:\Applications\_WIM\Solidworks_2020_SP3_x64_R01.wim C:\users\Public\solidworksWIM }
This was repeated with some of the other larger apps that we have that get updated on an annual basis. The results are impressive
App | files and folders | size on disk | WIM size | file copy | WIM copy | time saved |
---|---|---|---|---|---|---|
Solidworks_2020_SP3_x64_R01 | 60143 | 22.1GB | 9.16GB | 76m40s | 3m40s | 73m0s |
Matlab_2020a_x64_R01 | 25133 | 17.8GB | 16.8GB | 46m47s | 5m22s | 41m25s |
Revit_2020_x64_R01 | 17970 | 15.6GB | 12.5GB | 21m13s | 4m13s | 17m0s |
Robot_StructuralAP_2020_x64_R01 | 9767 | 2.78GB | 1.10GB | 12m14s | 29s | 11m45s |
AutoCAD_2020_x64_R01 | 5739 | 3.65GB | 1.76GB | 11m0s | 49s | 10m11s |
Using WIM files for these 5 apps has saved over 2 hours 30 minutes in file copy time. Solidworks has shown the most dramatic improvement, but saving 40 minutes on Matlab is also impressive considering the size has only shrunk from 17.8GB to 16.8GB.
Two parts of the WIM deployment process haven’t been measured empirically and those are
- how long it takes to import to SCCMContentLib
- how long it takes to mount the WIM file
After years of pain I just couldn’t bring myself to add these apps the old fashioned way again, so there’s no quantitive data on import times. Even without real numbers, the import was much quicker than normal; all five of the above WIM files imported in a few hours compared to the standard week or more.
Another part of the process that wasn’t measured was the time taken to mount each WIM file. For each of the five packages it was less than 20 seconds per packge which, given the massive gains made elsewhere, does not negatively impact the process too much.
Adding WIM files to a packaging workflow
All of the above is great in theory, but how can this be put into production?
Once the WIM file is created from the package source files it is placed in its own folder. We then copy two generic files to start the install process. The first is install.cmd
containing a single line
The second generic file is a barely customised copy of the code from Justin Hollman’s post, but the key pieces of logic are below
From the example above, the resulting Solidworks folder contains the following files
install.cmd
Install-FromWIM.ps1
Solidworks_2020_SP3_x64_R01.WIM
Every app that has been through the WIM packaging process has the same identical supporting files in a unique folder alongside its WIM file. For example, here’s the Matlab folder contents
install.cmd
Install-FromWIM.ps1
Matlab_2020a_x64_R01.WIM
Since the two supporting files are identical and can be reused over and over again, it makes the repackaging of installers as WIM files easy.
In case there’s any confusion about how the install is kicked off, every application that we package has an install.cmd
file at the root of the package folder that has the command line (or lines) needed to install the software. This consistency makes the transition to WIM very easy as we know this file will always be present to install the software.
One thing that caught me out was that since we are mounting the package into a different folder, the relative paths in my cmd files didn’t work. This was easily fixed by using %~dp0
Expert level WIM files - multiple index files
Three of the above applications are from the same vendor - AutoDesk. WIM files can deduplicate multiple copies of the same file, so why not try combining the apps into a single WIM with multiple indexes?
These three apps have a combined size of 15.36GB when we put them into separate WIM files. Adding Robot Structural Analysis Professional and Revit to the AutoCAD WIM was a case of running the following two commands
Add-WindowsImage -CapturePath C:\users\public\robotsource -Name "Robot_StructuralAP_2020_x64_R01" -ImagePath "C:\Users\Public\AutoCAD_2020_x64_R01.wim"
Add-WindowsImage -CapturePath C:\users\public\revitsource -Name "Revit_2020_x64_R01" -ImagePath "C:\Users\Public\AutoCAD_2020_x64_R01.wim"
The resulting WIM file is 13.6GB in size, a further reduction in size of 1.76GB, and we can see that it contains three apps from the same vendor
To install each application we can modify our powershell installer script to loop through the WIM indexes
Final thoughts
One thing that is missing from the tests run above is an indication of a cutoff point; the point at which it is more efficient to leave the package as a collection of files instead of sticking them in a WIM file.
It’s amazing to think that this is a brand new deployment method to many people (including me!) when WIM data images have been available using stock Windows tools for nearly 15 years.