Deploying large applications can be a soul crushing job. The time to install a huge application can be bad enough but it is often made worse by how long it takes to copy tens of thousands of files around the network.

Occasionally an idea will appear out of nowhere that makes you wonder why no-one thought of it before. I discovered this method via a blog post from Justin Holloman, who in turn was inspired by Aaron Y (@_aarony).

Background

The idea here is to use a WIM file as a data disk to hold the source files. This results in one large file instead of a folder with a large number of files. This is important because copying large numbers of small files over SMB is not a good idea

During file transfer, file creation causes both high protocol overhead and high file system overhead. For large file transfers, these costs occur only one time. When a large number of small files are transferred, the cost is repetitive and causes slow transfers.

docs.microsoft.com

Over the past week I’ve run a number of tests to see how using this idea would work for large problematic apps. The method has worked so well I’m about to transition all of the packages detailed in this test to production.

The results are nothing short of amazing. To see why, consider an app such as Solidworks from Dassault Systèmes.

Example - how to speed up Solidworks deployment time by an hour

This application is updated annually, is 22GB in size and the installer has over 60,000 files. Copying the source files from a network drive to a workstation takes over an hour

alt text

Copying the same files encapsulated in a WIM takes 3 minutes 40 seconds alt text

Just by placing the source files in a WIM has saved one hour and 13 minutes deployment time!

Configuration Manager adds to the misery

If SCCM (MEMCM) is used for software distribution the problem of deploying a large application with lots of files is compounded by the Configuration Manager Content Library

In a nut-shell, the content library stores all the Configuration Manager content efficiently on the disk. If the same file is part of two different packages, it stores only one copy in the content library. However, references are kept indicating that the file is part of both the packages.

Understanding the Configuration Manager Content Library

The Content Library was first introduced in SCCM 2012 and has been one of the few components of SCCM that regularly causes me headaches. The general concept is sound, but it struggles with very large files and packages with lots of smaller files.

When an app is imported into SCCM every single file needs to be hashed then copied into the SCCMContentLib. This takes a very long time because SMS_DISTRIBUTION_MANAGER not only copies the files from the source location into SCCMContentLib, it also creates a signature for each file.

This can take days to finish. It’s not much fun watching thousands of lines of distmgr.log scrolling past in cmtrace when something inevitably goes wrong with the process.

Adding a package to SCCMContentLib still takes time with a single WIM file, but with an application like Solidworks only one hash is needed instead of 60,000.

WVD packages over a VPN

My workflow for WVD is to prepare packages locally then copy to Azure. Copying tens of thousands of files over a VPN connection is not a pleasant experience, so our solution was to compress the install package in a zip file.

Using a WIM file is a subtle but welcome change to our WVD workflow. We still can package locally and compress into a single file. But since the WIM file has already been created to improve SCCM deployment times, there is no extra work to be done. And because the resulting file is mounted and not decompressed, there is very little overhead when installing the app on a WVD desktop.

Some real life data

To measure the improvement a WIM file can bring, the following steps were followed

  1. Copy the source files from a network drive to a test workstation

     Measure-Command {Copy-Item R:\Applications\Solidworks_2020_SP3_x64_R01 -Destination C:\users\Public\solidworksource -recurse -Container }
    
  2. Create a WIM file from the source

     Measure-Command { New-WindowsImage -CapturePath C:\users\public\solidworksource -ImagePath "C:\Users\Public\Solidworks_2020_SP3_x64_R01.wim" -Name "Solidworks_2020_SP3_x64_R01" }
    
  3. Copy the resulting WIM file from a network drive to a test workstation

     Measure-Command {Copy-Item R:\Applications\_WIM\Solidworks_2020_SP3_x64_R01.wim C:\users\Public\solidworksWIM }
    

This was repeated with some of the other larger apps that we have that get updated on an annual basis. The results are impressive

App files and folders size on disk WIM size file copy WIM copy time saved
Solidworks_2020_SP3_x64_R01 60143 22.1GB 9.16GB 76m40s 3m40s 73m0s
Matlab_2020a_x64_R01 25133 17.8GB 16.8GB 46m47s 5m22s 41m25s
Revit_2020_x64_R01 17970 15.6GB 12.5GB 21m13s 4m13s 17m0s
Robot_StructuralAP_2020_x64_R01 9767 2.78GB 1.10GB 12m14s 29s 11m45s
AutoCAD_2020_x64_R01 5739 3.65GB 1.76GB 11m0s 49s 10m11s

Using WIM files for these 5 apps has saved over 2 hours 30 minutes in file copy time. Solidworks has shown the most dramatic improvement, but saving 40 minutes on Matlab is also impressive considering the size has only shrunk from 17.8GB to 16.8GB.

Two parts of the WIM deployment process haven’t been measured empirically and those are

  • how long it takes to import to SCCMContentLib
  • how long it takes to mount the WIM file

After years of pain I just couldn’t bring myself to add these apps the old fashioned way again, so there’s no quantitive data on import times. Even without real numbers, the import was much quicker than normal; all five of the above WIM files imported in a few hours compared to the standard week or more.

Another part of the process that wasn’t measured was the time taken to mount each WIM file. For each of the five packages it was less than 20 seconds per packge which, given the massive gains made elsewhere, does not negatively impact the process too much.

Adding WIM files to a packaging workflow

All of the above is great in theory, but how can this be put into production?

Once the WIM file is created from the package source files it is placed in its own folder. We then copy two generic files to start the install process. The first is install.cmd containing a single line

The second generic file is a barely customised copy of the code from Justin Hollman’s post, but the key pieces of logic are below

From the example above, the resulting Solidworks folder contains the following files

  • install.cmd
  • Install-FromWIM.ps1
  • Solidworks_2020_SP3_x64_R01.WIM

Every app that has been through the WIM packaging process has the same identical supporting files in a unique folder alongside its WIM file. For example, here’s the Matlab folder contents

  • install.cmd
  • Install-FromWIM.ps1
  • Matlab_2020a_x64_R01.WIM

Since the two supporting files are identical and can be reused over and over again, it makes the repackaging of installers as WIM files easy.

In case there’s any confusion about how the install is kicked off, every application that we package has an install.cmd file at the root of the package folder that has the command line (or lines) needed to install the software. This consistency makes the transition to WIM very easy as we know this file will always be present to install the software.

One thing that caught me out was that since we are mounting the package into a different folder, the relative paths in my cmd files didn’t work. This was easily fixed by using %~dp0

alt text alt text

Expert level WIM files - multiple index files

Three of the above applications are from the same vendor - AutoDesk. WIM files can deduplicate multiple copies of the same file, so why not try combining the apps into a single WIM with multiple indexes?

These three apps have a combined size of 15.36GB when we put them into separate WIM files. Adding Robot Structural Analysis Professional and Revit to the AutoCAD WIM was a case of running the following two commands

Add-WindowsImage -CapturePath C:\users\public\robotsource -Name "Robot_StructuralAP_2020_x64_R01" -ImagePath "C:\Users\Public\AutoCAD_2020_x64_R01.wim" 

Add-WindowsImage -CapturePath C:\users\public\revitsource -Name "Revit_2020_x64_R01" -ImagePath "C:\Users\Public\AutoCAD_2020_x64_R01.wim"

The resulting WIM file is 13.6GB in size, a further reduction in size of 1.76GB, and we can see that it contains three apps from the same vendor

alt text

To install each application we can modify our powershell installer script to loop through the WIM indexes

Final thoughts

One thing that is missing from the tests run above is an indication of a cutoff point; the point at which it is more efficient to leave the package as a collection of files instead of sticking them in a WIM file.

It’s amazing to think that this is a brand new deployment method to many people (including me!) when WIM data images have been available using stock Windows tools for nearly 15 years.