From the remaining 880GB of data in the database, 350GB was being used to store PDF files in varchar(max) columns. These files were also worthy of attention.

Working with PDF files is clearly outside the area of SQL Server, but it’s worth noting the changes we made to those files:

  • Reduced text resolution from 1200x1200 DPI to 150x150 DPI. Even at 75x75 DPI, I could not find a user who could see the difference.
  • Replaced existing TIFF images in new files with smaller (and higher-quality) PNG images. This not only reduced the size but improved the quality of scaled output. This change could not be made for older existing files. For those, I reduced the quality of the stored images using functions in the PDF library, until I achieved a good balance of size and quality.
  • Removed unnecessary fonts that were being stored in each and every PDF file.

The outcome of this was that the 350GB of PDF data was reduced to 120GB of data.