Optimizing a document helps reduce the size of the document significantly, making it faster to load, read, and share. DsPdf allows you to optimize PDF documents with various options without compromising their quality and integrity. Refer to the following sections to learn more about various document optimization options:
DsPdf allows you to reduce the size of a document efficiently using RemoveDuplicateImages method of GcPdfDocument class. This method eliminates redundant instances of identical images internally within the document, retaining only a single instance across multiple locations, hence reducing the size of the document.
Refer to the following example code demonstrating how to optimize the file size of a document using RemoveDuplicateImages method:
C# |
Copy Code
|
---|---|
// Initialize GcPdfDocument. GcPdfDocument doc = new GcPdfDocument(); // Open PDF document in the file stream. FileStream fs = File.OpenRead("Invoice.pdf"); // Load the PDF document. doc.Load(fs); // Remove duplicate images. doc.RemoveDuplicateImages(); // Save PDF document. doc.Save("RemovedDuplicateImages.pdf"); |
DsPdf enables you to optimize font usage by merging subsets of the same fonts and removing duplicate and unused fonts using OptimizeFonts method of GcPdfDocument class. Furthermore, DsPdf also provides OptimizeFontsOptions class that enables you to control the behavior of OptimizeFonts method using the following properties of this class:
Refer to the following example code to optimize font usage:
C# |
Copy Code
|
---|---|
static void Main(string[] args) { // Create a 5 page non-optimal PDF. var tmpInput = MakeInputFile("CompleteJavaScriptBook.pdf"); var fiInput = new FileInfo(tmpInput); // Create a new PDF, load the source PDF into it, and optimize it. var tmpOutput = Path.GetTempFileName(); var tmpDoc = new GcPdfDocument(); using (var fs = File.OpenRead(tmpInput)) { tmpDoc.Load(fs); // By default GcPdfDocument uses CompressionLevel.Fastest when saving a PDF. // Set CompressionLevel to Optimal to reduce the size of the PDF. tmpDoc.CompressionLevel = CompressionLevel.Optimal; // Optimize the font usage. tmpDoc.OptimizeFonts(); tmpDoc.Save(tmpOutput); } var fiOutput = new FileInfo(tmpOutput); // Record the input and output file sizes in the resultant PDF. var doc = new GcPdfDocument(); Common.Util.AddNote(String.Format( "Using the GcPdfDocument.OptimizeFonts() method reduced the size of a 5-page PDF from {0:N0} to {1:N0} bytes " + "by merging duplicate and removing unused font data.\n" + "To reproduce these results locally, download and run this sample. You may also modify the sample code to keep the temporary " + "input and output files and compare their sizes using a file manager.", fiInput.Length, fiOutput.Length), doc.NewPage()); doc.Save("OptimizeFonts.Pdf"); // Delete the temp files. File.Delete(tmpInput); File.Delete(tmpOutput); } // Create a method to make input file. static string MakeInputFile(string inFn) { // Initialize GcPdfDocument. var indoc = new GcPdfDocument(); // Load the PDF document. using var fs = File.OpenRead(inFn); indoc.Load(fs); // Create 5 PDFs from the first 5 pages of the source document. var pageCount = 5; var docs = new List<GcPdfDocument>(pageCount); for (int i = 0; i < pageCount; ++i) { var outdoc = new GcPdfDocument(); outdoc.MergeWithDocument(indoc, new MergeDocumentOptions() { PagesRange = new OutputRange(i + 1, i + 1) }); docs.Add(outdoc); } // Merge the PDFs into a single document. var doc = new GcPdfDocument(); foreach (var d in docs) doc.MergeWithDocument(d); // Save the resultant PDF in a temp file. var outFn = Path.GetTempFileName(); doc.Save(outFn); return outFn; } |
DsPdf uses the one-byte encoding format, i.e., Type0AutoOneByteEncoding, by default. This format produces smaller PDF content than Type0IdentityEncoding, but if the amount of text rendered with a font using Type0AutoOneByteEncoding is small (less than ~1000 symbols), then the resulting PDF content size may be bigger than when using Type0IdentityEncoding. This happens due to the requirement of additional information by Type0AutoOneByteEncoding about encoding. The additional size is ~1Kb depending on the number of unique characters used in the text.
DsPdf allows you to set the encoding type for the font formats representing a font in a PDF document using PdfFontFormat property of GcPdfDocument and FontHandler classes. This property uses PdfFontFormat enumeration to define the encoding type.
PdfFontFormat enumeration provides the following options that define the encoding type:
Option | Description |
---|---|
Type0AutoOneByteEncoding | Saves the font as one or more Type0 PDF fonts, where each character is encoded by one byte. |
Type0IdentityEncoding | Saves the font as a single Type0 font with Identity encoding, where each character is encoded with two bytes. |
Refer to the following example code to define the encoding type:
C# |
Copy Code
|
---|---|
// Load the font from file. var gabriola = GCTEXT.Font.FromFile(Path.Combine("Resources", "Fonts", "Gabriola.ttf")); if (gabriola == null) throw new Exception("Could not load font Gabriola"); // Render the text using the font. var tf = new TextFormat() { Font = gabriola, FontSize = 16 }; // Initialize GcPdfDocument. var doc = new GcPdfDocument(); var g = doc.NewPage().Graphics; // Set PdfFontFormat to Type0IdentityEncoding. doc.PdfFontFormat = PdfFontFormat.Type0IdentityEncoding; // Draw the string. g.DrawString($"Sample text drawn with font {gabriola.FontFamilyName}.", tf, new PointF(72, 72)); // Change the font size. tf.FontSize += 4; // Draw the string. g.DrawString("The quick brown fox jumps over the lazy dog.", tf, new PointF(72, 72 * 2)); // Emulate bold or italic style with a non-bold (non-italic) font. tf.FontStyle = GCTEXT.FontStyle.Bold; // Draw the string. g.DrawString("This line prints with the same font, using emulated bold style.", tf, new PointF(72, 72 * 3)); // Set bold italic font and print a line with it. var timesbi = GCTEXT.Font.FromFile(Path.Combine("Resources", "Fonts", "timesbi.ttf")); tf.Font = timesbi ?? throw new Exception("Could not load font timesbi"); tf.FontStyle = GCTEXT.FontStyle.Regular; // Draw the string. g.DrawString($"This line prints with {timesbi.FullFontName}.", tf, new PointF(72, 72 * 4)); // Save the PDF document. doc.Save("OptimizeFontFormat.pdf"); |
Limitations
DsPdf will use Type0IdentityEncoding regardless of the user’s selection if a font is not embedded, as Acrobat Reader renders such PDFs with a lot of distortions.
DsPdf enables the use of object streams when saving a PDF document through the UseObjectStreams property in the SavePdfOptions class. This property utilizes the UseObjectStreams enumeration to specify whether to use object streams and, if so, determine the type of object streams to apply.
An object stream is a stream object that can store a sequence of indirect objects more compactly using CompressionLevel property rather than storing them at the file's outermost level. Object Streams significantly reduce the size of PDF documents.
The SavePdfOptions class gives you precise control over how your code saves the PDFs in the optimal way, an instance of which can be passed to Save, Sign, and TimeStamp methods of GcPdfDocument class. The SavePdfOptions class provides following properties:
Property | Description |
---|---|
PdfStreamHandling |
Sets a value controlling how existing PDF streams will be handled when the document is saved using PdfStreamHandling enumeration. PdfStreamHandling enumeration provides the following options:
|
Mode |
Sets a value specifying the PDF save mode using SaveMode enumeration. SaveMode enumeration provides the following save modes:
|
UseObjectStreams |
Sets a value indicating whether to use object streams when saving the PDF using UseObjectStreams enumeration. UseObjectStreams enumeration provides the following options:
|
Refer to the following example code to use multiple object streams to reduce the PDF document size:
C# |
Copy Code
|
---|---|
static void Main(string[] args) { // Create a 5 page non-optimal PDF. var tmpInput = MakeInputFile(); var fiInput = new FileInfo(tmpInput); // Create a new PDF, load the source PDF into it, and optimize it. var tmpOutput = Path.GetTempFileName(); var tmpDoc = new GcPdfDocument(); using (var fs = File.OpenRead(tmpInput)) { tmpDoc.Load(fs); // By default GcPdfDocument uses CompressionLevel.Fastest when saving a PDF. // Set CompressionLevel to Optimal. tmpDoc.CompressionLevel = CompressionLevel.Optimal; // Minimize stream sizes using object streams. tmpDoc.Save(tmpOutput, new SavePdfOptions(SaveMode.Default, PdfStreamHandling.MinimizeSize, UseObjectStreams.Multiple)); } var fiOutput = new FileInfo(tmpOutput); // Record the input and output file sizes in the resultant PDF. var doc = new GcPdfDocument(); Common.Util.AddNote(String.Format( "Using the UseObjectStreams.Multiple option when saving a PDF will in most cases reduce the resulting file size, " + "sometimes significantly. In this case the size of the PDF generated by the 'Large Document' sample decreased " + "from {0:N0} to {1:N0} bytes, without any loss in fidelity or PDF opening speed.\n" + "Using the UseObjectStreams.Single option yields an even slightly smaller PDF size at the cost of slower opening in PDF viewers.\n" + "To reproduce these results locally, download and run this sample, specifying a valid license key " + "(otherwise loading is limited to 5 pages, and the size reduction may be too small). " + "You may also modify the sample code to keep the temporary " + "input and output files, and compare their sizes using a file manager.", fiInput.Length, fiOutput.Length), doc.NewPage()); // Save the resultant PDF document. doc.Save("ObjectStreams.pdf"); // Delete the temp files. File.Delete(tmpInput); File.Delete(tmpOutput); } // Create method to make input file. static string MakeInputFile() { // Set number of pages to generate. const int N = Common.Util.LargeDocumentIterations; var start = Common.Util.TimeNow(); var doc = new GcPdfDocument(); // Create a TextLayout to hold/format the text. var tl = new TextLayout(72) { MaxWidth = doc.PageSize.Width, MaxHeight = doc.PageSize.Height, MarginAll = 72, FirstLineIndent = 36, }; tl.DefaultFormat.Font = StandardFonts.Times; tl.DefaultFormat.FontSize = 12; // Generate the PDF document. for (int pageIdx = 0; pageIdx < N; ++pageIdx) { tl.Append(Common.Util.LoremIpsum(1)); tl.PerformLayout(true); doc.NewPage().Graphics.DrawTextLayout(tl, PointF.Empty); tl.Clear(); } // Insert a title page (cannot be done if using StartDoc/EndDoc). tl.FirstLineIndent = 0; var fnt = GCTEXT.Font.FromFile(Path.Combine("Resources", "Fonts", "yumin.ttf")); var tf0 = new TextFormat() { Font = fnt, FontSize = 24, FontBold = true }; tl.Append(string.Format("Large Document\n{0} Pages of Lorem Ipsum\n\n", N), tf0); var tf1 = new TextFormat(tf0) { FontSize = 14, FontItalic = true }; tl.Append(string.Format("Generated on {0} in {1:m\\m\\ s\\s\\ fff\\m\\s}.", Common.Util.TimeNow().ToString("R"), Common.Util.TimeNow() - start), tf1); tl.TextAlignment = TextAlignment.Center; tl.PerformLayout(true); doc.Pages.Insert(0).Graphics.DrawTextLayout(tl, PointF.Empty); // Save the resultant PDF in a temp file with UseObjectStreams.None (it is the default). var outFn = Path.GetTempFileName(); doc.Save(outFn, new SavePdfOptions(SaveMode.Default, PdfStreamHandling.Copy, UseObjectStreams.None)); return outFn; } |
Limitations
DsPdf does not save the document using object streams: