[]
        
(Showing Draft Content)

Product Architecture

Packaging

DsPdf DsPdf is a cross-platform .NET library in C# providing API to create, load, analyze, and modify PDF files efficiently.

The DsPdf is a Document processing API to help organisations add document processing workflows in business applications.

DsPdf is compatible with .NET Core 2.x/3.x, .NET Standard 2.x, .NET Framework 4.6.2 or higher, and .NET 6 or higher.

DsPdf and supporting packages are available on nuget.org:

  • DS.Documents.Pdf

  • DS.Documents.BarCode

  • DS.Documents.Imaging

  • DS.Documents.Imaging.Windows

  • DS.Documents.DX.Windows

To use DsPdf in an application, simply reference the DS.Documents.Pdf package. NuGet automatically installs all additional packages required by DsPdf.

To render barcodes, install the DS.Documents.Barcode package (DsBarcode for short). It provides extension methods allowing to draw barcodes when using DsPdf.

DS.Documents.DX.Windows provides access to the native imaging APIs to DsPdf if it runs on a Windows system.

DsPdf API Overview

Classes and other types in the DsPdf and related libraries expose a PDF object model that colour follows the Adobe PDF specification version 2.0 published by Adobe. DsPdf provides direct access to all PDF features, including low-level features, whenever feasible. Also, DsPdf provides a powerful and platform-independent text layout engine and some other high-level features that make document creation using DsPdf easy and convenient.

Namespaces

Namespaces

Description

GrapeCity.Documents.Drawing

Framework for drawing on the abstract GcGraphics surface.

GrapeCity.Documents.Pdf

Types used to create, process and modify PDF documents includes GcPdfGraphics. Nested namespaces contain types supporting specific PDF spec areas:

GrapeCity.Documents.Text

Text processing sub-system.

GcPdfDocument

A PDF document in DsPdf corresponds to a GrapeCity.Documents.Pdf.GcPdfDocument class instance. To create a new PDF, create an instance of GcPdfDocument, add content to it and then call one of the GcPdfDocument.Save() overloads to write the document to a file. You can call the Save() method multiple times on a GcPdfDocument instance to create multiple (possibly different) PDF documents.

GcPdfDocument also provides a Load() method, allowing the analysis or modification of an existing PDF. Calling the Load() method clears the GcPdfDocument instance first. The Load() method accepts a caller-opened, readable PDF Stream that must remain open for the lifetime of the loaded document. Load() loads only what’s needed on demand, reducing memory use and improving performance. Note that Load() operates in read-only mode. GcPdfDocument doesn't try to write back to the loaded stream. To save changes, call the Save() method and specify an output file or stream for the new document.


A number of properties and collections on the GcPdfDocument provide access to the content and properties of the document. The most important collection is Pages (see The Pages Collection), others include Outlines, AcroForm, Security, and more.

The Pages collection

The Pages collection represents the collection of a document's pages. When you create a new GcPdfDocument, the collection starts empty. You can use the usual collection methods to fetch, add, insert, remove, or move pages. When you load an existing PDF into a GcPdfDocument, it fills the Pages collection with the document’s pages. You can then modify it the same way as a document created from scratch.

Modifying existing documents

The GcPdfDocument.Load() method lets you inspect and modify existing documents. The possible modifications include:

  • Changing the writable properties of the loaded document and its elements.

  • Add arbitrary new content. You can add to a loaded document anything you can add to a new one: pages, page content, annotations, fields, and more.

  • Modify collections on the document and its pages. You can move, remove, or add elements in the following collections:

    • At the document level:

      • Pages

      • NamedDestinations

      • Outlines

      • AcroForm.Fields

    • At the page level:

      • ContentStreams

      • Annotations

You can't make other modifications at this time. For example, you cannot replace existing text or graphics except by removing them and adding new content streams.


When you load a document into a GcPdfDocument, it reads content from the source without modifying it. To save any changes, you must explicitly call GcPdfDocument.Save().

Sequential (StartDoc/EndDoc) mode

In addition to the Save() method mentioned earlier, GcPdfDocument provides a sequential mode for creating a PDF. To use this mode, start by calling the StartDoc() method on the document, specifying a writable Stream as the method's only parameter. After that, you can add the content to the document as usual, but you must follow the limitations below. When done, call the EndDoc() method which completes writing the document.

The limitations of the sequential method are as follows:

  • The only allowed modification of the Pages collection is adding a page to the end of it. Removing, inserting or moving pages isn't allowed.

  • You can draw only on the last page in the Pages collection. After you add a new page, you can't modify any earlier pages.

  • Certain features (for example, linearization) aren't available in this mode.

Sequential mode writes pages to the stream as soon as they finish, reducing memory usage, especially when creating large PDF documents.

Text

A specialized set of classes in the GrapeCity.Documents.Text namespace supports text measuring and layout. These classes provide a rich object model that lets you access text elements from high-level paragraphs down to individual fonts and glyph features. Text processing is completely platform-independent and doesn't rely on any operating system-provided APIs.

The most important class in the GrapeCity.Documents.Text namespace is TextLayout, it represents one or more paragraphs of text, and supports the following features:

  • Layout of paragraphs in an arbitrary rectangular area using a specified text flow direction

  • Line wrapping according to the Unicode standard recommendations

  • OpenType, TrueType and WOFF fonts, including extensions for handling national languages

  • Individual formatting of text fragments using different fonts, font styles and colors (see TextFormat class)

  • Typography features such as tabs, text alignment, char and line spacing, etc.

  • Text flow around rectangular areas

  • Inline and anchored objects

  • Kashida text justification in Arabic scripts

  • Splitting of large bodies of text into several layouts (columns or pages), including support for column balancing and control over widow/orphan lines

All features are fully supported for vertical (Chinese or Japanese) and RTL/bidirectional text.

After TextLayout processes text, it generates a font-based glyph representation, allowing retrieval of any fragment’s coordinates within the resulting layout.

A TextLayout instance can also be directly rendered onto GcGraphics (see Graphics) using the DrawTextLayout method. Simple MeasureString/DrawString methods on GcGraphics are also provided for convenience.

Graphics

DsPdf provides a graphics surface to draw on, represented by a GcPdfGraphics class, which is an implementation of the abstract GcGraphics base class. GcPdfGraphics provides a flexible and rich object model for measuring, stroking, and filling the usual graphic primitives such as lines, rectangles, polygons, ellipses, and more. You can draw (stroke) using solid or dashed lines and fill shapes with solid or gradient brushes. For an example of shape rendering methods, see GcPdfGraphics.DrawEllipse() or GcPdfGraphics.FillEllipse() method. You can use graphic paths to create and render complex shapes. For example, see GcPdfGraphics.DrawPath() method.

Graphics transformations using 3x2 matrices are fully supported (including text). For more information, see GcPdfGraphics.Transform() method.

Units of measurement

The default units of measurement used by GcPdfGraphics and TextLayout are printer points (1/72 of an inch). You can change them to any desired resolution using the Resolution property in both GcPdfGraphics and TextLayout classes.

Coordinates

GcPdfGraphics measures all graphic object coordinates from the top-left corner of the graphics surface (usually a page), and you can use GcPdfGraphics.Transform to change that.

Page Graphics

Use an instance of GcPdfGraphics to draw on each page of a PDF document. Each page in the GcPdfDocument.Pages collection has the Graphics property that fetches the graphics for that page. You can simply get that property and draw on the returned graphics instance. Initially each page has just one graphics associated with it. If the page contains multiple content streams, each stream has its own graphics, and the Page.Graphics property returns the graphics of the last (top-most) stream. (You can use the ContentStreams collection to access all content streams of the page.)

DsHtml API overview

DsHtml is a utility library that renders HTML to PDF file or an image in PNG, JPEG, and WebP format. DsHtml uses a Chrome or Edge browser (already installed in the current system, or downloaded from a public web site) in headless mode. It doesn’t matter whether your .NET application targets x64, x86, or AnyCPU platforms. The browser is continuously working in a separate process.

The DS.Documents.Html library consists of a platform-independent main package that exposes the HTML rendering functionality. The main package contains the following namespaces:

Namespaces

Description

GrapeCity.Documents.Pdf

It provides the extension methods for rendering HTML to PDF file and represents the formatting attributes for rendering HTML to PDF file.

The namespace comprises the following classes:

GrapeCity.Documents.Html

It provides methods for converting HTML to PDF or images and defines parameters for the PDF or image.

The namespace comprises the following classes:

GrapeCity.Documents.Drawing

It provides the extension methods and formatting attributes for rendering HTML to image. The namespace comprises the following classes:

GrapeCity.Documents.HTML.BrowserFetcher

The BrowserFetcher class has two static methods: GetSystemChromePath() and GetSystemEdgePath(). The methods return the path to an executable file of Chrome or Edge browsers correspondingly. Another option is to download and install Chromium into a local folder. You can create an instance of BrowserFetcher and pass the information such as host, platform, revision, and the destination folder, if needed. Then, execute the BrowserFetcher.GetDownloadedPath() method which downloads Chromium, if required, and returns the path to an executable file for running the Chromium.

GrapeCity.Documents.Html.GcHtmlBrowser

The GcHtmlBrowser class provides methods for converting HTML to PDF and images. With a path to a Chromium or Edge executable discovered by the BrowserFetcher class, you can create a GcHtmlBrowser instance that runs the browser process in the background. GcHtmlBrowser also accepts another parameter of LaunchOptions type. The LaunchOptions class provides various settings specific to launching the browser.

The class has two important methods: NewPage(Uri uri) and NewPage(string html). Both methods return an instance of HtmlPage class which represents a browser tab after navigating to the specified web address, file, or the arbitrary HTML content. The second parameter of the PageOptions type provides various properties such as username and password for HTTP authentication, disabling JavaScript, lazy loading, etc, that are applied to the new browser page.

Note:

  • Use the GcHtmlBrowser class with the Chrome browser, as some DevTools features are implemented differently in Edge.

  • It's important to dispose every instance of the GcHtmlBrowser and HtmlPage classes after use.

Grapecity.Documents.Html.HtmlPage

The HtmlPage class represents a browser tab after navigating to the specified web address, file, or the arbitrary HTML content. The class has methods such as SaveAsPdf, SaveAsPng, SaveAsJpeg, and SaveAsWebp to save the current page as a PDF or as a raster image of PNG, JPEG, or WebP formats respectively. The first parameter of these methods specifies the destination file or stream. The second parameter passes the additional options for rendering HTML page as single PDF page, setting page size, margins, header and footer etc.

The HtmlPage class contains the additional methods that help to interact with HTML page content. For example, you can obtain the full HTML content of the page using the GetContent method. The SetContent method updates the HTML markup. You can reload the web page with the Reload method or even execute a script in the browser context using the EvaluateExpression method. The WaitForNetworkIdle method helps with loading asynchronous web content.

GrapeCity.Documents.Html.PdfOptions

The PdfOptions class defines output settings and parameters used by the Chromium PDF exporter to render HTML content as a PDF. In the case of PDF, it doesn’t support any transparency.

If the PageWidth and PageHeight properties aren't set, the Letter paper size (8.5 x 11 in) is applied by default. The Landscape property of the class indicates the paper orientation but it is ignored when the FullPage property is set to true.

The Margins property specifies page margins, in inches and its default value is 0. The Scale property scales the content of PDF on the scale of 0.1 to 2.0. You might also need to provide the scaled values for PageWidth and PageHeight properties to keep the relative size of the resulting pages unchanged.

Use thePageRanges property to limit the number of pages in the output PDF file. You can specify the desired page numbers as a string, such as the following: '1-5, 8, 11-13.' Invalid page ranges (for example, '9-5') are ignored.

Setting the FullPage property to true allows you to export the whole HTML as single PDF page. All other layout settings (except Scale) are ignored in that case.

GrapeCity.Documents.Pdf.HtmlToPdfFormat

The HtmlToPdfFormat class contains the formatting attributes for rendering HTML to PDF file on a GcPdfGraphicsExt class using DrawHtml extension methods. A temporary PDF is drawn with the HTML as a single page (if FullPage is true) or with the specified page size (MaxPageWidth, MaxPageHeight) Scale and DefaultBackgroundColor. It is then loaded into a GcPdfDocument and trimmed to actual size of the HTML content. The result is rendered on a GcPdfGraphics as PDF Form XObject.

If MaxPageWidth or MaxPageHeight properties aren't set explicitly they are assumed to be equal to 200 inches. DefaultBackgroundColor is equal to Color.White by default.

Other properties of HtmlToPdfFormat are mapped to the corresponding properties of the PageOptions/PdfOptions class:

HtmlToPdfFormat Property

PageOptions/PdfOptions Property

WindowSize

PageOptions.WindowSize

DefaultBackgroundColor

PageOptions.DefaultBackgroundColor

FullPage

PdfOptions.FullPage

DisplayBackgroundGraphics

PdfOptions.PrintBackground

Scale

PdfOptions.Scale

MaxPageWidth

PdfOptions.PageWidth

MaxPageHeight

PdfOptions.PageHeight

GcPdfGraphics Extension Methods

DsHtml provides 4 methods that extend GcPdfGraphics and allow rendeing or measuring an HTML text or page:

  • Draws an HTML text on this GcPdfGraphics at a specified position:

    bool GcPdfGraphics.DrawHtml(GcHtmlBrowser browser, string html, float x, float y, HtmlToPdfFormat format, out SizeF size, bool loadLazyImages = false)

  • Draws an HTML page specified by a URI on this GcPdfGraphics at a specified position:

    bool GcPdfGraphics.DrawHtml(GcHtmlBrowser browser, Uri htmlUri, float x, float y, HtmlToPdfFormat format, out SizeF size, bool loadLazyImages = false)

  • Measures an HTML text for this GcPdfGraphics:

    SizeF GcPdfGraphics.MeasureHtml(GcHtmlBrowser browser, string html, HtmlToPdfFormat format, bool loadLazyImages = false)

  • Measures an HTML page specified by a URI for this GcPdfGraphics:

    SizeF GcPdfGraphics.MeasureHtml(GcHtmlBrowser browser, Uri htmlUri, HtmlToPdfFormat format, bool loadLazyImages = false)

Note: In DsImaging release version 6.0.0, the GcHtmlRenderer class has been marked obsolete and has been replaced by the new GcHtmlBrowser class. This avoids using GPL- or LGPL-licensed software that the custom Chromium build previously required. For tips about migration from obsolete GcHtmlRenderer class, see Tips to Migrate from Obsolete GcHtmlRenderer class.