[]
        
(Showing Draft Content)

Export

Exporting a Word document to PDF and images is a very common use case and is generally required for various business purposes. Some of the advantages are security, cross-platform compatibility, availability of free readers, reduced file size, etc.

DsWord allows you to export Word documents to PDF format with the help of the GcWordLayout class. This class separates the layout of source Word documents from the exported PDF formats. You can also use the WordLayoutSettings class to perform various customizations to the set of pages being exported.

Export to PDF

The SaveAsPdf method of the GcWordLayout class can be used to export Word documents to PDF. You can also set various PDF options like compression level, back color, metadata, and PDF version by using the properties of the PdfOutputSettings class.

Structured Tags

For exporting, DsWord supports structured tags that correspond with PDF standards maintaining compliance with accessibility and other requirements. Structured tags map document elements to PDF logical structure, ensuring proper document hierarchy and metadata in the output format.

Export to PDF/UA compliant PDF

DsWord supports exporting Word documents to PDF/UA-1 compliant PDF. To do this, make sure that:

  • The document has a title.

  • The structure of heading styles is hierarchical, starting from "Heading 1" without skipping intermediate outline levels.

  • All images and figures are accompanied by appropriate alternative text.

  • All hyperlinks include screen tips.

  • The document contains no blank pages.

  • The exported document conforms to the ISO 14289-1 standard.

When required fields are missing from the document, use the MissingDocumentTitle, MissingShapeAlternativeText and MissingHyperlinkScreenTip properties of the PdfOutputSettings class to provide placeholder values during export.

Check out the code sample below that demonstrates marking the exported document as PDF/UA compliant:

using GrapeCity.Documents.Word;
using GrapeCity.Documents.Word.Layout;
DocToPdfUa("SimpleDocument");

static void DocToPdfUa(string name)
{
    var doc = new GcWordDocument();
    doc.Load($"{name}.docx");
    using var wl = new GcWordLayout(doc);
    var settings = new PdfOutputSettings
    {
        ExportStructureTags = true,
        MarkAsPdfUa = true,
        MissingDocumentTitle = "(no title)",
        MissingShapeAlternativeText = "(no alternative text)",
        MissingHyperlinkScreenTip = "(no content)"
    };
    wl.SaveAsPdf($"{name}PdfUa.pdf", null, settings);
}

Export to PDF/A compliant PDF

DsWord supports exporting Word documents to PDF/A-3 compliant PDF. To do this, make sure that:

  • The document includes a title in its properties.

  • All fonts are embedded.

  • The document is not encrypted or password-protected.

  • The document contains no multimedia content (audio or video).

  • The document contains no JavaScript.

  • All colors use device-independent color spaces (RGB, CMYK, or Grayscale).

  • Document metadata is included.

  • The exported document conforms to the ISO 19005-3 standard.

When the document title is missing, use the MissingDocumentTitle property of the PdfOutputSettings class to provide placeholder values during export.

Check out the code sample below that demonstrates marking the exported document as PDF/A compliant:

using GrapeCity.Documents.Word;
using GrapeCity.Documents.Word.Layout;
DocToPdfA("SimpleDocument");

static void DocToPdfA(string name)
{
    var doc = new GcWordDocument();
    doc.Load($"{name}.docx");
    using var wl = new GcWordLayout(doc);
    var settings = new PdfOutputSettings
    {
        ExportStructureTags = true,
        MissingDocumentTitle = "(no title)"
    };
    wl.SaveAsPdf($"{name}PdfA.pdf", null, settings);
}

Export to Image

You can save Word Documents to TIFF, PNG and JPEG image formats by using the SaveAsTiff, SaveAsPng and SaveAsJpeg methods respectively. The properties of the ImageOutputSettings class can be used to choose from various options like zoom, back color and resolution while saving to image files.

To export a Word document to TIFF image format:

  1. Load a Word document in a GcWordLayout instance.

  2. Save the Word document as a TIFF image by using the SaveAsTiff method of the GcWordLayout class.

  3. Use the ImageOutputSettings class to specify the Compression as Deflate.

    var wordDoc = new GcWordDocument();
    wordDoc.Load("JsFrameworkExcerpt.docx");
    
    using (var layout = new GcWordLayout(wordDoc))
    {
    // save a few pages of the Word document as TIFF
    layout.SaveAsTiff("JsFrameworkExcerpt.tiff", new OutputRange("2, 6-7"),
    new ImageOutputSettings() { Zoom = 2f }, new TiffFrameSettings() { Compression = TiffCompression.Deflate });        
    }

To export a single page of a Word document to JPEG image format:

  1. Create a new Word document by instantiating the GcWordDocument class.

  2. Add content to the document by using the Add method of the ParagraphCollection class and specifying various BuiltInStyles by using BuiltInStyleId enumeration.

  3. Load the document in a GcWordLayout instance.

  4. Save the first page of a Word document as a JPEG image by using the SaveAsJpeg method of the GcWordLayout class.

  5. Use the ImageOutputSettings class to specify the Zoom and BackColor properties.

    var doc = new GcWordDocument();
    var sec = doc.Body.Sections.First;
    var pars = sec.GetRange().Paragraphs;
    
    // Title
    pars.Add("Some Common Built-in Styles (Title)", doc.Styles[BuiltInStyleId.Title]);
    
    // Subtitle
    pars.Add("Demonstration of some of the built-in styles. (Subtitle)", doc.Styles[BuiltInStyleId.Subtitle]);
    
    // Headings 1-4
    var heading1 = pars.Add("Heading 1", doc.Styles[BuiltInStyleId.Heading1]);
    var heading2 = pars.Add("Heading 2", doc.Styles[BuiltInStyleId.Heading2]);
    var heading3 = pars.Add("Heading 3", doc.Styles[BuiltInStyleId.Heading3]);
    var heading4 = pars.Add("Heading 4", doc.Styles[BuiltInStyleId.Heading4]);
    
    // Character styles
    var p = pars.Add("In this paragraph we demonstrate some of the built-in character styles. ");
    var runs = p.GetRange().Runs;
    runs.Add("This run uses the 'Strong' style. ", doc.Styles[BuiltInStyleId.Strong]);
    runs.Add("A run of normal text. ");
    runs.Add("This run uses 'Emphasis' style. ", doc.Styles[BuiltInStyleId.Emphasis]);
    runs.Add("A run of normal text. ");
    runs.Add("This run uses 'Intense Emphasis' style. ", doc.Styles[BuiltInStyleId.IntenseEmphasis]);
    runs.Add("A run of normal text. ");
    runs.Add("This run uses 'Subtle Emphasis' style. ", doc.Styles[BuiltInStyleId.SubtleEmphasis]);
    
    pars.Add("The End.");
    using (var layout = new GcWordLayout(doc))
    { 
    layout.Pages[0].SaveAsJpeg("example.jpg", new ImageOutputSettings() { Zoom = 2f, BackColor = Color.Yellow });
    }

For more information on how to convert a Word document into PDF and image formats using DsWord, see the DsWord sample browser.

Export to SVG

In addition to the above mentioned common image formats, DsWord also lets you save Word document pages as SVG or its compressed format SVGZ. You can use the SaveAsSvg and ToSvgz methods of the GrapeCity.Documents.Word.Layout.Page class to export an instance of a Word page to a SVG file or stream(.svg) or a byte array(.svgz).

var wordDoc = new GcWordDocument();
wordDoc.Load("StatementOfWork.docx");
using (var la = new GcWordLayout(wordDoc))
{
    var page = la.Pages[0];
     
    // Render a Word page to the .svg file
    page.SaveAsSvg("StatementOfWork.svg", new ImageOutputSettings() { Zoom = 1f }, new XmlWriterSettings() { Indent = true });
 
    // Render a Word page to the byte array with compressed data in SVGZ format
    var svgData = page.ToSvgz();
    File.WriteAllBytes("StatementOfWork.svgz", svgData);
}

Limitations

DsWord has a few limitations while exporting to PDF and image file formats. Some of them, in particular, are:

  • While exporting to SVG, text is always rendered as graphics using paths. Hence, resulting .svg files for text pages are large and it is not possible to select or copy text on the SVG images opened in the browsers.

  • Objects that are not supported by the DsWord OM are not exported.

  • Footnotes are not exported.

  • Export of text boxes is supported, but linked text boxes, text outlines, gradients, font fill and line effect, and the "Do not rotate text" option are not supported.

  • When exporting shapes, custom shapes, ink shapes, sketched lines, gradient lines, and compound lines are not supported.

  • Only stored values in fields are exported (i.e. no recalculation), with the following exceptions:

    • Page numbering is supported.

    • Partial hyperlink field is supported.

  • Comments can optionally be exported, but complete formatting is not retained.