Skip to main content Skip to footer

.NET PDF AI Assistant: How to Summarize PDFs, Extract Tables, and Build Outlines Automatically

Quick Start Guide
Tutorial Concept

Learn how to use the .NET PDF AI Assistant (DsPdfAI) to summarize long reports, extract structured tables, and build navigable outlines with simple natural-language prompts programmatically using C# in .NET apps.

What You Will Need

NuGet packages:

Controls Referenced

Document Solutions for PDF - AI Assistant
Online Demo Explore | Documentation

Long PDF reports like financial statements or compliance documents, are packed with valuable data but notoriously hard to work with. PDFs were never designed as structured databases, so extracting information like tables, creating summaries, or adding outlines reliably can be time consuming, and hard to automate in workflows.

Developers are turning to .NET PDF AI Assistants, like the Document Solutions for PDF’s AI (DsPdfAI) package to help automate these tasks. With a PDF AI Assistant .NET developers can do all of that with natural-language prompts quickly, and programmatically using C# or VB. Whether you need a quick abstract, structured tables, or a navigable outline tree, DsPdfAI turns bulky PDFs into actionable data in just a few lines of code.

Download a trial of Document Solutions for PDF Today!

.NET Developers Guide: Automating PDF Summaries, Tables, and Outlines

  1. Getting Started with .NET PDF AI: DsPdfAI
  2. Summarizing a PDF – GetAbstractMessage & GetSummary Methods
  3. Extracting PDF Tables using AI – GetTable Method
  4. Building PDF Outlines using .NET PDF AI – BuildOutlines Method
  5. Best Practices
  6. Beyond .NET: PDF AI Use in the Browser
  7. Conclusion

Download a Finished Sample App to Follow Along!


Getting Started with .NET PDF AI: DsPdfAI

To use the .NET DsPdfAI, install the these two NuGet packages:

Then add your OpenAI or Azure OpenAI credentials:

using GrapeCity.Documents.Pdf;
using GrapeCity.Documents.Pdf.AI;

var openAiToken = "YOUR_OPENAI_TOKEN";
// Initalize the .NET PDF AI Assistant
var assistant = new OpenAIDocumentAssistant(openAiToken);

Load your PDF into a GcPdfDocument instance using the .NET PDF API’s Load method.

// Create .NET PDf document isntance
var doc = new GcPdfDocument();
// Load in Existing PDF
using var fs = File.OpenRead("./resource/DsPdfAI_LargeReport.pdf");
doc.Load(fs);

The application is now ready to send natural-language prompts to the AI Assistant! To learn more see our Setting Up the DsPdfAI Environment documentation.


Summarizing a PDF using AI in .NET Apps

Large reports often start with dozens of pages of narrative. Instead of reading line by line, use the .NET DsPdfAI to generate a summary or abstract easily by invoking the GetAbstract and GetSummary methods.

  • Abstracts → one-paragraph overview (great for cover pages).
  • Summaries → multi-paragraph digest of specific page ranges.
// Set the abstract message
assistant.GetAbstractMessage = "Return a brief abstract suitable for executives.";
string abs = await assistant.GetAbstract(doc);

// Set the summary message
assistant.GetSummaryMessage = "Summarize the first five pages of this report.";
// Set the page range to be summarized, in this case pages 1 to 5
OutputRange o1 = new OutputRange("1-5");
var summary = await assistant.GetSummary(doc, pageRange: o1);

Developers can then take these AI response and write them to a PDF page, in this case, we will add a summary page to the beginning of the loaded PDF:

// Insert a new page at the front of PDF
var summaryPage = doc.Pages.Insert(0);

float margin = 72f; // 1 inch
var pageSize = summaryPage.Size;

// Build a layout that wraps within the margins
var tl = new TextLayout(72); // 72 dpi
tl.MaxWidth = pageSize.Width - margin * 2;
tl.MaxHeight = pageSize.Height - margin * 2;
tl.ParagraphSpacing = 8;
tl.Append("The AI Generated Summary: " + summary); // Append the summary response
tl.PerformLayout(true);

// Draw wrapped text
summaryPage.Graphics.DrawTextLayout(tl, new PointF(margin, margin));

AI generated PDF Summary | .NET PDF API Library Developer SolutionsSee the interactive Get AI Abstract or Get AI Summary online demos or learn more here by reading the PDF AI documentation.


Extracting PDF Tables using AI 

PDF tables hold key numbers in financial statements, like income statements, KPIs, or segment breakdowns. However, PDF tables are different from programs like Excel or Word. These PDF "tables" lack a deeper structure; they don't feature any code that defines rows, columns, or cells. They're created by arranging text and shapes to look like tables.

Using the .NET PDF AI Assistant, developers can quickly and easily extract them using the GetTable method:

// GetTableMessageFmt is the general request to the AI, the following is its default value
assistant.GetTableMessageFmt = "Please analyze the PDF. {0}. Return the table with its header row included, without additional explanation.";
// This argument is the tableRequest
var table = await assistant.GetTable(doc, "Extract the table from the chapter named \"3.1 Record\" including the headers.");

foreach (var row in table.Rows)
{
    Console.WriteLine(string.Join("\t", row.Cells.Select(c => c.Text)));
}

This gives developers structured data, where they can:

  • Export to CSV or Excel (.xlsx) - using Document Solutions for Excel
  • Re-render into a new PDF
  • Feed into data into dashboards or analysis tools

Bonus: Export the Extracted PDF Table Data to Excel

Using a .NET Excel library, like Document Solutions for Excel, in combination with the DsPdfAI .NET developers can take the extracted table data from the PDF and export it to an Excel files with formatted data. 

using GrapeCity.Documents.Pdf;
using GrapeCity.Documents.Pdf.AI;
using GrapeCity.Documents.Excel;
...
var wb = new Workbook();
var ws = wb.Worksheets[0];
// 1) Read headers from the AI table data
var headers = table.Cols?.Select(c => c.Name).ToList() ?? new List<string>();
bool hasHeaders = headers.Count > 0;
// 2) Compute sizes
int colCount = Math.Max(headers.Count, table.Rows.Max(r => r.Cells.Count));
int startRow = hasHeaders ? 1 : 0;
int totalRows = startRow + table.Rows.Count;
// 3) Write header row at row 0
if (hasHeaders)
{
    for (int c = 0; c < headers.Count; c++)
        ws.Cells[0, c].Value = headers[c];
}
// 4) Write data rows (start at row 1 if headers exist)
for (int r = 0; r < table.Rows.Count; r++)
{
    for (int c = 0; c < colCount; c++)
    {
        var text = (c < table.Rows[r].Cells.Count)
            ? table.Rows[r].Cells[c].Text
            : string.Empty;
        ws.Cells[startRow + r, c].Value = text;
    }
}
// 5) Create the Excel table over the full range (including header row)
var rng = ws.Range[0, 0, totalRows, colCount];
ws.Tables.Add(rng, hasHeaders);
ws.Columns.AutoFit();
// 6) Save to XLSX
wb.Save("ExtractedTable.xlsx");

Programmatically Extract PDF Table Data Using .NET PDF API Libraries AI capabilities | .NET PDF SDK

To learn more about programmatically extracting table data from a PDF using AI, check out our online interactive demos or read the PDF AI documentation.


Building PDF Outlines using .NET PDF AI

Large PDFs are hard to navigate without bookmarks. Using the .NET DsPdfAI, developers can automatically generate a PDF outline tree in the document with the BuildOutlines method:

await assistant.BuildOutlines(doc);
doc.Save("Report_WithOutlines.pdf");

Now readers can click through sections like Executive Overview, Financials, Notes, and Risk Factors right in their PDF viewer.

Use DsPdfAI to Programmatically Add Outlines to PDFs in .NET Apps

See the Generate Outlines documentation or try out the interactive online demos.


Best Practices

  • Reference headings or chapters instead of page numbers.
  • Keep prompts specific (e.g., “Extract the Revenue by Geography table”).
  • Validate AI outputs before using in production.
  • Use summaries + outlines together to deliver digestible reports.

Beyond .NET: PDF AI Use in the Browser

While DsPdfAI brings AI into server-side .NET apps, the Document Solutions JavaScript PDF Viewer allows AI features to be offered to end users in the browser. Developers can modify the toolbar options to then integrate AI allowing users to ask questions, request summaries, or extract insights interactively. Try the online PDF Viewer demo here.

JavaScript PDF Viewer | PDF AI Browser Support for End Users

The Client-Side PDF Viewer is Included in the Trial of Document Solutions for PDF. Try It Today!


Conclusion

The .NET PDF API library Document Solutions for PDF makes it easy to transform bulky PDFs into actionable, developer-friendly outputs. With just a few lines of code, the AI Assistant methods can easily:

Together, these features save time, reduce manual work, and let you deliver smarter PDFs inside your .NET applications.

comments powered by Disqus