How to Extract PDF Form Data and Convert it to XML Using C#

October 30 2025

Quick Start Guide
Tutorial Concept	This tutorial shows how to extract PDF form fields and save them as XML using C# and a .NET PDF API in just a few lines of code.
What You Will Need	NuGet package: Ds.Documents.Pdf
Controls Referenced	Document Solutions for PDF - .NET PDF API Library Online Demo Explore \| Documentation

PDFs are one of the most common formats for creating and distributing forms. From job applications and legal agreements to surveys and banking documents, organizations rely on fillable PDF forms to gather information efficiently. Once filled, developers often need a way to extract the data for processing or storage.

In this short guide, you’ll learn how to use the .NET PDF API Document Solutions for PDF to extract form data from a PDF and save it as XML using just a few lines of C# code.

Download a free trial of Document Solutions for PDF and start automating your PDF workflows today.

Tutorial: Convert PDF Form Data to XML in .NET Apps

Why Export PDF Form Data to XML
Step 1: Import Required Namespaces
Step 2: Load the PDF Form
Step 3: Export Form Data to a Memory Stream
Step 4: Export Form Data to an XML File
Step 5: Verify the XML Output

Download a finished sample application to follow along with!

Why Export PDF Form Data to XML

XML is one of the most flexible formats for exchanging structured data. Exporting PDF form fields as XML makes it easy to:

Integrate PDF responses into databases or analytics systems.
Automate workflows like surveys, HR applications, or billing.
Maintain field structure and consistency across tools.

Import Required Namespaces

For this tutorial, we will use a .NET 8 Console application. To begin, include Document Solutions for PDF in the .NET application. In this example, get the latest release from NuGet:

Installing a .NET PDF API Library | Document Solutions for PDF

Import the needed namespaces:

using GrapeCity.Documents.Pdf;
using System.IO;

To learn more about getting started with the .NET PDF API, see the documentation.

Load the PDF Form

Initialize the .NET PDF instance using the GcPdfDocument constructor. Then invoke the Load method to read the PDF form you want to extract data from. For this example, we’ll use a sample file named filledForm.pdf.

GcPdfDocument doc = new GcPdfDocument();

//Create an object of filestream
var fs = new FileStream(Path.Combine("filledForm.pdf"), FileMode.Open, FileAccess.Read);

//Load the document
doc.Load(fs);

Export Form Data to a Memory Stream

If you want to process the data programmatically, for example, by sending it to another system or API, developers can export the PDF form data into a MemoryStream using the .NET PDF API’s ExportFormDataToXML method.

// Create a new memory stream
MemoryStream stream = new MemoryStream();

// Export form data into the stream
doc.ExportFormDataToXML(stream);

At this point, the stream contains the XML-formatted form data, which you can read or manipulate in your code.

Export PDF Form Data to an XML File

If you simply want to save the extracted data to disk, use the same ExportFormDataToXML method with a file path:

// Export form data directly to an XML file
doc.ExportFormDataToXML("sampleFormData.xml");

This will create an XML file containing all form field names and their values from the PDF.

Extract PDF Form Data and Save as an XML File in .NET Apps

Verify the XML Output

The exported XML will look similar to this:

<fields xmlns:xfdf="http://ns.adobe.com/xfdf-transition/">
<fldEmail>John Smith</fldEmail>
<fld1586527433974>j.smith@abc.inc</fld1586527433974>
<fldAddress>1234 Wilber Lane</fldAddress>
<fldDateOfBirth>01/02/1995</fldDateOfBirth>
<fldBestPhoneNumbers>123-456-7890</fldBestPhoneNumbers>
<fldCityStateZip>12312 WA</fldCityStateZip>
<fldEmergencyContact>Susan Smith / Mother</fldEmergencyContact>
<fldEmergencyPhoneNumbers>123-456-7891</fldEmergencyPhoneNumbers>
<radiogroupPainRating>2</radiogroupPainRating>
<radiogroupStressLevel>5</radiogroupStressLevel>
<fldParentGuardianName>John Smith</fldParentGuardianName>
<fldParentGuardianDate>10/27/2025</fldParentGuardianDate>
<fldPrescriptionsList>n/a</fldPrescriptionsList>
<fldAllergies>Tree Nuts</fldAllergies>
<fldTraumaticEvent>No</fldTraumaticEvent>
<fldDizzyness>No</fldDizzyness>
<fldBackPain>No</fldBackPain>
<fldKneePain>Yes left due to prior injuries</fldKneePain>
<fldShoulderPain>No</fldShoulderPain>
<fldInjuries>Tore ACL in 2014</fldInjuries>
<fldSurgeries>Left Leg</fldSurgeries>
<fldHighBloodPressureOrClots>No</fldHighBloodPressureOrClots>
<fldAsthmaCOPDEmphysema>No</fldAsthmaCOPDEmphysema>
<fldDiabetes>No</fldDiabetes>
<fldHeartCondition>No</fldHeartCondition>
<fldDifficultySleeping>No</fldDifficultySleeping>
<fldMentalHealth>No</fldMentalHealth>
<fldOtherHealthConditions>No</fldOtherHealthConditions>
<fldExerciseActivities>No</fldExerciseActivities>
</fields>

You can now load this XML into other systems, databases, or analytics tools.

Download a Free Trial of this .NET PDF API Library Today!

Conclusion

Extracting and converting PDF form data into XML is a common requirement in many enterprise workflows, from HR systems to customer feedback management. With Document Solutions for PDF, developers can easily automate this process using just a few lines of C#.

This just scratches the surface of what developers can automate with this .NET PDF API, check out our online demos and documentation to learn more!

Related Blogs

Try Our .NET PDF Document API

PDF Resources

Learn More About Document Solutions for PDF

Release Information

Product Pricing

Product Documentation

Mackenzie Albitz

Product Marketing Specialist