How to Extract PDF Form Data and Convert it to XML Using C#
| Quick Start Guide | |
|---|---|
| Tutorial Concept | This tutorial shows how to extract PDF form fields and save them as XML using C# and a .NET PDF API in just a few lines of code. | 
| What You Will Need | NuGet package: Ds.Documents.Pdf | 
| Controls Referenced | Document Solutions for PDF - .NET PDF API Library | 
PDFs are one of the most common formats for creating and distributing forms. From job applications and legal agreements to surveys and banking documents, organizations rely on fillable PDF forms to gather information efficiently. Once filled, developers often need a way to extract the data for processing or storage.
In this short guide, you’ll learn how to use the .NET PDF API Document Solutions for PDF to extract form data from a PDF and save it as XML using just a few lines of C# code.
Download a free trial of Document Solutions for PDF and start automating your PDF workflows today.
Tutorial: Convert PDF Form Data to XML in .NET Apps
- Why Export PDF Form Data to XML
- Step 1: Import Required Namespaces
- Step 2: Load the PDF Form
- Step 3: Export Form Data to a Memory Stream
- Step 4: Export Form Data to an XML File
- Step 5: Verify the XML Output
Download a finished sample application to follow along with!
Why Export PDF Form Data to XML
XML is one of the most flexible formats for exchanging structured data. Exporting PDF form fields as XML makes it easy to:
- Integrate PDF responses into databases or analytics systems.
- Automate workflows like surveys, HR applications, or billing.
- Maintain field structure and consistency across tools.
Import Required Namespaces
For this tutorial, we will use a .NET 8 Console application. To begin, include Document Solutions for PDF in the .NET application. In this example, get the latest release from NuGet:

Import the needed namespaces:
using GrapeCity.Documents.Pdf;
using System.IO;To learn more about getting started with the .NET PDF API, see the documentation.
Load the PDF Form
Initialize the .NET PDF instance using the GcPdfDocument constructor. Then invoke the Load method to read the PDF form you want to extract data from. For this example, we’ll use a sample file named filledForm.pdf.
GcPdfDocument doc = new GcPdfDocument();
//Create an object of filestream
var fs = new FileStream(Path.Combine("filledForm.pdf"), FileMode.Open, FileAccess.Read);
//Load the document
doc.Load(fs);Export Form Data to a Memory Stream
If you want to process the data programmatically, for example, by sending it to another system or API, developers can export the PDF form data into a MemoryStream using the .NET PDF API’s ExportFormDataToXML method.
// Create a new memory stream
MemoryStream stream = new MemoryStream();
// Export form data into the stream
doc.ExportFormDataToXML(stream);At this point, the stream contains the XML-formatted form data, which you can read or manipulate in your code.
Export PDF Form Data to an XML File
If you simply want to save the extracted data to disk, use the same ExportFormDataToXML method with a file path:
// Export form data directly to an XML file
doc.ExportFormDataToXML("sampleFormData.xml");This will create an XML file containing all form field names and their values from the PDF.

Verify the XML Output
The exported XML will look similar to this:
<fields xmlns:xfdf="http://ns.adobe.com/xfdf-transition/">
<fldEmail>John Smith</fldEmail>
<fld1586527433974>j.smith@abc.inc</fld1586527433974>
<fldAddress>1234 Wilber Lane</fldAddress>
<fldDateOfBirth>01/02/1995</fldDateOfBirth>
<fldBestPhoneNumbers>123-456-7890</fldBestPhoneNumbers>
<fldCityStateZip>12312 WA</fldCityStateZip>
<fldEmergencyContact>Susan Smith / Mother</fldEmergencyContact>
<fldEmergencyPhoneNumbers>123-456-7891</fldEmergencyPhoneNumbers>
<radiogroupPainRating>2</radiogroupPainRating>
<radiogroupStressLevel>5</radiogroupStressLevel>
<fldParentGuardianName>John Smith</fldParentGuardianName>
<fldParentGuardianDate>10/27/2025</fldParentGuardianDate>
<fldPrescriptionsList>n/a</fldPrescriptionsList>
<fldAllergies>Tree Nuts</fldAllergies>
<fldTraumaticEvent>No</fldTraumaticEvent>
<fldDizzyness>No</fldDizzyness>
<fldBackPain>No</fldBackPain>
<fldKneePain>Yes left due to prior injuries</fldKneePain>
<fldShoulderPain>No</fldShoulderPain>
<fldInjuries>Tore ACL in 2014</fldInjuries>
<fldSurgeries>Left Leg</fldSurgeries>
<fldHighBloodPressureOrClots>No</fldHighBloodPressureOrClots>
<fldAsthmaCOPDEmphysema>No</fldAsthmaCOPDEmphysema>
<fldDiabetes>No</fldDiabetes>
<fldHeartCondition>No</fldHeartCondition>
<fldDifficultySleeping>No</fldDifficultySleeping>
<fldMentalHealth>No</fldMentalHealth>
<fldOtherHealthConditions>No</fldOtherHealthConditions>
<fldExerciseActivities>No</fldExerciseActivities>
</fields>You can now load this XML into other systems, databases, or analytics tools.
Download a Free Trial of this .NET PDF API Library Today!
Conclusion
Extracting and converting PDF form data into XML is a common requirement in many enterprise workflows, from HR systems to customer feedback management. With Document Solutions for PDF, developers can easily automate this process using just a few lines of C#.
This just scratches the surface of what developers can automate with this .NET PDF API, check out our online demos and documentation to learn more!
