Skip to main content Skip to footer

How to Programmatically Parse Formulas in Excel Using C#

Quick Start Guide
Tutorial Concept

Learn how to parse and analyze Excel formulas using C# and .NET. Programmatically inspect cell formulas, extract references, and understand Excel calculation logic to build powerful spreadsheet automation tools.

What You Will Need .NET 8 or higher, and the DS.Documents.Excel NuGet package.
Controls Referenced

Document Solutions for Excel, .NET API

Documentation | Demos

Excel formulas are the core of many spreadsheet workflows. From financial modeling to operational reporting, formulas drive the calculations that transform raw data into meaningful insights.

However, when working with Excel programmatically, formulas are often treated as simple strings. This makes them difficult to analyze, modify, or reuse safely—especially when formulas become complex.

Document Solutions for Excel (DsExcel) provides a Formula Parsing API that allows developers to convert Excel formulas into a syntax tree representation. This syntax tree exposes the semantic structure of the formula, allowing applications to inspect functions, operators, and references programmatically.

Using this approach, developers can reliably analyze and modify formulas without fragile string manipulation.

In this tutorial, you will learn how to parse and modify Excel formulas using C# and .NET by building a simple automated sales reporting scenario.

Download a free trial of Document Solutions for Excel and start automating your formula parsing workflows today.

.NET Developer Guide to Programmatically Parsing Excel Formulas Using C#


Parse and Analyze Excel Formulas in C# with .NET

Developers often need to analyze or modify Excel formulas programmatically when generating reports, validating spreadsheets, or building automation workflows.

Using the Document Solutions for Excel formula parser, applications can:

  • Parse Excel formulas into a syntax tree
  • Inspect functions, operators, and references
  • Programmatically modify formulas in C#
  • Generate formulas dynamically in .NET applications

Instead of manipulating formulas as raw strings, the parser exposes a structured semantic model that makes formula analysis reliable and scalable.


What You Will Need

Before beginning this tutorial, make sure you have the following installed:

  • Visual Studio 2022 or later
  • .NET 8 SDK or later
  • Document Solutions for Excel (DsExcel)

Install the NuGet package:

Install-Package DS.Documents.Excel

Or download the latest release of the .NET Excel API as a trial here.

The formula parsing API is available in the namespace:

using DS.Documents.Excel.Expressions;

Understanding Excel Formula Syntax Trees

When a formula is parsed using DsExcel, it is converted into a syntax tree that represents the semantic structure of the formula.

Each part of the formula becomes a node within the tree.

Common node types include:

Node Type Description
FunctionNode Represents Excel functions
OperatorNode Represents mathematical operators
ReferenceNode Represents cell or range references
TextNode Represents text literals

In total, the library includes 16 different node classes. The FormulaSyntaxTree class is the entry point to the formula parsing API.

To parse a formula, call:

FormulaSyntaxTree.Parse(formula)

Important: The formula string must not include the leading "=" character when parsing.

Once parsed, the Root property gives access to the root node of the syntax tree.


Create the C# Console Application

Start by creating a .NET console application and importing the required namespaces.

using System;
using System.Collections.Generic;
using GrapeCity.Documents.Excel;
using GrapeCity.Documents.Excel.Expressions;

Next, define the program entry point.

class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine("Excel formula parsing demo.");
    }
}

Example Scenario: Automating a Sales Report

In this tutorial, we will build a simple sales analysis report from raw sales data.

The spreadsheet contains the following columns:

  • Sales Representative
  • Region
  • Product
  • Units Sold

The goal is to categorize sales performance using the following rules:

Units Sold Result
< 2999 Below Target
> 3000 On Target
> 5000 Above Target

This classification is generated using a formula that combines several Excel functions:

  • IF
  • ISNUMBER
  • FILTER
  • IFS

Example formula:

=IF(ISNUMBER(FILTER(A2:D19,A2:A19="Fritz")),
IFS(
FILTER(A2:D19,A2:A19="Fritz")>5000,"Above Target",
FILTER(A2:D19,A2:A19="Fritz")>3000,"On Target",
FILTER(A2:D19,A2:A19="Fritz")<2500,"Below Target"
),
FILTER(A2:D19,A2:A19="Fritz"))

Instead of manually rewriting this formula for each sales representative, we will parse the formula and dynamically replace the representative name.


Step 1: Initialize a Workbook with Sample Data and a Formula to Parse

Start by creating a new workbook, adding sample sales data, and inserting the formula that will later be parsed and modified.

var workbook = new Workbook();
workbook.AllowDynamicArray = true;

var worksheet = workbook.Worksheets[0];
worksheet.Name = "Sales Report";

// Headers
worksheet.Range["A1"].Value = "Sales Representative";
worksheet.Range["B1"].Value = "Region";
worksheet.Range["C1"].Value = "Product";
worksheet.Range["D1"].Value = "Units Sold";

// Sample sales data
object[,] salesData =
{
    { "Fritz", "North", "Laptop", 5200 },
    { "Fritz", "South", "Monitor", 3400 },
    { "Fritz", "East", "Keyboard", 2100 },
    { "Xi", "North", "Laptop", 6100 },
    { "Xi", "West", "Mouse", 3200 },
    { "Xi", "South", "Monitor", 1800 },
    { "Maria", "East", "Laptop", 4700 },
    { "Maria", "North", "Keyboard", 2300 },
    { "Maria", "West", "Monitor", 5100 },
    { "James", "South", "Laptop", 2800 },
    { "James", "East", "Mouse", 3900 },
    { "James", "North", "Monitor", 1500 }
};

worksheet.Range["A2:D13"].Value = salesData;

// Section labels
worksheet.Range["F1"].Value = "Unique Rep";
worksheet.Range["H1"].Value = "Sales Analysis";

// Starter formula for Fritz
worksheet.Range["H3"].Formula =
    "=IF(ISNUMBER(FILTER(A2:D13,A2:A13=\"Fritz\"))," +
    "IFS(" +
    "FILTER(A2:D13,A2:A13=\"Fritz\")>5000,\"Above Target\"," +
    "FILTER(A2:D13,A2:A13=\"Fritz\")>3000,\"On Target\"," +
    "FILTER(A2:D13,A2:A13=\"Fritz\")<2999,\"Below Target\"" +
    ")," +
    "FILTER(A2:D13,A2:A13=\"Fritz\"))";

// Basic formatting
worksheet.Range["A1:D1,F1,H1"].Font.Bold = true;
worksheet.Range["A1:D1,F1,H1"].Interior.Color = System.Drawing.Color.FromArgb(68, 114, 196);
worksheet.Range["A1:D1,F1,H1"].Font.Color = System.Drawing.Color.White;
worksheet.Range["A:D"].EntireColumn.AutoFit();
worksheet.Range["F:H"].EntireColumn.AutoFit();

At this point, the workbook contains raw sales data and an Excel formula in cell H3 that is ready to be parsed, as shown below:

Initialize a .NE Excel Workbook with Sample Data and a Formula to Parse


Step 2: Extract the Formula

Next, retrieve the formula from the worksheet.

Because the parser expects the formula text without the leading equals sign, remove it before parsing.

var originalFormula = worksheet.Range["H3"].Formula.Substring(1);

This gives you a clean formula string that can be passed to the parser.


Step 3: Parse the Formula

Now generate a syntax tree using FormulaSyntaxTree.Parse().

var syntaxTree = FormulaSyntaxTree.Parse(originalFormula);

To inspect the structure of the formula, recursively walk through the syntax tree and print each node.

void Flatten(SyntaxNode node, int level)
{
    Console.WriteLine($"{new string(' ', level * 2)}{node.GetType().Name}: {node}");

    foreach (var child in node.Children)
    {
        Flatten(child, level + 1);
    }
}

Flatten(syntaxTree.Root, 0);

This produces a hierarchical representation of the formula and reveals the functions, operators, references, and text values it contains.

For this formula, you will see node types such as:

  • FunctionNode(IF)
  • FunctionNode(FILTER)
  • OperatorNode(>)
  • TextNode("Fritz")
  • TextNode("Above Target")

This syntax tree is what makes safe programmatic formula modification possible. The output in your console window will appear as below:

Parse the Formulas from XLSX using C Sharp


Step 4: Modify the Formula Using Syntax Nodes

Once the formula has been parsed, locate the text value you want to replace.

In this example, the sales representative name "Fritz" appears as a TextNode. We can replace it with another representative name such as "Xi".

var findNode = new TextNode("Fritz");
var replaceNode = new TextNode("Xi");

Next, traverse the syntax tree and replace every matching node.

void ReplaceNode(SyntaxNode lookIn, SyntaxNode find, SyntaxNode replacement)
{
    var children = lookIn.Children;
    for (int i = 0; i < children.Count; i++)
    {
        var child = children[i];
        if (child.Equals(find))
        {
            children[i] = replacement;
        }
        else
        {
            ReplaceNode(child, find, replacement);
        }
    }
}
ReplaceNode(syntaxTree.Root, findNode, replaceNode);

After the replacement is complete, rebuild the formula text from the modified syntax tree.

var updatedFormula = "=" + syntaxTree.ToString();

This gives you a new valid Excel formula with the updated representative name.


Step 5: Apply the Updated Formula for All Sales Representatives

Now expand the example so the formula is reused for every representative in the dataset.

First, generate a unique list of representative names using Excel’s UNIQUE function.

worksheet.Range["F2"].Formula = "=UNIQUE(A2:A13)";
var uniqueRep = worksheet.Range["F2#"];

Then parse the original formula once, replace the representative name for each unique value, and write each updated formula into the worksheet.

var baseSyntaxTree = FormulaSyntaxTree.Parse(originalFormula);
var currentFind = new TextNode("Fritz");

for (int r = 0, resultRow = 3; r < uniqueRep.Cells.Count; r++, resultRow += 4)
{
    var repName = uniqueRep.Cells[r].Value?.ToString();

    if (string.IsNullOrWhiteSpace(repName))
        continue;

    var workingTree = FormulaSyntaxTree.Parse(originalFormula);
    var replacement = new TextNode(repName);

    ReplaceNode(workingTree.Root, new TextNode("Fritz"), replacement);

    var resultCell = "H" + resultRow;
    worksheet.Range[resultCell].Formula = "=" + workingTree.ToString();
    worksheet.Range[resultCell + "#"].Borders.LineStyle = BorderLineStyle.Thin;
    worksheet.Range[resultCell + "#"].Borders.Color = System.Drawing.Color.FromArgb(91, 155, 213);
}

This automatically generates a sales analysis block for each representative using the same formula structure.


Step 6: Save the Excel File

Finally, save the workbook as an Excel file.

workbook.Save("ModifiedFormula.xlsx", SaveFileFormat.Xlsx);

When you open the saved workbook, you will see:

  • programmatically generated sample sales data
  • the original formula
  • dynamically generated analysis results for each sales representative
  • formulas created and modified through the syntax tree API

The workbook will appear as below:

Save a Generated Excel .xlsx file from a .NET Server-Side Application


Additional Formula Parser Capabilities

The DsExcel formula parser supports several advanced scenarios beyond simple modifications.

Generating Formulas Programmatically

Developers can construct formulas entirely from syntax nodes.

Example:

var multiply = new OperatorNode(OperatorKind.Multiply);

multiply.Children.Add(new ReferenceNode(new Reference { Row = 0, Column = 0 }));
multiply.Children.Add(new ReferenceNode(new Reference { Row = 1, Column = 0 }));

var tree = new FormulaSyntaxTree { Root = multiply };

This generates the formula:

A1*A2

Converting Reference Styles

The parser supports converting formulas between R1C1 and A1 reference styles using parsing contexts.

Inspecting Formula Structure

Developers can analyze formulas programmatically to identify:

  • Function usage
  • Range references
  • Operators
  • Named ranges
  • Array literals

Download a Free Trial of this .NET Excel API Library Today!

Summary

In this tutorial, you learned how to parse and modify Excel formulas programmatically using Document Solutions for Excel and C#.

We covered how to:

  • Extract formulas from worksheets
  • Parse formulas into syntax trees
  • Inspect formula nodes
  • Replace nodes programmatically
  • Rebuild and apply modified formulas

Using the FormulaSyntaxTree API, developers can treat Excel formulas as structured objects rather than raw text.

This enables powerful automation scenarios such as:

  • Dynamic report generation
  • Spreadsheet auditing
  • Intelligent formula modification
  • Automated Excel workflows

To explore additional capabilities, review the Document Solutions for Excel documentation and demos.

Tags:

comments powered by Disqus