How to Parse and Extract HL7 Data in C# .NET

February 24 2025

Quick Start Guide
What You Will Need	ComponentOne Data Services Edition Visual Studio 2022
Controls Referenced	C1TextParser
Tutorial Concept	Shows how to parse a healthcare file in the HL7 v2 format and show its contents within in a C# .NET WPF desktop application.

Quick Start Guide

What You Will Need

ComponentOne Data Services Edition

Visual Studio 2022

Controls Referenced

Tutorial Concept

Shows how to parse a healthcare file in the HL7 v2 format and show its contents within in a C# .NET WPF desktop application.

HL7 data refers to the Health Level Seven standard used by healthcare providers around the world. The HL7 data format is a long ASCII string that is difficult to read, and since it doesn’t follow some broad format like JSON or XML, there isn’t a lot of tooling to support it.

This article will demonstrate how to parse a file in the HL7 version 2 (V2) format and show its contents in a TreeView. It uses ComponentOne TextParser to parse the HL7 file and the WPF TreeView for visualization, making it much easier to read and interpret. We will cover:

What is HL7
Prerequisites
Setting Up the C1_hl7 Solution
Interpreting the HL7 V2 File
Presenting the HL7 File

Ready to Try It Out? Download ComponentOne Today!

What is HL7

Health Level Seven (HL7) refers to international criteria for transferring clinical and administrative data between software applications used by various healthcare providers. The standards focus on the OSI model’s application layer, layer 7. Health Level Seven International, an international standards organization, produces the HL7 standards, which other standards issuing bodies, such as the American National Standards Institute and the International Organization for Standardization, have adopted.

HL7 provides three standards for exchanging electronic health care (e-health) records: V2, CDA, and FHIR. In this article, you use the HL7 V2 format. This article focuses solely on the HL7 V2 format:

MSH|^~\&|REGAD1|MCM|IFENG||199901110500||ADT^A02|000001|P|2.4|||

EVN|A02|199901110520||01||199901110500

PID|||12345^^^MEDCOM^MR~123456^^^USSSA^SS|253763|JOHN^SMITH||19560129|M|||677 DELAWARE AVENUE^^EVERETT^MA^02149||(555)753-1298

PV1||I|SICU^0001^01^GENHOS|||6N^1234^A^GENHOS|0200^JONES, GEORGE|0148^ADDISON,JAMES||MED|||||||0148^ANDERSON,CARL|S|1400|A

Seeing an HL7 V2 message for the first time can feel intimidating. The HL7 message is an ASCII string with segments that are \r separated.

The high-level rules for parsing the format are as follows:

Each segment starts with three characters that indicate the type of record. There are more than 120 types of segments, and it is possible to create your own types.
Segments contain composites (fields) separated by a vertical line (|). Each segment type has its own set of fields.
A field can contain sub-fields, each separated by the ^.
Each segment type has its own set of fields.

A message starts with a message header (MSH) segment. The header contains some metadata about the message, including its type.

As you can see, this is a challenging format to read. Unfortunately, it isn’t in a commonly used format, such as JSON, XML, YAML, or CSV, so parsing an HL7 file can be complex without the help of a quality parser. Next, we will cover the technical implementation of parsing this file.

Prerequisites

WPF is only available on Windows. If you want to try the code, you’ll need a Windows machine.

You should also know your way around Visual Studio (VS) 2022 and C#. The (free) community edition of VS is sufficient for this article. This article uses VS 2022 version 17.5.1.

You can download the project that accompanies this article to follow along.

Setting Up the C1_hl7 Solution

Start creating a new solution named C1_hl7 using the following dotnet CLI commands in an empty folder.

dotnet new sln -n C1_hl7 -o C1_hl7		# create an empty solution
cd .\C1_hl7\
dotnet new wpf -n C1_hl7.wpf -o C1_hl7.wpf	# create a new WPF project
dotnet sln .\C1_hl7.sln add .\C1_hl7.wpf\C1_hl7.wpf.csproj	  # Add to the solution

dotnet new classlib -n C1_hl7.data -o C1_hl7.data	# create a new assembly
dotnet sln .\C1_hl7.sln add .\C1_hl7.data\C1_hl7.data.csproj

md Tests
dotnet new xunit -n C1_hl7.data.tests -o Tests\C1_hl7.data.tests

# Add the necessary project references
dotnet add .\C1_hl7.wpf\C1_hl7.wpf.csproj reference .\C1_hl7.data\C1_hl7.data.csproj
dotnet add .\Tests\C1_hl7.data.tests\C1_hl7.data.tests.csproj reference .\C1_hl7.data\C1_hl7.data.csproj

# Add all the necessary packages
dotnet add .\C1_hl7.wpf\C1_hl7.wpf.csproj package C1.WPF.TreeView -s https://api.nuget.org/v3/index.json
dotnet add .\C1_hl7.data\C1_hl7.data.csproj package C1.TextParser -s https://api.nuget.org/v3/index.json

When you open the C1_hl7.sln solution file in Visual Studio, the solution explorer looks like this:

HL7 .NET

The C1_hl7.data project contains the code and templates for interpreting the HL7 files. This is why you have already added the C1.TextParser NuGet package to it. The C1_hl7.wpf file is the data project. It presents the data in a C1 tree view.

And, of course, all the tests go into the Tests project directory. This article isn’t about testing, but you should test as much as you can!

Interpreting the HL7 V2 File

The file you use contains data about a doctor’s appointment. It has information about the patient (and his family), the doctor, some observations, and a diagnosis. To present it in a tree view, you need a recursive structure:

Right-click the data project, click Add, then click New Folder and name it Model.
Add a new interface called ISegment under the Model folder and a new Segment class to implement this interface.

In ISegment.cs, add the following code block:

public interface ISegment
{
    string Segmenttype { get; init; }
    string Name { get; init; }
    string Data { get; init; }
    List<ISegment> Subsegments { get; init; }
}

The class Segment implements this interface. Add it in Segment.cs.

public class Segment : ISegment
{
    public Segment(string segmenttype, string name, string data)
    {
        Segmenttype = segmenttype;
        Name = name;
        Data = data;
    }
    public string Segmenttype { get; init; }
    public string Name { get; init; }
    public string Data { get; init; }
    public List<ISegment> Subsegments { get; init; } = new();
}

Reading the HL7 File

Now, you’ll write a separate class to read the file. This class needs a template file that describes the HL7 V2 format. Because this is a class library, you add this file as a resource in the assembly.

As before, you create a new folder for this template. If you need more templates later, you know where to store them.

Right-click the data project and add a new folder named Templates.
Right-click the Templates folder and add a new file called xml.
In the properties of this file, set the Build Action to Embedded resource.

HL7 .NET

The ComponentOne TextParser library supports three different extractors for different scenarios, including plain text, a specialized HTML extractor, and a template-based extractor. The template-based extractor is the most generic, as it allows users to parse data structures following a declarative XML template. Since the template can be provided as a separate file, it allows you to provide both the template and source to parse.

You use the TemplateBasedExtractor class to parse the HL7 file. To use this class, you must describe a template to extract the data from the file. Here’s the template:

<?xml version="1.0" ?>
 
<template rootElement="HL7Segment" >
 
  <element name="common" >
    <element name="Type" extractFormat="regex:[A-Z]{2}[A-Z\d]"  />
    <element name="Id" startingRegex="\|" extractFormat="regex:.*?\|" />
  </element >
 
  <element name="HL7Segment" >
    <element template="common" />
    <element name="Fields" extractFormat="regex:.*\r" />
  </element>
 
</template>

This XML file has the <HL7Segment> element as its root element. The template recursively describes the elements. In this case, the elements are the following:

The common element — Each HL7 line has at least a type and an optional identifier described by the common element.
The HL7Segment element — A message contains the common element and a variable number of fields. These fields depend on the message type, so you handle them as a big string.

Add a new class to the data project called HL7Datareader. This class is responsible for transforming the HL7 string to a Segment (with its Subsegments). It takes a string as its argument and returns the corresponding List<ISegment>.

First, write a method to read the HL7Template.xml file from the assembly resources and create a TemplateBasedExtractor.

Add a constant string, _templateResource, to the class
Add the method, ReadTemplateResource

public class HL7Datareader
{
    private const string _templateResource = "C1_hl7.data.Templates.HL7Template.xml";
 
    internal TemplateBasedExtractor ReadTemplateBasedExtractor()
    {
        var assembly = Assembly.GetExecutingAssembly();
        using Stream stream = assembly.GetManifestResourceStream(_templateResource);
        return new TemplateBasedExtractor(stream);
    }
 
    // The rest of the class follows
}

Now you add the ReadData method to the HL7Datareader class. This method receives a string in HL7 format and transforms it into a list of Segments.

Add the following code block to the class that you started:

public List<ISegment> ReadData(string hl7String)
{
    // the Extract method expects a stream
    using MemoryStream hl7 = new MemoryStream(Encoding.ASCII.GetBytes(hl7String));
    TemplateBasedExtractor templateBasedExtractor = ReadTemplateBasedExtractor();
    IExtractionResult extractionResult = templateBasedExtractor.Extract(hl7);
    // The final result of the extraction is a json string 
    string json = extractionResult.ToJsonString();
 
    return null;
    // deserialize the json string
    //HL7Message? messages = JsonSerializer.Deserialize<HL7Message>(json);
 
    //return InterpretExtractedData(messages);
}

The class HL7Message has yet to exist, so you cannot deserialize it.

To see the result of the extractionResult.ToJsonString method, you put a breakpoint on the line, return null;.

Now you inspect the JSON string in the debugger and copy the whole string to the clipboard.

HL7 .NET

To deserialize this string, you must create all the necessary classes.

Add a new class called Message in the Model folder in the C1_hl7.data project.

Remove the actual Message class and copy the JSON file produced in the previous step. Instead of just copying the contents, use Edit > Paste Special > Paste JSON as Classes.

Visual Studio now generates the classes for your model to deserialize the JSON file, with Rootobject as its root.

To be closer to the domain model, rename this class to HL7Message.

Now you can uncomment the two last lines in the method and remove the return null; line.

The only thing left is to convert the HL7Message into a List<ISegment>. Here’s the function, which goes inside the same HL7Datareader class.

private List<ISegment> InterpretExtractedData(HL7Message? message)
{
    List<ISegment> segments = new();
    ISegment nextOfKin = new Segment("Next of Kin", string.Empty, string.Empty);
    ISegment observations = new Segment("Observations", string.Empty, string.Empty); 
    foreach (var segment in message.Result.HL7Segment)
    {
        string segmentType = segment.common.Type;
        string[] fields = segment.Fields.Split('|');
        switch (segmentType)
        {
            case "MSH":     // Message header
                segments.Add(new Segment(segmentType, fields[1], 
						ToDate(fields[4])));
                break;
            case "EVN":     // Event type
                segments.Add(new Segment(segmentType, "Encounter", 
						  ToDate(fields[1])));
                break;
            case "PID":     // Patient Identification
                segments.Add(new Segment(segmentType, ToName(fields[3]), 
						  ToDate(fields[5])));
                break;
            case "PV1":     // Patient Visit
                segments.Add(new Segment(segmentType, "Dr.", 
						   ToPratitionerName(fields[7])));
                break;
            case "NK1":     // Next of Kin
                nextOfKin.Subsegments.Add(new Segment(segmentType, 
						   ToName(fields[0]), fields[1]));
                break;
            case "OBX":     // ObservationResult
                string obs = fields[1].Split('^').Last();
                observations.Subsegments.Add(new Segment(segmentType, obs, 
						fields[3] + " " + fields[4]));
                break;
            case "DG1":     // Diagnosis Information
                segments.Add(new Segment(segmentType, fields[2], fields[4]));
                break;
        }
    }
 
    segments.Add(nextOfKin);
    segments.Add(observations);
 
    return segments;
 
    string ToName(string humanName)
    {
        string[] nameparts = humanName.Split('^');
 
        return nameparts.Length switch
        {
            0 => string.Empty,
            1 => nameparts[0],
            _ => $"{nameparts[0]} {nameparts[1]}"
        };
    }
 
    string ToPratitionerName(string humanName)
    {
        string[] nameparts = humanName.Split('^');
 
        return nameparts.Length switch
        {
            0 => string.Empty,
            1 => nameparts[0],
            2 => nameparts[1],
            _ => $"{nameparts[2]} {nameparts[3]}"
        };
    }
 
 
    string ToDate(string dt)
    {
        if (DateTime.TryParseExact(dt, new string[] { "yyyyMMddHHmm", "yyyyMMdd" }, null, DateTimeStyles.None, out DateTime d))
            return d.ToString();
        else return dt;
    }
}

In this case, you only want to represent some fields in the tree view. In an actual project, you can use more fields and store them in a database. If you plan to do more HL7 projects, move the helper functions into a HL7Helperclass. For the sake of simplicity, you have implemented them as nested functions.

Presenting the HL7 File

You represent this file in a C1TreeView. The data project already defines the ViewModel. The WPF project only presents it.

HL7 .NET

Here is the XAML for the MainWindow:

<Window
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        xmlns:local="clr-namespace:C1_hl7.wpf"
        xmlns:c1="http://schemas.componentone.com/winfx/2006/xaml" x:Class="C1_hl7.wpf.MainWindow"
        mc:Ignorable="d"
        Title="HL7 Viewer" Height="450" Width="800">
    <DockPanel LastChildFill="True">
        <DockPanel.Resources>
            <!-- styles for the textblocks in the treeview -->
            <Style TargetType="TextBlock" x:Key="segmentType">
                <Setter Property="Foreground" Value="White" />
                <Setter Property="Background" Value="CadetBlue" />
                <Setter Property="Margin" Value="0 0 5 0" />
                <Setter Property="Padding" Value="5" />
                <Setter Property="FontWeight" Value="DemiBold" />
            </Style>
            <Style TargetType="TextBlock" x:Key="nameType">
                <Setter Property="Margin" Value="0 0 5 0" />
                <Setter Property="Padding" Value="5" />
            </Style>
            <Style TargetType="TextBlock" x:Key="dataType">
                <Setter Property="Margin" Value="0 0 5 0" />
                <Setter Property="Padding" Value="5" />
            </Style>
        </DockPanel.Resources>
 
        <ToolBar DockPanel.Dock="Top">
            <Button Content="Test TemplateBasedExtractor" Click="TestExtractor_Click" />
        </ToolBar>
 
        <!-- TreeView is referred as "tree" in the code -->
        <c1:C1TreeView x:Name="tree"
                       ItemsSource="{Binding}" 
                       SelectionMode="Single"
                       SnapsToDevicePixels="True" 
                       HorizontalContentAlignment="Stretch" 
                       Margin="5">
 
            <!-- representation of one tree item -->
            <c1:C1TreeView.ItemTemplate>
                <!-- Bound to the property (List) SubSegments of the Segment-->
                <c1:C1HierarchicalDataTemplate ItemsSource="{Binding Subsegments}" >
                    <StackPanel Orientation="Horizontal" >
                        <TextBlock HorizontalAlignment="Left" Text="{Binding Segmenttype}" Style="{StaticResource segmentType}" />
                        <TextBlock HorizontalAlignment="Left" Text="{Binding Name}" Style="{StaticResource nameType}"/>
                        <TextBlock HorizontalAlignment="Left" Text="{Binding Data}" Style="{StaticResource dataType}"/>
                    </StackPanel>
                </c1:C1HierarchicalDataTemplate>
            </c1:C1TreeView.ItemTemplate>
        </c1:C1TreeView>
    </DockPanel>
</Window>

The interesting part is the definition of the C1TreeView:

The ItemsSource is set to {Binding}.
The ItemTemplate determines what is shown in the Treeview and how.

All that is left now is to implement MainWindow.xaml.cs:

using C1_hl7.data.Model;
using C1_hl7.data;
using System.Collections.Generic;
using System.Windows;
using System.IO;
 
namespace C1_hl7.wpf
{
    /// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window
    {
        public MainWindow()
        {
            InitializeComponent();
            LoadData();
        }
 
        private void LoadData()
        {
            string hl7 = File.ReadAllText(@"Data\ADT_A04.hl7");
            HL7Datareader hL7Datareader = new HL7Datareader();
            List<ISegment> message = hL7Datareader.ReadData(hl7);
 
            tree.DataContext = message;
        }
 
        private void TestExtractor_Click(object sender, RoutedEventArgs e)
        {
            TestExtractor wnd = new TestExtractor();
 
            wnd.Show();
        }
    }
}

You may have noticed that there is also a function called TestExtractor_Click. This function opens a window where you can enter a data file and a template file to test the TemplateBasedExtractor class. It is part of the .NET solution that accompanies this article, and after reading this article, the code should be straightforward for you. The code is grayed out.

Conclusion

HL7 V2 is not a simple data format. But thanks to the TemplateBasedExtractor class, which relies on the XML template, you managed to parse the file so that further processing became easier. Representing the data in a tree view was made easy thanks to the c1:C1TreeView component. The data binding is straightforward, and you can easily style everything.

Using the TemplateBasedExtractor class simplifies the process because it reads the file and makes it available in an easy-to-use object model. In the InterpretExtractedData method, you need only interpret the different fields according to their segment type. If you elaborate on the XML template further, extending the object model and simplifying the InterpretExtractedData method is possible. Adding a button to the application allows you to test this conveniently.

HL7 .NET