How to Parse and Extract HL7 Data in C# .NET
Quick Start Guide | |
---|---|
What You Will Need |
ComponentOne Data Services Edition Visual Studio 2022 |
Controls Referenced | |
Tutorial Concept | Shows how to parse a healthcare file in the HL7 v2 format and show its contents within in a C# .NET WPF desktop application. |
HL7 data refers to the Health Level Seven standard used by healthcare providers around the world. The HL7 data format is a long ASCII string that is difficult to read, and since it doesn’t follow some broad format like JSON or XML, there isn’t a lot of tooling to support it.
This article will demonstrate how to parse a file in the HL7 version 2 (V2) format and show its contents in a TreeView. It uses ComponentOne TextParser to parse the HL7 file and the WPF TreeView for visualization, making it much easier to read and interpret. We will cover:
- What is HL7
- Prerequisites
- Setting Up the C1_hl7 Solution
- Interpreting the HL7 V2 File
- Presenting the HL7 File
Ready to Try It Out? Download ComponentOne Today!
What is HL7
Health Level Seven (HL7) refers to international criteria for transferring clinical and administrative data between software applications used by various healthcare providers. The standards focus on the OSI model’s application layer, layer 7. Health Level Seven International, an international standards organization, produces the HL7 standards, which other standards issuing bodies, such as the American National Standards Institute and the International Organization for Standardization, have adopted.
HL7 provides three standards for exchanging electronic health care (e-health) records: V2, CDA, and FHIR. In this article, you use the HL7 V2 format. This article focuses solely on the HL7 V2 format:
Seeing an HL7 V2 message for the first time can feel intimidating. The HL7 message is an ASCII string with segments that are \r separated.
The high-level rules for parsing the format are as follows:
- Each segment starts with three characters that indicate the type of record. There are more than 120 types of segments, and it is possible to create your own types.
- Segments contain composites (fields) separated by a vertical line (|). Each segment type has its own set of fields.
- A field can contain sub-fields, each separated by the ^.
- Each segment type has its own set of fields.
A message starts with a message header (MSH) segment. The header contains some metadata about the message, including its type.
As you can see, this is a challenging format to read. Unfortunately, it isn’t in a commonly used format, such as JSON, XML, YAML, or CSV, so parsing an HL7 file can be complex without the help of a quality parser. Next, we will cover the technical implementation of parsing this file.
Prerequisites
WPF is only available on Windows. If you want to try the code, you’ll need a Windows machine.
You should also know your way around Visual Studio (VS) 2022 and C#. The (free) community edition of VS is sufficient for this article. This article uses VS 2022 version 17.5.1.
You can download the project that accompanies this article to follow along.
Setting Up the C1_hl7 Solution
Start creating a new solution named C1_hl7 using the following dotnet CLI commands in an empty folder.
When you open the C1_hl7.sln solution file in Visual Studio, the solution explorer looks like this:
The C1_hl7.data project contains the code and templates for interpreting the HL7 files. This is why you have already added the C1.TextParser NuGet package to it. The C1_hl7.wpf file is the data project. It presents the data in a C1 tree view.
And, of course, all the tests go into the Tests project directory. This article isn’t about testing, but you should test as much as you can!
Interpreting the HL7 V2 File
The file you use contains data about a doctor’s appointment. It has information about the patient (and his family), the doctor, some observations, and a diagnosis. To present it in a tree view, you need a recursive structure:
- Right-click the data project, click Add, then click New Folder and name it Model.
- Add a new interface called ISegment under the Model folder and a new Segment class to implement this interface.
In ISegment.cs, add the following code block:
The class Segment implements this interface. Add it in Segment.cs.
Reading the HL7 File
Now, you’ll write a separate class to read the file. This class needs a template file that describes the HL7 V2 format. Because this is a class library, you add this file as a resource in the assembly.
As before, you create a new folder for this template. If you need more templates later, you know where to store them.
- Right-click the data project and add a new folder named Templates.
- Right-click the Templates folder and add a new file called xml.
- In the properties of this file, set the Build Action to Embedded resource.
The ComponentOne TextParser library supports three different extractors for different scenarios, including plain text, a specialized HTML extractor, and a template-based extractor. The template-based extractor is the most generic, as it allows users to parse data structures following a declarative XML template. Since the template can be provided as a separate file, it allows you to provide both the template and source to parse.
You use the TemplateBasedExtractor class to parse the HL7 file. To use this class, you must describe a template to extract the data from the file. Here’s the template:
This XML file has the <HL7Segment> element as its root element. The template recursively describes the elements. In this case, the elements are the following:
- The common element — Each HL7 line has at least a type and an optional identifier described by the common element.
- The HL7Segment element — A message contains the common element and a variable number of fields. These fields depend on the message type, so you handle them as a big string.
Add a new class to the data project called HL7Datareader. This class is responsible for transforming the HL7 string to a Segment (with its Subsegments). It takes a string as its argument and returns the corresponding List<ISegment>.
First, write a method to read the HL7Template.xml file from the assembly resources and create a TemplateBasedExtractor.
- Add a constant string, _templateResource, to the class
- Add the method, ReadTemplateResource
Now you add the ReadData method to the HL7Datareader class. This method receives a string in HL7 format and transforms it into a list of Segments.
Add the following code block to the class that you started:
The class HL7Message has yet to exist, so you cannot deserialize it.
To see the result of the extractionResult.ToJsonString method, you put a breakpoint on the line, return null;.
Now you inspect the JSON string in the debugger and copy the whole string to the clipboard.
To deserialize this string, you must create all the necessary classes.
Add a new class called Message in the Model folder in the C1_hl7.data project.
Remove the actual Message class and copy the JSON file produced in the previous step. Instead of just copying the contents, use Edit > Paste Special > Paste JSON as Classes.
Visual Studio now generates the classes for your model to deserialize the JSON file, with Rootobject as its root.
To be closer to the domain model, rename this class to HL7Message.
Now you can uncomment the two last lines in the method and remove the return null; line.
The only thing left is to convert the HL7Message into a List<ISegment>. Here’s the function, which goes inside the same HL7Datareader class.
In this case, you only want to represent some fields in the tree view. In an actual project, you can use more fields and store them in a database. If you plan to do more HL7 projects, move the helper functions into a HL7Helperclass. For the sake of simplicity, you have implemented them as nested functions.
Presenting the HL7 File
You represent this file in a C1TreeView. The data project already defines the ViewModel. The WPF project only presents it.
Here is the XAML for the MainWindow:
The interesting part is the definition of the C1TreeView:
- The ItemsSource is set to {Binding}.
- The ItemTemplate determines what is shown in the Treeview and how.
All that is left now is to implement MainWindow.xaml.cs:
You may have noticed that there is also a function called TestExtractor_Click. This function opens a window where you can enter a data file and a template file to test the TemplateBasedExtractor class. It is part of the .NET solution that accompanies this article, and after reading this article, the code should be straightforward for you. The code is grayed out.
Conclusion
HL7 V2 is not a simple data format. But thanks to the TemplateBasedExtractor class, which relies on the XML template, you managed to parse the file so that further processing became easier. Representing the data in a tree view was made easy thanks to the c1:C1TreeView component. The data binding is straightforward, and you can easily style everything.
Using the TemplateBasedExtractor class simplifies the process because it reads the file and makes it available in an easy-to-use object model. In the InterpretExtractedData method, you need only interpret the different fields according to their segment type. If you elaborate on the XML template further, extending the object model and simplifying the InterpretExtractedData method is possible. Adding a button to the application allows you to test this conveniently.
Talk to MESCIUS to learn how ComponentOne can help you start creating your own e-health applications.