RichTextBox for WPF | ComponentOne
C1RichTextBox Task-Based Help / Parsing URLs into Hyperlinks
In This Topic
    Parsing URLs into Hyperlinks
    In This Topic

    It would be highly useful if long URLs written in a document gets automatically formatted into readable links. Parsing URLs into meaningful text in the document enhances its readability.

    You can inculcate this parsing capability into standard RichTextbox control to provide seamless reading and writing experience to your users. The C1RichTextBox control is used in conjunction with C1TextParser library for the conversion. The C1TextParser library is a strong text parsing .NET library that enables you to convert and format URLs to hyperlinks automatically.

    This enhancement in the standard RichTextBox control helps in creating more powerful and smart text editor.

    Hyperlink Parsing

    To create an application for parsing URLs into hyperlinks, follow these steps.

    Set up the Application UI

    1. Create a new WPF App(.NET Framework) in Visual Studio.
    2. In the Solution Explorer, right-click Dependencies and select Manage NuGet Packages.
    3. In NuGet Package Manager, select nuget.org as the Package source.
    4. Search and install the following packages:
      • C1.XAML.WPF.RichTextbox
      • C1.TextParser
    5. In XAML view, add a RichTextbox control, a button control and a checkbox control in the grid by adding the following code inside the <Grid> tags.
      XAML
      Copy Code
      <Grid.RowDefinitions>
          <RowDefinition Height="1*"/>
          <RowDefinition Height="10*"/>
      </Grid.RowDefinitions>
      <Grid.ColumnDefinitions>
          <ColumnDefinition Width="5*"/>
          <ColumnDefinition Width="13*"/>
      </Grid.ColumnDefinitions>
      <Button x:Name="btnPasteDoc" Content="Paste Text with hyperlinks" Width="144" HorizontalAlignment="Left" Click="btnPasteDoc_Click" Grid.Row="0" Grid.Column="0" Margin="20,0,0,0"/>
      <CheckBox Name="chkAutoConversion" Grid.Row="0" Grid.Column="1" Margin="23,3,7,7" Content="Allow Auto Hyperlink Conversion" FontSize="13" IsChecked="False" Checked="chkAutoConversion_Checked"/>
      <c1:C1RichTextBox Name="smartRichTextBox" Grid.Row="1" Grid.Column="0" Grid.ColumnSpan="2" Margin="20,20,0,16" Height="200" Width="550" HorizontalAlignment="left" VerticalAlignment="Top"/>
      

    Create a Hyperlink Parser 

    1. Add a class file in your application and named it as HyperlinkParser.cs.
    2. Declare the following IHyperlinkParser interface which contains two methods that are ExtractURLs and GetDisplayText.
      CS
      Copy Code
      public interface IHyperlinkParser
      {
          IEnumerable<string> ExtractURLs(string text);
          string GetDisplayText(string uri);
      }
      
      VB
      Copy Code
      Public Interface IHyperlinkParser
          Function ExtractURLs(text As String) As IEnumerable(Of String)
          Function GetDisplayText(uri As String) As String
      End Interface
      
    3. Create a custom data extraction method named ExtractData to identify relevant portion of the text from a given text based on specific starting and ending criteria. In this method, the TextParser’s Starts-After-Continues-Until extractor extracts all the text present between StartsAfter/ContinuesUntil(Ends Before) text phrases.
      CS
      Copy Code
      private List<ExtractedData> ExtractData(string text, string startsAfter, string continueUntil)
      {
          //Extract URL from complete text
          var parser = new StartsAfterContinuesUntil(startsAfter, continueUntil);
          var result = parser.Extract(new MemoryStream(Encoding.UTF8.GetBytes(text)));
      
          //Create iObject by parsing the JSON string using Newtonsoft.Json.Linq.JObject.Parse
          var jObject = Newtonsoft.Json.Linq.JObject.Parse(result.ToJsonString());
      
          //Retrieves the value associated with the key "Result" from the JObject
          var jToken = jObject.GetValue("Result");
      
          //Convert jToken to a list of ExtractedData objects using jToken.ToObject<List<ExtractedData>>() 
          var extractedData = jToken.ToObject<List<ExtractedData>>();
          return extractedData;
      }
      
      VB
      Copy Code
      Private Function ExtractData(ByVal text As String, ByVal startsAfter As String, ByVal continueUntil As String) As List(Of ExtractedData)
          'Extract URL from complete text
          Dim parser = New StartsAfterContinuesUntil(startsAfter, continueUntil)
          Dim result = parser.Extract(New MemoryStream(Encoding.UTF8.GetBytes(text)))
          'Create iObject by parsing the JSON string using Newtonsoft.Json.Linq.JObject.Parse
          Dim jObject = Newtonsoft.Json.Linq.JObject.Parse(result.ToJsonString)
          'Retrieves the value associated with the key "Result" from the JObject
          Dim jToken = jObject.GetValue("Result")
          'Convert jToken to a list of ExtractedData objects using jToken.ToObject<List<ExtractedData>>() 
          Dim extractedData As List(Of ExtractedData) = jToken.ToObject(Of List(Of ExtractedData))()
          Return extractedData
      End Function
      
    4. Define the ExtractURLs method to extract URLs from the given text. This method processes the input text to extract URLs that start with "http" or "https". It uses a custom data extraction method ExtractData to identify potential URL fragments, validates each URL using the Uri.IsWellFormedUriString method, and collects them into a list. The final list of valid URLs is then returned.
      CS
      Copy Code
      public IEnumerable<string> ExtractURLs(string text)
      {
          //Extract URLs starting with these protocols
          var _protocols = new List<string> { "http", "https" };
          List<string> urls = new List<string>();
          text += " ";
          foreach (var protocol in _protocols)
          {
              //extract the URL from the whole text
              var links = ExtractData(text, protocol, @"\s+");
              foreach (var link in links.Select(x => x.ExtractedText))
              {
                  //if hyperlink is correct, add to list of URLs
                  string hyperlink = $"{protocol}{link}";
                  if (!Uri.IsWellFormedUriString(hyperlink, UriKind.Absolute))
                      continue;
                  if (!urls.Contains(hyperlink)) urls.Add(hyperlink);
              }
          }
          return urls;
      }
      
      VB
      Copy Code
      Public Function IHyperlinkParser_ExtractURLs(ByVal text As String) As IEnumerable(Of String) Implements IHyperlinkParser.ExtractURLs
          'Extract URLs starting with these protocols
          Dim _protocols As New List(Of String) From {"http", "https"}
          Dim urls As New List(Of String)()
          text &= " "
      
          For Each protocol In _protocols
              ' Extract the URL from the whole text
              Dim links = ExtractData(text, protocol, "\s+")
      
              For Each link In links.Select(Function(x) x.ExtractedText)
                  ' If hyperlink is correct, add to list of URLs
                  Dim hyperlink As String = $"{protocol}{link}"
      
                  If Not Uri.IsWellFormedUriString(hyperlink, UriKind.Absolute) Then
                      Continue For
                  End If
      
                  If Not urls.Contains(hyperlink) Then
                      urls.Add(hyperlink)
                  End If
              Next
          Next
          Return urls
      End Function
      
    5. Create a method named ExtractDomainName(Uri uri) that takes a Uri object as input and returns the domain name as a string. This method extracts the domain name from a given URI by processing the host part of the URI. It then filters and selects the first non-empty extracted fragment as the domain name. If no valid domain name is found, it returns the original host.
      CS
      Copy Code
      private string ExtractDomainName(Uri uri)
      {
          IEnumerable<string> data = null;
          int dotCount = uri.Host.Count(x => x == '.');
          // This condition will be executed for extracting domain name if the uri host contains more than 1 dot('.'). 
          if (dotCount > 1)
              data = ExtractData(uri.Host, @"\.", @"\.").Select(x => x.ExtractedText);
      
          // This condition will be executed for extracting domain name if the uri host contains only one dot('.'). 
          if (dotCount == 1)
              data = ExtractData($" {uri.Host}", @" ", @"\.").Select(x => x.ExtractedText);
          var domainName = data.Where(x => !string.IsNullOrEmpty(x.Trim())).First();
          
          //return domain name
          return string.IsNullOrEmpty(domainName) ? uri.Host : domainName;
      }
      
      VB
      Copy Code
      Private Function ExtractDomainName(uri As Uri) As String
          Dim data As IEnumerable(Of String) = Nothing
          Dim dotCount As Integer = uri.Host.Count(Function(x) x = "."c)
      
          ' This condition will be executed for extracting domain name if the uri host contains more than 1 dot('.'). For example, www.google.com
          If dotCount > 1 Then
              data = ExtractData(uri.Host, "\.", "\.").Select(Function(x) x.ExtractedText)
          End If
      
          ' This condition will be executed for extracting domain name if the uri host contains only one dot('.'). For example, youtube.com
          If dotCount = 1 Then
              data = ExtractData($" {uri.Host}", " ", "\.").Select(Function(x) x.ExtractedText)
          End If
      
          Dim domainName As String = data.Where(Function(x) Not String.IsNullOrEmpty(x.Trim())).First()
      
          'return domain name
          Return If(String.IsNullOrEmpty(domainName), uri.Host, domainName)
      End Function
      
    6. Create the method named ChooseDisplayText to filter out irrelevant segments, reverse the list to prioritize the most specific segments, and select the first valid segment as the display text. Then, format the selected text to title case and returns it. If the list of segments is empty, it returns an empty string.
      CS
      Copy Code
      protected virtual string ChooseDisplayText(List<string> words)
      {
          //get the correct words from the segments
          if (words.Count == 0) return string.Empty;
          string displayText = string.Empty;
          var correctWords = words.Where(x => !x.Equals("/")).ToList();
          if (correctWords.Count > 0)
          {
              //Choose the word to be used as display text for the hyperlink
              correctWords.Reverse();
              for (int index = 0; index < words.Count; index++)
              {
                  displayText = $"{correctWords[index]}";
                  break;
              }
          }
          //return the choosen word
          return CultureInfo.CurrentCulture.TextInfo.ToTitleCase(displayText);
      }
      
      VB
      Copy Code
      Protected Overridable Function ChooseDisplayText(ByVal words As List(Of String)) As String
          'get the correct words from the segments
          If (words.Count = 0) Then
              Return String.Empty
          End If
      
          Dim displayText As String = String.Empty
          Dim correctWords = words.Where(Function(x) Not x.Equals("/")).ToList()
          If correctWords.Count > 0 Then
              ' Choose the word to be used as display text for the hyperlink
              correctWords.Reverse()
              For index As Integer = 0 To words.Count - 1
                  displayText = $"{correctWords(index)}"
                  Exit For
              Next
          End If
      
          'return the choosen word
          Return CultureInfo.CurrentCulture.TextInfo.ToTitleCase(displayText)
      End Function
      
    7. Define the GetDisplayText(string uri) method to generate a user-friendly display text from a given URI. This method processes a given URI to generate a display-friendly text by extracting the domain name and URI segments, then using these parts to form a readable text. If any errors occur during the process, or if the generated display text is empty, the original URI is returned.
      CS
      Copy Code
      public string GetDisplayText(string uri)
      {
          try
          {
              //break URL into segments
              var uriObject = new Uri(uri);
              var words = new List<string>() { ExtractDomainName(uriObject) };
              words.AddRange(uriObject.Segments);
      
              //send these segments to get the choosen word for display
              var displayText = ChooseDisplayText(words);
      
              //return display text
              return string.IsNullOrEmpty(displayText) ? uri : displayText;
          }
          catch
          {
              //In case of errors, return whole URL
              return uri;
          }
      }
      
      VB
      Copy Code
      Public Function IHyperlinkParser_GetDisplayText(ByVal uri As String) As String Implements IHyperlinkParser.GetDisplayText
          Try
              'break URL into segments
              Dim uriObject = New Uri(uri)
              Dim words As New List(Of String) From {ExtractDomainName(uriObject)}
      
              words.AddRange(uriObject.Segments)
              'send these segments to get the choosen word for display
              Dim displayText = ChooseDisplayText(words)
              'return display text
              Return If(String.IsNullOrEmpty(displayText), uri, displayText)
              'TODO: Warning!!!, inline IF is not supported ?
              'TODO: Warning!!!! NULL EXPRESSION DETECTED...
          Catch
              'In case of errors, return whole URL
              Return uri
          End Try
      
      End Function
      

    Set RichTextBox for Hyperlinks Conversion

    1. In the MainWindow.xaml.cs file, add the following code in the Click event of the Button control to paste some text in the RichTextBox control from Resources.resx file:
      CS
      Copy Code
      private async void btnPasteDoc_Click(object sender, RoutedEventArgs e)
      {
          //clear both textboxes
          smartRichTextBox.Text = string.Empty;
          //set document to clipboard
          var manager = new ResourceManager(@"SmartRichTextBox_NET48.Resources", Assembly.GetExecutingAssembly());
          Clipboard.SetText(manager.GetString("Document"), TextDataFormat.Text);
          await Task.Delay(50);
          smartRichTextBox.ClipboardPaste();
      }
      
      VB
      Copy Code
      Private Async Sub btnPasteDoc_Click(sender As Object, e As RoutedEventArgs)
          ' Clear both textboxes
          smartRichTextBox.Text = String.Empty
          ' Set document to clipboard
          Dim manager As New ResourceManager("SmartRichTextBox_NETFW_VB.Resourcesvb", Assembly.GetExecutingAssembly())
          Clipboard.SetText(manager.GetString("Document"), TextDataFormat.Text)
          Await Task.Delay(50)
          smartRichTextBox.ClipboardPaste()
      End Sub
      
    2. Add the below code in the Checked event of the chkAutoConversion checkbox to display the textual links in the RichTextBox control in place of long URLs using the methods defined in the HyperlinkParser class. The methods are easily accessible using the object of the HyperlinkParser class.
      CS
      Copy Code
      private void chkAutoConversion_Checked(object sender, RoutedEventArgs e)
      {
          try
          {
              //Create parser object
              var hyperlinkParser = new HyperlinkParser();
      
              //Get pasted data from clipboard
              string text = Clipboard.GetText();
      
              //Converting hyperlinks into meaningful Text using methods of HyperlinkParser class
      if (!string.IsNullOrEmpty(text))
              {
                  var links = hyperlinkParser.ExtractURLs(text).ToList();
                  foreach (var link in links)
                  {
                      var displayText = hyperlinkParser.GetDisplayText(link);
                      var anchor = $"<a href={link}>{displayText}</a>";
                      var pattern = $@"(^|\s){link}(\s|$)";
                      Regex rgx = new Regex(pattern, RegexOptions.Compiled);
                      text = rgx.Replace(text, $" {anchor} ");
                  }
                  Clipboard.SetData(DataFormats.Html, text);
                  smartRichTextBox.Text = "";
                  smartRichTextBox.ClipboardPaste();
              }
          }
          catch (Exception ex)
          {
              MessageBox.Show($"{ex.Message}{Environment.NewLine}{ex.StackTrace}");
          }
      }
      
      VB
      Copy Code
      Private Sub chkAutoConversion_Checked(sender As Object, e As RoutedEventArgs)
          Try
              'Create parser object
              Dim hyperlinkParser As New HyperlinkParser()
      
              'Get pasted data from clipboard
              Dim text As String = Clipboard.GetText()
      
              'Converting hyperlinks into meaningful Text using methods of HyperlinkParser class
          If Not String.IsNullOrEmpty(text) Then
                  Dim links As List(Of String) = hyperlinkParser.IHyperlinkParser_ExtractURLs(text).ToList()
                  For Each link As String In links
                      Dim displayText As String = hyperlinkParser.IHyperlinkParser_GetDisplayText(link)
                      Dim anchor As String = $"<a href={link}>{displayText}</a>"
                      Dim pattern As String = $"(^|\s){link}(\s|$)"
                      Dim rgx As New Regex(pattern, RegexOptions.Compiled)
                      text = rgx.Replace(text, $" {anchor} ")
                  Next
                  Clipboard.SetData(DataFormats.Html, text)
                  smartRichTextBox.Text = ""
                  smartRichTextBox.ClipboardPaste()
              End If
          Catch ex As Exception
              MessageBox.Show($"{ex.Message}{Environment.NewLine}{ex.StackTrace}")
          End Try
      
      
      End Sub
      
    3. Execute the application and click the button control to paste some text in the RichTextBox control. Then, select the checkbox to convert the URLs into meaningful text in the RichTextBox.

    Transforming a standard RichTextBox into a smart RichTextBox can greatly enhance user interaction and content creation efficiency.