Get Text "under" Annot rectangle

Posted by: aweber on 28 May 2024, 5:32 pm EST

    • Post Options:
    • Link

    Posted 28 May 2024, 5:32 pm EST

    With the ITextMap and the annotation giving us a Rect and PdfRect, this would appear to be easy to do, but I’m having trouble.

    First, the code example for GetFragment appears to have a typo or two…and that is one thing that is tripping me up…the HitTest rectangle definition.

    Also, if I’m reading it correctly the HitTest/TextMap uses top-left 0, not Pdf coordinates? Do I still need to multiply the dpi into these coordinates? There’s a lot of missing information in the docs.

    Still, this should be pretty easy since the annotation supplies both Rect and PdfRect information.

    Can anyone help me with a code snippet using an annotation’s coordinates to see if the Page’s TextMap has any text within the boundaries?

    Thanks.

  • Posted 29 May 2024, 7:52 am EST

    Hi,

    Apologize for any inconvenience.

    Rect and PdfRect they both consider RectangleF but have differences. Rect origin starts from Top-Left of the Pdf page whereas, PdfRect’s origin starts from Bottom-Left.

    Yes, HitTest/TextMap uses Top-left origin and you need to multiply with dpi to the coordinates while working with it.

    TexMap can’t find a text under annotation. it can be used to find the Text from a PDF which is directly set on the page. You need to find the particular annotation under a specific Rect region, then you can get the text set on the Annotation.(see code snippet)

    var annotations = FindAnnotations(page, searchRect);
    foreach(var annotation in annotations)
    {
        Console.WriteLine(annotation.Contents);
    }
    .....................
    List<AnnotationBase> FindAnnotations(Page page, RectangleF searchRect)
    {
        var annotationList = new List<AnnotationBase>();
        foreach(var annotation in  page.Annotations)
        {
            var annoRect = annotation.Rect;
            if(searchRect.Contains(annoRect))
            {
                annotationList.Add(annotation);
            }
        }
        return annotationList;
    }

    Please refer the attached sample for the same: AnnotationDemo.zip

    If you have another requirement, then please let us know. We will try our best to fulfill your requirements.

    Best Regards,

    Nitin

  • Posted 29 May 2024, 8:00 am EST

    Thank you for the reply.

    The Annotation does not contain any text itself. I have already checked, and the Annotation.Contents == null.

    It appears the user (or automation) is simply drawing a box around a sentence or paragraph to identify it. This is an annotation in the PDF itself, and the text IS found if I dump the entire page’s text (Page.GetText), so as to my previous question, I’d like to look in the TextMap for any text that is within the drawn rectangle of the annotation if the .Contents == null. Finding the annotations and checking contents is not an issue. Matching the rectangle from the Annotation’s coordinates to the ITextMap GetFragment is the current problem. Nothing I’ve tried matches-up.

  • Posted 29 May 2024, 8:03 am EST

    Hi,

    Could you please provide a sample Pdf? So, that we can investigate this and assist you better.

    We will implement a solution to fetch text from annotation Rectangle but we need to investigate this issue first.

    Regards,

    Nitin

  • Posted 29 May 2024, 8:29 am EST

    I can not add a sample to the public forum. Is there a way I can PM you an example for review?

    Since the text DOES appear in the Page.GetText(), and I have coordinates from the Annotation-Rectangle (drawn), this should be straightforward for someone who understands how the Annotation Rectangle/Coordinates should “map” or convert to ITextMap coordinates.

    The type of the “empty” annotations is SquareAnnotation, FWIW.

  • Posted 29 May 2024, 11:51 am EST

    The Annotation Rect is documented as relative to the MediaBox.

    I don’t see any documentation about the TextMap (or relevant HitTestInfo/GetFragment). Is there possibly some offset I need to apply to correlate the Annotation coordinates to the TextMap coordinates for a specific page?

    I am shocked this isn’t more straightforward, I guess.

  • Posted 30 May 2024, 3:57 am EST

    Hi,

    Apologize for the inconvenience.

    You can create a support ticket here: https://developer.mescius.com/my-account/my-support/newcase

    This will be a private support thread, only you and the Mescius Support team can access this. I will follow up the generated ticket once you create it.

    Please provide the Pdf file on the support ticket and we will investigate this issue. After investigation, we will provide you with a resolution.

    And please let us know where you are facing issues while understanding the Documentation. So, that we can see and improve it.

    Regards,

    Nitin

Need extra support?

Upgrade your support plan and get personal unlimited phone support with our customer engagement team

Learn More

Forum Channels