Posted 4 August 2024, 10:22 am EST
I have some marketing PDF pages to analyze. Many of them separate sections/parts of the page using different background colors (rectangles or other drawn shapes to change the background of the area of the page to make the text there distinct).
For basic example, imagine this forum page was a pdf…there is a “section”/block to the right side in a shade of green for “Need extra support?” and another beneath that in an off-white for “Forum Channels”. How to find those blocks/rectangle borders in the pdf page, and then find the text within those borders?
Is there a way to find the distinct background shapes (maybe borders) and therefore use something like the textmap to determine what text is in each “section” (shape)?
Sorry I don’t have an example I can share at this time due to proprietary nature, but happy to explain further my issue if I was not clear.