This article refers to the new version of the template editor for PDFs using OCR. This tutorial assumes that you already know how to create simple OCR templates.

Understanding field positioning: absolute vs relative to label

When you create a field, Parseur will position it in absolute terms on the page by default: that means it will extract the text in all documents in that exact box location on that page.

This is called Zonal OCR and it works when the field is always at the same place in a document. But sometimes, a field can move up or down, left or right in a document.

This is when you can use the relative label positioning, also known as Dynamic OCR.

Let's take the example below: we want to extract the subtotal value under the table. However the number of items in the table can vary from one document to the next. So the position of the subtotal value can move vertically across documents.

Subtotal value will move vertically on the page depending on the number of items in the table above

However the position of that value is always present at the same place to the right of the Subtotal text placeholder. So we'll create a Label over the Subtotal text. Then we'll tell Parseur that the Subtotal field will be relative to this label.

How to create a dynamically positioned field with Labels?

Creating a field positioned relative to a label is quite straight forward

  1. Draw a box over the text label that you want to use to position the field relative to it dynamically (in our example, "Subtotal")

  2. Click New Label

  3. Wait for Parseur to identify the content of the label

  4. Draw a box over the text you want to extract

  5. Click New Field and enter the field name and other options like any normal field

  6. Under Field position > Start relative to, select the label you just created from the dropdown

If your field has a fixed height and width, you only need to use the "Start relative to" option.

How create fields with a dynamic height or width?

If your field has a variable height (typically tables with varying number of rows or comments with varying number of lines),

  1. Perform steps 1 to 6 above

  2. Create a second label below the field

  3. Edit the field

  4. Under Field position > End relative to, select that second label

This will tell Parseur to stop the field relative to that second label.

I am getting a "No text found in the box" error. What can I do?

This can typically happen if you try to create a label over an image (for example a company logo or screen capture embedded in the document). There are two ways to fix this:

Option 1: Find another label

Try to find another piece of text that can accurately position the field. This is the recommended option

Option 2: Force OCR on images

If option 1 is not possible, you can force Parseur to detect text in images.

To do so:

  • Open your Mailbox Settings

  • Click on the Processing tab

  • Under Advanced Settings, check the "Force use of OCR on PDFs" button

  • Click Save

  • Reupload your Documents

How are labels identified in a document?

Label are identified using two data points:

  • A text content, for example "Subtotal" in our previous example

  • An occurrence number. In case the text content is found several times in the document, Parseur will use the occurrence number to select the right label

In some cases, you want Parseur to make sure that the total number of occurrences also matches. To do so:

  • Edit your label

  • Click on the Lock icon

  • Save the label and template

With this option, Parseur will not match a document to that template if the total number of occurrences in a document is different.

Did this answer your question?