All Collections
Extracting data
OCR parsing engine
Use Labels to dynamically position fields (Dynamic OCR)
Use Labels to dynamically position fields (Dynamic OCR)

How to use the Labels to capture data from fields that can move horizontally or vertically in a document

Updated over a week ago

This tutorial assumes that you already know how to create simple OCR templates.

Note: Labels are only available for OCR templates. In text templates, delimiters (our term for labels in that case) are handled automatically.

Understanding field positioning: absolute vs relative to label

When you create a field, Parseur will position it in absolute terms on the page by default, which means it will extract the text from all documents in that exact box location on that page.

This is called Zonal OCR, and it works when the field is always in the same place in a document. But sometimes, a field can move up or down, left or right, in a document.

This is when you can use relative label positioning, also known as Dynamic OCR.

Let's take the example below: we want to extract the subtotal value under the table. However, the number of items in the table can vary from one document to the next. So the position of the subtotal value can move vertically across documents.

The subtotal value will move vertically on the page depending on the number of items in the table above

However, the position of that value is always present in the same place to the right of the Subtotal text placeholder. So we'll create a Label over the Subtotal text. Then we'll tell Parseur that the Subtotal field will be relative to this label.

How to create a dynamically positioned field with Labels?

Creating a field positioned relative to a label is quite straightforward

  1. Draw a box over the text label that you want to use to position the field relative to it dynamically (in our example, "Subtotal")

  2. Click New Label

  3. Wait for Parseur to identify the content of the label

  4. Draw a box over the text you want to extract

  5. Click New Field and enter the field name and other options like any normal field

  6. Under Field position > Start relative to, select the label you just created from the dropdown

If your field has a fixed height and width, you only need to use the "Start relative to" option.

How create fields with a dynamic height or width?

If your field has a variable height (typical tables with varying numbers of rows or comments with varying numbers of lines),

  1. Perform steps 1 to 6 above

  2. Create a second label below the field

  3. Edit the field

  4. Under Field position > End relative to, select that second label

This will tell Parseur to stop the field relative to that second label.

The label content that Parseur recognized does not match what I see on the document. What can I do?

When you create a label over a piece of text and the text recognized by Parseur isn't the one you see on the document, it usually means that your PDF was scanned or encoded with a bad OCR program.

When that happens, you can ask Parseur to redo the OCR by enabling the Force OCR option in your mailbox Settings > Processing. You will then need to delete and re-upload the document for the OCR to take place.

I am getting a "No text found in the box" error. What can I do?

This can typically happen if you try to create a label over an image (for example, a company logo or screen capture embedded in the document). There are two ways to fix this:

Option 1: Find another label

Try to find another piece of text that can accurately position the field. This is the recommended option.

Option 2: Force OCR on images

If option 1 is not possible, you can force Parseur to detect text in images.

To do so:

  • Open your Mailbox Settings

  • Click on the Processing tab

  • Under Advanced Settings, check the "Force use of OCR on PDFs" button

  • Click Save

  • Re-upload your Documents

How are labels identified in a document?

Labels are identified using two data points:

  • A text content, for example, "Subtotal" in our previous example

  • An occurrence number. If the text content is found several times in the document, Parseur will use the occurrence number to select the right label.

How do I constrain a template to a certain number of occurrences of a label?

In some cases, you may want Parseur to make sure that the total number of occurrences also matches. To do so:

  • Edit your label

  • Click on the Lock icon

  • Save the label and template

With this option, Parseur will not match a document to that template if the total number of occurrences in a document is different.

How do I pick the last label of a document?

By default, the document's top is the starting point for calculating label occurrence. Sometimes, however, you want to tell Parseur that the label should be located starting from the bottom of the document instead.

For example, you want to always take the last occurrence of "Total" in a document, even though the total number of occurrences varies from one document to the next.

You can change the direction occurrences are counted on the label edit screen:

Did this answer your question?