This article refers to the new version of the template editor for PDFs using OCR. Check out this page if you are looking for the older version for emails and text PDFs.

Notes: Documents processed with an OCR template are charged 1 credit per page in the document

This tutorial assumes that you already created a mailbox and sent your first email. If not, check out this article to get started.

Prefer the video?

Step 1: Open the template editor

A template needs to be created from one or several sample documents. You can create your first template in several ways: using the wizard, from the document view page or from the document list page.

Open the template editor from the Mailbox Wizard

If you have just created your template, the Wizard will offer you to create your first template.

Create on Create Template from the Mailbox Wizard

Open the template editor from the document view page

From the document view page, click on the + Create Template button to open the editor.

If the document has already been processed, you can also use the " + " button at the top to open the template editor.

Click on Create Template from the document view page

Open the template editor from the document list page

Head over to the Documents section on the left menu. Hover on the document you want to create a template from and click on + New Template

Click on New Template from the document list page

Open the template editor from the document list page with multiple samples

The new version lets you create templates with multiple document samples so that you can work with optional fields and better test you template works for all documents.

You can create a new template and include several samples:

  1. Select the documents you want to use as samples by clicking on the checkbox

  2. Then, click on the + New template button

Step 2: get acquainted with the OCR template editor

When you create your first template, the Template Editor tutorial opens. You can revisit that tutorial at any time by clicking on the "How to use this editor" link at the top right corner of the screen.

The Template Editor is where you will show which data points you want to retrieve from documents.

Walk through the template editor screen

Let's go through each section of this screen:

  1. Template Name: give your template a name. Name must be unique in a mailbox. We recommend you always update the default name and give a meaningful one to each template

  2. Contextual help: gives you some tips on what to do next or error messages, if any.

  3. Sample list: you can attach several document samples to the template editor. This allows you to manage optional fields and check a template works against several documents.

  4. View: leave this on Image view for now. Other modes can be useful but are for an advanced usage.

  5. Content: shows the content of the current selected PDF sample. You can draw box over it to tell Parseur which data to extract (see Step 3 below).

  6. Fields tab: lists the fields used or available to use. As you haven't created any field yet, this list is empty.

  7. Metadata tab: lists additional metadata fields you may want to add to your parsed results. See below for more information.

  8. Static tab: allows you to create Static fields, which are field you can set with custom values. See below for more information.

  9. Settings tab: lists several advanced options like the action to take on matching documents.

  10. Create buttons: you will will use those buttons to create fields, label and table fields. They will become active once you draw a box over the content. Read on for more information.

Step 3: Create your first field

In Parseur, a field represents a piece of information you want to extract.

The animation below shows you how to create your first template.

To create a field:

  1. Draw a box over the text you want to extract. Make sure to draw the box over the full size the text can possibly take in any document. Parseur will only extract the text under the box.

  2. Move or resize the box using the handles as appropriate

  3. The "New Field" button becomes available

  4. Click this button, this will open the field option section

  5. Name your field and change options as appropriate

  6. Click Save or draw a new field

When you create a field, Parseur will position it in absolute terms on the page by default: that means it will extract the text in all documents in that exact box location on that page. If the field can move horizontally or vertically, you can use Labels to position the field dynamically. Check out our article on how to use labels and dynamic OCR for more information.

Step 4: Create all remaining fields and save

Repeat steps described above for every field you want to capture.

Tips when creating a template:

  1. As mentioned above, make sure to have fields cover the full zone of where the text can be placed for a field, not only the one on where the text is in the current document

  2. On the right end side you see some fields and labels in bold and some in regular text: fields in bold text are required, the ones in regular text are optional. You can change that setting by toggling the "Field presence is required" switch when editing a field or label

  3. As you can see from the screen capture, we created labels on top of the invoice supplier name and the "Invoice" term. This will help Parseur selecting the right template in case your mailbox contains templates from several suppliers. When searching for the best template, Parseur will filter on the ones that contain all mandatory labels.

Step 5: Add metadata fields (optional)

You may want to extract additional metadata information that is not present in the document body, like for instance a link to the original PDF document.

Head over to the Metadata tab next to the Fields tab

For more information check out our Using Metadata Fields article.

Step 6: Save the template

Once finished, click "Create".

You will now see that your document has been processed.

Step 7: Check results

Make sure that all the data was captured correctly.

In this screen you see:

  • At the top left, metadata info about the document, including the template that was used

  • At the top right, the action buttons (hover them for more information)

  • On the left, the document content

  • On the right, the parsed data extracted. You can switch between the table view and JSON view according to your preference

If everything looks correct, congratulations! You have parsed your first document!

Now send more documents and verify that your data is correctly extracted. Create new templates as necessary.

FAQ - Frequently Asked Questions

How can I split a multipage PDFs into several documents?

You can setup your mailbox to split PDFs every X pages on the Mailbox Settings:

  • Open you mailbox

  • Click on Settings on the left menu

  • Click on the Processing tab

  • Check the "Split PDFs into individual documents" box

  • Enter the number of pages you want Parseur to split your PDFs on

  • Click save

How does Parseur prioritizes templates?

Templates are prioritized following the same usual rules. Check out the following article to understand how Parseur picks a template.

I have another question

Please contact use on the chat at the bottom right corner.

What's Next?

Did this answer your question?