Skip to main content
All CollectionsGetting Started
Extract data using the AI parsing engine
Extract data using the AI parsing engine

How to use Parseur AI engine to extract data from documents without templates.

Updated this week

Parseur's advanced AI technology now empowers you to extract data from documents by simply leveraging the field names within your mailbox. No more manual template setup – just seamless and accurate data extraction, regardless of language or document complexity.


AI data extraction features:

Parseur's AI extraction feature introduces a new era of efficiency and convenience:

  1. Template-less extraction: Bid farewell to template creation and updates. Our AI-driven solution eliminates the need for manual setup, allowing you to automatically extract data from documents.

  2. Any type of document layout: Since no templates are involved, the AI engine can extract data from documents with a great variety of layouts.

  3. Field names are the key: Guiding our AI to extract the precise data you require is as simple as naming the desired fields within your mailbox. These field names serve as intuitive cues for our AI to identify and extract the corresponding data.

  4. Multilingual proficiency: Parseur's AI understands and extracts data from documents in any language, ensuring global accessibility and applicability.

Limitations:

The AI engine has the following limitations to bear in mind:

  • Page count limitation: when using table fields, the AI is capable of extracting data from a limited number of pages. The exact number of pages can be slightly more or less, depending on the text density of your pages. You can use the Split PDF feature to have your document split into smaller one at upload

How to parse documents using AI

Getting started with Parseur's AI parsing feature is quick and intuitive.

Step 1 : Create a new mailbox

Choose from our pre-defined mailboxes or create a customized mailbox tailored to your needs.

If you have an existing mailbox for which you want to use AI, enable the AI engine as described in step 2 below.

Step 2: Enable AI at mailbox level (or user level)

After selecting the mailbox type, click the AI checkbox to activate it.

You can enable/disable the AI engine when creating a mailbox

You can also activate AI on an existing mailbox in the mailbox settings:

You can also enable/disable the AI engine in your mailbox settings

Finally, you can also activate AI for all of your existing mailboxes in your user account. Click on your name in the left menu > Account > Manage account > AI engine.

Step 3: Upload a sample document

Upload a representative sample document that showcases the type of data you want to extract.

Step 4: Configure your fields (Optional)

Wait for Parseur to analyze the document.

Then, if your mailbox already has fields (for example, if you chose one of our pre-defined mailboxes), Parseur will immediately start the extraction process.

Create simple fields

For custom mailboxes where there are no default fields, you will need to create some fields:

  1. Click on the uploaded document to view it

  2. Navigate to the Fields tab.

  3. Add the specific fields you wish to extract.

  4. Ensure these fields are named in a manner that the AI can easily understand, such as using terms like "InvoiceNumber" or "customer_address".

Create table fields to capture repeating data

To extract a list of repeating data, use the New Table button.

Then click on the "Add fields to <your field>" button to name the individual fields you want to extract from the table.

Repeat this for each field you want to add. For example: quantity, description, sku, price, etc.

Step 5: Process your document and check the results

After adding all desired extraction fields, click the "Process" button to initiate the AI-driven data extraction process.

Frequently Asked Questions (FAQ)

Parseur AI didn't fetch the value I wanted for some of my fields. How can I train the Parseur AI model to do better?

Tip #1: use better field names

Parseur uses the name of your fields to find the relevant data in your documents. If the wrong value is fetched, try renaming the field to something more accurate that AI will better understand. Think of the AI as a data entry trainee that needs guidance to understand what you want.

For example, to capture the invoice number in invoice documents:

  • ❌ don't name the field Invoice or Number or invno

  • βœ… name it InvoiceNumber, invoice_number or Invoice number

Tip #2: delete unused or duplicate fields

The more fields you have, the more the AI tends to get some of them wrong. If tip 1 didn't help, try to restrict the number of extracted fields to the core of what you need.

Tip #3: consider using the template engine for some layouts

AI being a probabilistic model, it cannot guarantee 100% accuracy for all documents. If you need better results and don't manage to get them, you could consider creating some templates for some of the layout. Read more about the pros and cons our AI parsing engine vs template parsing engines.

Parseur only retrieved 1 data point from my documents. I have other similar data points in my document. How do I tell Parseur to extract all the data?

If the data repeats within a page, use Table fields instead of single fields:

  • Go to the the Fields tab when viewing a document

  • Click New Table

  • Name it something the AI will understand (for example, if you are working on extracting contact details, name the table something like ContactList)

  • Click Create

  • Click Add Field and name each field similar to the single fields you had previously

  • Delete the single fields so as not to confuse the AI

  • Reprocess your documents and check to see if you get the right results

If your document contains several individual documents (like several invoices, for example), use the Split PDF feature described below.

I have long documents; will AI be able to extract data from them?

AI will only be able to extract data from the first few pages of your document. The exact number depends on document density and the number of pages.

If you have long documents, you can consider the following options:

  • If you have a PDF consisting of several individual documents all bundled together, you can use the Split document feature to have Parseur cut the document into individual ones.

  • You can also consider using one of our two template engines: Text engine for emails and text documents and OCR engine for PDFs

I have some templates and the AI engine enabled in my mailbox. Which engine will be used to parse my documents?

Matching templates take priority over the AI engine. But if there are no matching templates, Parseur will use the AI Engine to extract your data.

How secure is my data when using the AI engine? Do you share my data to improve the AI model?

Parseur uses state-of-the-art AI models from Azure, Google and AWS to parse your data. Your data is processed in the European Union. The data remains yours and we don't re-use or share it to improve the AI models.

What is the difference between AI engine v1 and v2 in the mailbox settings?

AI v1 is our legacy template engine, introduced in late 2023. AI v2 is our newest model.

The v2 model improves extraction accuracy and can handle parsing data from much larger documents. We recommend that you use v2 by default and only try v1 if v2 doesn't give you satisfying results.

Did this answer your question?