With Parseur’s AI engine, you can extract data from documents effortlessly by leveraging field names and instructions within your mailbox. No more manual template setups—just simple and accurate data extraction, regardless of the document’s language or complexity.

Key AI Data Extraction Features:

Template-less extraction: Say goodbye to manual template creation and updates. Our AI-driven solution eliminates the need for setup, allowing you to automatically extract data from documents.
Flexible Document Layouts: Because there are no templates, the AI engine can extract data from any kind of document layout.
Multilingual proficiency: Parseur’s AI can understand and extract data from documents in most languages.

Limitations:

The AI engine has the following limitations to bear in mind:

Page count limitation: When extracting data from tables, the AI can handle documents up to 25 pages. Use the Split Page feature to divide your document into smaller parts during upload if needed.

Tutorial Videos:

How to parse documents using AI

Getting started with Parseur’s AI parsing feature is quick and intuitive:

Step 1 : Select AI-assisted mailbox creation

Choose the AI-assisted mailbox creation mode to have Parseur set up a mailbox for you.

A screenshot of the New mailbox page showing the Ai-Assisted mailbox creation box

Alternatively, if using the Manual Creation mode, ensure the “Use AI” toggle is enabled after selecting your mailbox type.

You can enable/disable the AI engine when creating a mailbox

Finally, for existing mailboxes, you can also activate AI in the mailbox settings.

Step 2: Upload a Sample Document

Upload one or more samples documents representative of the type of data you want to extract.

Step 3: Review Suggested Fields

Parseur will analyze your document, suggest fields to extract, and proceed with data extraction. Click on a document name to review the extracted data.

If you need to make changes, click on the Fields tab and continue to Step 4.

Note: You can also access this list in the “Fields” section of your mailbox on the left menu.

Step 4 (optional): Refine Fields for Extraction

Option 1: Edit Existing Fields

Click the edit button next to a field to modify it:

Here is what you'll get after you click edit:

You can update the following attributes:

Field Name: The label for your data when you download or export it.
Output Format: The type of data to extract. Use this to further normalize your data. Refer to the overview about field formats for more details.
Instructions: By default, AI uses field names to understand what to extract. If needed, provide more detailed instructions and context here. Think of instructions as a custom prompt describing what you want the AI to extract. Read more about using instructions.

Option 2: Create Fields

If needed, you can add fields. Simple fields lets you extract a single value, whereas table fields will list you extract repetitive data.

To create a Simple Field:

Click on "New field" to add the specific fields you wish to extract.
Enter the field name, format, and instructions as detailed above.

To create table fields:

Click on "New Table", and enter the name and instructions.
Click on the "Add fields to <your field>" button to name the individual fields you want to extract from the table.
Repeat this process for each field in the table (e.g., quantity, description, SKU, price).

Step 5: Reprocess Your Document and Review Results

After updating all desired extraction fields, click the “Process” button to initiate the AI-driven data extraction process.

Step 6: Download your parsed data

After you've parsed the data you're looking for, you can now download it as a file or send the data somewhere else on the internet with a webhook.

Frequently Asked Questions (FAQ)

Q1: Parseur AI didn’t fetch the value I wanted for some fields. How can I improve the AI’s accuracy?

Tip #1: Use More Accurate Instructions

Edit your field and update the instructions to provide more context about the data you want the AI to extract. Note that the AI can only analyze data included in the document; it doesn’t have internet access to retrieve external data. Read more about using instructions.

Tip #2: Remove Unused or Duplicate Fields

Having too many fields can confuse the AI. If Tip #1 doesn’t help, try limiting the extracted fields to only the essential ones.

Tip #3: Consider Using a Template Engine for Some Layouts

AI is a probabilistic model and may not always achieve 100% accuracy. If you require better results, consider creating templates for specific layouts. Read more about the pros and cons our AI parsing engine vs template parsing engines.

For instance, if your documents contain straightforward tables or consistent layouts, you can use Parseur's template engine to create templates for each layout type. This ensures efficient data extraction and avoids parsing unnecessary data from irrelevant content.

Q2: Parseur only retrieved one data point from my documents. How do I extract all similar data points?

If you need to extract repeated data points like multiple line items in invoices, you can define table fields in the schema. This allows for the accurate capture of multiple details such as line items SKUs, quantities, or unit price within a single document.

To use “Table fields” instead of single fields:

Go to the “Fields” tab when viewing a document.
Click New Table
Name it in a way the AI understands (e.g., “ContactList” for contact details).
Click Create
Click Add Field and name each field similar to the single fields previously used.
Delete the single fields to avoid confusing the AI.
Reprocess your documents and review the results.

For documents containing multiple individual documents (e.g., several invoices), use the Split Page feature.

Q3: Can the AI extract data from long documents?

Yes, the AI engine can extract documents up to 25 pages.

Please note that Processing long documents takes some time, we appreciate your patience.

If you have longer documents, consider these options:

Use the “Split Page” feature to separate a bundled PDF into individual documents.
Alternatively, consider using one of our template engines: the Text engine for emails and text documents, and the OCR engine for PDFs.

Q4: If I have both templates and the AI engine enabled in my mailbox, which one will be used?

Matching templates take priority over the AI engine. If no matching templates are found, Parseur will use the AI engine for data extraction.

Please be advised that you cannot use template parsing in tandem with the AI engine for document parsing; our app must choose one or the other before processing.

Q5: How secure is my data when using the AI engine? Is my data shared to improve the AI model?

Parseur uses state-of-the-art AI models from Azure, Google, and AWS to process your data. Your data is processed in the European Union and remains your property; it is not reused or shared to improve AI models.

Document formats supported by Parseur

Create your first OCR template to extract text from PDF

Extract PDF tables with OCR

AI vs template parsing: pros and cons

Manually download extracted data