In this article, we'll see how to prevent some documents from being parsed by Parseur. We'll learn how to create document filtering that allows you to only parse the documents you want without getting "New Template Needed" notifications from the others.
There are several options to prevent documents from being parsed.
Solution 1: Prevent documents from being forwarded to Parseur
This is often the best solution. The idea here is to restrict document forwarding to only those from whom we want to extract data.
Most email providers, such as Gmail, Microsoft Exchange, and Outlook, allow setting custom filters when creating auto-forwarding rules. Using those filters, you can restrict emails to be forwarded by sender email, by subject, by content, or a mix of all of these. Refer to this page to learn more about creating filters depending on your email provider.
Unfortunately, not all email providers allow creating custom forward filters (Yahoo doesn't, for instance). And when they do, sometimes forward rules are just too complex to manage with your email provider.
For those cases, you can prevent some documents from being parsed directly in Parseur, either manually or automatically.
Solution 2: Prevent some documents from being parsed in Parseur
Manually skip documents
If you only have a few documents that have slipped through the cracks, you can filter them out of the processing queue with a single click.
To manually prevent documents from being parsed in Parseur:
Open your Parseur mailbox and go do your Document queue
Locate the document(s) you want to filter out and then either:
Click the Eject icon to skip a document but keep it in the queue for auditing purposes. The document will get the Skipped (manual) status.
Click on the Trash icon to permanently delete the document from the queue. Clicking on the icon with open a confirmation window.
Automatically skip documents
If you keep getting documents in the queue that you need to filter out, you don't have to keep managing them manually.
Parseur allows you to create specific templates that will automatically skip or delete documents.
To automatically prevent documents from being parsed in Parseur:
Open your mailbox and go do your Document queue
Create a new template based on the document you want to filter out
Click on the Settings tab on the right menu
Under the "action on matching documents" select:
Skip to keep the document but not process it. Matching documents will get the Skipped (auto) status,
Delete to delete the document. Matching documents won't appear in the document queue anymore. Parseur will delete them immediately after matching them with the template. Select the action to be performed by the template
Then:
if you're working on a PDF / OCR template, create labels that will identify the document
if you're working on an Email / text templates, create fields that will identify this document (see below)
Note for text templates: When creating Skip or Delete templates, the fields you create work in the opposite way as with normal templates:
For normal data extraction templates, you normally create fields on top of the pieces of text you want to extract. This data usually changes from one document to another.
For skipping (and deleting) templates, you have to create fields on top of pieces of text that are the same from one email to skip to the other. This allows Parseur to confirm that the document has to be skipped (or deleted) during the template matching process.
Use Post Processing for more advanced skipping options
You can also use the Post Processing module to write your own business rules to decide which emails to skip or not. During Post Processing, returning None
will mark the document as skipped.
One common use case for this is to have a blacklist (or whitelist) of email addresses sender you want to reject (or accept). See below