Metadata fields extract information about the document (e.g., sender, received time), while Custom fields extract specific data from within the document’s content.

Adding Metadata Fields in Parseur

Metadata fields apply globally for all documents in that mailbox.
Changes to metadata fields will only apply to newly parsed documents. To update existing documents, reprocess them from the document queue.
You can reprocess documents one by one or in bulk.

You can add Metadata fields in 2 ways:

Open your Parseur mailbox
On the left-hand side menu, click on Fields.
You will see a list of available metadata fields and your custom fields.
Click the metadata fields you want to extract.

Option 2: in the Template Editor

Open your Parseur mailbox
Edit a template you want to use.
On the right side of the document, click the Metadata tab.
Select the metadata fields you want to extract.

List of Available Metadata Fields in Parseur

Date and Time Metadata:

Received: Date and time when Parseur received the document.
ReceivedDate: Only the date part of the received timestamp.
ReceivedTime: Only the time of day when Parseur received the document.
ProcessedTime: Date and time when Parseur last processed the document.

Note: These fields follow your date and time formatting preferences. You can adjust these preferences in Account Settings

Email Address Metadata:

Sender: Email address of the person who sent the document. Usually the same address as the OriginalRecipient address, unless your mailbox receives emails from different aliases or is a catch-all.
SenderName: Name associated with the sender (extracted from the email’s “From” field).
Recipient: Parseur’s receiving mailbox email address (e.g., [email protected]).
To: Recipients listed in the “To” field of the email.
CC: Recipients listed in the “CC” field of the email.
BCC: Visible only if your Parseur mailbox was BCCed.
ReplyTo: Address to reply to (if set in the email).
RecipientSuffix: The part of the recipient email after the + symbol (e.g., in [email protected], the suffix is alias). This is particularly useful if you forward emails from different sources and want to know which source sent what email.
OriginalRecipient: The original recipient before any forwarding to Parseur. Note: this will only work after you set up automatic forwarding of your emails (it will be equal to Recipient otherwise)

Document Content Metadata:

Subject: Subject of the email or filename of the document.
Content: The best-available content of the document (HTML if present,
otherwise text)
PageCount: The number of pages in a document.
HtmlDocument: HTML content of the email or document (might be empty if TextDocument is present).
TextDocument: Text-only content of the document (might be empty if HtmlDocument is present).
OriginalDocument: An object containing the filename, content type, size, and download URL.
SearchablePDF: An object similar to OriginalDocument pointing to a cleaned-up version of your scanned PDF, with properly rotated pages and updated text from OCR, making it easier to read and search. If the PDF didn’t go through our OCR pipeline, the link will point to the original document instead. If the original document wasn't a PDF, the field will be empty
LastReply: Content of the last reply in the email chain (plain text only). Note: This field is limited to English text replies without forward headers. If you don't get the result you want, try using AI with instructions instead.
Attachments: List of attached files with URLs for downloading them.
Headers: an object containing the raw email headers with technical details (e.g., Message-ID).

Parseur-specific metadata

DocumentID: A unique identifier for the document.
ParentID: The ID of the parent document (e.g., attachments’ parent would be the email’s DocumentID).
DocumentURL: A direct link to view the document in the Parseur app (requires authentication).
PublicDocumentURL: A public link to share the document (no authentication required - make sure to keep the link private).
CreditCount: The number of credits consumed per document.
SplitPageRange: For split documents, the range of pages taken from the original file.
Template: The name of the template used to parse the document.
ParsingEngine: The type of parsing engine used on the document.
- AI means the AI engine was used
- TEMPLATE_TEXT means the text template engine was used
- TEMPLATE_OCR means zonal OCR template parsing was used
- METADATA means only metadata was extracted from the document

Frequently Asked Questions (FAQ)

How can I extract the file name of a document?

Option 1: If you only need the file name, enable the Subject metadata field.
Option 2: If you want the file name along with the document’s URL, use the OriginalDocument metadata field.

Can I reference the metadata fields while using the AI engine?

You can reference the following metadata fields in your AI field instructions, in addition to the document content:

Sender
SenderName
Recipient
RecipientSuffix
Subject
PageCount

This is useful if you want to tailor field extraction based on metadata, for example, varying behavior depending on the email sender. The best way to help the AI understand you're referring to a metadata field is to prefix the field name with "metadata:", for example: