This article describes how to control your mailboxes via its native REST HTTPS API.

Does Parseur have an API?

Yes, we do! The article on this page details the main API to manage your Parseur account.

You may also want to check:

Our Document Sending API to send documents to Parseur
Our Webhook API to export data from Parseur to your apps.

Parseur API Authentication

The Parseur API uses token-based authentication.

You will find your API Token Key in your Account Overview.

For clients to authenticate, the token key should be included in the Authorization HTTP header. The key should be prefixed by the string literal "Token", with white space separating the two strings. For example:

Authorization: Token 1234d45678c90bcf1234fe123ddae4aabbc6abcd

Unauthenticated responses that are denied permission will result in an HTTP 403 Unauthorized response with an appropriate WWW-Authenticate header. For example:

WWW-Authenticate: Token

The curl command-line tool may be useful for testing token-authenticated APIs. For example:

curl -X GET https://api.parseur.com/ -H "Authorization: Token <enter-your-token-here>" --compressed

Don't forget to specify the --compressed flag to take advantage of the on-th-fly compression of requests. Otherwise some requests may transfer gigabytes of uncompressed data and that will be slow.

Manage Mailboxes

List mailboxes

Mailbox objects are called parsers in the API.

To list all your mailboxes, make a GET request on /parser. The response is paginated.

Supported sorting keys (see the section below for more information):

name
document_count
template_count
PARSEDOK_count (number of documents processed)
PARSEDKO_count (number of documents not processed)
QUOTAEXC_count (number of documents in quota exceeded status)
EXPORTKO_count (number of documents in export failed status)

The search query parameter will search for the following properties:

mailbox name
mailbox email prefix

Create a mailbox

To create a mailbox, make a POST request on /parser passing the following keys:

name
email_prefix (optional, if not present, will be derived from name key)
process_attachments: true / false (optional, defaults to true)
ai_engine: GCP_AI_1 / none (optional, defaults to none, which disables the AI parsing completely for this mailbox)
master_parser_slug: optional, set it if you want your mailbox to use one of our set of ready-made templates.
parser_object_set: optional, the list of fields you want to create (if you set a master_parser_slug Parseur will create default field set for you)

Possible values for master_parser_slug are automotive, contact-list, delivery-notes, event-ticketing, financial-statement, food-delivery, invoices, job-application, job-search, leads, property-bookings, real-estate, resume-cv, search-alerts, statements, transactions, travel, utility, work-order

Example for parser_object_set:

[
  { "name": "MyField", "format": "TEXT", "query": "MyField is usually found at the top of the document and looks like 123-AA-456" },
  { "name": "MyAddress", "format": "ADDRESS", "query": "The address is on the top-left corner of the document" },
  { "name": "MyTableField", "format": "TABLE", "query": "This table is found just after the 'Financial Results' header"
    "parser_object_set": [
      { "name": "MyTableColumn", "format": "TEXT", "type": "FIELD" }
      // ... more columns here ...
    ]
  }
 // ... more fields here ...
]

Available field formats are: ADDRESS, DATE, DATETIME, LINK (linked document, will download the page behind the link and create a new document), NAME (person's name), NUMBER, ONELINE (single line text), TABLE, TEXT (multi line text), TIME

Instructions on how to extract the fields can be read and updated through the query field's property.

Retrieve a mailbox

You can retrieve a mailbox with a GET request on /parser/:mailbox_id

Update a mailbox

You can update a mailbox with a PUT or POST request on /parser/:mailbox_id

To delete a field, set a _destroy property with true as value.

Copy a mailbox

You can copy (duplicate) an existing mailbox with a POST on /parser/:mailbox_id/copy

Delete a mailbox

You can delete a mailbox with a DELETE request on /parser/:mailbox_id

Get the field structure (schema) of a mailbox

You can get the mailbox schema with a GET request on /parser/:mailbox_id/schema.

Useful if you're planning to create a connector for Parseur.

Manage Documents in a mailbox

Send documents

To send a document via API, check out this article

List documents

You can list your documents in a given mailbox with a GET request on /parser/:mailbox_id/document_set. The response is paginated.

Supported sorting keys (see the section below for more information):

name
created (default - received date)
modified (processed date)
status

The search query parameter will search in the following properties:

document id (exact match)
document name
template name
from to, cc and bcc email addresses
document metadata header

Filter documents by date with keys:

received_after=yyyy-mm-dd
received_before=yyyy-mm-dd
tz=timezone (optional, example: Asia%2FSingapore) to filter dates in the given timezone. If not present, timezone will be set to UTC
you can use either one or both of the date filters in a query

Get the parsed result for each document:

this endpoint no longer returns the results by default
add query parameter with_result=true to get the result string with each document

Retrieve a document

You can retrieve a document and its parsed results with a GET request on /document/:document_id

Update a document

You cannot update a document.

Reprocess a document

You can reprocess (parse) a document with a POST on /document/:document_id/process

Skip a document

You can set the Skipped status on a document with a POST on /document/:document_id/skip

Copy a document

You can copy a document to another mailbox with a POST on /document/:document_id/copy/:target_mailbox_id

Retrieve the logs for a document

You can access the activity logs of a document with a GET on /document/:document_id/log_set. Logs are paginated.

Delete a document

You can delete a document with a DELETE request on /document/:document_id

Manage Templates in a mailbox

List templates

You can list your templates in a given mailbox with a GET request on /parser/:mailbox_id/template_set. The response is paginated.

Supported sorting keys (see the section below for more information):

name
created (creation date)
modified (default: last template update time or last time template was used)
last_activity (last time template was used)
status
document_count (number of documents matched by the template)

The search query parameter will search for the following properties:

template name

Create a template

You need to use the template editor to create and update templates.

Retrieve a template

You can retrieve a template with a GET request on /template/:template_id

Copy a template

You can copy a template with a POST on /template/:template_id/copy/:target_mailbox_id

Delete a template

You can delete a template with a DELETE request on /template/:template_id

Manage Webhooks in a mailbox

List webhooks

You can list your webhooks in a given mailbox with a GET request on /parser/:mailbox_id

Enabled webhooks are under the webhook_set key
Paused webhooks are under the available_webhook_set key.

Create a webhook

You can create a new webhook with a POST request on /webhook passing the following keys:

event: must be one of document.processed, document.processed.flattened, document.template_needed or table.processed (see our webhook reference article for more information)
target: URL to send the data to, e.g. https://api.example.com/parseur
category: must be set to CUSTOM
parser: ID of the mailbox you want to add the webhook to, in numerical format
name: Custom name for the webhook. If omitted, it will use the target URL instead. Optional
headers: JSON object containing the HTTP headers you want to send along with the result data. Optional
parser_field: ID of a field or a table field you want the webhook to react to, in the "PF12345" format

Enable a webhook

You can enable an existing webhook for a given mailbox with a POST request on /parser/:mailbox_id/webhook_set/:webhook_id

Pause a webhook

You can pause an existing webhook for a given mailbox with a DELETE request on /parser/:mailbox_id/webhook_set/:webhook_id

Getting parsed data

Using webhooks

Parseur can send parsed data in real time to your server via its Webhook feature.
Check out the webhook article to learn more.

Using download URLs

Using webhooks is the recommended way to get your data back to your servers. If that is not possible (for example, if you are not able to create an URL endpoint that listens for the data, or if your organization's security team doesn't allow you to open your firewall), you can use the download URLs provided when you receive a mailbox.

In a parser mailbox payload, you will find the following attributes:

csv_download. Download the data as a CSV. Example: /parser/<secret>/download/my.mailbox.csv
json_download. Download the data as JSON. Example: /parser/<secret>/download/my.mailbox.json
xls_download. Download the data as an XLSX. Example: /parser/<secret>/download/my.mailbox.xlsx

You can quickly capture these download URLs for use later on the Download / Export page by right-clicking the download and copying the link:

Filtering: You can filter the parsed data in the same way you do it in the app:

Add last_document_only=true HTTP query parameter to only retrieve the data of the last processed document
Add date=yyyy HTTP query parameter to retrieve data from year yyyy (e.g. date=2023)
Add date=yyyy-mm to retrieve data from year yyyy and month mm (e.g. date=2023-09)
Add date=yyyy-mm-dd to retrieve data from year yyyy, month mm and day dd (e.g. date=2023-09-05)

Notes:

You need to prefix those path names with https://api.parseur.com to get the full URL
- i.e. for CSVs, the path would be https://api.parseur.com/parser/<secret>/download/my.mailbox.csv
You don't need to add authentication headers to get the data. So make sure you keep those URLs private (for example, save the secret key as an environment variable and don't commit it to your code repository)
Date filtering is done based on UTC timezone

Optional HTTP Query parameters

The following query parameters can be mixed and matched.

Pagination

All GET requests that return a list of documents, templates, and mailboxes that support pagination by appending a page option to the URL. The default page size is 25. You can change the page size using the page_size query parameter.

For example: /parser?page=2&page_size=50 will list the second page of your mailboxes, each page containing 50 records.

Searching

Some endpoints support sorting via the search query parameter. The search value needs to be URL encoded.

For example, /parser?search=test%20mailbox will search for mailbox names containing "test mailbox"

Unless stated otherwise, search is not case sensitive and will retrieve all entities that partially match the search string. For example, a mailbox search for foo will return mailboxes named test.foo and FOO Mailbox 123.

Sorting

Some endpoints support sorting via the ordering query parameter.

to sort a list ascending on the foo key, use ?ordering=foo
to sort a list descending on the foo key, use ?ordering=-foo

For example, /parser?ordering=-document_count will list your mailboxes starting with the one with the most documents.

Statuses

Below is a complete list of responses the API will use to dictate the current status of the requested document:

INCOMING - the file has been received by our system
ANALYZING - the file is being analyzed against our system's import parameters and the user's mailbox settings.
PROGRESS - the file is currently being processed by active AI engine for that mailbox
PARSEDOK - the file has been processed and data is available for export
PARSEDKO - the processing for this file failed
QUOTAEXC - processing for this file was stopped because the user does not have enough credits to process it
SKIPPED - processing for this file was skipped because of a template
EXPORTKO - exporting for this file failed
TRANSKO - post-processing for this file failed
INVALID - the imported file is not supported by our system

API rates limit

Requests to the /login and /signup endpoints are strictly rate-limited.

Sending requests to other endpoints is limited to 5 requests per second per IP, with an initial burst allowance of 50 requests.

Requests that go over the rate limit will return a 429 error code. We can accommodate higher rate limits as part of our Enterprise plan. Contact us to discuss.

Do more with the API

This article just lists the most common use cases for our API. There is more you can do; feel free to ask us for more details!

Extract metadata from emails and documents with Metadata fields

How to Stop Parseur from Parsing Certain Documents

Send documents to Parseur using the API

Send parsed data using Webhooks

Get notifications for failed document processing

Use Parseur document parsing API

Does Parseur have an API?

Parseur API Authentication

Manage Mailboxes

List mailboxes

Create a mailbox

Retrieve a mailbox

Update a mailbox

Copy a mailbox

Delete a mailbox

Get the field structure (schema) of a mailbox

Manage Documents in a mailbox

Send documents

List documents

Retrieve a document

Update a document

Reprocess a document

Skip a document

Copy a document

Retrieve the logs for a document

Delete a document

Manage Templates in a mailbox

List templates

Create a template

Retrieve a template

Copy a template

Delete a template

Manage Webhooks in a mailbox

List webhooks

Create a webhook

Enable a webhook

Pause a webhook

Getting parsed data

Using webhooks

Using download URLs

Optional HTTP Query parameters

Pagination

Searching

Sorting

Statuses

API rates limit

Do more with the API