This article describes how to control your mailboxes via its native REST HTTPS API.
Does Parseur have an API?
Yes, we do! The article on this page details the main API to manage your Parseur account.
You may also want to check:
Our Document Sending API to send documents to Parseur
Our Webhook API to export data from Parseur to your apps.
โ
Parseur API Authentication
The Parseur API uses token-based authentication.
You will find your API Token Key in your Account Overview.
For clients to authenticate, the token key should be included in the Authorization HTTP header. The key should be prefixed by the string literal "Token", with white space separating the two strings. For example:
Authorization: Token 1234d45678c90bcf1234fe123ddae4aabbc6abcd
Unauthenticated responses that are denied permission will result in an HTTP 403 Unauthorized response with an appropriate WWW-Authenticate header. For example:
WWW-Authenticate: Token
The curl command-line tool may be useful for testing token-authenticated APIs. For example:
curl -X GET https://api.parseur.com/ -H "Authorization: Token <enter-your-token-here>" --compressed
Don't forget to specify the --compressed
flag to take advantage of the on-th-fly compression of requests. Otherwise some requests may transfer gigabytes of uncompressed data and that will be slow.
Manage Mailboxes
List mailboxes
Mailbox objects are called parsers in the API.
To list all your mailboxes, make a GET
request on /parser
. The response is paginated.
Supported sorting keys (see the section below for more information):
name
document_count
template_count
PARSEDOK_count
(number of documents processed)PARSEDKO_count
(number of documents not processed)QUOTAEXC_count
(number of documents in quota exceeded status)EXPORTKO_count
(number of documents in export failed status)
The search
query parameter will search for the following properties:
mailbox name
mailbox email prefix
Create a mailbox
To create a mailbox, make a POST
request on /parser
passing the following keys:
name
email_prefix
(optional, if not present, will be derived fromname
key)process_attachments
:true
/false
(optional, defaults totrue
)ai_engine
:GCP_AI_1
/none
(optional, defaults tonone
, which disables the AI parsing completely for this mailbox)master_parser_slug
: optional, set it if you want your mailbox to use one of our set of ready-made templates.parser_object_set
: optional, the list of fields you want to create (if you set amaster_parser_slug
Parseur will create default field set for you)
Possible values for master_parser_slug
are automotive
, contact-list
, delivery-notes
, event-ticketing
, financial-statement
, food-delivery
, invoices
, job-application
, job-search
, leads
, property-bookings
, real-estate
, resume-cv
, search-alerts
, statements
, transactions
, travel
, utility
, work-order
Example for parser_object_set
:
[
{ "name": "MyField", "format": "TEXT" },
{ "name": "MyAddress", "format": "ADDRESS" },
{ "name": "MyTableField", "format": "TABLE",
"parser_object_set": [
{ "name": "MyTableColumn", "format": "TEXT", "type": "FIELD" }
// ... more columns here ...
]
}
// ... more fields here ...
]
Available field formats are: ADDRESS
, DATE
, DATETIME
, LINK
(linked document, will download the page behind the link and create a new document), NAME
(person's name), NUMBER
, ONELINE
(single line text), TABLE
, TEXT
(multi line text), TIME
Retrieve a mailbox
You can retrieve a mailbox with a GET
request on /parser/:mailbox_id
Update a mailbox
You can update a mailbox with a PUT
or POST
request on /parser/:mailbox_id
Copy a mailbox
You can copy (duplicate) an existing mailbox with a POST
on /parser/:mailbox_id/copy
Delete a mailbox
You can delete a mailbox with a DELETE
request on /parser/:mailbox_id
Get the field structure (schema) of a mailbox
You can get the mailbox schema with a GET
request on /parser/:mailbox_id/schema
.
Useful if you're planning to create a connector for Parseur.
Manage Documents in a mailbox
Send documents
To send a document via API, check out this article
List documents
You can list your documents in a given mailbox with a GET
request on /parser/:mailbox_id/document_set
. The response is paginated.
Supported sorting keys (see the section below for more information):
name
created
(default - received date)modified
(processed date)status
The search
query parameter will search in the following properties:
document id (exact match)
document name
template name
from to, cc and bcc email addresses
document metadata header
Filter documents by date with keys:
received_after=yyyy-mm-dd
received_before=yyyy-mm-dd
tz=timezone
(optional, example:Asia%2FSingapore
) to filter dates in the given timezone. If not present, timezone will be set to UTCyou can use either one or both of the date filters in a query
Get the parsed result for each document:
this endpoint no longer returns the results by default
add query parameter
with_result=true
to get the result string with each document
Retrieve a document
You can retrieve a document and its parsed results with a GET
request on /document/:document_id
Update a document
You cannot update a document.
Reprocess a document
You can reprocess (parse) a document with a POST
on /document/:document_id/process
Skip a document
You can set the Skipped status on a document with a POST
on /document/:document_id/skip
Copy a document
You can copy a document to another mailbox with a POST
on /document/:document_id/copy/:target_mailbox_id
Retrieve the logs for a document
You can access the activity logs of a document with a GET
on /document/:document_id/log_set
. Logs are paginated.
Delete a document
You can delete a document with a DELETE
request on /document/:document_id
Manage Templates in a mailbox
List templates
You can list your templates in a given mailbox with a GET
request on /parser/:mailbox_id/template_set
. The response is paginated.
Supported sorting keys (see the section below for more information):
name
created
(creation date)modified
(default: last template update time or last time template was used)last_activity
(last time template was used)status
document_count
(number of documents matched by the template)
The search
query parameter will search for the following properties:
template name
Create a template
Retrieve a template
You can retrieve a template with a GET
request on /template/:template_id
Copy a template
You can copy a template with a POST
on /template/:template_id/copy/:target_mailbox_id
Delete a template
You can delete a template with a DELETE
request on /template/:template_id
Manage Webhooks in a mailbox
List webhooks
You can list your webhooks in a given mailbox with a GET
request on /parser/:mailbox_id
Enabled webhooks are under the
webhook_set
keyPaused webhooks are under the
available_webhook_set
key.
Create a webhook
You can create a new webhook with a POST
request on /webhook
passing the following keys:
event
: must be one ofdocument.processed
,document.processed.flattened
,document.template_needed
ortable.processed
(see our webhook reference article for more information)target
: URL to send the data to, e.g.https://api.example.com/parseur
category
: must be set toCUSTOM
parser
: ID of the mailbox you want to add the webhook to, in numerical formatname
: Custom name for the webhook. If omitted, it will use the target URL instead. Optionalheaders
: JSON object containing the HTTP headers you want to send along with the result data. Optionalparser_field
: ID of a field or a table field you want the webhook to react to, in the "PF12345" format
Enable a webhook
You can enable an existing webhook for a given mailbox with a POST
request on /parser/:mailbox_id/webhook_set/:webhook_id
Pause a webhook
You can pause an existing webhook for a given mailbox with a DELETE
request on /parser/:mailbox_id/webhook_set/:webhook_id
Getting parsed data
Using webhooks
Parseur can send parsed data in real-time to your server via its Webhook feature. Check out the webhook article to learn more.
Using download URLs
Using webhooks is the recommended way to get your data back to your servers. If that is not possible (for example, if you are not able to create an URL endpoint that listens for the data, or if your organization's security team doesn't allow you to open your firewall), you can use the download URLs provided when you receive a mailbox.
In a parser mailbox payload, you will find the following attributes:
csv_download
. Download the data as a CSV. Example:/parser/<secret>/download/my.mailbox.csv
json_download
. Download the data as JSON. Example:/parser/<secret>/download/my.mailbox.json
xls_download
. Download the data as an XLSX. Example:/parser/<secret>/download/my.mailbox.xlsx
Filtering: You can filter the parsed data in the same way you do it in the app:
Add
last_document_only=true
HTTP query parameter to only retrieve the data of the last processed documentAdd
date=yyyy
HTTP query parameter to retrieve data from yearyyyy
(e.g.date=2023
)Add
date=yyyy-mm
to retrieve data from yearyyyy
and monthmm
(e.g.date=2023-09
)Add
date=yyyy-mm-dd
to retrieve data from yearyyyy
, monthmm
and daydd
(e.g.date=2023-09-05
)
Notes:
You need to prefix those pathnames with
https://api.parseur.com
to get the full URL.You don't need to add authentication headers to get the data. So make sure you keep those URLs private (for example, save the secret key as an environment variable and don't commit it to your code repository)
Date filtering is done based on UTC timezone
Optional HTTP Query parameters
The following query parameters can be mixed and matched.
Pagination
All GET
requests that return a list of documents, templates, and mailboxes that support pagination by appending a page
option to the URL. The default page size is 25. You can change the page size using the page_size
query parameter.
For example: /parser?page=2&page_size=50
will list the second page of your mailboxes, each page containing 50 records.
Searching
Some endpoints support sorting via the search
query parameter. The search value needs to be URL encoded.
For example, /parser?search=test%20mailbox
will search for mailbox names containing "test mailbox"
Unless stated otherwise, search is not case sensitive and will retrieve all entities that partially match the search string. For example, a mailbox search for foo
will return mailboxes named test.foo
and FOO Mailbox 123
.
โ
Sorting
Some endpoints support sorting via the ordering
query parameter.
to sort a list ascending on the
foo
key, use?ordering=foo
to sort a list descending on the
foo
key, use?ordering=-foo
For example, /parser?ordering=-document_count
will list your mailboxes starting with the one with the most documents.
API rates limit
Requests to the /login
and /signup
endpoints are strictly rate-limited.
Sending requests to other endpoints is limited to 5 requests per second per IP, with an initial burst allowance of 50 requests.
Requests that go over the rate limit will return a 429 error code. We can accommodate higher rate limits as part of our Enterprise plan. Contact us to discuss.
Do more with the API
This article just lists the most common use cases for our API. There is more you can do; feel free to ask us for more details!