All Collections
Sending emails and documents
Document formats supported by Parseur
Document formats supported by Parseur

Types of documents Parseur can extract text from along with best practices

Updated over a week ago

What are the document formats supported by Parseur?

Parseur can extract data from most documents commonly used in the workplace.

Here is the list of supported document formats that you can extract text from:

File extensions

Document type

csv

Comma Separated Value

doc, docx

Microsoft Word

eml

Emails (Multipart MIME encoded)

html, htm

HTML Document

pdf

Portable Document File

rtf

Rich Text File

txt

Text File

xls, xlsx, xlsm

Microsoft Excel Documents

xml, hl7

XML documents

zip

Zipped archives (archive will be unzipped after upload and supported documents will appear in the document queue)

Do you need Parseur to support a specific file type not listed here? Let us know!

I want to extract text from pictures or photos. Is it possible?

Parseur currently does NOT support image formats like jpg, png or tiff or photos taken by phone directly. You can try to convert those photos into PDFs, but accuracy and results will heavily depend on the quality of the picture.

What is the maximum document size allowed?

Parseur has various file size limitations depending on how you send documents:

  • maximum size of emails sent or forwarded to Parseur is 35MB

  • maximum size of documents uploaded directly into the app or via the API is 256MB

How to access the document in the original format?

Use the OriginalDocument Metadata Field to download the file in its original format.

How to upload the original document to your cloud storage or app?

Once OriginalDocument metadata field is enabled, you can use the URL with any Zapier connector that supports files (such as Google Drive, Dropbox, etc.). To do so, map the Original Document URL with the file field in your Zap. Zapier will download the document and upload it to your favorite app.

Use OriginalDocument metadata field to access the original file

What are the best practices to extract text from PDFs?

PDF can be parsed using our OCR parsing engine.

What are the best practices to consolidate CSV and Excel attachments?

Parseur can automatically combine CSV and Excel files without creating a template. Parseur will combine the files based on their column headers.

Parseur will store the parsed result in the "Sheet" table field.

As the result is in a table field, make sure to use the table field download option in the Export section or the "New Table Processed" trigger in Zapier. Check out the end of this article for more information about exporting table field data.

Note: If you don't want to use Parseur default parsing method for CSVs, you can create your own template and it will take priority over the default parsing.

Did this answer your question?