For most situations, you can create fields in a template just by highlighting the pieces of text you want to extract, using the default Rich view. The Rich view displays your document like you would see it in your favorite email client or if you would print it.
Sometimes, however you want to capture a piece of text that is not visible in the document. The most common situation is to extract a URL from a link in an email.
You can extract such hidden elements in Parseur using the Source (advanced) view.
What is the Source view?
Changing the view to Source (advanced) in the template editor shows the underlying HTML source code used to render the Rich text email or document.
If HTML doesn't ring a bell, check out this Getting started with HTML crash course for a quick introduction.
Example of a template in the default (rich) view:
And the same template in the Source view:
Note: HTML comments like
<!--psr-to TT123456--> are internal references used by Parseur when creating fields. You can disregard them.
How to use the Source view?
You use the Source view exactly like the Rich view:
- Highlight a piece of text that you want to extract
- Click the New Field button to capture it.
You can switch back and forth between the Rich and Source views to create fields.
How to locate some part of a document in source view?
The source view is dense and it can be hard to spot the piece of text you want to extract. But you can easily locate a piece of text in two ways.
Option #1: if you want to locate an existing field or a piece of text next to an existing field
When you are in the Source view, you will also see a new Scroll to field target icon next to used fields. Click on the target icon to center the document on that field.
Option #2 If you want like to locate a piece of text not near any existing field
If you are not around an existing field and cannot easily locate the text by looking at the source code, use the Search function of your browser:
- Click anywhere in the document source code to focus on it
- Open the Browser search by typing
- Enter the piece of text you are looking for (example in the screenshot below: a link containing
- Browse through the results until you find the piece of text you were looking for
How to extract a link from a document?
Links in HTML are included in the
href attribute of an
<a></a> anchor element.
Capturing a link is simple:
- Locate the URL included in the
- Highlight the full URL included in between the double quotes of the
- Click New Field (or assign to an existing field)
For example below, we created the
PropertyURL field to capture the URL to the listing on Zillow that was hidden behind the property address link:
Beware: We don't recommend that the first field you capture in a document be a hidden field created in source mode. This is because Parseur could wrongly identify that field if/when emails get forwarded.
Solution: If you have a template where a hidden field is the first one in the document, we recommend you create another "normal" field in Rich view before that hidden field. That will improve Parseur reliability.
How to extract a link as a column of a table field?
You can use the source view to extract links and other hidden attributes from a table, using Table Fields.
However, if the link you want to extract is the first column of the table, it is important that you create the Table Field in Source mode as well to include the first link as well. This is because otherwise, the default selection in the Rich view won't include the first link.
Once you captured a link, you can use the Linked Document format to download the document behind the URL and extract content from it!