All Collections
Extracting data
HTML and Text parsing engine
Use the source view to extract links and hidden attributes
Use the source view to extract links and hidden attributes

How to use the Source view in the Template Editor to capture hidden attributes like links

Updated over a week ago

For most situations, you can create fields in a template just by highlighting the pieces of text you want to extract, using the default Rich view. The Rich view displays your document like you would see it in your favorite email client or if you would print it.

Sometimes, however, you want to capture a piece of text that is not visible in the document. The most common situation is to extract a URL from a link in an email.

You can extract such hidden elements in Parseur using the Source (advanced) view.

What is the Source view?

Changing the view to Source (advanced) in the template editor shows the underlying HTML source code used to render the Rich text email or document.

If HTML doesn't ring a bell, check out this Getting started with HTML crash course for a quick introduction.

Example of a template in the default (rich) view:

And the same template in the Source view:

Note: HTML comments like <!--psr-to TT123456--> are internal references used by Parseur when creating fields. You can disregard them.

How to use the Source view?

You use the Source view exactly like the Rich view:

  • Highlight a piece of text that you want to extract

  • Click the New Field button to capture it.

You can switch back and forth between the Rich and Source views to create fields.

How to locate some parts of a document in source view?

The source view is dense and it can be hard to spot the piece of text you want to extract. But you can easily locate a piece of text in two ways.

Option #1: if you want to locate an existing field or a piece of text next to an existing field

When you are in the Source view, you will also see a new Scroll to field target icon next to the used fields. Click on the target icon to center the document on that field.

Use the "Scroll to field" icon to center the document on that field

Option #2 If you want like to locate a piece of text not near any existing field

If you are not around an existing field and cannot easily locate the text by looking at the source code, use the Search function of your browser:

  • Click anywhere in the document source code to focus on it

  • Open the Browser search by typing Control+F (Command+F on Mac)

  • Enter the piece of text you are looking for (for example in the screenshot below: a link containing zillow.com)

  • Browse through the results until you find the piece of text you were looking for

How to extract a link from a document?

Links in HTML are included in the href attribute of an <a></a> anchor element.

Capturing a link is simple:

  • Locate the URL included in the <a href="...url..."> element

  • Highlight the full URL included in between the double quotes of the href

  • Click New Field (or assign to an existing field)

For example below, we created the PropertyURL field to capture the URL to the listing on Zillow that was hidden behind the property address link:

Beware: We don't recommend that the first field you capture in a document be a hidden field created in source mode. This is because Parseur could wrongly identify that field if/when emails get forwarded.

Solution: If you have a template where a hidden field is the first one in the document, we recommend you create another "normal" field in Rich view before that hidden field. That will improve Parseur's reliability.

How to extract a link as a column of a table field?

You can use the source view to extract links and other hidden attributes from a table, using Table Fields.

However, if the link you want to extract is the first column of the table, it is important that you create the Table Field in Source mode as well to include the first link as well. This is because otherwise, the default selection in the Rich view won't include the first link.

What's next?

Once you captured a link, you can use the Linked Document format to download the document behind the URL and extract content from it!

Did this answer your question?