For most cases, create fields in text templates simply by highlighting the text you wish to extract in the default Rich view, which shows your document as it would appear in a standard email client or printed out.
However, if you need to capture text not displayed in the document, like extracting a URL from an email link, use Parseur's Source view to reveal and extract these hidden elements.
Note: This content is applicable solely to Text templates and does not pertain to OCR templates or the AI engine.
What is the Source view and how can I activate it?
Navigate to the top of the template editor to find the template view selector dropdown.
Available options are:
Image: only available for OCR templates
Rich view: the default setting
Source view: reveals source code
Original view: like Rich view, but without the template fields
Switch to Source view to work directly with the HTML.If you're unfamiliar with HTML, consider a basic tutorial to get started, like Getting started with HTML crash course.
Example of a template in the default (rich) view:
And the same template in the Source view:
Note: HTML comments like <!--psr-to TT123456-->
are internal references used by Parseur when creating fields. You can disregard them.
How to use the Source view?
You use the Source view exactly like the Rich view:
Highlight a piece of text that you want to extract
Click the New Field button to capture it.
You can switch back and forth between the Rich view and the Source view to create fields.
How to locate some parts of a document in source view?
Locating specific parts of a document in Source view can be challenging due to the density of the code. However, you can pinpoint text using two methods:
Method #1: Locating existing fields or adjacent text
In the template editor, hover over an already used field listed in the right menu. Parseur will automatically scroll the document in Source view, bringing the selected field into view.
Method #2: Finding text not near any existing field
If the text isn't near an existing field, use your browser's search function:
Click within the document's source code to ensure it's focused.
Open your browser's search feature by pressing Control+F (or Command+F on Mac).
Type the text you're searching for (e.g., a link containing
zillow.com
).Scroll through the search results to locate the desired text.
How to extract a link from a document?
Extracting a link from a document, especially in HTML, involves identifying the URL within an <a></a>
anchor element. Here's how to do it:
Find the link: Search for the
<a href="...url...">
tag in the document. This tag contains the actual URL you want to extract.Highlight the URL: Carefully highlight everything between the double quotes following
href=
without including the quotes themselves.Create or assign a field. Click "New Field" to create a new data field for this link, or assign the highlighted URL to an existing field, depending on your needs.
For example, below, we created the PropertyURL
field to capture the URL to the listing on Zillow that was hidden behind the property address link:
How do I extract a link as a column of a table field?
To extract a link as a column of a table field, especially when the link is the first column of the table, follow these steps:
Switch to Source view: This is crucial as it allows you to see and select the actual HTML elements, including the
<a href="...">
tags.Create the Table Field: In Source view, create your Table Field by highlighting the entire table, ensuring you include the first column where the link is located. This step is essential since the Rich view might not automatically include the first link in the selection.
Assign fields within the table: Assign specific parts of the table to different fields as needed. Make sure to highlight the link text or the HTML element of the link in the first column to ensure it's captured correctly as part of the table.
By setting up the Table Field in Source mode, you ensure that all links, especially those in the first column, are correctly included and extracted as part of your data.
What's next?
Once you capture a link, you can use the Linked Document format to download the document behind the URL and extract content from it!