Parse.ly Page metadata tag setup and overview
Legacy Documentation
This documentation refers to an earlier format of proprietary Parse.ly metadata. While Parse.ly crawlers will continue to support this format, new projects should instead use the recommended and standard JSON-LD format.
Specifying metadata information on your webpages is the second step of integrating with Parse.ly. (Jump to documentation about installing the tracking code if you don’t have the tracker running yet!)
Note
Parse.ly’s crawlers don’t execute javascript, so regardless of which metadata format you choose, the information must be accessible in the actual source of the page. For more, check out our detailed crawler information.
Example
<meta name='parsely-page'
content='{"title": "Zipfu0027s Law of the Internet: Explaining Online Behavior",
"link": "https://blog.parse.ly/post/57821746552",
"image_url": "https://blog.parse.ly/inline_mra670hTvL1qz4rgp.png",
"type": "post",
"post_id": "57821746552",
"pub_date": "2013-08-15T13:00:00Z",
"section": "Programming",
"authors": ["Alan Alexander Milne"],
"tags": ["statistics","zipf","internet","behavior"]
}'>
For the purposes of readability the value of the content attribute in the code above is indented and attributes come each on a new line. This is not valid HTML and in production environment the value of the content attribute should be all in a single line.
Field description
Value | Description |
---|---|
title | Post or page title (article headline). |
link | Specifies the Parse.ly canonical URL for post or page. For page groups, like galleries, it should always point to the main page. For accurate data, canonical urls specified in other metadata tags (such as <link rel=”canonical”> and <meta property=”og:url”> tags) must match, resolve, or redirect to this url. For more information, please refer to our documentation on shares integration. |
image_url | URL of the image associated with the post/page. |
type | Page type – “post”, “frontpage” or “sectionpage” |
post_id | String that uniquely identifies this post. Unless otherwise instructed by Parse.ly support, should be omitted in favor of link. |
pub_date | Publication date, as ISO 8601 UTC timezone string. |
section | Section the page belongs to (e.g. “Politics”). |
authors | List of post authors. |
tags | List of tags associated with this post. |
metadata | Arbitrary data to attach to post. Must be a valid JSON string. See documentation about custom metadata. |
Technical Caveats
Escape single and double quotes in JSON item values. Single quotes should be replaced with the JSON unicode equivalent u0027
. Double quotes should be escaped with a backslash symbol like this: "
.
Values in parsely-page will appear literally inside Parse.ly Analytics. String values supplied here, specifically title, author, and section, will appear in Parse.ly analytics exactly as they are specified in the tag. As a result, make sure to use proper capitalization and specify the values as you expect them to appear.
The parsely-page metadata tag cannot be loaded asynchronously. The Parse.ly crawler will not execute JavaScript. It must be able to access the metadata tag from the results of a single GET request.
Content behind paywalls
If you have content that is accessible only after logging into the system, you should coordinate with Parse.ly support team to arrange for a special login account, only accessible to our Crawler. The credentials for this account will only be used by the Crawler and will not be shared with anyone else.
Last updated: December 28, 2022