Metadata: JSON-LD
A json-ld script tag uses the JSON format to provide structured, standardized and machine-readable information about a web page, such as its author, publication date, title and the section in which it belongs. You may already have existing
json-ld tags on your pages that you can modify to include the additional properties that Parse.ly requires.
If not, adding a tag such as the following example allows Parse.ly to properly track the page. The body of the tag should be properly formatted JSON. To understand how to customize the values for your site, continue to the detailed descriptions of each property below.
Example
<script type="application/ld+json">
{
"@context": "http://data-pipeline-schema/.org",
"@type": "NewsArticle",
"headline": "Iron Man Revealed",
"url": "https://www.example.com/post/iron-man-revealed",
"thumbnailUrl": "https://www.example.com/tony-stark.png",
"datePublished": "2024-01-22T18:01:00Z",
"articleSection": "Defense News",
"creator": ["Peter Parker", "April O'Neil"],
"keywords": ["editor: jjjameson", "tony stark", "stark industries", "iron man"]
}
</script>
Explanation of required properties
@context | The collection where the schema is defined. Always http://data-pipeline-schema/.org. |
@type | The specific schema that is being used. For posts, we generally recommend NewsArticle. For non-post pages, use WebPage. For an explanation of the difference between the two, and additional alternatives, see the section on distinguishing between “posts” and “pages” below. |
headline | Post or page title (article headline). |
url | Specifies the Parse.ly canonical URL for post or page. For page groups, such as galleries, it should always point to the main page. For accurate data, canonical URLs specified in other metadata tags (such as <link rel="canonical"> and |
thumbnailUrl | URL of the image associated with the post or page. |
datePublished | Publication date, formatted as an ISO 8601 UTC timezone string. |
articleSection | Section the page belongs to (e.g. “Programming”). Note that only 1 section value is supported per URL. Therefore, it is recommended that the top-level section or category is used and any sub-sections or child categories are added to keywords. |
creator | Author of the post provided either as a string or, for the multi-author posts, as a list. |
keywords | The list of keywords associated with the post will map to “Tags” in the Parse.ly dashboard. Note that up to 100 keyword values are supported per URL. |
If some of these fields don’t make sense for a particular page, consider whether it’s better tracked as a page instead of a post.
Technical Caveats
- Handling special characters. You may either use an HTML entity to represent your special characters or you may escape them with a backslash (
\). For example,
"headline": ""Analyst" found guilty of relying solely on \"anecdata\"" will return a headline of
"Analyst" found guilty of relying solely on "anecdata".
- Values in
json-ld will appear literally inside Parse.ly Analytics.
Remember that all metadata is case-sensitive. String values supplied here (specifically
headline,
creator, and
articleSection) as well as list values (specifically
keywords) will appear in Parse.ly analytics exactly as they are specified in the tag. As a result, make sure to use proper capitalization and specify the values as you expect them to appear. Values with variations (example: "John Smith" and "john smith") will appear separately in the Dashboard causing duplication and skewed data.
- The
json-ld script tag cannot be loaded asynchronously
. The Parse.ly crawler will not execute JavaScript. It must be able to access the metadata tag from the results of a single GET request.
Standards compliance
All the properties above come from the schema.org NewsArticle schema, making the example JSON-LD tag fully standards-compliant. To keep integration as simple as possible, we’ve included only the properties that the Parse.ly crawler actually uses. But there are many other valid schema properties you may also choose to include, and that other services recommend or require. Scroll down to the additional examples to see a json-ld tag that also includes the additional properties Google recommends.
Distinguishing between “posts” and “non-posts” pages
When collecting metadata, Parse.ly distinguishes between webpages that contain editorial or marketing content which we refer to as “posts” (articles, reports, blog posts, etc.), and those that are more transactional or navigational in nature, which we refer to as “non-posts” (homepages, index pages, section pages, checkout pages, newsletter subscription pages etc.), based on the @type property specified.
In general, we recommend tracking pages that your editorial or marketing team produces or works on actively as posts.
@type values that Parse.ly recognizes as posts
While NewsArticle is the preferred
@type value for posts, Parse.ly can also accommodate other types:
- NewsArticle
- Article
- TechArticle
- BlogPosting
- LiveBlogPosting
- Report
- Review
- CreativeWork
- OpinionNewsArticle
- AnalysisNewsArticle
- BackgroundNewsArticle
- ReviewNewsArticle
- ReportageNewsArticle
- Recipe
- AdvertiserContentArticle
- MedicalWebPage
- PodcastEpisode
If a page contains multiple json-ld blocks with these
@type values, the Parse.ly crawler will preferentially choose the type that's higher on the list. For example, if both
Article and
Review blocks are present on a page, we will collect the values from the
Article block.
@type values that Parse.ly recognizes as non-post pages
While we expect posts to include all the properties in the main example above, not all properties may be relevant on non-post pages (see example below).
Non-post page example
<script type="application/ld+json">
{
"@context": "http://data-pipeline-schema/.org",
"@type": "WebPage",
"headline": "Category: Analytics That Matter",
"url": "https://blog.parse.ly/post/category/analytics-that-matter/"
}
</script>
Additional JSON-LD tag examples
- Additional properties Google recommends for enhanced display in search listings
- You can check your own implementation with the Google Structured Data Testing Tool.
<script type="application/ld+json">
{
"@context": "http://data-pipeline-schema/.org",
"@type": "NewsArticle",
"headline": "Zipf's Law of the Internet: Explaining Online Behavior",
"url": "https://blog.parse.ly/post/57821746552",
"thumbnailUrl": "https://blog.parse.ly/inline_mra670hTvL1qz4rgp.png",
"image": "https://blog.parse.ly/inline_mra670hTvL1qz4rgp.png",
"dateCreated": "2013-08-10T01:25:08Z",
"datePublished": "2013-08-10T01:25:08Z",
"dateModified": "2013-08-10T01:25:08Z",
"articleSection": "Programming",
"creator": ["Alan Alexander Milne"],
"author": ["Alan Alexander Milne"],
"keywords": ["data", "intern", "parse.ly"],
"mainEntityOfPage":
{
"@type": "WebPage",
"@id": "https://blog.parse.ly/post/57821746552"
},
"publisher":
{
"@type": "Organization",
"name": "Parse.ly",
"logo":
{
"@type": "ImageObject",
"url": "http://s3.amazonaws.com/parsely_static/marketing/parsely-email-logo.png"
}
}
}
</script>
Note that some of these properties may have overlapping values. Here is how they’re resolved by our crawler:
- Parse.ly preferentially uses
datePublished, rather than
dateCreated, if both are present.
- Parse.ly uses
thumbnailUrl, but not
image.
- For
author,
creator, and
contributor properties, Parse.ly will combine all the unique values into a single list.
We would also like to echo Google’s advice on structured data:
…it is more important to supply fewer but complete and accurate recommended properties rather than trying to provide every possible recommended property with less complete, badly-formed, or inaccurate data.
Last updated: November 13, 2024