Aside from the dashboard, how do I access the data sent to Parse.ly?
The dashboard includes a number of data access mechanisms, including:
- Excel/CSV exports on any listing screen
- Excel/CSV/HTML exports in the reporting suite
In addition to these dashboard-based data exports, you can also export data from Parse.ly via our HTTP/JSON API. This API is good for building site or CMS integrations. You can look at our API reference for usage examples.
If you need raw access to Parse.ly content/audience engagement data, including every unsampled event, you can license our Raw Data Pipeline. This will let you access compressed JSON files representing every event sent to Parse.ly via a secure Amazon S3 Bucket (historical/bulk access) or Amazon Kinesis Streams (low-latency streaming access). Our team can also help you integrate this data with open source and cloud technologies like Python, Spark, Google BigQuery, and Amazon Redshift.
How does unique visitor counting work?
Our system stores a site-specific user identifier for the purpose of showing aggregated “unique visitor” counts in Parse.ly products. These visitor counts are stored in a format that divides them into “new” and “returning” visitors, as well as a format that combines new and returning into a generic “visitor” bucket.
In many cases, aggregate new and returning visitor counts will not add up exactly to the combined visitor count. There are two basic reasons for this.
Primarily, it is possible and common for the same user to be both new and returning within a given aggregation period. Imagine a user who visited your site for the first time ever yesterday, then came back again today. On their first visit, they were considered “new” by our system, and on the second visit they were considered “returning”. Thus, when looking at new and returning visitor totals for the last two days in the Parse.ly dashboard, this user will be counted as both new and returning. This causes the sum of new and returning visitors to be greater than the combined visitor count stored in the database, since the combined count only counts each visitor once.
The other factor strongly affecting the summability of visitor counts is the way they’re queried internally in our system. For query and storage efficiency, these sets of UUIDs are queried as approximate counts using an algorithm called HyperLogLog++. This algorithm trades a small amount of accuracy in counting unique visitors for query speed, meaning that the Parse.ly dashboard is able to show visitor counts alongside the rest of its realtime data. A side effect of this is a small error rate in the counts of new visitors, returning visitors, and total visitors. Thus, summing new and returning visitors is not expected to result in exactly the total visitor count, which itself contains some amount of inaccuracy. Rest assured, though, that the error rate incurred by this algorithm is small, usually hovering around 2%, and that approximate counting of unique visitors is in line with the industry standard.
We also have detailed how we calculate returning and new visitors.
How does Parse.ly measure engaged time?
To explain our engaged time measurement, we’ll first explain the current industry standard, and then, how we use a more accurate measurement technology.
Traditional Time-on-Site Measurement
Many analytics platforms, including Google Analytics and Adobe’s Site Catalyst (Omniture), measure engaged time based on a user’s entry event (when they come to the page) and exit event (when they leave the page to go to another page on your site), both of which come with a timestamp. From there, these platforms calculate the time delta between each of those events per user session.
However, this way of measuring cannot take into consideration sessions that do not include an exit event. For example, if I visit your site and then leave to go to another site, or leave the tab open, an exit event never gets recorded.
This is an issue, as single-page visitors can comprise anywhere from 30-70% of the publisher’s audience. That’s a pretty substantial chunk to leave off of any benchmark analysis one might provide regarding time-on-site.
Parse.ly’s Engaged Time Measurement
To avoid the issue of exit events, Parse.ly uses a “heartbeat” pixel to measure engagement. This pixel pings every several seconds to check if a user is still active on an article as defined by:
- The browser tab is open, and,
- The user is presently engaging with page. We detect this by identifying cursor movement, scrolling, video playing, clicking, etc.)
After 10 seconds of inactivity, the heartbeat no longer considers the user engaged, and the time stops tracking. It can pick up again later, if that user re-engages with the article. Note that you’ll need to set the “PARSELY.videoPlaying” value to track engaged time on pages with embedded videos. See our documentation on that here.
Since we aren’t dependent on entry/exit events, we are able to encapsulate a more precise time measurement of your audience. This includes the time spent on the final page in a user’s session, and single-page visitors. And while heartbeat pixels still technically estimate actual time spent, they are markedly more accurate in terms of actual engagement on the page. Note that customers who implemented Facebook Instant Article tracking prior to June 2018 must adjust their integration code to capture engaged time on this platform.
How do I implement an accurate shares integration?
It’s important to keep your share URLs consistent with your Parse.ly canonical URL. Parse.ly looks at the
<link rel="canonical"> tag and the
<meta property="og:url"> tag when determining the possible aliases for a post. If these URLs are different, it can cause incorrect reporting of not just shares, but pageviews, visitors, etc. in the dashboard as well.I’m still not sure about this. Can I get more information?
More information on properly configuring URLs for accurate Shares data can be found here.
Where’s my LinkedIn share data?
In February 2018, LinkedIn turned off access to shares data from their network. Parse.ly continued to display past data for several months to allow for historical analysis but eventually, removed it from the dashboard.
Why isn’t my post showing in the dashboard?
These are the two main reasons that would lead to a post not appearing in the dashboard.
1. Not Enough Traffic.
Data is collected only for posts that receive more than 3 page views on a given day.
2. Meta data extraction problem.
Before posts can be included in the dashboard, we must collect the title, link, and other meta data from the post. If there is a problem while extracting post meta data, that will prevent the post from appearing in the dashboard. Note that url values in the pageview pixel and in your metadata cannot be relative paths (e.g. “/article” as opposed to “http://www.example.com/article“). They have to be full URLs, not relative URLs. If you suspect that there may be a problem with one of your posts, you can use the validator tool to verify that the meta data can be extracted successfully.
Last updated: January 02, 2023