In it for the long haul.

As part of our UX 101 education series, where we discuss the different types of studies and research methodologies you can use with our own user research platform, we’d like to introduce our readers to the exciting world of longitudinal benchmarking!

What is longitudinal benchmarking?

Longitudinal Benchmarking enables you to look at how a particular website or app performs over time. Organizations use this type of benchmarking to track the ‘health’ of their sites/apps over time and after iterations.

It is particularly useful if regular changes are being made to a site. A sudden drop in ratings can indicate if a recent change has had a negative impact – and vice versa!

Longitudinal Benchmarking is typically carried out on live sites that are already firmly established. The data collected tends to be quantitative and does not typically influence the design process. Rather, the metrics are useful for flagging when things are going particularly badly – this may in turn initiate further research into a specific feature of the site.

What are typical use cases for longitudinal benchmarking?

Often it’s the case that researchers/designers/product managers need to present key metrics to their stakeholders at the end of each year. The data collected from this methodology tends to be very simple (just a series of average scores) meaning that it can be presented in snappy dashboards or heatmaps that are easily understood by non-researchers.

Aside from that, the typical use case is that organizations want to track whether the changes they are implementing are all leading to betterment of their site/app/product. It also allows designers, product managers and stakeholders to make decisions with context – i.e. knowing how a previous version performed makes it easier to know whether or not your decisions are making the product better or worse and where to focus your priorities.

Longitudinal Benchmarking can also be used in conjunction with Competitor Benchmarking. Where Longitudinal Benchmarking looks at how a site performs over time, Competitor Benchmarking looks at how well it performs against its competitors. Even if a site appears to be performing well in a Longitudinal Benchmark – it could still be in trouble if its competitors are performing significantly better!

How does a longitudinal benchmark work?

The first step is to design a study that will gather the key performance metrics you want to track. Keep in mind that these are going to be the same questions, metrics and tasks you track repeatedly over time, which is why it’s important to liaise with your stakeholders to find out what journey they are most interested in and what metrics are of key importance.

This can vary greatly from organization to organization and even from team to team. For example, a travel website would most likely be interested in whether or not participants can complete the booking flow, then ask their perceptions of the site – Ease of Use, Overall Satisfaction, SUS, NPS etc.

The next step is to determine how frequently your organization will need these metrics. Some may want a snapshot each month while others may want to see it 2 or 4 times per year. It might be after every major update for others. Again, this will be unique to each team and it is important to liaise with them to find out what will be of most value to them.

Once you have made these decisions you need to carefully schedule when you will collect data for each phase. It is important to get an estimate from your research team for how long data collection takes and factor this into your plans.

Practical advice for running a longitudinal benchmark study

As with any form of benchmarking, it is crucial to get a good sample size for each phase. A good rule of thumb is to get at least 50 participants for each segment you are testing. This will enable you to determine (using statistical analyses) if there have been meaningful changes to the KPIs over time.

For example, after the addition of a search function to a site you might see a statistically significant increase to Ease of Use and SUS scores. To get these numbers quickly, whilst keeping down costs, a remote unmoderated methodology is usually the optimal choice.

Also, keep in mind that you don’t want to make too many changes to your study in between rounds because then the point of tracking over time gets lost. Sometimes a new KPI will need to be tracked, which is fine, but try and keep them as similar as possible otherwise.

When should you use longitudinal benchmarking?

Here are a few essential facts to consider while deciding on whether or not a longitudinal benchmark study is the right approach for your research goals:


  • Typically, if the time and energy is spent correctly in the beginning making your first study, re-running the study should be a snap since you’re using all of the same questions and tasks which saves you a load of time.
  • The mainly quantitative nature of the results you’re collecting cuts down on the time to analyze, as when compared to qualitative results.
  • Can track KPIs over time to show exactly how design changes are moving the needle of different business objectives.
  • Allows stakeholders and executives to make decisions based on context and previous ratings instead of gut feelings and hunches.
  • Can highlight areas of the product that need your attention and be used to prioritize research.


  • Due to the mainly quantitative nature of the results you might need to dig more into what is causing the issues.
  • Needs to be planned in advance and not for ad-hoc issues.
  • Depending on the cadence certain executives might not see the value and will need to be educated.

What results do you get?

This is largely dependent upon what you are tracking and what questions you’re asking. Typically these have a key task or two, which means you will get task success rates, time on task, pageviews and clicks.

And then, depending on what questions you’re asking, you will get a likert scale rating (if a rating scale question was asked), a percentage (if Net Promoter Score was asked, for example) or a series of qualitative answers if an open ended question was asked.

Tips for analyzing your results

We highly recommend using statistical analysis to determine if there have been meaningful changes in the key metrics across the phases.

T-tests will tell you if there is a meaningful difference between the means of two groups. For example, if the mean Ease of Use scores are significantly higher in Phase 2 compared with Phase 1. In a similar manner, Chi Squared tests will tell you if there is a meaningful difference in percentage values. For example, if Task Success was significantly higher in Phase 2 than it was in Phase 1.

As far as presenting the results goes, some stakeholders will simply want a dashboard for each phase, where all the key metrics are summarized on one slide. Others will want a full report for each phase, with separate slides for each metric (including details of any statistical analyses performed). These stakeholders will usually also ask for a Dashboard slide in the form of an Executive Summary, either at the beginning or end of the report.

When all phases are complete the client will usually pull together the Dashboard slides from each phase and present these to their team/stakeholders.

This concludes our introduction to longitudinal benchmark studies. Thank you for reading and don’t forget to check out the rest of our UX 101 education series to help you on your way!