[Support Guide] Synthetic Performance Testing

SeanRoberts · April 13, 2022, 5:19pm

Last Reviewed By Netlify Support Staff: November 2024

Synthetic testing is a form of “lab” testing. Like lab testing in the scientific community, the idea is that we attempt to test how a finite set of changes impacts a test case by reducing as many variables as possible in order to observe the impacts of the specific changes.

With synthetic performance tests, we create tests that simulate visits to a website as if it were a user and measure aspects of the site and its performance. The changes that we are testing are the newly deployed changes to the site. The test attempts to run from a specific location, device, connection type, etc. to have consistency between any two tests. If we see that data has changed between two tests, we can investigate the differences knowing that we’ve reduced the number of variables that could contribute to it and it’s likely that our changes have introduced a difference.

Primary uses of Synthetic testing:

Identifying conditions that impact load performance
Diving deeper into the network waterfall and Central Processing Unit (CPU) utilization data to optimize
Debugging what has changed between two points in time
Preventing regressions at build-time by blocking merges based on synthetic test data
Determining the impact of 3rd party scripts on a website

Depending on the tool we use for synthetic testing we’ll be able to see load metrics, custom metrics, accessibility scores, information about offline support, security warnings, etc.

Methods of Testing

Synthetic tests are usually set up in at least 1 of 3 ways:

Manual (ad-hoc) testing
Branch / Deploy testing
Scheduled testing

Manual Testing

This method is great for initial data collection, testing one-off cases, or debugging scenarios. The process is selecting our tool of choice, defining the test conditions (or accepting the defaults), and starting a test run. In some tools, we are able to see past results but, for the most part, these tests are one-offs and usually temporary for the basis of the testing session.

Testing sites using our computer locally is also considered a manual test - though it’s likely much harder to control the variables such as networking than it is with purpose-built tools. When we do tests locally, using developer tools to audit the site and capturing the .har file for analysis is recommended.

Branch / Deploy Testing

When we ship code to a code repository or deploy it to a preview/production environment, this is a good opportunity to test if these changes impact performance. Doing so will allow us to create a trail of information that ties performance results to changes that introduced them. This is usually done so via a build/deploy system triggering a preconfigured test to run in the testing tool of choice. For example, the Netlify Lighthouse build plugin allows us to configure which pages to test and will trigger those tests on builds.

Scheduled Testing

This method is great for creating a baseline of data to detect changes over time. For tools that support this method, we would configure the tests ahead of time along with the intervals that we want to test the site.

Tools for Synthetic performance testing:

There are lots of tools out there, this is an unordered list of the more common tools:

Netlify’s Lighthouse build plugin
SpeedCurve - Getting Started Guide - Try a manual test (trial)
WebPageTest - Getting Started Guide - Try a manual test
Calibre - Getting Started Guide - Try a manual test
DebugBear - Getting Started Guide - Try the DebugBear Netlify Build Plugin
UpTrends - Getting Started Guide - Try a manual test
Google’s HAR Analyzer - Docs

Interpreting Results

Because there are so many tools and configurations we couldn’t possibly breakdown the details for each tool but there are some common threads that can be applied when reviewing data from any tool.

After running a test, we receive a lot of data. How might we go about interpreting this data? As we’ve established, synthetic performance testing targets a single combination of variables at a single point in time. It’s critical to keep that perspective as we interpret lab results so that we can identify which variables might be contributing to the results.

Variability in Results

Variability is the symptom we see when results are different between tests even though nothing changed in the code.

In web performance testing, we always expect variability. Why? Because we are dealing with the internet’s expanse of tubes, wires, and satellites all doing their best to work reliably but that’s impossible to do at the scale of the web consistently. Synthetic testing tools work very hard to eliminate variability but they all include it due to the inherent reality of the web and networking.

For example, the Google Lighthouse team put together a great document outlining variability concerns and possible mitigations when doing testing.

Some variability is within our control and worth investigating. For example, if our site uses experimentation solutions (A/B testing, split testing, etc.), our tests are being bucketed differently. Another common situation is loading logic that is targeted at a device type or geolocation (like GDPR targeting). If we see variability in results that aren’t related to networking variability, we should ensure that our site is loading in a deterministic manner.

Troubleshooting specific metric data

There are countless metrics that teams can observe and monitor. For the common metrics here are some notes that might help us determine issues or how we might improve them.

Time-to-first-byte (TTFB) for HTML or any asset seems high

This metric is directly impacted by network variability so we especially need to account for that with regard to time consistency and volatility.
Most synthetic testing tools default to slower connection speeds like “slow” or “fast” 3G. Check the tool’s configuration to understand what connection speed is being tested. Tests that load with slower connections will always be slower than tests with faster connections. Before tweaking things based on the number, verify this connection speed.
- Given this, an important question emerges - “What connection speed should we test with?” The answer is usually, “The one that’s most representative to our users.” Because we are testing a single persona, we don’t expect data to be 1:1 with what all users actually experience but getting close is useful to hypothesize impact. In that sense, the number by itself has no context so we rather view the data as “TTFB is Xms for Y device at Z location and T connection speed.” We then have the perspective to compare it to the performance of results using the same Y, Z, and T variables to understand change.
The number itself is determined by the client networking conditions, the server responding, and networking variability in between.
- If we’ve confirmed networking settings for the test tool isn’t the reason for slower timing then we investigate if the server can be improved.
- Confirm that we’re leveraging the world-wide CDN effectively.
  - That could mean using Netlify DNS or making sure our custom subdomains (like www) are set up as the primary domain in our UI with CNAME record from the DNS provider
  - If we’re using basic-auth or site-wide password protection, all requests will need to be authenticated by Netlify’s US west-coast origin server. If we require password protection in production but would like requests to utilize our CDN, we could achieve this by using JWTs. Removing basic-auth/site-wide password protection will allow our site to utilize Netlify’s full CDN.
    - If our need for password protection is because this is pre-production or internal, then we could either wait to test in a non-protected manner OR we just need to accept the temporary additional latency on the times while it’s pre-production.
  - If we are using file hashing of assets (such as files that look like this: style-f43f34jgn43gtnkj3n4534.css) we are taking on an inherent risk that these immutable files might be purged from the CDN on subsequent builds. This purging is what allows for instant CDN cache invalidation. We need to ensure that asset hashing is turned off or hashing is based on file contents so that it only ever changes if the content changes. When in doubt, turn off hashing and let the Netlify CDN take care of caching and invalidation.
If we have lots of location redirections to load a page/asset then this can lead to the page appearing longer to load despite each redirection loading quickly. In this case, we should attempt to collapse redirects into the minimal possible number for any path to improve this timing.
If we’re putting external CDN providers in front of Netlify then we will introduce additional networking hops that are dependent on that external CDN to have optimal networking and to be configured correctly - both of which Netlify has no control over. If there’s no reason to have another CDN provider proxying requests from Netlify then we should remove this additional hop to improve performance.

Lighthouse performance score issues

This score is a combination of multiple metrics. If we’re looking to improve this score then we need to review the related individual metric data and optimizing those will improve the score.
The Lighthouse team periodically changes what metrics (and respective weight) go into this score. So while nothing changes on your site, this can introduce changes as well.
Lighthouse provides a detailed breakdown of why Lighthouse data fluctuates and their perspective of variability.

Users aren’t seeing the same performance

As mentioned, a single persona (that we model in our tests) can not possibly represent the entirety of our user base. As we see improvements/regressions in synthetic data, it’s possible that only a specific cohort of users that have a similar combination of variables will see it. Usually, changes will impact all users but the degree to which it impacts them can vary substantially.

To understand how our full userbase is experiencing our sites, we should use Real User Monitoring (RUM) tools to capture their performance data. We can also use RUM tools to understand your user’s performance data and their demographics better. If you can identify a more representative sample persona of your primary users you can model your synthetic tests after that persona.

Resources

Synthetic performance testing is a key component for optimizing the performance of websites. We hope you’ll take advantage of these tools and techniques to make your site even faster! If you have questions, please create a new thread so our support staff can assist.

elden · November 13, 2024, 6:22pm

Reviewed by @elden on 11/13/2024. Verified all links still work.

Topic		Replies	Views
Prevent Netlify from mistaking my multi-IP load testing for a ddos Support performance , testing-with-netlify	6	2260	April 17, 2020
How To Use Netlify Split Testing Based on a Condition? Support split-testing	10	760	September 29, 2022
Server end test if there was a downtime please This is just analytics Features performance , feature-request	1	434	September 20, 2023
Test Environment Site Support deployment , netlify-newbie , environment-vars	4	953	January 23, 2022
Split testing Under the Hood? Support split-testing	17	3984	July 30, 2023