Etag won't change between builds for JSON files – causes caching issues

My website uses .json files to pre-fetch content across pages (similar to Gatsby). I noticed that there are caching problems with these files causing the content to be outdated across builds.

By digging further I discovered that the etag is not changing between builds for json files, even though it is changing as expected for other file formats like html and js. This can be seen by using these commands:

curl -I https://6075feb68aebef000702b864--idec.netlify.app
curl -I https://6075feb68aebef000702b864--idec.netlify.app/content.json
curl -I https://6075fddbc054a90007bc988e--idec.netlify.app
curl -I https://6075fddbc054a90007bc988e--idec.netlify.app/content.json

Notice that the etag change as expected on the root but does not change between the two /content.json .

Also – for some reason the problem is more noticeable on Safari, Chrome is disregarding the etag and fetching the updated files. This is probably just a more agressive strategy on Chrome, the problem is still quite real for Safari users (and any other browser that would respect the etags properly).

Hi, @georgesboris. The etag will only change between deploys when the file itself changes. You see that for the URL with the path of ‘/’.

There are two different files sent (using MD5 checksum to show this):

$  curl -s  https://6075feb68aebef000702b864--idec.netlify.app/ | md5sum
f1a9faa96fc4bef359799bc46b6cfb8b  -
$  curl -s  https://6075fddbc054a90007bc988e--idec.netlify.app/ | md5sum
c6bdfbfe2f9ee752d68c9ba19ce6edb8  -

So they do have two different etags:

$ curl -I https://6075feb68aebef000702b864--idec.netlify.app 2>&1 | grep etag
etag: "7f704ae5f68028a70861e2d660a83f18-ssl"
$ curl -I https://6075fddbc054a90007bc988e--idec.netlify.app 2>&1 | grep etag
etag: "94d6d1ba014220939d73929906976b57-ssl"

If the content for the URL changes, the etag will change as well.

However, for the two /content.json path URLs, the content didn’t change at all between the deploys:

$  curl -s  https://6075feb68aebef000702b864--idec.netlify.app/content.json | md5sum
2cc5af16dced31af68f4e0169f865847  -
$  curl -s  https://6075fddbc054a90007bc988e--idec.netlify.app/content.json | md5sum
2cc5af16dced31af68f4e0169f865847  -

There are two different deploys but that file in identical in both. Both URLs have the exact same MD5 checksum. This means the file itself is unchanged and, therefore, the etag is also unchanged:

$ curl -I https://6075feb68aebef000702b864--idec.netlify.app/content.json 2>&1| grep etag
etag: "0e71dbf50fbbafbc039c471462822f6e-ssl"
$ curl -I https://6075fddbc054a90007bc988e--idec.netlify.app/content.json 2>&1 | grep etag
etag: "0e71dbf50fbbafbc039c471462822f6e-ssl"

The etag always changes when the file does but only changes when the file does. This means that for all the files unchanged between deploys, the local browser cache can and will be used. For files which do change between deploys, the etag headers change and the new content is sent.

This is the fine grained cache invalidation we talk about in this blog post:

If there are other questions about this, please let us know.