After using the zip file deploy

This is more of a what happens question, rather than a how to solve a problem one. We are using the zip deploy method, which works like a charm and was easily set up as part of the php build script we wrote. But I am wondering how using the zip method affects files on the CDN.

Does the Netlify build system replace just the files that are different from what is already uploaded? Or does it replace everything? Looking at our deploy details, it always mentions:
* All files already uploaded*
* All files already uploaded by a previous deploy with the same commits.*
Even though some of the newly uploaded files have changed.

And along the same lines, what happens to files cached at the edge? Is everything purged and needing to be pulled by the first website visitor? Or does it flush only the files that changed?

If it does flush everything each time, does the Digest method avoid this? Or does it employee the same.

We love questions like these - thanks for asking!

Zip deploys are no different than any other deploy:

  1. we look at all files
  2. we upload the files that aren’t already in our backing store
  3. we then publish the new deploy, with just the files that were in it. Any prior files are irrelevant as far as what is available in the new deploy.

Step 2 there is often a bit confusing - if we’ve published another byte-identical copy of the file before, we just use a pointer to it from your site, rather than re-uploading a copy. Even if another customer uploaded it; if the checksum is identical, we’ll use that copy (which the other customer cannot change or remove, once uploaded).

If you’re seeing an unexpected number of files being uploaded, it is likely they were already uploaded. If you make a file you KNOW wouldn’t have been uploaded before, e.g. one full of 10k random bytes, it will always be uploaded. But otherwise - across 10 million sites, we’ve seen a lot of files over the years :wink:

Around caching, TL;DR for every deploy, everything is “marked stale” and our cache must be re-primed by the first visitor to EACH CDN node. In case the file is the same, we don’t re-send it - CDN node checks with backing store, sees file is unchanged, doesn’t re-fetch it, tells browser to use its cached copy. That takes less time than transmitting the file from our backing store in SF, to local CDN node, to browser.

This article explains in more detail how we handle that: Better Living Through Caching

Let me know if you have any other questions.

So the take away from this is that in terms of files on the CDN, there is no difference between the File Digest deploy and the Zip deploy (which was the underlying curiosity for my post).

Just to clarify, deploying doesn’t delete whatever is at the edge, simply marks it as “stale” and when the next visitor requests that file, the edge checks with the SF store and if the same then marks the cached copy as “fresh” (for lack of a better term)? And fetches a new version if changed?

You also highlighted “EACH CDN node” … how many cdn nodes are there? For a low traffic website like ours, too many cdn nodes partially defeats the purpose of a cdn as it takes too many visitors to re-prime all the nodes. :sunglasses:

Yup, that is a reasonable way to describe things.

Of course, you are competing with quite a few other sites for limited per-node cache space, so it is quite possible that your older asset is not even in cache on a specific node by the time it is fetched unless you are paying for a fair amount of extra bandwidth each month (since you’d need rather steady traffic to be in cache all the time).

There are a couple dozen CDN nodes. You’ll have to analyze & decide if the performance is problematic for you or not.

Thanks again. One further question for clarification, then I will leave you be to enjoy some eggnog.

You mention a couple of dozen nodes. Your website describes the Enterprise version as having 27 global edge locations versus the standard (thought I read 6 somewhere). So when we refer to nodes, there is one node per “global edge location” or multiple nodes per.

We have more than one node per location, though how many nodes in a location this depends on how busy that area is.