I’ve noticed that sometimes, some static files hosted on Netlify are slow to be served to the clients.
By slow, I mean, the first byte might take many seconds before being sent, and the request is pending for a while.
After the first byte is sent, trying to acces the file again with caching disabled it fast, as expected.
I can’t confirm yet but I suspect this happens after a new deployment, probably due to using a lazy strategy to propagate the files across the world/CDN?
The problem I encounter:
I have a Gatsby site I modify/deploy often, and some pages have quite low access rates
When clicking on a gatsby link, gatsby does code split the js into chunks so it has to download next page’s js chunk before navigating to it
Due to some js chunks being served slowly by netlify (probably after a deploy), users are clicking on a link and just get no feedback at all, and then many seconds later, navigation happens
I could probably add some feedback, like a progress bar like github (nprogress etc…).
But I’d rather find a more general solution.
Is there a possibility to initialize the netlify deployment eagerly in such way that no file is ever server slowly? What would you recommend to solve this problem.
your intuition that there is lazy-loading going on is correct; each CDN node (and there may be up to a dozen in one location, this is true east coast and west coast US) maintains its own cache, and that cache is invalidated at deploy time, and refilled only when a request is received. On our non-enterprise CDN, there is very high cache contention so further the file contents will fall out of cache quite quickly, in case your page has only sporadic access on a particular CDN node.
Now, the TTFB still shouldn’t be seconds - more like hundreds of milliseconds - unless you have some misconfiguration in DNS to not use our CDN optimally, or some other odd configuration like a proxy in front of us which is not a supported configuration. Two requests:
can you tell me what the hostname for your site is so I can check DNS config?
If you have such a slow load and can get us the value of the HTTP response header called x-nf-request-id that would be useful for us to see if we can understand if the delay is inside our network or outside (we maintain timing records for every request in our internal logs and can reference them by that ID).
There is no intention that you would push your site contents to the cache, as we don’t want to cache them if there are no requests, so there’s nothing you can do aside from following the advice here to optimize that setup:
We have a pretty normal setup, no proxy or whatever, and do have seen many seconds (like, 20 seconds) file acces. But maybe it’s my local network, will have to check.
I’ll try to see if I can get a nf-request-id but as it’s not so easy to reproduce it may take some time.
I understand that you need to evict inactive deployments from your caches. Is it possible to have an idea of this eviction policy?
Is there a way to ensure a website is kept in the cache? Other providers like Zeit offer an ability to do so for a limited number of deployment (a “scale” optino I think). That would be cool to ensure that the files of a production deployments could stay in the cache.
We are currently on the free plan and would be fine to pay for something like that, but going from 0$ to 500$ enterprise plan for this is not really an option for us at the moment.
Hi @slorber, even on the free tier you can have a high cache hit rate if your site is accessed often. But the only way to increase it further would be to move to one of our custom or enterprise plans and move on to the Enterprise CDN network. That said, you shouldn’t be having TTFB as high as 20 seconds even on our free tier. Once you provide that x-nf-request-id or at least the URL for an asset that took a long time to load, and the date that you saw it, then we can dig further to see what’s happening.
I’ve run some tests and I see some bad TTFB on a few requests, not necessarily after a redeploy.
Actually I’ve run some tests with someone that tries to resolve the website using different netlify ips, and one was significantly worse than the others in term of TTFB: 188.8.131.52
According to the DNS doc it’s your loadbalancer.
Can you tell me how using the loadbalancer ip in the DNS config lead to such results? Is this normal? If this makes TTFB 2 to 10x worse when using the loadbalancer, what about documenting better how this choice can significantly impact performances?
Thanks. I can definitely see what you’re talking about in there, this request taking 12 seconds seems like a big bummer:
…but on our side in our internal logs, we show we sent it in 6ms (the “x-nf-request-id” HTTP response header is a unique value that we can correlate to a specific request in our logs, and so I looked it up).
Was your internet generally working well at the time?
On my side I didn’t get any news from my last twitter attempt to solve this problem.
On our side, it turns out that putting CloudFlare on top of Netlify does significantly increase the TTFB performances. It is the exact same deployment, with a single toggle switched on CF side to enable the proxy.
Swyx should have given you both DNS domains (with/without cloudflare proxy) and told me you were studying the problem (fool+gerald), but no news for 1 month.
I feel like I gave you everything I can on my side, and can’t do anything more, except maybe paying 500€/month to solve this problem. CloudFlare is actually a way cheaper solution, and it’s what I end up recommending to my customers currently until this problem is solved. Despite Netlify saying CF on top of Netlify is useless, it turns out it’s not for me.
Hi there, thank you & other folks for your patience on this thread as we work to understand the underlying issues here. I know its frustrating to experience slow download times and we want to understand & do better.
We have been talking about this internally and are still looking into these issues, and hope to bring them up with relevant folks who work more closely with our infrastructure in the near future. We’ll update this thread here with any information - and if anything changes for any of you please do keep adding your thoughts here.
Not the quick-and-easy outcome we had all been hoping for - but we will keep looking into this.