I’m trying to find out what’s causing my Netlify function to sometimes respond slow. It seems to be slower when the function hasn’t been called in a while, as if it goes to sleep. The function log is showing a significant lower duration than the Network tab in Chrome dev tools.
Calling the function after ±10 minutes of “downtime”
4:00:17 PM: Duration: 785.31 ms Memory Usage: 121 MB
I’m confused why the difference in duration is not reflected in the Function log itself — this also keeps me from debugging this further as it doesn’t seem to have anything to do with the performance of the function’s code.
The Netlify instance is makeover-feed.netlify.app and the function is called filter. The function runs on page load for example this page: makeover-feed.netlify.app/shop/lampen
Seems likely that you are seeing the difference between a cold and warm cache on our CDN node.
If the route to your function is not in cache on any specific CDN node (all caches are completely separate, even in the same data center, so you could have one machine with a populated cache and another without), then your request will need to get proxy’d through our origin in San Francisco, US, before being sent on to AWS to run. That could explain a second or 2 of the startup time.
That “route” cache is in general not highly contended so hopefully the norm is that most of the time the function runs more quickly. But if you only run the function very rarely, or you deploy frequently (which invalidates the cache), you may see mixed results.
That makes sense. This site is still in development so the functions aren’t being used much yet. I’m doing some database calls in the function too which might have a similar cold cache as it was on a free cluster.
I’ve upgraded the database now and tested again. Subsequent calls are much quicker (around 900ms) but the initial call on a cold cache still takes about 20 seconds, much more than the 2s you describe.
@fool I’m still seeing a 20-30s load on cold starts, which wouldn’t be an issue per se, but these cold starts seem to occur every 15 minutes, which is too frequent for what I’m using the functions for:
We’d be looking for the value of an x-nf-request-id HTTP response header for such a slow request to try to understand what happens based on our internal logs. I don’t think it is really cold-start related, since most folks don’t use their functions very often and I’ve never seen anyone take so long (plus unless our staff adjusted your function timeout, usually we’ll shut down the connection in 10 seconds unless it has already started sending a response - which would mean it was already solidly “running” :))
Thanks for your patience, Sander - we finally have an answer!
Your site is massive: almost 200k files. Thus, your redirect rules are massive. So massive it takes 15 seconds to transmit them across our network, since they cannot be cached for very long. This will make any non-file access (and some file accesses, depending on your setup) take quite a bit of time to run.
Hi, @jega. The Function will run in AWS region us-east-1 with 1024 MB of memory:
This likely is a much less powerful (virtual) system that your local system so I would expect a performance difference.
As far as finding the bottleneck for the performance, it could be concurrency, CPU compute power, or network latency related. It might help to add additional logging to your function to help determine what the source of the performance bottleneck is.
If there are other questions about this, please let us know.
I am new to functions and I just deployed a very basic function that takes the request body as JSON and gives it back as it is. The request takes between 4 to 6 seconds to complete which is way to much given the simplicity of the function.
In the logs is shown Duration: 1.10 ms Memory Usage: 66 MB.
Is there something I can do to improve this?
Hiya @stilllife00 - maybe you could send an x-nf-request-id so we can help figure out where the time is going? There are a lot of things that could contribute to that time, from your DNS hosting, to caching settings from your function.
This is how to get the number that will help us peer inside a specific request to determine where the time went:
Thanks for the reply, actually since the day after my post the timing went down to ~500ms which seems way more reasonable (the code is also doing more stuff than my original test).
I have tried today on a brand new function with the same code today and is pretty fast ~130ms (x-nf-request-id: 11942687-d379-45b3-998e-0c6a76f2a2bc-31849917)
So I can’t reproduce it and provide a slow request-id, all good