Netlify Functions speed

I’m trying to find out what’s causing my Netlify function to sometimes respond slow. It seems to be slower when the function hasn’t been called in a while, as if it goes to sleep. The function log is showing a significant lower duration than the Network tab in Chrome dev tools.

Calling the function after ±10 minutes of “downtime”

The corresponding Function log

3:53:22 PM: Duration: 829.42 ms	Memory Usage: 121 MB

Later calls to the same function with less “downtime” result in a faster response

The corresponding Function log

4:00:17 PM: Duration: 785.31 ms	Memory Usage: 121 MB	

I’m confused why the difference in duration is not reflected in the Function log itself — this also keeps me from debugging this further as it doesn’t seem to have anything to do with the performance of the function’s code.

The Netlify instance is and the function is called filter. The function runs on page load for example this page:

Curious to hear your thoughts!

Seems likely that you are seeing the difference between a cold and warm cache on our CDN node.

If the route to your function is not in cache on any specific CDN node (all caches are completely separate, even in the same data center, so you could have one machine with a populated cache and another without), then your request will need to get proxy’d through our origin in San Francisco, US, before being sent on to AWS to run. That could explain a second or 2 of the startup time.

That “route” cache is in general not highly contended so hopefully the norm is that most of the time the function runs more quickly. But if you only run the function very rarely, or you deploy frequently (which invalidates the cache), you may see mixed results.

That makes sense. This site is still in development so the functions aren’t being used much yet. I’m doing some database calls in the function too which might have a similar cold cache as it was on a free cluster.

I’ve upgraded the database now and tested again. Subsequent calls are much quicker (around 900ms) but the initial call on a cold cache still takes about 20 seconds, much more than the 2s you describe.

@fool I’m still seeing a 20-30s load on cold starts, which wouldn’t be an issue per se, but these cold starts seem to occur every 15 minutes, which is too frequent for what I’m using the functions for:

Do you have any idea what could cause this and what I could do to fix it?

We’d be looking for the value of an x-nf-request-id HTTP response header for such a slow request to try to understand what happens based on our internal logs. I don’t think it is really cold-start related, since most folks don’t use their functions very often and I’ve never seen anyone take so long (plus unless our staff adjusted your function timeout, usually we’ll shut down the connection in 10 seconds unless it has already started sending a response - which would mean it was already solidly “running” :))

You can see what we need here: [Common Issue] Netlify Support asked for the 'x-nf-request-id' header? Why are they asking for it and how do I find it?

Thanks for the explanation. This request took 29s and this is the header: x-nf-request-id: 1428b0f9-a31a-44ae-96e3-5bb81e757ce3-5610707

Looking forward to hear what you find!

hey sander, we are still looking at this, just as an update. we’ll get back to you as soon as we have more info to share!

Thanks for your patience, Sander - we finally have an answer!

Your site is massive: almost 200k files. Thus, your redirect rules are massive. So massive it takes 15 seconds to transmit them across our network, since they cannot be cached for very long. This will make any non-file access (and some file accesses, depending on your setup) take quite a bit of time to run.

Your options to fix are to split into several smaller sites (perhaps using the workflow here: [Common Issue] Can I deploy multiple repositories in a single site? - you don’t have multiple repos, but you can stitch sites together linked to the same repo), or to move the functions to a separate site and call them by full path:

also seeing pretty sizeable performance degradation with functions.

x-nf-request-id: 0455cf66-57e4-4516-85ea-045b4e80c597-19603167

Functions Log (for Page request)

11:53:38 AM: 2020-11-18T16:53:38.765Z	65aa5142-1acb-46b7-9da8-1cc8c20cab21	INFO	[request] /inventory
11:53:41 AM: Duration: 2884.14 ms	Memory Usage: 100 MB	Init Duration: 226.10 ms	

Functions Log (for API)

11:53:39 AM: 2020-11-18T16:53:39.676Z	c0d999f4-54c2-40ee-86f7-558707c72441	INFO	[request] /api/graph
11:53:39 AM: Duration: 113.70 ms	Memory Usage: 74 MB	Init Duration: 212.74 ms	
11:53:40 AM: 2020-11-18T16:53:40.981Z	0a615627-563b-4992-9fe8-c268b8d437a9	INFO	[request] /api/graph
11:53:41 AM: Duration: 86.28 ms	Memory Usage: 75 MB		

For context:

I’m using a Next.js app. All page requests hit pages/index. In turn, based on the request url, I make two API calls (pages/api/graph/) to get the data required to build pages/index at runtime.

By comparison, running these functions on a local Node server yields a 3.7 second reduction in TTFB. And my API calls (pages/api/graph/) are twice as fast using Playground vs cloud functions.

Hi, @jega. The Function will run in AWS region us-east-1 with 1024 MB of memory:

This likely is a much less powerful (virtual) system that your local system so I would expect a performance difference.

As far as finding the bottleneck for the performance, it could be concurrency, CPU compute power, or network latency related. It might help to add additional logging to your function to help determine what the source of the performance bottleneck is.

If there are other questions about this, please let us know.

thanks Luke. for comparison, i ran the same code (Next.js 10.0.1 minus Netlify configs - netlify.toml, next-on-netlify) on Vercel’s free plan and my TTFB dropped from 3.1s avg to 1.3s avg.

Then I made my API calls directly to the endpoints instead of using the /pages/api proxy, and my TTFB dropped again to ~750ms avg.

Two simple updates for a 2.35s page speed improvement.

Unfortunately, it’s hard - if not impossible - for me to determine why the performance is so different between platforms in this instance.

I am new to functions and I just deployed a very basic function that takes the request body as JSON and gives it back as it is. The request takes between 4 to 6 seconds to complete which is way to much given the simplicity of the function.
In the logs is shown Duration: 1.10 ms Memory Usage: 66 MB.
Is there something I can do to improve this?

Hiya @stilllife00 - maybe you could send an x-nf-request-id so we can help figure out where the time is going? There are a lot of things that could contribute to that time, from your DNS hosting, to caching settings from your function.

This is how to get the number that will help us peer inside a specific request to determine where the time went:

Thanks for the reply, actually since the day after my post the timing went down to ~500ms which seems way more reasonable (the code is also doing more stuff than my original test).
I have tried today on a brand new function with the same code today and is pretty fast ~130ms (x-nf-request-id: 11942687-d379-45b3-998e-0c6a76f2a2bc-31849917)
So I can’t reproduce it and provide a slow request-id, all good :+1:

1 Like