Netlify Function with query strings ignores custom Cache-Control header

I don’t think this will give much more info than what was originally stated here, but this is indeed an issue that I’ve had to work around using external proxies and avoiding Cache-Control headers from Netlify when using functions. Here is simple app that illustrates the problem: GitHub - spencewood/hello-netlify

And that is deployed here:

https://unruffled-chandrasekhar-feb45f.netlify.app/api/hello

This simply has a Cache-Control set to 5 minutes. Inside this window changing querystring parameters doesn’t affect the response, regardless of cache settings on the client.

For example:

> curl "https://unruffled-chandrasekhar-feb45f.netlify.app/api/hello"
Hello World!

> curl "https://unruffled-chandrasekhar-feb45f.netlify.app/api/hello?subject=test"
Hello World!

> curl "https://unruffled-chandrasekhar-feb45f.netlify.app/api/hello?subject=test&greeting=hi"
Hello World!

When I would expect the previous responses in order: “Hello World!”, “Hello test!”, “hi test!”

Appreciate the repro, @spencewood. We’ll be eagerly keeping an eye on our internal tracker for y’all.

Any update on this? We are encountering the same issue. Looks like Netlify cache treats functions with different query string as the same call.

Hi @michal, we’re still working on the issue, which is being actively worked on. We’ll update here when we get the fix deployed. Thank you for your patience.

1 Like

I was testing out Netlify and found out this problem. Sad to see this kind of critical problem is still not solved after this long and there is no way to work around or disable the CDN. I guess I have to try out other services.

Hi all,
Wanted to clarify a bit here: I have updated the title of this thread to specify that it’s about caching issues with Netlify functions that have query strings.

If you are having issues with static site/asset caching (so, not function caching) involving query strings, please create a new thread so we can investigate.

Function caching, though, is a wildly different beast than caching for static site assets! We don’t recommend relying on cached function responses in general, and for functions with query strings specifically, there is currently no way to cache those responses because we strip the query string before caching.

I’m switching from Firebase functions to Netlify and this is a bit of an issue. I would have thought it very common to want a single function to return different output depending on query string based input (and to have that response cached, at least a little). What is the intent behind stripping query string? Could you provide more detail about what currently is part of the cache key so maybe we could work around it somehow?

Hey @drzax,

The problem is that we can’t determine what should and shouldn’t be cached. If we do cache function responses with query string params, some customers may be using this to hold sensitive/customer PII which shouldn’t be relayed to other customers, for example.

Did you wanna delve in a bit on your use case? Maybe there’s something we can do.

Hi @Scott ,

I’m not following your use-case.

Why would a customer using a function with query string containing sensitive information be caching response in the first place?

If a user did include sensitive information in the query string and enabled caching, wouldn’t subsequent users see the exact same response as the first user regardless of query string? and the sensitive data passed in the query string wouldn’t even hit the function because the response is cached?

If I am missing a user-case where sensitive information is passed in the query string and cache is enabled, please kindly reply below~

p.s. A good example cache key from CloudFlare is:
${header:origin}::${scheme}://${host_header}${uri_iqs}

Thanks!

1 Like

I don’t think anyone expects it to work the way it does with Netlify functions when caching is turned on. With cache-control, you effectively turn off the ability to use query string parameters for functions because the response cannot be relied upon.

Would you expect, for example, a function that returns product details to return the same thing, regardless of the item you’re requesting while also relying on cache?

/.netlify/functions/product-details?pid=1
/.netlify/functions/product-details?pid=2

I would expect to be able to cache these responses individually, not simply as “product-details”.

I might be wrong since I have not tried using the function caching yet (due to this thread I found, I figured I will postpone it until it is resolved), but in the case @spencewood highlighted above, applied to the example of @Scott, a request of …/sensitive?id={UUID} if cached could return sensitive information for other users if a response for a previous UUID is returned for a new UUID:

.../sensitive?id=ACTA4 returns { birthday: "1.1.1970", name: "Jones", id: "ACTA4" }
.../sensitive?id=TIRK8 returns { birthday: "1.1.1970", name: "Jones", id: "ACTA4" }

Or would caching in this case not work? Or is it up to the developer to “not do this”?

Parameter caching would tie the cache to the UUID and thus prevent the very thing not implementing parameter caching is supposed to prevent, no?

Sorry all, think I may have been crossing my wires with this topic when I replied last! This related to redirects, functions and query string params.

Today, our cache key uses the raw path without query params (as this is optimised for static assets). The cache is only invalidated on a new deploy and not for subsequent requests with unique params. But, it’s a talking point internally so your insight, feedback and ideas are greatly appreciated.

You can set the cache-control header like this however, given the nature of Lambdas being stateless, this could be dropped earlier than the advertised time due to a cold start.

I guess if the cache is optimized for static assets without query strings, one possibility to allow caching of lambda query param responses would be to make it opt-in using a custom header or something like that instead. That way the static asset caching would not be influenced, while the developers would have the choice to opt-in to this query param caching behaviour.

I think even with a shorter than-advertised amount of caching time, this might be very beneficial. If the request can be cached and a lot of traffic hits the function unexpectedly at once, the cache could respond instead of running the function. Less lambda calls for Netlify, less execution time for developers. Win win ^^

As it is right now, if one uses query params, they cannot use caching, as it might return an incorrect response - which in a lot of cases is more destructive than returning nothing at all.

I don’t think this is a good default since it’ll trip up more people than it would help. Caching based on pathname should be opt-in imho. If you’re worried about breaking backwards compatibility one option might be to add a cacheKey option to the return value of the function and cache based on that (and default to current behavior).

I would also like to see an option to not invalidate cache on deploy (for obvious reasons).

– my two cents

@snorkypie,

Appreciate the feedback and I’ve added it to our internal feature tracker!

For what it’s worth Vercel does this flawlessly. I have a fairly expensive function which takes a screenshot with some inputs specified as query params. The code blow does exactly what you’d expect.

const file = await getScreenshot(config);

res.statusCode = 200;
res.setHeader("Content-Type", `image/${config.fileType}`);
res.setHeader(
  "Cache-Control",
  `public, s-maxage=${+config.ttl || ttl}, max-age=${+config.ttl || ttl}`
);
res.end(file);

I’m pleased it’s being discussed internally. I hope a solution can be found. Caching of function results is a fundamentally different thing to caching static assets. It makes some sense to exclude query params the cache key on static assets because it probably shouldn’t be possible to bust the cache by requesting https://example.com/huge-image.png?bust=<new guid>. But functions are supposed to return something different depending on inputs.

1 Like

hey there @drzax - just letting you know we haven’t forgotten about you, and we are still thinking on this. More soon! thanks for your patience.

1 Like

Wow, 10 months since this nasty bug (yes, it’s a bug!) was reported and Netlify still hasn’t fixed it.

Netlify’s UX is nicer than Vercel’s and other competitors, the speed is great, and developing is a breeze. That said, if a simple, primary error like that isn’t fixed in 10 months (and it seems that Netlify support engineers don’t even treat it as a major bug as they should), how can we trust Netlify for scale?

Hi @buzinas and thanks for that feedback!

It’s a complicated situation to fix for us based on how we cache things and our desire not to break workflows already in place for our millions of customers - and for which it turns out that we are building entirely new CDN components. Not a quick process, and not one we can accelerate and have good, reliable results.

I understand that from your point of view the error is “simple”. The fix is not, and we are focusing on a better fix than we could have done in any shorter timeframe. This problem affects relatively few of our customers as well, but we are still working on it with tremendous focus and engineering effort.

In the end, that’s how we believe you can trust us: we are building better solutions that will prevent related future trouble, and not putting half-assed “fixes” out there in cases where it wouldn’t serve you, or us, well.

You can make the decisions you’d like, for your business, based on that transparency, which we provide for exactly that reason: to let you make reasoned judgments about how to implement your service and build your business. If how we work is not up to your standards, we don’t want you to waste time or effort to use our systems.

Thanks for participating in the transparency process :slight_smile:

2 Likes

Can you provide any transparency on how quickly this is being addressed? Seems like a major issue, to me.