Netlify Function with query strings ignores custom Cache-Control header

Hi,

I’m trying to make a really simple Netlify Function that accepts a query param, does some magic based on that and generates some output. The output is supposed to be the same across all the calls of the Function as long as the given param is the same.

To achieve this I added ‘Cache-Control’: ‘max-age=365000000,immutable’ header to my function result.
Now the problem is Netlify caches the Function response based only on the path to the Function, not taking into account the param. (Although, this cache seems to get invalidated in not so long time.)

See on an example. Here is a function that says Hello ${name} when given name query param, or Hello World otherwise.
If you curl https://admiring-liskov-24938c.netlify.app/.netlify/functions/hello?name=Yaroslav first and then https://admiring-liskov-24938c.netlify.app/.netlify/functions/hello it will keep calling you Yaroslav for some time no matter what name you actually provide.

This is what it looks like inside:

exports.handler = function (event, context, callback) {
  const { name } = event.queryStringParameters;

  callback(null, {
    statusCode: 200,
    headers: {
      'Cache-Control': 'max-age=365000000,immutable',
    },
    body: 'Hello, ' + (name || 'World') + '. ' + new Date().toISOString(),
  });
};

Is this expected? If so, how can I implement such an eager caching in Netlify Functions otherwise?

2 Likes

Hi, I tested your function in and while I do see what you described, as soon as I used an incognito browser window, the function returned a ‘hello world’ instead. I think what you are seeing is how caching behaves in the browser. Not sure what you can do to avoid that except to not send that cache-control header. Or change your request to a post and send your arguments in the request body.

That’s definitely not a stale browser cache :slight_smile:

~ $ curl -v 'https://admiring-liskov-24938c.netlify.app/.netlify/functions/hello'
*   Trying 157.230.120.63...
* TCP_NODELAY set
* Connected to admiring-liskov-24938c.netlify.app (157.230.120.63) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.netlify.app
*  start date: Mar  4 11:17:06 2020 GMT
*  expire date: Mar  5 11:17:06 2021 GMT
*  subjectAltName: host "admiring-liskov-24938c.netlify.app" matched cert's "*.netlify.app"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fcbd900d600)
> GET /.netlify/functions/hello HTTP/2
> Host: admiring-liskov-24938c.netlify.app
> User-Agent: curl/7.64.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 150)!
< HTTP/2 200 
< cache-control: max-age=365000000,immutable
< date: Sat, 23 May 2020 22:18:45 GMT
< content-length: 41
< content-type: text/plain; charset=utf-8
< age: 155995
< server: Netlify
< x-nf-request-id: d5f08243-c2eb-47b5-ada0-825106912929-3070234
< 
* Connection #0 to host admiring-liskov-24938c.netlify.app left intact
Hello, Yaroslav. 2020-05-23T22:18:45.039Z* Closing connection 0

This is what I see right now, after 2 days (note the Date: header).

I think when you tested the function in Incognito you might have hit another CDN load balancer which didn’t have the response in the cache yet. Have you tried to hard-refreshing the page several times? Otherwise try using cURL on that endpoint and calling it several times.

I’m not sure CDN is supposed to ignore query parameters when caching responses as per HTTP spec. Do you think so?

Hi! Ah, I see! You are correct. We should probably be caching requests with different query string params separately but it looks like it is not. I’ve dug around and do see an open issue around this issue and have added this thread to the issue so if there are any changes, we’ll update here.

3 Likes

Just encountered the same thing, caching should no occur with different query string.

thanks, @c3657e13a387488e23eb - we’ll definitely follow up here once there is movement.

2 Likes

Hi @perry
I am facing a similar issue where the cache control header is not respecting query string variables. Is there any workaround suggested for this and how soon can we expect a fix for this.
thanks
Abhishek

Hey guys,

Checking if there is any further movement on this request.
CC: @fool
thanks
Abhishek

We have some engineering work planned that should improve the situation, but for now, nothing has changed. You could describe more about the pattern you’re seeing - share your redirect and the behavior - to see if we have some configuration advice to help you work around it; not all query string parameter caching issues are created equal :slight_smile:

I don’t think this will give much more info than what was originally stated here, but this is indeed an issue that I’ve had to work around using external proxies and avoiding Cache-Control headers from Netlify when using functions. Here is simple app that illustrates the problem: GitHub - spencewood/hello-netlify

And that is deployed here:

https://unruffled-chandrasekhar-feb45f.netlify.app/api/hello

This simply has a Cache-Control set to 5 minutes. Inside this window changing querystring parameters doesn’t affect the response, regardless of cache settings on the client.

For example:

> curl "https://unruffled-chandrasekhar-feb45f.netlify.app/api/hello"
Hello World!

> curl "https://unruffled-chandrasekhar-feb45f.netlify.app/api/hello?subject=test"
Hello World!

> curl "https://unruffled-chandrasekhar-feb45f.netlify.app/api/hello?subject=test&greeting=hi"
Hello World!

When I would expect the previous responses in order: “Hello World!”, “Hello test!”, “hi test!”

Appreciate the repro, @spencewood. We’ll be eagerly keeping an eye on our internal tracker for y’all.

Any update on this? We are encountering the same issue. Looks like Netlify cache treats functions with different query string as the same call.

Hi @michal, we’re still working on the issue, which is being actively worked on. We’ll update here when we get the fix deployed. Thank you for your patience.

1 Like

I was testing out Netlify and found out this problem. Sad to see this kind of critical problem is still not solved after this long and there is no way to work around or disable the CDN. I guess I have to try out other services.

Hi all,
Wanted to clarify a bit here: I have updated the title of this thread to specify that it’s about caching issues with Netlify functions that have query strings.

If you are having issues with static site/asset caching (so, not function caching) involving query strings, please create a new thread so we can investigate.

Function caching, though, is a wildly different beast than caching for static site assets! We don’t recommend relying on cached function responses in general, and for functions with query strings specifically, there is currently no way to cache those responses because we strip the query string before caching.

I’m switching from Firebase functions to Netlify and this is a bit of an issue. I would have thought it very common to want a single function to return different output depending on query string based input (and to have that response cached, at least a little). What is the intent behind stripping query string? Could you provide more detail about what currently is part of the cache key so maybe we could work around it somehow?

Hey @drzax,

The problem is that we can’t determine what should and shouldn’t be cached. If we do cache function responses with query string params, some customers may be using this to hold sensitive/customer PII which shouldn’t be relayed to other customers, for example.

Did you wanna delve in a bit on your use case? Maybe there’s something we can do.

Hi @Scott ,

I’m not following your use-case.

Why would a customer using a function with query string containing sensitive information be caching response in the first place?

If a user did include sensitive information in the query string and enabled caching, wouldn’t subsequent users see the exact same response as the first user regardless of query string? and the sensitive data passed in the query string wouldn’t even hit the function because the response is cached?

If I am missing a user-case where sensitive information is passed in the query string and cache is enabled, please kindly reply below~

p.s. A good example cache key from CloudFlare is:
${header:origin}::${scheme}://${host_header}${uri_iqs}

Thanks!

1 Like

I don’t think anyone expects it to work the way it does with Netlify functions when caching is turned on. With cache-control, you effectively turn off the ability to use query string parameters for functions because the response cannot be relied upon.

Would you expect, for example, a function that returns product details to return the same thing, regardless of the item you’re requesting while also relying on cache?

/.netlify/functions/product-details?pid=1
/.netlify/functions/product-details?pid=2

I would expect to be able to cache these responses individually, not simply as “product-details”.

I might be wrong since I have not tried using the function caching yet (due to this thread I found, I figured I will postpone it until it is resolved), but in the case @spencewood highlighted above, applied to the example of @Scott, a request of …/sensitive?id={UUID} if cached could return sensitive information for other users if a response for a previous UUID is returned for a new UUID:

.../sensitive?id=ACTA4 returns { birthday: "1.1.1970", name: "Jones", id: "ACTA4" }
.../sensitive?id=TIRK8 returns { birthday: "1.1.1970", name: "Jones", id: "ACTA4" }

Or would caching in this case not work? Or is it up to the developer to “not do this”?

Parameter caching would tie the cache to the UUID and thus prevent the very thing not implementing parameter caching is supposed to prevent, no?