Proxied requests are responding with 50x error - Caching issue with redirects/rewrites

Wats0n · November 2, 2022, 6:21pm

My site name: app-prod-ts

Starting from around Wed, 02 Nov 2022 13:00 GMT we’ve been seeing users receiving 502 and 504 errors on the following endpoints:

/rest-read/rpc/me
/rest/rpc/last_seen_at

These errors aren’t showing up consistently, but if a user is seeing them they seem to be pinned to his session. Also the errors have been getting worse over the last hours, rendering out platform more and more unusable.

On our Backend (the actual endpoints these redirects/rewrites are leading to) I’m only seeing a few 50x errors from earlier in the day, before 06:00 GMT on 02 Nov 2022. My suspicion therefore is there is some caching issue going on here. But since the redirect/rewrite part is fully opaque to me, I can’t investigate further into this myself.

I collected these requests from the response headers of the failed 50x requests to Netlify:

x-nf-request-id: 01GGWK9KW5W53E92Q75631N6T6
×-nf-request-id: 01GGWK9KW5AKDW52AYE8R86BGA
×-nf-request-id: 01GGWN6YCRMQ20JZJPW97ZZ76V
×-nf-request-id: 01GGWN6YCRHSPJE3C2ZMAR48F4
×-nf-request-id: 01GGWPK1Z2YMQFE2FHBP6NARZ0
×-nf-request-id: 01GGWPK1Z28ZPW9W9CFX8YYKCP

Towards the end of the day the number of errors have been increasing and the issue is still going on!

fool · November 2, 2022, 11:33pm

Hi @Wats0n and sorry to hear about the trouble! From our side, those connections look similar. For 3 of them, we returned a 502 timeout status, since this happened:

visitor connects to your netlify site
our CDN node immediately looks up how to handle the route and finds that you have a proxy redirect configured
it sends request to your server…
…which fails to answer within 40 seconds, so we stop waiting and return a 502 timeout.

To fix this, ensure that your server begins to send content - at least http response headers - within the first 30 seconds of a request. Presumably you can examine the logs on your server to understand what is causing the requests to be so slow there, since it is unlikely this is the customer experience you intend your visitors to get.

The other 3 requests were all POST requests, which returned 504’s. I don’t see these as reaching your server at all, which is unexpected, and I am not sure why that might have happened. I can see that out of all requests to https://app.talentspace.io/rest/rpc/last_seen_at in the past week, there were over 115k successful accesses, and only a handful of failures like this before today. There were quite a few today (hundreds) but I am not sure why that was; they do seem to have wrapped up around 4 hours ago.

Wats0n · November 3, 2022, 9:43am

Thank you for the reply @fool

Can you let me know which ×-nf-request-id (and timestamp) you’re seeing the request timeout because our server is not responding?
I had a quick look and it looks like none of them are reaching our server (ALB or WAF).

And will you be investigating further why the 504’s failed on your end without reaching our server?

amelia · November 8, 2022, 12:47am

Hey @Wats0n , we’re going to investigate a bit further with our networking team and we’ll follow up with you on what we find!

Wats0n · November 8, 2022, 9:51am

Hi @amelia thank you for continuing the investigation.

The problem is unfortunately still not resolved. I’ve setup special monitoring for it and still see it happening today.

My original thought was that Netlify has a bad cache (504 error) and sporadically returns that. So I changed the redirect from /rest/rpc/last_seen_at to /v1/rest/rpc/last_seen_at, but this did not help:

As you can see the proxied redirects fail with 504 error, but the direct request succeeds without error. And as mentioned before there are no 50x errors visible on on my endpoint.

Here are x-nf-request-id that I captured today:

01GHBCD52PZSFA32KC9RHD8PSP
01GHBC9Q9K8BH66NT2EW483Z8C
01GHBC9Q9KZ03HWKE3BCV2374S

gualter · November 10, 2022, 10:54am

Hi @Wats0n

I’ve replied to your thread on our helpdesk, let us continue our analysis there if you wish.

All the best!

Topic		Replies	Views
Proxy caching issues Support proxying , redirects	15	2741	January 14, 2020
API calls hidden with URL rewrites sometimes return 502 Support redirects	1	581	March 20, 2022
502 error on production even though the response times are well below 10 seconds Support lambda-functions	6	2407	April 21, 2020
Proxied URL returns 502 response Support redirects	6	658	February 13, 2022
Intermittent 502 errors from proxy rewrite and long load times Support proxying	7	570	September 14, 2022

Proxied requests are responding with 50x error - Caching issue with redirects/rewrites

Related topics