Netlify pages unreachable for 1 or 2 hours

Good morning everyone,

Our pages occasionally become unreachable for 1 or 2 hours in a random manner (example page https://sundaritalia.netlify.app/). Specifically, this page went down on the following days:

Time (Europe/Rome)

Down - 2023-12-05 04:16:28 | UP - 2023-12-05 05:11:35
Down - 2023-11-27 05:37:48 | UP - 2023-11-27 07:55:54

We were wondering if this was due to some update on Netlify’s side and if the problem would reoccur even with a Pro plan.

Thank you.

Hi, @bfine-agency. I researched this and I do see the site returning only 502 responses during the two time windows above.

The site is using a serverless function for server-side rendering (SSR). Function invocations have a default timeout of ten seconds. If the SSR function takes longer than 10 seconds before it starts sending a response, the function will be stopped and a 502 status response is returned to the client.

This is what happened for this site during both time windows shared. To understand why the function was timing out, this might be listed in the function logs found here:

However, the function logs for are only kept for 24 hours on the current plan. This means there are no longer any logs for this function for either downtime time window. For the Pro plan, the function logs are kept for seven days. A full seven days of logs would allow for further research of the 2023-12-05 event.

Also, the logs will only show the reason for the timeout if there is some sort of verbose logging being done by the function code itself. This function does not appear to do any logging so it is possible that, even if the function logs were available, they may not reveal any useful information about why the SSR function was timing out.

If they logs do not show an explanation for the timeout, our recommendation is to add console.log() and console.time() function calls to the code in question to reveal more information about where the function is spending it’s time.

There is also a Sentry integration that can be used for third-party monitoring of function performance:

About whether upgrading to Pro could have prevent this, it would not have changed the behavior of the site or prevented the 502s. However, on a Pro plan we would at least have the function logs for the 2023-12-05 event (but, again, even then those logs may not have revealed the root cause).

In most cases, the reason that the SSR function will start timing out temporarily is because some third-party resources the function calls (like an API endpoint or third-party website) becomes unavailable or slow. This in turn then causes the function to run so slowly that it then times out. If that turns out to be the cause, it means that the issue is with the third-party API or website. The next step would be to contact that hosting provider’s technical support to get that issues resolved. Again, that is all hypothetical as I don’t have enough information to know if that is what happened here.

To summarize, I do see the site’s function did timeout and return 502s during both time windows. However, to understand why, detailed logging from the function is required and no such logs exist for this function because it both does not log verbosely and because the logs for both time windows have expired.

If there are any questions about all this, please reply here anytime.

Hi Luke,

I appreciate your time and response. We’ll wait for the issue to recur so we can analyze the logs as you suggested.

In the meantime, thank you and I wish you all the best.

1 Like