Occasional 500 Internal Server Errors

This weekend we didn’t had any 500 error. The first occurrence was on monday. After the customer started to edit and publish content and new deploys to Netlify were requested.

@fool Our customer would like to have more insights on what causes the 500 error. Did you already found the exact route cause? If yes, can you explain what goes wrong?

Any view on a timeline to release a potential fix?
The customer flagged a few cases where end-users experienced a 500 error, so this is really impacting their business.

I couldn’t explain the cause effectively or completely, but it is related to the way we store and look up content. I’m not sure why it affects your site more than most everyone else’s, but the error messages in our internal logs helped our engineers figure that out.

I also do not have any planned ETA on the rest of the work. I got the impression it was on the longer term roadmap rather than immediate, but the developers are also waiting to hear back on what the change might have been in incidences based on what they’ve already done before working further on this.

So: I understand you saw one 500 error since I updated you - was that the only one?

1 Like

Hi Chris, no that was not the only 500 error we saw.
I created an export from our monitoring system to list them for 3 important pages
While since the 8th there seem to be less issues, the problem still persists

Hope this data helps you to track down the problem.
(since this software only allows me to list 6 url’s I replaced them with a description)

English Homepage: https://www.vlerick.com/en/
Dutch Homepage: https://www.vlerick.com/nl
Search page: https://www.vlerick.com/en/find-a-programme/?

Date Time Url
04/10/2021 11:32 English home page
04/10/2021 11:32 Dutch home page
05/10/2021 01:18 Search page
05/10/2021 08:46 Dutch home page
05/10/2021 13:21 English home page
05/10/2021 14:38 Search page
05/10/2021 14:39 English home page
05/10/2021 14:39 Dutch home page
05/10/2021 14:39 Dutch home page
05/10/2021 14:39 Search page
05/10/2021 14:40 English home page
05/10/2021 14:43 Search page
05/10/2021 14:44 English home page
05/10/2021 14:44 Dutch home page
05/10/2021 14:49 Search page
05/10/2021 15:02 Dutch home page
05/10/2021 15:02 Search page
05/10/2021 15:05 English home page
05/10/2021 15:05 Dutch home page
05/10/2021 15:07 Search page
05/10/2021 15:08 English home page
05/10/2021 15:08 Dutch home page
05/10/2021 15:09 Search page
05/10/2021 15:10 English home page
05/10/2021 15:10 Dutch home page
05/10/2021 15:43 Search page
05/10/2021 16:22 English home page
05/10/2021 16:22 Dutch home page
05/10/2021 16:26 Search page
06/10/2021 10:19 English home page
06/10/2021 10:19 Dutch home page
06/10/2021 10:27 Search page
06/10/2021 10:28 English home page
06/10/2021 10:28 Dutch home page
06/10/2021 17:22 Dutch home page
06/10/2021 17:23 Dutch home page
06/10/2021 17:24 Dutch home page
06/10/2021 18:14 Search page
06/10/2021 18:15 English home page
06/10/2021 18:15 Dutch home page
07/10/2021 01:18 Dutch home page
07/10/2021 08:54 English home page
07/10/2021 08:54 Dutch home page
07/10/2021 10:16 Search page
07/10/2021 10:17 Search page
07/10/2021 10:50 Search page
07/10/2021 10:51 English home page
07/10/2021 10:51 Dutch home page
07/10/2021 10:51 Search page
07/10/2021 11:52 English home page
07/10/2021 11:52 Dutch home page
07/10/2021 11:52 Search page
07/10/2021 11:55 English home page
07/10/2021 11:55 Dutch home page
07/10/2021 11:55 Search page
07/10/2021 11:56 Search page
07/10/2021 11:57 English home page
07/10/2021 12:16 English home page
07/10/2021 12:16 Dutch home page
07/10/2021 12:16 Search page
07/10/2021 12:19 Search page
07/10/2021 12:20 English home page
07/10/2021 12:20 Dutch home page
07/10/2021 12:22 Search page
07/10/2021 12:23 Search page
07/10/2021 12:25 English home page
07/10/2021 12:25 Dutch home page
07/10/2021 12:27 English home page
07/10/2021 12:27 Dutch home page
07/10/2021 12:29 Search page
07/10/2021 12:39 English home page
07/10/2021 12:39 Search page
07/10/2021 12:41 Dutch home page
07/10/2021 12:42 English home page
07/10/2021 12:42 Dutch home page
07/10/2021 12:47 Search page
07/10/2021 12:48 Search page
07/10/2021 12:49 Search page
07/10/2021 12:57 Search page
07/10/2021 13:11 Search page
07/10/2021 13:12 English home page
07/10/2021 13:13 English home page
07/10/2021 13:13 Dutch home page
07/10/2021 18:18 Search page
07/10/2021 18:19 English home page
07/10/2021 18:56 Search page
07/10/2021 18:57 Search page
07/10/2021 18:57 English home page
07/10/2021 18:58 English home page
07/10/2021 18:58 Search page
07/10/2021 18:58 Dutch home page
08/10/2021 01:19 Dutch home page
08/10/2021 09:14 English home page
08/10/2021 09:14 Dutch home page
08/10/2021 09:48 English home page
08/10/2021 09:48 Search page
08/10/2021 09:49 English home page
08/10/2021 09:49 Dutch home page
08/10/2021 11:28 English home page
08/10/2021 11:29 English home page
08/10/2021 11:30 English home page
08/10/2021 11:31 English home page
08/10/2021 11:32 English home page
08/10/2021 11:33 English home page
08/10/2021 11:34 English home page
08/10/2021 12:12 Search page
08/10/2021 12:13 English home page
08/10/2021 12:13 Dutch home page
08/10/2021 18:59 English home page
08/10/2021 18:59 Dutch home page
08/10/2021 18:59 Search page
08/10/2021 19:01 Search page
09/10/2021 01:18 Search page
11/10/2021 09:40 Search page
11/10/2021 11:11 English home page
11/10/2021 11:12 English home page
11/10/2021 11:12 Dutch home page
11/10/2021 13:10 English home page
11/10/2021 13:11 English home page
11/10/2021 13:11 Dutch home page
11/10/2021 13:11 Search page
11/10/2021 15:09 Dutch home page
11/10/2021 16:43 Search page
11/10/2021 17:03 English home page
11/10/2021 17:03 Dutch home page
12/10/2021 04:09 Dutch home page
12/10/2021 08:50 Dutch home page
13/10/2021 13:35 Search page
14/10/2021 01:18 English home page
14/10/2021 01:18 Dutch home page
14/10/2021 03:30 Dutch home page
14/10/2021 08:08 Search page
14/10/2021 09:38 Dutch home page
14/10/2021 10:20 English home page
14/10/2021 14:11 Dutch home page
14/10/2021 14:11 Search page
14/10/2021 14:12 Search page
14/10/2021 14:14 Dutch home page

By the way, times are CET

Thanks for those details, Rik! Our team has spent quite awhile working on this and while they think things are improved, they have figured out the proximate cause of your problems: you deploy manually, 5x, right in a row:

The prior deploys have not finished processing before the others start, leading to the 500’s at deploy time (as you helpfully pointed out :)).

Could you try…not doing that? Deploy only once? Or wait 10 minutes between deploys? Is kind of a waste to deploy repeatedly that fast, but you seem to do it quite regularly, which as I mentioned is the proximate cause of the problem.

We do have the actual cause sorted out, but it will take substantial work to fix. Work we do plan to do, but not in the immediate future.

Let me know if you think you can modify your deployment habits or not and I’ll pass that feedback on to our dev team to consider next steps.

To follow up, we have also enabled a feature flag for your account that should further improve performance even if you can’t change deploy patterns. Will be interested to hear how things go!

Thank you very much for your effort and that of your team. :+1:
I’ll take this up with the team after the weekend and see how we can move forward.

Hello @fool,

We are happy to report that we have seen a complete absence of 500 errors ever since the implementation of that feature flag.

Some questions however:

  • Is that feature flag going to be merged into the master or not, and/or is this feature flag a temporary measure
  • Is there a tradeoff in deploy time and/or otherwise when this feature flag is enabled?
  • We still see 502 bad gateway errors (prior and after the feature flag implementation), which before were hidden among the multitude of 500 internal server errors. Can you see what the reason is for those 502’s?

Thanks!

@fool
just FYI, we detected a couple more error 500’s yesterday
20/10/2021 14:24:50 https://www.vlerick.com/en/
20/10/2021 14:30:56 https://www.vlerick.com/nl/

Hey Rik and Tim!

We think those errors should be the last! Yesterday our dev team shipped the final part of their work that should fully resolve your issue. I think the fix was shipped a couple hours after your last report.

While you are among the first to benefit from the feature-flagged fix, it will become our normal production path in coming weeks - we rushed to get it ready for you to confirm the behavior since your site was particularly triggering some conditions that led to problems, but everyone will be on the new system soon, so the fix won’t be lost nor you stranded out on an abandoned codepath :slight_smile:

I do not expect that you’d encounter any performance changes, either in site service nor in deploying, nor in the speed that your deploys go live.

Re: 502 errors, those are usually something around how we connect to proxied resources or functions and would not be related to the 500’s. Please send us the x-nf-request-id (cf [Support Guide] Netlify Support asked for the 'x-nf-request-id' header? What is it and how do I find it?) for a 502 you observe and we’ll be happy to investigate further.

1 Like

Hi @fool

Thanks for the effort again.
We noticed an improvement on server response time and a lack of 500 errors for now.
We will monitor this for some more days and get back to you on this.

Our monitoring system doesn’t have a browser so we can’t look up the request-id.
The last occurrence of a 502 error was on the dutch homepage Vlerick Business School | Vlerick Business School

  • 25/10/2021 19:08:52 CET
  • 22/10/2021 15:34:46 CET

Would be great if you could have a look on this, but we know it is harder to find the request with only the timestamps.
We’re already happy that the 500 errors are gone.

6 posts were merged into an existing topic: 502/503 Internal Server Error

@Tim_Refaerts_Vlaming :wave:

I want to assure you your response hasn’t gotten lost and that I have passed your message on to @fool again.

1 Like

502’s are different - these are timeouts, usually from our proxying layer, rather than errors in finding your content as the 500’s were.

Could you please confirm the timezone on those reports? My timezone converter claims that CET is not the currently used TZ but instead CEDT, so I’d like to make sure I’m looking at the right things before I dig in :slight_smile:

Hi Chris (@fool )
Can I shortcut the discussion in providing them in UTC?

25/10/2021 17:08:52 UTC
22/10/2021 13:34:46 UTC

hi there, @Rik

I want to assure you we haven’t forgotten about this thread. Stay tuned for more information soon. The team has looped in the appropriate folks in the Engineering Org to work on this further.

Hi @Rik and @Tim_Refaerts_Vlaming

Unfortunately, we weren’t able to get to the bottom of the 500 server errors, however, we haven’t seen any for the last week. Can you let us know if you do see the issue again? Or if your monitor doesn’t agree? Thanks!