Hi All, we switched hosting from GitHub pages to Netlify a few weeks ago I have noticed that in the immediate minute or so after a fresh deployment, our website is pretty much non-responsive.
We are using Netlify in order to make the most of Edge Functions, and it seems, if I deploy a site with Edge Functions in, then this seems to cause the problem?
Yesterday, I tried adding Algolia Search, which triggers a web crawler to visit the site once the build has finished. It’s running at about 50% success rate, with the other 50% hitting 502/503 errors when trying to visit our site and mentions of the edge functions not being available in the Crawlers logs.
Any advice on what we could be doing to cause this would be most appreciated. The Edge functions currently just define functions that we plan to use for A/B Testing, none of that is actually being used at the moment front-end.
Naturally we don’t want visitors hitting downtime on our website, and the crawler hitting 502’s is causing problems as it then means we have to trigger another Netlify build manually until it succeeds.
https://flowforge.com
source available here too: GitHub - flowforge/website: The FlowForge Website
this is our specific edge function, only have the one: website/eleventy-edge.js at main · flowforge/website · GitHub
I also created a GitHub issue for it and attached a video so you can see the problem in action: Edge Function Failure on Deploy to Netlify · Issue #623 · flowforge/website · GitHub
In this case it even explicitly errored with regards to the Edge Function. Netlify then seems to restart, and after that, it works great.
Thanks,
Joe
Hi @joepavitt,
Thanks for sharing those details. I do see quite a few failures for that site. The errors seem mixed to me, and have filed this for the devs to check and pinpoint the exact issue.
We’ll follow-up as we have more info.
Thanks @hrishikesh - any insight into the failures I can see for myself, or are those issues displayed on Netlify-internal logs only?
If you’re looking at our general build/deploy logs, we did have some regular failures a few weeks ago due to a missing folder dependency “docs”, which we’ve since resolved.
Hi @hrishikesh we had a downtime report of 1hr 37m for our website last night, it started at 21:54 UK time, and was reporting a 502 Bad Gateway, with the Edge Function error I mentioned previously.
I can also see:
Apr 13, 11:24:08 PM: error An edge function failed to initialize due to an internal error. Please reach out to Netlify support.
repeated every 30 seconds or so for the duration of this incident in our Edge Functions logs.
There were no new deployments/builds that triggered any change, this appears to be a random occurrence? The site then came up again at 23:31.
I will be removing the Edge Functions definition from our netlify.toml
- as we aren’t using them due to the instability, but it seems even just defining them causes problems.
Any updates on this would be most appreciated.
Thanks
While the root cause of the issues is still to be determined by the devs, our team noticed that you were relying on using File System in Edge Functions. This is soon not going to be supported and thus, your Edge Functions would break. It’s advisable to start migrating your Edge Functions from relying on File System.
Thanks @hrishikesh - although not sure I understand where in our Edge Function we are accessing File System? Are you referring to the first import line? Unfortunately, that’s pre-determined by our use of Eleventy, so I will raise that with the Eleventy development team.
Oh sorry, it was about this line: website/eleventy-edge.js at main · flowforge/website · GitHub which is already marked as it will throw an error in prod. Using CLI v14.x, this should probably not be needed now.
Thanks for sharing those details. I do see quite a few failures for that site. The errors seem mixed to me, and have filed this for the devs to check and pinpoint the exact issue.
@hrishikesh any updates on the original issue here at all please? We’ve now been over 2 weeks without being able to use edge functions.
Hey @joepavitt,
The engineers have confirmed that it was indeed the filesystem access that was breaking this and have deployed a fix. The fix should be spread across the entire network by the end of the week.
Thanks @hrishikesh even though it was in the try/catch
too?
try
…catch
will catch user errors, not system errors. Using filesystem is not a user error, thus it wasn’t caught.