Site going down every 10 minutes

I hear you. We ran a full stack on AWS for 10 years using RDS, Elasticache, Cloudfront and EC2 on custom .Net code, we had ONE outage. We decided to simplify our stack and reduce tech debt by going JAMStack with a modern PAAS, and it’s costing us dearly. This is our busiest time of the year and I have to deal with daily support requests about our site being down. First time we’ve had to deal with this in 20 years running this business.

1 Like

Netlify has been great for a few years. Just started to fail miserably 10 days ago. Literally every day it is down and they never update their status! Our entire site is down now! What is wrong with Netlify?

Apparently, Netlify has probably laid off a bunch of support staff. As this is completely outrageous. They are still saying; “All systems operational” This is a flat out lie. Edge is down for 2 hours and counting now.

Hard to believe, I’m still the only site in the world on Netlify that is still malfunctioning? The site goes down every 10 minutes. Edge functions are not working. Nobody else?

We are down again also!

We are experiencing the same issue.

When we delete/disable our edge functions, the site operates correctly.

Apparently, we are all too small for Netlify to care. Edge functions are not working at all. They go up and down every 10 minutes. Same error every day for 10 days already. Today is the worst day, though, by far. Especially, since Netlify refuses to acknowledge their network is malfunctioning. I guess it will take a really big client having massive failure for them to acknowledge the problem.

1 Like

Hello folks,

My apologies I did not see this thread earlier-- I have passed along all of your feedback and we have an engineering team investigating as we speak. I will follow up here when I have more information.

Hi folks,

I run our Support team and I’ll admit today wasn’t a great day for stability in edge functions. I don’t know if everyone who reported issues today in this thread is using Edge Functions, but the symptoms of an issue we were working on today with our partner Deno sure do line up with at least @daniel.hedrick 's report.

While this is not the uptime we intend, it also affected only edge functions. Edge functions are an experimental feature, not yet released as GA, and thus not 100% stable. We say this in our documentation as clearly as we can

Your choice to use edge functions is one you should make with your business in mind: are you ok relying on a beta service that sometimes has issues, and an API implementation that could change? Great, Edge Functions are for you. Do you need 100% stability? Edge Functions are not (yet) for you. Use lambdas instead! Most of our frameworks that use edge functions have an option for that. Let us know if you can’t find it, and we’ll guide you to it.

Going forward, we’ll continue to work on stability, and when we do release as GA, we’ll provide an SLA for our enterprise customers for them, and we’ll expect them to be stable, and we’ll statuspage about issues running a small percentage of edge functions - which is what happened today.

When we release as GA, we’ll make a lot of noise, publicizing the launch, removing the beta lable from the docs, etc. So you won’t be able to miss it. But until then, please set expectations around the stability of this feature that match its experimental nature.

@fool We are not using edge functions. We are running NextJs on Netlify. NextImage just failed almost entire day. What does this have to do with edge functions? What does this have to do with the same errors that were reported for the last 10 days? Are you saying that running NextJs on Netlify is an experimental feature on Netlify? If so, then Netlify should say so, and we wouldn’t have bothered running NextJs on Netlify. Maybe Netlify runs NextJs on edge functions. If so, then you should say so.

So I just read the docs on NextJs on Netlify. I have no idea how it works, but running NextJs on Netlify uses Netlify’s Next.js Runtime which is installed automatically when you run NextJs. The docs say that among other things Next/Image works on Netlify by using Edge functions. So apparently, yes, if you plan to run NextJs on Netlify then you must use Edge functions, and now when edge functions fail, the problem is that we are responsible b/c we shouldn’t have any expectation that Edge functions will run properly as it is a beta feature? Why not put a big red message on the NextJs Netlify page that says: “NextJs on Netlify is a beta. Do NOT run any production sites of NextJs on Netlify b/c they all may stop working at any time.” It’s doubtful Netlify will do this, b/c they make a big deal about being able to run NextJs on Netlify. But apparently when the set up fails, it’s not their problem. This is ludicrous. You can’t have it both ways. Either you support running NextJs on Netlify and you then you take responsibility for the sites running NextJs on Netlify, or you just don’t bother claiming that you can run NextJs properly on Netlify.

1 Like

So in sum, the message here is simple: Do you need your NextJs site to have 100% stability? Then don’t run it on Netlify. B/c while you can opt out of certain features of Netlify’s NextJs Runtime, doing so means that you cannot take advantage of some NextJs’s important features, like NextImage, which require Netlify Edge. So lesson learned. NextJs on Netlify is beta and is not guaranteed to work at all. And when it does work, unless you want to pay enterprise fees, Netlify couldn’t give a damn about your site. I guess we will have to move to Vercel.

What’s laughable about Jamstack hosts like Netlify, is that they have an amazing landing page on NextJs: Deploy Next.js Sites and Apps - Starter Templates & Resources | Netlify - which makes you think how easy and great it is to run NextJs on Netlify? But then you miss the fine print. This is beta. It won’t work and when it does they won’t support you unless you pay enterprise fees. What does enterprise cost? They won’t say, b/c it’s probably a different price for everyone, depending on how they gauge your pain level. Which is to say, that no small biz can ever afford enterprise fees. So this is just flat up dishonest marketing. For 20 years, we ran PHP applications, like Magento, on cloud hosting, and had SLA for 99.9% up time and 24x7 online support, even when paying $20/Month for a site. And yet Netlify won’t provide any SLA or support unless you pay enterprise? Absolutely ridiculous. You should make it clear from the get go, that Netlify is only available for large enterprises and the rest of your pricing plans are only for applications that don’t have to actually work in production.

PS Enterprise plans are nearly always complete bs with any tech company nowadays. The support is nowhere near as advertised and the up time is no different than non-enterprise. The sad state of Saas nowadays is that support is non-existent for all but very large companies who can shell out 100’s of thousands a month. Nobody wants anything to do with SMB, though it’s the SMB’s that support the insane valuations that companies like Netlify provide to investors. Where would Netlify be without the massive amount of SMB’s on their Business Plan?

1 Like

You guys did make a lot of noise when you released the beta in April. It’s the #1 key feature listed on your home page.

No mention of beta / experimental / unreliable status on your marketing materials.

None of your Edge Function blog posts mention they are not to be used in production

The only mention I can find is on the main Netlify Docs page. It just says ‘BETA’, not ‘Unsupported’.

But when #### hits the fan you put the blame back on your customers by claiming you didn’t ACTUALLY mean for anyone to use this?

That’s super lame. Outages happen, but the least you can do is be proactive and notify your customers, explain the incident, the cause of it, what you did to remedy, and how it should affect future stability. How do you explain that there’s absolutely no report of any of this on your status page, even today? If your plan is to hide Edge Function failures, you should stop advertising the feature completely. Right now you are being deceitful, and it doesn’t look like an accidental omission. If you had been honest, the edge function feature would be in some limited beta, you would not be advertising it, and we would have picked a different vendor 8 months ago. Too late now.

Our cost is not just our monthly bill. We invested $ 50K of development costs behind a Netlify-centric solution, and our business reputation is at risk. Your attitude regarding those outages, blaming your customers, is very frustrating.

Hello there,

Thank you for the continued feedback and for articulating your concerns. I hear that you have not had an optimal experience, and I have looped in all of the appropriate Product and Engineering leaders. A member of the Engineering team will follow up here when we have more information for you.

Thank you for your patience.

I understand your frustrations and concerns, however your information is slightly inaccurate. That’s probably on us as we did not do a good job documenting it, but it does exist. For example, @fool said:

The option mentioned by @fool is documented here:

You may also manually disable the Edge Function by setting the environment variable NEXT_DISABLE_EDGE_IMAGES to true .

So, Next.js on Netlify is not Beta, a specific subset of features uses Beta features by default, which you can disable at will.

Thanks for your response. However, I think you might be unaware that there are 2 issues here. To recap, running NextJs on Netlify is dependent on Netlify’s On Demand Builders, which are functions (what kind of functions, I don’t know) that are cached on Netlify’s Edge CDN. Secondly, NextImage, which is a core feature on NextJs runs with Netlify’s Edge functions. This has lead to problems.

  1. Edge CDN failure: Netify’s Edge CDN (JFK region??) has failed repeatedly in the last 10 days for hours at a time. This means that any NextJs site which uses getStaticProps, which is basically every NextJs website in the world, has gone down for any user or server that connects in the effected regions.

  2. Edge functions failure: Netlify’s edge functions also have failed repeatedly over the last 10 days. Is this related to the Edge CDN? I have no idea. But, the impact has been that any website running NextJs on Netlify that uses NextImage (and who doesn’t?) has been down. Do the On Demand Builders also use Edge functions? I don’t know. Netlify doesn’t disclose how On Demand actually works.

Netlify’s response to the above problems has roughly been as follows:

  1. Netlify Edge CDN : We are sorry, but the traffic routed thru the effected Edge CDN is a small percentage of our traffic, so it’s not a major concern. It’s clear it’s not a concern for Netlify, as they have ceased to even report the downtime on the Status page, since Dec 13, even though it has failed since then numerous times.

  2. Edge functions: We are sorry but Edge fucntions are beta, so tough luck. If you want to use NextJs on Netlify don’t use NextImage or other features that might rely on our using Edge functions behind the scene to run NextJs.

The above responses from Netlify are unsatisfactory because:

  1. The notion that only a tiny percentage of traffic is effected by downtime in the Northeast region seems highly suspect. Aside from the massive population in the US Northeast, alot of servers on AWS run thru there, and so there is a ripple impact across many regions when the Edge CDN fails in that region, due to various dependencies. So it is overly simplistic to say, since we only have a <10% of traffic from NYC, the downtime only effects a tiny subset of users. This is just not how complex systems operate. So it would seem that Netlify should investigate the issues on the Edge CDN and offer a true fix.

  2. NextImage and other Edge Functions: How many people who build NextJs websites and then host them on Netlify are even aware that NextImage and possibly other features run on beta Edge functions? How many of these of users actually know of and use Next_Disable, which I’m still not even clear on how it works? I am guessing the answer is ZERO to each of the above. Because the way this works in reality is that a developer sees that Netlify offers NextJs because it is marketed heavily. So they deploy to Netlify and it works. Great, they move more NextJs sites to Netlify. Nobody suspects that under the hood this is only working b/c of beta Edge functions. Then when their site fails, they are told that we are sorry, but you forgot to read the fine print about how certain feature rely on beta Edge functions, and if you want stability you will need to refactor your code to eliminate some features of NextJs. Really? Who is going to refactor their code to eliminate a core feature, just to be able to host on Netlify? If anyone actually knew this in advance they would never host NextJs on Netlify. Now they are stuck and need to refactor code to use it? Does anyone at Netlify actually think this is a proper response?

Honestly, this just a classic bait and switch which should beneath a respected company like Netlify. You should have a big red warning sign on NextJs Netlify that says: Please be aware that several important features of NextJs only run on Netlify via Edge Functions. If you use these features, and do not refactor your code to eliminate these features, your site may not be stable. You should then prominently detail which NextJs features rely on Edge functions. Is it just NextImage or is it ISR also? It’s still not clear to me.

I am guessing Netlify will never put any warning like that on the site, because it would dramatically reduce the number of developers who will host NextJs on Netlify, because nobody is going to refactor code to satisfy limitations of Netlify’s hosting.

I’m really sorry to say, @osseonews, but I think, a part of your confusion here is because it seems, you’re not really sure about Netlify and the terminologies here. Let me try to address some of these here, hopefully it can also help someone else in the future. Note that, I am only addressing the concerns I have sufficient knowledge about, so you can see I might have skipped some items.

Not completely true. On Demand Builders are used for getStaticProps(), ISR and next/image. Yes, that should be 99% of Next.js sites, but you can use Next.js without using On Demand Builders. That is documented as point 5 and 6 in key features here: Next.js on Netlify | Netlify Docs, with a link to On Demand Builders docs, in case someone wants to learn more about what they are (which you should, since you mention you don’t know what they are).

For the sake of simplicity, let’s refer to this as CDN and Edge Functions as something different. I think the constant use of the word Edge is confusing, I guess everyone?

So coming to your problem number 1, CDN failure:

The initially reported incident on December 13: Netlify Status - Increased errors and latency on the Standard Edge Network was the one with JFK issues. Once we mark an incident as resolved, it means, that one is really resolved. So, if you’re seeing something off after some time, that could be a different issue. Thus, it would be incorrect to say, this same incident has been going on for the past 10 days, because that’s not true.

As mentioned above, getStatcProps() is run in On Demand Builders, not on the CDN. So, this point is moot. It could be the case that, since the CDN suffered issues, the output of getStaticProps() that was cached, could be having issues (temporarily), but that doesn’t mean, getStaticProps() was down for 10 days.

We would suggest taking a look at the documentation then: Edge Functions overview | Netlify Docs

Sure, we do: On-demand Builders | Netlify Docs

I don’t think we said that. You simply need to use an Environment Variable to disable Edge Functions.

We have mentioned it in the docs: Next.js on Netlify | Netlify Docs. There’s no fine-print here. As someone who tries to use a platform, I think it is okay to expect them to have gone through the documentation before deciding whether or not a platform is suitable for them. And based on some of your above statements, I really feel, you’ve missed a big part of our documentation. Also, as mentioned above, I don’t think adding an environment variable counts as having to refactor the entire codebase.

Sorry to know that you feel this way, but from my perspective, putting it on the official docs seem enough. If you were expecting this to be on the marketing copy, I don’t think any product mentions anything of that sort. Our marketing copy does point you to the docs which mention al lot of this fairly well. But if you disagree, please let us know how else we could have provided that info, so we can improve the docs.

thank you for your answer, but you haven’t provided any actionable advice on how to fix the problems we have had with NextJs on Netlify. So let’s try to a different route.

  1. Next/Image: As NextImage runs on Netlify via Edge functions, and Edge functions are in beta, it is obviously not advisable to use NextImage on Netlify b/c the site will not be 100% stable, as we discovered in the last few weeks and have been advised in the thread above to not use Edge, if we want 100% stability. The documentation on Netlify doesn’t warn about this at all, but simply says: “If you don’t want to use Edge functions to handle imaging processing…” This makes it sound like Edge function is just a matter of personal taste for Next/Image, not that using NextImage is bound to make your site unstable as Edge is still in beta. I think you might want to change this language to specifically mention that Edge functions are in beta and so using NextImage on Netlify is not guaranteed to work. In fact, since Edge is in beta, it’s not clear to me at all why NextImage on Netlify defaults to using Edge, in the first place. If Edge is not “100% stable”, why is this the default behavior for NextJs on Netlify?

  2. Removing Next/Image: The solution to remove Next/Image to make the site stable, requires you to set NEXT_DISABLE_EDGE_IMAGES to true. However, there is no further explanation in the docs on what that actually means. After setting Next_Disable to true Do we need to remove all NextImage tags from the code and replace with standard HTML img tags? What happens if we don’t? Will those pages with NextImage simply fail to work? What other impacts does this have on a site? The thing about “scopes” in the docs on NextImage is unfortunately, very vague to me at least, and I don’t understand what is being suggested there.

  3. What other features of NextJs on Netlify rely on Edge functions?

  4. What features of NextJs rely on the Edge CDN/Edge Network or whatever we are supposed to call it? It seems like NextJs on Netlify is very reliant on On Demand Builders. But, in last 2 weeks, our Edge CDN/Edge Network has gone done repeatedly, as documented in this post, so what can we do to avert this Is there a way to opt out of Builders or is this just core to using ISR on NextJs on Netlify? We use ISR extensively.

  5. We have noticed that our API endpoints on NextJs on Netlify fail repeatedly over the last 10 days. This seems to coincide with Edge CDN failures and/or Edge failures in general. None of these API functions are Edge functions, and none of the exceed any limits, i.e. they are not large functions. They just return 500 status codes out of nowhere. We know this, b/c after the difficulties in the last 10 days, we now ping our functions every 10 minutes to check stability. So how exactly is NextJs on Netlify working with NextJs API routes? Do they re-package these as Edge Functions? Do they just export these and place them on an Edge CDN? Is using API routes on NextJs somehow related to any Beta features of Netlify in the background and so using API routes would not be advisable to have a 100% stable NextJs Site? Is it better to just recode our entire website and use Netlify functions instead?

Hi, my name is Joey and I am a Product Manager here at Netlify, working on our edge network and serverless edge compute offerings like Netlify Functions and Edge Functions.

Up front, I first want to confirm that there have been some instances recently of Edge Functions instability and service degradation that were not highlighted on our status page. While the overall Netlify Edge network remained up for these periods, you may have unknowingly taken a dependency on Edge Functions via features like the Next.js Runtime, or were not given enough information to understand the reliability implications of knowingly taking a dependency on Edge Functions. Despite its current status as a beta feature, we can, should, and will do a better job of explicitly communicating the status of our Edge Functions service as we improve its reliability.

We also recognize that we definitely could have done a better job of highlighting the beta nature of Edge Functions, and will be adding some language to our homepage and docs to reflect that. We do already have some language today in our build logs (e.g. the Next.js Runtime emits a warning about Edge Functions usage), but based on the feedback here, that wasn’t discoverable enough.

We are driving hard towards ensuring that Edge Functions meets the production standard of reliability that’s offered by the rest of Netlify’s platform, and will share more as soon as we have updates. In the meantime, thanks for your patience and understanding, and for your honest feedback.

2 Likes