We are down again also!
We are experiencing the same issue.
When we delete/disable our edge functions, the site operates correctly.
Apparently, we are all too small for Netlify to care. Edge functions are not working at all. They go up and down every 10 minutes. Same error every day for 10 days already. Today is the worst day, though, by far. Especially, since Netlify refuses to acknowledge their network is malfunctioning. I guess it will take a really big client having massive failure for them to acknowledge the problem.
My apologies I did not see this thread earlier-- I have passed along all of your feedback and we have an engineering team investigating as we speak. I will follow up here when I have more information.
I run our Support team and I’ll admit today wasn’t a great day for stability in edge functions. I don’t know if everyone who reported issues today in this thread is using Edge Functions, but the symptoms of an issue we were working on today with our partner Deno sure do line up with at least @daniel.hedrick 's report.
While this is not the uptime we intend, it also affected only edge functions. Edge functions are an experimental feature, not yet released as GA, and thus not 100% stable. We say this in our documentation as clearly as we can
Your choice to use edge functions is one you should make with your business in mind: are you ok relying on a beta service that sometimes has issues, and an API implementation that could change? Great, Edge Functions are for you. Do you need 100% stability? Edge Functions are not (yet) for you. Use lambdas instead! Most of our frameworks that use edge functions have an option for that. Let us know if you can’t find it, and we’ll guide you to it.
Going forward, we’ll continue to work on stability, and when we do release as GA, we’ll provide an SLA for our enterprise customers for them, and we’ll expect them to be stable, and we’ll statuspage about issues running a small percentage of edge functions - which is what happened today.
When we release as GA, we’ll make a lot of noise, publicizing the launch, removing the beta lable from the docs, etc. So you won’t be able to miss it. But until then, please set expectations around the stability of this feature that match its experimental nature.
@fool We are not using edge functions. We are running NextJs on Netlify. NextImage just failed almost entire day. What does this have to do with edge functions? What does this have to do with the same errors that were reported for the last 10 days? Are you saying that running NextJs on Netlify is an experimental feature on Netlify? If so, then Netlify should say so, and we wouldn’t have bothered running NextJs on Netlify. Maybe Netlify runs NextJs on edge functions. If so, then you should say so.
So I just read the docs on NextJs on Netlify. I have no idea how it works, but running NextJs on Netlify uses Netlify’s Next.js Runtime which is installed automatically when you run NextJs. The docs say that among other things Next/Image works on Netlify by using Edge functions. So apparently, yes, if you plan to run NextJs on Netlify then you must use Edge functions, and now when edge functions fail, the problem is that we are responsible b/c we shouldn’t have any expectation that Edge functions will run properly as it is a beta feature? Why not put a big red message on the NextJs Netlify page that says: “NextJs on Netlify is a beta. Do NOT run any production sites of NextJs on Netlify b/c they all may stop working at any time.” It’s doubtful Netlify will do this, b/c they make a big deal about being able to run NextJs on Netlify. But apparently when the set up fails, it’s not their problem. This is ludicrous. You can’t have it both ways. Either you support running NextJs on Netlify and you then you take responsibility for the sites running NextJs on Netlify, or you just don’t bother claiming that you can run NextJs properly on Netlify.
So in sum, the message here is simple: Do you need your NextJs site to have 100% stability? Then don’t run it on Netlify. B/c while you can opt out of certain features of Netlify’s NextJs Runtime, doing so means that you cannot take advantage of some NextJs’s important features, like NextImage, which require Netlify Edge. So lesson learned. NextJs on Netlify is beta and is not guaranteed to work at all. And when it does work, unless you want to pay enterprise fees, Netlify couldn’t give a damn about your site. I guess we will have to move to Vercel.
What’s laughable about Jamstack hosts like Netlify, is that they have an amazing landing page on NextJs: Deploy Next.js Sites and Apps - Starter Templates & Resources | Netlify - which makes you think how easy and great it is to run NextJs on Netlify? But then you miss the fine print. This is beta. It won’t work and when it does they won’t support you unless you pay enterprise fees. What does enterprise cost? They won’t say, b/c it’s probably a different price for everyone, depending on how they gauge your pain level. Which is to say, that no small biz can ever afford enterprise fees. So this is just flat up dishonest marketing. For 20 years, we ran PHP applications, like Magento, on cloud hosting, and had SLA for 99.9% up time and 24x7 online support, even when paying $20/Month for a site. And yet Netlify won’t provide any SLA or support unless you pay enterprise? Absolutely ridiculous. You should make it clear from the get go, that Netlify is only available for large enterprises and the rest of your pricing plans are only for applications that don’t have to actually work in production.
PS Enterprise plans are nearly always complete bs with any tech company nowadays. The support is nowhere near as advertised and the up time is no different than non-enterprise. The sad state of Saas nowadays is that support is non-existent for all but very large companies who can shell out 100’s of thousands a month. Nobody wants anything to do with SMB, though it’s the SMB’s that support the insane valuations that companies like Netlify provide to investors. Where would Netlify be without the massive amount of SMB’s on their Business Plan?
You guys did make a lot of noise when you released the beta in April. It’s the #1 key feature listed on your home page.
No mention of beta / experimental / unreliable status on your marketing materials.
None of your Edge Function blog posts mention they are not to be used in production
The only mention I can find is on the main Netlify Docs page. It just says ‘BETA’, not ‘Unsupported’.
But when #### hits the fan you put the blame back on your customers by claiming you didn’t ACTUALLY mean for anyone to use this?
That’s super lame. Outages happen, but the least you can do is be proactive and notify your customers, explain the incident, the cause of it, what you did to remedy, and how it should affect future stability. How do you explain that there’s absolutely no report of any of this on your status page, even today? If your plan is to hide Edge Function failures, you should stop advertising the feature completely. Right now you are being deceitful, and it doesn’t look like an accidental omission. If you had been honest, the edge function feature would be in some limited beta, you would not be advertising it, and we would have picked a different vendor 8 months ago. Too late now.
Our cost is not just our monthly bill. We invested $ 50K of development costs behind a Netlify-centric solution, and our business reputation is at risk. Your attitude regarding those outages, blaming your customers, is very frustrating.
Thank you for the continued feedback and for articulating your concerns. I hear that you have not had an optimal experience, and I have looped in all of the appropriate Product and Engineering leaders. A member of the Engineering team will follow up here when we have more information for you.
Thank you for your patience.
I understand your frustrations and concerns, however your information is slightly inaccurate. That’s probably on us as we did not do a good job documenting it, but it does exist. For example, @fool said:
The option mentioned by @fool is documented here:
You may also manually disable the Edge Function by setting the environment variable
So, Next.js on Netlify is not Beta, a specific subset of features uses Beta features by default, which you can disable at will.
Thanks for your response. However, I think you might be unaware that there are 2 issues here. To recap, running NextJs on Netlify is dependent on Netlify’s On Demand Builders, which are functions (what kind of functions, I don’t know) that are cached on Netlify’s Edge CDN. Secondly, NextImage, which is a core feature on NextJs runs with Netlify’s Edge functions. This has lead to problems.
Edge CDN failure: Netify’s Edge CDN (JFK region??) has failed repeatedly in the last 10 days for hours at a time. This means that any NextJs site which uses getStaticProps, which is basically every NextJs website in the world, has gone down for any user or server that connects in the effected regions.
Edge functions failure: Netlify’s edge functions also have failed repeatedly over the last 10 days. Is this related to the Edge CDN? I have no idea. But, the impact has been that any website running NextJs on Netlify that uses NextImage (and who doesn’t?) has been down. Do the On Demand Builders also use Edge functions? I don’t know. Netlify doesn’t disclose how On Demand actually works.
Netlify’s response to the above problems has roughly been as follows:
Netlify Edge CDN : We are sorry, but the traffic routed thru the effected Edge CDN is a small percentage of our traffic, so it’s not a major concern. It’s clear it’s not a concern for Netlify, as they have ceased to even report the downtime on the Status page, since Dec 13, even though it has failed since then numerous times.
Edge functions: We are sorry but Edge fucntions are beta, so tough luck. If you want to use NextJs on Netlify don’t use NextImage or other features that might rely on our using Edge functions behind the scene to run NextJs.
The above responses from Netlify are unsatisfactory because:
The notion that only a tiny percentage of traffic is effected by downtime in the Northeast region seems highly suspect. Aside from the massive population in the US Northeast, alot of servers on AWS run thru there, and so there is a ripple impact across many regions when the Edge CDN fails in that region, due to various dependencies. So it is overly simplistic to say, since we only have a <10% of traffic from NYC, the downtime only effects a tiny subset of users. This is just not how complex systems operate. So it would seem that Netlify should investigate the issues on the Edge CDN and offer a true fix.
NextImage and other Edge Functions: How many people who build NextJs websites and then host them on Netlify are even aware that NextImage and possibly other features run on beta Edge functions? How many of these of users actually know of and use Next_Disable, which I’m still not even clear on how it works? I am guessing the answer is ZERO to each of the above. Because the way this works in reality is that a developer sees that Netlify offers NextJs because it is marketed heavily. So they deploy to Netlify and it works. Great, they move more NextJs sites to Netlify. Nobody suspects that under the hood this is only working b/c of beta Edge functions. Then when their site fails, they are told that we are sorry, but you forgot to read the fine print about how certain feature rely on beta Edge functions, and if you want stability you will need to refactor your code to eliminate some features of NextJs. Really? Who is going to refactor their code to eliminate a core feature, just to be able to host on Netlify? If anyone actually knew this in advance they would never host NextJs on Netlify. Now they are stuck and need to refactor code to use it? Does anyone at Netlify actually think this is a proper response?
Honestly, this just a classic bait and switch which should beneath a respected company like Netlify. You should have a big red warning sign on NextJs Netlify that says: Please be aware that several important features of NextJs only run on Netlify via Edge Functions. If you use these features, and do not refactor your code to eliminate these features, your site may not be stable. You should then prominently detail which NextJs features rely on Edge functions. Is it just NextImage or is it ISR also? It’s still not clear to me.
I am guessing Netlify will never put any warning like that on the site, because it would dramatically reduce the number of developers who will host NextJs on Netlify, because nobody is going to refactor code to satisfy limitations of Netlify’s hosting.
I’m really sorry to say, @osseonews, but I think, a part of your confusion here is because it seems, you’re not really sure about Netlify and the terminologies here. Let me try to address some of these here, hopefully it can also help someone else in the future. Note that, I am only addressing the concerns I have sufficient knowledge about, so you can see I might have skipped some items.
Not completely true. On Demand Builders are used for
getStaticProps(), ISR and
next/image. Yes, that should be 99% of Next.js sites, but you can use Next.js without using On Demand Builders. That is documented as point 5 and 6 in key features here: Next.js on Netlify | Netlify Docs, with a link to On Demand Builders docs, in case someone wants to learn more about what they are (which you should, since you mention you don’t know what they are).
For the sake of simplicity, let’s refer to this as CDN and Edge Functions as something different. I think the constant use of the word Edge is confusing, I guess everyone?
So coming to your problem number 1, CDN failure:
The initially reported incident on December 13: Netlify Status - Increased errors and latency on the Standard Edge Network was the one with JFK issues. Once we mark an incident as resolved, it means, that one is really resolved. So, if you’re seeing something off after some time, that could be a different issue. Thus, it would be incorrect to say, this same incident has been going on for the past 10 days, because that’s not true.
As mentioned above,
getStatcProps() is run in On Demand Builders, not on the CDN. So, this point is moot. It could be the case that, since the CDN suffered issues, the output of
getStaticProps() that was cached, could be having issues (temporarily), but that doesn’t mean,
getStaticProps() was down for 10 days.
We would suggest taking a look at the documentation then: Edge Functions overview | Netlify Docs
Sure, we do: On-demand Builders | Netlify Docs
I don’t think we said that. You simply need to use an Environment Variable to disable Edge Functions.
We have mentioned it in the docs: Next.js on Netlify | Netlify Docs. There’s no fine-print here. As someone who tries to use a platform, I think it is okay to expect them to have gone through the documentation before deciding whether or not a platform is suitable for them. And based on some of your above statements, I really feel, you’ve missed a big part of our documentation. Also, as mentioned above, I don’t think adding an environment variable counts as having to refactor the entire codebase.
Sorry to know that you feel this way, but from my perspective, putting it on the official docs seem enough. If you were expecting this to be on the marketing copy, I don’t think any product mentions anything of that sort. Our marketing copy does point you to the docs which mention al lot of this fairly well. But if you disagree, please let us know how else we could have provided that info, so we can improve the docs.
thank you for your answer, but you haven’t provided any actionable advice on how to fix the problems we have had with NextJs on Netlify. So let’s try to a different route.
Next/Image: As NextImage runs on Netlify via Edge functions, and Edge functions are in beta, it is obviously not advisable to use NextImage on Netlify b/c the site will not be 100% stable, as we discovered in the last few weeks and have been advised in the thread above to not use Edge, if we want 100% stability. The documentation on Netlify doesn’t warn about this at all, but simply says: “If you don’t want to use Edge functions to handle imaging processing…” This makes it sound like Edge function is just a matter of personal taste for Next/Image, not that using NextImage is bound to make your site unstable as Edge is still in beta. I think you might want to change this language to specifically mention that Edge functions are in beta and so using NextImage on Netlify is not guaranteed to work. In fact, since Edge is in beta, it’s not clear to me at all why NextImage on Netlify defaults to using Edge, in the first place. If Edge is not “100% stable”, why is this the default behavior for NextJs on Netlify?
Removing Next/Image: The solution to remove Next/Image to make the site stable, requires you to set
true. However, there is no further explanation in the docs on what that actually means. After setting Next_Disable to true Do we need to remove all NextImage tags from the code and replace with standard HTML img tags? What happens if we don’t? Will those pages with NextImage simply fail to work? What other impacts does this have on a site? The thing about “scopes” in the docs on NextImage is unfortunately, very vague to me at least, and I don’t understand what is being suggested there.
What other features of NextJs on Netlify rely on Edge functions?
What features of NextJs rely on the Edge CDN/Edge Network or whatever we are supposed to call it? It seems like NextJs on Netlify is very reliant on On Demand Builders. But, in last 2 weeks, our Edge CDN/Edge Network has gone done repeatedly, as documented in this post, so what can we do to avert this Is there a way to opt out of Builders or is this just core to using ISR on NextJs on Netlify? We use ISR extensively.
We have noticed that our API endpoints on NextJs on Netlify fail repeatedly over the last 10 days. This seems to coincide with Edge CDN failures and/or Edge failures in general. None of these API functions are Edge functions, and none of the exceed any limits, i.e. they are not large functions. They just return 500 status codes out of nowhere. We know this, b/c after the difficulties in the last 10 days, we now ping our functions every 10 minutes to check stability. So how exactly is NextJs on Netlify working with NextJs API routes? Do they re-package these as Edge Functions? Do they just export these and place them on an Edge CDN? Is using API routes on NextJs somehow related to any Beta features of Netlify in the background and so using API routes would not be advisable to have a 100% stable NextJs Site? Is it better to just recode our entire website and use Netlify functions instead?
Hi, my name is Joey and I am a Product Manager here at Netlify, working on our edge network and serverless edge compute offerings like Netlify Functions and Edge Functions.
Up front, I first want to confirm that there have been some instances recently of Edge Functions instability and service degradation that were not highlighted on our status page. While the overall Netlify Edge network remained up for these periods, you may have unknowingly taken a dependency on Edge Functions via features like the Next.js Runtime, or were not given enough information to understand the reliability implications of knowingly taking a dependency on Edge Functions. Despite its current status as a beta feature, we can, should, and will do a better job of explicitly communicating the status of our Edge Functions service as we improve its reliability.
We also recognize that we definitely could have done a better job of highlighting the beta nature of Edge Functions, and will be adding some language to our homepage and docs to reflect that. We do already have some language today in our build logs (e.g. the Next.js Runtime emits a warning about Edge Functions usage), but based on the feedback here, that wasn’t discoverable enough.
We are driving hard towards ensuring that Edge Functions meets the production standard of reliability that’s offered by the rest of Netlify’s platform, and will share more as soon as we have updates. In the meantime, thanks for your patience and understanding, and for your honest feedback.
thank you for this post. It is much appreciated. I look forward to better documentation on how NextJs actually works on Netlify and what features are still in beta and should not be used in production. I am still stuck with NextImages and have no idea what to do, as there is no chance I am going to refactor our code to get rid of NextImages, even though I now understand this is beta. Some information on how API routes actually work, would be helpful too, as these seem to fail ocassionally and it’s not clear why. Something to do with how builders work? I don’t know as it’s not clear how Netlify is packaging and then serving serverless functions.
Just to be clear @osseonews, we never guarantee 100% stability.
Using Edge Functions, Next/Image benefits from being able to use content-negotiation, which was not possible without Edge Functions. Due to content negotiation, we can serve optimised images based on browser support - a browser that support AVIF and WebP can receive those images instead of JPG or PNG, thus making images load faster (due to smaller size for almost similar quality). Since that is the expected result of using Next/Image, using Edge Functions as the default option, seemed like a logical step.
No. It simply won’t use Edge Functions to load the images (thus, not making use of content-negotiation). The images would continue to load without any code change (which is exactly why I’m saying - no refactoring needed).
The Next.js middleware. That can be disabled too:
This is a general answer that applies to most frameworks. There are multiple moving parts here. Note that, the following is a highly simplified version of a lot of the internal intricacies:
Netlify Origin Server: The server that’s the primary source of all the data. All the CDN nodes depend on the origin for any uncached content. So, if the origin is down, you can expect a lot to be broken. In most cases, cached content continues to be served correctly, but uncached content returns errors.
Netlify CDN: The globally distributed nodes that cache your content. Whenever a client requests something from Netlify, using DNS we route them to the closest edge node. This node checks its cache for the requested file. If cached, it serves that directly, if not, it requests from the origin, serves that, and caches it.
Netlify Functions: These are executed by AWS Lambda. This is restricted to a single location (
us-east-1by default). The content is served through the CDN. On Demand Builders, make use of AWS Lambda as well, except they have a special cache store that’s independent of the CDN cache. The Next/Image also does the actual rendering of images in an On Demand Builder. Only content-negotiation runs inside an Edge Function.
Netlify Edge Functions: These are executed by Deno. These are globally distributed as well. The content is served through the CDN.
Now, how these work with Next.js:
Your Next.js’ static assets are directly uploaded to origin and served as and when requested through CDN nodes. We do setup some redirects to make those accessible, but that’s not too important for this discussion.
SSR, API routes, Preview mode, etc. directly render on Netlify Functions - served through CDN.
ISR and Next/Image render on Netlify Functions powered by Netlify On Demand Builders - served through CDN.
Middleware and Next/Image content-negotiation use Edge Functions.
When origin is down, almost all of this can be down. Yes, cached content would still work, but for this example, let’s imagine nothing is cached. So, your site will not work. When CDN goes down, even if your SSR/API/ISR/images render fine on the other services mentioned, CDN will still be unable to serve it.
So no matter what you use, if the CDN is down, your site will effectively be down.
The reason is explained above.
Again, Next/Image is not beta. Edge Functions are.
I hope this post clarifies your remaining doubts.
Thank you so much! This is incredibly helpful information. After a quick review I have 2 points:
As I now understand the the NextJs api functions are restricted to a single cdn (us-east-1), I think that easily explains the issues we were having. It also implies that when that CDN is down it has nothing to do with where actual users are located. The functions just won’t work properly for anyone b/c they rely on a CDN that is having issues. This will effect every single NextJs website on Netlify that uses the api folder for functions. Isn’t this correct? If so, I do think you need to look at having a “backup” CDN, as there were many problems this month with the current CDN. Never had these issues before actually. This month was just terrible, though. It would seem that you can store functions in separate locations, so if one goes down another works. I don’t know actually how it works really, so just an idea to make this more stable in case of some sort issues at us-east.
Why exactly is Edge still in beta? I believe edge is based on V8 engine. We have used Cloudflare Workers for 2 years already with no problems and it’s not beta, because V8 is not beta, as far as I know. Why is Edge beta on Netlify? Honestly, we are moving nearly all our functions to Edge and now that I see Netlify Edge is not 100% stable, that gives me pause for concern. Edge is the future for serverless, in my opinion, and I think Netlify believes that also. If it’s built on something that is proven, it should be stable, no?
That’s slightly off the mark. The AWS Lambda functions that execute are located by default in
us-east-1. This can be changed on Pro and above plans (by requesting the Support team to do that) to
ap-southeast-1 (and a couple more regions in US). However, the CDN is still global. For example:
You make a request from Singapore → Our CDN in Asia/Pacific gets assigned to handle that request → It checks what content to serve → Since it’s a Function (which is most likely not cached), it sends a request to AWS to execute the function → Request is sent to AWS (
us-east-1) → AWS returns a response to CDN in Asia/Pacific → CDN sends the response to the browser.
So, only the function execution is restricted to a single region, but the response of the execution can be still served by any of the global nodes.
Having a backup CDN point in this context, does not apply. In each region that we have presence, we have multiple CDN nodes. We are constantly rotating these nodes without any traffic disruption. The specific region that got affected on December 13, has not been affected for a long time and was immediately restored.
I feel you might be relating multiple incidents to the same degradation - which is not the case.
As @joey has mentioned:
This is the part that needs more work to get it out of the beta stage.
Let’s take an example. React (or Vue) is a stable library/framework for the most of the part. You build an application on top of it. Can you be sure it’s 100% stable, that there are no bugs? Wouldn’t you be interested in testing it out under the beta label before publicly releasing it? Both, React as well as Vue are proven. Just because you build something on top of that, it doesn’t automatically make it stable. There’s a lot of additional work required to tune it to perfection.
Same applies for Edge Functions.
Hope this answers your concerns.