Sites Down. Is Netlify down?

I use cloudflare nameservers to manage my domain and pointed it to netlify IP.

Hmm I still cannot get the IP’s, they load in browser for me but pinging doesnt.

Do you use the Let’s Encrypt SSL under Domain Management? image In all honesty I doubt it would fix it for you but I have seen a few times people needing to refresh that, these sounds like a deeper issue though.

Sorry you are facing these issues, hopefully there is a staff member around soon to resolve these for you! Please post back any updates if things change.

EDIT: I did actually manage to get an IP back:
https://elegant-almeida-4eefd6.netlify.app/ of 52.73.153.209

https://condescending-sinoussi-2a30de.netlify.app/ of 67.207.80.24

senpa.io of 167.172.221.254
nbk.io of 206.189.73.52

Not sure if these help you in any way, I am not anywhere close to an expert in this area!

Tried that still not working.

I think the issue is my site IPs changed without any reason

for senpa.io IP is 167.172.221.254
and for nbk.io IP is 206.189.73.52

I just changed the IPs to those and it seems to have fixed the issue

Hmm, interesting. They shouldn’t just change for no reason. Glad to hear it is fixed but I still recommend that @luke (Sorry to call you out Luke just have seen you help with similar issues in the past!) or another staff member can review this for you.

@yumo Are you still having issues if you update your IP’s?

It solved the issue but makes me worried about the netlify service that it may happen again. Disappointed

It was quite confusing, I agree. Please check back once a staff member is able to reply and provide some insight into what happened here.
I will leave the topic as open so that staff will still review the issue.

True, and i since it’s the same case with @mistik, there is surely some issues at netlify side.

Hi there!

In general, pointing a single IP address at another CDN for a Netlify site can cause issues. As we maintain our set of servers, the IPs that are used sometimes change as we do things like perform maintenance and scale capacity up and down. These changes are reflected in our published DNS records.

There’s some additional information available at [Support Guide] Why not proxy to Netlify? which covers some of the challenges with putting another CDN in front of us.

1 Like

Same issue this morning. DNS settings we’ve had for 2.5 years stopped working all of a sudden. I updated A records to Netlify’s load balancer IP, and site went back up. Now having an issue with Netlify flagging the domains as using Netlify DNS, but the DNS is managed at name.com. Also looks like next SSL prop might fail for this reason. Anyone else with this issue?

This is only happening with one of many sites I have on Netlify, and it happens to be the oldest.

which site is this regarding, jbedesign? If you don’t want to share the domain publicly, can you share the API ID please?

Yep its greater-good-strategy.netlify.app

Hi, @yumo, @Mistik, and @jbdesign.

The issue likely is being caused by one or both of two distinct root causes:

  • either you (or whoever configures the DNS settings for your domain) is using A records instead of CNAME records
  • or Cloudflare is caching DNS records longer than the TTL allows - much, much longer than allowed

However, I’m pretty certain it is the first issue above.

Our DNS service stopped returning the IP addresses listed above hours ago. If you are using our recommended settings this would not be happening. There is nothing wrong at Netlify. The issue can only occur if you incorrectly configured your DNS (or Cloudflare is caching DNS records in error).

So, what are the correct DNS instructions? They can be found here:

Are you following those directions? If not, that is the reason your sites stopped working.

By the way, one of the reasons that we don’t recommend proxying to Netlify’s ADN using Cloudflare’s CDN is because this makes is much harder to troubleshoot issues. Adding Cloudflare “in front” of our ADN introduces and additional layer between site visitor’s browsers and Netlify. It creates a layer of opacity where Netlify cannot troubleshoot because we cannot see anything happening at Cloudflare. We cannot see your configurations there. We cannot see the internal routing of IP packets at Cloudflare.

So, all I can tell you is “Cloudflare is broken not Netlify” but I cannot tell you exactly what is broken there.

The IP addresses you are listing above were taken out of rotation before you reported an issue. The source of truth for those IP addresses is our DNS service - live queries to our DNS service. The records are only valid for as long at the TTL (the time to live value) allows.

The TTLs in the records give a maximum time that those records can be cached. Let’s look at one of those records for a correctly configured site:

mu-testing-domain.com.	        20	IN	A	138.68.234.180
mu-testing-domain.com.	        20	IN	A	54.241.246.27
www.mu-testing-domain.com.      20	IN	A	138.68.234.180
www.mu-testing-domain.com.      20	IN	A	138.68.61.186

If you create an A record for one of those IP address, you need to delete it after 20 seconds, make a new DNS lookup and then create the new A record. Each 20 seconds, you must query, delete the old, and then make the new. This needs to be done 24/7/365. Are you manually updating these IP addresses each 20 seconds at Cloudflare (meaning using their web UI or API to so to)? I’m pretty sure the answer is “no”.

Again, this is why you are supposed to use a CNAME record. Using an A record should never be done unless it is for the apex domain. When it is the apex domain only 104.198.14.52 should be used and then the www subdomain should be made the primary custom domain so that the full ADN is used.

If the site is using the wrong IP addresses it can only be because you (or Cloudflare) is manually configuring for the site to use old IP addresses in error. If you wrote down the IP addresses above (example: 104.248.78.23) and created A records at Cloudflare for this IP addresses, then that is the error. This is what should not be done and this is what took your sites offline.

I even see you taking about doing this again here:

Ack! NOOOO!!! This is wrong. Please stop doing this. This is exactly what took your sites offline last time!!!

Please use the external DNS instructions. Those are the only correct instructions if you are not using Netlify DNS. Failure to follow those directions (like you are doing above) can, will, and did take your sites offline. Not following these directions is the root cause of the site downtime.

There is another layer to this. It appears that Cloudflare is hanging on to IP addresses far longer than the TTL allows. Look at this example:

<REDACTED DOMAIN NAME>	300	IN	A	104.248.78.23
<REDACTED DOMAIN NAME>	300	IN	A	104.248.78.24
;; Received 76 bytes from 173.245.59.247#53(west.ns.cloudflare.com) in 16 ms
  • (Sorry, I don’t have permission to share the actual domain name publicly.)

When I query the DNS for the domain, Cloudflare returned an IP address what we used previously and that we don’t use anymore. Also, look at the TTL. The TTL is five minutes (300 seconds). However, we’ve seen above that the real TTL is only 20 seconds.

This person also created a Netlify DNS zone (which is also an error as it is inactive and therefore it should be deleted).

However, error or not, we can test it to see what correct records would look like:

$ dig +noall +answer  <REDACTED DOMAIN NAME> @dns1.p04.nsone.net
<REDACTED DOMAIN NAME>	20	IN	A	138.68.50.15
<REDACTED DOMAIN NAME>	20	IN	A	138.197.207.178

So, not only is Cloudflare returning IP address with a TTL 15 times larger that what is allowed (300 seconds instead of 20), they are also still returning out of date records many hours after the real DNS record expired. Why is Cloudflare doing that? Again, I have no way of knowing because we at Netlify have no way of seeing what is happening internally at Cloudflare. My best guess is that someone manually created A records - but that is only a guess.

Again, if you follow the instructions above (the " Configure external DNS for a custom domain" link above), your sites will be online again.

If there are questions about any of this, please let us know.

2 Likes

Thanks Luke. I’m trying to figure out why our DNS got configured with an A record for www and incorrect IP addresses in the first place. Did these instructions change at some point in the last 2+ years?

Can you elaborate on this?

hiya @jbdesign. I’ll take over for Luke since he finished his shift several hours ago.

We’ve never published any IP in any of our documentation, app UI, or other staff-provided advice, besides 104.198.14.52. If other IP addresses were configured by you, you must have chosen to apply them on your own, somehow. I can imagine how this could have happened: our CDN uses hundreds of IP addresses, which do show in DNS lookups like this:

% host cyclocrass.netlify.app
cyclocrass.netlify.app has address 184.72.37.151
cyclocrass.netlify.app has address 138.68.7.48

However, though you cannot tell, those are special DNS records which are both geo-aware using this feature: https://ns1.com/geographic-routing (so, my answers are only best for someone in my location), and are also automatically added and removed from rotation regularly by our processes and staff - so, in the end, they are never published for you to use. You should always use either 104.198.14.52 or a CNAME to sitename.netlify.app (for your sitename).

To elaborate on why they are added and removed, there are dozens of reasons. Our CDN is made up of hundreds of machines and they are added and removed regularly, so our advice centers on what doesn’t change and is always correct: the A record and CNAME that I mention.

More details on correct DNS configuration at Netlify here: How to Set Up Netlify DNS - Custom Domains, CNAME, & Records

Hi,
Since today the site of my client is down.

The builds all succeeded without problems.
Via netlify site name it works: brave-goodall-4090cf.netlify.app
Any ideas?
We did not change anything in the last days.
Thanks!

HI, @t87. I’m showing that site is working when I test:

$ curl -svo /dev/null https://ten87.studio/  2>&1 | egrep '^< '
< HTTP/2 200
< cache-control: public, max-age=0, must-revalidate
< content-type: text/html; charset=UTF-8
< date: Fri, 26 Feb 2021 05:45:06 GMT
< etag: "d1f4c3b7adce40243c1b4cf4777a401f-ssl"
< strict-transport-security: max-age=31536000
< age: 0
< server: Netlify
< x-nf-request-id: 45d7a935-456f-433f-a76f-8503df4a8060-688592
<

If it isn’t working for you, would you please send us the x-nf-request-id header which we send with every HTTP response.

There more information about this header here:

If that header isn’t available for any reason, please send the information it replaces (or as many of these details as possible). Those details are:

  • the complete URL requested
  • the IP address for the system making the request
  • the IP address for the CDN node that responded
  • the day and time of the request (and the timezone the time is in)

If there are questions about how to gather any of this information, please let us know.

Hi @luke ,
My site at travelaar.nl is also down as of yesterday.
If I understand correctly, only A records should be added if it concerns an apex domain.
So does this mean I should only ad one A record to my Cloudflare dns?
A TTL:20 104.198.14.52

Netlify domain: travelaar.netlify.app

Let me know if you can see things working as with t87, so I can try and provide the correct information.

//Edit
Site’s back up…
Mysteriously two A records were added… removing the A records and adding a CNAME flattened records to travelaar.netlify.app worked!

1 Like

Hi, Erick. I would recommend the CNAME flattening over the A record when possible. If only an A record is allowed, then only 104.198.14.52 should be used at this time.

Hi Luke,

Like mentioned in my edit, I was able to solve the issue by adding a CNAME flattened record containing travelaar.netlify.app to the dns records over at Cloudflare. :+1:

Thanks!