Home
Support Forums

Google Search redirect error started last week (no change to my site)

@Ultra - Sorry for late reply. Busy month. I don’t have either an A record or an ALIAS record on my delegated-to-Netlify apex domain. All I have is two “NETLIFY” records, both pointing at my Netlify site hostname.

Is this the wrong way to set it up? Other than the problem with Googlebot redirects, it seems to work OK for other use cases. FWIW, in site settings, my apex domain is set up as primary, with www redirecting to the primary domain.

Are you using Netlify DNS @justingrant ?

For domains managed by Netlify, we will automatically create “NETLIFY” records that point to our servers when you assign a domain or subdomain for your site. You can also add your own DNS records to point to other services, such as an email provider.

Source: DNS records | Netlify Docs

@colemay - Yes. DNS is delegated to Netlify from my domain registrar. The screenshot above was from Netlify’s domain settings.

So your DNS is right.

@coelmay unfortunately, the redirect checker at redirect-checker.org still shows an infinite redirect when the user-agent is set to Google Bot.

Now, it’s not clear to me whether this is a mistake at Google’s end or Netlify’s but for @justingrant’s purposes that’s largely irrelevant. If he’s getting redirect errors in his Google Search Console it’s negatively impacting his SEO. And as 99% of the population are using Google to find websites, even if it’s Google’s mistake, it’s Netlify’s responsibility to make it work for their customers.

@justingrant I’m sorry, I don’t know the solution. It seems that NETLIFY simply means Netlify can change their config on a whim. If I were you, I’d be tempted to manage the DNS myself as seeing events of last few days transpire has dented my confidence in Netlify’s DNS management.

If that’s not an option for you, I’d consider switching your canonical name from ‘justingrant.net’ to ‘www.justingrant.net’ and hope Netlify’s DNS setup works when the redirect is managed that way round – but that will dent your SEO for a while so I’d avoid that if you already have some domain authority there.

FWIW, just checking in on the other thread, someone with a similar issue solved it by removing the Netlify references in Netlify DNS and managing it manually – without the need to switch DNS provider:

Again, they have ‘www’ set up as their canonical domain, so your config might need to be slightly different.

Google’s browser doesn’t have infinite redirects when viewing websites, but Google’s bot does. I don’t see how that is Netlify’s problem @Ultra.

@coelmay if Googlebot is unhappy – that’s a massive issue. It negatively impacts SEO and search rankings.

As mentioned before, the reason it’s Netlify’s issue is that 99% of the population use Google to find what they’re looking for. Even if Netlify’s DNS implementation is impeccable, the fact that Google doesn’t like it is a problem.

I’m having the exact same issue on my website at phonolyth.com, and redirect-checker.org shows the same when user-agent is set to Google Bot.
This effectively kills Google SEO as the bot is unable to begin crawling the site.

FWIW, it appears that enabling prerendering with Netlify kinda bypasses the issue, though it no longer redirects http to https (at least with redirect-checker.org)… But my site is already prerendered :thinking:

Hi, @justingrant, @Ultra, and @coelmay. I just tested the site with the tool https://www.redirect-checker.org/index.php and was surprised to find it providing false information.

EDIT: The page at https://www.redirect-checker.org/index.php is giving correct information and it was our logging, not theirs, that had false information. The page at https://www.redirect-checker.org/index.php is trustworthy and I’m sorry for saying otherwise. I was wrong.

For example, it showed this:

>>> https://www.justingrant.net/

> --------------------------------------------
> 301 Moved Permanently
> --------------------------------------------

Status:	301 Moved Permanently
Code:	301
cache-control:	public, max-age=0, must-revalidate
content-length:	44
content-type:	text/plain
date:	Tue, 18 May 2021 08:28:24 GMT
x-nf-request-id:	7830a9ef-f21c-4d4f-af96-c2f8efdd1a8d
Location:	https://www.justingrant.net/
server:	Netlify
age:	81192

This shows that the request was made to the URL https://www.justingrant.net/. However, we can see the requests logged at Netlify and the request is actually being made to http://www.justingrant.net/. The tool says it used HTTPS but I can see that it did not do so in reality.

So, this isn’t a redirect loop. That tool is repeatedly requesting the HTTP version 19 times and being directed to HTTPS all 19 times. That isn’t a loop but the tool incorrectly states that it is. So, I won’t be able to use that tool for future testing because I know it isn’t providing accurate data.

At Netlify, we cannot see what is happening inside of Google’s tools or apps. We can see HTTP requests at Netlify and we log details about the request and response. This enables us to troubleshoot Netlify. We have no tools to see what happens at Google and we cannot troubleshoot what they report. However, if you help us to identify the incorrect HTTP responses, we can troubleshoot Netlify.

If you want us to address the incorrect redirects, we need to identify them or reproduce them. We normally would do this by collecting the x-nf-request-id response headers (as shown in the output above). These headers are unique to a single HTTP response and never reused.

There more information about this header here:

If that header isn’t available for any reason, please send the information it replaces (or as many of these details as possible). Those details are:

  • the complete URL requested
  • the IP address for the system making the request
  • the IP address for the CDN node that responded
  • the day and time of the request (with the timezone the time is in)

Alternatively, if you can use curl or some other CLI tool to reproduce the issue, that would be perfect. If you have such a curl example, please feel free to send that instead. For example, when I test with the Googlebot user agent, I see no issue.

First, without the Googlebot user-agent:

$ curl -svo /dev/null http://www.justingrant.net/  2>&1 | egrep '^< '
< HTTP/1.1 301 Moved Permanently
< cache-control: public, max-age=0, must-revalidate
< content-length: 44
< content-type: text/plain
< date: Wed, 19 May 2021 07:26:34 GMT
< x-nf-request-id: ad0b7b53-209c-49ff-bc93-db0815b0f947
< location: https://www.justingrant.net/
< server: Netlify
< age: 1
<
$ curl -svo /dev/null https://www.justingrant.net/  2>&1 | egrep '^< '
< HTTP/2 301
< cache-control: public, max-age=0, must-revalidate
< content-length: 40
< content-type: text/plain
< date: Wed, 19 May 2021 07:26:40 GMT
< strict-transport-security: max-age=31536000
< server: Netlify
< location: https://justingrant.net/
< age: 0
< x-nf-request-id: edd0269d-f7c2-453e-a64e-360343545ca1
<
$ curl -svo /dev/null https://justingrant.net/  2>&1 | egrep '^< '
< HTTP/2 200
< cache-control: public, max-age=0, must-revalidate
< content-type: text/html; charset=UTF-8
< date: Wed, 19 May 2021 07:26:49 GMT
< etag: "a78d9f0c5950661c7b6351285791199d-ssl"
< strict-transport-security: max-age=31536000
< age: 1
< server: Netlify
< x-nf-request-id: 4ff9f826-ee62-497e-af0d-9ef8a72dca81
<

There are two redirects:

  1. HTTP to HTTPS at www.justingrant.net
  2. from www.justingrant.net to justingrant.net using HTTP

The third response is a 200.

The same is true with the Googlebot user-agent:

$ curl -svo /dev/null -A Googlebot http://www.justingrant.net/  2>&1 | egrep '^< '
< HTTP/1.1 301 Moved Permanently
< cache-control: public, max-age=0, must-revalidate
< content-length: 44
< content-type: text/plain
< date: Wed, 19 May 2021 07:29:01 GMT
< x-nf-request-id: 85a32f01-dd4e-4041-9d8f-c9861440dca8
< location: https://www.justingrant.net/
< server: Netlify
< age: 0
<
$ curl -svo /dev/null -A Googlebot https://www.justingrant.net/  2>&1 | egrep '^< '
< HTTP/2 301
< cache-control: public, max-age=0, must-revalidate
< content-length: 40
< content-type: text/plain
< date: Wed, 19 May 2021 07:29:06 GMT
< strict-transport-security: max-age=31536000
< server: Netlify
< location: https://justingrant.net/
< age: 0
< x-nf-request-id: 70ed2479-4406-40eb-991e-96b43612a443
<
luke@macbook-luke-pdx : ~/tmp/aws-cli/aws : 2021-05-19 00:29:06 :
< HTTP/2 200
< cache-control: public, max-age=0, must-revalidate
< content-type: text/html; charset=UTF-8
< date: Wed, 19 May 2021 07:29:10 GMT
< etag: "a78d9f0c5950661c7b6351285791199d-ssl"
< strict-transport-security: max-age=31536000
< age: 0
< server: Netlify
< x-nf-request-id: 766727af-9708-487b-911a-369a45691349
<

Last but not least, if you can reproduce the issue in a browser (for example by spoofing the user-agent header there to use Googlebot’s header) then making a HAR file recording would be ideal. That recording will contain all the information required - include the x-nf-request-id headers, timestamp, and more.

To summarize, asking us to explain messages inside of Google’s tools will rarely be successful as we have zero insight into those tools. We can troubleshoot Netlify itself. In order to troubleshoot the issue, we need some way to identify the incorrect HTTP responses and the x-nf-request-id header or the information is replaces are the two most common ways this can be done.

If there are questions about any of this, please let us know.

1 Like

To summarize, asking us to explain messages inside of Google’s tools will rarely be successful as we have zero insight into those tools. We can troubleshoot Netlify itself. In order to troubleshoot the issue, we need some way to identify the incorrect HTTP responses and the x-nf-request-id header or the information is replaces are the two most common ways this can be done.

Copy pasting from the other thread:

As mentioned, the way to test this is by using Google Search Console which is the only authoritative way to test this as far as I can see. Like I mentioned before, even if Google’s implementation is incorrect their dominance in search makes their implementation the de-facto standard.

I would encourage all affected individuals to check the results for the urls in question by visiting Google Search Console and entering the url into the ‘URL Inspection’ tool which is third down in the left-hand navigation bar. You can then test any updates to your configuration by pressing the ‘test live url’ button.

If it works – wonderful. No further action needed.

If it fails, the SEO of your website is being negatively impacted and you’ll need to make changes to mitigate this.

Again, I believe it is the responsibility of Netlify to ensure that their default DNS setup is compatible with Google’s interpretation of the standards – regardless of whether or not it is correct. It’s the reality of their dominant position in the market.

Hi, @Ultra. Here is the blocker preventing this from being resolved:

  • There is no proof in either topic that Netlify is doing anything wrong.

Just show me the issue happening and I can get it fixed. That is all I need to help you.

I explained several different possible sets of information which would allow our support team to troubleshoot, which includes any of the following:

  • the x-nf-request-id header for a bad response
  • the client IP address, service IP address, URL, date, time, and timezone for a bad response
  • a HAR recording of a bad response
  • a curl command which will return a bad response

Any of those would allow me to research the issue.

Would you please send us that information (any one of those sets)? Do you have questions about what is required or how to gather that information?

Hi @Luke,

I’m confused with your stance that ‘there is no proof in either topic that Netlify is doing anything wrong’ .

That’s not my aim.

My concern is that people read these statements that show positive diagnostics results and believe that their site will perform well despite Google Search Console reporting otherwise.

It’s important you communicate to people that this isn’t the case.

Regardless of what Netlify’s or anybody else diagnostics say, if Google Search Console reports issues for their site, that’s a problem that will negatively impact their SEO.

Therefore, I’m recommending that people confirm that their urls are behaving in the way Goolge expects by confirming their results in Google Search Console.

This follows on from my own experience after following Netlify’s recommendations on setting up my own DNS that proved to cause Google Search Console to show configuration errors.

This seemed to be caused by Nettlify’s recommendation to use flattened CNAME records (ALIAS records) which, as you can verify below, has now been limited to Cloudflare customers only.

Original Instructions: https://web.archive.org/web/20210129101918if_/https://docs.netlify.com/domains-https/custom-domains/configure-external-dns/

Some DNS providers, such as Netlify DNS or NS1, have devised special record types to simulate CNAME-style domain resolution for apex domains. Find out if your provider supports this type of behavior, which might be labeled as CNAME flattening, ANAME records, or ALIAS records.

If your DNS provider supports one of these special record types (recommended), find and follow their instructions to point the apex domain directly to your Netlify subdomain, such as brave-curie-671954.netlify.app .

Current instructions: https://web.archive.org/web/20210129101918if_/https://docs.netlify.com/domains-https/custom-domains/configure-external-dns/

If you use Cloudflare as your DNS provider , it supports a special record type that also works well on the bare domain - Flattened CNAME records. This record type is recommended for your bare domain. You’d set the same record value as for your subdomains, such as brave-curie-671954.netlify.app.

What’s your evidence that this is the case? My experience has been exactly the opposite … except in one edge case, no matter what Google claims is wrong with my sites they still appear in search results as expected.

@luke I’m seeing this too and the fact that it is user agent specific should be a strong hint that this is something Netlify does server-side.

Example x-nf-request-id is f7bdb6dc-c836-48ff-80e6-3c40f077779b or 3c71401f-6a96-43c4-b6ed-c51ba2697780.

I am unable to provide you with a reproducible test case because Netlify behaves differently in repeated requests. Here’s a log of what I did:

First I tried a HTTP → HTTPS redirect:

$ curl --user-agent "Googlebot/2.1 (+http://www.google.com/bot.html)" -v http://poedit.net/support/
*   Trying 167.99.242.112...
* TCP_NODELAY set
* Connected to poedit.net (167.99.242.112) port 80 (#0)
> GET /support/ HTTP/1.1
> Host: poedit.net
> User-Agent: Googlebot/2.1 (+http://www.google.com/bot.html)
> Accept: */*
> 
< HTTP/1.1 301 Moved Permanently
< cache-control: public, max-age=0, must-revalidate
< content-length: 43
< content-type: text/plain
< date: Wed, 19 May 2021 14:19:31 GMT
< x-nf-request-id: 60bb1149-00bf-435c-9974-30e9e8a9278d
< location: https://poedit.net/support/
< server: Netlify
< age: 0
< 
Redirecting to https://poedit.net/support/
* Connection #0 to host poedit.net left intact

Then the same with HTTPS, and got a redirect loop back to the requested URL:

$ curl --user-agent "Googlebot/2.1 (+http://www.google.com/bot.html)" -v https://poedit.net/support/
*   Trying 167.99.242.112...
* TCP_NODELAY set
* Connected to poedit.net (167.99.242.112) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=*.poedit.net
*  start date: Mar 24 20:37:42 2021 GMT
*  expire date: Jun 22 20:37:42 2021 GMT
*  subjectAltName: host "poedit.net" matched cert's "poedit.net"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7f8c20006600)
> GET /support/ HTTP/2
> Host: poedit.net
> User-Agent: Googlebot/2.1 (+http://www.google.com/bot.html)
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 301 
< cache-control: public, max-age=0, must-revalidate
< content-length: 43
< content-type: text/plain
< date: Wed, 19 May 2021 14:19:31 GMT
< x-nf-request-id: f7bdb6dc-c836-48ff-80e6-3c40f077779b
< location: https://poedit.net/support/
< server: Netlify
< age: 11
< 
Redirecting to https://poedit.net/support/
* Connection #0 to host poedit.net left intact

And again:

$ curl --user-agent "Googlebot/2.1 (+http://www.google.com/bot.html)" https://poedit.net/support/
Redirecting to https://poedit.net/support/
$

So I tried as non-Googlebot:

curl  https://poedit.net/support/
<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
…etc…etc...

As you can see, there was wrong redirect for Googlebot user agent, but not for curl one.

I tried again, as Googlebot, to confirm, but this time, the problem did not occur:

$ curl --user-agent "Googlebot/2.1 (+http://www.google.com/bot.html)" https://poedit.net/support/ 
<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
…etc…etc...

After that initial successful fetch, I was unable to reproduce even with different URLs from the same domain (ones I know as problematic from Google Search Console).

Eventually, I thought I figured it out: I needed to make a plain HTTP request, followed by HTTPS one, to re-reproduce the bug. But that too isn’t perfectly reliable.

In the process, I noticed that sometimes curl returned data, not redirect, for http:// requests too! E.g. x-nf-request-id: ff83c7b5-5032-47f4-a74f-4609c64e72c3 - but this should never happen.

This leads me to think that something at Netlify changed recently-ish to use some short-lived shared cache for http:// and https:// That’s why it returns self-referencing redirect - it’s the previous http:// response. And that’s why it responds with body instead of redirect to a http:// request - it’s a cached response to the https:// request done shortly before it.

The misbehavior does seem short-lived, on a single-digit minutes TTL. But that’s enough to break Googlebot.

Finally, this is Googlebot-specific, the server does not behave this way for other user agents. That’s really not something that can be explained by DNS or by issues on Googlebot’s end, especially given the reproduction with curl above.

1 Like

@gregraven as mentioned in the other thread, any back links that are made to the site using the non-functioning url aren’t going to credit the site with domain authority. If you’re trying to boost your search ranking you need every backlink you can get. If a site is linked to by the NYT but they use the broken url, that’s some serious SEO juice being lost.

1 Like

@Ultra You’ve completely lost me. You seem to have gone from arguing that Google reports things are wrong to stating that your own URLs are wrong. If the URL is correct and links to a page on Netlify, I’m mystified as to what you think the problem is.

@gregraven I’ll try to make it clear: for many Netlify customers, myself included, search engine ranking is very important. Anything that damages that costs money and is something to be avoided.

2 Likes

@Ultra I’m one of them, but each of my many sites on Netlify are properly indexed by Google, to the best of my ability to verify as such.