Google Search redirect error started last week (no change to my site)

Site: musing-saha-fbd5d3.netlify.app
Domain: justingrant.net

I got a notification from Google Search Console about a “redirect error” that started on May 4. Crawling before that date showed no errors. This was suspicious because I haven’t published any changes to my site since April 22, and Google crawled my site almost daily for months previously without any errors. Digging into the Google Search Console error, it looks like the problem is the redirect from HTTP to HTTPS, e.g. http://www.justingrant.net/ => https://www.justingrant.net/. Crawling the HTTPS version of my site does not report an error-- only HTTP.

Has anything changed on the Netlify end which would explain this new error from Google? Or has something changed on Google’s end?

Regardless, what’s the right way to ensure that HTTP=>HTTPS redirects are handled correctly from the Google crawler’s perspective? I found an article (https://www.searchenginejournal.com/google-on-307-hsts-redirects-http-to-https/386232/) describing Google’s handling of 307 redirects but I’m not sure how to apply this info to my Netlify site.

BTW, this sounds somewhat similar to this Netlify forums thread Eliminate multiple redirects for www. and https, although that thread complains about extra redirects, not a Google Search Console error.

FWIW, my site is very simple: just one static page and a few dependency files like images.

Here’s my current redirects in netlify.toml:

[[redirects]]
  from = "/*"
  to = "/index.html"
  status = 200
1 Like

@justingrant Welcome to the Netlify community.

Yet another reason not to trust Google. Your site seems to be redirecting just fine.

| ---------------------------- http -----------------------------
| --------------------- www.justingrant.net ---------------------
HTTP/1.1 301 Moved Permanently
cache-control: public, max-age=0, must-revalidate
content-length: 0
content-type: text/plain
date: Sun, 09 May 2021 12:55:02 GMT
x-nf-language: 
location: https://www.justingrant.net/
x-nf-ats-version: 3438f24
age: 0
x-nf-request-id: 42d9eeb8-9ce5-4b47-8f05-5d0c93e8bebc
server: Netlify
x-nf-country: US

HTTP/2 301 
cache-control: public, max-age=0, must-revalidate
content-length: 40
content-type: text/plain
date: Sun, 09 May 2021 05:12:37 GMT
strict-transport-security: max-age=31536000
x-nf-language: 
location: https://justingrant.net/
x-nf-ats-version: 3438f24
x-nf-request-id: bd049c8d-3c24-415f-85c1-8b4878d8d0ca
server: Netlify
x-nf-country: US
age: 27745

HTTP/2 200 
cache-control: public, max-age=0, must-revalidate
content-length: 0
content-type: text/html; charset=UTF-8
date: Sun, 09 May 2021 12:55:02 GMT
etag: "a78d9f0c5950661c7b6351285791199d-ssl"
strict-transport-security: max-age=31536000
x-nf-language: 
x-nf-ats-version: 3438f24
server: Netlify
age: 0
x-nf-country: US
x-nf-request-id: 578f7bec-416e-4e5c-afac-e93d0fb3ace5

|================================================================

Out of interest, I just ran http://www.justingrant.net/ through Redirect Checker | Check your Statuscode 301 vs 302 and it appears to get too many redirects when setting the user agent to Google Bot.

Is this not the same issue we discussed at https://answers.netlify.com/t/redirect-error-on-google-search-console-for-non-https-bare-url ?

Thanks @gregraven. Could you retry your test with the user agent set to Googlebot ? According to https://www.redirect-checker.org/, there’s an infinite redirect loop with this user agent (and any other search engine), but not with the default user agent nor real browsers. I see this infinite loop only on HTTP-to-HTTPS redirects, both with and without “www”, e.g. http://justingrant.net => https://justingrant.net or http://www.justingrant.net/ => https://justingrant.net.

Thanks @Ultra for suggesting that tool to check redirects. Note that one difference from your case to mine is that in my site the bare (no “www”) URL is the canonical one, while I think in your case the www variant is canonical. Perhaps this explains the difference in behavior between our sites?

1 Like

@justingrant I’m no DNS expert, but I would expect so.

Out of interest, did you set up an ALIAS record for your apex domain instead of an A record?

That ended up being my issue. I’m 99% sure Netlify’s previous recommendation was to use an ALIAS instead of an A record if available but looking at current docs that is no longer the case.

@Ultra - Sorry for late reply. Busy month. I don’t have either an A record or an ALIAS record on my delegated-to-Netlify apex domain. All I have is two “NETLIFY” records, both pointing at my Netlify site hostname.

Is this the wrong way to set it up? Other than the problem with Googlebot redirects, it seems to work OK for other use cases. FWIW, in site settings, my apex domain is set up as primary, with www redirecting to the primary domain.

Are you using Netlify DNS @justingrant ?

For domains managed by Netlify, we will automatically create “NETLIFY” records that point to our servers when you assign a domain or subdomain for your site. You can also add your own DNS records to point to other services, such as an email provider.

Source: DNS records | Netlify Docs

@colemay - Yes. DNS is delegated to Netlify from my domain registrar. The screenshot above was from Netlify’s domain settings.

So your DNS is right.

@coelmay unfortunately, the redirect checker at redirect-checker.org still shows an infinite redirect when the user-agent is set to Google Bot.

Now, it’s not clear to me whether this is a mistake at Google’s end or Netlify’s but for @justingrant’s purposes that’s largely irrelevant. If he’s getting redirect errors in his Google Search Console it’s negatively impacting his SEO. And as 99% of the population are using Google to find websites, even if it’s Google’s mistake, it’s Netlify’s responsibility to make it work for their customers.

@justingrant I’m sorry, I don’t know the solution. It seems that NETLIFY simply means Netlify can change their config on a whim. If I were you, I’d be tempted to manage the DNS myself as seeing events of last few days transpire has dented my confidence in Netlify’s DNS management.

If that’s not an option for you, I’d consider switching your canonical name from ‘justingrant.net’ to ‘www.justingrant.net’ and hope Netlify’s DNS setup works when the redirect is managed that way round – but that will dent your SEO for a while so I’d avoid that if you already have some domain authority there.

FWIW, just checking in on the other thread, someone with a similar issue solved it by removing the Netlify references in Netlify DNS and managing it manually – without the need to switch DNS provider:

Again, they have ‘www’ set up as their canonical domain, so your config might need to be slightly different.

Google’s browser doesn’t have infinite redirects when viewing websites, but Google’s bot does. I don’t see how that is Netlify’s problem @Ultra.

@coelmay if Googlebot is unhappy – that’s a massive issue. It negatively impacts SEO and search rankings.

As mentioned before, the reason it’s Netlify’s issue is that 99% of the population use Google to find what they’re looking for. Even if Netlify’s DNS implementation is impeccable, the fact that Google doesn’t like it is a problem.

I’m having the exact same issue on my website at phonolyth.com, and redirect-checker.org shows the same when user-agent is set to Google Bot.
This effectively kills Google SEO as the bot is unable to begin crawling the site.

FWIW, it appears that enabling prerendering with Netlify kinda bypasses the issue, though it no longer redirects http to https (at least with redirect-checker.org)… But my site is already prerendered :thinking:

Hi, @justingrant, @Ultra, and @coelmay. I just tested the site with the tool https://www.redirect-checker.org/index.php and was surprised to find it providing false information.

EDIT: The page at https://www.redirect-checker.org/index.php is giving correct information and it was our logging, not theirs, that had false information. The page at https://www.redirect-checker.org/index.php is trustworthy and I’m sorry for saying otherwise. I was wrong.

For example, it showed this:

>>> https://www.justingrant.net/

> --------------------------------------------
> 301 Moved Permanently
> --------------------------------------------

Status:	301 Moved Permanently
Code:	301
cache-control:	public, max-age=0, must-revalidate
content-length:	44
content-type:	text/plain
date:	Tue, 18 May 2021 08:28:24 GMT
x-nf-request-id:	7830a9ef-f21c-4d4f-af96-c2f8efdd1a8d
Location:	https://www.justingrant.net/
server:	Netlify
age:	81192

This shows that the request was made to the URL https://www.justingrant.net/. However, we can see the requests logged at Netlify and the request is actually being made to http://www.justingrant.net/. The tool says it used HTTPS but I can see that it did not do so in reality.

So, this isn’t a redirect loop. That tool is repeatedly requesting the HTTP version 19 times and being directed to HTTPS all 19 times. That isn’t a loop but the tool incorrectly states that it is. So, I won’t be able to use that tool for future testing because I know it isn’t providing accurate data.

At Netlify, we cannot see what is happening inside of Google’s tools or apps. We can see HTTP requests at Netlify and we log details about the request and response. This enables us to troubleshoot Netlify. We have no tools to see what happens at Google and we cannot troubleshoot what they report. However, if you help us to identify the incorrect HTTP responses, we can troubleshoot Netlify.

If you want us to address the incorrect redirects, we need to identify them or reproduce them. We normally would do this by collecting the x-nf-request-id response headers (as shown in the output above). These headers are unique to a single HTTP response and never reused.

There more information about this header here:

If that header isn’t available for any reason, please send the information it replaces (or as many of these details as possible). Those details are:

  • the complete URL requested
  • the IP address for the system making the request
  • the IP address for the CDN node that responded
  • the day and time of the request (with the timezone the time is in)

Alternatively, if you can use curl or some other CLI tool to reproduce the issue, that would be perfect. If you have such a curl example, please feel free to send that instead. For example, when I test with the Googlebot user agent, I see no issue.

First, without the Googlebot user-agent:

$ curl -svo /dev/null http://www.justingrant.net/  2>&1 | egrep '^< '
< HTTP/1.1 301 Moved Permanently
< cache-control: public, max-age=0, must-revalidate
< content-length: 44
< content-type: text/plain
< date: Wed, 19 May 2021 07:26:34 GMT
< x-nf-request-id: ad0b7b53-209c-49ff-bc93-db0815b0f947
< location: https://www.justingrant.net/
< server: Netlify
< age: 1
<
$ curl -svo /dev/null https://www.justingrant.net/  2>&1 | egrep '^< '
< HTTP/2 301
< cache-control: public, max-age=0, must-revalidate
< content-length: 40
< content-type: text/plain
< date: Wed, 19 May 2021 07:26:40 GMT
< strict-transport-security: max-age=31536000
< server: Netlify
< location: https://justingrant.net/
< age: 0
< x-nf-request-id: edd0269d-f7c2-453e-a64e-360343545ca1
<
$ curl -svo /dev/null https://justingrant.net/  2>&1 | egrep '^< '
< HTTP/2 200
< cache-control: public, max-age=0, must-revalidate
< content-type: text/html; charset=UTF-8
< date: Wed, 19 May 2021 07:26:49 GMT
< etag: "a78d9f0c5950661c7b6351285791199d-ssl"
< strict-transport-security: max-age=31536000
< age: 1
< server: Netlify
< x-nf-request-id: 4ff9f826-ee62-497e-af0d-9ef8a72dca81
<

There are two redirects:

  1. HTTP to HTTPS at www.justingrant.net
  2. from www.justingrant.net to justingrant.net using HTTP

The third response is a 200.

The same is true with the Googlebot user-agent:

$ curl -svo /dev/null -A Googlebot http://www.justingrant.net/  2>&1 | egrep '^< '
< HTTP/1.1 301 Moved Permanently
< cache-control: public, max-age=0, must-revalidate
< content-length: 44
< content-type: text/plain
< date: Wed, 19 May 2021 07:29:01 GMT
< x-nf-request-id: 85a32f01-dd4e-4041-9d8f-c9861440dca8
< location: https://www.justingrant.net/
< server: Netlify
< age: 0
<
$ curl -svo /dev/null -A Googlebot https://www.justingrant.net/  2>&1 | egrep '^< '
< HTTP/2 301
< cache-control: public, max-age=0, must-revalidate
< content-length: 40
< content-type: text/plain
< date: Wed, 19 May 2021 07:29:06 GMT
< strict-transport-security: max-age=31536000
< server: Netlify
< location: https://justingrant.net/
< age: 0
< x-nf-request-id: 70ed2479-4406-40eb-991e-96b43612a443
<
luke@macbook-luke-pdx : ~/tmp/aws-cli/aws : 2021-05-19 00:29:06 :
< HTTP/2 200
< cache-control: public, max-age=0, must-revalidate
< content-type: text/html; charset=UTF-8
< date: Wed, 19 May 2021 07:29:10 GMT
< etag: "a78d9f0c5950661c7b6351285791199d-ssl"
< strict-transport-security: max-age=31536000
< age: 0
< server: Netlify
< x-nf-request-id: 766727af-9708-487b-911a-369a45691349
<

Last but not least, if you can reproduce the issue in a browser (for example by spoofing the user-agent header there to use Googlebot’s header) then making a HAR file recording would be ideal. That recording will contain all the information required - include the x-nf-request-id headers, timestamp, and more.

To summarize, asking us to explain messages inside of Google’s tools will rarely be successful as we have zero insight into those tools. We can troubleshoot Netlify itself. In order to troubleshoot the issue, we need some way to identify the incorrect HTTP responses and the x-nf-request-id header or the information is replaces are the two most common ways this can be done.

If there are questions about any of this, please let us know.

1 Like

To summarize, asking us to explain messages inside of Google’s tools will rarely be successful as we have zero insight into those tools. We can troubleshoot Netlify itself. In order to troubleshoot the issue, we need some way to identify the incorrect HTTP responses and the x-nf-request-id header or the information is replaces are the two most common ways this can be done.

Copy pasting from the other thread:

As mentioned, the way to test this is by using Google Search Console which is the only authoritative way to test this as far as I can see. Like I mentioned before, even if Google’s implementation is incorrect their dominance in search makes their implementation the de-facto standard.

I would encourage all affected individuals to check the results for the urls in question by visiting Google Search Console and entering the url into the ‘URL Inspection’ tool which is third down in the left-hand navigation bar. You can then test any updates to your configuration by pressing the ‘test live url’ button.

If it works – wonderful. No further action needed.

If it fails, the SEO of your website is being negatively impacted and you’ll need to make changes to mitigate this.

Again, I believe it is the responsibility of Netlify to ensure that their default DNS setup is compatible with Google’s interpretation of the standards – regardless of whether or not it is correct. It’s the reality of their dominant position in the market.

Hi, @Ultra. Here is the blocker preventing this from being resolved:

  • There is no proof in either topic that Netlify is doing anything wrong.

Just show me the issue happening and I can get it fixed. That is all I need to help you.

I explained several different possible sets of information which would allow our support team to troubleshoot, which includes any of the following:

  • the x-nf-request-id header for a bad response
  • the client IP address, service IP address, URL, date, time, and timezone for a bad response
  • a HAR recording of a bad response
  • a curl command which will return a bad response

Any of those would allow me to research the issue.

Would you please send us that information (any one of those sets)? Do you have questions about what is required or how to gather that information?

Hi @Luke,

I’m confused with your stance that ‘there is no proof in either topic that Netlify is doing anything wrong’ .

That’s not my aim.

My concern is that people read these statements that show positive diagnostics results and believe that their site will perform well despite Google Search Console reporting otherwise.

It’s important you communicate to people that this isn’t the case.

Regardless of what Netlify’s or anybody else diagnostics say, if Google Search Console reports issues for their site, that’s a problem that will negatively impact their SEO.

Therefore, I’m recommending that people confirm that their urls are behaving in the way Goolge expects by confirming their results in Google Search Console.

This follows on from my own experience after following Netlify’s recommendations on setting up my own DNS that proved to cause Google Search Console to show configuration errors.

This seemed to be caused by Nettlify’s recommendation to use flattened CNAME records (ALIAS records) which, as you can verify below, has now been limited to Cloudflare customers only.

Original Instructions: https://web.archive.org/web/20210129101918if_/https://docs.netlify.com/domains-https/custom-domains/configure-external-dns/

Some DNS providers, such as Netlify DNS or NS1, have devised special record types to simulate CNAME-style domain resolution for apex domains. Find out if your provider supports this type of behavior, which might be labeled as CNAME flattening, ANAME records, or ALIAS records.

If your DNS provider supports one of these special record types (recommended), find and follow their instructions to point the apex domain directly to your Netlify subdomain, such as brave-curie-671954.netlify.app .

Current instructions: https://web.archive.org/web/20210129101918if_/https://docs.netlify.com/domains-https/custom-domains/configure-external-dns/

If you use Cloudflare as your DNS provider , it supports a special record type that also works well on the bare domain - Flattened CNAME records. This record type is recommended for your bare domain. You’d set the same record value as for your subdomains, such as brave-curie-671954.netlify.app.

What’s your evidence that this is the case? My experience has been exactly the opposite … except in one edge case, no matter what Google claims is wrong with my sites they still appear in search results as expected.