Redirect error on Google Search console for non-https bare URL

That’s the thing, I use netlify DNS, not an external one… so this is what works:

But netlify itself by default just configures two “NETLIFY” (cname ?) entries, not an A entry.
And that seems started triggering google bot somehow.

1 Like

OK, now I’m getting it. You are unnecessarily modifying your DNS settings within Netlify to mollify the googlebot.

How certain are you that what you are seeing is NOT due to the content security policy you have set up?

content-security-policy: base-uri https://pduchnovsky.com; 
connect-src 'self'; 
style-src 'self' 'unsafe-inline'; 
child-src 'self' disqus.com; 
object-src 'none'; 
worker-src 'self'; 
report-uri https://pduchnovsky.report-uri.com/a/d/g; 
report-to https://pduchnovsky.report-uri.com/a/d/g;
  • This CSP is active for at least 6 months and there is nothing that could affect redirect.
  • After adding an A entry instead of “NETLIFY”/CNAME Flattening it works fine… (at least on redirect-checker.org)

With these “default” settings it gets redirect loop:

As soon as I add A entry it’s fine:

Google admin toolbox seems to also check whether naked domain is an A record:
image

2 Likes

@pduchnovsky I gave up trying to appease the Google gods many years ago. As long as CSE works, I’m happy.

2 Likes

Actually, the solution above only worked for the apex domain itself and still encountered errors for www. subdomain

Therefore I created cname for www. subdomain pointing towards apex domain and this seems to solve the redirect errors for both apex as well as subdomain.

image

1 Like

Hi, I think I’m running in to the same issue, except I’m getting a server timeout error with googlebot instead of a redirect error. I tried the DNS configuration changes that @pduchnovsky suggested but it doesn’t seem to help, either for the apex domain or www (set as primary).

My site is https://www.pdfstitcher.org, and if I point googlebot at the netlify url (brave-volhard-5d9df5.netlify.app) then everything seems fine, so it really seems like a DNS issue. I’ve tried configuring it externally as well as letting Netlify handle it and I’m at a bit of a loss.

The other thing that doesn’t work is LinkedIn’s post inspector, so I suspect it’s the same issue.

@cfcurtis Welcome to the Netlify community.

I have no idea why this would be happening, but for some reason the Internet doesn’t think your apex domain is served by Netlify.

|======================== check for server ======================
| ---------------------- should be Netlify ----------------------
| ----------------------- pdfstitcher.org -----------------------

| --------------------- www.pdfstitcher.org ---------------------
< server: Netlify
|================================================================
|====================== get x-nf-request-id =====================
| -------------------- blank if not Netlify ---------------------
| ----------------------- pdfstitcher.org -----------------------
| ---------------------------- http -----------------------------
| ---------------------------- https ----------------------------
< x-nf-request-id: ac4b1606-3298-4ba0-93b0-cf25bddc028b-221337382
| --------------------- www.pdfstitcher.org ---------------------
| ---------------------------- http -----------------------------
< x-nf-request-id: 2f0ecc80-90a8-4ad8-8e9d-f296fd264df4
| ---------------------------- https ----------------------------
< x-nf-request-id: 9a98d8fc-ae7f-4ca6-8afc-b5fa202c36d9
|================================================================

Thank you for looking into this for me, I’ve been wracking my brain trying to figure it out. Is there anything I can do to fix it? Right now it says pdfstitcher.org redirects to www.pdfstitcher.org, which is set as my primary.

Actually, this is an incorrect DNS configuration and as a member of the Netlify support team I ask that people please not do this.

The NETLIFY type DNS records are the correct way to link custom domains to sites at Netlify when using Netlify DNS. So, I do ask that anyone reading this topic not follow that suggestion. It is definitely not a recommended configuration in any way.

Also, after doing testing with https://www.redirect-checker.org/index.php I believe it provides highly inaccurate information and I will not accept any data from that tool as valid until I can see it is working correctly. I just tested an example tonight where the tool reports a redirect loop when in fact it is actually requesting the exact same URL over and over (the HTTP only URL) and getting the exact same redirect again and again (the redirect from HTTP to HTTPS). That isn’t a loop. The tool also incorrectly reports it is requesting HTTPS when in fact it does not. To summarize, I cannot trust the data returned by https://www.redirect-checker.org/index.php as I know it is providing false data. (Note, I don’t think that is intentional but I do think it has a bug of some kind.)

EDIT: Further testing indicates that https://www.redirect-checker.org/index.php is giving accurate and correct information and it was our logging, not theirs, which was giving inaccurate information.

Also, we cannot see what Google’s tools are doing. We need some way to see this without using third-party tools.

Can anyone reproduce this outside of these two tools? If so, no one has share that information. I’m going to summarize my post in a parallel topic by saying:

  • We cannot troubleshoot Google. We can only troubleshoot Netlify. To troubleshoot Netlify, we need to identify the incorrect redirects.

We need information to identify the incorrect response. There is a longer reply with what details are required to troubleshoot here:

If you would provide any of the three possible sets of details to troubleshoot, we will be happy to take another look at this. If there are any questions about to get that information, please let us know.

Hi @Luke ,

I understand your position on this and have no particular investment in the tool at https://www.redirect-checker.org/index.php.

However, in my personal situation it was the only tool that reflected the non-publicly available results I received by using the URL Inspection tool on Google Search Console.

Also, we cannot see what Google’s tools are doing. We need some way to see this without using third-party tools.

As mentioned, the way to test this is by using Google Search Console which is the only authoritative way to test this as far as I can see. Like I mentioned before, even if Google’s implementation is incorrect their dominance in search makes their implementation the de-facto standard.

I would encourage all affected individuals to check the results for the urls in question by visiting Google Search Console and entering the url into the ‘URL Inspection’ tool which is third down in the left-hand navigation bar. You can then test any updates to your configuration by pressing the ‘test live url’ button.

If it works – wonderful. No further action needed.

If it fails, the SEO of your website is being negatively impacted and you’ll need to make changes to mitigate this.

Again, I believe it is the responsibility of Netlify to ensure that their default DNS setup is compatible with Google’s interpretation of the standards – regardless of whether or not it is correct. It’s the reality of their dominant position in the market.

1 Like

Hi @Luke,

Thanks for this info, it seems I have a different issue than the OP. I’ve reverted back to the NETLIFY DNS records by obliterating my manual configuration and re-enabling DNS management, so now I just have two NETLIFY records (one for www.pdfstitcher.org, the other for pdfstitcher.org).

The curl command gives me the following:

Apex domain without Googlebot:

curl -svo /dev/null https://pdfstitcher.org 2>&1 | egrep '^< '
< HTTP/2 301
< cache-control: public, max-age=0, must-revalidate
< content-length: 43
< content-type: text/plain
< date: Sun, 16 May 2021 06:22:35 GMT
< strict-transport-security: max-age=31536000
< age: 286621
< location: https://www.pdfstitcher.org/
< server: Netlify
< x-nf-request-id: 7680ef39-3f13-423c-a8f3-ddd9d603d7de

With Googlebot:

curl -svo /dev/null -A Googlebot https://pdfstitcher.org 2>&1 | egrep '^< '
< HTTP/2 307
< content-length: 39
< content-type: text/html; charset=utf-8
< date: Wed, 19 May 2021 13:48:51 GMT
< etag: W/"27-ghawzGh2y9RPAcFY59/zgzzszUE"
< x-nf-request-id: 7393bb07-9a48-47fe-ae74-9ef4c66346da
< location: https://www.pdfstitcher.org/
< server: Netlify
< age: 1

So far so good, though Google Search console doesn’t agree. However, if I go to www:

No Googlebot:

curl -svo /dev/null https://www.pdfstitcher.org 2>&1 | egrep '^< '
< HTTP/2 200
< cache-control: public, max-age=0, must-revalidate
< content-type: text/html; charset=UTF-8
< date: Tue, 18 May 2021 16:53:35 GMT
< etag: "87a4424958dbb25da122e85520fc8542-ssl"
< strict-transport-security: max-age=31536000
< x-nf-request-id: b0ed57b9-8c68-4a66-a22f-3900b628a63d
< age: 76045
< server: Netlify
< content-length: 10106

Googlebot:

curl -svo /dev/null -A Googlebot https://www.pdfstitcher.org 2>&1 | egrep '^< '
< HTTP/2 504
< content-length: 39
< content-type: text/html; charset=utf-8
< date: Wed, 19 May 2021 13:49:29 GMT
< etag: W/"27-ghawzGh2y9RPAcFY59/zgzzszUE"
< x-nf-request-id: a218effd-260b-4e81-8e4c-31ece96ac8d9
< server: Netlify
< vary: Accept-Encoding
< age: 20

I appreciate any advice you might have, thank you!

Edit: I also tried a few other bots (Bingbot, LinkedInBot, Twitterbot) and got the same result, so it seems like a configuration problem on my end.

@Ultra FWIW, some of my sites pass Google URL Inspection, but some do not. However, each of them works fine in Google search engine results, despite the fact that Google high-handedly says that URLs that end with index.html must be represented as /, ignoring and overriding explicit canonical tags.

As far as I can tell, Netlify is doing things correctly and it is Google that is changing the rules, or at least is trying to.

Hi @gregraven , at the risk of repeating myself, for most Netlify customers it doesn’t matter who is or isn’t doing things ‘correctly’ or whether or not Netlify or Google is to blame.

The point is that if Google Search Console is telling you there is an error, without a shadow of a doubt you are putting your SEO efforts at risk.

You may still place in the rankings but it’s possible you would have ranked higher.

A short story to illustrate:

Jo Bloggs runs a small website selling widgets and is creating weekly articles about the widgets hoping to get exposure for her company. One day, she receives a notification from a journalist who writes for the New York Times that they’ve quoted her article and put a link to her website. However, as they used http://jobloggs.com for the link, GoogleBot has no idea that the site being referred to is in fact jobloggs.com. Therefore, the site receives zero domain authority for the citation and languishes in the depths of Google’s search results.

@Ultra I submit it DOES matter who is correct and who is not. If you concede that Netlify may be doing this the correct way, then perhaps you should contact Google?

Just as a reminder, a few years ago we were all warned that improperly-formed HTML would cause lower Google search results rankings. Then Google came out with AMP, which is utterly invalid HTML, and just like that we were all expected to abandon years of trying to do things correctly so that Google could corner the market on Internet searches.

@gregraven I agree it’s unfortunate that Google has so much influence. However, unless you have Netlify customers explicit consent to fight a holy war with Google, I would do them a favour and let them know that if GSC is showing errors it may be negatively impacting their SEO.

Considering there must be in excess of 1.6 million Netlify domains, many of which have been indexed by Google, I would say that Netlify doing this would be a great disservice to its customers without solid evidence.

Hi, @Ultra. I do want to be clear that it is definitely possible that Netlify is doing the wrong thing somewhere. However, to help you, I must be able to see it happening.

I’m not the person that will fix a bug if it exists. However, I will be the person that files the bug report.

However, my bug report must contain either:

  • a) instructions of how to reproduce the issue

or

  • b) some logging from our system, a HAR file recording, an x-nf-request-id header, etc. - in other words, some proof that the issue is occurring

If I don’t have proof in the form of reproduction steps or other logging we can see, then the bug will immediately be closed as “cannot reproduce” and no work will be done. I must include proof in order to successfully file a bug report.

This is a hard requirement and it is only logical. Without a way to see the issue, there is nothing for our developer to research or fix.

I’m asking for proof because that is a bare minimum requirement for any bug report. It is that simple.

I’m not asking for proof to be a gatekeeper or to put the burden of proof on you. I am also trying hard to prove the issue exists. I trying to work with you on this, but I simply cannot trigger the behavior.

Again, I’ve tried to see what you say is happening. I tested with the Googlebot user agent header (one of many user-agent headers that will trigger our prerendering if it is enabled).

I still cannot trigger the issue you are reporting:

$ curl -svo /dev/null  -A Googlebot http://lightmeterultra.com/  2>&1 | egrep '^< '
< HTTP/1.1 301 Moved Permanently
< Cache-Control: public, max-age=0, must-revalidate
< Content-Length: 43
< Content-Type: text/plain
< Date: Sun, 09 May 2021 12:57:58 GMT
< Age: 933438
< Connection: keep-alive
< Server: Netlify
< Location: https://lightmeterultra.com/
< X-NF-Request-ID: 209cbd91-eddd-4e93-8c3d-f88de617a2e9-207912910
<
$ curl -svo /dev/null  -A Googlebot https://lightmeterultra.com/  2>&1 | egrep '^< '
< HTTP/2 301
< cache-control: public, max-age=0, must-revalidate
< content-length: 47
< content-type: text/plain
< date: Thu, 20 May 2021 08:15:22 GMT
< strict-transport-security: max-age=31536000
< age: 0
< server: Netlify
< location: https://www.lightmeterultra.com/
< x-nf-request-id: 209cbd91-eddd-4e93-8c3d-f88de617a2e9-207913799
<
$ curl -svo /dev/null  -A Googlebot https://www.lightmeterultra.com/  2>&1 | egrep '^< '
< HTTP/2 200
< cache-control: public, max-age=0, must-revalidate
< content-type: text/html; charset=UTF-8
< date: Thu, 20 May 2021 08:15:28 GMT
< etag: "505dfdd921daae81ff03a896352d413d-ssl"
< strict-transport-security: max-age=31536000
< age: 0
< server: Netlify
< x-nf-request-id: a5ec2d24-f3ac-4b9f-8f1a-ba02537e7c7f
<

However, the site in question isn’t using our prerendering so we wouldn’t actually treat the “Googlebot” user agent any differently than any other site.

To summarize, again, I’m not refusing to help. I’m not saying Netlify is perfect and that we are definitely not to blame. I’m saying that for any action to be taken on this, we need to be able to show the issue to our developers.

Again, I’ll help you but I cannot see the issue and you said you could see it. If you show it to me as well, I can get it fixed.

Can anyone tell me what I need to do to see the issue happening? That is all I need to file a bug report. I’m trying to help you but I cannot do so without proof.

Hi @luke, maybe the HAR I’ve sent to you via support email can shed some light on this? I’ve also uploaded it here: phonolyth.com_Archive [21-05-19 20-10-04].har

@ Luke – my issue is fixed.

I’m trying to leave a trail for others who may find themselves in the situation that I was – and a potential fix.

Also, I’m trying to surface the fact that even if Netlify’s diagnostics tools are showing a pass, that does not mean they are safe to ignore Google Search Console errors – it could be harming their SEO.

This is a hard requirement and it is only logical.

I’m sorry, but this constraint is not acceptable when people are debugging Google Search Console errors. We have no insight into Google’s processes or Googlebot, and must rely on the diagnostics tools that Google provide in these circumstances.

Customers rightfully expect their DNS set-up to play nicely with Googlebot. It is not the fault of the customer that they can not provide ‘proof’ – it’s clear from the number of people posting here that there’s a genuine issue.

In my situation, it turned out to be that I had followed Netlify recommendations for setting up my DNS with flattened CNAME records (ALIAS records) that were causing issues. Netlify has since silently updated their recommendations – for this I do have proof – which suggests to me that Netlify knows there is an issue with this set-up.

Original Instructions: Configure external DNS for a custom domain | Netlify Docs

Some DNS providers, such as Netlify DNS or NS1, have devised special record types to simulate CNAME-style domain resolution for apex domains. Find out if your provider supports this type of behavior, which might be labeled as CNAME flattening, ANAME records, or ALIAS records.

If your DNS provider supports one of these special record types (recommended), find and follow their instructions to point the apex domain directly to your Netlify subdomain, such as brave-curie-671954.netlify.app .

Current instructions: Configure external DNS for a custom domain | Netlify Docs

If you use Cloudflare as your DNS provider , it supports a special record type that also works well on the bare domain - Flattened CNAME records. This record type is recommended for your bare domain. You’d set the same record value as for your subdomains, such as brave-curie-671954.netlify.app.

  1. Find your DNS provider’s DNS record settings for your apex domain, such as petsofnetlify.com .
  2. Add an A record . Depending on your provider, leave the host field empty or enter @ .
  3. Point the record to Netlify’s load balancer IP address: 75.2.60.5
  4. Save your settings. It may take a full day for the settings to propagate across the global Domain Name System.

Notice how flattened CNAME records are now mentioned for Cloudflare customers only.

I hope this is useful for someone who runs into the same issues as I did.

1 Like

@gregraven @luke

I think my reply might have gotten lost in the noise, and now I’m wondering if I should start a new topic since it seems to be a different problem. However, if it is still related, I’ve created .har files to reproduce the problem, one for googlebot and one for default user agent.

Another thing I tried is setting the apex domain to primary, which actually returned a valid response to the googlebot, but still timed out when trying to load the www subdomain. I switched it back so that it’s configured per netlify’s recommendations and now if I force it to visit the apex domain it still retrieves content, but shows up as non-indexable in google search console (and correspondingly does not appear in search results).