Domain redirect not working, legacy domain still accessible for some

I have a strange issue with a site hosted on Netlify (site name stoic-shaw-9173d0) where a legacy domain, changed and redirected over a year ago, is still showing up in logs.

The old domain learnpyqt.com was redirected to pythonguis.com over a year ago, using the following rule. Both domains are set on this site, have working certificates and do appear to redirect.

https://www.learnpyqt.com/*         https://www.pythonguis.com/:splat 301!
https://learnpyqt.com/*         https://www.pythonguis.com/:splat 301!

However, the old URL continues to show up in sources (referers).

When I contacted support about this, I was told this is “good news!” because traffic is still coming via the old domain. But redirects don’t change the referer. The only way the old domain should show up as a referer is if someone actually lands on the page and clicks a link.

With the redirect in place, how are they reaching a page on that domain?

Using 3rd party analytics I checked the pages that these visitors are “landing” on on pythonguis.com when coming from learnpyqt.com and it suggests they are using an old version of the site (the URLs are outdated). I then went and checked some old API endpoints, last used in the site over 6 months ago, and lo and behold they are still getting hits.

To top it off, the old domain still exists in Google despite a 301 redirect being configured for over a year. This is not normal.

Something seems very wrong here. It appears that some users are being served a very out of date (cached?) version of the site.

Support told me to go buy an upgrade so I can access the log drains and figure it out myself. I would appreciate some actual help/advice in debugging this.

Support are now point-blank refusing to help debug this, saying there are no errors (there wouldn’t be) and telling me to upgrade to access Log Drains (a >$99/month plan) and figure it out myself.

That’s the attitude now to a potentially serious problem with deployment of a site? “LOL, your problem, good luck.”

I’ve been a very happy Netlify user for years, support was interested in solving issues (see my other posts on here). Recently upgraded to a paid account & now this. Do I need to downgrade again to get help?

Beyond p*ssed off.

Hey @mfitzp,

Happy to check this for you. As far as logs are concerned, I can see that there have indeed been a lot of requests to your new domain with referrer as the old domain.

So, I tried to check for requests to your old domain in the past 30 days (the max we have the logs for) with the status code as 200, but all I can see are the LetsEncrypt verification requests that we served to renew your SSL certificate.

I can also confirm that we do not maintain referrers when using a 301 redirect.

Surprisingly, I’m not seeing any other 200s being served for that domain. I can see a piece of information that you can provide to help us locate these requests. You mention:

Could you share those URLs? Are you still able to access those URLs yourself?

I’d be surprised if a site somehow managed to stay cached for 6 months or a year, so this is definitely weird.

Hi, @mfitzp. I’ve found the source of the referrer. There isn’t an error or bug. That is the referrer being sent by an uptime monitor and Netlify is correctly reporting that fact.

The referrer is being sent and we just reported it. We didn’t make it happen. We just reported that it did happen. So, the question becomes:

  • Why is this referer header sent?

First, I want to clarify what a referrer is. Long ago a referer header was added to the HTTP specification. The header name, referer, is a misspelling of “referrer” and that misspelling is now part of the specification itself. It has been wrong for decades now and unlikely to change anytime soon.

Browsers themselves control what is sent in this header and there is more information about the header here:

Now, I realize you likely know all of that and probably none of it is new information for you. I’m just clarifying the functionality of the header to point out two things:

  • the web client (not the web server) is in complete control of this header
  • a web client can spoof that header and send any string imaginable - it doesn’t even have to be a valid URL (it is intended to be a URL but that is not a rule enforced by web servers)

So, why do I mention all this? Because the source of this referrer is not something Netlify can control. When I checked the source of that referer header, I can see all examples are using a single user-agent string. The user-agent string always includes this URL:

http://www.uptimerobot.com/

In other words, it appears that someone has hard coded that referrer into the uptime check at that service. That service is the only client sending that referer header.

The uptime monitoring service itself using this header. If there are questions about why they are using that referer header, we recommend contacting that service’s technical support for more information as Netlify cannot debug a third-party service. My best guess is that the monitor has this header hard coded or is designed to add it when a 301 redirect occurs. If so, this is controlled by the uptime monitor service and not by Netlify.

If that monitor is corrected to stop doing this, please let us know as our support team can double check our logs to confirm if we still see the issue or not. The exact search to find that monitor is saved in a private comment in this thread that our support team can see so there will be no doubt if the issue is fixed or not.

​Please let us know if there are other questions. If the monitor itself has been corrected, please feel free to follow-up here as, again, we can easily double check to be sure we also see it fixed on our side.

Thanks @luke I appreciate you taking the time to look into this and understand what’s happening. (sorry for the delay replying @hrishikesh family were unwell). I’ve changed the configuration on the uptime monitor, so that should stop the referer traffic.

The old pages appearing in Google is also a red herring as this is reported elsewhere, just Google being lazy about removing things that work.

This still doesn’t explain how the old API endpoints are still receiving hits, that appear “natural” i.e. the result of someone navigating around the (old) site. It’s quite bizarre.

I understand the referer comes from the browser (or bot): it was the combination of the referers + the API traffic that concerned me, as the only way I can see that happening is with organic traffic. But with the referer part explained, that traffic is low enough that I’m not inclined to spend more time thinking about it.

Thanks again for looking into this.

1 Like