Pages with and without the trailing slash causing duplicate content issue

When I run a spider on our Contentful site I am seeing duplicate URLs for the same page:

This is causing duplicate URLs. I do not have the Pretty URL’s feature turned on. I do not want to have a redirect controlling this as it causes lag time in serving the page to the visitor.

What is the best solution here?

Thanks,
Jen

@Jennifer_G Welcome to the Netlify community.

Can you add a canonical tag to your pages?

@gregraven I wanted to avoid that since it is not the best way to handle the issue. I think this is a server issue and it is the first time in my over 20 years that I would have to think of using canonicals (yes using them) to do something the server should be doing.

Correct me if I am wrong and thanks for the idea. May end up there.

Hey @Jennifer_G!

Welcome to The Community! :netliheart:

I was having some trouble with this too. I felt inspired to dig in and answer your question, but the resulting content became so a) long, and b) relevant beyond the scope of this forum, that I wrote it into a (more refined) article. I firmly believe this is the best solution for trailing slashes / duplicated content, at least on Netlify. If you could give it a good read, I think it’ll help answer your questions.


Otherwise,

To @gregraven’s point (adding a canonical tag), the good news is that Netlify is already doing that for you. Sort of. As noted here, when a request comes in to any given site’s <xyz-xyz-123>.netlify.app domain, the response is automatically sent back with a Link header specifying the canonical URL for that site. You’ll note that it has a trailing slash.

That’s not much help in figuring out your trailing-slash strategy on your custom / primary domain, but it’s a good side-note to be aware of.


Jon

Thank you for this detailed response. It does make sense and I do have a couple of questions:

  1. Why do I need to disable minification and bundling to have pretty urls?

  2. What if I don’t want a redirect at all? Why is the server allowing copies to be built? Shouldn’t this just be a server setting?

  3. You mentioned writing special conditions for url parameters ex: https://livelyme.com/author/aaron-benway?page=1
    Is that because the trailing slash would be added https://livelyme.com/author/aaron-benway/ and break the parameter? I tried and it does not break the param it redirects from https://livelyme.com/author/aaron-benway/?page=1 to https://livelyme.com/author/aaron-benway?page=1 which seems to me would be a server setting in and of itself.

Sorry if I sound confused a bit still, I probably am. Still struggling with why in the world Netlify would create an environment that would behave like this and claim they are friendly for SEO

Howdy! To your questions…

  1. You certainly don’t! The rest of the settings can be enabled or disabled as preferred; the point is just making sure that Pretty URLs is enabled and the overall “Disable Asset Optimization” box is un-checked
  2. Well there’s not a ton of options, really. /test and /test/ are different paths. If the server didn’t redirect from /test to /test/, it would just send a 404 instead, since technically /test doesn’t exist but /test/ does. Redirecting to the correct path is way better than "Not Found’ing. Either the server responds to both paths with the same content (duplicated content, bad), it redirects the user to the correct path when reaching the wrong one (best), or it only responds to one path and 404’s on the other (also bad)
  3. I’m not sure where I mentioned the bit about URL parameters carrying over to the trailing-slash path when requested on a non-trailing-slash page, but I did some extensive testing yesterday and confirmed that query string parameters are all passed through without issue :+1:t2:

As to the “claim they are friendly for SEO” part - SEO is a very tricky and vast thing. There are a lot of factors and Netlify does indeed support quite a few of them. Even if the trailing slash issue is tricky, Netlify does a lot of other great things to naturally improve site SEO :netliheart:


Jon

@jonsully I don’t suppose there is any way I can test the pretty URLs function on our UAT environment? I am afraid to go into production with this without testing it first as I believe there were some efforts in the past to remove the trailing slash from this particular page: https://livelyme.com/blog

I wanted to turn on pretty urls and unravel the previous fixes for a trailing slash issue with the blog that I uncovered in older tickets before I joined.

@jonsully thank you so much for all of your help so far. I am sorry I was away for a bit I had to put out another few fires before I could get back to this topic

@jonsully I don’t suppose there is any way I can test the pretty URLs function on our UAT environment? I am afraid to go into production with this without testing it first as I believe there were some efforts in the past to remove the trailing slash from this particular page: https://livelyme.com/blog

I wanted to turn on pretty urls and unravel the previous fixes for a trailing slash issue with the blog that I uncovered in older tickets before I joined.

@jonsully thank you so much for all of your help so far. I am sorry I was away for a bit I had to put out another few fires before I could get back to this topic

Hey @Jennifer_G! Sorry for the response delay there–

You could create a new Netlify Site and attach it to the same repository. It would build the same code from your master branch and allow you to change the Netlify settings without worrying about the current, live/production site. You could also make said new site use a branch (not master) where you could roll back / play with the code-level changes you’d made previously.

1 Like

Thank you @jonsully - you really solved all of this for me

:smiley:

1 Like

@jonsully hoping you can help me here…

I turned on the pretty URLs and added code to the next.config.js to add the trailing slash (without this code the pretty URLs did not function) The code is:
trailingSlash: true,

In the next sprint we noticed that our custom 404 page stopped rendering so we added a redirect to our _redirects file:

/* /404

I also added the trailing slash to our top and footer navigation ex:

changed /features to /features/ in both files on all links

In addition to that I went into our redirects file and edited all of the redirects to have a trailing slash:

/hsa-guide /guides/hsa-guide was modified to
/hsa-guide /guides/hsa-guide/ with the trailing slash

Our production release not only had major caching issues (curl USA site was 404 curl germany the site was up) but it also resulted in a ton of 404 pages

Do you have any ideas here?
Thanks!

Doesn’t this instruction redirect everything to your 404 page?

I understood it redirected everything that was not an existing page to the custom 404 page

1 Like

Greetings again @Jennifer_G! :wave:t2:

I’m sorry to hear that you had issues! Can you elaborate on those? And just in case, does your site run any service workers? Those can be a real tricky thing with caching… Also what version of NextJS are you running?

Debugging the trailing slash stuff should be doable locally as well — just run a full netlify build from your local command line and inspect the resulting publish directory (maybe run a clean first just in case). You should expect to see any page on any path throughout the site to be a folder with the page name and an index.html contained therein with all the page content. That’s a good first verification step.

The 404 page is actually the one place that we don’t want the trailing slash premise of “take the page and make it an index.html inside a named folder” to take place. Netlify automatically takes note if your site has the specifically-named 404.html page in the publish directory and automatically serves it as the 404 page instead of Netlify’s own. This is why https://jonsully.net/foo looks the way it does. I use Gatsby, but Gatsby generates a 404.html in the publish directory for me and Netlify serves it on any 404-resulting request.

Now that said, I’m really not sure if there’s any way to get NextJS to force trailing slashes for everywhere except 404.html. If it can’t, I’d say either use the _redirect with shadowing like you tried, or just make a custom 404.html (outside of Next.js) and make sure it gets copied to the publish folder.

I am curious if there’s a _redirects precedence issue going on :thinking: if we setup a redirect like yours, /* /404/ 404 then presume you have a file in your publish directory like /public/blog/random-thing/index.html, a request to your site at example.com/blog/random-thing/ (note the trailing slash) should work fine. I’m wondering if a request to example.com/blog/random-thing (no trailing slash) is getting picked up by the 404 redirect rule instead of just pushing the user to the trailing-slash version of the route. Would need to test this, but that could account for some serious head-aches if so.


PS. just as a last fleeting thought — I wonder if you could just turn off the trailingSlash: true, run a local export, go into the public / publish directory and copy the 404.html Next generated, paste that into your static directory, then turn trailingSlash: back on :sweat_smile: effectively using Next to generate your 404.html page once (locally) then making sure all the trailing slash rules are enabled thereafter. If you did this you could remove the _redirects 404 catch-all too.


Jon

2 Likes

@jonsully

Absolutely amazing! You are so helpful and right on! I am so grateful for you picking up my initial question, writing a blog post and following this to the end :smiley: :partying_face:

We have successfully now moved to trailing slash only and now our next move is to move all of our blog pages that are currently at the root of our domain into the proper subdirectory. The work continues so I still have a job :grinning_face_with_smiling_eyes: but I was able to get this giant task done because of you :man_superhero:

2 Likes

@jonsully is a fantastic asset to these forums and netlify customers in general! we are so pleased you were able to get help @Jennifer_G :netliheart:

Good! So glad to hear that everything worked out as it should. :sunglasses: and cheers to the Lively crew :slightly_smiling_face:

:netliheart: @perry

Hi @jonsully I noticed now that the site is showing a ton of redirects due to this change (over 7000!) which will likely not make Google happy.

The internal links on the site are without the trailing slash (although I did update the links in the navigation) so it appears to be the reason that the spider on SEMRush is seeing all of these redirects (other ideas?).

How can I fix this without touching every page?

My understanding is that the canonical links you have should take care of that.

@jonsully How do I handle my redirects going forward? Example:

livelyme.com/calculatorlivelyme.com/hsa-savings-calculator

My entry is:
image

However in some browsers (not all) if you navigate to https://livelyme.com/calculator you are sent to https://livelyme.com/calculator/ and not HSA Savings Future Value Calculator | Lively

When I ran a curl on /calculator vs /calculator/ I see the redirect is only picking up when someone inputs /calculator/

How do I rectify this issue?