I’m using the _headers file to provide canonical data for PDFs (more details in why in this post).
Even though the canonical tags are now set correctly, the headers file is not always served when a page loads and this is causing Google Search Console to give me a ‘Duplicate without user-selected canonical’ error.
This is one of the impacted pages:
To try to debug this I’ve inspected the page in Chrome and looked at Network > Headers > Response headers. Sometime the Link from the _headers file providing the canonical is there, but most of the time it isn’t and I just have to reload multiple times to see it.
Is there anything I can do from my end to ensure this is served consistently?
Any help would be greatly appreciated!
My Netlify Site Name is lovely-semifreddo-4bea40
My Custom Domain is
Here is a picture of my Google Search Console info:
Here are the Response Headers with the link
Here are the Response Headers without the link
A little bit more digging on my side… the _header information is pulled in when there is Status Code: 200 i.e. everything is loaded successfully, but not when there is Status Code: 304 i.e. leveraging stuff in the cache.
Any thoughts on why this is causing Google to miss the _headers input?
This is expected. 304s would not serve custom headers. Why would Google ask for a 304 response though? Google should get a 200, unless it’s asking for cached data.
You can instead add the canonical tags to the HTML: Canonicalization - Moz
Thanks for the response!
Google isn’t asking for the 304 response (to my knowledge), they’re saying they can’t see the headers. I can’t add it to the html unfortunately as the headers are for PDFs. I just don’t know why they’re not being seen by Google?
Oh sorry, I missed the fact that they’re PDFs. Well, Netlify doesn’t add custom headers to 304 responses and the only way to request 304 is by sending the previously received etag response headers. If that header is missing or outdated, we send a 200 which includes the custom headers.
I don’t know too if Google saves and requests data with the etag header or not, I assumed it always asks for fresh data. Based on our logs, in the past 7 days, we’ve served only 200 to Googlebot for all PDF files.
I think you can use Google’s URL inspection tool to see what data it is receiving. If I recall, that also shows the canonical headers. If Google is seeing the right data there, you might have to try contacting them to understand why they’re not seeing this header some times. We don’t store the response headers in our logs, so we cannot check whether or not we served those headers for 200 status code, but nothing is broken from our end in terms of serving custom headers right now. So, I won’t think that’s the case.
This is definitely a google-issue and when I query the URLs within Google Search Console it shows the correct header, even when it’s not stored within their database.
I’ll see if I can change the title of this post to account for that