mjgs
August 2, 2023, 7:28am
1
I have a sitemap.xml file at root of site. I have added a header rule in netlify.toml specifying correct content type, but when I view the file in a browser it displays as html.
# Sitemap
[[headers]]
for = "/sitemap.xml"
[headers.values]
Content-Type = "application/xml"
Here is the file:
https://h2qz.netlify.app/sitemap.xml
Why isn’t it displaying as xml?
luke
August 3, 2023, 7:50am
2
Hi, @mjgs . Because there is no such file and a default 404 page is being sent instead. The 404 page is HTML and not XML. If you request an XML file the content-type
is text/xml
$ curl --compressed -svo /dev/null --stderr - https://h2qz.netlify.app/feeds/links/rss/feed.xml | egrep '^< content-type'
< content-type: text/xml
So, the real problem to solve here is that is there no sitemap.xml
file in the deploy. You will need to generate or include that file in the deploy and then the the [[headers]]
setting will work.
mjgs
August 4, 2023, 12:41am
3
Thanks for the reply.
Shute, in the meantime I must have deleted the file to test something else. I’ll re-generate it, and update the thread shortly.
mjgs
August 4, 2023, 2:19am
4
Hi @luke I’ve put the file back. I’ll try to keep it there for a day or so.
Here’s a screenshot of what I’m seeing when I load it in a browser:
The URLs in the sitemap have the live site hostname, this is deployed into a staging server. The point is that I was expecting to see XML not HTML.
Any ideas why it’s not displaying the XML that’s in the sitemap file?
mjgs
August 4, 2023, 11:51pm
5
I have to remove the file from the staging server to get on with other things. I can put it back again later if that’s helpful.
Could you list a few troubleshooting things I could do? Or some additional info I could provide to you to help me figure out the cause?
mjgs
August 5, 2023, 3:21am
6
Hi @luke , I’ve put the sitemap.xml file back on the staging server. (Should be there once the latest build completes in a few mins)
It would be great if you could take a look and let me know you were able to see it. Thks
What’s the link to your staging server?
That’s being served as XML, which is why I asked where your file is. If that’s the same link, I don’t see a problem there.
mjgs
August 6, 2023, 11:12pm
10
For comparison, here’s another xml file, different format, which displays as actual xml, which is what I would expect for the other file:
https://h2qz.netlify.app/feeds.opml
I noticed that the opml file, which is xml, didn’t have a header rule, so I tried both
no header rule for both
same header rule for both
In neither case did the sitemap display as xml.
On the other hand the feeds.opml did display as xml without the header rule. With the header rule, it displayed as html.
Here’s the opml file with the header rule defined:
What’s going on with these header rules?
Why doesn’t the sitemap.xml file display as xml in the browser no matter what I do?
Updated: correct path to staging server OPML file
Updated: screenshot of staging OPML file
luke
August 9, 2023, 11:56pm
11
The file is being served as XML:
$ curl --compressed -svo /dev/null --stderr - https://64d042e3be3d0c14820eb0e8--h2qz.netlify.app/sitemap.xml | egrep '^(<|>)'
> GET /sitemap.xml HTTP/2
> Host: 64d042e3be3d0c14820eb0e8--h2qz.netlify.app
> User-Agent: curl/8.1.2
> Accept: */*
> Accept-Encoding: deflate, gzip
>
< HTTP/2 200
< accept-ranges: bytes
< age: 0
< cache-control: public,max-age=0,must-revalidate
< content-encoding: gzip
< content-type: application/xml
< date: Wed, 09 Aug 2023 23:48:17 GMT
< etag: "2ce0332f1b207d555c965b546df4538b-ssl-df"
< server: Netlify
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< vary: Accept-Encoding
< x-nf-request-id: 01H7EB6YMYEM53EKFWJJV0H85M
< x-robots-tag: noindex
<
It shows content-type: application/xml
above. Also, you can see the file itself is XML:
$ curl -s https://64d042e3be3d0c14820eb0e8--h2qz.netlify.app/sitemap.xml
<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"><url><loc>https://markjgsmith.com/about/index.html</loc></url><url><loc>https://markjgsmith.com/archives/index.html</loc></url><url><loc>https://markjgsmith.com/blog/index.html</loc></url><url><loc>https://markjgsmith.com/contacts/index.html</loc></url><url><loc>https://markjgsmith.com/feeds/index.html</loc></url><url><loc>https://markjgsmith.com/feeds.opml</loc></url><url><loc>https://markjgsmith.com/index.html</loc></url><url><loc>https://markjgsmith.com/job-interview-policy/index.html</loc></url><url><loc>https://markjgsmith.com/latest/index.html</loc></url><url><loc>https://markjgsmith.com/links/index.html</loc></url><url><loc>https://markjgsmith.com/newsletter/index.html</loc></url><url><loc>https://markjgsmith.com/podcast/index.html</loc></url><url><loc>https://markjgsmith.com/portfolio/index.html</loc></url><url><loc>https://markjgsmith.com/pricing/index.html</loc></url><url><loc>https://markjgsmith.com/recommendations/index.html</loc></url><url><loc>https://markjgsmith.com/services/index.html</loc></url><url><loc>https://markjgsmith.com/sponsorships/index.html</loc></url><url><loc>https://markjgsmith.com/tags/index.html</loc></url><url><loc>https://markjgsmith.com/archives/blog/index.html</loc></url><url><loc>https://markjgsmith.com/blog/2021/index.html</loc></url><url><loc>https://markjgsmith.com/blog/2021/06/01/ipsum-dolor-sit-amet/index.html</loc></url><url><loc>https://markjgsmith.com/blog/2021/01/01/ipsum-dolor-sit-amet/index.html</loc></url><url><loc>https://markjgsmith.com/blog/2022/index.html</loc></url><url><loc>https://markjgsmith.com/blog/2022/06/01/ipsum-dolor-sit-amet/index.html</loc></url><url><loc>https://markjgsmith.com/blog/2022/02/08/up-with-templating-in-modern-javascript-frameworks/index.html</loc></url><url><loc>https://markjgsmith.com/blog/2022/01/01/ipsum-dolor-sit-amet/index.html</loc></url><url><loc>https://markjgsmith.com/archives/links/index.html</loc></url><url><loc>https://markjgsmith.com/links/2021/index.html</loc></url><url><loc>https://markjgsmith.com/links/2021/08/index.html</loc></url><url><loc>https://markjgsmith.com/links/2021/08/12/index.html</loc></url><url><loc>https://markjgsmith.com/links/2021/08/12/151441-markjgsmith.com/index.html</loc></url><url><loc>https://markjgsmith.com/links/2021/08/12/141441-share.transistor.fm/index.html</loc></url><url><loc>https://markjgsmith.com/links/2021/08/12/052614-ckarchive.com/index.html</loc></url><url><loc>https://markjgsmith.com/links/2022/index.html</loc></url><url><loc>https://markjgsmith.com/links/2022/01/index.html</loc></url><url><loc>https://markjgsmith.com/links/2022/01/01/index.html</loc></url><url><loc>https://markjgsmith.com/links/2022/01/01/163045-markjgsmith.substack.com/index.html</loc></url><url><loc>https://markjgsmith.com/links/2022/01/01/162741-blog.markjgsmith.com/index.html</loc></url><url><loc>https://markjgsmith.com/links/2022/01/01/162156-blog.markjgsmith.com/index.html</loc></url><url><loc>https://markjgsmith.com/archives/newsletter/index.html</loc></url><url><loc>https://markjgsmith.com/newsletter/2020/index.html</loc></url><url><loc>https://markjgsmith.com/newsletter/2020/10/21/third-issue/index.html</loc></url><url><loc>https://markjgsmith.com/newsletter/2020/10/21/second-issue/index.html</loc></url><url><loc>https://markjgsmith.com/newsletter/2020/10/19/first-issue/index.html</loc></url><url><loc>https://markjgsmith.com/newsletter/2021/index.html</loc></url><url><loc>https://markjgsmith.com/newsletter/2021/02/05/fifth-issue/index.html</loc></url><url><loc>https://markjgsmith.com/newsletter/2021/02/04/fourth-issue/index.html</loc></url><url><loc>https://markjgsmith.com/archives/podcast/index.html</loc></url><url><loc>https://markjgsmith.com/podcast/2020/index.html</loc></url><url><loc>https://markjgsmith.com/podcast/2020/10/21/0003-noisey-cafe-2/index.html</loc></url><url><loc>https://markjgsmith.com/podcast/2020/10/21/0002-noisey-cafe/index.html</loc></url><url><loc>https://markjgsmith.com/podcast/2020/10/19/0001-silly-chant-too-early/index.html</loc></url><url><loc>https://markjgsmith.com/podcast/2021/index.html</loc></url><url><loc>https://markjgsmith.com/podcast/2021/02/05/0017-foot-badminton-in-the-park-at-sunrise/index.html</loc></url><url><loc>https://markjgsmith.com/podcast/2021/02/04/0016-morning-sound-check-in-the-park/index.html</loc></url><url><loc>https://markjgsmith.com/feeds/blog/rss.xml</loc></url><url><loc>https://markjgsmith.com/feeds/links/rss.xml</loc></url><url><loc>https://markjgsmith.com/feeds/newsletter/rss.xml</loc></url><url><loc>https://markjgsmith.com/feeds/podcast/rss.xml</loc></url><url><loc>https://markjgsmith.com/tags/blog/index.html</loc></url><url><loc>https://markjgsmith.com/tags/links/index.html</loc></url><url><loc>https://markjgsmith.com/tags/newsletter/index.html</loc></url><url><loc>https://markjgsmith.com/tags/podcast/index.html</loc></url></urlset>%
I can confirm is that the file is sent with the correct content-type
header and the correct content is sent. If you are seeing errors in your browser, that I cannot troubleshoot as I don’t have access to your browser to do so.
To summarize, I cannot see any errors when I test. Can you send us a HAR recording of the incorrect response? (That or the x-nf-request-id
HTTP response header for the incorrect response?)
I don’t think there is an incorrect response, though. However, if there is, please let us know.
mjgs
August 11, 2023, 12:26am
12
Thanks for taking a look @luke
I guess what’s viewable via the browser isn’t that important, as long as those commands you run return the right results. Also I presume Google will report an error when I submit the sitemap if there is an issue.
I’m glad it’s working, though I want to be sure I have it configured in the most stable way.
Currently there are no rules configured, they are commented out:
#[[headers]]
# for = "/sitemap.xml"
# [headers.values]
# Content-Type = "text/xml"
# RSS feeds
[[headers]]
for = "/feeds/blog/rss.xml"
[headers.values]
Content-Type = "text/xml"
[[headers]]
for = "/feeds/links/rss.xml"
[headers.values]
Content-Type = "text/xml"
[[headers]]
for = "/feeds/newsletter/rss.xml"
[headers.values]
Content-Type = "text/xml"
[[headers]]
for = "/feeds/podcast/rss.xml"
[headers.values]
Content-Type = "text/xml"
#[[headers]]
# for = "/feeds.opml"
# [headers.values]
# Content-Type = "text/xml"
Is the application/xml content type that was returned in your command output some sort of default?
Is it better to configure a rule to ensure things don’t change unexpectedly?
Which is better for xml files: text/xml or application/xml ?
Yes, we have default content-types for a lot of formats, and we don’t plan to change this (as we’re following the current standards). You can choose to specify it explicitly, though that’s not required.
Your final question is answered here: rest - What’s the difference between text/xml vs application/xml for webservice response - Stack Overflow
mjgs
August 13, 2023, 4:54am
14
Thanks for the help on this thread.
I managed to get Google Search Console to parse the sitemap, looks like no errors reported.
See screenshot attached to this email, replying via email because thread content isn’t currently loading on the website.