Loading RSS feed via Github results in a 403 error

I have a Github action, that loads my latest posts on my weblog into my profile README.md.

Running the action results in a 403 error, when using the above direct RSS link. I have to use a RSS proxy service that loads my feed and creates a copy.

The action workflow can be found here, the used action is here. It worked fine until some time ago.

From the looks of it Netlify is blocking Github. Is there a “real” reason for this?

The script is set to run once daily, so it’s definitely not an abuse issue. It stopped running on November 14, 2023 07:00 UTC+7.

I opened an issue on the workflows repo, but from the looks of it this is an issue on the side of Netlify.

Hi, @davidsneighbour. I have a saying which I’ll repeat again now:

“Assumptions are the bane of technical support.” - some IT person somewhere

I’m concerned you are starting with a false assumption and then working based on that assumption. If so, this will slow the troubleshooting process so I want to clear up any assumptions and get hard facts (as hard facts is how troubleshooting issues get resolved).

The assumption that I’m focused on here is this:

I don’t see proof of this yet. If you have proof of this is happening, however, please do share it here. While it is possible that we might be blocking request from GitHub, I consider it highly unlikely as we should be getting many reports about it. That fact that there is only a single report leads me to believe that something else is the root cause here. Again, I just want to call this out as unproven. (Again, if you have proof, we’d love to see it.)

I do see the URL https://kollitsch.dev/rss.xml is hosted at Netlify, I don’t see any 403 responses for that URL at all. We have only returned 200s and 304s for that URL. Also, I checked the access logs for your site and we have never sent any 403 responses for it at all in the last 30 days (for any URLs).

Please also note, I can only debug Netlify. I need to know what is happening at Netlify that is incorrect not at GitHub. Would you please clarify exactly what HTTP request is being made to Netlify and what the incorrect response is? Also, all of our responses will have a unique x-nf-request-id header value. If you send us a x-nf-request-id value for an incorrect response, that will greatly aid the troubleshooting process.

For example, here are three consecutive requests to the feed URL:

$ curl --compressed -svo /dev/null --stderr - https://kollitsch.dev/rss.xml | egrep '^< x-nf-request-id'
< x-nf-request-id: 01HK6GCMYJDXJ4ZECMRGHHMJYN

$ curl --compressed -svo /dev/null --stderr - https://kollitsch.dev/rss.xml | egrep '^< x-nf-request-id'
< x-nf-request-id: 01HK6GCNVVP3EY4X9HCSX9ADRA

$ curl --compressed -svo /dev/null --stderr - https://kollitsch.dev/rss.xml | egrep '^< x-nf-request-id'
< x-nf-request-id: 01HK6GCPS2SHE5CY6M4ZW1V9SG

Each returned a unique x-nf-request-id value. If you send us the value for any incorrect responses, that will be helpful. We’ll definitely need additional information to troubleshoot so please share anything you believe is relevant here.

1 Like

“From the looks of it” clearly indicates an assumption, not detailed knowledge or even a working theory. I don’t know if you can see the logs of the actual workflow here - if not, here is a quick screenshot:

This is all the information I, without access to Github servers and Netlify logs, have about this I guess.

The 403 also implies some form of active block.

Now, if you read the last link, where I once again start with the triggering “from the looks of it” there was an issue with a different provider of comparable services and it indeed was an issue on their side.

In my uneducated imagination something like this is happening:

  • Github Workflow actions run from a bunch of servers behind a range of IPs
  • Many workflows run from one single IP address
  • someone (not me) does something that tries to access content hosted on Netlify
  • this someone has some form of loop or even intentional ddos-like script-kiddie workflow running that gets the IP blocked
  • my little workflow comes with it’s single request and the request is blocked “at the door” coming from a blacklisted IP without it arriving in any logs in connection with my website.

I got no deeper insight in the form of proper logs of what is happening on the GitHub side other than the logs above. The process works with the proxy service, so they can access the feed; I myself can load the feed from any server I try it from, and you can load the feed, so let’s put this discussion into the archive, as I have no energy to help debug a possible larger issue.

Hi, @davidsneighbour. If you want this issue dropped, consider it dropped. If you do wish to continue troubleshooting, we’ll help you but only on two conditions:

  • You communicate politely and respectfully going forward.
  • You provide usable troubleshooting information.

I do agree the “ddos-like script-kiddie workflow” hypothesis is plausible and I even considered it myself before replying to you yesterday. However, a hypothesis is only useful if you test it.

So, I did test it on my side by checking both access logs and the rate limiter logs for kollitsch.dev. I did this yesterday before replying to you. I found nothing. Nada. Zilch. There are no 403s or rate limit events for this site.

So, while the hypothesis seems plausible, my data only disproves it. This is why I asked if you had data that supports it. My data or methodology could be wrong in some way I don’t know about. Your additional information could then reveal the root cause and confirm (or further disprove) this hypothesis.

I also agree that the logs you shared implied Netlify did send a 403. However, they don’t contain any useful details and, without more details, there is no way to troubleshoot. This is why I said:

This is what we need to debug this:

  • the x-nf-request-id response header for an 403 response from Netlify

If that header information is not available, we need the following details:

  • the full URL requested
  • the client IP address making the request (the public egress IP address)
  • the server IP address that responded
  • the date, time, and timezone of the request

The screenshots and logs (which I can see as well) do not contain enough information to debug this. Netlify cannot get the required information for you. You can add additional logging to your GitHub action to reveal the required information but only you can do so. If you will not or cannot provide that information then we cannot assist you further.