Support Forums

Rate limiting bots? (20000 visits every few days)

Every few days I get 20000 visits from a bot in the Netherlands. The bot isn’t crawling the site, it’s just visiting one page 20000 times. It also requests the favicon.ico and apple-touch-icon.png.

The page is relatively small and it doesn’t cost me any money, but I’d like to stop it anyway, as each time it does this is uses enough electricity to power my laptop for 140 days — and frankly I just think it’s wasteful.

Is there any way of rate limiting bots like this? Or blocking it? Or just appeasing it?

…and does any one have any idea what it could be doing?

hi @ShadowfaxRodeo - how strange! is it maybe a scraper - does the page it visits contain anything worth scraping?

generally speaking, our policy at Netlify has been that we don’t limit traffic to sites on our service, but I do agree this sounds wasteful.

can you tell me a bit more about the bot? do you have an IP address?

Hi @perry, sorry for the delayed reply. Thanks for your help. I’m using Netlify Analytics, so the infomation I’ve given you is pretty much it. Unless there’s a way of getting the IP address. I know it’s in the Netherlands.

Here’s the page: Pattern Generator.

I guess you could write a scraper for generating patterns, but I’m not sure why anyone would do that, it would be quite difficult to build. There would also be a much easier way of doing it.

Well, I don’t see much other than what you can see on our side. If you wanted to block it, you could create a netherlands-specific redirect to block your site from there, but it seems like a bit of a large hammer for this problem.

While I do see the two big spikes in our traffic logs from a few weeks ago, I don’t see any since 29 Mar, so maybe they’ve abandoned their fruitless scraping?

In case it’s of interest at all, this is their user agent:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0

I doubt that it is really facebook nor twitter since the two of them do not mention eachother in their usual user agent. But, you could create custom JS to handle that UA if you wanted to return lighter content to them or something. Not exactly tested, but from an example another customer gave me and could inspire your workflow if you decide to do this:

var isBadBot = /Facebot Twitterbot/.test(window.navigator.userAgent)
if(!isBadBot) { ... do person-browser only stuff }