The website in question is heynode.com ( stoic-kilby-3833c6). The bandwidth usage for the site has been steadily increasing for the last month and a half. I was just recently charged for an overage which is what brought this to my attention. However, looking at analytics on both Netlify (I paid for the analytics package to see if I could get more useful info from that and could not) and Google Analytics, there’s no real increase in visitors to the site. So I’m not sure what is causing the steady increase in bandwidth and would like some help tracking it down so I can get it under control.
I’m wondering if it’s something like an image being hot linked, a crawler, or some other scenario where the request doesn’t look like a request for a standard page, or is being made without JavaScript, and is thus not picked up by things like Google Analytics.
Any help in tracking down what is going on here would be greatly appreciated.
Looking into your traffic for the past 30 days, the URLs accessed all look fairly evenly spread out, but I am seeing a User Agent in your visitor list that looks odd. I’m seeing your top user agent as [null], or blank, using about 115GB last month, mostly coming from IP addresses in Ireland.
This might be bot scraping if it seems unexpected to you. You can always block user agents you don’t want visiting your sites, in this case blocking any UA’s with no user agent, or you can block by location, for example, by creating an Edge Function. We have examples of those here:
I hope this helps, let me know if you have any follow-up questions. Thanks!
Is there any way that I could have seen this? I don’t see anything in the analytics that would tell me I’m getting a bunch of requests with no user agent like this. Is it possible to gain access to the logs? Given that these requests are not showing in Google Analytics I’m guessing it’s some kind of crawler/bot and would like to try and continue to track down what is happening.
In the meantime I’ll look into implementing an Edge Function like you suggest. Thanks.
Our analytics feature provides a high-level overview of traffic, but doesn’t offer this fine-grained level of detail. This level of detail is provided on our log drains feature, which is unfortunately an enterprise only feature.