Hi folks! I added Netlify Analytics a few weeks ago to my site, juangarcia.design, and I’m seeing way more page-views than I had with GA. Most of them are pings to non-existing files, such as robots.txt, wp-login.php, ads.txt, etc., from URLs I assume are bots or webcrawlers.
Is there a way I can filter Referral URLs or Requests to certain files, so they don’t show up on my Analytics dashboard, and ensure accuracy on my reports? I don’t see that option anywhere on my site settings.
There isn’t any way to filter those results. We do count bot traffic which GA does not, so that is probably the discrepancy in total numbers.
As an idea for getting those scanner’s 404’s out of your eyes, you could deploy some empty files to those well known paths so they are 200’s but probably won’t be in your top pages, and will use even less bandwidth than our 404 page (or a custom one you made).
Hi there! Thanks for your reply, and your idea. Will definitively consider that!
I’m not concerned about bandwidth all that much, though. I would love to have more accurate results by filtering those, as well as spam referral URLs if they show up on my site. I currently have around ~1500 pageviews/month, according to Netlify Analytics, and I know the vast majority of those aren’t real users, unfortunately.
Thanks for your thoughts! I’ll do that, but my issue is related to the accuracy of the results, not the 404 pages. I’m trying to find a workaround to get more accurate results on my analytics, since I don’t think bots and crawlers read my blog posts, thus I don’t want them to show up as visitors
We do have an open feature request to show a filtered view removing bots, to which I’ve just added your name and thus will follow up here should we create that feature. I think it’s already halfway written so I give reasonable odds on it arriving later this year
This one is a bit more likely than your last feature request to which I just responded, @Intranel ! We’ll mention here in this thread, if we do ship this.
As a customer of Netlify Analytics for many months, I still have 0 idea of how many real visitors come to my website. I don’t even have a rough idea! So I really wonder how can I distinguish scrapers from real visitors. Is this service only suited to websites that have a large amount of visitors?
Is there a threshold below which the numbers are 100% noise from scrapers? Even 20 real persons a months would be a big news for me.
Would love some inputs, and please add me on the list as well.