Why is my site invoking hundreds of Next.js Server Handler functions even with low usage?

That is weird indeed. As for the analytics, you should be able to see a 24 hours view by changing at the top. Other than the regex being insufficient, I’m not sure what may be going on.

Ah, I see that, and when I set it to last 24hrs and watched for a few minutes, I see specific page 404 counts increasing, so they are not being handled by the edge function even though when I request such pages, I see the 404 response that is returned by the edge function.

In case I wasn’t being very clear, if I request /blog/page/2, I get the 404 returned by the edge function:

image

And the pattern matching is correct, because if I request /blog/page/abc, I get the 404 for my site from nextjs:

And yet I see the 404 count for /blog/page/2 in the “Top resources not found” counter.

Oh, now that I think about it, I don’t think the edge function will have any impact in the 404s you see in the analytics, because those include the 404s returned by the edge function.
So you should be seeing the reduction in the 404s of the serverless function (which you are not for some reason), but you shouldn’t see any reduction in the analytics…
The question now is how you’ll find which 404 addresses are being requested that are bypassing your edge function and hitting the serverless function. Maybe you could try logging those addresses somehow inside Next (haven’t tested this)?

@hrishikesh @SamO is there anyway to get the full analytics, i.e. all the addresses being requested, so that @jfrank14 can see which other 404s he needs to block? I would expect paying for analytics would give us more than just a top 15 of requests…

@fvieira @jfrank14 Thanks so much for raising this concern regarding 404 requests causing function invocations. We’ve escalated it to our developers for greater visibility on this issue affecting you two as well as other customers on Netlify. While I don’t have an immediate workaround outside of what you two have been discussing here, we will be looking at potential solutions going forward.

Regarding full analytics for @jfrank14 to better understand what needs to be blocked, we’re looking into what information we can provide that would be helpful. I do see a ticket already exists in our helpdesk, #297858. We will be following up there when we have more information to share.

If y’all have additional questions, feel free to raise them here and we’ll do our best to address them!

@sid.m I’m no longer sure that having the full list of 404’s would help, since it’s clear that even the 404’s that are blocked by my edge function still show up in this list. Having the rest of that list would possibly let me block other patterns, but that won’t help me because blocking them doesn’t seem to help.

This function invocation situation is a serious problem, because it means that sites are charged for something that they can’t control and that isn’t organic usage. According to the site analytics (and more or less corroborated by Google Analytics), my site is serving around 300 pages a day. This is a low volume site that’s important to me but doesn’t bring in any revenue, so putting it on a free (for low usage) provider makes sense. But if spambots asking for pages that don’t even exist can force me to spend $25/month on this, it’s no longer cost competitive with GoDaddy or some other hosting provider that doesn’t charge in this way (and which also provides 24/7 phone tech support).

@jfrank14 I completely understand your concerns here and genuinely appreciate the feedback regarding your experience on the platform. I’ve communicated that feedback to our development team for additional context on this issue.

Is there anything we can do in the meantime to help you have a better experience? You stated the full list of 404s wouldn’t be helpful, is there anything we could provide that would be helpful?

Can you comment on whether blocking a url via edge function should make it not use a function invocation? @fvieira seems to have had success doing this, but when I did the same thing, it didn’t change anything. I’m not clear from the docs what does or does not invoke a function, and if blocking via edge function means that it does NOT invoke a server function. Certainly, I AM blocking 404’s in this way, and yet they are getting through as invocations anyway.

Hi @jfrank14 - we’ve followed up on your ticket in our helpdesk to get more details about how you’re implementing that edge function. Please take a look and follow up there! Thanks so much.

I have responded to that ticket. The advice isn’t really helpful because it doesn’t answer the basic question about why url patterns that are clearly being handled and rejected by the edge function are still showing up in “Top resources not found”. I also can’t get a clear answer about whether such pages count as function invocations.

It is now only 1/3rd of the way through the month and I have again used 75% of my function invocations, which is going to result in another upcharge, and I am getting very frustrated here.

@jfrank14 we’ve followed up in your ticket

Hi @sid.m I wonder if you have any update concerning this issue?

I’m facing the same problem with my site at elaborate-cactus-b60dfa. A few days ago I received this email:

" [Netlify] You’ve reached 50% of your invocations allowance on elaborate-cactus-b60dfa"

Your site is popular!

Your invocations usage on site elaborate-cactus-b60dfa has reached 50% of the included allowance for Functions Level 0 (free) in the current billing cycle from October 9 to November 9.

If it passes 100% before the end of the billing cycle, we’ll upgrade to Functions Level 1. Check out your site dashboard for current usage details.


But my site is NOT popular! It doesn’t have more than a hundred visitors a day. Yet, Netlify’s “Function Next.js Server Handler” shows this stats as of today:

Usage from Oct 9 to Nov 9

Last daily update today at 10:10 AM Functions Level 0

  • Requests

Counts every time a function was invoked in the current billing period.

69,705 out of 125,000

55,295 requests left

  • Run time

Combined run time of all function requests in the current billing period.

3 hours out of 100 hours

97 hours left


69,705 requests? Where is that coming from? That’s certainly not visitors to my site.

(Btw, I don’t even have functions deployed, as I don’t need them. My site is a simple static site. I even have Next.js image optimizations disabled.)

That means that, at the current rate, I’ll soon be upgraded to Functions Level 1, sometime before the end of the current cycle.

That makes no sense to me. I certainly don’t want to have to upgrade due to this spike of invocations from I don’t even know who or what, and that appears to have come out of nowhere.

Is there any solution so that I don’t get upgraded?

Thanks

It sounds like your Next.js site on Netlify is experiencing unexpectedly high function invocations, likely due to factors such as frequent API calls, Server-Side Rendering (SSR) on every page load, or client-side behavior triggering functions excessively. To address this, review your code and dependencies to identify any patterns in the function logs, optimize your site by switching to Static Site Generation (SSG) where possible, and consider implementing rate limiting for functions that are invoked too often. I hope this works for you:)

Hi @sid.m, we’re also having this issue again. I had solved it through a edge function that would block requests for php files, but now I’m having the problem again.

Probably we’re just getting spammed differently, in a way that the edge function is not blocking, and I would update the edge function if I knew what requests we were being hit with, the problem is that the only way I know of understanding what requests we’re getting is to subscribe to Netlify Analytics (again), which doesn’t even give me a decent look at the 404 requests, only the top 15 or so.

Is there anyway to get decent request logs other than this?

Thanks in advance.

Hey @leoloso ! I took a peek into your traffic, and it looks like you have some crawlers on your site, as I see your highest user agent as “null”. If you want to avoid high traffic by bots, you can block them using either a robots.txt or Edge Functions, which we outline here:

Let me know if you have any other questions. Thanks!

Found a way to get information about the 404s without having to use Netlify Analytics.

Basically I am logging the current path on my 404 page. I also added the revalidate: 30 on the getStaticProps to have it log once every 30 seconds (assuming there are 404 requests all the time).

import { getClient } from 'lib/sanity.server';
import { useRouter } from 'next/router';

import Layout from 'components/Layout';
import page404groq from 'groq/page404.groq';
import Page404Template from 'templates/Page404';

export default function Page404({ data, preview }) {
  const router = useRouter();
  const currentPath = router.asPath;
  console.log('404 url path', currentPath);

  return (
    <Layout data={data} preview={preview}>
      <Page404Template />
    </Layout>
  )
}

export async function getStaticProps({ preview = false }) {
  const data = await getClient(preview).fetch(page404groq)

  return {
    revalidate: 30,
    props: {
      data,
      preview,
    },
  }
}

With this I found the request were for wordpress files that didn’t end with .php (which is why my filter was not catching them), so I updated my edge function as below, and am now waiting to see if it works, but I see no reason not to.

import type { Config } from '@netlify/edge-functions'

export default async function () {
  const html404 =
    '<!DOCTYPE html><html><head><title>404 Not Found</title></head><body><h1>404 Not Found</h1></body></html>'

  return new Response(html404, {
    status: 404,
    headers: {
      'Content-Type': 'text/html',
      'netlify-cdn-cache-control':
        'durable, immutable, max-age=31536000, public',
    },
  })
}

export const config: Config = {
  cache: 'manual',
  pattern:
    '^.*((wp-admin|wp-content|wp-includes|\\.htaccess).*|\\.[Pp][Hh][Pp])$',
}

Thanks @charlottechhum I’ve now added robots.txt blocking all AI bots:

User-agent: AI2Bot
User-agent: Ai2Bot-Dolma
User-agent: Amazonbot
User-agent: anthropic-ai
User-agent: Applebot
User-agent: Applebot-Extended
User-agent: Bytespider
User-agent: CCBot
User-agent: ChatGPT-User
User-agent: Claude-Web
User-agent: ClaudeBot
User-agent: cohere-ai
User-agent: Diffbot
User-agent: FacebookBot
User-agent: facebookexternalhit
User-agent: FriendlyCrawler
User-agent: Google-Extended
User-agent: GoogleOther
User-agent: GoogleOther-Image
User-agent: GoogleOther-Video
User-agent: GPTBot
User-agent: iaskspider/2.0
User-agent: ICC-Crawler
User-agent: ImagesiftBot
User-agent: img2dataset
User-agent: ISSCyberRiskCrawler
User-agent: Kangaroo Bot
User-agent: Meta-ExternalAgent
User-agent: Meta-ExternalFetcher
User-agent: OAI-SearchBot
User-agent: omgili
User-agent: omgilibot
User-agent: PerplexityBot
User-agent: PetalBot
User-agent: Scrapy
User-agent: Sidetrade indexer bot
User-agent: Timpibot
User-agent: VelenPublicWebCrawler
User-agent: Webzio-Extended
User-agent: YouBot
Disallow: /

Hopefully the problem will go away. I’ll keep an eye on it.

Thanks!