• DigitalDilemma@lemmy.ml
    link
    fedilink
    English
    arrow-up
    73
    ·
    27 天前

    Surprised at the level of negativity here. Having had my sites repeatedly DDOSed offline by Claudebot and others scraping the same damned thing over and over again, thousands of times a second, I welcome any measures to help.

    • dan@upvote.au
      link
      fedilink
      English
      arrow-up
      4
      ·
      26 天前

      thousands of times a second

      Modify your Nginx (or whatever web server you use) config to rate limit requests to dynamic pages, and cache them. For Nginx, you’d use either fastcgi_cache or proxy_cache depending on how the site is configured. Even if the pages change a lot, a cache with a short TTL (say 1 minute) can still help reduce load quite a bit while not letting them get too outdated.

      Static content (and cached content) shouldn’t cause issues even if requested thousands of times per second. Following best practices like pre-compressing content using gzip, Brotli, and zstd helps a lot, too :)

      Of course, this advice is just for “unintentional” DDoS attacks, not intentionally malicious ones. Those are often much larger and need different protection - often some protection on the network or load balancer before it even hits the server.

      • DigitalDilemma@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        25 天前

        Already done, along with a bunch of other stuff including cloudflare WAF and rate limiting rules.

        I am still annoyed that it took me over a day’ of my life to finally (so far) restrict these things. And several other days to offload the problem to Cloudflare pages for sites that I previous self hosted but my rural link couldn’t support.

        this advice is just for “unintentional” DDoS attacks, not intentionally malicious ones.

        And I don’t think these high volume AI scrapes are unintentional DDOS attacks. I consider them entirely intentional. Not deliberrately malicious, but negligent to the point of criminality. (Especially in requesting the same pages again so frequently, and all of them ignoring robots.txt)