Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse

NodeBB

  1. Home
  2. Fediverse memes
  3. This should have end up differently

This should have end up differently

Scheduled Pinned Locked Moved Fediverse memes
44 Posts 19 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R rekabis@lemmy.ca

    Lemmy.world will send my instance hundreds of thousands if not millions of requests a day, in a near steady stream. It's telling my instance about every post, comment, or vote.

    And yet, federation means that each instance should know all the other domain names, yes? So do daily DNS lookups of all IP addresses associated with federation and auto-whitelist them.

    Sure, if you have to then configure cloudflare with these IPs, it’ll require an API to do so automatically.

    But otherwise if you are running some sort of throttling protection on the actual box or VM the instance is sitting on, it should be rather trivial to update it directly, especially if said throttling software is doing Linux correctly and drawing its whitelist from a flat file.

    dave@lemmy.nzD This user is from outside of this forum
    dave@lemmy.nzD This user is from outside of this forum
    dave@lemmy.nz
    wrote last edited by
    #41

    New instances (and not just Lemmy instances, but Mastodon and other fediverse instances) are coming online all the time, so you need a way to let them through to start the federation process. There are thousands, so it needs to be automatic, you can't require a new instance sends whitelisting requests to ever server one of their users might want to interact with (instances aren't linked unless a local user subscribes to something on a remote instance).

    Given the AI bots seem to just be indiscriminately scraping web pages, I excluded API endpoints from blocking anyway. Another admin showed me a nice Cloudflare rule to do this, though media can still be a problem due to how it's individual users on other instances that are loading it so it's hard to block scrapers without blocking users, which is another way Cloudflare helps (static media files are easily cached by their CDN).

    R 1 Reply Last reply
    0
    • dave@lemmy.nzD dave@lemmy.nz

      Fundamentally, what I’m suggesting is a fork in the road. Either an instance admin can set up to eliminate scrapers by making the instance private to only registered users,

      Yeah, it would require perhaps more changes (since instances newly subscribed to a community need the ability to ad hoc fetch content), but even just not showing the website when someone isn't logged in would probably make a big difference. That might be pretty easy, just redirect requests to load the web app (except the login page) to the login page, and exclude the API. Apps would still get logged out access but I doubt that's much of a problem compared to the website, since the bots seem to just be indiscriminately scraping web pages.

      cooper8@feddit.onlineC This user is from outside of this forum
      cooper8@feddit.onlineC This user is from outside of this forum
      cooper8@feddit.online
      wrote last edited by
      #42

      Definitely true.

      1 Reply Last reply
      1
      • dave@lemmy.nzD dave@lemmy.nz

        New instances (and not just Lemmy instances, but Mastodon and other fediverse instances) are coming online all the time, so you need a way to let them through to start the federation process. There are thousands, so it needs to be automatic, you can't require a new instance sends whitelisting requests to ever server one of their users might want to interact with (instances aren't linked unless a local user subscribes to something on a remote instance).

        Given the AI bots seem to just be indiscriminately scraping web pages, I excluded API endpoints from blocking anyway. Another admin showed me a nice Cloudflare rule to do this, though media can still be a problem due to how it's individual users on other instances that are loading it so it's hard to block scrapers without blocking users, which is another way Cloudflare helps (static media files are easily cached by their CDN).

        R This user is from outside of this forum
        R This user is from outside of this forum
        rekabis@lemmy.ca
        wrote last edited by rekabis@lemmy.ca
        #43

        you need a way to let them through to start the federation process.

        This isn’t via an API endpoint explicitly for that purpose that bots would normally not utilize?

        And why not have a process by which admins from a new instance poke the admins of another instance - any other instance, so long as it’s already a part of the network - to do an initial manual whitelist that could cascade through the entire system?

        Then there should be ways that the software itself can auth with other instances of itself, via a common encryption protocol. While this would only work with like software, the key point being that only a toehold is needed to start propagating.

        The point being, there are options. Some of them quite simple.

        dave@lemmy.nzD 1 Reply Last reply
        1
        • R rekabis@lemmy.ca

          you need a way to let them through to start the federation process.

          This isn’t via an API endpoint explicitly for that purpose that bots would normally not utilize?

          And why not have a process by which admins from a new instance poke the admins of another instance - any other instance, so long as it’s already a part of the network - to do an initial manual whitelist that could cascade through the entire system?

          Then there should be ways that the software itself can auth with other instances of itself, via a common encryption protocol. While this would only work with like software, the key point being that only a toehold is needed to start propagating.

          The point being, there are options. Some of them quite simple.

          dave@lemmy.nzD This user is from outside of this forum
          dave@lemmy.nzD This user is from outside of this forum
          dave@lemmy.nz
          wrote last edited by
          #44

          Realistically, federation is not the main concern. You can leave all your API endpoints open to bots and not have a problem because they are loading the web app. Just block the web app for suspicious traffic.

          ActivityPub already uses authentication to some extent with other instances, it's the first contact where you have to have trust.

          My main concern is still that media is loaded directly from users in most cases, the APIs are not a problem right now as the bots aren't specifically targeting Lemmy. There are ways to address this but Lemmy (and other threadiverse services) don't have full time dev teams, they work on what they can or want to work on given the very low hourly rate.

          1 Reply Last reply
          0
          Reply
          • Reply as topic
          Log in to reply
          • Oldest to Newest
          • Newest to Oldest
          • Most Votes


          • Login

          • Don't have an account? Register

          • Login or register to search.
          Powered by NodeBB Contributors
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • World
          • Users
          • Groups