I’m curious to get all of your thoughts on this. It’s no secret that AI has been growing quite exponentially over the last year. I feel that new models are being released almost every other day. With that said many of these models need a tremendous amount of data to train on. It’s no secret that reddit sells its users interaction to the highest bidder. This was partially the reason why they made the changes to the API limits that got many of us to move to the fediverse in the first place.

My question is how does everyone feel with knowing that multi-billion dollar companies as scraping this instance and the others, creating extra load on the servers for nothing more than to be able to profit from it?

What can be done to continue providing a free, open network to users but prevent those who are only looking to profit from the data?

edit: fixed title typo

  • MachineFab812@discuss.tchncs.de
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    3 months ago

    Scrape*, for your title.

    Meanwhile, preventing un-paid scraping was a big part of Reddit’s rationalle for their en-shitification, ie, charging for API access.

    I would rather train an AI indirectly for free than ask random web-hosts to run interference, which IRL works out to be pay-walling and selling user content.

    By asking Lemmy hosts to “prevent AI from seeing my content”, all you are really asking them to do is to slap a price-tag on it, and hire lawyers to pursue companies/users that don’t pay. Not pay you or me, but them.

    • degen
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 months ago

      Yeah, I’m more worried about the output of AI getting involved than anything regarding the input, at least as far as a public forums go.