• Grimy@lemmy.world
    link
    fedilink
    arrow-up
    12
    arrow-down
    1
    ·
    5 months ago

    Researchers are ringing the alarm bells, warning that companies like OpenAI and Google are rapidly running out of human-written training data for their AI models.

    There is so much more to this than the raw amount of data, this is not at all the bottle neck is seems to be. There’s a lot of room for progress when it comes to how we clean the data, how we train and the actual structures of the models.

    • 🔍🦘🛎@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      5 months ago

      Yeah if AI can’t pinpoint something when it has ALL OF HUMAN KNOWLEDGE to draw from, it’s not the fault of the data set

    • TheFriar@lemm.ee
      link
      fedilink
      arrow-up
      3
      ·
      5 months ago

      Right? What happened to that whole “there are millions of pages of text being generated by all internet users every minute” thing that people used to say? Look at lemmy alone. Look how much text we are putting into the ether every day. They’re not ever going to run out of text unless people stop typing. Is this not a fake problem?