• L_Acacia@lemmy.one
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    4 months ago

    The training doesn’t use csam, 0% chance big tech would use that in their dataset. The models are somewhat able to link concept like red and car, even if it had never seen a red car before.

    • AdrianTheFrog@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      4 months ago

      Well, with models like SD at least, the datasets are large enough and the employees are few enough that it is impossible to have a human filter every image. They scrape them from the web and try to filter with AI, but there is still a chance of bad images getting through. This is why most companies install filters after the model as well as in the training process.