• @anachronist
    link
    English
    38 months ago

    Models don’t get bigger as you add more stuff.

    They will get less coherent and/or “forget” the earlier data if you don’t increase the parameters with the training set.

    There are two-gigabyte networks that have been trained on hundreds of millions of images

    You can take a huge tiff of an image, put it through JPEG with the quality cranked all the way down and get a tiny file out the other side, which is still a recognizable derivative of the original. LLMs are extremely lossy compression of their training set.

    • @mindbleach@sh.itjust.works
      link
      fedilink
      48 months ago

      which is still a recognizable derivative of the original

      Not in twelve bytes.

      Deep models are a statistical distillation of a metric shitload of data. Smaller models with more training on more data don’t get worse, they get more abstract - and in adversarial uses they often kick big networks’ asses.