• archomrade [he/him]
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    14 days ago

    Look, I get that we all are very skeptical and cynical about the usefulness and ethics of AI, but can we stop with the reactive headlines?

    Saying we know how AI works because it’s ‘just predicting the next word’ is like saying I know how nuclear energy works because it’s ‘just a hot stick of metal in a boiler’

    Researchers who work on transformer models understand how the algorithm works, but they don’t yet know how their simple programs can generalize as much as they do. That’s not marketing hype, that’s just an acknowledgement of how relatively uncomplicated their structure is compared to the complexity of its output.

    I hate that we can’t just be mildly curious about ai, rather than either extremely excited or extremely cynical.

    • ProfessorOwl_PhD [any]@hexbear.net
      link
      fedilink
      arrow-up
      2
      ·
      14 days ago

      If you don’t understand how your algorithm is reaching its outputs, you obviously don’t understand the algorithm. Knowing what you’ve made is different to understanding what it does.

      • archomrade [he/him]
        link
        fedilink
        English
        arrow-up
        2
        ·
        14 days ago

        Knowing what you’ve made is different to understanding what it does.

        Agree, but also - understanding what it does is different to understanding how it does it.

        It is not a misrepresentation to say ‘we have no way of observing how this particular arrangement of ML nodes respond to a specific input that is different to another arrangement’ - the best we can do is probe the network like we do with neuron clusters and see what each part does under different stimuli. That uncertainty is meaningful, because without having a way to understand how small changes to the structure result in apparently very large differences in output we’re basically just groping around in the dark. We can observe differences in the outputs of two different models but we can’t meaningfully see the node activity in any way that makes sense or is helpful. The things we don’t know about LLM’s are some of the same things we don’t know about neuro-biology, and just as significant to remedying dysfunctions and limits to both.

        The fear is that even if we believe what we’ve made thus far is an inert but elaborate rube goldberg machine (that’s prone to abuse and outright fabrication) that looks like ‘intelligence’, we still don’t know if:

        • what we think intelligence looks like is what it would look like in an artificial recreation
        • changes we make to its makeup might accidentally stumble into something more significant than we intend

        It’s frustrating that this field is getting so much more attention and resources than I think it warrants, and the reason it’s getting so much attention in a capitalist system is honestly enraging. But it doesn’t make the field any less intriguing, and I wish all discussions of it didn’t immediately get dismissed as overhyped techbro garbage.

        • ProfessorOwl_PhD [any]@hexbear.net
          link
          fedilink
          English
          arrow-up
          2
          ·
          14 days ago

          OK, I suppose I see what you’re saying, but I think headlines like this are important to shaping people’s understanding of AI, rather than being dismissive - highlighting that, like with neuroscience, we are still thoroughly in the research phase rather than having end products to send to market.

          • archomrade [he/him]
            link
            fedilink
            English
            arrow-up
            2
            ·
            14 days ago

            Yea, I’m with ya. Some people interpreted this as marketing hype, and while I agree with them that mysticism around AI is driven by this kind of reporting I think there’s very much legitimacy to the uncertainty of the field at present.

            If everyone understood it as experimental I think it would be a lot more bearable.

    • sexy_peach@beehaw.org
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      14 days ago

      Researchers who work on transformer models understand how the algorithm works, but they don’t yet know how their simple programs can generalize as much as they do.

      They do!

      You can even train small networks by hand with pen and paper. You can also manually design small models without training them at all.

      The interesting part is that this dated tech is producing such good results now that we throw our modern hardware at it.

      • archomrade [he/him]
        link
        fedilink
        English
        arrow-up
        1
        ·
        14 days ago

        an acknowledgement of how relatively uncomplicated their structure is compared to the complexity of its output.

        The interesting part is that this dated tech is producing such good results now that we throw our modern hardware at it.

        That’s exactly what I mean.

          • archomrade [he/him]
            link
            fedilink
            English
            arrow-up
            1
            ·
            14 days ago

            Maybe a less challenging way of looking at it would be:

            We are surprised at how much of subjective human intuition can be replicated using simple predictive algorithms

            instead of

            We don’t know how this model learned to code

            Either way, the technique is yielding much better results than what could have been reasonably expected at the outset.