• space_comrade [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 days ago

    Didn’t they manage to make it somewhat good at solving certain math competition problems? Regardless it’s a pretty big jump from that to making a breakthrough in physics.

    • Llituro [he/him, they/them]@hexbear.net
      link
      fedilink
      English
      arrow-up
      5
      ·
      3 days ago

      maybe certain ones, but it’s generally bad about numbers and mathematical reasoning. he also gets paid to make it fail at math, and it’s arguably worse at basic math than physics.

    • QuillcrestFalconer [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 days ago

      Yeah deepmind had good results with IMO problems, but only geometry problems. They scored almost at the level of gold medalist. That’s only a fraction of IMO problems, though. They did it by combining a formal verification system with a LLM to propose solution paths, and then doing some tree search I think.

      This is one way to improve large AI systems and will probably be incorporated in some way in the future, for example by integrating with a language like lean (for math proofs).

      They will also be improved by combining with tool use like calculators, code interpreters, web search, calendars, etc. This is already starting to happen to some extent.

      LLMs by themselves, at least with current architectures using transformers, are not great at reasoning (counting, arithmetic, symbolic reasoning)