• GenderNeutralBro@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    5
    ·
    10 months ago

    Getting there, but I can say from experience that it’s mostly useless with the current offerings. I’ve tried using GPT4 and Claude2 to give me answers for less-popular command line tools and Python modules by pointing them to complete docs, and I was not able to get meaningful answers. :(

    Perhaps you could automate a more exhaustive fine-tuning of an LLM based on such material. I have not tried that, and I am not well-versed in the process.

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      2
      ·
      10 months ago

      I’m thinking a potentially useful middle ground might be to have the AI digest the documentation into an easier-to-understand form first, and then have it query that digest for context later when you’re asking it questions about stuff. GPT4All already does something a little similar in that it needs to build a search index for the data before it can make use of it.

      • GenderNeutralBro@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        That’s a good idea. I have not specifically tried loading the documentation into GPT4All’s LocalDocs index. I will give this a try when I have some time.

        • FaceDeer@kbin.social
          link
          fedilink
          arrow-up
          3
          ·
          10 months ago

          I’ve only been fiddling around with it for a few days, but it seems to me that the default settings weren’t very good - by default it’ll load four 256-character-long snippets into the AI’s context from the search results, which is pretty hit and miss on being informative in my experience. I think I may finally have found a good use for those models with really large contexts, I can crank up the size and number of snippets it loads and that seems to help. But it still doesn’t give “global” understanding. For example, if I put a novel into LocalDocs and then ask the AI about general themes or large-scale “what’s this character like” stuff it still only has a few isolated bits of the novel to work from.

          What I’m imagining is that the AI could sit on its own for a while loading up chunks of the source document and writing “notes” for its future self to read. That would let it accumulate information from across the whole corpus and cross-reference disparate stuff more easily.

    • sanguine_artichoke
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      What about Github Copilot? It has tons of material available for training. Of course, it’s not necessarily all bug-free or well written.