Extending Context Window of Large Language Models via Positional Interpolation

arxiv.org

Extending Context Window of Large Language Models via Positional Interpolation

arxiv.org

Glagnar@programming.dev to

Machine Learning@programming.devEnglish · 1 year ago

Paper released by Meta a few days ago, detailing a method for extending the context or “memory” of an LLM up to 32k tokens. What is interesting is that they give a mention to: https://kaiokendev.github.io/

This is a blog post written by a guy in his spare time who came up with the same method simultaneously, he calls it SuperHOT.

It’s really exciting how AI/ Machine learning can be advanced by relatively ordinary people putting in hard work without the resources of Microsoft etc.

You must log in or # to comment.

Chat