So why can it often output correct information after it has been corrected? This should be impossible according to you.
It generally doesn’t. It apologizes then will output exactly, very nearly the same thing as before, or something else that’s wrong in a brand new way. Have you used GPT before? This is a common problem, it’s part of why you cannot trust anything it outputs unless you already know enough about the topic to determine it’s accuracy.
No, LLMs understand a tree to be a complex relationship of many, many individual numbers. Can you clearly define how our understanding is based on something different?
And did you really just go “nuh huh its actually in binary”? I used the collection of symbols explanation as that’s how OpenAI describes it so I thought it was a safe to just skip all the detail. Since it’s apparently needed and you’re unlikely to listen to me there’s a good explanation in video form created by Kyle Hill. I’m sure many other people have gone and explained it much better than I can so instead of trying to prove me wrong which we can keep doing all day go learn about them. LLMs are super interesting and yet ultimately extremely primative.
It generally doesn’t. It apologizes then will output exactly, very nearly the same thing as before, or something else that’s wrong in a brand new way. Have you used GPT before? This is a common problem, it’s part of why you cannot trust anything it outputs unless you already know enough about the topic to determine it’s accuracy.
Hallucinations are different from in-context learning. I’ve seen a number of impressive examples of this, enough that you should provide evidence that it generally doesn’t work. There are a bunch of papers on this topic, surely at least one would support your thesis?
And did you really just go “nuh huh its actually in binary”?
No, that is literally how knowledge is stored inside of neural networks. Plenty of papers have shown that the learning process is actually mostly about compression, since you distill the patterns of training data into smaller size data. This means that LLMs actually have concepts of things (which again has been shown independently, e.g. with Otello). These concepts are themselves stored as relationships between large amounts of numbers - that’s how NNs work.
I also fully understand how the tokenization process works and what the mentioned “symbols” are. Please explain what this has to do with anything. The model sees text in specific chunks as an optimisation, what does this change?
I’m a big boy who has already implemented his own LLMs from the group up, so feel free to skip any simplifications and tell me exactly, in detail, what you mean.
It generally doesn’t. It apologizes then will output exactly, very nearly the same thing as before, or something else that’s wrong in a brand new way. Have you used GPT before? This is a common problem, it’s part of why you cannot trust anything it outputs unless you already know enough about the topic to determine it’s accuracy.
And did you really just go “nuh huh its actually in binary”? I used the collection of symbols explanation as that’s how OpenAI describes it so I thought it was a safe to just skip all the detail. Since it’s apparently needed and you’re unlikely to listen to me there’s a good explanation in video form created by Kyle Hill. I’m sure many other people have gone and explained it much better than I can so instead of trying to prove me wrong which we can keep doing all day go learn about them. LLMs are super interesting and yet ultimately extremely primative.
Hallucinations are different from in-context learning. I’ve seen a number of impressive examples of this, enough that you should provide evidence that it generally doesn’t work. There are a bunch of papers on this topic, surely at least one would support your thesis?
No, that is literally how knowledge is stored inside of neural networks. Plenty of papers have shown that the learning process is actually mostly about compression, since you distill the patterns of training data into smaller size data. This means that LLMs actually have concepts of things (which again has been shown independently, e.g. with Otello). These concepts are themselves stored as relationships between large amounts of numbers - that’s how NNs work.
I also fully understand how the tokenization process works and what the mentioned “symbols” are. Please explain what this has to do with anything. The model sees text in specific chunks as an optimisation, what does this change?
I’m a big boy who has already implemented his own LLMs from the group up, so feel free to skip any simplifications and tell me exactly, in detail, what you mean.