ChatGPT has meltdown and starts sending alarming messages to users::AI system has started speaking nonsense, talking Spanglish without prompting, and worrying users by suggesting it is in the room with them
ChatGPT has meltdown and starts sending alarming messages to users::AI system has started speaking nonsense, talking Spanglish without prompting, and worrying users by suggesting it is in the room with them
To be honest this is the kind of outcome I expected.
Garbage in, garbage out. Making the system more complex doesn’t solve that problem.
Thank you for your service
Would you like to know more?
Bamalam
The development of LLMs is possibly becoming self defeating, because the training data is being filled not just with human garbage, but also AI garbage from previous, cruder LLMs.
We may well end up with a machine learning equivalent of Kessler syndrome, with our pool of available knowledge eventually becoming too full of junk to progress.
I mean, surely the solution to that would be to use curated/vetted training data? Or at the very least, data from before LLMs became commonplace?
The funny thing is, children are similar. They just learn whatever you put in front of them. We have whole systems for educating children for decades of their lives.
With AI we literally just plopped them in front of the Internet, with no guidelines on what to learn. AI researchers say “it’s a black box! We don’t know why it’s doing this!” You fed it everything you could and gave it few rules on what to do. You are the reason why it’s nuts.
Humans come hardwired to be a certain way, do certain things. Maybe they need to start AI off like that, some basic programs that guide learning. “Learn everything” isn’t working.
That’s a good point. For real brains, size and intelligence are not linked. An elephant brain has 3 times the amount of neurons as a human brain, but a human brain is more intelligent. There is more to intelligence than just the amount of neutrons, real or virtual, so making larger and larger AI models may not be the right direction.
True. Maybe they just need more error correction. Like spend more energy questioning whether what you say is true. Right now LLMs seems to just vomit out whatever they thought up, with no consideration of whether it makes sense.
They’re like an annoying friend who just can’t shut up.
They aren’t thinking though. They’re making connection with the trained data that they’ve processed.
This is really clear when they are asked to write code worth to vague a prompt.
Maybe feeding them through primary school curriculum (including essays and tests) would be helpful, but I don’t think the language models really sort knowledge yet.
Yes but that only works if we can differentiate that data on a pretty big scale. The only way I can see it working at scale is by having meta data to declare if something is AI generated or not. But then we’re relying on self reporting so a lot of people have to get on board with it and bad actors can poison the data anyway. Another way could be to hire humans to chatter about specific things you want to train it on which could guarantee better data but be quite expensive. Only training on data from before LLMs will turn it into an old people pretty quickly and it will be noticable when it doesn’t know pop culture or modern slang.
Pretty sure this is why they keep training it on books, movies, etc. - it’s already intended to make sense, so it doesn’t need curated.
You mean like work? Can’t I just have some AI do all that stuff? What could go wrong?
This is called model collapse and imo has to be solved if LLMs are to be a long term thing. I could see it wrecking this current AI push until people step back and reevaluate how data gets sucked up
I really hope so. I still have to see a meaningful use case for these kind of LLMs that just get fed with all kinds of data. LLMs “on premise” that are used for specific jobs are fine, but this…I really hope a Kessler-Like syndrome blows it out the water, for countless reasons…
And now I’m picturing it training on a bunch of chats with Eliza…
just how google search results feel these days…
Damn.
Thank you VERY much for that insight: AI’s version of Kessler-syndrome.
EXACTLY.
Damn, damn, damn, that gets the truth right in its marrow.
_ /\ _
I am happy to report I did my part on feeding it garbage. I only ever speak to chatGPT thru a pirate translator. And I only ever ask it for harry potter fan fic. Pay me if you want me to train it meaningfully.
The solution is paying intelligent people to interact with it and give honest feedback.
Like, I’m sure you can pay grad students $15/hr to talk to one about their subject matter.
But with as many as they’d need, it would get expensive.
So they train with low quality social media comments, or using copywritten text without paying the owners.
It’s not that we can’t do it, it’s just expensive. So a capitalist society wont.
If we had an FDR style president, this would be a great area for a new jobs program.
It appears, that with the increase in popularity of machine learning, the percentage of people who properly source and sanitize their training data has steeply decreased.
As you stated, a MLAI can only be as good as the data it was trained on, and is usually way worse. The popularity and application of MLAIs built with questionable practices scare me, though, at least their fuckups will keep me employed and likely more busy than ever.
LLM’s are not “machine learning”, they are neural-networks.
Different category.
ML is small potatoes, ttbomk.
Decision-tree stuff.
Neural-nets are black-boxes, with back-propagation training of the neural-net to get closer to ( layer by layer, training-instance by training-instance ) the intended result.
ML is what one does on one’s own machine with some python libraries,
ChatGPT ( 3, 3.5, or 4, don’t know which ) cost something like $100,000,000 to rent the machines required for mixing the training-data & the model ( I’m assuming about $20/hr per machine, so an OCEAN of machines, to do it )
_ /\ _
Neural nets are a technology which is part of the umbrella term “machine learning”. Deep learning is also a term which is part of machine learning, just more specialized towards large NN models.
You can absolutely train NNs on your own machine, after all, that’s what I did for my masters before Chatgpt and all that, defining the layers myself, and also what I do right now with CNNs. That said, LLMs do tend to become so large that anyone without a super computer can at most fine tune them.
“Decision tree stuff” would be regular AI, which can be turned into ML by adding a “learning method” like a KNN or neural net, genetic algorithm, etc., which isn’t much more than a more complex decision tree where decision thresholds (weights) were automatically estimated by analysis of a dataset. More complex learning methods are even capable of fine tuning themselves during operation (LLMs, KNN, etc.), as you stated.
One big difference from other learning methods and to NN based methods, is that NN likes to add non-weighted layers which, instead of making decisions, transform the data to allow for a more diverse decision process.
EDIT: Some corrections, now that I’m fully awake.
While very similar in structure and function, the NN is indeed no decision tree. It functions much the same as one, as is a basic requirement for most types of AI, but whereas every node in a decision tree has unique branches with their own unique nodes, all of a NN’s nodes are interconnected to all nodes of the following layer. This is also one of the strong points of a NN, as something that seemed outrageous to it a moment ago might have become much more plausible when looking at it from a different point of view, such as after a transformative layer.
Also, other learning methods usually don’t have layers, or, if one were to define “layer” as “one-shot decision process”, they pretty much only have a single or two layers. In contrast, the NN can theoretically have an infinite amount of layers, allowing for pretty much infinite complexity as long as the inputted data is not abstracted beyond reason.
At last, NN don’t back-propage by default, though they make it easy to enable such features given enough processing power and optionally enough bandwidth (in the case of chatGPT). LLMs are a little different, as I’m decently sure they implement back-propagation as part of the technologies definition, just like KNN.
This became a little longer than I had hoped, it’s just a fascinating topic. I hope you don’t mind that I went into more detail than necessary, it was mostly for the random passersby.
And Its only to get worse as more of the public is aware.