Salamander@mander.xyz to Non-Trivial AI@mander.xyzEnglish · 8 months ago

Detecting hallucinations in large language models using semantic entropy

www.nature.com

cross-posted to:
futurology@futurology.today

Detecting hallucinations in large language models using semantic entropy

www.nature.com

Salamander@mander.xyz to Non-Trivial AI@mander.xyzEnglish · 8 months ago

cross-posted to:
futurology@futurology.today

Detecting hallucinations in large language models using semantic entropy | Nature

www.nature.com

Large language model (LLM) systems, such as ChatGPT1 or Gemini2, can show impressive reasoning and question-answering capabilities but often ‘hallucinate’ false outputs and unsubstantiated answers3,4. Answering unreliably or without the necessary information prevents adoption in diverse fields, with problems including fabrication of legal precedents5 or untrue facts in news articles6 and even posing a risk to human life in medical domains such as radiology7. Encouraging truthfulness through supervision or reinforcement has been only partially successful8. Researchers need a general method for detecting hallucinations in LLMs that works even with new and unseen questions to which humans might not know the answer. Here we develop new methods grounded in statistics, proposing entropy-based uncertainty estimators for LLMs to detect a subset of hallucinations—confabulations—which are arbitrary and incorrect generations. Our method addresses the fact that one idea can be expressed in many ways by computing uncertainty at the level of meaning rather than specific sequences of words. Our method works across datasets and tasks without a priori knowledge of the task, requires no task-specific data and robustly generalizes to new tasks not seen before. By detecting when a prompt is likely to produce a confabulation, our method helps users understand when they must take extra care with LLMs and opens up new possibilities for using LLMs that are otherwise prevented by their unreliability. Hallucinations (confabulations) in large language model systems can be tackled by measuring uncertainty about the meanings of generated responses rather than the text itself to improve question-answering accuracy.

Abstract

Large language model (LLM) systems, such as ChatGPT1 or Gemini2, can show impressive reasoning and question-answering capabilities but often ‘hallucinate’ false outputs and unsubstantiated answers3,4. Answering unreliably or without the necessary information prevents adoption in diverse fields, with problems including fabrication of legal precedents5 or untrue facts in news articles6 and even posing a risk to human life in medical domains such as radiology7. Encouraging truthfulness through supervision or reinforcement has been only partially successful8. Researchers need a general method for detecting hallucinations in LLMs that works even with new and unseen questions to which humans might not know the answer. Here we develop new methods grounded in statistics, proposing entropy-based uncertainty estimators for LLMs to detect a subset of hallucinations—confabulations—which are arbitrary and incorrect generations. Our method addresses the fact that one idea can be expressed in many ways by computing uncertainty at the level of meaning rather than specific sequences of words. Our method works across datasets and tasks without a priori knowledge of the task, requires no task-specific data and robustly generalizes to new tasks not seen before. By detecting when a prompt is likely to produce a confabulation, our method helps users understand when they must take extra care with LLMs and opens up new possibilities for using LLMs that are otherwise prevented by their unreliability.

I am not entirely sure if this research article falls within the community’s scope, so feel free to remove it if you consider it does not.

Chat

TragicNotCute@lemmy.world
link
fedilink
English
arrow-up
2·
8 months ago

What the hell? 'Scuse me. Who’s watchin these AIs?

Uh - the fat one’s watchin the little one?

Non-Trivial AI@mander.xyz

ntai@mander.xyz

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !ntai@mander.xyz

This is a community for discussing and sharing news about what I am calling “Non-Trivial” AI. That is, AIs which Solve Problems. Discussions and news should relate to unique, unusual, and/or novel applications of AI, or the solutions of problems with AI, especially AI Safety. History of AI is also welcome.

For the purposes of this community, chatbots and image/video generators are trivial applications of AI, and thus content related to those applications are not fit for this community.

For the purposes of this community, AI and Machine Learning Algorithms/Applications are equivalent terms.

Rules:

No Chatbot Logs/News; No AI-Generated Images/News

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
2 users / 6 months
1 local subscriber
55 subscribers
12 Posts
2 Comments
Modlog

mods:
DarkNightoftheSoul@mander.xyz