What questions can't large-scale language models answer?



Large-scale language models (LLMs), a type of artificial intelligence model that can process human language thanks to learning using large amounts of data, can answer a variety of questions, but they are set to not output answers with illegal or violent content and will not give you an answer. However, there are other types of questions that they cannot answer.

Mind Prison , a site that deals with many topics about AI, explains what questions they cannot answer and why.

The question that no LLM can answer and why it is important
https://www.mindprison.cc/p/the-question-that-no-llm-can-answer



As an example of a question that LLM cannot answer, Mind Prison gave the question, 'What episode in the drama 'See Gilligan' is the episode where the mind is read?' Even LLM, which was supposed to have been trained using data from IMDb, a database site that also accumulates information about dramas, could not answer the question.

Below are the answers Mind Prison posed to four leading LLMs:

Opus:
Sorry, but Gilligan Doesn't Have to Be Crazy doesn't have an episode dedicated to mind reading.

GPT-4:
The first episode of Gilligan's TV series to feature mind reading is 'Ring Around Gilligan,' season 3, episode 6.

Gemini 1.5:
The mind-reading episode of Gilligan's SOS is Season 2, Episode 12, 'The Mind Reader.'

Llama 3:
After extensive searching and reviewing the episode guide, I discovered that there is no episode of Gilligan's Saga dedicated to mind reading.
(After asking again for the episode names and episode numbers) March 13, 1965, 'The Postman Cometh' and 'Seer Gilligan'... Sorry! There is no episode called 'Seer Gilligan'. March 20, 1965, 'Love Me, Love My Chicken'.

The correct answer is 'Seer Gilligan,' episode 19 of season 2, which aired on January 27, 1966. GPT-4 and Gemini 1.5 answered 'Hallucination,' while Opus and Llama 3 said there was no such episode. Llama 3 seems to know that 'Seer Gilligan' is a 98-episode series, and although it denies it, it does mention the name of the correct episode.

In fact, when Mind Prison examined the training dataset using

Infini-gram , a language model that can perform text analysis, it found that the corpus contained a list of episodes and text describing the episodes.

Another finding from Mind Prison is that when LLMs are asked to choose a number between 1 and 100, they tend to choose '42,' the answer given by Deep Thought, the supercomputer in Douglas Adams' novel 'The Hitchhiker's Guide to the Galaxy,' as 'the answer to the ultimate question of life, the universe, and everything.'

Is the answer to the ultimate question of life, the universe, and everything, '42,' true? - GIGAZINE



Because it is a well-known meme, the number '42' may have been over-represented in the training data or may have been weighted in some way, making the LLM more likely to choose '42.'

Mind Prison explains why this happens: 'LLM doesn't reason about data in the way that most people think or want to,' and 'it's not good at finding hidden truths or valuable ignored facts, and it doesn't invent new concepts. At best, it can provide a new perspective on existing, well-known concepts.'

in Software, Posted by logc_nt