Our review of concerns since the launch of ChatGPT 3.5 end of 2022. First episode: their HALLUCINATIONS.

Mid-May 2023 (six months after ChatGPT 3.5 was launched), Sam Altman, OpenAI’s CEO, called for US lawmakers to regulate artificial intelligence. Quoting a May 17th, 2023, BBC article, Altman argued that these tools can create incredibly human-like answers to questions, but can also be wildly inaccurate. “I think if this technology goes wrong, it can go quite wrong…we want to be vocal about that,” Mr. Altman said. “We want to work with the government to prevent that from happening.” Even though he certainly spoke for his parish (ChatGPT was ahead of the game at this time, so he might have wanted to regulate the process to maintain his leading position), we should be worried[i]. Many papers, articles, and posts discuss their inaccuracies.

Many use a stronger word; they speak of hallucinations when referring to ChatGPT’s like tools. Quoting Cade Metz, a New York Times tech specialist (March 2023) [ii] “these systems can generate untruthful, biased and otherwise toxic information. Systems like ChatGPT-4 get facts wrong and make up information, a phenomenon called “hallucination”. In the same article, Melanie Mitchell, an AI researcher at the Santa Fe Institute explains why “they live in a world of language… that world gives them some clues about what is true and what is not true, but the language they learn from is not grounded in reality. They do not necessarily know if what they are generating is true or false.”

According to an April 2025 post from TechCrunch, OpenAI’s new reasoning AI models hallucinate more [iii], even more than several of OpenAI’s older models. According to an October 2024 Fortune article[iv], OpenAI’s artificial intelligence-powered transcription tool Whisper “has a major flaw: It is prone to making up chunks of text or even entire sentences […] experts said some of the invented text — known in the industry as hallucinations — can include racial commentary, violent rhetoric and even imagined medical treatments.” As the tool is used to transcribe doctors’ consultations with patients, we should be seriously concerned.

May 6, 2025: the same Cade Metz from the New York Times, mentioned in a detailed article[v], that hallucinations are getting worseMore than two years after the arrival of ChatGPT, tech companies, office workers and everyday consumers are using A.I. bots for an increasingly wide array of tasks. But there is still no way of ensuring that these systems produce accurate information. The newest and most powerful technologies — so-called reasoning systems from companies like OpenAI, Google and the Chinese start-up DeepSeek — are generating more errors, not fewer. As their math skills have notably improved, their handle on facts has gotten shakier. It is not entirely clear why. […] Despite our best efforts, they will always hallucinate,” said Amr Awadallah, the chief executive of Vectara, a start-up that builds A.I. tools for businesses, and a former Google executive. “That will never go away.” Apparently, the most advanced technologies they rely on, these reasoning systems, make more errors. Some recent tests with various of these (such as OpenAI’s o3 and 04-mini, but also with some of Google and DeepSeek models) mention error rates of up to 50% and even 79%. A very serious concern is that the companies do not understand why the error % is even higher than with some of the first ChatGPT’s like tools.

Some hypothesize that it has to do with the reasoning models spending time “thinking” step by step through complex problems, before making up their answer. This seems to increase the rate of hallucinations at each step. As some of these bots reveal the steps to the users, they can see how the answers can sometimes be very unrelated.

Cade also mentions the reinforcement learning techniques these systems use to determine the best reply to give. As they have run out of about all the English text on the internet for their knowledge base, they now learn through some trial and error techniques (reinforcement learning). This seems to work well with hard facts sectors, like math and computer programming, but not so much with less factual ones.

These hallucinations may not be a major issue for many people; nevertheless, they raise serious concerns when they come up in medical, legal, political, and defense situations, as well as with sensitive business data. The answer is in adopting good practices, as these ChatGPT’s like tools can be incredibly helpful.

To conclude, ChatGPT’s like tools hallucinate because they make up their answers from all the bits and pieces they have in their knowledge base, and they make mistakes doing this. y contrast, search engine plain results (not the ones that are made-up answers) are not said to be hallucinations because the engines just bring up webpages; they do not make up answers. The resulting webpages can be inadequate or a lousy selection, but each result is sourced, and each user can verify them. How ChatGPT’s like tools make up their answers is much less transparent (even if some tools try to trace back the origins of their answer). They lack the good sense human beings have, which plays a big part in modelling our answers. Also, they are still some way from how human beings make up knowledge, think, reason, and argue. Modeling these processes, which are so human, is a Herculean task.

Check our next LinkedIn articles and posts for other concerns about ChatGPT’s like tools and how to adopt some good practices. Check as well our handbook Master ADVANCED Digital Tools for Research available on Amazon marketplaces.

Image from https://pixabay.com/users/gdj-1086657/


[i] The Time published on June 21st, 2023, an in-depth article on Sam Altman and AI opportunities and threats Felsenthal, E., & Perrigo, B. (2023). Open AI CEO Sam Altman Is Pushing Past Doubts on Artificial Intelligence. time.com. 21 June 2023. Available at: Sam Altman Talks OpenAI: TIME100 Most Influential Companies | TIME [Viewed 22 July 2024].

[ii]              Metz, C. (2023). What Makes A.I. Chatbots Go Wrong? The curious case of the hallucinating software. nytimes.com. 29 March 2023. Available at: What Makes Chatbots ‘Hallucinate’ or Say the Wrong Thing? – The Mes New York Times (nytimes.com) [Viewed 4 April 2023].

[iii]             In this post, Zeff refers to the o3 and o4-mini models. Zeff, M. (2025). OpenAI’s new reasoning AI models hallucinate more. Techcrunch.com. 18 April 2025. Available at: OpenAI’s new reasoning AI models hallucinate more | TechCrunch [Viewed 22 April 2025]

[iv]             Burke, G., Schellmann, H. and The Associated Press (2024). OpenAI’s transcription tool hallucinates more than any other, experts say – but hospitals keep using it. fortune.com. 26 October 2024. Available at: OpenAI’s transcription hallucinates more than any other, experts say | Fortune [Viewed 9 November 2024]

[v]              Metz, C.,, Weise, K. (2025). A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse. New York Times. 6 May 2025. Available at: A.I. Hallucinations Are Getting Worse, Even as New Systems Become More Powerful – The New York Times [Viewed 21 May 2025]