AI’s hallucination problem is getting worse

hallucination problem

Despite the great progress in artificial intelligence, the surprising trend shows: the latest models of artificial intelligence A significant increase in inaccurate and fabricated information. This is a phenomenon that is usually referred to as “hallucinations”. This development is confusing the leaders of industry and constitutes major challenges for the wide and reliable application of artificial intelligence techniques.

The latest test of the latest models of key players such as Openai and Deepseek reveals a surprising reality: these systems that are supposed to be more intelligent generate incorrect information at higher rates than their ancestors. Openai’s special assessments, detailed in a Modern search paperIt showed that the latest models of O3 and O4-MINI, which were released in April, have suffered from highly high hallucinations compared to the previous O1 model from late 2024. For example, when summarizing questions about general numbers, O3 hallucinations were designed by 33 % of time, while O4-MINI corrected 48 % of the time. In a blatant contradiction, the O1 Octo model was only 16 % hallucinations.

The issue is not isolated to Openai. Independent test by Vikara,, Which classifies artificial intelligence models, indicates that many “thinking” models, including R1 Deepseek, have seen great increases in hallucinations compared to previous repetitions of the same developers. These thinking models are designed to imitate human -like thinking by dividing problems into multiple steps before reaching an answer.

The effects of this increase in great inaccuracy. Given the increasingly integration of AI Chatbots into different applications – from customer service and research assistance to legal and medical fields – its productive production becomes very important. Customer service robot that provides incorrect information to politics, as it is taking into account the users of the programming tool, or legal artificial intelligence that cites a country that is not present, can lead to great frustration with users and even serious severe consequences in the real world.

While artificial intelligence companies initially expressed optimism that hallucinations will decrease normally with models updates, recent data draws a different image. Even Openai admits the issue, where a company spokesman said: “Hells are not in nature more prevalent in thinking models, although we are actively working to reduce the hallucinations that we have seen in O3 and O4-MINI.” They maintain this research on the reasons and relief of hallucinations in all models are still a priority.

The main causes of this increase in errors in the most advanced models remain somewhat far -reaching. Due to the huge data size, these systems are trained on these systems, and the complex sporting operations they use, explaining the exact reasons for hallucinations represents a major challenge for technicians. Some theories indicate that the “thinking” process step by step in thinking models may create more opportunities for the compound. Others suggest that training methodologies, such as reinforcement learning, although they are beneficial to tasks such as mathematics and coding, may display realistic accuracy in other than in other areas.

Researchers are active with possible solutions to alleviate this increasing problem. The strategies under investigation include training forms to identify and express uncertainty, as well as employment Recovery generation techniques Which allows Amnesty International to refer to the external information sources before creating responses.

However, some experts warn against appointing Amnesty International’s mistakes in the term “Hilassa” itself. They argue that it involves inappropriately at a level of awareness or awareness that artificial intelligence models do not own them. Instead, they view this inaccuracy as an essential aspect of the current possibility of language models.

Despite the ongoing efforts to improve accuracy, the last direction indicates that the trusted AI road may be more complicated than it was initially expected. Currently, users are advised to exercise caution and critical thinking when interacting with even the most advanced chat tools, especially when searching for realistic information. It seems that the “growing pain” of the development of artificial intelligence has not yet ended.

Leave a Reply