OpenAI’s o3, o4-mini reasoning AI models hallucinate more

OpenAI found o3-mini to hallucinate while responding to 33% of questions on PersonQA

OpenAI’s o3, o4-mini reasoning AI models hallucinate more
OpenAI’s o3, o4-mini reasoning AI models hallucinate more

OpenAI recently released o3 and o4-mini artificial intelligence (AI) models have been found to hallucinate more in contrast to the older OpenAI models.

Hallucinations have remained a major challenge to resolve in AI, impacting modern and best-performing systems.

OpenAI’s internal tests indicated that o3 and o4-mini hallucinate more as compared to the previous models, including o1, o1-mini, and o3-mini, and “non-reasoning” models.

OpenAI stated, “Specifically, o3 tends to make more claims overall, leading to more accurate claims as well as more inaccurate/hallucinated claims,” as reported by TechCrunch.

The ChatGPT manufacturer found that o3 hallucinated while responding to 33% of questions on PersonQA, while the 04-mini hallucinated 48% of the time.

The hallucination rate is double the rate of the company’s older reasoning models, such as o1 and o3-mini.

An effective strategy to boost the precision of models is to give them web search capabilities. 

OpenAI’s GPT-4o with web search accomplishes 90% accuracy on SimpleQA. Potentially, the search could enhance reasoning models’ hallucination rates too.

Last year, AI transitioned towards reasoning models for enhanced performance with reduced data and computing, though this shift may increase hallucinations, posing a significant challenge.

While hallucinations can assist models in being innovative, they also reduce their suitability for business that needs enhanced accuracy. 

iOS 26: Apple brings eSIM transfer to and from Android
iOS 26: Apple brings eSIM transfer to and from Android
With the ‌iOS 26 update, many ‌Apple users who want to move to Android should have a simpler time doing so
Android 16 QPR1 Beta 2 experiments Gemini launch animation with vibration
Android 16 QPR1 Beta 2 experiments Gemini launch animation with vibration
With the upcoming update, the 'Press & hold power button' gesture to launch Gemini will stop vibrating,
Google reconsiders its partnership with Scale AI: Report
Google reconsiders its partnership with Scale AI: Report
Google is currently in talks with other vendors and cut ties with Scale AI
Apple's App Store’s latest AI-generated tags are live in the beta
Apple's App Store’s latest AI-generated tags are live in the beta
Apple promised that all tags would undergo human review before being shown to users
WhatsApp to launch a feature to scan documents with the camera: Report
WhatsApp to launch a feature to scan documents with the camera: Report
WhatsApp’s upcoming feature is reportedly under development and it is likely to be launched soon
Apple introduces Live Activities feature to iPad, Mac
Apple introduces Live Activities feature to iPad, Mac
On iPadOS 26, Live Activities will be associated with the latest feature dubbed as Background Tasks
Google converts online search results into conversations
Google converts online search results into conversations
Google is turning getting the internet queries results easier than ever with latest update
Jeff Bezos dethroned by renowned tech figure as the world's second richest man
Jeff Bezos dethroned by renowned tech figure as the world's second richest man
Former tech CEO has taken the spot of Jeff Bezos on the world's richest men list, which he held for eight years
Google brings significant update of Snapseed for iOS
Google brings significant update of Snapseed for iOS
With this update, Snapseed receives a variety of filters, including some new vintage ones
Snapchat brings latest editing features for creators
Snapchat brings latest editing features for creators
Snapchat has also announced the Auto-Save Stories Function to automatically save creators public stories to their profiles
Microsoft Copilot launches the latest ‘Highlights’ and multi-app features
Microsoft Copilot launches the latest ‘Highlights’ and multi-app features
Microsoft revealed that the latest AI-centric feature are currently available to US, with plans for broader expansion soon
Anker recalls 10000 power banks due to fire and burn hazards
Anker recalls 10000 power banks due to fire and burn hazards
Anker got up to 19 reports of fires and explosions, causing burn injuries to two individuals and 11 property damage