Does AI still hallucinate or is it becoming more reliable?

Home » Does AI still hallucinate or is it becoming more reliable?

Current State of AI Hallucination

Hallucination persists: Despite improvements, AI models like ChatGPT and Google’s AI Overviews still hallucinate—i.e., produce factually incorrect or absurd outputs.
Example incidents:
- Google’s AI told users to add glue to pizza sauce or eat rocks—clearly fabricated answers.
- DALL-E generated images with elephants when explicitly asked for a room with no elephants.

Relevance : GS 3(Technology)

Why Do AI Models Hallucinate?

Statistical reasoning, not understanding: AI doesn’t “understand” language. It uses statistical associations, so it struggles with abstract concepts like negation (e.g., “no elephants”).
Data limitation: Lack of training data on rare or negative queries contributes to these failures.
Conceptual connections: Hallucinations arise with queries needing complex reasoning or novel concept connections.
Training-testing gap: AI may perform well during testing but fail in real-world scenarios if it was never exposed to similar queries during training.

Defining AI Reliability

Two criteria:
- Consistency: Ability to give similar outputs for similar inputs.
- Factuality: Providing accurate information, including acknowledging ignorance when unsure.
Hallucination compromises factuality: AI may confidently state false information instead of admitting uncertainty.

Empirical Evidence of Hallucination

A 2023 study showed:
- ChatGPT-3.5 had 55% hallucinated references.
- ChatGPT-4 reduced this to 18%, but hallucinations remained.

Challenges to AI Reliability

Benchmark manipulation: Some AI models might perform better on benchmarks by being trained on test data (“data leakage”).
Real-world drop: Good performance in benchmarks doesn’t always translate to real-life accuracy.

Strategies to Reduce Hallucinations

Band-aid training:
1. Newer models are trained with more examples where earlier models failed.
2. Example: Spotting failure patterns and fine-tuning with better data.
3. Limitation: Reactive, not preventive.

Specialised AI models:
- Small Language Models (SLMs) are trained for specific tasks.
- Example: Microsoft’s Orca 2 excels at math, reasoning, and summarisation.
Retrieval-Augmented Generation (RAG):
- AI refers to specific external sources (e.g., Wikipedia) for factual queries.
- Reduces hallucinations by anchoring outputs to trusted databases.
Curriculum learning:
- AI is trained from simple to complex problems, mimicking human learning.
- Helps improve performance over random data exposure.

Limitations of Current Solutions

No absolute fix: Even the best techniques can’t guarantee hallucination-free AI.
Need for verification: Ongoing necessity of human oversight and fact-checking systems to ensure output reliability.

Trend: Becoming More Reliable (But Not Fully)

Reduced hallucination in newer versions:
- Especially for common queries due to expanded training data.
Persistent problem for uncommon/complex inputs.

Expert Opinion Summary

Sarawagi (IIT-B): AI’s hallucination is like a “band-aid” fix; models still can’t say “I don’t know.”
Chatterjee (IIT-D): AI can’t be fully reliable without real-time global knowledge access.
Kar (IIT-D): General models like ChatGPT may never eliminate hallucination entirely.
Consensus: Targeted models and smarter training help, but verification systems are crucial.

Conclusion

AI is becoming more reliable, especially for routine and common queries.
But hallucinations still exist, especially in complex, negative, or niche queries.
The field is progressing, but perfect reliability requires structural changes—and human oversight remains essential.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Call Us Now

For Enquiry

Does AI still hallucinate or is it becoming more reliable?

Recent Posts

Categories

About

Newsletter

Contact us

Call Us Now

For Enquiry

Does AI still hallucinate or is it becoming more reliable?

Recent Posts

Categories

Tags

About

Newsletter

Contact us