Call Us Now

+91 9606900005 / 04

For Enquiry

legacyiasacademy@gmail.com

Does AI still hallucinate or is it becoming more reliable?

Current State of AI Hallucination

  • Hallucination persists: Despite improvements, AI models like ChatGPT and Google’s AI Overviews still hallucinate—i.e., produce factually incorrect or absurd outputs.
  • Example incidents:
    • Googles AI told users to add glue to pizza sauce or eat rocks—clearly fabricated answers.
    • DALL-E generated images with elephants when explicitly asked for a room with no elephants.

Relevance : GS 3(Technology)

Why Do AI Models Hallucinate?

  • Statistical reasoning, not understanding: AI doesn’t “understand” language. It uses statistical associations, so it struggles with abstract concepts like negation (e.g.no elephants”).
  • Data limitation: Lack of training data on rare or negative queries contributes to these failures.
  • Conceptual connections: Hallucinations arise with queries needing complex reasoning or novel concept connections.
  • Training-testing gap: AI may perform well during testing but fail in real-world scenarios if it was never exposed to similar queries during training.

Defining AI Reliability

  • Two criteria:
    • Consistency: Ability to give similar outputs for similar inputs.
    • Factuality: Providing accurate information, including acknowledging ignorance when unsure.
  • Hallucination compromises factuality: AI may confidently state false information instead of admitting uncertainty.

 Empirical Evidence of Hallucination

  • A 2023 study showed:
    • ChatGPT-3.5 had 55% hallucinated references.
    • ChatGPT-4 reduced this to 18%, but hallucinations remained.

Challenges to AI Reliability

  • Benchmark manipulation: Some AI models might perform better on benchmarks by being trained on test data (data leakage”).
  • Real-world drop: Good performance in benchmarks doesn’t always translate to real-life accuracy.

Strategies to Reduce Hallucinations

  1. Band-aid training:
    1. Newer models are trained with more examples where earlier models failed.
    2. Example: Spotting failure patterns and fine-tuning with better data.
    3. Limitation: Reactive, not preventive.
  • Specialised AI models:
    • Small Language Models (SLMs) are trained for specific tasks.
    • Example: Microsofts Orca 2 excels at math, reasoning, and summarisation.
  • Retrieval-Augmented Generation (RAG):
    • AI refers to specific external sources (e.g., Wikipedia) for factual queries.
    • Reduces hallucinations by anchoring outputs to trusted databases.
  • Curriculum learning:
    • AI is trained from simple to complex problems, mimicking human learning.
    • Helps improve performance over random data exposure.

Limitations of Current Solutions

  • No absolute fix: Even the best techniques can’t guarantee hallucination-free AI.
  • Need for verification: Ongoing necessity of human oversight and fact-checking systems to ensure output reliability.

Trend: Becoming More Reliable (But Not Fully)

  • Reduced hallucination in newer versions:
    • Especially for common queries due to expanded training data.
  • Persistent problem for uncommon/complex inputs.

Expert Opinion Summary

  • Sarawagi (IIT-B): AI’s hallucination is like a band-aid” fix; models still can’t say “I dont know.”
  • Chatterjee (IIT-D): AI can’t be fully reliable without real-time global knowledge access.
  • Kar (IIT-D): General models like ChatGPT may never eliminate hallucination entirely.
  • Consensus: Targeted models and smarter training help, but verification systems are crucial.

Conclusion

 AI is becoming more reliable, especially for routine and common queries.
 But hallucinations still exist, especially in complex, negative, or niche queries.
 The field is progressing, but perfect reliability requires structural changes—and human oversight remains essential.


May 2025
MTWTFSS
 1234
567891011
12131415161718
19202122232425
262728293031 
Categories