Question
Which of the following statements with regard to Large Language Models (LLMs) used in machine learning is/are correct?
1LLMs assign probabilities to the next possible words and then pick the one with the highest probability.
2LLMs process data through mathematical optimization to minimise prediction errors.
3LLMs produce unbiased outputs.
A1 only
B1 and 2 only
C2 and 3 only
D1, 2 and 3
✓
Correct Answer: (B) 1 and 2 only — Statement 3 is the trap
LLMs are trained on biased human data → they inherit and amplify biases · “Unbiased” is the opposite of reality · AI bias is one of the most studied problems in computer science
Each Statement — Verified Against How LLMs Actually Work
1
“LLMs assign probabilities to the next possible words and then pick the one with the highest probability” — TRUE (with one nuance)
This accurately describes the fundamental mechanism of LLMs during text generation:
Assigns probabilities to next words → picks highest probability
✓ Correct — core mechanism
• At each step, the LLM computes a probability distribution over every possible next token (word/subword) in its vocabulary
• The model then selects the next token — either by picking the highest probability (greedy decoding) or sampling from the distribution (temperature-based sampling)
The nuance: The statement says “pick the one with the highest probability” — this is the greedy decoding method and is technically the most basic approach. In practice, many LLMs use sampling with temperature, top-k, or top-p strategies to add variation and creativity. However, for UPSC purposes, Statement 1 correctly describes the core probability-based mechanism and is considered correct.
✓ Core LLM text generation mechanism
Next token prediction via probability distribution → greedy selection or sampling. Foundational principle of all LLMs.
2
“LLMs process data through mathematical optimization to minimise prediction errors” — TRUE
During training, LLMs use mathematical optimization (specifically gradient descent with backpropagation) to minimise a loss function:
Mathematical optimization to minimise prediction errors
✓ Correct — this is gradient descent / backpropagation
• The loss function measures the difference between the model’s predicted next token and the actual next token in the training data
• A common loss function is cross-entropy loss: L = -log(p_y) where p_y is the probability assigned to the correct next token
• If the model correctly predicts the next token (p_y = 1), loss = 0. Wrong predictions have higher loss.
• The optimizer (Adam, SGD) adjusts the model’s billions of parameters in the direction that reduces this loss
• Over billions of training examples, the model’s predictions become increasingly accurate
✓ Confirmed by LLM training theory
Training = gradient descent to minimise cross-entropy loss · Billions of parameters adjusted iteratively · Mathematical optimization is the ENGINE of LLM training
3
“LLMs produce unbiased outputs” — COMPLETELY FALSE
This is one of the most well-documented and extensively studied failures of LLMs. LLMs produce biased outputs because:
“LLMs produce unbiased outputs”
✗ False — LLMs inherit and amplify human biases
1. Training data bias: LLMs are trained on vast amounts of internet text, books, and other human-generated content — all of which contain human biases (racial, gender, cultural, political, religious).
2. Learned bias: The model learns statistical patterns from the data — including biased patterns. If a word appears disproportionately with certain associations in training data, the model will reproduce those associations.
Examples of documented LLM biases:
Gender bias
Associating “doctor” with male, “nurse” with female. Assuming the engineer in a scenario is male.
Racial bias
Higher rates of negative sentiment associated with certain racial groups. Differential treatment in decision-making prompts.
Cultural/Religious bias
Western-centric perspectives. Under-representation of non-English-speaking cultures and their viewpoints.
Political bias
Tendency to reflect the political biases prevalent in training data. Different outputs for similar prompts with different political framings.
✗ LLMs are inherently biased
Training data contains human biases → model learns and reproduces biases. AI bias reduction (RLHF, constitutional AI) helps but doesn’t eliminate bias. “Unbiased” = false.
LLMs — Key Facts for UPSC
| Parameter | Detail |
| Core mechanism | Next token prediction via probability distribution · Transformer architecture with self-attention · Billions of parameters |
| Text generation | Computes probability distribution over vocabulary → selects next token (greedy/sampling) → repeats until output complete |
| Training | Mathematical optimization (gradient descent) · Minimize cross-entropy loss function · Adjusts parameters on billions of training examples |
| Bias — why it exists | Trained on human-generated internet text → inherits gender, racial, cultural, political biases present in training data |
| Bias — attempts to reduce | RLHF (Reinforcement Learning from Human Feedback) · Constitutional AI · Instruction tuning · Red-teaming · Bias filters |
| Notable LLMs | GPT-4 (OpenAI) · Gemini (Google) · Claude (Anthropic) · Llama (Meta) · India: Sarvam AI (35B, 105B parameters, launched at India-AI Impact Summit 2026) |
| India’s LLM push | GenomeIndia AI · INDIAai mission · Sarvam AI (indigenous LLMs for Indian languages) · BharatGPT |
| Statement 3 error | “Unbiased” is directly contradicted by decades of AI research. LLM bias is one of the most researched topics in computer science. Statement 3 is categorically false. |
Memory Trick
🧠 Remember It This Way
Statement 3 is instantly wrong — “unbiased”: Any student who has used AI chatbots knows they can produce biased outputs. LLMs are trained on human internet text — the most biased dataset imaginable. “Unbiased” = false. Garbage in, garbage out — biased data in, biased outputs out.
Statement 1 = LLM text generation in one line: At each step, LLM computes probability over all possible next words → picks the most likely (or samples). This is the entire engine of ChatGPT, Gemini, Claude etc. Statement 1 correctly describes this.
Statement 2 = how LLMs learn: During training, the model makes predictions, measures how wrong it is (loss function), and uses gradient descent to adjust parameters toward lower errors. This is mathematical optimization minimising prediction errors — exactly Statement 2.


