How do graphics processing units work?

  • In 1999, Nvidia Corporation launched GeForce 256, branding it the “world’s first GPU”, initially aimed at improving videogame graphics performance.
  • Over 25 years, GPUs evolved from gaming hardware to core infrastructure of AI, cloud computing and digital economy, powering large-scale neural network training and data centres.
  • Today, high-end GPUs such as Nvidia’s H100 Tensor Core deliver up to 1.9 quadrillion tensor operations per second (FP16/BF16), forming backbone of generative AI systems.
  • Nvidia commands roughly ~90% market share in discrete GPUs, raising competition law and strategic supply-chain concerns globally.

Relevance

  • GS 3 (Science & Tech / Economy / Security / Environment):
    Parallel computing architecture; AI hardware backbone; 90% discrete GPU market dominance; supply-chain concentration in East Asia; energy-intensive data centres; strategic tech controls.
  • A Graphics Processing Unit (GPU) is a specialised processor designed for parallel processing, executing thousands of simple calculations simultaneously, unlike CPUs optimised for sequential complex tasks.
  • A 1920×1080 display contains 2.07 million pixels per frame; at 60 frames per second, over 120 million pixel updates per second are required, illustrating GPU’s parallel advantage.
  • GPUs contain hundreds or thousands of cores; while individual cores are weaker than CPU cores, aggregate throughput makes GPUs ideal for repetitive workloads.
  • Both CPUs and GPUs use advanced fabrication nodes (e.g., 3–5 nm class silicon transistors), differing primarily in microarchitecture and workload specialisation.
1. Rendering Pipeline
  • Vertex Processing applies matrix transformations to triangles composing 3D models, calculating spatial positioning and camera perspective using linear algebra operations.
  • Rasterisation converts geometric triangles into pixel fragments, identifying which pixels correspond to specific shapes on screen.
  • Fragment (Pixel) Shading calculates final pixel colour using lighting models, textures, reflections and shadow algorithms through small programs called shaders.
  • Final image written to frame buffer memory, then displayed; high-speed memory movement enabled through VRAM (Video RAM) with high bandwidth architecture.
2. Parallelism & AI Computing
  • Neural networks rely heavily on matrix and tensor multiplications, repetitive mathematical operations perfectly suited for GPU’s parallel core architecture.
  • Contemporary AI models contain millions to billions of parameters, demanding both compute intensity and high memory bandwidth.
  • Nvidia GPUs include Tensor Cores, specialised hardware units accelerating matrix multiplications central to deep learning workloads.
  • Google developed Tensor Processing Units (TPUs) specifically to optimise neural network computations at hyperscale.
3. Hardware Placement & System Integration
  • GPUs may exist as discrete graphics cards connected via high-speed PCIe interfaces, or integrated within System-on-Chip (SoC) designs alongside CPUs.
  • High-end GPU packages often integrate High-Bandwidth Memory (HBM) stacks positioned close to die, reducing latency and increasing data throughput.
  • GPUs allocate larger die area to compute blocks and data pathways, whereas CPUs prioritise control logic, branch prediction and cache optimisation.
  • Example: Four Nvidia A100 GPUs (250 W each) used for 12-hour training consume approximately 12 kWh during training phase alone.
  • Continuous inference operations may consume around 6 kWh per day, equivalent to running an AC at full compressor for 4–6 hours daily.
  • Additional server components (CPU, RAM, cooling) add 30–60% overhead power consumption, increasing carbon footprint of AI infrastructure.
  • Large-scale AI training clusters with thousands of GPUs contribute significantly to data centre energy demand, raising sustainability concerns.
  • GPUs have become critical for AI-enabled defence systems, cybersecurity, financial modelling and weather simulations, elevating them to strategic technology status.
  • Export controls by U.S. on advanced GPUs to certain countries reflect geopoliticisation of semiconductor supply chains.
  • High concentration of fabrication capacity in East Asia exposes AI infrastructure to geopolitical supply-chain disruptions.
  • GPU dominance accelerates innovation but risks vendor lock-in, limiting open competition and raising entry barriers for startups and sovereign AI initiatives.
  • Energy-intensive AI workloads may conflict with global climate commitments unless powered by renewable energy grids.
  • Dependence on few firms for AI hardware undermines digital sovereignty for developing nations.
  • However, GPU-driven AI advancements contribute significantly to healthcare diagnostics, climate modelling and productivity gains.
  • Promote diversified semiconductor ecosystems through industrial policy and chip incentives, reducing excessive concentration risk.
  • Encourage open standards and interoperability frameworks to mitigate software lock-in effects of proprietary platforms like CUDA.
  • Mandate Green Data Centre norms, integrating renewable energy and efficiency benchmarks for AI compute clusters.
  • Strengthen global antitrust scrutiny while balancing innovation incentives and competition policy objectives.
Prelims Pointers
  • GPU = parallel processor; CPU = sequential complex processor.
  • 1920×1080 display = 2.07 million pixels per frame.
  • Nvidia H100 ≈ 1.9 quadrillion tensor ops/sec (FP16/BF16).
  • Nvidia ≈ 90% discrete GPU market share.
  • A100 board power ≈ 250 W.
Practice Question (15 Marks)
  • “Semiconductor hardware, particularly GPUs, has become a strategic pillar of the digital economy.”
    Examine the technological, economic and geopolitical implications of GPU dominance in the AI era.

Book a Free Demo Class

March 2026
M T W T F S S
 1
2345678
9101112131415
16171819202122
23242526272829
3031  
Categories

Get free Counselling and ₹25,000 Discount

Fill the form – Our experts will call you within 30 mins.