A GPU (Graphics Processing Unit) is a specialized processor designed to handle thousands of operations simultaneously. Think of it as a factory with 16,000 workers who can each do simple math — compared to a CPU, which is more like 16 expert engineers who can each solve complex problems.
GPUs were originally built to render graphics — calculating the color of millions of pixels 60 times per second. That requires doing the same math on different data, over and over, in parallel. This is called SIMD (Single Instruction, Multiple Data).
It turns out that training AI models has the same pattern: multiply millions of numbers together, add them up, repeat billions of times. That's why GPUs became the engine behind modern AI.