Wednesday, August 20, 2025

C07 The CPU: How it Works, Step-by-Step


The Central Processing Unit (CPU) - How it Works

Dr Sudheendra S G summarizes the core functions and components of the Central Processing Unit (CPU), often referred to as the "brain" of the computer. It outlines the fundamental cycle of program execution and introduces key internal structures and operational concepts.

1. The CPU's Fundamental Role: The Fetch-Decode-Execute Cycle

The CPU's primary function is to execute programs, which are essentially "a long list of tiny steps called instructions." The CPU operates by continuously performing three core actions:

  • Fetch: Retrieving an instruction from memory.
  • Decode: Interpreting what the instruction means (e.g., "LOAD A," "ADD B into A").
  • Execute: Performing the action specified by the instruction.

This "Fetch → Decode → Execute" cycle repeats "millions or billions of times per second," enabling the computer to run all its software.

2. Key Internal Components of a CPU

Even a simple CPU comprises three essential parts that work in concert:

  • Registers: These are "tiny, super-fast storage boxes for numbers (A, B, C, D)." They provide immediate access to data currently being processed, significantly faster than accessing main memory (RAM).
  • ALU (Arithmetic & Logic Unit): This component is responsible for all computational operations, including "math (add, subtract) and logic (AND, OR, NOT)."
  • Control Unit: Acting as "the conductor," the Control Unit manages and coordinates all operations within the CPU, "tells everyone when to read, write, and compute."

3. Program Storage and Instruction Structure

Programs and their instructions are stored in RAM (Random Access Memory) as sequences of 1s and 0s. Each instruction typically has two main parts:

  • Opcode: This specifies "what to do" (e.g., load, store, add).
  • Operands: These indicate "what to use," which could be data in registers or specific memory addresses.

An example given is an 8-bit instruction structured as [opcode 4 bits][data 4 bits], such as 0010 1110.

4. Special Registers for Program Flow

Two crucial special registers manage the smooth execution of a program:

  • Instruction Address Register (a.k.a. Program Counter - PC): This register holds the memory address of "which instruction to fetch next." After an instruction is executed, the PC is typically incremented to point to the subsequent instruction.
  • Instruction Register: This register temporarily "holds the instruction we just fetched" from RAM, allowing the Control Unit to decode and execute it.

5. The Fetch-Decode-Execute Cycle Walkthrough (Example Program)

The document provides a detailed step-by-step example of a mini-program executing within a CPU, demonstrating the interaction between the CPU's components and RAM:

Sample Program in RAM:

  • Addr 0: 0010 1110 → LOAD A from address 14
  • Addr 1: 0001 1111 → LOAD B from address 15
  • Addr 2: 1000 01 00 → ADD B into A
  • Addr 3: 0100 1101 → STORE A into address 13
  • Addr 14 (data): 00000011 (3)
  • Addr 15 (data): 00001110 (14)

Key Steps Illustrated:

  1. FETCH #1: PC (0) points to RAM[0]. Instruction 0010 1110 is fetched into the Instruction Register.
  2. DECODE #1: 0010 is identified as "LOAD A"; 1110 refers to address 14.
  3. EXECUTE #1: Control Unit reads RAM[14] (which is 3) and writes it to Register A. A becomes 00000011. PC increments to 1.
  4. FETCH #2: PC (1) points to RAM[1]. Instruction 0001 1111 is fetched.
  5. DECODE #2: 0001 is "LOAD B"; 1111 refers to address 15.
  6. EXECUTE #2: Control Unit reads RAM[15] (which is 14) and writes it to Register B. B becomes 00001110. PC increments to 2.
  7. FETCH #3: PC (2) points to RAM[2]. Instruction 1000 01 00 is fetched.
  8. DECODE #3: 1000 is "ADD"; 01 selects Register B, 00 selects Register A. The operation is A = A + B.
  9. EXECUTE #3: Control Unit routes A (3) and B (14) to the ALU. The ALU calculates 3 + 14 = 17. The result 00010001 is written back to Register A. PC increments to 3.
  10. FETCH #4: PC (3) points to RAM[3]. Instruction 0100 1101 is fetched.
  11. DECODE #4: 0100 is "STORE A"; 1101 is address 13.
  12. EXECUTE #4: Control Unit reads A (17) and writes it to RAM[13]. RAM[13] becomes 00010001. PC increments to 4.

This completes the "complete mini-program: load, load, add, store."

6. The CPU Clock

The entire Fetch-Decode-Execute process is synchronized by "a clock—an electronic metronome."

  • Clock speed is measured in Hertz (cycles/second).
  • Early CPUs (e.g., Intel 4004, 1971) ran at ~740 kHz.
  • Modern CPUs operate at gigahertz (billions of cycles per second).
  • Overclocking increases speed but generates more heat.
  • Underclocking reduces speed to save battery.
  • "Modern CPUs use dynamic frequency scaling to adjust speed on demand" based on workload.

7. Status Flags

After certain operations, particularly those performed by the ALU, status flags are set to provide information about the result. Programs can read these flags to make conditional decisions. Common flags include:

  • Zero: Set if the result was 0.
  • Negative: Set if the result was negative.
  • Overflow: Set if the result was too large to fit in the allocated bits.

8. CPU Communication with RAM

The complete CPU, consisting of Registers, ALU, Control Unit, and Clock, communicates with external RAM via dedicated connections:

  • Address lines: Used by the CPU to specify the memory location it wants to access.
  • Data lines: Used to transfer data between the CPU and RAM.
  • Control lines: Used to send control signals (e.g., read, write).

9. Modern CPU Enhancements

While the core Fetch-Decode-Execute cycle remains fundamental, modern CPUs incorporate advanced techniques to enhance speed and efficiency:

  • Pipelines: Overlapping the fetch, decode, and execute stages of multiple instructions.
  • Caches (L1/L2/L3): Small, very fast memory areas closer to the CPU to store frequently accessed data and instructions, reducing the need to access slower RAM.
  • Branch Prediction: Guessing the likely path of execution for conditional branches to avoid stalling the pipeline.
  • Out-of-Order Execution: Executing instructions in an order different from their original sequence if dependencies allow, to keep the CPU busy.
  • Multiple Cores: Including several independent CPUs on a single chip, allowing for parallel processing of multiple tasks.

In essence, the CPU continuously executes instructions by fetching them, understanding what they mean, and then performing the specified actions, all synchronized by an internal clock. This fundamental process, augmented by specialized components and modern optimizations, is how every application on a computer operates.

 


No comments: