Control Hazards and Branch Prediction

Control hazards (also known as Branch hazards) arise when the pipeline cannot determine the next instruction to fetch because it encountered a branch instruction. Since the CPU doesn't know if the branch will be taken until later in the pipeline, it may fetch the wrong instructions.

The Branch Penalty

When a branch is taken, the instructions already fetched into the pipeline (which followed the branch sequentially) are incorrect. These instructions must be discarded, and the pipeline must be 'flushed,' resulting in wasted clock cycles known as the branch penalty.

Static Branch Prediction

This is a simple technique where the hardware always makes the same decision for every branch, regardless of history.

  • **Predict Not Taken:** The CPU always assumes the branch will fail and continues fetching sequentially. If the branch is actually taken, the pipeline is flushed.
  • **Predict Taken:** The CPU always assumes the branch will succeed. This is less common because the target address often isn't known early enough.

Dynamic Branch Prediction

Modern CPUs use hardware that 'learns' from past behavior. If a loop has run 100 times, the hardware predicts it will run again.

  • **1-Bit Predictor:** Remembers if the branch was taken last time and predicts the same outcome.
  • **2-Bit Predictor (Bimodal):** A state machine that requires two consecutive 'wrong' guesses before changing its prediction. This is much more stable for loops.
  • **Branch Target Buffer (BTB):** A small cache that stores the destination address of previous branches to speed up fetching.

Comparison of Solutions

TechniqueHow it WorksComplexity
StallingWait until branch outcome is knownLowest (High penalty)
Static PredictionAlways guess the same outcomeLow
Delayed BranchExecute the next instruction anywayMedium (Compiler dependent)
Dynamic PredictionPredict based on historyHighest (Most efficient)

Common Mistakes to Avoid

  • Thinking 'Branch Prediction' and 'Data Forwarding' solve the same problem (one is for control, the other for data).
  • Assuming a 100% accurate branch predictor is possible (mispredictions are inevitable in complex code).
  • Confusing 'Conditional Branches' (if/else) with 'Unconditional Jumps' (which still have a small fetch delay).
  • Underestimating the impact of deep pipelines on branch penalties.

Advanced Concepts

  • Tournament Predictors
  • Global vs. Local Correlation
  • Speculative Execution
  • Branch Folding
  • Return Address Stack (RAS)

Practice Exercises

  • Calculate the CPI of a pipeline where 20% of instructions are branches with a 2-cycle penalty.
  • Draw a 2-bit saturating counter state machine.
  • Why do deep pipelines (like Intel's NetBurst) suffer more from mispredictions?
  • Explain how 'Branch Target Buffers' reduce the fetch stage delay.

Conclusion

Control hazards are the single biggest obstacle to achieving an ideal CPI of 1 in high-performance processors. Advanced dynamic branch prediction is the 'secret sauce' that allows modern CPUs to guess correctly over 95% of the time, keeping the pipeline full and efficient.

Note: Note: In modern software optimization, avoiding unpredictable branches (like those inside tight loops) can significantly speed up execution by reducing pipeline flushes.