Control Hazards and Branch Prediction
Control hazards (also known as Branch hazards) arise when the pipeline cannot determine the next instruction to fetch because it encountered a branch instruction. Since the CPU doesn't know if the branch will be taken until later in the pipeline, it may fetch the wrong instructions.
The Branch Penalty
When a branch is taken, the instructions already fetched into the pipeline (which followed the branch sequentially) are incorrect. These instructions must be discarded, and the pipeline must be 'flushed,' resulting in wasted clock cycles known as the branch penalty.
Static Branch Prediction
This is a simple technique where the hardware always makes the same decision for every branch, regardless of history.
- **Predict Not Taken:** The CPU always assumes the branch will fail and continues fetching sequentially. If the branch is actually taken, the pipeline is flushed.
- **Predict Taken:** The CPU always assumes the branch will succeed. This is less common because the target address often isn't known early enough.
Dynamic Branch Prediction
Modern CPUs use hardware that 'learns' from past behavior. If a loop has run 100 times, the hardware predicts it will run again.
- **1-Bit Predictor:** Remembers if the branch was taken last time and predicts the same outcome.
- **2-Bit Predictor (Bimodal):** A state machine that requires two consecutive 'wrong' guesses before changing its prediction. This is much more stable for loops.
- **Branch Target Buffer (BTB):** A small cache that stores the destination address of previous branches to speed up fetching.
Comparison of Solutions
| Technique | How it Works | Complexity |
|---|---|---|
| Stalling | Wait until branch outcome is known | Lowest (High penalty) |
| Static Prediction | Always guess the same outcome | Low |
| Delayed Branch | Execute the next instruction anyway | Medium (Compiler dependent) |
| Dynamic Prediction | Predict based on history | Highest (Most efficient) |
Common Mistakes to Avoid
- Thinking 'Branch Prediction' and 'Data Forwarding' solve the same problem (one is for control, the other for data).
- Assuming a 100% accurate branch predictor is possible (mispredictions are inevitable in complex code).
- Confusing 'Conditional Branches' (if/else) with 'Unconditional Jumps' (which still have a small fetch delay).
- Underestimating the impact of deep pipelines on branch penalties.
Advanced Concepts
- Tournament Predictors
- Global vs. Local Correlation
- Speculative Execution
- Branch Folding
- Return Address Stack (RAS)
Practice Exercises
- Calculate the CPI of a pipeline where 20% of instructions are branches with a 2-cycle penalty.
- Draw a 2-bit saturating counter state machine.
- Why do deep pipelines (like Intel's NetBurst) suffer more from mispredictions?
- Explain how 'Branch Target Buffers' reduce the fetch stage delay.
Conclusion
Control hazards are the single biggest obstacle to achieving an ideal CPI of 1 in high-performance processors. Advanced dynamic branch prediction is the 'secret sauce' that allows modern CPUs to guess correctly over 95% of the time, keeping the pipeline full and efficient.
Codecrown