Data Hazards in Pipelining

Data hazards occur when instructions that exhibit data dependency modify data in different stages of a pipeline. If these instructions are not handled correctly, they can lead to incorrect results because an instruction may try to read a value before it has been updated by a previous instruction.

Types of Data Hazards

There are three primary types of data hazards, though modern simple pipelines primarily deal with the first one (RAW).

  • **RAW (Read After Write):** The most common hazard. A later instruction tries to read a source before an earlier instruction writes to it. (True Dependency)
  • **WAR (Write After Read):** A later instruction tries to write to a destination before an earlier instruction reads it. (Anti-Dependency)
  • **WAW (Write After Write):** A later instruction tries to write to an operand before an earlier instruction writes to it. (Output Dependency)

A Closer Look: RAW Hazard

Consider the following assembly sequence. Instruction 2 depends on the result of Instruction 1, which is stored in register R1.

TEXT
I1: ADD R1, R2, R3  ; R1 is calculated in EXE stage, written in WB stage
I2: SUB R4, R1, R5  ; Needs R1 in ID stage

Solutions for Data Hazards

Architects use several hardware and software tricks to maintain the speed of the pipeline without sacrificing accuracy.

TechniqueHow it WorksImpact
Stalling (Bubbles)The pipeline is paused for one or more cycles until the data is ready.Decreases performance (increases CPI).
Forwarding (Bypassing)The result is sent directly from the ALU to the next instruction's input.High performance; eliminates most stalls.
Code ReorderingThe compiler moves independent instructions between dependent ones.No hardware cost; depends on code structure.

Load-Use Data Hazard

A special case is the 'Load-Use' hazard. Data from a **LOAD** instruction isn't available until the end of the MEM stage. Even with forwarding, a one-cycle stall is often required if the very next instruction uses that data.

Common Mistakes to Avoid

  • Confusing 'Data Hazards' with 'Structural Hazards' (which are about hardware resources like memory ports).
  • Thinking forwarding solves all RAW hazards without stalls (Load-Use still needs a bubble).
  • Assuming WAR and WAW hazards happen in simple 5-stage in-order pipelines (they usually only occur in out-of-order execution).
  • Ignoring the role of the compiler in mitigating hazards.

Advanced Concepts

  • Scoreboarding
  • Tomasulo’s Algorithm (Dynamic Scheduling)
  • Register Renaming (solves WAR/WAW)
  • Speculative Execution
  • Bypass networks complexity

Practice Exercises

  • Identify all RAW dependencies in a block of 5 assembly instructions.
  • Draw a pipeline diagram showing a 'bubble' being inserted for a Load-Use hazard.
  • Explain why forwarding from the EX/MEM register to the ALU input works.
  • Compare the performance of a program with and without hardware forwarding enabled.

Conclusion

Data hazards are a natural side effect of increasing CPU throughput via pipelining. By using sophisticated techniques like forwarding and dynamic scheduling, modern processors can minimize stalls and keep the execution units busy.

Note: Note: Forwarding is the most critical optimization in modern RISC pipelines, effectively reducing the CPI closer to the ideal value of 1.