Pipelining


Pipelining is an implementation technique where multiple instructions have overlapped execution.


In a car assembly plant, the work done to produce a new car is broken down into several small jobs. As a car progresses along the assembly line each of these jobs is performed in sequence.


The total time taken to produce a car is still the same, but several cars can be assembled simultaneously.


A new car appears at a rate determined by how long each of the small jobs take.


The MIPs processor can be thought of as having several steps:


lw $1,100($0) ;load the word at 100 into Reg1


is made up of five separate steps



1. IF: Instruction fetch from Instr Memory


2. ID: Instruction Decode and Register fetch (Reg 0 = 0)


3. EX: Execution add the value (100) to register output (0)


4. MEM: Memory access - read location 100


5. WB: Write Back - write data into register 1


Total time is 40ns


If we perform several loads in succession:


lw $1, 100($0)

lw $2, 200($0)

lw $3, 300($0)


The total time is 120ns




We can reduce the total time if we overlap these steps :


Each step is independent of the others, but because some are performed faster than others we have to insert some delays.




After we perform the ID and WB steps we must wait 5ns -

each step now takes the same amount of time - 10ns.

Each instruction now takes 50ns. - 10ns longer than before


All three instructions take 70ns. - slightly better than 120ns


But it only takes 10ns between instructions.


We have effectively increased our throughput by a factor of 4


1000 instructions now take 10,040ns instead of 40,000ns



We must be careful!



In cycle 3, the ALU is adding reg 0 to 100 for the first instruction


At the same time the immediate value 100 is changing to 200 as the second instruction is being decoded.



Each pipeline stage needs to know the data for the instruction that it is currently working on.


Before each of the stages, we need to include a register to remember details of the current instruction for that stage..


We pass the contents of this register to the next stage at beginning of the next cycle.






159.233 Hardware 18 - 1