Tải bản đầy đủ (.pdf) (62 trang)

Advanced Computer Architecture - Lecture 13: Instruction level parallelism

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.2 MB, 62 trang )

CS 704
Advanced Computer Architecture

Lecture 13
Instruction Level Parallelism
(Dynamic Scheduling - Scoreboard Approach)

Prof. Dr. M. Ashraf Chughtai


Today's Topics
Recap - Lecture 11-12
Out-of-Order Execution
Dynamic Scheduling
Scoreboard Technique
Summary

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

2


Recap: Lecture 12
- FP and Integer Multiplier
- FP and Integer Divider
Here, we observed that :
- Only one instruction is issued on every clock cycle


- the integer ADD instructions go through the FP pipeline
as they go through in standard pipeline – as the integer
ALU operations have ZERO latency
- the FP add and FP/integer multiply and divide
instructions enter into loop when they reach EX-stage
due to longer latencies of these operations – thus
increases the number of stalls before the instruction is
issued to EX stage
MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

3


Recap: Lecture 12
RAW and WAR hazards may occur because the
instruction are of varying length and may reach
WB out-of-order
There are different ways to RAW hazard:

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

4



Recap: Lecture 12
WAW hazard

(The jth instruction writes prior to the ith
instruction; the ith instruction overwrites
the result of jth instruction)

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

5


Recap: Lecture 12
Two ways to resolve WAW hazard
- Delay the issue of jth instruction until the

ith instruction enters the MEM stage
- Stamp out the ith instruction by detecting

the hazard and changing the control
(WB) so that the ith instruction does not
write.
Hence, the jth instruction can be issued
right-away.

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

6


Today's Topics
Out-of-Order Execution
Problems of Out-of-order
execution
Dynamic Scheduling
Scoreboard Technique
Summary
MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

7


In-Order Execution
Simple Pipelined datapath facilitates
only the In-order instruction

execution, i.e.,

Instructions are fetched, decoded and issued in
the sequence of the program and

no later instruction can proceed if an
instruction is stalled due to hazard –
structural or data dependence
MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

8


In-order Execution … Cont’d
For example: in the code
DIV.D

F0, F2, F4

ADD.D

F10, F0, F8

SUB.D

F12, F8, F14

MAC/VU-Advanced

Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

9


In-order Execution … Cont’d
Conclusion

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

10


In-order Execution: MIPS 5-stage Pipeline
The MIPS 5-stage pipeline, both the structural and data
hazards are checked during the Instruction Decode (ID)
stage; and
the instruction is issued from ID stage, if it could
execute properly
Here, the issue process, at ID stage, is separated into
two parts:
Checking the structural hazard
Waiting for the absence of data hazard


MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

11


Out-of-order Execution: MIPS 5-stage pipeline
DIV.D

F0, F2, F4

ADD.D

F10, F0, F8

SUB.D

F12, F8, F14

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

12



Today's Topics
Out-of-Order Execution
Problems of Out-of-order
execution
Dynamic Scheduling
Scoreboard Technique
Summary
MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

13


Basic Problems of Out-of-order Execution
Consider the example FP code
DIV.D
ADD.D
SUB.D
MUL.D

F0, F2, F4
F6, F0, F8
F8, F10, F14
F6, F10, F8


MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

14


Example Explained: RAW Hazard

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

15


Example Explained: WAW hazard

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

16



Example Explained: WAW hazard

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

17


Today's Topics
Out-of-Order Execution
Problems of Out-of-order
execution
Dynamic Scheduling
Scoreboard Technique
Summary
MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

18


Scheduling for out-of-order execution


Static Scheduling:
Rearrangement of the instruction
execution by the compiler

Dynamic Scheduling:
Rearrangement of the instruction
execution by the hardware

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

19


Dynamic Scheduling
-

Issue:

- Read Operand:

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)


20


Dynamic Scheduling

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

21


Dynamic Scheduling: Score boarding Technique
CDC 6600 contains:
- 4 FP units
- 5 Memory Reference Units
- 7 integer operation units

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

22


MIPS Processor with Scoreboard

Registers

Data Buses
FP Mul
FP Mul
FP Divide
FP Adder
Integer Unit

ScoreBoard

MAC/VU-Advanced
Computer Architecture

Control/status

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

23


Features of Scoreboard
The Scoreboard :

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)


24


Components of Scoreboard
Instruction Status
Functional Unit Status
Register Result Status

MAC/VU-Advanced
Computer Architecture

Lecture 13 – Instruction Level
Parallelism -Dynamic (2)

25


×