Tải bản đầy đủ (.pdf) (13 trang)

kiến trúc máy tính võ tần phương chương ter04 exercise sinhvienzone com

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (862.42 KB, 13 trang )

dce
2013

COMPUTER ARCHITECTURE
CE2013

BK
TP.HCM

Faculty of Computer Science and
Engineering
Department of Computer Engineering

Vo Tan Phuong
/>CuuDuongThanCong.com

/>

dce
2013

Chapter 4
Single-cycle & Pipeline
Processor

CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE



2


Single-Cycle Processor Overview
Jump or Branch Target Address

30

30

30

Next
PC

Imm26

+1

PCSrc

30

00

2013

Imm16


Instruction
Memory

Rs 5
32

Instruction

0

m
u
x

PC

dce

Rt 5

Address

RA

RB

E
0

BusB


m
u
x

0

m
u
Rd x

1

RW

BusW

ALU result

zero

BusA

Registers

J, Beq, Bne

A
L
U


Data
Memory
Address

0

32

Data_out
Data_in

m 32
u
x
1

1

1
5

clk
func
Op

RegDst

ALUop


ALU
Ctrl

RegWrite ExtOp

ALUSrc

MemRead
MemWrite

MemtoReg

Main
Control

CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE

3


dce
2013

Exercise 1
Fill the value of the control signals for following instruction:

a. slt $t0,$s0,$zero
Reg
Dst

Reg
Write

Ext
Op

ALU
Src

Beq

Bne

J

Mem
Read

Mem
Write

Mem
toReg

1


1

x

0

0

0

0

0

0

0

J

Mem
Read

Mem
Write

Mem
toReg

b. bne $t0,$zero,exit_label

Reg
Dst

Reg
Write

CuuDuongThanCong.com

Ext
Op

ALU
Src

Computer Architecture – Chapter 4.2

Beq

Bne

/>
©2013, CE

4


dce
2013

Exercise 2



We wish to add the instruction jalr (jump and link
register) to the single-cycle datapath. Add any necessary
datapath and control signals and draw the result
datapath. Show the values of the control signals to
control the execution of the jalr instruction.

• The jump and link register instruction is described
below:

CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE

5


dce
2013

Exercise 2
• One solution:
(Comment: JReg means Jump Register; RA means: Return Address)

CuuDuongThanCong.com


Computer Architecture – Chapter 4.2

/>
©2013, CE

6


dce
2013

Exercise 2
• The main control signals for the JALR instruction are the
same for other R-type instructions, such as ADD and SUB.
These control signals are shown in the table below:

• The ALU Control signals for the JALR instruction are shown
below. JReg = 1 and RA = 1. ALUCtrl is a don't care

CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE

7


dce

2013

Exercise 3
We want to compare the performance of a single-cycle CPU design
with a multi-cycle CPU. Suppose we add the multiply and divide
instructions. The operation times are as follows:
o Instruction memory access time = 190 ps, Data memory access time = 190
ps
o Register file read access time = 150 ps, Register file write access = 150 ps
o ALU delay for basic instructions = 190 ps, ALU delay for multiply or divide =
550 ps
Ignore the other delays in the multiplexers, control unit, sign-extension, etc.
Assume the following instruction mix: 30% ALU, 15% multiply & divide, 15%
load, 15% store, 15% branch, and 10% jump.
a. What is the total delay for each instruction class and the clock cycle for the
single-cycle CPU design
b. Assume we fix the clock cycle to 200 ps for a multi-cycle CPU, what is the
CPI for each instruction class and the speedup over a fixed-length clock
cycle?
CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE

8


dce

2013

Exercise 3
a. Total delay for each instruction:

Clock cycle = max delay = 1040ps

CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE

9


dce
2013

Exercise 3
b. CPI for each instruction:
CPI for Basic ALU = 4 cycles
CPI for Multiply & Divide = 6 cycles (ALU takes 3 cycles)
CPI for Load = 5 cycles
CPI for Store = 4 cycles
CPI for Branch = 3 cycles
CPI for Jump = 2 cycles
Average CPI = 0.3 * 4 + 0.15 * 6 + 0.15 * 5 + 0.15 * 4 + 0.15 * 3 + 0.1 *
2 = 4.1

Speedup of multi-cycle over single-cycle = (1040 * 1) / (200 * 4.1) =
1.27

CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE

10


dce
2013

Exercise 4
• Identify all the RAW data dependencies in the following
code. Which dependencies are data hazards that will be
resolved by forwarding? Which dependencies are data
hazards that will cause a stall? Using a graphical
representation of the pipeline, show the forwarding paths
and stalled cycles if any.
add $3, $4, $2
sub $5, $3, $1
lw $6, 200($3)
add $7, $3, $6

CuuDuongThanCong.com


Computer Architecture – Chapter 4.2

/>
©2013, CE

11


dce
2013

Exercise 4
• RAW dependencies:
add $3, $4, $2 and sub $5, $3, $1 (forwarding)
add $3, $4, $2 and lw $6, 200($3) (forwarding)
lw $6, 200($3) and add $7, $3, $6 (stall 1, forward)
add $3, $4, $2 and add $7, $3, $6 (from register)

CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE

12


dce
2013


Exercise 5
• We have a program of 10^6 instructions in the format of “lw, add,
lw, add,…”. The add instruction depends only on the lw instruction
right before it. The lw instruction also depends only on the add
instruction right before it. If this program is executed on the 5-stage
MIPS pipeline:
a. Without forwarding, what would be the actual CPI?
It takes 6 cycles on average to complete one LW and one ADD.
1 cycle (to complete LW) + 2 cycles (bubbles) + 1 cycle (to complete ADD) + 2
cycles (bubbles) = 6 cycles
So, it takes 6 cycles to complete 2 instructions
Average CPI = 6/2 = 3

b. With forwarding, what would be the actual CPI?
It takes only 3 cycles on average to to complete one LW and one ADD.
1 cycle (to complete LW) + 1 cycle (bubble) + 1 cycle (to complete ADD) = 3
cycles
So, it takes 3 cycles to complete 2 instructions
Average CPI = 3/2 = 1.5
CuuDuongThanCong.com

Computer Architecture – Chapter 4.2

/>
©2013, CE

13




×