Tải bản đầy đủ (.pdf) (36 trang)

Advanced Computer Architecture - Lecture 6: Instruction set principles (Cont''d)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (972.3 KB, 36 trang )

CS 704
Advanced Computer Architecture

Lecture 6
Instruction Set Principles
(ISA Performance Analysis, Fallacies and Pitfalls)

Prof. Dr. M. Ashraf Chughtai
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

1


Today’s Topics
Recap Lecture 5
DSP Media Operations
ISA Performance
Putting it all Together
Summary

MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

2



Recap: Lecture 5
Instruction encoding
- Essential elements of computer instruction

-

-

word:
- Type of operands
- Places of source and destinations
- Place of next instruction
Instruction word length
- Variable Length
- Fixed length
- Hybrid – variable fixed
Categories of Hybrid length
4, 3, 2, 1 and 0 address format

MAC/VU-Advanced
Computer Architecture

Lecture 5 - Instruction Set Principles ..
Cont'd

3


Recap: Lecture 5


….. Cont’d

- Comparison of hybrid instruction word format
Minimum number of memory bytes are required in case
of 1 address (accumulator) format
Maximum for 4-address format
- MIPS Instruction word format
- RISC and MIPS a fixed length, 64-bit LOAD/STORE
Architecture
- It supports:
- 8-, 16-, 32- and 64-bit operand
- R-type, I-type and J-type
- Arithmetic and logic operation
- data transfer operations
- Control flow operations
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

4


Media and Signal Processing Operands




Graphic applications deal with 2D and 3D images
3D data type is called vertex

Vertex structure has 4-components
-

x- coordinate
y- coordinate
z- coordinate
w-coordinate

 The three vertices specify a graphic primitive, such as a

triangle; and the fourth to help with color and hidden
surfaces
 Vertex values are usually 32-bit Floating point values

 DSP adds fixed point to the data types – binary point just

to the right of the sign-bit
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

5


3D Data Type
 A triangle is visible when it is depicted as

filled with pixels
 Pixels are typically 32-bits, usually

consisting of four 8-bit channels
-

R -red
G-green
B-blue
A: Transparency of pixel
when it is depicted

MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

6


Media and Signal Processing Operations
 Data for multimedia operations is

usually much narrower than the 64-bit
data word of modern processors

 Thus, 64-bit may be partitioned in to

four 16-bit data values so that the 64bit ALU to perform four 16-bit
operations (say add operation) in a
single clock cycle
MAC/VU-Advanced
Computer Architecture


Lecture 6- Instruction Set Principles (3)

7


Media and Signal Processing Operations
 Here, extra hardware is added to

prevent the ‘CARRY’ between the four
16-bit partitions of 64-bit ALU

 These operations are called Single-

Instruction Multiple-Data (SIMD) or
vector operations

MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

8


Multimedia Operations
 Most graphic multimedia applications

use 32-bit floating point operations
allowing a single instruction to launch

two 32-bit operations on operands
found side-by-side in double precision
register

 The table shown here summarizes

SIMD instructions found in recent
computers
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

9


Summary of SIMD instructions
in recent computers
Insert Table given in Fig. 2.17 from page 110

MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

10


Multimedia Operations
 You may note that there is very little


common across the five architectures

 All are fixed-width operation ,

performing multiple narrow operations
on either 64-bit or 128-bit ALU
 The narrow operation are shown as
B-byte,
H-half word
W-word and
8B double word
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

11


Digital Signals Processing Issues
 Saturating Add/Subtract

Too Large Result and Overflow
 Result Rounding

Choose from IEEE 754 mode
algorithms
 Multiply Accumulate


Vector and Matrix dot product operations
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

12


DSP Operations
 Saturating Add/Sub

DSP cannot ignore results of
overflow otherwise it may miss an
event, therefore, it uses saturating
arithmetic.
- Here, if the result is too large to be
presented it is set to the largest
representable number, based on the
sign of the number
-

MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

13



DSP Operations
 Result Rounding

IEEE 754 has several algorithms to round
the wider accumulator into narrower one,
DSPs select the appropriate mode to
round the result
 Multiply-Accumulate (MAC)

MAC operations are the key to dot
product operations of vector and matrix
multiply which need to accumulate a
series of product
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

14


ISA Performance
 Role of Compiler

The interaction of compiler and highlevel languages significantly effects how
program uses an ISA
-

- Optimizations performed by the
compilers can be classified as follows:


MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

15


Classification of Performance
optimization
-

High-level optimization: is often done on the
source with the output fed to the later
optimization passes.
- Local Optimization: is done within a straightline code fragment (basic block)
- Global Optimization: extends the optimization
across branches
- Register Allocation: associate registers with
operands
- Processor-dependent optimization: using the
specific architecture
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

16



Impact of Compiler Technology
-

Interaction of compiler and high-level language
affects how a program uses an ISA

-

Here, two important questions are:
1:
2:

-

How are variables allocated?
How many registers are needed to
allocate variables appropriately?

These questions are addressed by using three
areas in which high-level language allocates
data

MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

17



Three areas of data allocation
1: Local Variable area – Stack
-

It is used to allocate local variable
it grows or shrinks on procedure call or
return
- Objects on stack are primarily scalar –
single variable rather than arrays and are
addressed by stack-pointer
- Register allocation is much more
effective for stack-allocated objects
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

18


Three areas of data allocation

… Cont’d:

2: Global Data Area
-

It is used to allocate statically declared objects
such as global variables and constants

- These objects are mostly arrays and other
aggregate data structures
- Register allocation is relatively less effective
for global variables
- Global variables are aliased – there are
multiple way to address so make it illegal to put
on registers
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

19


Three areas of data allocation

… Cont’d:

3: Dynamic Object Allocation: Heap
- It is used to allocate the objects that
do not
adhere to stack
- The objects in heap are accessed
with pointer but are not scalars
- Most heap variable are aliased so
register
allocation is almost
impossible for heap
MAC/VU-Advanced

Computer Architecture

Lecture 6- Instruction Set Principles (3)

20


ISA Performance … Cont’d
 MIPS Floating-point Operations

The instructions manipulate the floatingpoint registers
- They indicate whether the operation is to
be performed on single precision or
double precision
-

MOV.S copies a single precision register to
another of the same type
MOV.D copies a Double precision register to
another of the same type
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

21


MIPS Floating-point Operations … Cont’d
To get greater performance for graphic

routines, MIPS64 offers Paired-Single
Instructions
- These instructions perform two 32-bit
floating point operations on each half of
the 64-bit floating point register
Examples:
-

ADD.PS
SUB.PS
MUL.PS
DIV.PS
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

22


Putting it All Together
The earliest architectures were limited to
instruction sets by the hardware
technology of that time
-

-

In the 1960s, stack architecture became
popular, viewed as being good match of

high-level language

- In the 1970s, the main concern of the
architectures was to reduce the software
cost, thus produced high-level
architectures such as VAX machine
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

23


Putting it All Together .. Cont’d
In the 1980s, return to simpler
architecture took place due to
sophisticated compiler technology
-

- In the 1990s, new architectures were
introduced; these include:

MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

24



Putting it All Together .. Cont’d
1990s Architectures
1: Address size doubles – 32-bit to 64-bit
2: Optimization of conditional branches via
conditional execution e.g.; conditional move
3: Optimization of Cache performance via
pre-fetch that increased the role of memory
hierarchy in performance of computers
4: Multimedia support
5: Faster Floating point instructions
6: Long Instruction Word
MAC/VU-Advanced
Computer Architecture

Lecture 6- Instruction Set Principles (3)

25


×