8/29/2022
Ho Chi Minh City University of Technology
Department of Electrical and Electronics
1.
2.
3.
4.
5.
History of CPUs
Intel x86 Processors
ARM processors
Memory
Computer Software
1
1. History of CPUs
1950s:
Ferranti Mark 1, 1951: from University of Manchester
single 80-bit accumulator , the 40-bit "multiplicand/quotient
register"
UNIVAC I (UNIVersal Automatic Computer I) designed
principally by J. Presper Eckert and John Mauchly, the
inventors of the ENIAC
1,905 operations per second running on a 2.25 MHz clock.
IBM 704 in 1957:
Ferranti Mark 1, c. 1951
Ferranti Mark 1, c. 1951
An IBM 704 computer
at NACA in 1957
2
1
8/29/2022
1. History of CPUs
1960s:
IBM System/360 (S/360): 34,500 instructions per
second, with memory from 8 to 64 KB
PDP-11: developed by Digital Equipment Corporation
32 bit processor, allow 4 MB of physical memory
Motorola 68000:
Initial speed grades were 4, 6, and 8 MHz.
68k instruction set
IBM System/360
PDP-11/40
Motorola MC68000
3
1. History of CPUs
1970s:
Intel 4004 (1971):
a single instruction cycle was 10.8 microseconds
Clock rate is 1 MHz
Intel 8008 (1972)/ 8080(1974)/8086(1976): 8-bit CPU with an
external 14-bit address
8008 clock frequency: 0.2 - 0.8MHz
8080 clock frequency: 2 MHz
8086 clock frequency : 5-10MHz
32-bit VAX (1977): based on DEC's earlier PDP-11, support
virtual memory
Intel 4004
Intel 8088
Intel 8086
4
2
8/29/2022
A Brief History of Computer
Link YouTube: />
5
2. Intel x86 Processors
Dominate laptop/desktop/server market
Evolutionary design
Backwards compatible up until 8086, introduced in 1978
Added more features as time goes on
Complex instruction set computer (CISC)
Many different instructions with many different formats
But, only small subset encountered with Linux programs
Hard to match performance of Reduced Instruction Set
Computers (RISC)
But, Intel has done just that!
In terms of speed. Less so for low power.
6
3
8/29/2022
Intel x86 Evolution: Milestones
Name
Date
Transistors
MHz
8086
1978
29K
5-10
First 16-bit Intel processor. Basis for IBM PC & DOS
1MB address space
386
1985
275K
16-33
First 32 bit Intel processor , referred to as IA32
16 bit data path
Added “flat addressing”, capable of running Unix
486
32-bit register, 32-bit data
486DX include FPU (Floating Point Unit)
Pentium 4E
2004
125M
2800-3800
First 64-bit Intel x86 processor, referred to as x86-64
Core 2
2006
291M
1060-3500
First multi-core Intel processor
Core i3, i5, i7
2008
731M
1700-3900
Two cores / four cores
7
Intel x86 Processors, cont.
Machine Evolution
386
1985
0.3M
Pentium
1993
3.1M
4.5M
Pentium/MMX 1997
Pentium Pro 1995
6.5M
Pentium III
1999
8.2M
Pentium 4
2001
42M
2006
291M
Core 2 Duo
Core i7
2008
731M
Added Features
Instructions to support multimedia operations
Instructions to enable more efficient conditional
operations
Transition from 32 bits to 64 bits
More cores
8
4
8/29/2022
2015 State of the Art
Core i7 Broadwell 2015
Desktop Model
4 cores
Integrated graphics
3.3-3.8 GHz
65W
Server Model
8 cores
Integrated I/O
2-2.6 GHz
45W
9
2. Intel x86 Processors
8086 processor
40 pin dual in-line package
16-bit wide data bus
16-bit registers
20-bit external address bus
provides a 1 MB physical
address space
The maximum linear address
space is limited to 64 KB
Max CPU clock: 5- 10 MHz
10
5
8/29/2022
2. CPU - x86 Processor
CPU, memory, input/output devices
Instruction set, interfacing C to assembly, macros, stack
frame and calling convention
Interrupt, exception
11
The architecture of 8086 microprocessor
2 major units:
BIU - Bus Interface Unit: bus interface, segment registers, fetch
queue
EU - Execution Unit: control unit, ALU, registers
12
6
8/29/2022
2. x86 Processors - 8086
Instructions:
One-address or two addresses operations
Support Assembly and high-level programming language (C,
Pascal)
Main registers: are called data register or general register
16 bit data
Can be accessed by 8-bit registers
AH
AL
AX (primary accumulator)
BH
BL
BX (base, accumulator)
CH
CL
CX (counter, accumulator)
DH
DL
DX (accumulator, other functions
13
2. x86 Processors - 8086
Index registers: for addressing
SI
Source Index
DI
Destination Index
BP
Base Pointer
SP
Stack Pointer
IP
Instruction Pointer
CS
Code Segment
DS
Data Segment
ES
Extra Segment
SS
Stack Segment
Program counter:
Segment registers:
14
7
8/29/2022
2. x86 Processors - 8086
Segment registers:
a way to allow programs to address more than 64 KB
the registers CS, DS, SS, and ES point to the currently used program code
segment (CS), the current data segment (DS), the current stack segment
(SS), and one extra segment determined by the programmer (ES).
CS
Code Segment
DS
Data Segment
ES
Extra Segment
SS
Stack Segment
0110 1000 1000 0111 0000
Segment,
16 bits, shifted 4 bits left
+
Offset,
16 bits
Address,
20 bits
0011 0100 1010 1001
0110 1011 1101 0001 1001
15
1. x86 Processors - 8086
Examples for x86
memory segmentation
16
8
8/29/2022
1. x86 Processors - 8086
x86-32: 80386, 80486
Register extend to 32-bit
EAX. EBX ECX, EDX
ESI, EDI, EBP, ESP, EIP, EFLAGS
Two new segment registers (FS and GS) were added
FS, GS is extra data for segment registers
x86-64: AMD64, Core i5, Core i7,
An R-prefix identifies the 64-bit registers (RAX, RBX,
RCX, RDX, RSI, RDI, RBP, RSP, RFLAGS, RIP)
Add eight additional 64-bit general registers (R8-R15)
17
Some History: IA32 Registers
general purpose
Origin
(mostly obsolete)
%eax
%ax
%ah
%al
accumulate
%ecx
%cx
%ch
%cl
counter
%edx
%dx
%dh
%dl
data
%ebx
%bx
%bh
%bl
base
%esi
%si
source
index
%edi
%di
destination
index
%esp
%sp
%ebp
%bp
stack
pointer
base
pointer
16-bit virtual registers
(backwards compatibility)
18
9
8/29/2022
x86-64 Integer Registers
%rax
%eax
%r8
%r8d
%rbx
%ebx
%r9
%r9d
%rcx
%ecx
%r10
%r10d
%rdx
%edx
%r11
%r11d
%rsi
%esi
%r12
%r12d
%rdi
%edi
%r13
%r13d
%rsp
%esp
%r14
%r14d
%rbp
%ebp
%r15
%r15d
Can reference low-order 4 bytes (also low-order 1
& 2 bytes)
19
3. ARM Processors
• ARM (Acorn RISC Machine) started as a new, powerful, CPU
design for the replacement of the 8-bit 6502 in Acorn
Computers (Cambridge, UK, 1985)
• First models had only a 26-bit program counter, limiting the
memory space to 64 MB (not too much by today standards,
but a lot at that time).
• 1990 spin-off: ARM renamed Advanced RISC Machines
20
10
8/29/2022
3. ARM Processors
• ARM now focuses on Embedded CPU cores
• IP licensing: Almost every silicon manufacturer sells
some microcontroller with an ARM core. Some even
compete with their own designs.
• Processing power with low current consumption
• Good MIPS/Watt figure
• Ideal for portable devices
• Compact memories: 16-bit opcodes (Thumb)
• New cores with added features
• Harvard architecture (ARM9, ARM11, Cortex)
• Floating point arithmetic
• Vector computing
• Java language
21
3. ARM Processors
• 32-bit CPU, Harvard architecture
• 3-operand instructions (typical): ADD Rd,Rn,Operand2
• RISC design:
• Few, simple, instructions
• Load/store architecture (instructions operate on registers, not
memory)
• Large register set
• Pipelined execution
22
11
8/29/2022
Von Neumann
Harvard
ARM9s
and newers
ARM7s
and olders
Inst.
Data
AHB
bus
MEMORY
I
D
Cache
Cache
& I/O
Bus Interface
AHB
bus
Memory-mapped I/O:
•
•
No specific instructions for I/O
(use Load/Store instr. instead)
Peripheral’s registers at some
memory addresses
MEMORY
& I/O
23
ARM7TDMI Pipeline
FETCH
DECODE
EXECUTE
Reg.
Read Shift
ALU
Reg.
Write
1 Clock cycle
ARM9TDMI Pipeline
FETCH
DECODE
Reg.
Read
EXECUTE
Shift
ALU
MEMORY
access
WRITE
Reg.
Write
1 Clock cycle
• Fetch: Read Op-code from memory to internal Instruction Register
• Decode: Activate the appropriate control lines depending on Opcode
• Execute: Do the actual processing
24
12
8/29/2022
1
FETCH
2
DECODE
EXECUTE
FETCH
DECODE
EXECUTE
FETCH
DECODE
3
EXECUTE
instruction
time
• Simple instructions (like ADD) Complete at a rate of one per cycle
25
• More complex instructions:
1 ADD
2 STR
3 ADD
4 ADD
FETCH
DECODE
EXECUTE
FETCH
DECODE
FETCH
Cal. ADDR Data Xfer.
stall
DECODE
EXECUTE
FETCH
stall
DECODE
EXECUTE
FETCH
DECODE
5 ADD
EXECUTE
instruction
time
STR : 2 effective clock cycles (+1 cycle)
26
13
8/29/2022
Data Sizes and Instruction Sets
The ARM is a 32-bit architecture.
When used in relation to the ARM:
Byte means 8 bits
Halfword means 16 bits (two bytes)
Word means 32 bits (four bytes)
Most ARM’s implement two instruction sets
32-bit ARM Instruction Set
16-bit Thumb Instruction Set
27
Processor Modes
The ARM has seven operating modes:
User : unprivileged mode under which most tasks run
FIQ : entered when a high priority (fast) interrupt is raised
IRQ : entered when a low priority (normal) interrupt is raised
SVC : (Supervisor) entered on reset and when a Software Interrupt
instruction is executed
Abort : used to handle memory access violations
Undef : used to handle undefined instructions
System : privileged mode using the same registers as user mode
28
14
8/29/2022
The Registers
ARM has 37 registers all of which are 32-bits long.
1 dedicated program counter
1 dedicated current program status register
5 dedicated saved program status registers
30 general purpose registers
The current processor mode governs which of several banks is
accessible. Each mode can access
a particular set of r0-r12 registers
a particular r13 (the stack pointer, sp) and r14 (the link register, lr)
the program counter, r15 (pc)
the current program status register, cpsr
Privileged modes (except System) can also access
a particular spsr (saved program status register)
29
The ARM Register Set
Current Visible Registers
Abort
Mode
Undef
SVC
Mode
IRQ
FIQ
User
Mode
Mode
Mode
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
spsr
Banked out Registers
User,
User
SYS
FIQ
IRQ
SVC
Undef
Abort
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
r13 (sp)
r14 (lr)
spsr
spsr
spsr
spsr
spsr
30
15
8/29/2022
Special Registers
Special function registers:
PC (R15): Program Counter. Any instruction with PC as its destination
register is a program branch
LR (R14): Link Register. Saves a copy of PC when executing the BL
instruction (subroutine call) or when jumping to an exception or interrupt
routine
- It is copied back to PC on the return from those routines
SP (R13): Stack Pointer. There is no stack in the ARM architecture. Even
so, R13 is usually reserved as a pointer for the program-managed stack
CPSR : Current Program Status Register. Holds the visible status register
SPSR : Saved Program Status Register. Holds a copy of the previous status
register while executing exception or interrupt routines
- It is copied back to CPSR on the return from the exception or interrupt
- No SPSR available in User or System modes
31
4. Memory
Memory - Purpose of memory is data storage. Two major
types of memory :
Primary memory - to hold data and instructions during
processing
eg RAM. Relatively limited capacity and volatile
Secondary memory - to provide permanent long term storage
eg hard disk. High capacity and non-volatile
RAM banks
Hard disk
NAND flash chip
32
16
8/29/2022
4. Memory
Primary memory consists of a set of locations defined
by sequentially numbered addresses. Each location
contains a binary number that can be interpreted as data
or an instruction.
8086 uses 20-bit physical address
Manage 1MB of memory
80386 uses 32-bit physical address
Manage 4GB of memory
X86-64 uses 64-bit physical address
Manage ??? of memory
33
u
Memory locations are called words. Words are 8 bits (one byte) in size, or
a multiple of 8. Common word sizes are 16, 32 and 64 bits.
0
1
1 0 0 1 0 0 0 1
1 1 0 1 0 0 1 1
2
3
0 1 0 0 0 0 0 0
4
1 0 1 0 0 1 1 1
5
1 1 1 0 1 0 1 0
1 1 0 0 1 0 1 0
Memory locations, using an 8 bit word
34
17
8/29/2022
2. Memory
Memory is commonly measured in multiples of bits
and bytes.
1 bit = 1 binary digit (0 or 1).
1.
1 byte = 8 bits
2.
1KB = 1024 bytes = 210
3.
1MB = 1024 KB= 220
4.
1GB = 1024 MB = 230
5.
1TB = 1024 GB = 240
35
Big Endian vs. Little Endian
• x86 processors are little-endian
• IBM z/Architecture mainframes are big-endian processors
Big Endian
(Others)
Register
FE
ED
FA
Little Endian
(Intel)
High Memory
Addresses
CE
00
00
CE
FA
ED
FE
0x5
0x4
0x3
0x2
0x1
0x0
Register
00
00
FE
ED
FA
CE
FE
ED
FA
CE
Low Memory Addresses
36
18
8/29/2022
5. Computer Software
Assembly/Machine Code View
CPU
Registers
Memory
Addresses
Code
Data
Stack
Data
PC
Condition
Codes
Instructions
Programmer-Visible State
PC: Program counter
Memory
Address of next instruction
Called “RIP” (x86-64)
Register file
Heavily used program data
Byte addressable array
Code and user data
Stack to support procedures
Condition codes
Store status information about most
recent arithmetic or logical operation
Used for conditional branching
37
5. Computer Software
Compiling Into Assembly
C Code (sum.c)
long plus(long x, long y);
void sumstore(long x, long y,
long *dest)
{
long t = plus(x, y);
*dest = t;
}
Generated x86-64 Assembly
sumstore:
pushq
movq
call
movq
popq
ret
%rbx
%rdx, %rbx
plus
%rax, (%rbx)
%rbx
Obtain (on shark machine) with command
gcc –Og –S sum.c
Produces file sum.s
Warning: Will get very different results on non-Shark machines (Andrew Linux,
Mac OS-X, …) due to different versions of gcc and different compiler settings.
38
19
8/29/2022
Quiz
1) Pick the correct choice for the 8086 CPU.
A 16 bit word size, 8 bit data path
B 8 bit word size, 8 bit data path
C 16 bit word size, 16 bit data path
D 4 bit word size, 8 bit data path
E 8 bit word size, 16 bit data path
2) Pick the correct choice for the 80386SX CPU.
A 16 bit word size, 16 bit data path
B 32 bit word size, 16 bit data path
C 8 bit word size, 32 bit data path
D 32 bit word size, 8 bit data path
E 32 bit word size, 32 bit data path
3) Pick the correct choice for the 80486DX CPU.
A 32 bit word size, 16 bit data path
B 64 bit word size, 32 bit data path
C 32 bit word size, 32 bit data path
D 32 bit word size, 16 bit data path
E 32 bit word size, 64 bit data path
39
Quiz
4) What is the first CPU to include an internal math
coprocessor?
A 386DX
B 486SX
C 486DX
D Pentium
5) What are the two main components of the CPU?
A The Control Unit and ALU
B The Registers and Output/Input management
C The ALU and FPU
6) What are the two main desktop CPU manufacturers?
A Intel and AMD
B Via and Power PC
C Marek and Sun UltraSparc
7) What are the 32-bit data when we read a double-word at
the address 0x4000 with Big Endian mode?
A 0xAC7E652F
B 0x2F657EAC
C 0xCAE756F2
Address
Content
0x4000
2F
0x4001
65
0x4002
7E
0x4003
AC
40
20
8/29/2022
Quiz
8) Pick the correct choice for the ARM processor.
A 16 bit word size, 16 bit data path
B 32 bit word size, 16 bit data path
C 8 bit word size, 32 bit data path
D 32 bit word size, 8 bit data path
E 32 bit word size, 32 bit data path
9) Pick the wrong choice for ARM architecture.
A Von Neumann architecture
B Harvard architecture
C 3 stage pipeline architecture
D 32-bit ARM Instruction Set
10) Pick the wrong choice for ARM registers.
A ARM has 37 32-bit registers
B There are 13 general purpose registers
C R13 is Stack Pointer
D R14 is the program counter
41
Exercises
1.
Suppose that you discover that RAM addresses 000C0000 to 000C7FFF are
reserved for a PC’s video adapter. How many bytes of memory is this?
2.
Suppose that you have an Intel 8086. Find the five-hex-digit address that
corresponds to each of these segment:offset pairs:
(a) 2B8C:8D21 (b) 059A:7A04 (c) 1234:5678
3.
In an 8086 program, suppose that the data segment register DS contains the
segment number 23D1 and that an instruction fetches a word at offset 7B86
in the data segment. What is the five-hex-digit address of the word that is
fetched?
4.
In an 8086 program, suppose that the code segment register CS contains the
segment number 014C and that the instruction pointer IP contains 15FE.
What is the five-hex-digit address of the next instruction to be fetched?
5.
What are advantages and disadvantage of secondary memory?
42
21