Tải bản đầy đủ (.pdf) (560 trang)

Student Guide - IES-443 Advanced Sun Fire™ Mid-Range Troubleshooting

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (3.99 MB, 560 trang )

Advanced Sun Fire™ Mid-Range
Troubleshooting
IES-443
Student Guide With Instructor Notes

Sun Microsystems, Inc.
UBRM05-104
500 Eldorado Blvd.
Broomfield, CO 80021
U.S.A.
Revision A


Copyright 2002 Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, California 94303, U.S.A. All rights reserved.
This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and
decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of
Sun and its licensors, if any.
Third-party software, including font technology, is copyrighted and licensed from Sun suppliers.
Sun, Sun Microsystems, the Sun Logo, Sun Enterprise, Sun StorEdge, Sun Fire, Netra, Sun Enterprise and Solaris are trademarks or
registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries.
UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd.
U.S. Government approval might be required when exporting the product.
RESTRICTED RIGHTS: Use, duplication, or disclosure by the U.S. Government is subject to restrictions of FAR 52.227-14(g)(2)(6/87) and
FAR 52.227-19(6/87), or DFAR 252.227-7015 (b)(6/95) and DFAR 227.7202-3(a).
DOCUMENTATION IS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS, AND
WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR
NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY
INVALID.

THIS MANUAL IS DESIGNED TO SUPPORT AN INSTRUCTOR-LED TRAINING
(ILT) COURSE AND IS INTENDED TO BE USED FOR REFERENCE PURPOSES IN


CONJUNCTION WITH THE ILT COURSE. THE MANUAL IS NOT A STANDALONE
TRAINING TOOL. USE OF THE MANUAL FOR SELF-STUDY WITHOUT CLASS
ATTENDANCE IS NOT RECOMMENDED.
ECCN Date – August 2002


Copyright 2002 Sun Microsystems Inc., 901 San Antonio Road, Palo Alto, California 94303, Etats-Unis. Tous droits réservés.
Ce produit ou document est protégé par un copyright et distribué avec des licences qui en restreignent l’utilisation, la copie, la distribution,
et la décompilation. Aucune partie de ce produit ou document ne peut être reproduite sous aucune forme, par quelque moyen que ce soit,
sans l’autorisation préalable et écrite de Sun et de ses bailleurs de licence, s’il y en a.
Le logiciel détenu par des tiers, et qui comprend la technologie relative aux polices de caractères, est protégé par un copyright et licencié
par des fournisseurs de Sun.
Sun, Sun Microsystems, the Sun Logo, Sun Enterprise, Sun StorEdge, Sun Fire, Netra, Sun Enterprise et Solaris sont des marques de fabrique
ou des marques déposées de Sun Microsystems, Inc. aux Etats-Unis et dans d’autres pays.
UNIX est une marques déposée aux Etats-Unis et dans d’autres pays et licenciée exclusivement par X/Open Company, Ltd.
L’interfaces d’utilisation graphique OPEN LOOK et Sun™ a été développée par Sun Microsystems, Inc. pour ses utilisateurs et licenciés.
Sun reconnaît les efforts de pionniers de Xerox pour larecherche et le développement du concept des interfaces d’utilisation visuelle ou
graphique pour l’industrie de l’informatique. Sun détient une licence non exclusive de Xerox sur l’interface d’utilisation graphique Xerox,
cette licence couvrant également les licenciés de Sun qui mettent en place l’interface d’utilisation graphique OPEN LOOK et qui en outre
se conforment aux licences écrites de Sun.
L’accord du gouvernement américain est requis avant l’exportation du produit.
LA DOCUMENTATION EST FOURNIE “EN L’ETAT” ET TOUTES AUTRES CONDITIONS, DECLARATIONS ET GARANTIES
EXPRESSES OU TACITES SONT FORMELLEMENT EXCLUES, DANS LA MESURE AUTORISEE PAR LA LOI APPLICABLE, Y
COMPRIS NOTAMMENT TOUTE GARANTIE IMPLICITE RELATIVE A LA QUALITE MARCHANDE, A L’APTITUDE A UNE
UTILISATION PARTICULIERE OU A L’ABSENCE DE CONTREFAÇON.
CE MANUEL DE RÉFÉRENCE DOIT ÊTRE UTILISÉ DANS LE CADRE D’UN COURS DE FORMATION DIRIGÉ PAR UN
INSTRUCTEUR (ILT). IL NE S’AGIT PAS D’UN OUTIL DE FORMATION INDÉPENDANT. NOUS VOUS DÉCONSEILLONS DE
L’UTILISER DANS LE CADRE D’UNE AUTO-FORMATION.

Please

Recycle


Please
Recycle


Table of Contents
About This Course............................................................... Preface-xxiii
Course Goals...................................................................... Preface-xxiii
Course Map.........................................................................Preface-xxiv
Topics Not Covered............................................................Preface-xxv
How Prepared Are You?................................................... Preface-xxv
Introduction ........................................................................Preface-xxvi
How to Use Course Materials ......................................... Preface-xxvi
Conventions ....................................................................... Preface-xxvi
Icons ............................................................................Preface-xxvi
Typographical Conventions ................................. Preface-xxviii
Reviewing the Sun Fire Servers .........................................................1-1
Objectives ........................................................................................... 1-1
Relevance............................................................................................. 1-2
Additional Resources ....................................................................... 1-2
Sun Fire Server Models ..................................................................... 1-3
Server Naming........................................................................... 1-3
Sun Fire Features................................................................................ 1-5
Interconnect Capabilities .................................................................. 1-7
Peak System Bandwidth .......................................................... 1-8
System Board Physical Locations .................................................... 1-9
Sun Fire 6800 Server ................................................................. 1-9
Sun Fire 4810 and 4800 Servers ............................................. 1-10

Sun Fire 3800 Server ............................................................... 1-10
Sun Fire I/O Assemblies................................................................. 1-11
I/O Assembly Locations ................................................................. 1-12
Sun Fire 6800 Server ............................................................... 1-12
Sun Fire 48x0 Server ............................................................... 1-13
Sun Fire 3800............................................................................ 1-14
Compact PCI I/O ............................................................................. 1-15
Sun Fire 4800/4810/6800 Four-Slot cPCI Board ................ 1-15
Sun Fire 3800 Six-Slot cPCI Board ........................................ 1-17
Sun Fireplane Switch Boards.......................................................... 1-18
Sun Fire 6800 Server ............................................................... 1-18

v
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Sun Fire 4800, 4810, and 3800 Servers .................................. 1-18
Sun Fireplane Switch Board Physical Locations ................ 1-19
Agent IDs .......................................................................................... 1-21
CPU Locations and Agent IDs .............................................. 1-21
Memory Controller Mapping................................................ 1-22
IOC AID.................................................................................... 1-23
Locating I/O Devices in Sun Fire 6800/4810/
4800 Systems ..................................................................................... 1-24
Four Slot cPCI I/O Assembly................................................ 1-27
Sun Fire 3800 I/O Device Location Mapping .............................. 1-29
Six Slot cPCI I/O Assembly................................................... 1-30
Sun Fireplane Interconnect............................................................. 1-31
How the Sun Fireplane Interconnect Works....................... 1-32
Troubleshooting Tools .................................................................... 1-33

Service Mode......................................................................... 1-33
System Logging ....................................................................... 1-35
Explorer .................................................................................... 1-37
Parity.................................................................................................. 1-38
Parity Checking ....................................................................... 1-39
Problems with Parity .............................................................. 1-39
ECC .................................................................................................... 1-40
Check Your Progress ....................................................................... 1-41
Power Management and the Frame Manager ................................... 2-1
Objectives ........................................................................................... 2-1
Relevance............................................................................................. 2-2
Additional Resources ....................................................................... 2-2
Power ................................................................................................... 2-3
The RTU and RTS............................................................................... 2-4
Redundant Power ..................................................................... 2-6
The AC/DC Power Supplies............................................................ 2-8
Housekeeping Voltage ............................................................. 2-9
Power Distribution .......................................................................... 2-10
Sun Fire 3800............................................................................ 2-11
Sun Fire 48x0............................................................................ 2-12
Sun Fire 6800............................................................................ 2-14
DC/DC Component Power Supplies................................... 2-16
Board Voltage Requirements ................................................ 2-17
The Frame Manager......................................................................... 2-18
Exercise: Managing Power ............................................................. 2-20
Objective................................................................................... 2-20
Description............................................................................... 2-20
Preparation............................................................................... 2-20
Problem Presentation ............................................................. 2-20
Task ........................................................................................... 2-22

Exercise Summary............................................................................ 2-35

vi

Advanced Sun Fire™ Mid-Range Troubleshooting
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Check Your Progress ....................................................................... 2-36
Domains and Segments .....................................................................3-1
Objectives ........................................................................................... 3-1
Relevance............................................................................................. 3-2
Additional Resources ....................................................................... 3-2
Virtual Servers .................................................................................... 3-3
Domains...................................................................................... 3-4
Segments .................................................................................... 3-6
Domains and Segments in the Sun Fire 6800 ................................. 3-8
Impact of Multiple Segments and Domains........................ 3-11
Power Grid Segmentation............................................................... 3-12
Power Grids on Sun Fire 48x0/3800 .................................... 3-12
Power Grids on Sun Fire 6800............................................... 3-13
Domain Recovery............................................................................. 3-15
Sun Fire 3800, 48x0, and 6800 Board
Configurations...................................................................... 3-15
Segment and Domain Summary.................................................... 3-19
Exercise: Identifying Domains and Segments ............................. 3-20
Objective................................................................................... 3-20
Preparation............................................................................... 3-20
Task 1 ........................................................................................ 3-20
Task 2 ........................................................................................ 3-22

Exercise Summary............................................................................ 3-31
Check Your Progress ....................................................................... 3-32
The System and I/O Boards ...............................................................4-1
Objectives ........................................................................................... 4-1
Relevance............................................................................................. 4-2
Additional Resources ....................................................................... 4-2
System Board ...................................................................................... 4-3
System Board Interconnects .................................................... 4-4
CPU Placement Rules ........................................................................ 4-6
Sun Fire I/O........................................................................................ 4-7
I/O Assembly Configurations ................................................ 4-7
The PCI I/O Board............................................................................. 4-9
PCI Card Support.................................................................... 4-10
PCI Board Slot Configuration ............................................... 4-10
The cPCI Board Assembly .............................................................. 4-12
cPCI Card Slot Configuration ............................................... 4-14
cPCI Operation ........................................................................ 4-14
The 3800 cPCI Board............................................................... 4-15
3800 cPCI Card Slot Configuration ...................................... 4-17
I/O Board Power.............................................................................. 4-18
PCI Card Power....................................................................... 4-18
Fireplane Bus Interface.................................................................... 4-19
Failure on a PCI Board .................................................................... 4-20

vii
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Exercise: Explaining System and I/O Board Operation............. 4-21
Objective................................................................................... 4-21

Preparation............................................................................... 4-21
Task 1 ........................................................................................ 4-21
Task 2 ........................................................................................ 4-23
Task 3 ........................................................................................ 4-25
Exercise Summary............................................................................ 4-28
Check Your Progress ....................................................................... 4-29
The Sun Fireplane Interconnect Bus ................................................. 5-1
Objectives ........................................................................................... 5-1
Relevance............................................................................................. 5-2
Additional Resources ....................................................................... 5-2
Fireplane Bus Introduction............................................................... 5-3
Fireplane Bus Design......................................................................... 5-4
The Fireplane Switch Overview.............................................. 5-5
Address Interconnect Overview ............................................. 5-8
Data Interconnect Overview ................................................... 5-9
L1 Data Flow..................................................................................... 5-11
System Board Port Configuration......................................... 5-11
I/O Board Port Configuration .............................................. 5-12
The ASIC Tree.......................................................................... 5-13
Circuitry on L1 System Boards ............................................. 5-16
Safari Ports ............................................................................... 5-17
Console Bus.............................................................................. 5-17
Fireplane Bus Flow Control Circuitry.................................. 5-18
The Fireplane Buses in the Sun Fire 3800, 4800, and
4810..................................................................................................... 5-19
Address Interconnect.............................................................. 5-19
SDC Management ................................................................... 5-23
Data Interconnect .................................................................... 5-27
The Fireplane Buses in the Sun Fire 6800 ..................................... 5-32
Address Interconnect.............................................................. 5-32

SDC Management Interconnect ............................................ 5-34
Data Interconnect .................................................................... 5-36
Fireplane Bus Data Paths ................................................................ 5-38
Quadword Data Line Structure ............................................ 5-39
Data Pathing ..................................................................................... 5-40
System Board to L2 Data Path............................................... 5-40
I/O Board to L2 Data Path..................................................... 5-41
Sun Fire 3800, 4800, and 4810 Data Path.............................. 5-42
Sun Fire 6800 Data Path ......................................................... 5-43
Sun Fire 6800 Double Pump Mode....................................... 5-44
L1 System Board Data Path Bit Assignment ....................... 5-46
L1 I/O Board Data Path Bit Assignment............................. 5-48
Exercise: Identifying the Safari Bus............................................... 5-51

viii

Advanced Sun Fire™ Mid-Range Troubleshooting
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Objective................................................................................... 5-51
Preparation............................................................................... 5-51
Task 1 ........................................................................................ 5-51
Task 2 ........................................................................................ 5-53
Exercise Summary............................................................................ 5-58
Check Your Progress ....................................................................... 5-59
Parity and ECC Detection and Recovery ...........................................6-1
Objectives ........................................................................................... 6-1
Relevance............................................................................................. 6-2
Additional Resources ....................................................................... 6-2

Error Detection ................................................................................... 6-3
Parity.................................................................................................... 6-4
Parity Detection in the Data Path ........................................... 6-5
Parity Detection in the Address Path..................................... 6-8
Parity Detection in the Control Path ...................................... 6-9
Error Correction Code (ECC) ......................................................... 6-11
ECC Error Types ..................................................................... 6-11
ECC Creation .................................................................................... 6-13
End-to-End ECC Protection................................................... 6-14
ECC Syndromes ............................................................................... 6-16
The ECC Syndrome Table...................................................... 6-16
Signaling ECC Syndrome Codes .......................................... 6-18
ECC Error Identification ................................................................. 6-20
Locating a DX ECC Error....................................................... 6-20
The DX ECC Status Register.................................................. 6-22
CPU-Caused Interconnect ECC Error Example ................. 6-24
ECC Errors from Memory ..................................................... 6-26
Data Bit Identification ..................................................................... 6-28
ECC Error Reporting ....................................................................... 6-32
Asynchronous Fault Status Register ............................................. 6-33
The AFSR Table ....................................................................... 6-33
AFT Labels ............................................................................... 6-36
Asynchronous Fault Address Register ......................................... 6-37
Physical Address Space.......................................................... 6-38
AFAR Addressing................................................................... 6-39
AFSR Overwrite Policy ................................................................... 6-43
AFSR/AFAR Overwrite Policy............................................. 6-43
ECC and MTAG Syndrome Fields Overwrite
Policy...................................................................................... 6-44
Exercise: Identifying and Diagnosing Parity

and ECC Errors................................................................................. 6-45
Preparation............................................................................... 6-45
Task 1 ........................................................................................ 6-45
Task 2 ........................................................................................ 6-47
Task 3 ........................................................................................ 6-49

ix
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Exercise Summary............................................................................ 6-55
Check Your Progress ....................................................................... 6-56
Caching and Interconnect Operations ............................................... 7-1
Objectives ........................................................................................... 7-1
Relevance............................................................................................. 7-2
Additional Resources ....................................................................... 7-2
Introduction ........................................................................................ 7-3
The UltraSPARC III Caches .............................................................. 7-4
The Instruction Cache............................................................... 7-4
The Data Cache.......................................................................... 7-4
The External Cache ................................................................... 7-4
The Write Cache ........................................................................ 7-5
The Prefetch Cache ................................................................... 7-6
Cache Snooping.................................................................................. 7-7
Snoopy Coherency .................................................................... 7-8
Fireplane Address Bus Snooping Operation ............................... 7-10
Snoop Response Signals......................................................... 7-10
Cache Data State Tags ..................................................................... 7-13
CTags ........................................................................................ 7-13
MOESI State Transitions ........................................................ 7-14

The DTags ................................................................................ 7-15
Fireplane Bus Transactions............................................................. 7-16
Interconnect Signal Groups ................................................... 7-16
Address Interconnect Operation.................................................... 7-18
Address Transaction (ATrans) .............................................. 7-18
Sun Fire Transaction Request Codes.................................... 7-19
Requests for Data .................................................................... 7-20
The Data Interconnect ..................................................................... 7-22
Data Transaction (DTrans) .................................................... 7-22
Request Flow .................................................................................... 7-23
Address Read /Write Transaction ....................................... 7-23
Data Transaction ..................................................................... 7-25
Read-to-Share Cache Example .............................................. 7-29
Arbitration on AR and SDC ASICS ............................................... 7-31
Address Interconnect Arbitration......................................... 7-31
Data Bus Arbitration and Operation.................................... 7-33
Direction of ECC Error Reporting ................................................. 7-35
The SDC ECC Error Register................................................. 7-39
General Notes on L1 DX ECC Errors & SC
Messages................................................................................ 7-41
Exercise: Interconnecting Operation ............................................. 7-43
Objective................................................................................... 7-43
Preparation............................................................................... 7-43
Task 1 ........................................................................................ 7-43
Task 2 ........................................................................................ 7-45

x

Advanced Sun Fire™ Mid-Range Troubleshooting
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A



Exercise Summary............................................................................ 7-51
Check Your Progress ....................................................................... 7-52
Memory Interleaving ...........................................................................8-1
Objectives ........................................................................................... 8-1
Relevance............................................................................................. 8-2
Additional Resources ....................................................................... 8-2
Memory ............................................................................................... 8-3
SIMMs......................................................................................... 8-3
DIMMs........................................................................................ 8-3
Logical and Physical Memory Banks ..................................... 8-4
Interleaving ....................................................................................... 8-11
Memory Interleave Configuration........................................ 8-12
Interleave Scope ...................................................................... 8-14
Interleave Rules ....................................................................... 8-14
Interleave Mode ...................................................................... 8-15
Configuring Interleave .................................................................... 8-16
Checking the Interleave Configuration ............................... 8-16
Exercise: Explaining and Using Memory ..................................... 8-20
Objective................................................................................... 8-20
Preparation............................................................................... 8-20
Task ........................................................................................... 8-21
Exercise Summary............................................................................ 8-34
Check Your Progress ....................................................................... 8-35
Hardware Control Buses ....................................................................9-1
Objectives ........................................................................................... 9-1
Relevance............................................................................................. 9-2
Additional Resources ....................................................................... 9-2
Sun Fire Hardware Control Buses................................................... 9-3

The Console Bus ................................................................................. 9-4
The Console Bus Hub ............................................................... 9-5
The BootBus Controller..................................................................... 9-7
System/Board Reset/Error Status and Control ................... 9-8
SBBC Control Paths ........................................................................... 9-9
System Controller SBBC Control Paths ................................. 9-9
System Board SBBC Control Paths ....................................... 9-11
I/O Board SBBC Control Paths............................................. 9-12
The SBBC Error Register ................................................................. 9-14
The PROM Bus ................................................................................. 9-17
JTAG................................................................................................... 9-19
The I2C Bus ....................................................................................... 9-20
The Global I2C Bus ................................................................. 9-21
Local I2C Buses........................................................................ 9-28
The ID Board and the I2C Bus............................................... 9-31
Error Signaling ................................................................................. 9-34
ASIC Error Reporting Policy ................................................. 9-34

xi
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


The EChip................................................................................. 9-35
System Board Error Path........................................................ 9-36
I/O Board Error Path.............................................................. 9-37
Fireplane Switch Board Error Path....................................... 9-38
SC Board Error Path ............................................................... 9-39
Reading an Error Report ................................................................. 9-41
Error Pause............................................................................... 9-42
Clocking in the Sun Fire Servers.................................................... 9-44

Sun Fire System Clocking Requirements ............................ 9-45
Sun Fire System Clocking Users ........................................... 9-47
Local Clock Distribution ........................................................ 9-48
Initial Clock Source Selection ................................................ 9-49
Clock Failover................................................................................... 9-50
SC Hot Swap Clock Management ........................................ 9-50
Exercise: Interfacing Hardware Control....................................... 9-51
Objective................................................................................... 9-51
Preparation............................................................................... 9-51
Task 1 ........................................................................................ 9-51
Task 2 ........................................................................................ 9-53
Exercise Summary............................................................................ 9-58
Check Your Progress ....................................................................... 9-59
Workshop......................................................................................... 10-1
Objectives ......................................................................................... 10-1
Relevance........................................................................................... 10-2
Additional Resources ..................................................................... 10-2
Classes of Errors ............................................................................... 10-3
Analyzing an Error Report ............................................................. 10-7
Analysis .................................................................................... 10-8
Helpful Hints .................................................................................. 10-11
Exercise: Troubleshooting the Sun Fire Mid-Range
Servers ............................................................................................. 10-13
Preparation............................................................................. 10-13
Task ......................................................................................... 10-13
Exercise Summary.......................................................................... 10-26
Memory Architecture..........................................................................A-1
SRAM and DRAM ............................................................................ A-2
Memory Chip Architecture ............................................................. A-4
Refresh ....................................................................................... A-5

Static RAM ......................................................................................... A-6
The Memory Access Cycle............................................................... A-8
The Steps of the Memory Read Cycle ................................... A-9
The Steps of the Memory Write Cycle .................................. A-9
Ecache Address Mapping .............................................................. A-10
Data Addressing .................................................................... A-10
Data Location.......................................................................... A-11

xii

Advanced Sun Fire™ Mid-Range Troubleshooting
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Tag Addressing ...................................................................... A-11
Building the ECC Syndrome Table .............................................. A-12
ECC Calculation Example..................................................... A-12
The UltraSPARC III CPU................................................................... B-1
Introduction ........................................................................................B-2
The UltraSPARC Processor ..............................................................B-3
Processor Architecture ......................................................................B-4
Superscalar Execution .......................................................................B-5
Pipelining ............................................................................................B-6
Pipeline Stages....................................................................................B-8
The Instruction-Fetch Stages ...................................................B-8
Instruction Issue ........................................................................B-9
Execution ....................................................................................B-9
Trap and Done.........................................................................B-10
Summary ..................................................................................B-10
Processor Subunits...........................................................................B-11

UltraSPARC III Functional Overview...........................................B-12
Instruction Issue Unit (IIU).............................................................B-13
Integer Execution Unit (IEU)..........................................................B-15
Floating Point and Graphics Unit (FGU)......................................B-17
Instruction Latency .................................................................B-17
Data Cache Unit (DCU)...................................................................B-18
The Prefetch Cache .................................................................B-19
L1 Data Cache SAM Addressing ..........................................B-20
External Cache Unit (ECU).............................................................B-21
Memory Control Unit (MCU) ........................................................B-22
DIMM Sizes..............................................................................B-23
System Interface Unit (SIU) ............................................................B-24
CPU Error Detection and Correction ............................................B-25
Caching ............................................................................................. C-1
Cache Characteristics........................................................................ C-2
Cache Terminology........................................................................... C-3
Virtual Address Cache ............................................................ C-3
Physical Address Cache .......................................................... C-4
Harvard Caches........................................................................ C-5
Cache Hit Rate ................................................................................... C-6
Example ..................................................................................... C-6
Effects of CPU Cache Misses .................................................. C-7
Cache Thrashing................................................................................ C-8
Measuring CPU and Caching Statistics ....................................... C-10
The cpustat Utility............................................................. C-10
The cputrack Utility ...................................................... C-11
Glossary/Acronyms ............................................................... Glossary-1
Index............................................................................................Index-1

xiii

Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A



List of Figures
Figure 1-1 Sun Fire Server Models ................................................. 1-3
Figure 1-2 Sun Fire 6800 System Board Slot Assignments .......... 1-9
Figure 1-3 Sun Fire 48x0 System Board Slot Assignments ........ 1-10
Figure 1-4 Sun Fire 3800 System Board Slot Assignments ........ 1-10
Figure 1-5 Sun Fire Mid-Range I/O Assembly ........................... 1-11
Figure 1-6 Sun Fire 6800 Server PCI I/O
Assembly Locations .................................................................... 1-12
Figure 1-7 Sun Fire 4800 Server I/O Assembly Locations ........ 1-13
Figure 1-8 Sun Fire 3800 cPCI I/O Assembly Locations ........... 1-14
Figure 1-9 Four-Slot cPCI I/O Board Logical Block Diagram .. 1-15
Figure 1-10 Six-Slot cPCI I/O Board Logical Block Diagram ... 1-17
Figure 1-11 Sun Fireplane Switch Board Slot Assignments
for the Sun Fire 6800 Server ....................................................... 1-19
Figure 1-12 Sun Fireplane Switch Board Slot Assignments
for the Sun Fire 48x0 Server ....................................................... 1-20
Figure 1-13 CPU Mapping Example ............................................ 1-22
Figure 1-14 Memory Controller Mapping Example .................. 1-22
Figure 1-15 Example I/O Device Path for Sun Fire
6800/4810/4800 Systems ........................................................... 1-24
Figure 1-16 I/O Assembly Physical Slot Designations .............. 1-26
Figure 1-17 Sun Fire 6800/48x0 Servers Four-Slot
cPCI Physical Slot Designations ................................................ 1-28
Figure 1-18 Example I/O Device Path for
Sun Fire 3800 Systems ................................................................ 1-29
Figure 1-19 Sun Fire 3800 System Six-Slot

cPCI Physical Slot Designations ................................................ 1-30
Figure 1-20 Sun Fireplane Interconnect Operational View ....... 1-31
Figure 2-1 RTU and RTS Power Connections ............................... 2-5
Figure 2-2 RTS and RTU Units ........................................................ 2-6
Figure 2-3 Sun Fire 3800 Logical Power Distribution ................ 2-11
Figure 2-4 Sun Fire 3800 Power Distribution .............................. 2-12
Figure 2-5 Sun Fire 48x0 Logical Power Distribution ................ 2-12
Figure 2-6 Sun Fire 48x0 Power Distribution .............................. 2-13

xv
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Figure 2-7 Sun Fire 6800 Logical Power Distribution ................ 2-14
Figure 2-8 Sun Fire 6800 Power Distribution .............................. 2-15
Figure 2-9 Frame Manager Cable Diagram ................................. 2-19
Figure 3-1 Bus Clocking with Two Domains ................................ 3-5
Figure 3-2 Segments in a Sun Fire 3800, 4800, or
4810 Server ..................................................................................... 3-7
Figure 3-3 Two Domains in a Sun Fire 6800 ................................. 3-8
Figure 3-4 Two Segments in a Sun Fire 6800 ................................ 3-9
Figure 3-5 Domains and Segments in a Sun Fire 6800 ............... 3-10
Figure 4-1 System Board Major Components ............................... 4-4
Figure 4-2 System Board Logical Components ............................. 4-5
Figure 4-3 PCI I/O Board Assembly .............................................. 4-9
Figure 4-4 cPCI I/O Board ............................................................. 4-13
Figure 4-5 cPCI Board and TI HPC-3130 HPC-PCI Chip .......... 4-15
Figure 4-6 3800 cPCI I/O Board .................................................... 4-16
Figure 4-7 PCI Bus-Fireplane Bus Relationship .......................... 4-19
Figure 4-8 I/O Board SBBC Interfaces ......................................... 4-20

Figure 5-1 Bus Hierarchy Levels ..................................................... 5-4
Figure 5-2 Fireplane Switches in the Sun Fire Mid-Range
Platforms ........................................................................................ 5-6
Figure 5-3 Fireplane Switch Board Layout .................................... 5-7
Figure 5-4 Address Interconnect Levels ........................................ 5-8
Figure 5-5 Data Interconnect Levels ............................................. 5-10
Figure 5-6 System Board Data Flow ............................................. 5-11
Figure 5-7 I/O Board Data Flow ................................................... 5-12
Figure 5-8 ASIC Tree ...................................................................... 5-13
Figure 5-9 System Board ASIC Tree ............................................. 5-14
Figure 5-10 System Board ASIC Tree (Continued) .................... 5-15
Figure 5-11 Major System Board Interconnect Pathways ......... 5-16
Figure 5-12 Address Repeater ....................................................... 5-20
Figure 5-13 Sun Fire 3800, 4800, and 4810 AR Level 1 and 2
Configurations ............................................................................. 5-21
Figure 5-14 SDC Interconnect ........................................................ 5-24
Figure 5-15 Dual CPU Data Switch ASIC .................................... 5-27
Figure 5-16 DX Level 2 Configuration for the Sun Fire 3800, 4800,
and 4810 ........................................................................................ 5-29
Figure 5-17 AR Level 2 Configuration for the Sun Fire 6800 .... 5-32
Figure 5-18 SDC Level 2 Configuration for the Sun Fire 6800 .. 5-34
Figure 5-19 Sun Fire 6800 Data Interconnect .............................. 5-36
Figure 5-20 Fireplane Data Path Bandwidth ............................... 5-38
Figure 5-21 Data Line Quadword Structure ............................... 5-39
Figure 5-22 Sun Fire 3800 and 48x0 Data Paths .......................... 5-42
Figure 5-23 Sun Fire 6800 Single Segment Data Path ................ 5-43
Figure 5-24 System Data Order and Bit Slicing in Double Pump
Mode Fireplane Switch ( Part A) ............................................... 5-44

xvi


Advanced Sun Fire™ Mid-Range Troubleshooting
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Figure 5-25 System Data Order and Bit Slicing in
Double Pump Mode Fireplane Switch (Part B) ...................... 5-45
Figure 6-1 Parity Checking in Data Path ....................................... 6-5
Figure 6-2 Safari Internal Ports on the I/O Board DX Chip ....... 6-6
Figure 6-3 Level 1 Data Repeater Internal Parity Detectors
and Regenerators ........................................................................... 6-7
Figure 6-4 Parity Protection for Address Interconnects .............. 6-8
Figure 6-5 ECC Error Detection and Reporting Path ................. 6-13
Figure 6-6 End-to-End ECC Protection
(Including Intermediate ECC Detection) ................................. 6-14
Figure 6-7 Data Path ECC Locations ............................................ 6-20
Figure 6-8 DX Incoming/Outgoing Data Paths ......................... 6-23
Figure 6-9 Uncorrectable ECC Error From a Bad CPU .............. 6-24
Figure 6-10 Uncorrectable ECC Error From a Memory
Module .......................................................................................... 6-26
Figure 6-11 Physical Address Spaces ........................................... 6-38
Figure 7-1 UltraSparc III Processor Layout ................................... 7-5
Figure 7-2 Caches and Data Currency ........................................... 7-7
Figure 7-3 Address Bus Transaction Broadcast .......................... 7-10
Figure 7-4 Cache Coherency State Transitions ........................... 7-14
Figure 7-5 ATransID Bit Layout .................................................... 7-19
Figure 7-6 The Address Request Transaction ............................. 7-23
Figure 7-7 Data Transaction Read Response ............................... 7-25
Figure 7-8 Target Transaction ....................................................... 7-26
Figure 7-9 Write Transaction ......................................................... 7-28

Figure 7-10 Read-to-Share from Memory on
Another Board ............................................................................. 7-29
Figure 7-11 SDC Arbiter ................................................................. 7-33
Figure 7-12 Inbound and Outbound Transfers on the
DX Switch ..................................................................................... 7-35
Figure 8-1 Memory Subsystem ....................................................... 8-9
Figure 9-1 Console Bus Structure ................................................... 9-4
Figure 9-2 Console Bus Hub ............................................................ 9-5
Figure 9-3 SC SBBC Block Diagram ................................................ 9-9
Figure 9-4 System Board SBBC Connection Block Diagram ..... 9-11
Figure 9-5 I/O Board SBBC Block Diagram ................................ 9-13
Figure 9-6 SBBC, I/O PROM and I2C Buses ............................... 9-18
Figure 9-7 SC I2C Bus Multiplexing ............................................. 9-21
Figure 9-8 CPU Board Global I2C Bus ......................................... 9-27
Figure 9-9 ID Board ......................................................................... 9-32
Figure 9-10 System Board Error Signal Paths ............................. 9-36
Figure 9-11 I/O Board Error Signal Paths ................................... 9-37
Figure 9-12 Fireplane Switch Board Error Signal Paths ............ 9-38
Figure 9-13 SC Board Error Signal Paths ..................................... 9-39
Figure 9-14 Error Reporting Hierarchy ........................................ 9-41
Figure 9-15 Error Pause Signal Distribution ............................... 9-42

xvii
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Figure 9-16 Error Reporting Paths ................................................ 9-43
Figure 9-17 Basic Clock Distribution ............................................ 9-44
Figure 9-18 Local Clock Distribution (System Board) ............... 9-48
Figure A-1 SRAM and DRAM ....................................................... A-2

Figure A-2 Single DRAM Bit Cell Structure................................. A-4
Figure A-3 DRAM Data and Control Flow .................................. A-5
Figure A-4 SRAM Single Bit Cell Design (Six Transistor).......... A-7
Figure A-5 DRAM Data Access Mechanism ................................ A-8
Figure B-1 UltraSPARC III Physical Layout.................................. B-4
Figure B-2 The UltraSPARC III Pipeline........................................ B-7
Figure B-3 UltraSPARC-III Functional Units .............................. B-12
Figure B-4 Instruction Fetch Logic Flow...................................... B-13
Figure B-5 Integer Execute Unit.................................................... B-15
Figure B-6 Data Cache Unit ........................................................... B-19
Figure B-7 Prefetch Cache Data Flow .......................................... B-19
Figure B-8 Data Cache SAM Addressing .................................... B-20
Figure B-9 Memory Subsystem Interconnect.............................. B-22
Figure C-1 Performance Loss to Cache Misses ............................ C-7
Figure C-2 Example of Cache Thrashing...................................... C-9

xviii

Advanced Sun Fire™ Mid-Range Troubleshooting
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


List of Tables
Table 1-1 Sun Fire Family Names ................................................... 1-3
Table 1-2 Sun Fire Family Component Names ............................. 1-4
Table 1-3 Sun Fire Family Maximum Configurations ................. 1-5
Table 1-4 Sun Fire Family System Interconnect
Specifications ................................................................................. 1-7
Table 1-5 Sun Fire ASIC List .......................................................... 1-16
Table 1-6 CPU Numbering ............................................................ 1-21

Table 1-7 IOC AID Numbering ..................................................... 1-23
Table 1-8 Device Path to I/O Card Slot Location Mapping ...... 1-25
Table 1-9 Device Path to I/O Card Slot Location Mapping ...... 1-27
Table 1-10 3800 I/O Assembly AIDs ............................................ 1-29
Table 1-11 Physical Slot Numbers for Sun Fire
3800 Systems ................................................................................ 1-30
Table 1-12 Service Mode Command Summary .......................... 1-33
Table 1-13 OpenBoot Prom error-reset-recovery
Variable ......................................................................................... 1-34
Table 1-14 Even and Odd Parity ................................................... 1-37
Table 2-1 AC/DC Power Supply Ratings ...................................... 2-8
Table 2-2 System Centerplane Configurations ........................... 2-10
Table 2-3 Dual Power Grid Configuration for
Sun Fire 6800 ................................................................................ 2-14
Table 2-4 Component Power Supply Ratings ............................. 2-16
Table 2-5 Board Voltage Requirements ....................................... 2-17
Table 3-1 Relative Bus Performance of Segments
and Domains ................................................................................ 3-11
Table 3-2 Single Segment Configuration for
Sun Fire 48x0/3800 ..................................................................... 3-12
Table 3-3 Dual Segment Configuration for
Sun Fire 48x0/3800 ..................................................................... 3-12
Table 3-4 Sun Fire 48x0/3800 MAC/HostIDs ............................ 3-13
Table 3-5 Single Segment Configuration for 6800 ...................... 3-13
Table 3-6 Dual Segment Configuration for Sun Fire 6800 ......... 3-13
Table 3-7 Sun Fire 6800 MAC/HostIDs ....................................... 3-14

xix
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A



Table 3-8 Sun Fireplane Switch Board Names ............................ 3-15
Table 3-9 Configuration Status for Sun Fire 3800
and 48x0 Platforms ...................................................................... 3-15
Table 3-10 Configuration Status for
the Sun Fire 6800 Platform ......................................................... 3-17
Table 4-1 I/O Assembly Types by Model .................................... 4-7
Table 4-2 PCI Board Slot Characteristics ..................................... 4-10
Table 4-3 cPCI Board Slot Characteristics .................................... 4-14
Table 4-4 3800 cPCI Board Slot Characteristic ............................ 4-17
Table 4-5 I/O Board Power Requirements .................................. 4-18
Table 5-1 Address Repeater Port Interconnects .......................... 5-22
Table 5-2 SDC Port Interconnects ................................................. 5-25
Table 5-3 System and Repeater Board DX Internal
and External Connections .......................................................... 5-30
Table 5-4 Sun Fire 6800 System Board DX Port
Connections .................................................................................. 5-39
Table 5-5 Sun Fire 6800 Connections ............................................ 5-40
Table 5-6 DCDS Memory Interface (M) and CPU
Interface (C) Bit Assignment ..................................................... 5-45
Table 5-7 I/O Board IOC and DX Bit Assignments
with Fireplane Bus Meaning ...................................................... 5-47
Table 6-1 Sun Fire ECC Syndrome Table ..................................... 6-17
Table 6-2 DX ECC Register ............................................................ 6-22
Table 6-3 DX Register Decoded ................................................... 6-23
Table 6-4 DRAM IDs ....................................................................... 6-28
Table 6-5 Reference Designator Position of the
DRAM Chip ................................................................................. 6-30
Table 6-6 Asynchronous Fault Status Register ........................... 6-33
Table 6-7 Asynchronous Fault Address Register ....................... 6-37

Table 6-8 Physical Address Mappings ......................................... 6-39
Table 6-9 AFAR Address Decoded ............................................... 6-41
Table 7-1 Data Origins of Snoop Response Signals .................... 7-12
Table 7-2 Transaction Command Field Values ........................... 7-19
Table 7-3 Address Arbitration Round-Robin Priorities ............. 7-32
Table 7-4 L2 Data Arbitration Determination ............................. 7-34
Table 7-5 SDC ECC Register .......................................................... 7-38
Table 7-6 SDC Register Decoded .................................................. 7-40
Table 8-1 Sun Fire DIMM Capacities (discounting ECC) ............ 8-3
Table 8-2 Memory Bank and DIMM Locations ............................ 8-5
Table 9-1 The SBBC CPU Board Error Register .......................... 9-15
Table 9-2 Global I2C Bus Assignments ........................................ 9-21
Table 9-3 Global I2C Bus Device Address ................................... 9-23
Table 9-4 Local I2C DIMM Locations ........................................... 9-27
Table 9-5 First Error Register on CPU board .............................. 9-33
Table 9-6 First Error Register on System Controller .................. 9-38
Table 9-7 Sun Fire Clock Distribution .......................................... 9-45

xx

Advanced Sun Fire™ Mid-Range Troubleshooting
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Table 10-1 Errors and Probable FRUs .......................................... 10-3
Table 13-1 Cache Characteristics ................................................... C-2

xxi
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A




Preface

About This Course
Course Goals
The Advanced Sun Fire™ Mid-Range Troubleshooting course goes beyond
field replacement unit (FRU) maintenance by focusing on
interrelationships between application-specific integrated circuits (ASICs)
and the resulting error outputs.
The strategy provided by the preface is to introduce students to the course before they introduce themselves
to you and one another. By familiarizing them with the content of the course first, their introductions will have
more meaning in relation to the course prerequisites and objectives.

Upon completion of this course, you should be able to:


List and identify the models, interconnect, and key features of the
Sun Fire server product line



Explain the differences in power management among the Sun Fire
server models



Identify misconfigured domains and segments and build an action
plan to rectify the situation




Identify circuitry on an L1 board using error reports



Describe how the Sun Fire family uses the Fireplane switch boards
differently



Given error checking and correction (ECC) reports, identify and
diagnose faulty components



Discuss and interpret different types of failure information

Preface-xxiii
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Course Map

Course Map
The following course map enables you to see the general topics and the
modules for that topic area in reference to the course goal.

The Sun Fire Platform
Reviewing the

Sun Fire Servers

Power Management
and the
Frame Manager

Domains and
Segments

The Interconnect
The System
and I/O Boards

Sun™ Fireplane
Interconnect Bus

Parity and ECC
Detection and
Recovery

Memory Management
Caching and
Interconnect
Operations

Memory
Interleaving

Hardware Control
Hardware

Control Buses

Preface-xxiv

Workshop

Advanced Sun Fire™ Mid-Range Troubleshooting
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A


Topics Not Covered

Topics Not Covered
This course does not cover the following topics. Many of these topics are
covered in other courses offered by Sun Educational Services:


FRU Maintenance – Covered in IES-SM30: Sun Fire Field Maintenance
and Support



ES10K Administration – Covered in ES-400: Sun Enterprise™ 10000
Server Administration



ES15K Administration – Covered in IES-421: Sun Fire 15K Server
Administration




Storage Area Networks – Covered in ES-475: Design and
Administration of Storage Area Networks



Solaris Administration – Covered in SA-289: Solaris™ 8 System
Administration II

Refer to the Sun Educational Services catalog for specific information and
registration.

How Prepared Are You?
To be sure you are prepared to take this course, can you answer yes to the
following questions?


Have you attended the IES-SM30 course?



Can you identify and replace FRU components of the Sun Fire
product line?



Can you implement domains on the Sun Fire product line?




Can you administer the Solaris Operating Environment?



Do you have six months’ minimum field/on-the-job experience with
the Sun Fire product line?

About This Course
Copyright 2002 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, Revision A

Preface-xxv


×