Tải bản đầy đủ (.pdf) (8 trang)

Adaptive Techniques for Dynamic Processor Optimization Theory and Practice by Alice Wang and Samuel Naffziger_17 pot

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (556.92 KB, 8 trang )

Chapter 12 The Challenges of Testing Adaptive Designs 297
The Itanium 2 has a thermal management system very similar to power
measurement. Using the same VCO (Figure 12.17) as in the power
measurement system, the thermal solution has the resolution to measure
temperature with a precision << 1ºC.


Figure 12.17 Block diagram of thermal measurement. (© IEEE 2006)
However, in order to calibrate the system a known temperature with <<
1ºC of error needs to be supplied by the test environment. The test
environment has to test parts with varying power draw, in a short amount
of time, and with limited thermal probes. To achieve the desired thermal
control in a test environment, the part would need to be submerged in an
oil bath. This is not possible while achieving the required test throughput.
As a result, the accuracy of the thermal monitoring system is not limited
by the processor capabilities, but instead is limited by the capabilities of
the test environment.
As more and more adaptive techniques are used to stretch the capabilities
of silicon, investments will need to be made in validation and test systems
to fully utilize the new capabilities. Adaptive circuit techniques have the
ability to reduce processor guard-bands provided the test infrastructure can
emulate the use conditions adequately.
12.4 Guard-Band Concerns of Adaptive Power
Management
After one considers the correctness of adaptable systems, one must deliver
the value that they offer in the product environment. One of the primary

298 Eric Fetzer, Jason Stinson, Brian Cherkauer, Steve Poehlman
manufacturing considerations in designing an adaptive frequency/power
control system is performance variability tolerance. A system based on
any type of analog measurement will inherently be susceptible to part-to-


part variation as well as environmental variation.
For example, the Montecito system that makes an on-die analog
measurement of the power being consumed will be subject to part-to-part
variation —no two parts will have exactly the same mix of leakage and
dynamic power. This means as voltage is raised or lowered, the power
consumed by parts will vary compared to one another. The same is true
with temperature variation, which affects the leakage power but not the
dynamic power. Also, the ideal voltage versus frequency curve is subject
to part-to-part variation, and attempting to optimize this on a per-part basis
will introduce additional variability.
This variability can also be a function of more subtle effects such as the
aging of components. Voltage regulator outputs may drift as they age,
cooling systems may provide less airflow, and even the leakage of the
processor itself changes with aging. Thus, it is exceedingly difficult to
make a processor that behaves identically from run-to-run and part-to-part
throughout its lifetime if it depends on an analog power measurement for
the basis of its performance adaptability. Systems that depend on a
temperature measurement to adapt performance are subject to similar
variability compared to those that measure power directly.
Reducing the number of possible operating conditions from a continuous
curve to a series of a few discrete conditions greatly reduces the exposure
to variability, as most variation will not be enough to move from one
operating condition to the next. However, if absolutely deterministic
behavior is required of a design, another approach is to replace analog
sensing with architectural event counters.
Using architectural counters [19], specific architectural events can serve
as a proxy for power dissipation, by weighting each one according to its
expected contribution to the power. Assuming the weighting is not done
on a part-by-part basis, all processors will behave identically on identical
code streams. This potentially gives up some benefits of the analog

schemes, which squeeze out more from the design by using actual power
or temperature measurements instead of a proxy. However, this even-based
approach guarantees part-to-part and workload-to-workload
repeatability—also making benchmarking and design debug much more
straightforward.




Chapter 12 The Challenges of Testing Adaptive Designs 299
From a manufacturability standpoint, both analog and architectural designs
require similarly sized guard-bands (Adaptive Op. Point, Figure 12.18) to
guarantee power stays within limits. Because of issues in testing and
operation, this guard-band is larger than the guard-band required at a non-
adaptive operating point. From an analog perspective, the design is
dependent on the ability to make an accurate current measurement, often in
the noisy environment of a running system.

0.80
0.90
1.00
1.10
1.20
1.30
1.40
1.00 1.20 1.40 1.60 1.80 2.00 2.20
Frequency (GHz)
Voltage (V)
Not Measured
Data, illustrative

purposes only
Frequency (GHz)
Voltage (V)
No Adapt
Op. Point
Worst
Case Activity
Code @ P
max
Frequency (GHz)
Voltage (V)
Not Measured
Data, illustrative
purposes only
Frequency (GHz)
Voltage (V)
No Adapt
Op. Point
Worst
Case Activity
Code @ P
max
Real App
Activity Code
@ P
max
Large
Guardband for
Power measurment
variability

Small
Guardband for Test
environment issues
Adaptive
Op. Point

Figure 12.18 Comparison of operating point with and without adaptation.
Architectural counters are not subject to analog noise or accuracy, but
they must be placed and weighted carefully in order to provide the best
mapping to power. One drawback of the architectural approach is that the
worst-case power event needs to be well understood to be detected and the
system needs tuning based on silicon-collected data to be accurate.
Another drawback is that it is very difficult to cover data-dependent
power. That is to say, you can map a certain architectural operation to a
given power level, but you cannot easily modify that power level based on
the operands or the specific data being manipulated, as this requires too
deep a penetration of the architectural monitors.
Determinism and repeatability give architectural power estimates a
significant advantage over the analog measurements. Unlike the situation
where the analog measurement-based power management must be disabled
for almost all production testing, an architectural power-based system will
300 Eric Fetzer, Jason Stinson, Brian Cherkauer, Steve Poehlman
determine steps to maintain a constant power level. While voltage and
frequency responses may not be properly emulated on the tester, the
measurement system itself will behave in a predictable and testable manner.
12.5 Conclusion
From wafer test to final testing of parts in systems, determinism and
repeatability are the cornerstones of bringing a processor design to market.
Adaptive techniques used in modern processors like those demonstrated in
this chapter make determinism and repeatability difficult to achieve. In

some cases, the test infrastructure is not able to keep up with the
processor’s ability to adapt, and as a result the guard-bands that adaptation
is trying to eliminate will remain. Careful planning, along with novel test
techniques like the ones described in this chapter, needs to be employed to
realize the full potential of adaptive techniques. Additional significant
breakthroughs will be required for higher levels of adaptation involving
applications, OS, firmware, system components, and the processor to be
fully production testable.
References
[1] Naffziger, S., et al., “The Implementation of a 2-core Multi-Threaded
Itanium-Family Processor,” IEEE Journal of Solid-State Circuits, Vol. 41,
No. 1 pp. 197–209, Jan. 2006
[2] Thompson, S., et al., “A 90 nm logic technology featuring 50 nm strained
silicon channel transistor, 7 layers of Cu interconnects, low k ILD, and 1 μm
2

SRAM cell,” Electron Devices Meeting, 2002. IEDM '02. Digest.
International, pp. 61–64, Dec. 2002
[3] Mahoney, P., Fetzer, E., et al., “Clock distribution on a dual-core, multi-
threaded Itanium®-family processor,” Solid-State Circuits Conference, 2005.
Digest of Technical Papers. ISSCC. 2005 IEEE International, Vol. 1, pp.
292–599, 6–10 Feb. 2005
[4] Anderson, F.E., Wells, J.S., Berta, E.Z., “The core clock system on the next
generation Itanium microprocessor,” Solid-State Circuits Conference, 2002.
Digest of Technical Papers. ISSCC. 2002 IEEE International, Vol. 1, pp.
146–453, 3–7 Feb. 2002
[5] Geannopoulos, G., Dai, X., “An adaptive digital deskewing circuit for clock
distribution networks”, Solid-State Circuits Conference, 1998. Digest of
Technical Papers. 45th ISSCC 1998 IEEE International, pp. 400–401, 5–7
Feb. 1998

Chapter 12 The Challenges of Testing Adaptive Designs 301
[6] Peterson, W.W., Weldon, E.J., Jr., Error-Correcting Codes, 2nd editions,
MIT Press: Cambridge Mass., 1972
[7] Ziegler, J. F., Srinivasan, G. R., et al, “Terrestrial cosmic rays and soft
errors,” IBM Journal of R and D, Vol. 40 No.1 1996
[8] Ershov, M., Saxena, S., et al., “Dynamic recovery of negative bias
temperature instability in p-type metal-oxide-semiconductor field-effect
transistors,” Applied Physics Letters, , Vol. 83, No. 8, pp. 1647–1649,
August 25 2003
[9] Agostinelli, M., et al., “Erratic fluctuations of SRAM cache Vmin at the
90nm process technology node,” Electron Devices Meeting, 2005. IEDM
Technical Digest. IEEE International, pp. 655–658, Dec. 5 2005
[10] McGowen, R., Poirier, C., et al., “Power and Temperature Control on a 90-
nm Itanium Microprocessor,” Solid-State Circuits, IEEE Journal of Vol. 41,
No. 1, pp. 229–237, Jan. 2006
[11] Wayne Needham, Cheryl Prunty, Eng Hong Yeoh, “High Volume
Microprocessor Test Escapes, An Analysis Of Defects Our Test Are
Missing”, IEEE International Test Conference, pp. 25–34, 1998.
[12] Mike Mayberry, John Johnson, Navid Shahriari, Mike Trip, “Realizing the
Benefits of Structural Test For Intel Microprocessors”, IEEE International
Test Conference, pp. 456–463, 2002.
[13] Ismet Bayraktaroglu, Jim Hunt, Daniel Watkins, “Cache Resident Functional
Microprocessor Testing: Avoiding High Speed IO Issues”, IEEE
International Test Conference Conference, 2006.
[14] Huston, R., “Microprocessor Functional Test Generation on the Sentry 600”,
IEEE International Test Conference, 1974.
[15] Praveen Parvathala, Kailas Maneparambil, William Lindsay, “ FRITS – A
Microprocessor Functional BIST Method”, IEEE International Test
Conference, pp. 590–598, 2002.
[16] Krantis, N., Xenoulis, G., Paschalis, A., Gizopoulos, D., Zorian, Y.,

“Application and Analysis of RT-Level Software-Based Self-testing for
Embedded Processor Cores”, IEEE Intetrnational Test C440.
[17] Wei-Cheng Lai, Kwang-Ting Cheng, “Instruction-Level DFT for Testing
Processor and IP Cores in System-on-a-Chip”, Design Automation
Conference ,pp. 59–64, 2001.
[18] Tsang, J., et. al., “Picosecond imaging circuit analysis”, IBM Journal of
Research and Development, Vol. 44, No. 4, pp. 583–603, 2000.
[19] Leon, A. S., et al., “A Power-Efficient High-Throughput 32-Thread SPARC
Processor,” IEEE J. Solid-State Circuits, Vol. 42, No. 1, pp. 7–16, Jan. 2007.
[20] Harry Hsiung, “Manufacturing and test Solutions with EFI”, Intel
Developers Forum, 2003.
[21] Peter Maxwell, Ismed Hartanto, Lee Bentz, “Comparing Functional and
Structural Tests”, IEEE International Test Conference, pp. 400–407, 2000.
[22] Satish M. Thatte, Jacob A. Abraham, “Test Generation For Microprocessors”,
IEEE Transactions On Computers, Vol. 29, No. 6, pp. 429–441.
[23] Advanced Configuration and Power Interface Specification, rev 3.0b,
o/spec.htm, October 2006
Index
Adaptive body-bias, 25, 45, 77
Adaptive voltage scaling, 25
Aging, 87, 151
negative bias temperature
instability (NBTI), 11
Asynchronous design, 230
bundled data, 230
dual-rail, 231
Asynchronous latch controller, 240

Body-bias, 2, 12, 20
adaptive, 4, 25, 45, 77

controller, 88
forward, 27, 60
reverse, 27, 55

Canary circuits, 179
Clock generation, 138
Clocking
jitter, 150
skew, 150, 274
Control loop, 199
Critical path, 145, 210

DC-DC, 108
inductor-based, 109
switched-cap, 110
Device sizing, 98
Drain induced barrier lowering
(DIBL), 17, 50
Dynamic voltage scaling (DVS), 26,
50, 95, 123, 126, 176

Error correction coding, 106, 277
Error detection, 182

Frequency island, 207–208
Frequency optimization, 33
Globally asynchronous, locally
synchronous (GALS), 208
Guardbands, 299


Hardware and software control, 68

In-situ monitor, 181

Leakage current
gate, 2, 17, 50
gate edge diode leakage (GEDL), 18
gate induced diode leakage
(GIDL), 20, 39
subthreshold, 2, 17, 50
Leakage current monitor, 56
Low-dropout (LDO), 109

Manufacturing test, 272, 279
ATPG, 280
clock de-skew, 288
power management, 289
wafer sort, 280
Microprocessor, 121
Minimum energy tracking, 112

Negative bias temperature instability
(NBTI), 11
Noise, 145

Operating system control (OS), 70

Performance monitor, 128
PLL, 87, 138
Power monitor, 279

Power optimization, 33
Process variation, 41, 79, 145, 149,
175, 207, 210, 267
die-to-die, 79
304 Index
Random dopant fluctuations, 11
Ring oscillatior, 33

Shadow latch, 187
Short-channel effect, 59
SRAM, 101, 134, 249
active sleep, 260
bias generator, 262
passive sleep, 261
read assist, 257
reliability, 267
replica path, 258
soft errors, 267
subthreshold, 107
timing, 257
write assist, 253
Static noise margin (SNM), 134
flip-flops, 97
read, 104, 250
SRAM, 104
write, 250


























Sub-threshold CMOS, 97
Supply voltage variation, 150, 177

Technology scaling, 1, 26, 75, 175
Temperature variation, 7, 57, 150,
177, 207, 217
Threshold-voltage variation, 13

Ultra dynamic voltage scaling, 95


Variable channel-length, 5
Variable frequency scaling, 207
Variable threshold CMOS
(VTCMOS), 55
Voltage/frequency hopping, 51
Voltage controlled oscillator
(VCO), 280
Voltage regulator, 278
Voltage scaling, 2
adaptive, 25



Continued from page ii
Abstraction Refinement for Large Scale Model Checking
Chao Wang, Gary D. Hachtel, and Fabio Somenzi
ISBN 978-0-387-28594-2, 2006
A Practical Introduction to PSL
Cindy Eisner and Dana Fisman
ISBN 978-0-387-35313-5, 2006
Thermal and Power Management of Integrated Systems
Arman Vassighi and Manoj Sachdev
ISBN 978-0-387-25762-4, 2006
Leakage in Nanometer CMOS Technologies
Siva G. Narendra and Anantha Chandrakasan
ISBN 978-0-387-25737-2, 2005

Statistical Analysis and Optimization for VLSI: Timing and Power
Ashish Srivastava, Dennis Sylvester, and David Blaauw

ISBN 978-0-387-26049-9, 2005

×