Tải bản đầy đủ (.pdf) (574 trang)

self-similar network traffic and performance evaluation.

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (6.51 MB, 574 trang )





















SELF-SIMILAR
NETWORK TRAFFIC
AND PERFORMANCE
EVALUATION
Self-Similar Network Traffic and Performance Evaluation
Edited by Kihong Park, Walter Willinger
Copyright
 2000 John Wiley & Sons, Inc.
ISBNs: 0-471-31974-0 (Hardback); 0-471-20644-X (Electronic)
SELF-SIMILAR
NETWORK TRAFFIC


AND PERFORMANCE
EVALUATION
Edited by
KIHONG PARK
Purdue University
WALTER WILLINGER
AT&T Labs-Research
A Wiley-Interscience Publication
JOHN WILEY & SONS, INC.
New York

Chichester

Weinheim

Brisbane

Singapore

Toronto
Designations used by companies to distinguish their products are often claimed as trademarks. In all
instances where John Wiley & Sons, Inc., is aware of a claim, the product names appear in initial capital or
ALL CAPITAL LETTERS. Readers, however, should contact the appropriate companies for more complete
information regarding trademarks and registration.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form
orbyanymeans,electronicormechanical,includinguploading,downloading,printing,decompiling,
recording or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States
Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher
for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc.,
605 Third Avenue, New York, NY 10158-0012, (212) 850-6011, fax (212) 850-6008,

E-Mail: PERMREQ @ WILEY.COM.
This publication is designed to provide accurate and authoritative information in regard to the subject
matter covered. It is sold with the understanding that the publisher is not engaged in rendering professional
services. If professional advice or other expert assistance is required, the services of a competent
professional person should be sought.
ISBN 0-471-20644-X
This title is also available in print as ISBN 0-471-31974-0.
For more information about Wiley products, visit our web site at www.Wiley.com.
Copyright#2000byJohnWiley&Sons,Inc.Allrightsreserved.
CONTRIBUTORS
Abdelnaser Adas, Conexant, Inc., Newport Beach, California, USA
P. Abry, CNRS UMR 5672, E
Â
cole Normale Supe
Â
rieure de Lyon, Laboratoire de
Physique, Lyon, France
O. J. Boxma, Eindhoven University of Technology, Eindhoven, The Netherlands
and CWI, Amsterdam, The Netherlands
F. Brichet, France Te
Â
le
Â
com, CNET, Issy-Moulineaux, France
J. W. Cohen, CWI, Amsterdam, The Netherlands
Mark E. Crovella, Boston University, Boston, Massachusetts, USA
N. G. Duf®eld, AT&T Labs±Research, Florham Park, New Jersey, USA
Anja Feldmann, University of SaarbruÈcken, SaarbruÈ cken, Germany
P. Flandrin, CNRS UMR 5672, E
Â

cole Normale Supe
Â
rieure de Lyon, Laboratoire de
Physique, Lyon, France
Daniel P. Heyman, AT&T Labs, Middleton, New Jersey, USA
Philippe Jacquet, INRIA, Le Chesnay, France
P. R. Jelenkovic
Â
, Columbia University, New York, New York, USA
Gitae Kim, Boston University, Boston, Massachusetts, USA
T. V. Lakshman, Bell Laboratories, Lucent Technologies, Holmdel, New Jersey,
USA
v
Guang-Liang Li, The University of Hong Kong, Hong Kong, China
Victor O.K. Li, The University of Hong Kong, Hong Kong, China
N. Likhanov, Institute for Problems of Information Transmission, Russian Acad-
emy of Science, Moscow, Russia
Lester Lipsky, University of Connecticut, Storrs, Connecticut, USA
Armand M. Makowski, University of Maryland, College Park, Maryland, USA
L. MassoulieÂ, Microscoft Research Ltd., Cambridge, United Kingdom
Amarnath Mukherjee, Knoltex Corporation, San Jose, California, USA
Ilkka Norros, VTT Information Technology, Espoo, Finland
Kihong Park, Purdue University, West Lafayette, Indiana, USA
Minothi Parulekar, University of Maryland, College Park, Maryland, USA
R. H. Riedi, Rice University, Houston, Texas, USA
Sidney Resnick, Cornell University, Ithaca, New York, USA
J. W. Roberts, France Te
Â
le
Â

com, CNET, Issy-Moulineaux, France
Gennady Samorodnitsky, Cornell University, Ithaca, New York, USA
A. Simonian, France Te
Â
le
Â
com, CNET, Issy-Moulineaux, France
M. S. Taqqu, Boston University, Boston, Massachusetts, USA
Tsunyi Tuan, Purdue University, West Lafayette, Indiana, USA
D. Veitch, Software Engineering Research Centre, Carlton, Victoria, Australia
W. Whitt, AT&T Labs±Research, Florham Park, New Jersey, USA
Walter Willinger, AT&T Labs±Research, Florham Park, New Jersey, USA
vi CONTRIBUTORS
CONTENTS
Preface xi
1 Self-Similar Network Traf®c: An Overview 1
Kihong Park and Walter Willinger
2 Wavelets for the Analysis, Estimation, and Synthesis
of Scaling Data 39
P. Abry, P. Flandrin, M. S. Taqqu, and D. Veitch
3 Simulations with Heavy-Tailed Workloads 89
Mark E. Crovella and Lester Lipsky
4 Queueing Behavior Under Fractional Brownian Traf®c 101
Ilkka Norros
5 Heavy Load Queueing Analysis with LRD OnaOff Sources 115
F. Brichet, A. Simonian, L. MassoulieÂ, and D. Veitch
6 The Single Server Queue: Heavy Tails and Heavy Traf®c 143
O. J. Boxma and J. W. Cohen
7 Fluid Queues, OnaOff Processes, and Teletraf®c Modeling
with Highly Variable and Correlated Inputs 171

Sidney Resnick and Gennady Samorodnitsky
vii
8 Bounds on the Buffer Occupancy Probability with Self-Similar
Input Traf®c 193
N. Likhanov
9 Buffer Asymptotics for MaGa11 Input Processes 215
Armand M. Makowski and Minothi Parulekar
10 Asymptotic Analysis of Queues with Subexponential
Arrival Processes 249
P. R. Jelenkovic
Â
11 Traf®c and Queueing from an Unbounded Set of Independent
Memoryless OnaOff Sources 269
Philippe Jacquet
12 Long-Range Dependence and Queueing Effects for VBR Video 285
Daniel P. Heyman and T. V. Lakshman
13 Analysis of Transient Loss Performance Impact of Long-Range
Dependence in Network Traf®c 319
Guang-Liang Li and Victor O.K. Li
14 The Protocol Stack and Its Modulating Effect on Self-Similar
Traf®c 349
Kihong Park, Gitae Kim, and Mark E. Crovella
15 Characteristics of TCP Connection Arrivals 367
Anja Feldmann
16 Engineering for Quality of Service 401
J. W. Roberts
17 Network Design and Control Using OnaOff and Multilevel
Source Traf®c Models with Heavy-Tailed Distributions 421
N. G. Duf®eld and W. Whitt
18 Congestion Control for Self-Similar Network Traf®c 446

Tsunyi Tuan and Kihong Park
19 Quality of Service Provisioning for Long-Range-Dependent
Real-Time Traf®c 481
Abdelnaser Adas and Amarnath Mukherjhee
viii
CONTENTS
20 Toward an Improved Understanding of Network Traf®c Dynamics 507
R. H. Riedi and Walter Willinger
21 Future Directions and Open Problems in Performance
Evaluation and Control of Self-Similar Network Traf®c 531
Kihong Park
Index 555
CONTENTS ix

PREFACE
The recent discovery of scaling phenomena in modern communication networks
involving self-similarity or fractals and power-law or heavy-tailed distributions is yet
another realization of Benoit Mandelbrot's vision of order in physical, social, and
engineered systems characterized by scaling laws. Since the seminal paper by
Leland, Taqqu, Willinger and Wilson in 1993 which set the groundwork for
considering self-similarity an ubiquitous feature of empirically observed network
traf®c and an important notion in the understanding of the traf®c's dynamic nature
for modeling analysis and control of network performance, an explosion of work has
ensued investigating the multifaceted nature of this phenomenon.
Despite the fact that data networks such as the Internet are drastically different
from legacy public switched telephone networks, the long held paradigm in the
communication and networking research community has been that data traf®cÐ
analogous to voice traf®cÐis adequately described by certain Markovian models
which are amenable to accurate analysis and ef®cient control. This supposition has
been instrumental in shaping the optimism permeating the late 1980s and early

1990s regarding the ability of achieving ef®cient traf®c control for quality of service
provisioning in modern high-speed communication networks. The discovery and,
more importantly, succinct formulation and recognition that actual data traf®c may,
in fact, be fundamentally different in nature from the hereto accustomed telephony
traf®c has signi®cantly in¯uenced the networking research landscape, necessitating a
reexamination and revamping of some of its basic premises.
This book is a collection of chapter contributions which brings together relevant
past works spanning a cross-section of topics covering traf®c measurement, model-
ing, performance analysis, and traf®c control for self similar network traf®c. The
primary objective of the book is to present a comprehensive yet cohesive account of
some of the principal developments and results concerning self-similar network
xi
traf®c across its various facets, with the aim of serving as a re¯ective milestone that
captures the state-of-the-art in the ®eld. The book is organized around three main
subtopicsÐtraf®c modeling, queueing-based performance analysis, and traf®c
control. By and large, the chapters re¯ect how research in these areas has reacted
when faced with the new scienti®c discoveries involving self similarity and
ubiquitous presence of heavy-tailed phenomena in networked systems.
The spectrum of reactions ranges from evolutionaryÐholding on to traditional
frameworks and tested concepts, and trying to extend, generalize them in the
presence of unfamiliar characteristics that, in many ways, contradict conventional
wisdomÐall the way to revolutionary, which embrace the novel and, at times,
surprising features giving rise to new questions, research problems, and challenges
both on theoretical and practical fronts of relevance to the future Internet. Overall,
the reader may ®nd the majority of book chapters to be of an evolutionary rather than
revolutionary nature: Many of the problems that have been considered in the past
and have been assumed to ®t into the powerful, but also mathematically convenient,
framework of Markovian analysis are being reformulated and analyzed to incorpo-
rate the slowly improving understanding of data traf®c. More fundamental issues
such as whether or not these problems are still relevant in light of the stark contrast

between hereto assumed properties of network traf®c and observed reality has
attracted less attention to date. In this sense, the book chapters give a sense of how
science, in many instances, works when faced with new discoveries and realities, and
they also illustrate how a ``give and take'' between traditional approaches, on the one
side, and unconventional thinking on the other side can lead to progress, thus
advancing our overall understanding in the various subtopics covered in this book. It
will be interesting to observe if, and when, future developments in these areas will
require more concentrated focus on revolutionary ideas and approaches to network-
ing research and practice, especially as far as network performance analysis and
traf®c control are concerned.
The chapter contributions have been organized into three parts: (i) estimation and
simulation, (ii) queueing with self-similar input, and (iii) traf®c control and resource
provisioning. The threefold categorization is not strict in the sense that some
chapters encompass subject matters that cross the set boundaries. Chapter 1, in
addition to serving as an introductory chapter which provides the necessary back-
ground and technical know-how to understanding self-similar traf®c that is common
to many of the chapters, also gives a bird's eye view of each chapter, how they ®t into
the overall picture, and comments on the role and potential relevance for future
advances. The remaining two chapters in Part I deal with traf®c characterization,
estimation, and modeling issues. Wavelet analysis is introduced as a powerful
technique for both modeling and estimation in self-similar traf®c. Augmenting the
theme of traf®c modeling are issues surrounding simulations such as those arising in
the generation of self-similar traf®c and workloads which entails, in many instances,
sampling from heavy-tailed distributions requiring special considerations.
The second part of the book consists of ten chapters and focuses on traditional
performance evaluation issues, in particular, queueing behavior of ®nite and in®nite
buffer systems when fed with long-range dependent input. Due to the breakdown of
xii PREFACE
Markovian assumptions which are key to achieving tractable analysis in traditional
queueing analysis, the technical challenges encountered with self-similar input are

great, and this part of the book exposes what is known about queueing with self-
similar input, above and beyond the phenomenon that queue length distribution
decays polynomially and not exponentially. The traf®c models employed, to a large
extent, can be viewed as variants of onaoff renewal reward processes where session
arrivals are allowed to be Poisson, however, on- or off-periods which correspond to
busy and idle transmission times, respectively, are heavy-tailed. Starting with
Chapter 4, many of the chapters employ asymptotic techniques to investigate tail
behavior in queueing systems which, in turn, are related to buffer over¯ow or packet
drop probabilities. Chapters 8, 9, and 10 provide asymptotic bounds on the tail
probability. Chapter 12 discusses a traditional, Markovian view of modeling and
analyzing variable bit rate video traces which represents a form of extreme adherence
to conventional techniques and world view which has its roots in telephony traf®c.
Chapter 13 provides a form of transient analysis which, in spite of its elementary
nature, is a useful exercise and points toward the need for nonequilibrium analysis.
A total of six chapters make up the third part of the book which is mainly
concerned with traf®c control and dynamic resource provisioning issues that arise
under self-similar traf®c conditions. There are two aspects to the question, one
centered on the problem of resource provisioningadimensioning and ensuing trade-
off relations, and the other based on the traditional traf®c control framework of
feedback control and its implementation in network protocols. With respect to
resource provisioning, due to the ampli®ed queueing delay incurred when employing
buffer dimensioning, an alternative resource provisioning strategy based on band-
width dimensioning as the central control variable has been advanced. A high-level
discussion is provided in Chapter 16. Chapter 17 provides analysis of bufferless
systems and long-range dependent processes whose future behavior is conditioned
on past behavior which are relevant to on-line resource provisioning and traf®c
control. Chapter 19 describes a concrete resource provisioning architecture based on
framing. Feedback traf®c control presents a more subtle challenge to traf®c manage-
ment where the central idea revolves around exploiting correlation structure at
multiple time scales, as afforded by long-range dependence and self-similarity, to

affect traf®c control decisions executed at smaller time scales. Chapter 14 discusses
the in¯uence of the protocol stack and network traf®c, and Chapter 15 gives a
detailed characterization of TCP based connection arrivals and network traf®c which
constitutes the bulk of current Internet traf®c. Chapter 18 introduces the multiple
time scale congestion control framework and its use in self-similar traf®c for
throughput maximization.
We conclude the book with two overview chapters which seek to take stock of
known results, and point toward research avenues and open problems that may
bene®t from concerted efforts by the research community. Chapter 20 gives a broad
overview of traf®c characterization and modeling issues, with focus on achieving a
comprehensive and re®ned understanding of network traf®c spanning both long and
short time scales. Chapter 21 describes a set of research problems and themes
categorized into workload characterization, performance analysis, and traf®c control.
PREFACE xiii
Some problems are more aptly described as research programs whereas other are
more focused in their scope and nature.
As co-editors, we greatly appreciate the generous efforts of all the contributors to
this volume. Because of their cooperation, ¯exibility, and willingness in helping us
achieve a measure of coherence and balanced representation, this project has been a
productive and timely occasion, and a delightful experience for us. We are con®dent
that despite the rapidly changing conditions that have become a trademark of
modern communication networks, this book contains insights and lessons that are
less transient and will withstand the test of time. We hope the book will be of service
as a comprehensive, in-depth, and up-to-date reference on self-similar network
traf®c for the larger networking and communication research communities. Our
work would have been much more dif®cult and time consuming without the help
of Wiley and its professional staff, especially, Andrew Smith who participated
in the initial idea of the book and Rosalyn Farkas who provided critical editing
support. We would like to extend our appreciation and thanks.
K

IHONG PARK
WALTER WILLINGER
Purdue University
AT&T Labs
May 2000
xiv PREFACE
SELF-SIMILAR
NETWORK TRAFFIC
AND PERFORMANCE
EVALUATION
1
SELF-SIMILAR NETWORK TRAFFIC:
AN OVERVIEW
KIHONG PARK
Network Systems Lab, Department of Computer Sciences,
Purdue University, West Lafayette, IN 47907
WALTER WILLINGER
Information Sciences Research Center, AT&T LabsÐResearch, Florham Park, NJ 07932
1.1 INTRODUCTION
1.1.1 Background
Since the seminal study of Leland, Taqqu, Willinger, and Wilson [41], which set the
groundwork for considering self-similarity an important notion in the understanding
of network traf®c including the modeling and analysis of network performance, an
explosion of work has ensued investigating the multifaceted nature of this phenom-
enon.
1
The long-held paradigm in the communication and performance communities
has been that voice traf®c and, by extension, data traf®c are adequately described by
certain Markovian models (e.g., Poisson), which are amenable to accurate analysis
and ef®cient control. The ®rst property stems from the well-developed ®eld of

Markovian analysis, which allows tight equilibrium bounds on performance vari-
ables such as the waiting time in various queueing systems to be found. This also
forms a pillar of performance analysis from the queueing theory side [38]. The
Self-Similar Network Traf®c and Performance Evaluation, Edited by Kihong Park and Walter Willinger
ISBN 0-471-31974-0 Copyright # 2000 by John Wiley & Sons, Inc.
1
1
For a nontechnical account of the discovery of the self-similar nature of network traf®c, including parallel
efforts and important follow-up work, we refer the reader to Willinger [71]. An extended list of references
that includes works related to self-similar network traf®c and performance modeling up to about 1995 can
be found in the bibliographical guide [75].
Self-Similar Network Traf®c and Performance Evaluation, Edited by Kihong Park and Walter Willinger
Copyright # 2000 by John Wiley & Sons, Inc.
Print ISBN 0-471-31974-0 Electronic ISBN 0-471-20644-X
second feature is, in part, due to the simple correlation structure generated by
Markovian sources whose performance impactÐfor example, as affected by the
likelihood of prolonged occurrence of ``bad events'' such as concentrated packet
arrivalsÐis fundamentally well-behaved. Speci®cally, if such processes are appro-
priately rescaled in time, the resulting coarsi®ed processes rapidly lose dependence,
taking on the properties of an independent and identically distributed (i.i.d.)
sequence of random variables with its associated niceties. Principal among them
is the exponential smallness of rare events, a key observation at the center of large
deviations theory [70].
The behavior of a process under rescaling is an important consideration in
performance analysis and control since buffering and, to some extent, bandwidth
provisioning can be viewed as operating on the rescaled process. The fact that
Markovian systems admit to this avenue of taming variability has helped shape the
optimism permeating the late 1980s and early 1990s regarding the feasibility of
achieving ef®cient traf®c control for quality of service (QoS) provisioning. The
discovery and, more importantly, succinct formulation and recognition that data

traf®c may not exhibit the hereto accustomed scaling properties [41] has signi®-
cantly in¯uenced the networking landscape, necessitating a reexamination of some
of its fundamental premises.
1.1.2 What Is Self-Similarity?
Self-similarity and fractals are notions pioneered by Benoit B. Mandelbrot [47].
They describe the phenomenon where a certain property of an objectÐfor example,
a natural image, the convergent subdomain of certain dynamical systems, a time
series (the mathematical object of our interest)Ðis preserved with respect to scaling
in space and=or time. If an object is self-similar or fractal, its parts, when magni®ed,
resembleÐin a suitable senseÐthe shape of the whole. For example, the two-
dimensional (2D) Cantor set living on A 0; 1Â0; 1 is obtained by starting with
a solid or black unit square, scaling its size by 1=3, then placing four copies of the
scaled solid square at the four corners of A. If the same process of scaling followed
by translation is applied recursively to the resulting objects ad in®nitum, the limit set
thus reached de®nes the 2D Cantor set. This constructive process is illustrated in Fig.
1.1. The limiting objectÐde®ned as the in®nite intersection of the iteratesÐhas the
property that if any of its corners are ``blown up'' suitably, then the shape of the
zoomed-in part is similar to the shape of the whole, that is, it is self-similar.Of
Fig. 1.1 Two-dimensional Cantor set.
2
SELF-SIMILAR NETWORK TRAFFIC: AN OVERVIEW
course, this is not too surprising since the constructive processÐby its recursive
actionÐendows the limiting object with the scale-invariance property.
The one-dimensional (1D) Cantor set, for example, as obtained by projecting the
2D Cantor set onto the line, can be given an interpretation as a traf®c series
X tPf0; 1gÐcall it ``Cantor traf®c''Ðwhere X t1 means that there is a packet
transmission at time t. This is depicted in Fig. 1.2 (left). If the constructive process is
terminated at iteration n ! 0, then the contiguous line segments of length 1=3
n
may

be interpreted as on periods or packet trains of duration 1=3
n
, and the segments
between successive on periods as off periods or absence of traf®c activity. Nonuni-
form traf®c intensities may be imparted by generalizing the constructive framework
via the use of probability measures. For example, for the 1D Cantor set, instead of
letting the left and right components after scaling have identical ``mass,'' they may be
assigned different masses, subject to the constraint that the total mass be preserved at
each stage of the iterative construction. This modi®cation corresponds to de®ning a
probability measure m on the Borel subsets of 0; 1 and distributing the measure at
each iteration nonuniformly left and right. Note that the classical Cantor set
constructionÐviewed as a mapÐis not measure-preserving. Figure 1.2 (middle)
shows such a construction with weights a
L

2
3
, a
R

1
3
for the left and right
Fig. 1.2 Left: One-dimensional Cantor set interpreted as on=off traf®c. Middle: One-
dimensional nonuniform Cantor set with weights a
L

2
3
, a

R

1
3
. Right: Cumulative process
corresponding to 1D on=off Cantor traf®c.
1.1 INTRODUCTION 3
components, respectively. The probability measure is represented by ``height''; we
observe that scale invariance is exactly preserved. In general, the traf®c patterns
producible with ®xed weights a
L
, a
R
are limited, but one can extend the framework
by allowing possibly different weights associated with every edge in the weighted
binary tree induced by the 1D Cantor set construction. Such constructions arise in a
more re®ned characterization of network traf®cÐcalled multiplicative processes or
cascadesÐand are discussed in Chapter 20. Further generalizations can be obtained
by de®ning different af®ne transformations with variable scale factors and transla-
tions at every level in the ``traf®c tree.'' The corresponding traf®c pattern is self-
similar if, and only if, the in®nite tree can be compactly represented as a ®nite
directed cyclic graph [8].
Whereas the previous constructions are given interpretations as traf®c activity
per unit time, we will ®nd it useful to consider their corresponding cumulative
processes, which are nondecreasing processes whose differencesÐalso called
increment processÐconstitute the original process. For example, for the on=off
Cantor traf®c construction (cf. Fig. 1.2 (left)), let us assign the interpretation that
time is discrete such that at step n ! 0, it ranges over the values t  0;
1=3
n

; 2=3
n
; ; 3
n
À 1=3
n
; 1. Thus we can equivalently index the discrete time
steps by i  0; 1; 2; ; 3
n
. With a slight abuse of notation, let us rede®ne X Á
as X i1 if, and only if, in the original process X i=3
n
1andX i=3
n
À e1
for all 0 < e < 1=3
n
. That is, for i values for which an on period in the original
process X t begins at t  i=3
n
, X i is de®ned to be zero. Thus, in the case of n  2,
we have
X 00; X 11; X 20; X 31; X 40;
X 50; X 60; X 71; X 80; X 91:
Now consider the continuous time process Yt shown in Fig. 1.2 (right) de®ned
over 0; 3
n
 for iteration n. Y t is nondecreasing and continuous, and it can be
checked by visual inspection that
X iY iÀY i À 1; i  1; 2; ; 3

n
;
and X 0Y 00. Thus Y t represents the total traf®c volume up to time t,
whereas X i represents the traf®c intensity during the ith interval. Most importantly,
we observe that exact self-similarity is preserved even in the cumulative process.
This points toward the fact that self-similarity may be de®ned with respect to a
cumulative process with its increment processÐwhich is of more relevance for
traf®c modelingÐ``inheriting'' some of its properties including self-similarity.
An important drawback of our constructions thus far is that they admit only a
strong form of recursive regularityÐthat of deterministic self-similarityÐand needs
to be further generalized for traf®c modeling purposes where stochastic variability is
an essential component.
4 SELF-SIMILAR NETWORK TRAFFIC: AN OVERVIEW
1.1.3 Stochastic Self-Similarity and Network Traf®c
Stochastic self-similarity admits the infusion of nondeterminism as necessitated by
measured traf®c traces but, nonetheless, is a property that can be illustrated visually.
Figure 1.3 (top left) shows a traf®c trace, where we plot throughput, in bytes, against
time where time granularity is 100 s. That is, a single data point is the aggregated
traf®c volume over a 100 second interval. Figure 1.3 (top right) is the same traf®c
series whose ®rst 1000 second interval is ``blown up'' by a factor of ten. Thus the
truncated time series has a time granularity of 10 s. The remaining two plots zoom in
further on the initial segment by rescaling successively by factors of 10.
Unlike deterministic fractals, the objects corresponding to Fig. 1.3 do not possess
exact resemblance of their parts with the whole at ®ner details. Here, we assume that
the measure of ``resemblance'' is the shape of a graph with the magnitude suitably
normalized. Indeed, for measured traf®c traces, it would be too much to expect to
observe exact, deterministic self-similarity given the stochastic nature of many
network events (e.g., source arrival behavior) that collectively in¯uence actual
network traf®c. If we adopt the view that traf®c series are sample paths of stochastic
processes and relax the measure of resemblance, say, by focusing on certain statistics

of the rescaled time series, then it may be possible to expect exact similarity of the
mathematical objects and approximate similarity of their speci®c realizations with
respect to these relaxed measures. Second-order statistics are statistical properties
Fig. 1.3 Stochastic self-similarityÐin the ``burstiness preservation sense''Ðacross time
scales 100 s, 10 s, 1 s, 100 ms (top left, top right, bottom left, bottom right).
1.1 INTRODUCTION 5
that capture burstiness or variability, and the autocorrelation function is a yardstick
with respect to which scale invariance can be fruitfully de®ned. The shape of the
autocorrelation functionÐabove and beyond its preservation across rescaled time
seriesÐwill play an important role. In particular, correlation, as a function of time
lag, is assumed to decrease polynomially as opposed to exponentially. The existence
of nontrivial correlation ``at a distance'' is referred to as long-range dependence.A
formal de®nition is given in Section 1.4.1.
1.2 PREVIOUS RESEARCH
1.2.1 Measurement-Based Traf®c Modeling
The research avenues relating to traf®c self-similarity may broadly be classi®ed into
four categories. In the ®rst category are works pertaining to measurement-based
traf®c modeling [13, 26, 34, 42, 56, 74], where traf®c traces from physical networks
are collected and analyzed to detect, identify, and quantify pertinent characteristics.
They have shown that scale-invariant burstiness or self-similarity is an ubiquitous
phenomenon found in diverse contexts, from local-area and wide-area networks to IP
and ATM protocol stacks to copper and ®ber optic transmission media. In particular,
Leland et al. [41] demonstrated self-similarity in a LAN environment (Ethernet),
Paxson and Floyd [56] showed self-similar burstiness manifesting itself in pre-World
Wide Web WAN IP traf®c, and Crovella and Bestavros [13] showed self-similarity
for WWW traf®c. Collectively, these measurement works constituted strong
evidence that scale-invariant burstiness was not an isolated, spurious phenomenon
but rather a persistent trait existing across a range of network environments.
Accompanying the traf®c characterization efforts has been work in the area of
statistical and scienti®c inference that has been essential to the detection and

quanti®cation of self-similarity or long-range dependence.
2
This work has speci®-
cally been geared toward network traf®c self-similarity [28, 64] and has focused on
exploiting the immense volume, high quality, and diversity of available traf®c
measurements; for a detailed discussion of these and related issues, see Willinger
and Paxson [72, 73]. At a formal level, the validity of an inference or estimation
technique is tied to an underlying process that presumably generated the data in the
®rst place. Put differently, correctness of system identi®cation only holds when the
data or sample paths are known to originate from speci®c models. Thus, in general, a
sample path of unknown origin cannot be uniquely attributed to a speci®c model,
and the main (and only) purpose of statistical or scienti®c inference is to deal with
this intrinsically ill-posed problem by concluding whether or not the given data or
sample paths are consistent with an assumed model structure. Clearly, being
consistent with an assumed model does not rule out the existence of other models
that may conform to the data equally well. In this sense, the aforementioned works
on measurement-based traf®c modeling have demonstrated that self-similarity is
2
The relationship between self-similarity and long-range dependenceÐthey need not be one and the
sameÐis explained in Section 1.4.1.
6 SELF-SIMILAR NETWORK TRAFFIC: AN OVERVIEW
consistent with measured network traf®c and have resulted in adding yet another
class of modelsÐthat is, self-similar processesÐto an already long list of models for
network traf®c. At a practical level, many of the commonly used inference
techniques for quantifying the degree of self-similarity or long-range dependence
(e.g., Hurst parameter estimation) have been known to exhibit different idiosyncra-
sies and robustness properties. Due to their predominantly heuristic nature, these
techniques have been generally easy to use and apply, but the ensuing results have
often been dif®cult to interpret [64]. The recent introduction of wavelet-based
techniques to the analysis of traf®c traces [1, 23] represented a signi®cant step

toward the development of more accurate inference techniques that have been shown
to possess increased sensitivity to different types of scaling phenomena with the
ability to discriminate against certain alternative modeling assumptions, in particu-
lar, nonstationary effects [1]. Due to their ability to localize a given signal in scale
and time, wavelets have made it possible to detect, identify, and describe multifractal
scaling behavior in measured network traf®c over ®ne time scales [23]: a nonuniform
(in time) scaling behavior that emerges when studying measured TCP traf®c over
®ne time scales, one that allows for more general scaling phenomena than the
ubiquitous self-similar scaling property, which holds for a range of suf®ciently large
time scales.
1.2.2 Physical Modeling
In the second category are works on physical modeling that try to explicate the
physical causes of self-similarity in network traf®c based on network mechanisms
and empirically established properties of distributed systems that, collectively,
collude to induce self-similar burstiness at multiplexing points in the network
layer. In view of traditional time series analysis, physical modeling affects model
selection by picking among competing andÐin a statistical senseÐequally well-
®tting models that are most congruent to the physical networking environment where
the data arose in the ®rst place. Put differently, physical modeling aims for models of
network traf®c that relate to the physics of how traf®c is generated in an actual
network, is capable of explaining empirically observed phenomena such as self-
similarity in more elementary terms, and provides new insights into the dynamic
nature of the traf®c. The ®rst type of causalityÐalso the most mundaneÐis
attributable to the arrival pattern of a single data source as exempli®ed by variable
bit rate (VBR) video [10, 26]. MPEG video, for example, exhibits variability at
multiple time scales, which, in turn, is hypothesized to be related to the variability
found in the time duration between successive scene changes [25]. This ``single-
source causality,'' however, is peripheral to our discussions for two reasons: one,
self-similarity observed in the original Bellcore data stems from traf®c measure-
ments collected during 1989±1991, a period during which VBR video payload was

minimalÐif not nonexistentÐto be considered an in¯uencing factor
3
; and two, it is
3
The same holds true for the LBLWAN data considered by Paxson and Floyd [56] and the BU WWW data
analyzed by Crovella and Bestavros [13].
1.2 PREVIOUS RESEARCH
7
well-known that VBR video can be approximated by short-range dependent traf®c
models, which, in turn, makes it possible to investigate certain aspects of the impact
on performance of long-range correlation structure within the con®nes of traditional
Markovian analysis [32, 37].
The second type of causalityÐalso called structural causality [50]Ðis more
subtle in nature, and its roots can be attributed to an empirical property of distributed
systems: the heavy-tailed distribution of ®le or object sizes. For the moment, a
random variable obeying a heavy-tailed distribution can be viewed as giving rise to a
very wide range of different values, includingÐas its trademarkÐ``very large''
values with nonnegligible probability. This intuition is made more precise in Section
1.4.1. Returning to the causality description, in a nutshell, if end hosts exchange ®les
whose sizes are heavy tailed, then the resulting network traf®c at multiplexing points
in the network layer is self-similar [50]. This causal phenomenon was shown to be
robust in the sense of holding for a variety of transport layer protocols such as
TCPÐfor example, Tahoe, Reno, and VegasÐand ¯ow-controlled UDP, which
make up the bulk of deployed transport protocols, and a range of network
con®gurations. Park et al. [50] also showed that research in UNIX ®le systems
carried out during the 1980s give strong empirical evidence based on ®le system
measurements that UNIX ®le systems are heavy-tailed. This is, perhaps, the most
simple, distilled, yet high-level physical explanation of network traf®c self-similarity.
Corresponding evidence for Web objects, which are of more recent relevance due to
the explosion of WWW and its impact on Internet traf®c, can be found in Crovella

and Bestavros [13].
Of course, structural causality would be meaningless unless there were explana-
tions that showed why heavy-tailed objects transported via TCP- and UDP-based
protocols would induce self-similar burstiness at multiplexing points. As hinted at in
the original Leland et al. paper [41] and formally introduced in Willinger et al. [74],
the on=off model of Willinger et al. [74] establishes that the superposition of a large
number of independent on=off sources with heavy-tailed on and=or off periods leads
to self-similarity in the aggregated processÐa fractional Gaussian noise processÐ
whose long-range dependence is determined by the heavy tailedness of on or off
periods. Space aggregation is inessential to inducing long-range dependenceÐit is
responsible for the Gaussian property of aggregated traf®c by an application of the
central limit theoremÐhowever, it is relevant to describing multiplexed network
traf®c. The on=off model has its roots in a certain renewal reward process introduced
by Mandelbrot [46] (and further studied by Taqqu and Levy [63]) and provides the
theoretical underpinning for much of the recent work on physical modeling of
network traf®c. This theoretical foundation together with the empirical evidence of
heavy-tailed on=off durations (as, e.g., given for IP ¯ow measurements [74])
represents a more low-level, direct explanation of physical causality of self-similarity
and forms the principal factors that distinguish the on=off model from other
mathematical models of self-similar traf®c. The linkage between high-level and
low-level descriptions of causality is further facilitated by Park et al. [50], where it is
shown that the application layer property of heavy-tailed ®le sizes is preserved by the
protocol stack and mapped to approximate heavy-tailed busy periods at the network
8 SELF-SIMILAR NETWORK TRAFFIC: AN OVERVIEW
layer. The interpacket spacing within a single session (or equivalently transfer=
connection=¯ow), however, has been observed to exhibit its own distinguishing
variability. This re®ned short time scale structure and its possible causal attribution
to the feedback control mechanisms of TCP are investigated in Feldmann et al. [22,
23] and are the topics of ongoing work.
1.2.3 Queueing Analysis

In the third category are works that provide mathematical models of long-range
dependent traf®c with a view toward facilitating performance analysis in the
queueing theory sense [2, 3, 17, 43, 49, 53, 66]. These works are important in
that they establish basic performance boundaries by investigating queueing behavior
with long-range dependent input, which exhibit performance characteristics funda-
mentally different from corresponding systems with Markovian input. In particular,
the queue length distribution in in®nite buffer systems has a slower-than-exponen-
tially (or subexponentially) decreasing tail, in stark contrast with short-range
dependent input for which the decay is exponential. In fact, depending on the
queueing model under consideration, long-range dependent input can give rise to
Weibullian [49] or polynomial [66] tail behavior of the underlying queue length
distributions. The analysis of such non-Markovian queueing systems is highly
nontrivial and provides fundamental insight into the performance impact question.
Of course, these works, in addition to providing valuable information into network
performance issues, advance the state of the art in performance analysis and are of
independent interest. The queue length distribution result implies that bufferingÐas
a resource provisioning strategyÐis rendered ineffective when input traf®c is self-
similar in the sense of incurring a disproportionate penalty in queueing delay vis-a
Á
-
vis the gain in reduced packet loss rate. This has led to proposals advocating a small
buffer capacity=large bandwidth resource provisioning strategy due to its simplistic,
yet curtailing in¯uence on queueing: if buffer capacity is small, then the ability to
queue or remember is accordingly diminished. Moreover, the smaller the buffer
capacity, the more relevant short-range correlations become in determining buffer
occupancy. Indeed, with respect to ®rst-order performance measures such as packet
loss rate, they may become the dominant factor. The effect of small buffer sizes and
®nite time horizons in terms of their potential role in delimiting the scope of
in¯uence of long-range dependence on network performance has been studied
[29, 58].

A major weakness of many of the queueing-based results [2, 3, 17, 43, 49, 53, 66]
is that they are asymptotic, in one form or another. For example, in in®nite buffer
systems, upper and lower bounds are derived for the tail of the queue length
distribution as the queue length variable approaches in®nity. The same holds true for
``®nite buffer'' results where bounds on buffer over¯ow probability are proved as
buffer capacity becomes unbounded. There exist interesting results for zero buffer
capacity systems [18, 19], which are discussed in Chapter 17. Empirically oriented
studies [20, 33, 51] seek to bridge the gap between asymptotic results and observed
behavior in ®nite buffer systems. A further drawback of current performance results
1.2 PREVIOUS RESEARCH 9

×