Tải bản đầy đủ (.pdf) (5 trang)

THE FRACTAL STRUCTURE OF DATA REFERENCE- P21 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (224.31 KB, 5 trang )

Transient and Persistent Data Access
89
window lengths much longer than the time needed to complete a typical burst,
one might usually expect to see an entire burst, isolated somewhere in the
window. But for window lengths shorter than this characteristic time, and
assuming that such a characteristic time actually exists, one might usually
expect to see some portion of a burst, spread throughout the window. The
characteristic burst time, and the corresponding number of requests, could then
be identified by the transition between these two patterns of behavior.
The result (7.2) is in sharp contrast, however, with the outcome of the
thought experiment just presented. Patterns of reference that conform to the
hierarchical reuse model, although bursty, do not preferentially exhibit bursts
of any specific, quantifiable length. Instead, they are bursty at all time scales.
It is reasonable to hope, therefore, that individual transient or persistent
data items can be distinguished from each other by applying the metric (7.1).
Provided that the time window is long enough for the persistence of a given
item of data to become apparent, this persistence should be reflected in a high
outcome for P. On the other hand, regardless of the time scale, we may hope
to recognize transient data from a small outcome for P.
Two
I/O traces, each covering 24 hours at a moderate
-
to
-
large OS/390 instal
-
lation, were used to investigate this idea. The two traced installations were:
A. A large data base installation running a mix of
CICS, IMS, DB2, and batch.
B. A moderate
-


sized DB2 installation running primarily on
-
line and batch DB2
work, with on
-
line access occurring from a number of time zones in different
parts of the world.
Figure 7.1. Distribution of probability density for the metric P: track image granularity.
90 THE FRACTAL STRUCTURE OF DATA REFERENCE
Figure 7.2. Distribution ofprobability density for the metric P: cylinder image granularity.
Figure 7.3.
storage.
Distribution of probability density for the metric P: file granularity, weighted by
Transient and Persistent Data Access
91
Figures 7.1 through 7.3 present the observed distribution of the metric P for
each installation. The three figures present distributions measured at three levels
of granularity: track image, cylinder image, and the total storage containing a
given file. Here files should be taken to represent the highest level of granularity,
since the average file size tends to correspond to many cylinder images (9 in
one recent survey).
The three figures show a pronounced bimodal behavior in the metric P.
Individual data items tend strongly toward the two extremes of either P ≈ 0 or
P ≈ 1. This appears to confirm the existence of the two contrasting modes of
behavior, persistent and transient, as just proposed in the previous paragraphs.
Based upon the region of P where the persistent mode of behavior becomes
clearly apparent in the figures, we shall define the observations for a given data
item as reflecting persistent behavior if
(7.3)
Otherwise, we shall take the observations to reflect transient behavior.

The figures also show that the role ofpersistent data is increasingly important
at higher levels of granularity. Only a relatively few observed track images
behave in a persistent manner; however, a substantial percentage of observed
file storage was persistent, when measured at the file level of granularity.
2.
The analysis just presented was limited to studying the behavior of the metric
P relative to a specific, selected time window of 24 hours. If we now use (7.3)
to focus our attention specifically on the issue of whether observed behavior
appears to be persistent or transient, however, it becomes possible to investigate
a broad range of time periods. Such an investigation is important, since clearly
any metric purporting to distinguish persistent from transient behavior should
tend to show results that are robust with respect to the exact choice of time
interval.
Figures 7.4 through 7.6 present the average percentage of storage capacity
associated with track images, cylinder images, or files seen to be active at the
two study installations, during windows of various durations, ranging from 15
minutes up to 24 hours. As we should expect, this percentage depends strongly
on the granularity of the object being examined. Considered at a track level
of granularity, it appears that only 10–20 percent of storage capacity tends to
be active over a 24
-
hour period (based upon the two study installations); at a
cylinder level of granularity, more like 20 to 40 percent of storage is active; and
at a file level, 25 to 50 percent of the capacity is active.
Figures 7.7 through 7.9 explore persistent data at the same two installations.
The three figures present the percentage of active track images, cylinder images,
PERIODS UP TO 24 HOURS
92 THE FRACTAL STRUCTURE OF DATA REFERENCE
Figure 7.4. Active track images as a function of window size.
Figure 7.5. Active cylinder images as a function ofwindow size.

Transient and Persistent Data Access 93
Figure 7.6.
Active file storage as a function ofwindow size.
Figure 7.7. Persistent track images as a function ofwindow size.

×