Tải bản đầy đủ (.pdf) (6 trang)

HandBooks Professional Java-C-Scrip-SQL part 153 pot

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (26.87 KB, 6 trang )

case of the big pointer array itself. The problem is that a big pointer array can have
a variable number of elements in it, so it would be all too easy for us to
accidentally step off the end of the big pointer array, with potentially disastrous
results. To prevent such a possibility, I have created the data type called
AccessVector. The purpose of this data type is to combine the safety features
of a normal SVector with the ability to specify the address where the data for the
SVector should start, rather than relying on the run-time memory allocation
library to assign the address. Because this data type is designed to refer to an
existing area of memory, the copy constructor is defaulted, as is the assignment
operator and the destructor, and there is no SetSize function such as exists for
"regular" SVectors. This data type allows us to "map" a predefined structure
onto an existing set of data, which is exactly what we need to access a big pointer
array safely.
As this suggests, the "big array header" is an AccessVector variable, which we
can use just as though it were a normal SVector.
47

The LittlePointerBlock class
The interface for LittlePointerBlock is shown in Figure blocki.10.
The interface for the LittlePointerBlock class (from quantum\blocki.h) (Figure
blocki.10)
codelist/blocki.10
There's nothing in this class that isn't exactly analogous to the corresponding
functions in the big pointer array class. Therefore, I won't waste either your time or
mine by repeating the analysis of the big pointer array class here. Instead, let's
move along to another class that is somewhat more interesting, if only because it
seems to have no purpose for existing.
The LeafBlock class
The interface for LeafBlock is shown in Figure blocki.11.
The interface for the LeafBlock class (from quantum\blocki.h) (Figure
blocki.11)


codelist/blocki.11
This is a real oddity: a class that defines no new member functions or member
variables. Of what value could such a class possibly be?
The answer is that it provides a "hook" for attaching a handle class, namely
LeafBlockPtr. This allows us to use a leaf block just as we would any of the
other quantum classes. If we did not have this class, we could always create
another class called QuantumBlockPtr, which would have much the same
effect as creating this class. So why did I create this class in the first place?
The answer is that originally it did have some member functions, but they
eventually turned out to be superfluous. At this point in the development of this
project, it would probably be unwise for me to go through the code to root out all
of the references to this class. And after all, a class that defines no new
member functions or member variables certainly can't take up too much extra
space; in fact, using this class should have absolutely no effect on the size of the
program or its execution time, so I think I'll leave it just as it is, at least for now.
The FreeSpaceArray class
Finally, we're finished with the block classes. Our next target of opportunity will
be the classes that maintain and provide access to the free space list. We'll start
with FreeSpaceArray, whose interface is shown in Figure newquant.22.
The interface for the FreeSpaceArray class (from quantum\newquant.h) (Figure
newquant.22)
codelist/newquant.22
The Normal Constructor for FreeSpaceArray
The first function we'll look at is the normal constructor for FreeSpaceArray,
whose code is shown in Figure newquant.23.
The normal constructor for FreeSpaceArray (from quantum\newquant.cpp)
(Figure newquant.23)
codelist/newquant.23
As you can see, all this function does (as is common in the case of constructors) is
to initialize a number of member variables. Most of these initializations are fairly

straightforward, but we should go over them briefly. First, we set the current
lowest free block number to 0, because we have no idea what blocks might be free
in the free space list, as we have not looked through it yet. Then we get the free
space list count, the number of blocks in the free space list, and the quantum
number adjustment from the quantum file object; this last value is used when we
need to convert between block numbers and quantum numbers. Next, we resize the
block pointer SVector so it can hold block pointers for all of the free space list
blocks in the quantum file. Finally, we assign free space block pointers to all the
elements of that SVector. Now we are ready to access the free space list.
The FreeSpaceArray::Get Function
The next function we'll look at is FreeSpaceArray::Get, whose code is
shown in Figure newquant.24.
The FreeSpaceArray::Get function (from quantum\newquant.cpp) (Figure
newquant.24)
codelist/newquant.24
The operation of this function is fairly straightforward. First, we check whether
we're trying to access something that is off the end of the free space list. If so, we
return a value that indicates that there is no free space in the quantum for which
information was requested. However, if the input argument is valid, we calculate
which block and which element in that block contains the information we need.
We then call the Get function of the block pointer to retrieve that element. Finally,
we return the result to the caller.
The FreeSpaceArray::Set Function
The next function we'll look at is FreeSpaceArray::Set, whose code is
shown in Figure newquant.25.
The FreeSpaceArray::Set function (from quantum\newquant.cpp) (Figure
newquant.25)
codelist/newquant.25
This function is very similar to its counterpart, the Get function. However, there is
one difference that we should look at: if the entry that we have just found in the

array indicates that its quantum is completely empty (i.e., has the maximum
available space) and this entry has a lower index than the current value of the
"lowest free block" variable, then we reset the "lowest free block" variable to
indicate that this is the lowest free block.
Free the Quantum 16K!
This is an optimization whose purpose is to avoid searching the entire free list
every time we want to find a block that isn't committed to any particular main
object. In both the previous C implementation and the current C++ one, we first
check the last quantum to which we added an item; if that has enough space to add
the new item, we use it.
48
In the old implementation, the free space list contained
only a "free space code", indicating how much space was available in the quantum
but not which object it belonged to. Therefore, when we wanted to find a quantum
belonging to the current object that had enough space to store a new item, we
couldn't use the free space list directly. As a substitute, the C code went through
the current little pointer array, looking up each quantum referenced in that array in
the free space list; if one of them had enough space, we used it. However, this was
quite inefficient; since each quantum can hold dozens or hundreds of items, this
algorithm might require us to look at the same quantum that many times!
49

Although this wasn't too important in the old implementation, where the free space
list was held in memory, it could cause serious delays in the current one if we used
the standard virtual memory services to access the free space list. The free space
list in the old program took up 16K, one byte for each quantum in the maximum
quantum file size allowed. In the new implementation, using 16K blocks of virtual
memory, that same free space list would occupy only one block, so searching such
a list would not require any extra disk accesses. However, the current
implementation can handle much larger quantum files that might contain tens or

hundreds of thousands of blocks, with correspondingly larger free space lists.
Using the old method, searching the free space list from beginning to end could
take quite a while, because the search routine would not access the list in a linear
manner and therefore might require extra disk accesses to access the same free
space list entries several times. At the very least, the free space blocks would be
artificially promoted to higher levels of activity and would therefore tend to crowd
other quanta out of the buffers.
Even if the free space blocks were already resident, virtual memory accesses are
considerably slower than "regular" accesses; it would be much faster to scan the
free space list sequentially by quantum number than randomly according to the
entries in the little pointer array. Of course, we could make a list of which quanta
we had already examined and skip the check in those cases, but I decided to
simplify matters by another method.
The FreeSpaceArray::FindSpaceForItem Function
In the current implementation, the free space list contains not just the free space for
each quantum but also which object it belongs to (if any).
50
This lets us write a
FreeSpaceArray::FindSpaceForItem routine that finds a place to store a
new item by scanning each block of the free list sequentially in memory, rather
than using a virtual memory access to retrieve each free space entry; we stop when
we find a quantum that belongs to the current object and has enough free space left
to store the item (Figure newquant.26).
51

The FreeSpaceArray::FindSpaceForItem function (from
quantum\newquant.cpp) (Figure newquant.26)
codelist/newquant.26
However, if there isn't a quantum in the free space list that belongs to our desired
main object and also has enough space left to add the new item, then we have to

start a new quantum; how do we decide which one to use?
One way is to keep track of the first free space block we find in our search and use
it if we can't find a suitable block already belonging to our object. However, I want
to bias the storage mechanism to use blocks as close as possible to the beginning of
the file, which should reduce head motion, as well as making it possible to shrink
the file's allocated size if the amount of data stored in it decreases. My solution is
to take a free block whenever it appears; if that happens to be before a suitable
block belonging to the current object, so be it. This appears to be a self-limiting
problem, since the next time we want to add to the same object, the newly assigned
block will be employed if it has enough free space and is the first suitable block in
the list.
This approach solves another problem as well, which is how we determine when to
stop scanning the free space list in the first place. Of course, we could also
maintain the block number of the last occupied block in the file and stop there.
However, I felt this was unnecessary, since stopping at the first free block provides
a natural shortcut, without contributing any obvious problems of its own. However,
as with many design decisions, my analysis could be flawed: there's a possibility
that using this algorithm with many additions and deletions could reduce the space
efficiency of the file, although I haven't seen such an effect in my testing.
This mechanism did not mature without some growing pains. For example, the
one-byte FreeSpaceCode code, used to indicate the approximate space available
in a quantum, is calculated by dividing the size by a constant (32 in the case of 16K
blocks) and discarding the remainder. As a result, the size code calculated for
items

×