Tải bản đầy đủ (.pdf) (6 trang)

HandBooks Professional Java-C-Scrip-SQL part 155 potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (27.12 KB, 6 trang )

Footnotes
1. Obviously, this is a simplification: normally, we would want to be able to
find a customer's record by his name or other salient characteristic.
However, that part of the problem can be handled by, for example, a hash
coded lookup from the name into a record number, as we will see in the next
chapter. Here we are concerned with what happens after we know the record
number.
2. This figure could just as easily be considered a layout for a single record
with variable-length fields; however, the explanation is valid either way.
3. The problem of changing the length of an existing record can be handled by
deleting the old version of the record and adding a new version having a
different length.
4. I am indebted for this extremely valuable algorithm to its inventor, Henry
Beitz, who generously shared it with me in the mid-1970's.
5. In general, I use the terms "quantum" and "block" interchangeably; in the
few cases where a distinction is needed, I will note it explicitly.
6. In the current implementation, the default block size is 16K. However, it is
easy to change that size in order to be able to handle larger individual items
or to increase storage efficiency.
7. Some blocks used to store internal tables such as the free space list are not
divided into items, but consist of an array of fixed-size elements, sometimes
preceded by a header structure describing the array.
8. For simplicity, in our sample application each user record is stored in one
item; however, since any one item must fit within a quantum, applications
dealing with records that might exceed the size of a quantum should store
each potentially lengthy field as a separate item.
9. Four of these bytes are used to hold the index in the IRA to which this item
corresponds; this information would be very helpful in reconstructing as
much as possible of the file if it should become corrupted. Another two
bytes are used to keep track of the type of the item. These entries are for
error trapping and file reconstruction if the file should somehow become


corrupted.
10. Of course, this assumes that we have set the parameters of the quantum file
to values that allow us to expand the file to a size large enough to hold that
much data. The header file "blocki.h" contains the constants BlockSize and
MaxFileQuantumCount, which together determine the maximum size of a
quantum file. The beginning of that file also contains a number of other
constants and structures related to this issue; you should be able to modify
the capacity and space efficiency of a quantum file fairly easily after
examining that header file.
11. The limit is 256 so that an object number can fit into one byte; this reduces
the size of the free space list, as we will see later.
12. By the way, there is a possible optimization that could be employed here:
sorting the buffers to be rewritten to the disk in order of their quantum
numbers (i.e., their positions in the file). This could improve performance in
systems where the hard disk controller doesn't already provide this service;
however, most (if not all) modern disk systems take care of this for us, so
sorting by quantum number would not provide any benefit.
13. The second edition also included this C implementation along with an
earlier, less capable, version of the C++ implementation we're examining
here.
14. Another restriction in C++ operator overloading is that it's impossible to
make up your own operators. According to Bjarne Stroustrup, this facility
has been carefully considered by the standards committee and has failed of
adoption due to difficulties with operator precedence and binding strength.
Apparently, it was the exponentiation operator that was the deciding factor;
it's the first operator that users from the numerical community usually want
to define, but its mathematical properties don't match the precedence and
binding rules of any of the "normal" C++ operators.
15. I'm not claiming this is a good use for overloading; it's only for tutorial
purposes.

16. Warning: do not compile and execute the program in Figure overload2.
Although it will compile, it reads from random locations, which may cause a
core dump on some systems.
17. By the way, this isn't just a theoretical problem: it happened to me during the
development of this program.
18. This is an example of the "handle/body" class paradigm, described in
Advanced C++: Programming Styles and Idioms, by James O. Coplien
(Addison-Wesley Publishing Company, Reading, Massachusetts, 1992).
Warning: as its title indicates, this is not an easy book; however, it does
reward careful study by those who already have a solid grasp of C++
fundamentals. For a kinder, gentler introduction to several advanced C++
idioms of wide applicability, see my Who's Afraid of More C++? (AP
Professional, San Diego, California, 1998).
19. In order to solve this problem in a more general way, the ANSI standards
committee for C++ has approved the addition of "namespaces", which allow
the programmer to specify the library from which one or more functions are
to be taken in order to prevent name conflicts.
20. Of course, there are other ways to accomplish the goal of protecting the class
user from concern about internals of a given class, as we've discussed briefly
in the sections titled "Data Hiding" and "Function Hiding".
21. If we had any functions that could change the contents of a shared object,
they would also have to be modified to prevent undesirable interactions
between "separate" handle objects that share data. However, we don't have
any such functions in this case.
22. Actually, the reference count should never be less than 0, but I'm engaging
in some defensive programming here.
23. To reduce the length of the function names in this class, I'm going to omit
the MainObjectArrayPtr qualifier at the beginning of those names.
24. We'll see exactly how this block access works when we cover the
MainObjectBlock class, but for now it's sufficient to note that the main

object array is potentially divided into blocks which are accessed via the
standard virtual memory system.
25. Actually, the name of the "lowest free object" variable should be something
like m_StartLookingHere, but I doubt it will cause you too much confusion
after you see how it is used.
26. If a preprocessor variable called DEBUG is defined, then the action of this
macro will be to terminate the program if the condition is not met;
otherwise, it will do nothing. You can find the implementation of this macro
and its underlying function in qfassert.h and qfassert.cpp.
27. We often will step through an array assigning values to each element in turn,
for example when importing data from an ASCII file; since we are not
modifying previously stored values, the quantum we used last is the most
likely to have room to add another item. In such a case, the most recently
written-to quantum is half full on the average; this makes it a good place to
look for some free space. In addition, it is very likely to be already in
memory, so we won't have to do any disk accesses to get at it.
28. By the way, this function was originally named GetFreeSpace, and its return
type was called FreeSpaceEntry, but I had to change the name of the return
type to avoid a conflict with a name that Microsoft had used once upon a
time in their MFC classes and still had some claim on; I changed the name
of the function to match. This is a good illustration of the need to avoid
polluting the global name space; using the namespace construct in the new
C++ standard would be a good solution to such a problem.
29. The alert reader will notice that the type of the NewBigPointerBlock
variable is BigPointerBlockPtr, not BigPointerBlock. However, the
functions that we call through that variable via the operator-> are from the
BigPointerBlock class, because that is the type of the pointer that operator->
returns, as explained in the section on overloading operator->.
30. Another possible use is to implement variant arrays, in which the structure of
the array is variable. In that case, we might use the type to determine

whether we have the item we want or some kind of intermediate structure
requiring further processing to extract the actual data.
31. The reason we mark this quantum as being full is twofold: first, there won't
be very much (if any) space left in this quantum after we have added the
little pointer array; and second, we don't want to store any actual data for our
new main object in the same quantum as we are using for a section of the
little pointer array, to make the reconstruction of a partially corrupted file
easier.
32. There's one exception to this rule, for reasons described above: if the user
specifies 0 elements, I change it to 1.
33. It would probably have been better to create a class to contain this function
as well as a few others that are global in this implementation.
34. These are FreeSpaceBlockPtr, MainObjectBlockPtr, BigPointerBlockPtr,
LittlePointerBlockPtr, and LeafBlockPtr.
35. Of course, while stepping through the program at human speeds in Turbo
Debugger, the timestamps were nicely distributed; this is a demonstration of
Heisenberg's Uncertainty Principle as it applies to debugging.
36. In a 32-bit implementation, it's entirely possible that the counter will never
turn over, as its maximum value is more than four billion. However, if you
let the program run long enough, eventually that will occur; at full tilt, such
an event might take a few months.
37. At least, it's the smallest interface for a class that actually does anything. As
we'll see, this program contains one class that doesn't contribute either data
or functions. I'll explain why that is when we get to it.
38. For a detailed example of how this works, see the section entitled "Polite
Pointing".
39. I'm not going to prefix the name of each embedded function or class with the
name of its enclosing class, to make the explanations shorter; I'll just include
it in the title of the main section discussing the class.
40. I'll cover this and the other global functions in the next chapter.

41. It is important to remember that the last item in a quantum is actually at the
lowest address of any item in that quantum, because items are stored in the
quantum starting from the end and working back toward the beginning of the
quantum.
42. We can't delete unused item index entries that aren't at the end of the index
because that would change the item numbers of the following items,
rendering them inaccessible.
43. Of course, in the real program, we will find the IRA by looking it up in the
main object index, but that detail is irrelevant to the current discussion of
deleting an element.
44. The statement that calculates the value we should return for an empty
quantum may be a bit puzzling. The reason it works as it does is that the
routine that looks for an empty block compares the available free space code
to a specific value, namely AvailableQuantum. If we did not return a value
that corresponded to that free space code, a quantum that was ever used for
anything would never be considered empty again.
45. It is important to note that the number of elements that I'm referring to here
is not the number of elements in the big pointer array (i.e., the number of
quanta that the small pointer array occupy) but the actual number of
elements in the array itself (i.e., the number of data items that the user has
stored in the array).
46. Note that we have to remember to set the modified flag for the big pointer
array quantum whenever we change a value in the big pointer array header
or the big pointer array itself, so that these changes will be reflected on the
disk rather than being lost.
47. Although we will be unable to go into the details of the implementation of
the AccessVector type, it is defined in the header file vector.h; if you are
familiar with C++ templates, I recommend that you read that header file to
understand how this type actually works.
48. The "last quantum added to" variable is stored in the big pointer quantum.

When first writing the code to update that variable, I had forgotten to update
the "modified" flag for the big pointer quantum when the variable actually
changed. As a result, every time the buffer used for the big pointer quantum
was reused for a new block, that buffer was overwritten without being
written out to the disk. When the big pointer quantum was reloaded, the "last
quantum added to" variable was reset to an earlier, obsolete value, with the
result that a new quantum was started unnecessarily. This error caused the
file to grow very rapidly.
49. For a similar reason, if we were adding an item to a large object with many
little pointer arrays, each of which contained only a few distinct quantum
number references, we wouldn't be gathering information about very much
of the total storage taken up by the object; we might very well start a new
quantum when there was plenty of space in another quantum owned by this
object.

×