Tài liệu Managing time in relational databases- P7 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (160.02 KB, 20 trang )

result is that asserted version bi-temporal tables have two pairs
of dates, representing two different time periods. They are the
effective time period and the assertion time period of the rows
in those tables.
To assert a version is to claim that it is true. With any tables
other than bi-temporal tables, if something happens that makes
a row no longer true, that row is either updated or deleted. But
with asserted version tables, that isn’t how we handle a row that
we discover is not true. We handle it by assigning it an assertion
end date (or a transaction end date, in the case of the standard
temporal model) representing the date on which we acknowl-
edge that it is not true. Then we add another row to the table.
This new row is a new assertion about the same version, a new
assertion about what the object is like during the same period
of time. The multiple rows representing the same version are
multiple assertions of that version.
In a series of assertions about the same object during the
same effective time period, the later one (in assertion time) of
every consecutive pair is a correction to the earlier one, and
the latest one of all is our current assertion about what is true.
We may also note here that corrections and the versions they
correct do not necessarily line up one to one. An error may span
only part of a version, or may span multiple versions, or may
span both. The ability to include in a table both corrections to
versions, as well as the original versions themselves, is precisely
what is lost when we remain with uni-temporal versions. This
ability is precisely what we gain when we manage data bi-
temporally.
For example, suppose that a row in a table states that pol-
icy P861 had a copay amount of $25 during the first six
months of 2010. But on July 1

st
of that year, we realized that
the copay amount was actually $20forthatpolicyoverthat
time period. If we were to overwrite that copay amount, we
would lose the information that prior to July 1
st
, all reports
and queries would have shown a copay of $25. To avoid losing
that information, we instead end the assertion time period for
the row showing a $25 copay, and insert another row. This new
row is also for policy P861, during the same six months of
2010, but it shows the correct copay of $20. Its assertion begin
date is the same as the assertion end date of the row showing
the $25 copay—July 1
st
.WenolongerassertthatP861hada
$25 copay during that period of time. Instead, we now assert
that it had a $20 copay during that period of time. We have
corrected a mistake without losing track of what the mistake
Chapter 5 THE CORE CONCEPTS OF ASSERTED VERSIONING
103
was, when it occurred, and for how long it was mistakenly
treated as the truth.
Temporal Integrity Constraints
The three integrity constraints in relational theory are entity
integrity, referential integrity and domain integrity. Entity integ-
rity insures that the object represented by a row in a non-
temporal table is represented by that row only, and no other.
But as we have seen, with temporal tables, an object may be
represented by any number of rows. For these tables, then, the

entity integrity constraint must be modified. That modification
results in what we call temporal entity integrity.
As for domain integrity, it is obviously possible for a domain
to change over time. But such changes are not changes to data.
They are changes to types of data, indeed to datatypes. Domain
changes are one form of what computer scientists call schema
evolution. Adding or removing tables, altering primary or foreign
keys, and adding or removing non-key columns, are other ways
in which database schemas evolve. Both the standard temporal
model and Asserted Versioning are formalizations of temporal
data management within a snapshot of the evolution of a data-
base schema. Both temporal models, along with the IT industry’s
best versioning practices, assume unchanging, stable database
schemas.
What, then, of referential integrity? Well, suppose we have
two conventional tables, X and Y, with Y having a referential
integrity dependency on X. Typically we would say that X is the
parent table, and Y the child table. Now suppose that both tables
are bi-temporal tables. In this case, any number of rows in X may
represent the same object. So for any row in Y, which of the mul-
tiple rows in X, all representing the same object, is that row in Y
dependent on? Which row in X does the foreign key in Y point to?
The answer is that the foreign key in Y does not point to any
specific row in X. For these tables, then, the referential integrity
constraint must be modified. That modification results in what
we call temporal referential integrity (TRI). The foreign key in Y
becomes what we call a temporal foreign key (TFK).
Temporal Entity Integrity
Breaking the one-to-one correspondence between objects
and rows is no small thing. Its implications are significant. One

of them is that the relational rule of entity integrity must be
understood in a new way.
104
Chapter 5 THE CORE CONCEPTS OF ASSERTED VERSIONING
With a non-temporal table, entity integrity is usually enforced
by a primary key uniqueness constraint on the table. It may also
be enforced by defining a unique index on an alternate key.
When a surrogate key is used as the primary key of a table, pri-
mary key uniqueness will guarantee only that rows are physically
distinguishable from one another. To guarantee entity integrity,
which is a semantic constraint, a unique index must also be
defined on an alternate key, none of whose component columns
are surrogate-valued.
For temporal tables, the corresponding constraint is temporal
entity integrity. Basically, it works like this. In a non-temporal
table, DBMS-enforced entity integrity blocks any insertion that
would result in a pair of rows both of which represent the same
object. But when the target table is an asserted version table,
temporal entity integrity (TEI) blocks any insertion that would
result in a pair of rows that represent the same object and that
have one or more effective time clock ticks in common within
one or more assertion time clock ticks that they also have in
common.
In enforcing the rule that no two versions of the same object
may conflict, TEI is obviously analogous to conventional entity
integrity. For having one or more effective time clock ticks in
common means that there are two rows which both purport to
describe the same object during the same period of time. So
both entity integrity and temporal entity integrity play the same
semantic role; they both prevent conflicting truth claims.

However, TEI has additional work to do, work that is not
required of entity integrity. First of all, TEI must insure that adja-
cent versions within the same episode [meet], i.e. that there are
no temporal gaps between them. In addition, TEI must also
insure that there are temporal gaps between adjacent episodes,
i.e. that one episode is always [before] the other. And it’s impor-
tant to understand why there must be at least one clock tick
between any pair of adjacent episodes. It has to do with under-
standing and enforcing the user’s intentions when she submits
a temporal transaction.
First, let’s consider insert transactions. With an insert to a
conventional table, the user tacitly agrees that if a row with the
same unique identifier already exists in the target table, the
DBMS will reject the transaction. The user is telling the DBMS
“I believe that this table does not contain a row for this object.
So if I’m wrong, I don’t want the transaction to proceed. I want
you to kick out the transaction, and notify me.”
By the same token, then, with an insert to an asserted version
table, the user is telling the DBMS to create a new episode of the
Chapter 5 THE CORE CONCEPTS OF ASSERTED VERSIONING
105
indicated object, or to extend an existing episode into effective
time clock ticks that it does not yet occupy. And analogously to
entity integrity in non-temporal tables, the user is tacitly agree-
ing that if an episode of the same object already exists in the tar-
get table and if its effective and assertion time periods have any
clock ticks in common with the time periods targeted by the
transaction, the transaction should be rejected. The user is
telling the Asserted Versioning Framework “I believe that this
table does not contain any row or rows for this object that

already represent the object in even a single clock tick that is
specified on the transaction. So if I’m wrong, I don’t want the
transaction to proceed. I want you to kick out the transaction,
and notify me.”
Next, let’s consider update and delete transactions. By a simi-
lar process of reasoning, we can see that in submitting either
type of transaction, the user is telling the Asserted Versioning
Framework “I believe that the table does contain one or more
rows for this object with one or more clock ticks that [
intersect]
the clock ticks specified on this transaction. So if I’m wrong, I
don’t want the transaction to proceed. I want you to kick out
the transaction, and notify me.”
One Clock Tick: Convention or Constraint?
Given two rows representing the same object, there are three
temporal relationships between them that are distinguishable by
means of a single clock tick. First, if there is even a single clock
tick between the end of one and the start of the next, then they
are non-contiguous. In Allen relationship terms, one is [before]
the other. Next, if there is even a single clock tick that is
contained in both their time periods, then they [
intersect].
Finally, if neither is the case, then they are contiguous. In Allen
relationship terms, they [meet].
Two versions of the same object that are non-contiguous may
exist in the same target table at the same time. But if they do,
they necessarily belong to different episodes. And if one of those
versions is the only version of that object already in the target
table and the other is a transaction, that transaction cannot be
an update. If it were an update transaction, it would be equiva-

lent to attempting a conventional update when there was no
row for that object already in the target table. By the same token,
if one of those versions is in the target table and the other is a
transaction, that transaction cannot be a deletion.
Two versions of the same object that [
intersect], in both effec-
tive and assertion time, cannot exist in the same target table at
106
Chapter 5 THE CORE CONCEPTS OF ASSERTED VERSIONING
the same time. If they did, they would violate temporal entity
integrity. They would be two concurrently asserted statements
about what the same object is like at the same time. By the same
token, if one of those versions is in the target table and the other
is a transaction, that transaction cannot be an insert transaction.
If it were an insert transaction, it would be equivalent to
attempting a conventional insert when a row for that object is
already in the target table.
But what about the third case, when the two versions are con-
tiguous? As we have already seen, two contiguous versions of the
same object can exist in the same table at the same time, and
that, in doing so, they belong to the same episode. But when
one is a transaction and the other a row already in the target table,
which is the correct transaction to use—an insert or an update?
As far as the results of the transaction are concerned, it doesn’t
matter. Whether an insert or an update is used, the effect will be
to expand an existing episode either forwards or backwards in
time. Of course, if the episode is being expanded forwards in time,
it must be an episode which, prior to the transaction, ended in a
non-12/31/9999 date. Otherwise, the transaction’s begin date
could not be contiguous with the end date of the episode, but

instead would be included within the time period of the episode.
But the principle is still the same. We have chosen to use an insert
transaction in cases of contiguous time periods, but we could
have chosen to use an update transaction instead. It is entirely a
matter of convention, of deciding on a convention that will make
a user’s background assumptions clear.
We chose to use temporal insert transactions, in fact, to pre-
serve a pleasing symmetry. For we now have a set of trans-
actions in which all increases in clock tick representation are
done with inserts, all reductions in clock tick representation
are done with deletes, and updates do neither. Of course, pleasing
symmetries often turn out to have practical as well as aesthetic
benefits. In this case, for example, the AVF will never have to be
concerned with temporal entity integrity or temporal referential
integrity on update transactions, except for an update which
changes the parent object referenced by a version. Making that
determination on nothing more than transaction type is certainly
more efficient than making it on the basis of some more complex
set of criteria.
With our convention in place, we have a clear and intuitive
model of how time and transaction types pair up with one
another. For a given object, if the effective and assertion time
periods specified on a temporal transaction do not [
intersect]
the time periods on any rows already in the target table, then
Chapter 5 THE CORE CONCEPTS OF ASSERTED VERSIONING
107
the transaction is valid if it is an insert, and invalid otherwise.
Conversely, for a given object, if the effective and assertion time
periods specified on a temporal transaction do [

intersect] the
time periods on one or more rows already in the target table,
then the transaction is valid if it is an update or delete, and
invalid otherwise.
What we could not have chosen to do is to permit either
inserts or updates to be used in the case of contiguous time per-
iods. The reason is that we must preserve a core element in the
semantics of conventional insert and update transactions, which
is that the user knows whether or not the target table already
contains a row which matches the transaction. By using a con-
ventional insert, the user is telling us to reject the transaction if
a matching representation of that object already exists in the
table. The same must be true of temporal inserts against tempo-
ral tables. By using a conventional update, the user is telling us
to reject the transaction unless a matching representation of that
object already exists in the table. Again, the same must be true of
temporal updates. The difference is that a “matching representa-
tion”, in a conventional table, is simply a row representing the
same object. A “matching representation”, in an asserted version
table, is a row representing the same object at the same time, i.e.
in a set of one or more identical clock ticks.
Temporal Referential Integrity
Another consequence of breaking the one-to-one correspon-
dence between objects and rows is that the relational rule of ref-
erential integrity breaks down when the parent table in a
referential integrity relationship is an asserted version table. In
that case, the parent in any instance of that relationship is not
a row; rather, it is an episode, and it may consist of any number
of rows.
Referential integrity (RI) reflects an existence dependency.Ifa

child row is RI-dependent on a parent row, this is based on the
fact that the object represented by the child row is existence-
dependent on the object represented by the parent row. There-
fore, a child row cannot be inserted into the database unless its
referenced parent row is already present, and that parent row
cannot be deleted from the database as long as any child row
referencing it is present.
The same logic is at work in the case of temporal referential
integrity. If there is an existence dependency between a parent
object and a child object—between a client and a policy, for
example—then we cannot assert that the policy is ever in effect
108
Chapter 5 THE CORE CONCEPTS OF ASSERTED VERSIONING
when the client is not. It follows that no change which reduces
the total number of clock ticks in which a child policy is in effect,
and no change which increases the total number of clock ticks in
which a parent client is in effect, can create a TRI violation. And
so the AVF never has to enforce TRI on a child-side temporal
delete or a parent side-temporal insert. As for temporal updates,
they never alter the number of clock ticks in which an object is
represented. And so, unless they change the parent object for a
version, there is no TRI enforcement needed.
By the same token, any change which increases the total
number of clock ticks in which a child policy is in effect, and
any change which decreases the total number of clock ticks in
which a parent client is in effect, can create a TRI violation.
And so the AVF must enforce temporal referential integrity on
parent-side temporal delete transactions, and also on child-side
temporal inserts.
Child-Side Temporal Referential Integrity

The foreign key in a row in a child asserted version table is a
temporal foreign key (TFK). It contains the object identifier (the
oid) of the object that its object is existence-dependent on. But
this object identifier isn’t sufficient to identify a specific row in
the parent table. There may be many rows in the parent table
with that object identifier. And that one row in the child table
may be TRI-dependent on any number of those rows in the par-
ent table. That one row in the child table is TRI-dependent on an
episode in the parent table, an episode of the object designated
by the object identifier in its TFK. The episode it is dependent
on is the one episode of the object designated by that oid that,
within shared assertion time, includes the effective time period
of that version.
Although the parent managed object in a TRI relationship is
an episode, the child managed object is a version. Just as the for-
eign key value in a row in a conventional table may change over
time, so too the temporal foreign key value in a version in a tem-
poral table may change over time from one version of an object
to the next version of that same object. It does not have to be the
same as the TFK value in any other version of the same object.
Even within the same episode, a TFK value may change from
one version in that episode to the next version in that same epi-
sode. TRI child objects are versions, not episodes, because
among versions of the same object, what is referenced by a
temporal foreign key may change over time and, consequently,
over versions.
Chapter 5 THE CORE CONCEPTS OF ASSERTED VERSIONING
109
Parent-Side Temporal Referential Integrity
Of course, TRI, like RI, can be violated from the parent side as

well. In the case of TRI, a violation cannot occur unless the tem-
poral extent of the parent episode is reduced. This can happen in
one of three ways, First of all, the effective-time start of the epi-
sode can be moved forwards. Second, the effective-time end of
the episode can be moved back. This will happen when either
the effective-time end of an episode is changed from 12/31/9999
to any other date, or is changed from a non-12/31/9999 date to
an earlier date. Finally, the episode can be split into two episodes,
leaving a gap where previously there had been none.
But shortening the effective time extent of a parent episode
will not always result in a TRI violation. It will do so only if the
reduction removes the representation of the parent object from
one or more clock ticks that are occupied by a child version
whose TFK matches the oid of the versions in that parent epi-
sode. For example, suppose a parent episode’s effective time is
all of 2009, and a delete transaction splits that episode and
creates in its place one January to April episode and another
October to December episode. If none of the child versions has
an effective time period that includes any of the six months from
April to October, then TRI has not been violated.
Conceptually, reducing the time period of an episode with
dependent versions (versions which may be in the same table
but, more commonly, are in other tables) so that the parent epi-
sode no longer fully includes the time period of one or more of
those child versions, is like a deletion in a conventional parent
table in that there are three options for handling it. First, we
may want to simply restrict the transaction and prevent the
reduction from taking place. Second, we may want to permit
the transaction to proceed, but find all the dependent child rows
and set their temporal foreign keys to the parent object to NULL.

Or, finally, we may want to reduce the temporal extent of all
affected child rows so that the TRI constraint is re-established.
For example, if a client is deleted effective September 2010, then
any policy owned by that client must be deleted as of that same
date. In asserted version tables, this means that if the most
recent episode of that client is given an effective end date of Sep-
tember 2010, then the most recent episode of the policy she
owns must have an effective end date no later than September
2010.
4
4
To be completely accurate, this description would have to be modified a little to
include situations in which there are future versions or episodes of that client or that
policy. But we will leave those details for later.
110
Chapter 5 THE CORE CONCEPTS OF ASSERTED VERSIONING

Tài liệu Managing time in relational databases- P7 pdf

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về