Tải bản đầy đủ (.pdf) (5 trang)

Database Modeling & Design Fourth Edition- P29 potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (166.86 KB, 5 trang )

6.5 Fourth and Fifth Normal Forms 127
Step 5. Definition of the Minimum Set of Normalized Tables
The minimum set of normalized tables has now been computed. We
define them below in terms of the table name, the attributes in the table,
the FDs in the table, and the candidate keys for that table:
Note that this result is not only 3NF, but also BCNF, which is very
frequently the case. This fact suggests a practical algorithm for a (near)
minimum set of BCNF tables: Use Bernstein’s algorithm to attain a mini-
mum set of 3NF tables, then inspect each table for further decomposi-
tion (or partial replication, as shown in Section 6.1.5) to BCNF.
6.5 Fourth and Fifth Normal Forms
Normal forms up to BCNF were defined solely on FDs, and, for most
database practitioners, either 3NF or BCNF is a sufficient level of normal-
ization. However, there are in fact two more normal forms that are
needed to eliminate the rest of the currently known anomalies. In this
section, we will look at different types of constraints on tables: multival-
ued dependencies and join dependencies. If these constraints do not
exist in a table, which is the most common situation, then any table in
BCNF is automatically in fourth normal form (4NF), and fifth normal
form (5NF) as well. However, when these constraints do exist, there may
be further update (especially delete) anomalies that need to be corrected.
First, we must define the concept of multivalued dependency.
6.5.1 Multivalued Dependencies
Definition. In a multivalued dependency (MVD), X->>Y holds on table
R with table scheme RS if, whenever a valid instance of table
R(X,Y,Z) contains a pair of rows that contain duplicate values of X,
R1: ABC (AB->C with key AB) R5: DFJ (F->DJ with key F)
R2: AEF (A->EF with key A) R6: DKLMNP (D->KLMNP, L->D,
with keys D, L)
R3: EG (E->G with key E) R7: PQRT (PQR->T with key PQR)
R4: DGI (G->DI with key G) R8: PRS (PR->S with key PR)


Teorey.book Page 127 Saturday, July 16, 2005 12:57 PM
128 CHAPTER 6 Normalization
then the instance also contains the pair of rows obtained by inter-
changing the Y values in the original pair. This includes situations
where only pairs of rows exist. Note that X and Y may contain either
single or composite attributes.
An MVD X ->> Y is trivial if Y is a subset of X, or if X union Y = RS.
Finally, an FD implies an MVD, which implies that a single row with a
given value of X is also an MVD, albeit a trivial form.
The following examples show where an MVD does and does not
exist in a table. In R1, the first four rows satisfy all conditions for the
MVDs X->>Y and X->>Z. Note that MVDs appear in pairs because of the
cross-product type of relationship between Y and Z=RS-Y as the two
right sides of the two MVDs. The fifth and sixth rows of R1 (when the X
value is 2) satisfy the row interchange conditions in the above defini-
tion. In both rows, the Y value is 2, so the interchanging of Y values is
trivial. The seventh row (3,3,3) satisfies the definition trivially.
In table R2, however, the Y values in the fifth and sixth rows are dif-
ferent (1 and 2), and interchanging the 1 and 2 values for Y results in a
row (2,2,2) that does not appear in the table. Thus, in R2 there is no
MVD between X and Y or between X and Z, even though the first four
rows satisfy the MVD definition. Note that for the MVD to exist, all rows
must satisfy the criterion for an MVD.
Table R3 contains the first three rows that do not satisfy the crite-
rion for an MVD, since changing Y from 1 to 2 in the second row results
in a row that does not appear in the table. Similarly, changing Z from 1
to 2 in the third row results in a nonappearing row. Thus, R3 does not
have any MVDs between X and Y or between X and Z.
R1: XYZ R2: XY Z R3: XY Z
111 111 111

112 112 112
121 121 121
122 122 221
221 221 222
222 212
333
Teorey.book Page 128 Saturday, July 16, 2005 12:57 PM
6.5 Fourth and Fifth Normal Forms 129
By the same argument, in table R1 we have the MVDs Y->> X and
Y->>Z, but none with Z on the left side. Tables R2 and R3 have no
MVDs at all.
The following inference rules for MVDs are somewhat analogous to
the inference rules for functional dependencies given in Section 6.4
[Beeri, Fagin, and Howard, 1977]. They are quite useful in the analysis
and decomposition of tables into 4NF.
Multivalued Dependency Inference Rules
6.5.2 Fourth Normal Form
The goal of 4NF is to eliminate nontrivial MVDs from a table by project-
ing them onto separate smaller tables, and thus to eliminate the update
anomalies associated with the MVDs. This type of normal form is rea-
sonably easy to attain if you know where the MVDs are. In general,
MVDs must be defined from the semantics of the database; they cannot
be determined from just looking at the data. The current set of data can
only verify whether your assumption about an MVD is currently true or
not, but this may change each time the data is updated.
Reflexivity X >> X
Augmentation If X >> Y, then XZ >> Y.
Transitivity If X >>Y and Y >> Z, then X >> (Z-Y).
Pseudotransitivity If X >> Y and YW >> Z, then XW >> (Z-YW).
(Transitivity is a special case of pseudotransitivity

when W is null.)
Union If X >> Y and X >> Z, then X >> YZ.
Decomposition If X >> Y and X >> Z, then X >> Y intersect Z
and X >> (Z-Y).
Complement If X >> Y and Z=R-X-Y, then X >> Z.
FD Implies MVD If X -> Y, then X >> Y.
FD, MVD Mix If X >> Z and Y >> Z’ (where Z’ is contained in
Z, and Y and Z are disjoint), then X->Z’.
Teorey.book Page 129 Saturday, July 16, 2005 12:57 PM
130 CHAPTER 6 Normalization
Definition. A table R is in fourth normal form (4NF) if and only if it is
in BCNF and, whenever there exists an MVD in R (say X ->> Y), at
least one of the following holds: the MVD is trivial, or X is a super-
key for R.
Applying this definition to the three tables in the example in the
previous section, we see that R1 is not in 4NF because at least one non-
trivial MVD exists and no single column is a superkey. In tables R2 and
R3, however, there are no MVDs. Thu,s these two tables are at least 4NF.
As an example of the transformation of a table that is not in 4NF to
two tables that are in 4NF, we observe the ternary relationship skill-
required, shown in Figure 6.6. The relationship skill-required is defined
as follows: “An employee must have all the required skills needed for a
project to work on that project.” For example, in Table 6.5 the project
with proj_no = 3 requires skill types A and B by all employees (see
employees 101 and 102). The table skill_required has no FDs, but it
does have several nontrivial MVDs, and is therefore only in BCNF. In
such a case it can have a lossless decomposition into two many-to-many
binary relationships between the entities Employee and Project, and
Project and Skill. Each of these two new relationships represents a table
in 4NF. It can also have a lossless decomposition resulting in a binary

many-to-many relationship between the entities Employee and Skill,
and Project and Skill.
A two-way lossless decomposition occurs when skill_required is
projected over (emp_id, proj_no) to form skill_req1 and projected over
(proj_no, skill_type) to form skill_req3. Projection over (emp_id,
Figure 6.6 Ternary relationship with multiple interpretations
Employee
Skill Project
NN
N
** (1) skill-required
(2) skill-in-common
(3) skill-used
**
Teorey.book Page 130 Saturday, July 16, 2005 12:57 PM
6.5 Fourth and Fifth Normal Forms 131
proj_no) to form skill_req1 and over (emp_id, skill_type) to form
skill_req2, however, is not lossless. A three-way lossless decomposition
occurs when skill_required is projected over (emp_id, proj_no),
(emp_id, skill_type), and (proj_no, skill_type).
Tables in 4NF avoid certain update anomalies (or inefficiences). For
instance, a delete anomaly exists when two independent facts get tied
together unnaturally so that there may be bad side effects of certain
deletes. For example, in skill_required, the last row of a skill_type may
be lost if an employee is temporarily not working on any projects. An
update inefficiency may occur when adding a new project in
skill_required, which requires insertions for many rows to include all
the required skills for that new project. Likewise, loss of a project
requires many deletions. These inefficiencies are avoided when
Table 6.5 The Table skill_required and Its Three Projections

skill_required
emp_id proj_no skill_type
MVDs(nontrivial)
101 3 A proj_no ->> skill_type
101 3 B proj_no ->> emp_id
101 4 A
101 4 C
102 3 A
102 3 B
103 5 D
skill_req1 skill_req2 skill_req3
emp_id proj_no emp_id skill_type proj_no skill_type
101 3 101 A 3 A
101 4 101 B 3 B
102 3 101 C 4 A
103 5 102 A 4 C
102 B 5 D
103 D
Teorey.book Page 131 Saturday, July 16, 2005 12:57 PM

×