Tải bản đầy đủ (.pdf) (5 trang)

Database Modeling & Design Fourth Edition- P30 docx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (162.5 KB, 5 trang )

132 CHAPTER 6 Normalization
skill_required is decomposed into skill_req1 and skill_req3. In
general (but not always), decomposition of a table into 4NF tables results
in less data redundancy.
6.5.3 Decomposing Tables to 4NF
Algorithms to decompose tables into 4NF are difficult to develop. Let’s
look at some straightforward approaches to 4NF from BCNF and lower
normal forms. First, if a table is BCNF, it either has no FDs, or each FD is
characterized by its left side being a superkey. Thus, if the only MVDs in
this table are derived from its FDs, they have only superkeys as their left
sides, and the table is 4NF by definition. If, however, there are other
nontrivial MVDs whose left sides are not superkeys, the table is only in
BCNF and must be decomposed to achieve higher normalization.
The basic decomposition process from a BCNF table is defined by
selecting the most important MVD (or if that is not possible, then by
selecting one arbitrarily), defining its complement MVD, and decom-
pose the table into two tables containing the attributes on the left and
right sides of that MVD and its complement. This type of decomposition
is lossless because each new table is based on the same attribute, which
is the left side of both MVDs. The same MVDs in these new tables are
now trivial because they contain every attribute in the table. However,
other MVDs may be still present, and more decompositions by MVDs
and their complements may be necessary. This process of arbitrary selec-
tion of MVDs for decomposition is continued until only trivial MVDs
exist, leaving the final tables in 4NF.
As an example, let R(A,B,C,D,E,F) with no FDs, and with MVDs A ->>
B and CD ->> EF. The first decomposition of R is into two tables R1(A,B)
and R2(A,C,D,E,F) by applying the MVD A ->> B and its complement
A->>CDEF. Table R1 is now 4NF, because A ->> B is trivial and is the
only MVD in the table. Table R2, however, is still only BCNF, because of
the nontrivial MVD CD ->> EF. We then decompose R2 into


R21(C,D,E,F) and R22(C,D,A) by applying the MVD CD ->> EF and its
complement CD ->> A. Both R21 and R22 are now 4NF. If we had
applied the MVD complement rule in the opposite order, using CD ->>
EF and its complement CD ->> AB first, the same three 4NF tables would
result from this method. However, this does not occur in all cases; it
only occurs in those tables where the MVDs have no intersecting
attributes.
This method, in general, has the unfortunate side effect of poten-
tially losing some or all of the FDs and MVDs. Therefore, any decision to
Teorey.book Page 132 Saturday, July 16, 2005 12:57 PM
6.5 Fourth and Fifth Normal Forms 133
transform tables from BCNF to 4NF must take into account the trade-off
between normalization and the elimination of delete anomalies, and the
preservation of FDs and possibly MVDs. It should also be noted that this
approach derives a feasible, but not necessarily a minimum, set of 4NF
tables.
A second approach to decomposing BCNF tables is to ignore the
MVDs completely and split each BCNF table into a set of smaller tables,
with the candidate key of each BCNF table being the candidate key of a
new table and the nonkey attributes distributed among the new tables in
some semantically meaningful way. This form of decomposing by candi-
date key (that is, superkey) is lossless because the candidate keys
uniquely join; it usually results in the simplest form of 5NF tables, those
with a candidate key and one nonkey attribute, and no MVDs. However,
if one or more MVDs still exist, further decomposition must be done
with the MVD/MVD-complement approach given above. The decompo-
sition by candidate keys preserves FDs, but the MVD/MVD-complement
approach does not preserve either FDs or MVDs.
Tables that are not yet in BCNF can also be directly decomposed into
4NF using the MVD/MVD-complement approach. Such tables can often

be decomposed into smaller minimum sets than those derived from
transforming into BCNF first and then 4NF, but with a greater cost of
lost FDs. In most database design situations, it is preferable to develop
BCNF tables first, then evaluate the need to normalize further while pre-
serving the FDs.
6.5.4 Fifth Normal Form
Definition. A table R is in fifth normal form (5NF) or project-join nor-
mal form (PJ/NF) if and only if every join dependency in R is
implied by the keys of R.
As we recall, a lossless decomposition of a table implies that it can be
decomposed by two or more projections, followed by a natural join of
those projections (in any order) that results in the original table, without
any spurious or missing rows. The general lossless decomposition con-
straint, involving any number of projections, is also known as a join
dependency (JD). A join dependency is illustrated by the following exam-
ple: in a table R with n arbitrary subsets of the set of attributes of R, R
satisfies a join dependency over these n subsets if and only if R is equal
to the natural join of its projections on them. A JD is trivial if one of the
subsets is R itself.
Teorey.book Page 133 Saturday, July 16, 2005 12:57 PM
134 CHAPTER 6 Normalization
5NF or PJ/NF requires satisfaction of the membership algorithm
[Fagin, 1979], which determines whether a JD is a member of the set of
logical consequences of (can be derived from) the set of key dependen-
cies known for this table. In effect, for any 5NF table, every dependency
(FD, MVD, JD) is determined by the keys. As a practical matter we note
that because JDs are very difficult to determine in large databases with
many attributes, 5NF tables are not easily derivable, and logical database
design typically produces BCNF tables.
We should also note that by the preceding definitions, just because a

table is decomposable does not necessarily mean it is not 5NF. For exam-
ple, consider a simple table with four attributes (A,B,C,D), one FD (A-
>BCD), and no MVDs or JDs not implied by this FD. It could be decom-
Table 6.6 The Table skill_in_common and Its Three Projections
skill_in_common emp_id proj_no skill_type
101 3 A
101 3 B
101 4 A
101 4 B
102 3 A
102 3 B
103 3 A
103 4 A
103 5 A
103 5 C
skill_in_com1 skill_in_com2 skill_in_com3
emp_id proj_no emp_id skill_type proj_no skill_type
101 3 101 A 3 A
101 4 101 B 3 B
102 3 102 A 4 A
103 3 102 B 4 B
103 4 103 A 5 A
103 5 103 C 5 C
Teorey.book Page 134 Saturday, July 16, 2005 12:57 PM
6.5 Fourth and Fifth Normal Forms 135
posed into three tables, A->B, A->C, and A->D, all based on the same
superkey A; however, it is already in 5NF without the decomposition.
Thus, the decomposition is not required for normalization. On the other
hand, decomposition can be a useful tool in some instances for perfor-
mance improvement.

The following example demonstrates that a table representing a ter-
nary relationship may not have any two-way lossless decompositions;
however, it may have a three-way lossless decomposition, which is
equivalent to three binary relationships, based on the three possible pro-
jections of this table. This situation occurs in the relationship skill-in-
common (Figure 6.6), which is defined as “The employee must apply the
intersection of his or her available skills with the skills needed to work
on certain projects.” In this example, skill-in-common is less restrictive
than skill-required because it allows an employee to work on a project
even if he or she does not have all the skills required for that project.
As Table 6.6 shows, the three projections of skill_in_common
result in a three-way lossless decomposition. There are no two-way loss-
less decompositions and no MVDs; thus, the table skill_in_common is
in 4NF.
The ternary relationship in Figure 6.6 can be interpreted yet another
way. The meaning of the relationship skill-used is “We can selectively
record different skills that each employee applies to working on individ-
ual projects.” It is equivalent to a table in 5NF that cannot be decom-
posed into either two or three binary tables. Note by studying Table 6.7
that the associated table, skill_used, has no MVDs or JDs.
Table 6.7 The Table skill_used, Its Three Projections, and Natural Joins of
Its Projections
skill_used emp_id proj_no skill_type
101 3 A
101 3 B
101 4 A
101 4 C
102 3 A
102 3 B
102 4 A

102 4 B
Teorey.book Page 135 Saturday, July 16, 2005 12:57 PM
136 CHAPTER 6 Normalization
A table may have constraints that are FDs, MVDs, and JDs. An MVD
is a special case of a JD. To determine the level of normalization of the
table, analyze the FDs first to determine normalization through BCNF;
then analyze the MVDs to determine which BCNF tables are also 4NF;
then, finally, analyze the JDs to determine which 4NF tables are also
5NF.
Three projections on skill_used result in:
skill_used1 skill_used2 skill_used3
emp_id proj_no proj_no skill_type emp_id skill_type
101 3 3 A 101 A
101 4 3 B 101 B
102 3 4 A 101 C
102 4 4 B 102 A
4C 102B
join skill_used1 with
skill_used2 to form:
join skill_used12 with
skill_used3 to form:
skill_used_12 skill_used_123
emp_id proj_no skill_type emp_id proj_no skill_type
101 3 A 101 3 A
101 3 B 101 3 B
101 4 A 101 4 A
101 4 B 101 4 B (spurious)
101 4 C 101 4 C
102 3 A 102 3 A
102 3 B 102 3 B

102 4 A 102 4 A
102 4 B 102 4 B
102 4 C
Table 6.7 The Table skill_used, Its Three Projections, and Natural Joins of
Its Projections (continued)
Teorey.book Page 136 Saturday, July 16, 2005 12:57 PM

×