Tải bản đầy đủ (.ppt) (16 trang)

Tài liệu Database Systems - Part 10 ppt

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (187.83 KB, 16 trang )

COP 4710: Database Systems (Day 10) Page 1 Mark Llewellyn
COP 4710: Database Systems
Spring 2004
Introduction to Normalization
BÀI 10, ½ ngày
COP 4710: Database Systems
Spring 2004
Introduction to Normalization
BÀI 10, ½ ngày
School of Electrical Engineering and Computer Science
University of Central Florida
Instructor : Mark Llewellyn

CC1 211, 823-2790
/>COP 4710: Database Systems (Day 10) Page 2 Mark Llewellyn


If R is a relational schema with attributes A
1
,A
2
, , A
n

and a set of functional dependencies F where X ⊆
{A
1
,A
2
, ,A
n


} then X is a key of R if:
1. X → A
1
A
2
A
n
∈ F
+
, and
2. no proper subset Y X gives Y → A⊆
1
A
2
A
n
∈ F
+
.

Basically, this definition means that you must attempt to
generate the closure of all possible subsets of the schema
of R and determine which sets produce all of the
attributes in the schema.


COP 4710: Database Systems (Day 10) Page 3 Mark Llewellyn

Let r = (C, T, H, R, S, G) with
F = {C → T, HR → C, HT → R, CS → G, HS → R}

Step 1: Generate (A
i
)
+
for 1 ≤ i ≤ n
C
+
= {CT}, T
+
= {T}, H
+
= {H}
R
+
= {R}, S
+
= {S}, G
+
= {G}
no single attribute is a key for R
Step 2: Generate (A
i
A
j
)
+
for 1 ≤ i ≤ n, 1 ≤ j ≤ n
(CT)
+
= {C,T}, (CH)

+
= {CHTR}, (CR)
+
= {CRT}
(CS)
+
= {CSGT}, (CG)
+
= {CGT}, (TH)
+
= {THRC}
(TR)
+
= {TR}, (TS)
+
= {TS}, (TG)
+
= {TG}
(HR)
+
= {HRCT}, (HS)
+
= {HSRCTG}, (HG)
+
= {HG}
(RS)
+
= {RS}, (RG)
+
= {RG}, (SG)

+
= {SG}
The attribute set (HS) is a key for R

6
120
720
)!16(!1
!6
1
6
==
−×
=








15
48
720
)!26(!2
!6
2
6
==

−×
=








COP 4710: Database Systems (Day 10) Page 4 Mark Llewellyn

Step 3: Generate (A
i
A
j
A
k
)
+
for 1 ≤ i ≤ n, 1 ≤ j ≤ n, 1 ≤ k ≤ n
(CTH)
+
= {CTHR}, (CTR)
+
= {CTR}
(CTS)
+
= {CTSG}, (CTG)
+

= {CTG}
(CHR)
+
= {CHRT}, (CHS)
+
= {CHSTRG}
(CHG)
+
= {CHGTR}, (CRS)
+
= {CRSTG}
(CRG)
+
= {CRGT}, (CSG)
+
= {CSGT}
(THR)
+
= {THRC}, (THS)
+
= {THSRCG}
(THG)
+
= {THGRC}, (TRS)
+
= {TRS}
(TRG)
+
= {TRG}, (TSG)
+

= {TSG}
(HRS)
+
= {HRSCTG}, (HRG)
+
= {HRGCT}
(HSG)
+
= {HSGRCT}, (RSG)
+
= {RSG}
Superkeys are shown in red.

20
36
720
)!36(!3
!6
3
6
==
−×
=









COP 4710: Database Systems (Day 10) Page 5 Mark Llewellyn

Step 4: Generate (A
i
A
j
A
k
A
r
)
+
for 1 ≤ i ≤ n, 1 ≤ j ≤ n, 1 ≤ k ≤ n,
1 ≤ r ≤ n
(CTHR)
+
= {CTHR}, (CTHS)
+
= {CTHSRG}
(CTHG)
+
= {CTHGR}, (CHRS)
+
= {CHRSTG}
(CHRG)
+
= {CHRGT}, (CRSG)
+
= {CRSGT}

(THRS)
+
= {THRSCG}, (THRG)
+
= {THRGC}
(TRSG)
+
= {TRSG}, (HRSG)
+
= {HRSGCT}
(CTRS)
+
= {CTRS}, (CTSG)
+
= {CTSG}
(CSHG)
+
= {CSHGTR}, (THSG)
+
= {THSGRC}
(CTRG)
+
= {CTRG}
Superkeys are shown in red.

15
48
720
)!46(!4
!6

4
6
==
−×
=








COP 4710: Database Systems (Day 10) Page 6 Mark Llewellyn

Step 5: Generate (A
i
A
j
A
k
A
r
A
s
)
+
for 1 ≤ i ≤ n, 1 ≤ j ≤ n, 1 ≤ k ≤
n, 1 ≤ r ≤ n, 1 ≤ s ≤ n
(CTHRS)

+
= {CTHSRG}
(CTHRG)
+
= {CTHGR}
(CTHSG)
+
= {CTHSGR}
(CHRSG)
+
= {CHRSGT}
(CTRSG)
+
= {CTRSG}
(THRSG)
+
= {THRSGC}
Superkeys are shown in red.

6
120
720
)!56(!5
!6
5
6
==
−×
=









COP 4710: Database Systems (Day 10) Page 7 Mark Llewellyn

Step 6: Generate (A
i
A
j
A
k
A
r
A
s
A
t
)
+
for 1 ≤ i ≤ n, 1 ≤ j ≤ n, 1 ≤ k
≤ n, 1 ≤ r ≤ n, 1 ≤ s ≤ n, 1 ≤ t ≤ n
(CTHRSG)
+
= {CTHSRG}
Superkeys are shown in red.


In general, for 6 attributes we have:

1
720
720
)!66(!6
!6
6
6
==
−×
=








cases6311520156
6
6
5
6
4
6
3
6
2

6
1
6
=++++=








+








+









+








+








+








 !"#"$"%&'!→#"#→$(
COP 4710: Database Systems (Day 10) Page 8 Mark Llewellyn



Normalization is a formal technique for analyzing
relations based on the primary key (or candidate key
attributes and functional dependencies.

The technique involves a series of rules that can be used
to test individual relations so that a database can be
normalized to any degree

When a requirement is not met, the relation violating the
requirement is decomposed into a set of relations that
individually meet the requirements of normalization.

Normalization is often executed as a series of steps. Each
step corresponds to a specific normal form that has
known properties.
)*#

COP 4710: Database Systems (Day 10) Page 9 Mark Llewellyn

#+)
),)
,)
-)
.)
#$)
/)
0)
1)
COP 4710: Database Systems (Day 10) Page 10 Mark Llewellyn



For the relational model it is important to recognize that
it is only first normal form (1NF) that is critical in
creating relations. All the subsequent normal forms are
optional.

However, to avoid the update anomalies that we
discussed earlier, it is normally recommended that the
database designer proceed to at least 3NF.

As the figure on the previous page illustrates, some 1NF
relations are also in 2NF and some 2NF relations are also
in 3NF, and so on.

As we proceed, we’ll look at the requirements for each
normal form and a decomposition technique to achieve
relation schemas in that normal form.
)*23
COP 4710: Database Systems (Day 10) Page 11 Mark Llewellyn


Non-first normal form relation are those relations in
which one or more of the attributes are non-atomic. In
other words, within a relation and within a single tuple
there is a multi-valued attribute.

There are several important extensions to the relational
model in which N1NF relations are utilized. For the
most part these go beyond the scope of this course and

we will not discuss them in any significant detail.
Temporal relational databases and certain categories of
spatial databases fall into the N1NF category.
)) ),)%
COP 4710: Database Systems (Day 10) Page 12 Mark Llewellyn


A relation in which every attribute value is atomic is in
1NF.

We have only considered 1NF relations for the most part
in this course.

When dealing with multi-valued attributes at the
conceptual level, recall that in the conversion into the
relational model created a separate table for the multi-
valued attribute. (See Day 6, Pages 8-10)
) ,)%
COP 4710: Database Systems (Day 10) Page 13 Mark Llewellyn


A key is a superkey with the additional property that the
removal of any attribute from the key will cause it to no longer
be a superkey. In other words, the key is minimal in the
number of attributes.

The candidate key for a relation a set of minimal keys of the
relation schema.

The primary key for a relation is a selected candidate key. All

of the remaining candidate keys (if any) become secondary
keys.

A prime attribute is any attribute of the schema of a relation R
that is a member of any candidate key of R.

A non-prime attribute is any attribute of R which is not a
member of any candidate key.
!4
COP 4710: Database Systems (Day 10) Page 14 Mark Llewellyn


Second normal form (2NF) is based on the concept
of a full functional dependency.

A functional dependency X → Y is a full functional
dependency if the removal of any attribute A from X
causes the fd to no longer hold.
for any attribute A∈X, X-{A} → Y

A functional dependency X → Y is a partial
functional dependency if some attribute A can be
removed from X and the fd still holds.
for any attribute A∈X, X-{A} → Y
) -)%
COP 4710: Database Systems (Day 10) Page 15 Mark Llewellyn


A relation scheme R is in 2NF with respect to a set
of functional dependencies F if every non-prime

attribute is fully dependent on every key of R.

Another way of stating this is: there does not exist a
non-prime attribute which is partially dependent on
any key of R. In other words, no non-prime attribute
is dependent on only a portion of the key of R.
5)
-)%
COP 4710: Database Systems (Day 10) Page 16 Mark Llewellyn

Given R = (A, D, P, G), F = {AD → PG, A → G} and
K = {AD}
Then R is not in 2NF because G is partially dependent on
the key AD since AD → G yet A → G.
Decompose R into:
R1 = (A, D, P) R2 = (A, G)
K1 = {AD} K2 = {A}
F1 = {AD → P} F2 = {A → G}
) -)%

×