Tải bản đầy đủ (.pdf) (42 trang)

COLLABORATE pps

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (294.18 KB, 42 trang )

2C.1 Relational Database Desi
g
n
L
L
E
E
S
S
S
S
O
O
N
N
:
:
2
2
C
C
C
C
O
O
L
L
L
L
A
A


B
B
O
O
R
R
A
A
T
T
E
E
Relational Database Desi
g
n
2C.2
2C.3 Relational Database Desi
g
n
K
K
N
N
O
O
W
W
L
L
E

E
D
D
G
G
E
E
B
B
Y
Y
T
T
E
E
Collaborate Lesson 2C / Slide 1 of 25©NIIT
Collaborate
Knowledge Byte
In this section, you will learn about the following:

Domains

Fourth Normal Form

Other Normal Forms
In this section, you will learn about the following:
Domains
Fourth Normal Form
Other Normal Forms
Relational Database Desi

g
n
2C.4
Domains
Collaborate Lesson 2C / Slide 2 of 25©NIIT
Collaborate
Domains

A domain is a set of data values that are of the same type.

Every value in an attribute must be drawn from the underlying domain.
However, every value in the domain may not appear in the table.

Domains are primarily conceptual and no current RDBMS supports this
concept in the complete sense.

A relation is a collection of domains and consists of two parts, a heading
and a body.

A change in a relation implies that either a new domain is added to the
collection of existing domains, or a domain that originally defined the
relation no longer exists.
A domain is a set of data values that are of the same type. In other words, a domain is a
pool of values from which the actual values for an attribute are drawn.
For example, consider a domain CITIES that contains the names of all cities. CUSTOMER
and PRODUCT are two tables, both of which have an attribute that holds the names of
cities. Attribute CITY in the CUSTOMER table holds the names of cities where the
customers are based. Attribute CITY in the PRODUCT table holds the names of the cities
where the products are manufactured. Both these attributes draw their values from the
CITIES domain.

2C.5 Relational Database Desi
g
n


Domains and Attributes
Every value in an attribute must be drawn from the underlying domain. However, every
value in the domain may not appear in the table. In the above figure, some cities that are
in the domain do not exist in the attributes. However, later, any of these values may exist
in the attributes as they are valid values. Domains are primarily conceptual and no current
RDBMS supports this concept in the complete sense.
A relation is a collection of domains and consists of two parts, a heading and a body. The
heading part consists of a fixed set of attributes. The body part consists of rows that may
change with time. The attributes of a relation are fixed. If an attribute is added or an
existing attribute is removed from a relation, the relation is no longer the same. Such a
change implies that either a new domain is added to the collection of existing domains, or
a domain that originally defined the relation no longer exists.
CITY
New York
Dallas
Texas
PRODUCT
New York
San Francisco
Miami
Dallas
Miami
San Francisco
Las Vegas
Texas

New York
Washington
Disneyland
Hollywood
Philadelphia
CUSTOME
R
PRODUCT
DOMAIN: CITIES
Relational Database Desi
g
n
2C.6
Fourth Normal Form
Collaborate Lesson 2C / Slide 3 of 25©NIIT
Collaborate
Fourth Normal Form

If a relation has many-to-many relationships with two or more relations,
then the attributes of all the three or more relations cannot be depicted in
the same relation.

When you model such situations in a relational database, you will either
have redundant data or use null values.

A condition that requires duplication of values and thus enforces mutual
independence of multivalued attributes is called multivalued dependency.
2C.7 Relational Database Desi
g
n

Collaborate ©NIIT
Collaborate
Fourth Normal Form (Contd )

You can use fourth normal form (4 NF) to remove multivalued
dependencies.

A table is in 4 NF if it is in 3 NF and has no multivalued dependencies.

To apply 4 NF, you need to put all multivalued attributes in individual
tables containing the key to which the attribute values apply.
Lesson 2C / Slide 4 of 25
According to 1 NF, a relation cannot have multivalued attributes. This means that for one
value of an attribute there cannot be multiple values of another attribute. However, in real
world, there might be situations where multivalued attributes are required. For example, in
a university, a faculty member is assigned multiple courses and committees. Also, in a
health insurance system, an employee has many dependents.
When you model such situations in a relational database, you will either have redundant
data or null values. For example, for the university database, the following figure depicts
the FACULTY table that uses the above concept.
FACULTY
FACULTY-
NAME
COURSE COMMITTEE
Chris 125 Admissions
Chris 125 Examinations
Chris 126 Admissions
Relational Database Desi
g
n

2C.8
FACULTY-
NAME
COURSE COMMITTEE
Chris 126 Examinations
Chris 127 Admissions
Chris 127 Examinations
Table with Multivalued Dependency
Such a condition that requires duplication of values and thus enforces mutual
independence of multivalued attributes is called multivalued dependency. Multivalued
dependency is a constraint on tables like functional dependency.
It is clear that multivalued dependency requires duplication of data values. Therefore, in
normalization, you need to remove multivalued dependencies. To do this, you can use the
fourth normal form (4 NF). A table is in 4 NF if it is in 3 NF and has no multivalued
dependencies. To apply 4 NF, you need to put all multivalued attributes in individual tables
containing the key to which the attribute values apply. The above table can be normalized
as follows:
FAC-COURSE
FACULTY-
NAME
COURSE
Chris 125
Chris 126
Chris 127
FAC-COMMITTEE
FACULTY-
NAME
COMMITTEE
Chris Admissions
Chris Examinations

4 NF Tables
Both the above tables are in 4 NF because the multivalued attributes, COURSE and
COMMITTEE, are now in tables by themselves. The primary keys of each of the above
tables consist of both the attributes present in the table. This means that the primary key
2C.9 Relational Database Desi
g
n
of the FAC-COURSE table consists of FACULTY-NAME and COURSE. The primary key of the
FAC-COMMITTEE table consists of the FACULTY-NAME and COMMITTEE.
Other Normal Forms
Collaborate ©NIIT
Collaborate
Other Normal Forms

Some constraints like business rules result in the need for fifth normal
form (5 NF).

For a table to be in 5 NF, it has to be in 4 NF and should abide by some
business rules.

The purpose of 5 NF is to have tables that cannot be further decomposed.

If the business rule does not exist, then there is no need of 5 NF.

In addition to 5 NF, there is another normal form called the domain-key
normal form (DKNF).

A table is in DKNF if every constraint on the table is a result of the
definitions of domains and keys.
Lesson 2C / Slide 5 of 25

The normal forms 2 NF, 3 NF, and 4 NF are the results of functional dependency and
multivalued dependency. There are some other constraints like business rules that result in
the need for fifth normal form (5 NF). For a table to be in 5 NF, it has to be in 4 NF and
should abide by some business rules. The purpose of 5 NF is to have tables that cannot be
further decomposed. Let us now understand 5 NF with an example.
Relational Database Desi
g
n
2C.10
Consider the following table:
This table indicates that the Computer Science department offers the subjects CS150,
CS103, and CS104. These subjects are taken by different students. Here, the business rule
is that a subject cannot be taken by all the students and a student cannot take all the
subjects. The above table does not illustrate multivalued dependency because the columns
SUBJECT and STUDENT are not independent. Both these columns are related to each other
and have important information in them. Therefore, this table cannot be decomposed in
the following two tables without losing information about the subjects taken by a student.
DEPT-SUB (DEPARTMENT, SUBJECT)
DEPT-STUD (DEPARTMENT, STUDENT)
However, we can decompose the above table in the following three tables without losing
any information.
DEPT-SUB
Department Subject
Mathematics MA520
Computer
Science
CS150
Computer
Si
CS103

Department Subject Student
Mathematics MA520 Ron Floyd
Computer
Science
CS150 Nancy Drew
Computer
Science
CS103 Charlie
Burton
Computer
Science
CS104 Ron Floyd
Chemistry CH894 Chris Laurel
Physics PH654 Mary
Peterson
2C.11 Relational Database Desi
g
n
Department Subject
Science
Computer
Science
CS104
Chemistry CH894
Physics PH654
DEPT-STUD
Department Student
Mathematics Ron Floyd
Computer
Science

Nancy Drew
Computer
Science
Charlie Burton
Computer
Science
Ron Floyd
Chemistry Chris Laurel
Physics Mary Peterson
SUB-STUD
Subject Student
MA520 Ron Floyd
CS150 Nancy Drew
CS103 Charlie Burton
CS104 Ron Floyd
CH894 Chris Laurel
PH654 Mary Peterson
If the business rule does not exist, then there is no need of 5 NF.
Relational Database Desi
g
n
2C.12
In addition to 5 NF, there is another normal form called the domain-key normal form
(DKNF) that was proposed by Fagin. This normal form is based on the definition of keys
and attribute domains. A table is in DKNF if every constraint on the table is a result of the
definitions of domains and keys.
2C.13 Relational Database Desi
g
n
F

F
R
R
O
O
M
M
T
T
H
H
E
E
E
E
X
X
P
P
E
E
R
R
T
T


S
S
D

D
E
E
S
S
K
K
Collaborate Lesson 2C / Slide 6 of 25©NIIT
Collaborate
From the Expert’s Desk

This section will introduce the following:

Best practices for the primary key

Tip on generalization

FAQs on data models, E/R diagrams, and normalization
In this section, you will learn about best practices for the primary key, a tip on
generalization, and some FAQs on data models, E/R diagrams, and normalization.
Relational Database Desi
g
n
2C.14
Best Practices
Primary Key
Collaborate Lesson 2C / Slide 7 of 25©NIIT
Collaborate
Best Practices
Primary Key


The primary key should be numeric.

The primary key should have only one column.

The primary key should not change over time.

The primary key should be meaningless.
Some best practices that you can follow for the primary key are as follows:
The primary key should be numeric: This is because you can automatically generate
numbers for each record easily by using the IDENTITY clause. This removes the risk
of duplicate primary key values for two records. However, if you use an
alphanumeric primary key, you have to assign the primary key to each record
manually. Therefore, you run the risk of assigning the same primary key value to two
records.
The primary key should have only one column: Although you can have multiple
columns that constitute the primary key, it is always better to have a single column
as the primary key. This is because if multiple columns make the primary key, more
space is required to store the key. In addition, if you need to identify a record, you
need to refer to a group of columns.
The primary key should not change over time: This is required because a changing
primary key makes it tough to use historical data since the links are destroyed.
2C.15 Relational Database Desi
g
n
The primary key should be meaningless: This means that the primary key should not
have any meaning associated with it. This ensures that the primary key does not
change over time. For example, consider that you assign employee codes in
accordance with the locations of the employees, like C1256 is the code of an
employee based in Chicago. Now, if the employee with code C1256 moves to Dallas,

you will have to change the primary key value.
Tips
Converting an E/R Diagram that Includes Generalization to
Tables
Collaborate ©NIIT
Collaborate
Tips
Converting an E/R Diagram that Includes
Generalization to Tables

To convert an E/R diagram that includes generalization to tables, create a
table for the higher-level entity.

Next, create a table for each lower-level entity. Each of these tables should
include columns for all attributes of the lower-level entity.

These tables should also include columns for each attribute of the primary
key of the higher-level entity.
Lesson 2C / Slide 8 of 25
You can convert an E/R diagram that includes generalization to tables. The method for this
is to create a table for the higher-level entity. Also, create a table for each lower-level
entity. Each of these tables should include columns for all attributes of the lower-level
entity. These tables should also include columns for each attribute of the primary key of
the higher-level entity. To understand this, consider the following E/R diagram that uses
generalization.
Relational Database Desi
g
n
2C.16


E/R Diagram Using Generalization
Here, ACCOUNT is the higher-level entity with primary key ACC-NUMBER, //and SAVING-
ACCOUNT and CURRENT-ACCOUNT are lower-level entities. For this E/R diagram, there will
be three tables: ACCOUNT with columns ACC-NUMBER and BALANCE, SAVING-ACCOUNT
with columns ACC-NUMBER and INTEREST-RATE, and CURRENT-ACCOUNT with columns
ACC-NUMBER and OVERDRAFT-LIMIT.
OVERDRAFT-
LIMIT
SAVING-
ACCOUNT
INTEREST-
RATE
ACCOUNT
BALANCE
ACC-NUMBER
CURRENT-
ACCOUNT
2C.17 Relational Database Desi
g
n
FAQs
Collaborate ©NIIT
Collaborate
FAQs
Q. Why do we need to create conceptual models when we can directly create
relational models?
Ans: The increase in the number of entities, relationships, aggregations, and
specializations increases the complexity of a data model. As the complexity
of a data model increases, it becomes difficult to develop the data model
correctly. However, a database is used to provide important information to

employees in an organization. Therefore, the database structure should be
free of errors. Even though any data model cannot ensure a completely
accurate structure, the graphical approach of the conceptual model leads to
fewer errors than the textual approach of the relational model. This is the
reason why you need to create conceptual models before creating relational
models.
Lesson 2C / Slide 9 of 25
Relational Database Desi
g
n
2C.18
Collaborate ©NIIT
Collaborate
FAQs (Contd )
Q. Given a choice between 3NF and BCNF, which one should we choose?
Ans: An advantage of using 3NF is that you can create a 3NF without losing
functional dependency. The disadvantage of using 3NF is repetition of
information. However, if a choice is available between 3NF and BCNF, it is
usually preferable to use 3NF. This is because if proper testing of functional
dependency is not possible, the system performance might get affected or
you might lose data integrity. Therefore, to avoid these risks, it is better to
use 3NF. However, if the above risks can be avoided, you can use BCNF.
Lesson 2C / Slide 10 of 25
2C.19 Relational Database Desi
g
n
Collaborate ©NIIT
Collaborate
FAQs (Contd )
Q. What is the difference between a weak entity and a subentity?

Ans: A weak entity depends upon a regular entity for its existence whereas a
subentity is part of a regular entity. For example, an entity called Students is
used to store all student details. Now, every student is enrolled for a course
and there are some students who have taken a break and are currently not
enrolled for a course. In this kind of a scenario, we can have a subentity
called Break-Students, which stores details of all students who have
currently taken a break. Note that a subentity will contain all the columns of
the super entity from which it is derived. A weak entity on the other hand
has attributes that are different from those of the regular entity on which it
is dependent.
Lesson 2C / Slide 11 of 25
Relational Database Desi
g
n
2C.20
Collaborate ©NIIT
Collaborate
FAQs (Contd )
Q. What is the difference between a subtype and a supertype?
Ans: The difference between a subtype and a supertype entity is best understood
with an example. An entity called Student has two subentities Boarder and
Day-scholar. Here, the entity Student stores details about students like
name, age, course, and class and has student-id as the primary key. The
Boarder subtype entity has a distinguishing attribute Room-no, while the
Day-scholar entity has a distinguishing attribute Locker-no. Apart from the
distinguishing attributes, the subtype entities also contain the primary key of
the super entity.
Q. What does a fully normalized record consist of?
Ans: A fully normalized record consists of:
• A primary key that identifies an entity

• A set of attributes that describe the entity
Lesson 2C / Slide 12 of 25
Why do we need to create conceptual models when we can directly create relational
models?
The increase in the number of entities, relationships, aggregations, and
specializations increases the complexity of a data model. As the complexity of a data
model increases, it becomes difficult to develop the data model correctly. However, a
database is used to provide important information to employees in an organization.
Therefore, the database structure should be free of errors. Even though any data
model cannot ensure a completely accurate structure, the graphical approach of the
conceptual model leads to fewer errors than the textual approach of the relational
model. This is the reason why you need to create conceptual models before creating
relational models.
Given a choice between 3NF and BCNF, which one should we choose?
An advantage of using 3NF is that you can create a 3NF without losing functional
dependency. The disadvantage of using 3NF is repetition of information. However, if
a choice is available between 3NF and BCNF, it is usually preferable to use 3NF. This
is because if proper testing of functional dependency is not possible, the system
performance might get affected or you might lose data integrity. Therefore, to avoid
these risks, it is better to use 3NF. However, if the above risks can be avoided, you
can use BCNF.
2C.21 Relational Database Desi
g
n
What is the difference between a weak entity and a subentity?
A weak entity depends upon a regular entity for its existence whereas a subentity is
part of a regular entity. For example, an entity called Students is used to store all
student details. Now, every student is enrolled for a course and there are some
students who have taken a break and are currently not enrolled for a course. In this
kind of a scenario, we can have a subentity called Break-Students, which stores

details of all students who have currently taken a break. Note that a subentity will
contain all the columns of the super entity from which it is derived. A weak entity on
the other hand has attributes that are different from those of the regular entity on
which it is dependent.
What is the difference between a subtype and a supertype?
The difference between a subtype and a supertype entity is best understood with an
example. An entity called Student has two subentities Boarder and Day-scholar.
Here, the entity Student stores details about students like name, age, course, and
class and has student-id as the primary key. The Boarder subtype entity has a
distinguishing attribute Room-no, while the Day-scholar entity has a distinguishing
attribute Locker-no. Apart from the distinguishing attributes, the subtype entities
also contain the primary key of the super entity.
What does a fully normalized record consist of?
A fully normalized record consists of:
x A primary key that identifies an entity
x A set of attributes that describe the entity
Relational Database Desi
g
n
2C.22
C
C
H
H
A
A
L
L
L
L

E
E
N
N
G
G
E
E
Collaborate ©NIIT
Collaborate
Challenge
1. The following statement has been extracted from a case
presented by a manufacturer regarding the maintenance of their
data: “A supplier ships certain parts”. Identify the entities
mentioned in this statement and their relationship. Draw an E/R
diagram depicting the relationship.
2. You have received a proposed table structure for the table
Position. After testing the table structure with some data, you
find that there is a problem in inserting, deleting, and modifying
data. You see that the table structure could lead to inconsistency
in data and is also occupying a lot of disk space. Modify the
given table structure to optimize data storage.
Lesson 2C / Slide 13 of 25
2C.23 Relational Database Desi
g
n
Collaborate ©NIIT
Collaborate
Challenge (Contd )
The table structure is as follows:

Lesson 2C / Slide 14 of 25
cPositionCode
Position
vDescription
iBudgetedStrength
siYear
iCurrentStrength
vSkill
Relational Database Desi
g
n
2C.24
Collaborate ©NIIT
Collaborate
Challenge (Contd )
Sample data for the table Position is as shown below:
Lesson 2C / Slide 15 of 25
cPositionC
ode
vDescription iBudgete
dStrength
iCurrentStre
ngth
vSkill
0001 Sales Manager 100 82 Communication
0002 Marketing Manager 100 83 Presentation
0003 Financial Analyst 50 30 Team Leading
0004 Training Co-ord 20 15 Communication
0005 Database Analyst 10 8 Presentation
0006 Staff Accountant 20 18 Convincing

0007 Audit Manager 20 18 Planning
0008 Telephone
Operator
20 18 Communication
0009 Office Clerk 20 18 Team Leading
0010 Legal Secretary 20 18 MS- Office
0011 Administrative
Assistant
20 18 Interpersonal
0012 Senior Receptionist 20 18 Communication
0013 Consultant 20 18 Team Leading
0014 Maintenance
Technician
20 18 Presentation
0015 Receptionist 21 18 Convincing
2C.25 Relational Database Desi
g
n
Collaborate ©NIIT
Collaborate
Challenge (Contd )
3. State whether true or false:
Attributes may acquire further attributes and become entities.
4. Each time the salary slip for an employee is generated, the
referral bonus (if present) has to be calculated and printed in
the salary slip. The following three tables are used for this
query- MonthlySalary, Employee, and EmployeeReferrals. The
table structures are as follows:
Lesson 2C / Slide 16 of 25

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×