COLLABORATE ppsx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (262.28 KB, 34 trang )

3C.1 Relational Database Desi
g
n
L
L
E
E
S
S
S
S
O
O
N
N
:
:
3
3
C
C
C
C
O
O
L
L
L
L
A
A

B
B
O
O
R
R
A
A
T
T
E
E
Relational Database Desi
g
n
3C.2
3C.3 Relational Database Desi
g
n
K
K
N
N
O
O
W
W
L
L
E

E
D
D
G
G
E
E
B
B
Y
Y
T
T
E
E
Collaborate Lesson 3C / Slide 1 of 22©NIIT
Collaborate
Knowledge Byte
In this section, you will learn about the following:
•
CODD’s Twleve Rules
•
Indexes
•
Recovery from Deadlock
•
Database Recovery
In this section, you will learn about the following:
CODD’s Twelve Rules
Indexes

Recovery from Deadlock
Database Recovery
Relational Database Desi
g
n
3C.4
CODD’s Twelve Rules
Collaborate Lesson 3C / Slide 2 of 22©NIIT
Collaborate
Codd’s Twelve Rules
•
Dr. E.F. Codd presented twelve rules that a database must obey if
it is to be considered truly relational.
•
These rules come out of Codd’s theoretical work on the relational
model.
•
The rules stem from a single foundation rule-the zero rule:
•
For a system to qualify as a RELATIONAL, DATABASE,
MANAGEMENT system, it must use its RELATIONAL facilities
to MANAGE the DATABASE.
•
The twelve rules are as follows:
•
The information rule: This rule simply requires all information
to be represented as data values in the rows and columns of
tables. This is the basis of the relational model.
•
The guaranteed access rule: Every data value in a relational

database should be logically accessible by specifying a
combination of the table name, the primary key value, and
the column name.
3C.5 Relational Database Desi
g
n
Collaborate Lesson 3C / Slide 3 of 22©NIIT
Collaborate
Codd’s Twelve Rules (Contd )
•
Systematic treatment of NULL values: The DBMS must support
NULL values to represent missing or inapplicable information. They
must be distinct from zero to spaces. The NULL values must be
independent of data type.
•
Active online catalog based on the relational model: The system
catalog tables hold the description of the database structure.These
tables are created, owned, and maintained by the DBMS. They can
be accessed by users in the same manner as the ordinary tables,
depending on the user’s privileges. System tables are read-only.
•
The comprehensive data sublanguage rule: This rule states that
the system must support at least all the following functions:
•
Data definition
•
View definition
•
Data manipulation operations
•

Security and integrity constraints
•
Transaction management operations
Relational Database Desi
g
n
3C.6
Collaborate Lesson 3C / Slide 4 of 22©NIIT
Collaborate
Codd’s Twelve Rules (Contd )
•
The view updating rule: All views that are theoretically updatable
must be updatable by the system.
•
High level insert, update, and delete: This rule states that rows
should be treated as sets in insert, delete, and update operations.
It stresses on the set-oriented nature of the database.
•
Application programs must remain unimpaired when any changes
are made in storage representation or access methods.
•
Logical data independence: Changes should not affect the user’s
ability to work with the data.
•
Integrity independence: Integrity constraints must be storable in
the system catalog.
•
Distribution independence: Database must allow manipulation of
distributed data located on other computer systems.
•

Nonsubversion rule: The nonsubversion rule states that different
levels of the language cannot subvert or bypass the integrity rules
and constraints.
Dr. E.F. Codd presented twelve rules that a database must obey if it is to be considered
truly relational. These rules come out of Codd’s theoretical work on the relational model.
No currently available RDBMS fully satisfies all twelve of Codd’s rules.
The rules stem from a single foundation rule- the zero rule:
For a system to qualify as a RELATIONAL, DATABASE, MANAGEMENT system, it must use
its RELATIONAL facilities to MANAGE the DATABASE.
The twelve rules are as follows:
1. The information rule: This rule simply requires all information to be represented
as data values in the rows and columns of tables. This is the basis of the relational
model.
2. The guaranteed access rule: Every data value in a relational database should be
logically accessible by specifying a combination of the table name, the primary key
value, and the column name.
3. Systematic treatment of NULL values: The DBMS must support NULL values to
represent missing or inapplicable information. They must be distinct from zero to
spaces. The NULL values must be independent of data type i.e NULL values for all
types of data are the same.
3C.7 Relational Database Desi
g
n
4. Active online catalog based on the relational model: In an earlier session, we
mentioned system tables or system catalog. The system catalog is a collection of
tables that the DBMS maintains for its own use. These tables hold the description
of the structure of the database. These tables are created, owned, and maintained
by the DBMS. They can be accessed by users in the same manner as the ordinary
tables, depending on the user’s privileges. System tables are read-only.
5. The comprehensive data sublanguage rule: This rule states that the system

must support at least all the following functions:
a. Data definition
b. View definition
c. Data manipulation operations
d. Security and integrity constraints
e. Transaction management operations
6. The view updating rule: All views that are theoretically updatable must be
updatable by the system.
7. High level insert, update, and delete: This rule states that rows should be
treated as sets in insert, delete, and update operations. It stresses on the set-
oriented nature of the database. Just as the SELECT operation can deal with a set
of rows, the other operations that modify the database should also deal with sets,
and not only with single rows.
This rule prevents DBMSs that support a row-at-a-time modification of the
database. Therefore, the DBMS must allow multiple rows to be updated.
8. Physical data independence: Application programs must remain unimpaired
when any changes are made in storage representation or access methods.
9. Logical data independence: Changes should not affect the user’s ability to work
with the data.
10. Integrity independence: Integrity constraints must be storable in the system
catalog. The concept of data integrity requires no further explanation.
11. Distribution independence: Database must allow manipulation of distributed
data located on other computer systems.
12. Nonsubversion rule: The nonsubversion rule states that different levels of the
language cannot subvert or bypass the integrity rules and constraints. For
example, assume there is an application program that takes care of the integrity of
data in a database. Another user can write another program, say in C language,
which can bypass the integrity constraints imposed by the application program.
Such an integrity violation is unacceptable in an RDBMS. Therefore, integrity
constraints should be specified at the DBMS level, and not through application

programs only. The DBMS must ensure that no other level can bypass the
constraints specified to maintain the integrity of the database.
Relational Database Desi
g
n
3C.8
Indexes
Collaborate Lesson 3C / Slide 5 of 22©NIIT
Collaborate
Indexes
•
For direct access to records, you can design additional structures that can
be linked with files. One such structure is an index.
•
An index for a file is like a catalog in a library.
•
The CREATE INDEX statement is used to create an index.
•
The DROP INDEX statement is used to remove an index.
•
The descriptions of indexes are stored in the system database.
•
When a table is dropped, all indexes created on that table are also
dropped.
A database query may reference only a portion of the records in a file. For example, the
query “Find all customers located at Boston” references only a small portion of the file that
stores customer records. If the system has to check this condition for each record, the
efficiency of the system will be less. Therefore, there needs to be a method to access the
required records directly. For direct access to records, you can design additional structures
that can be linked with files. One such structure is an index. An index for a file is like a

catalog in a library.
There are two types of indexes:
Primary: A primary index contains pointers that point directly to the file.
Secondary: A secondary index contains pointers that do not point directly to the file.
Instead, each pointer points to a bucket, which, in turn, contains pointers to the
actual file. Secondary indexes improve the performance of queries that use keys
other than the primary key. However, they have a disadvantage. They complicate
the process of making any changes in the database.
3C.9 Relational Database Desi
g
n
Like base tables, indexes are also created and dropped using SQL DDL statements. In a
relational system, the user does not decide whether to use an existing index or not. This
decision is taken by the query processor.
The SQL statement for creating an index on the customer table is:
CREATE INDEX x ON customer (cust-no) ASC
The above statement creates an index named x, based on the ascending value of column
cust-no. To define a descending index, you can specify DESC instead of ASC.
You can also specify the UNIQUE option with the CREATE INDEX statement. For example,
CREATE UNIQUE INDEX x ON customer (cust-no) ASC
By specifying the UNIQUE option, you can ensure that no two records in the base table can
take on the same value for the indexed column or column combination. This is a good way
of ensuring non-duplicate rows in a table. If an existing table already violates the
uniqueness constraint, an attempt to create a unique index will fail.
The statement to drop an index is simply:
DROP INDEX x
The descriptions of indexes are also stored in the system database. When an index is
dropped, its description is removed from the system catalog. When a table is dropped, all
indexes created on that table are also dropped.
Relational Database Desi

g
n
3C.10
Recovery from Deadlock
Collaborate Lesson 3C / Slide 6 of 22©NIIT
Collaborate
Recovery from Deadlock
•
When a deadlock occurs, one or more transactions are rolled back to break
the deadlock.
•
While performing the roll back operation, the following issues need to be
addressed:
•
Selection of a victim - In the situation of a deadlock, you first need to
determine the transaction (or transactions) that should be rolled back
to break the deadlock.
•
Rollback - After determining the transaction to be rolled back, you
need to determine how far the transaction is to be rolled back.
•
Starvation - To avoid starvation, you need to ensure that a
transaction is picked as a victim for only a fixed number of times.
Once it is determined that a deadlock exists, the system needs to recover from the
deadlock. For this, one or more transactions are rolled back to break the deadlock. While
performing the roll back operation, the following issues need to be addressed:

Selection of a victim: In the situation of a deadlock, you first need to determine the
transaction (or transactions) that should be rolled back to break the deadlock. Such
a transaction is called the victim transaction. The transaction that will lead to

minimum loss, in terms of cost, should be chosen for rollback. The following factors
determine the cost of a rollback:
x How much execution has the transaction completed and for how much more time
the transaction will execute to complete its task?
x How many data items were used by the transaction?
x How many more data items does the transaction need to complete?
x How many transactions will be included in the rollback?
Rollback: After determing the transaction to be rolled back, you need to determine
how far the transaction is to be rolled back. The easiest answer to this problem is to
do a total rollback, which means that the transaction will be aborted and restarted.
3C.11 Relational Database Desi
g
n
However, it is better to roll back the transaction only till the point where the
deadlock can be broken. This method requires the DBMS to maintain information
about all current transcations.
Starvation: When the selection of a victim is based on cost factors, it might happen
that the same transaction is selected as a victim every time a deadlock occurs. Due
to this, the transaction might not be able to complete its task. Such a situation is
called starvation. To avoid starvation, you need to ensure that a transaction is picked
as a victim for only a fixed number of times. To ensure this, you can select a victim
based on the number of rollbacks along with the cost factor.
Database Recovery
Collaborate ©NIIT
Collaborate
Database Recovery
•
Some events that cause system failure are:
•
System errors

•
Logical errors
•
Hardware failures
•
Database recovery procedures enable recovery of data to the state that
existed some time before the system failure. These procedures also help
identify the status of transaction processing when the system failed.
•
A transaction needs to be in one of the following two states to maintain
data integrity:
•
Aborted state
•
Committed state
Lesson 3C / Slide 7 of 22
Relational Database Desi
g
n
3C.12
Collaborate ©NIIT
Collaborate
Database Recovery (Contd )
•
Database recovery involves the transaction log that stores a history of all
the changes made to the database and the status of each transaction.
•
You can follow any of the following two approaches while deciding the
recovery strategy:
•

Log with deferred updates
•
Log with immediate updates
Lesson 3C / Slide 8 of 22
You can lose any information stored on a computer because of a wide range of events.
Some such events that cause system failure are:
System errors: The system might enter an unfavorable state, like a deadlock. Such a
state prevents a program from completing its execution in a normal manner.
However, system errors may or may not lead to the corruption of data files.
Logical errors: Logical problems include bad or missing data that can prevent a
program from completing its normal execution.

Hardware failures: The most common types of hardware failures include disk failure
and loss of transmission capacity over a transmission link.
You need to have some procedures for recovering the database after a system failure. It is
difficult to restore the database to exactly the same state where it failed. However,
database recovery procedures enable recovery of data to the state that existed some time
before the system failure. These procedures also help identify the status of transaction
processing when the system failed. Using this information, the DBMS can complete the
processing of unprocessed transactions with the recovered database. In this way, the
database can be brought to the state that existed when the system failed.
A transaction needs to be in one of the following two states to maintain data integrity:
3C.13 Relational Database Desi
g
n
Aborted state: It is not always necessary that a transaction will complete its
execution. When a transaction is incomplete, it is aborted so that it does not affect
the consistent state of the database. This restores the database to the state it was in
before the aborted transaction started executing. This restoration is attained by
rollback.

Committed state: When a transaction successfully completes its execution, it is said
to be in the committed state. A committed transaction leads to the database
reaching a new consistent state.
Database recovery involves the transaction log that stores a history of all the changes
made to the database and the status of each transaction. You can follow any of the
following two approaches while deciding the recovery strategy:
Log with deferred updates: This approach of database recovery records all database
modifications in the log. However, it defers the execution of all write operations of a
transaction until the transaction is partially committed. When the transaction is
committed partially, the information in the log associated with the transaction is used
to execute the deferred write operations. Then, the database is updated with the
results of these write operations. If the system fails before the transaction completes
execution or if the transaction aborts, then the information in the log is ignored.

Log with immediate updates: In this approach, all updates are made to the database
immediately and a record of all the changes is maintained in the log. When the
system fails, the information in the log is used to restore the database to the
previous consistent state.
Relational Database Desi
g
n
3C.14
F
F
R
R
O
O
M
M

T
T
H
H
E
E
E
E
X
X
P
P
E
E
R
R
T
T
’
’
S
S
D
D
E
E
S
S
K
K

Collaborate Lesson 3C / Slide 9 of 22©NIIT
Collaborate
From the Expert’s Desk
•
This section will introduce the following:
•
Best practices for preventing deadlocks
•
Tip on database recovery
•
FAQs on deadlocks and SQL statements
In this section, you will learn about best practices for preventing deadlocks, a tip on
database recovery, and some FAQs on deadlocks and SQL statements.
3C.15 Relational Database Desi
g
n
Best Practices
Preventing Deadlocks
Collaborate Lesson 3C / Slide 10 of 22©NIIT
Collaborate
Best Practices
Preventing Deadlocks
•
Each transaction should lock all its data items before it starts execution. In
addition, either all data items should be locked in one step or none of them
is locked.
•
Introduce a partial ordering of data items and enforce that a transaction can
lock a data item only in the specified order.
•

Use pre-emption and transaction rollbacks.
The different methods to prevent deadlocks are:
Each transaction should lock all its data items before it starts execution. In addition,
either all data items should be locked in one step or none of them is locked.
However, this method has some drawbacks. First, many data items may be locked
by a transaction but the transaction may not use them for a long period of time. This
leads to low utilization of data items. Second, this method can lead to starvation.
This means that if a transaction needs many data items, it may have to wait
indefinitely because at least one data item that it needs is locked by some other
transaction.
Introduce a partial ordering of data items and enforce that a transaction can lock a
data item only in the specified order.

Use pre-emption and transaction rollbacks. In pre-emption, you need to assign a
unique timestamp to each transaction. This timestamp is used to determine whether
a transaction should wait for a data item or roll back. The transaction with the
Relational Database Desi
g
n
3C.16
smallest timestamp is not rolled back. If the transaction is rolled back, it restarts
with its old timestamp.
Tips
Recovering Database using Transaction Log
Collaborate ©NIIT
Collaborate
Tips
Recovering Database using Transaction Log
•
Database recovery involves scanning the transaction log for the most recent

transactions.
•
However, there may be no limit to the scanning of the log as errors might
have occurred at the first transaction.
•
A better method is to locate a point, called a checkpoint, that is suitably far
back in the DBMS.
•
The DBMS ensures that any item written before this point is without any
errors and is stored carefully. Then, database recovery begins from the
checkpoint. This method is known as checkpointing.
Lesson 3C / Slide 11 of 22
Usually, database recovery involves scanning the transaction log for the most recent
transactions. However, there may be no limit to the scanning of the log as errors might
have occurred at the first transaction. This process is very time-consuming. A better
method is to locate a point, called a checkpoint, that is suitably far back in the DBMS. The
DBMS ensures that any item written before this point is without any errors and is stored
carefully. Then, database recovery begins from the checkpoint. This method is known as
checkpointing. When the DBMS executes, it maintains the transaction log but also
performs checkpointing consisting of the following actions:
1. The DBMS temporarily halts the start of any new transactions until all the current
transactions are either committed or aborted.
2. The DBMS makes a backup copy of the database.
3C.17 Relational Database Desi
g
n
3. The DBMS writes all log records that are in primary memory to disk storage.
4. The DBMS appends a record at the end of the log to indicate that a checkpoint has
occurred. Then, the DBMS writes the record to the disk storage.
FAQs

Collaborate ©NIIT
Collaborate
FAQs
Q. How often should we check for the occurrence of deadlocks?
Ans: The frequency with which you need to check for the occurrence of deadlocks
depends on the following two factors:
• Frequency of deadlock occurrence
• Number of transactions that are affected by the deadlock
You need to do a regular check of the occurrence of deadlocks if they occur
frequently. This is because data items that are held by deadlocked
transactions cannot be used by other transactions until the deadlock is
broken. As a worst case, sometimes, you might need to check for deadlock
occurrence every time an allocation request cannot be granted immediately.
Lesson 3C / Slide 12 of 22
Relational Database Desi
g
n
3C.18
Collaborate ©NIIT
Collaborate
FAQs (Contd )
Q. Is there any method to test whether a subquery returns any rows as result?
Ans: SQL provides the EXISTS clause to test whether a subquery returns any
rows as result. You can also test the non-existence of rows in a subquery by
using the NOT EXISTS clause.
Q. Apart from UNION, can we perform other set operations by using SQL
statements?
Ans: In addition to UNION, you can perform the INTERSECT and MINUS set
operations by using the INTERSECT and MINUS clauses available in SQL.
Q. Does SQL include any operator for comparison of values?

Ans: SQL includes the BETWEEN comparison operator. This operator simplifies the
WHERE clause that specifies a condition in which a value should be less than
or equal to some value and greater than or equal to some other value. SQL
also provides the NOT BETWEEN comparison operator.
Lesson 3C / Slide 13 of 22
How often should we check for the occurrence of deadlocks?
The frequency with which you need to check for the occurrence of deadlocks depends
on the following two factors:
x Frequency of deadlock occurrence
x Number of transactions that are affected by the deadlock
You need to do a regular check of the occurrence of deadlocks if they occur
frequently. This is because data items that are held by deadlocked transactions
cannot be used by other transactions until the deadlock is broken. As a worst case,
sometimes, you might need to check for deadlock occurrence every time an
allocation request cannot be granted immediately.
Is there any method to test whether a subquery returns any rows as result?
SQL provides the EXISTS clause to test whether a subquery returns any rows as
result. For example, the following SQL statement finds the names of customers who
have an account at the Ridge branch.
SELECT name
FROM customer
WHERE EXISTS (SELECT *
3C.19 Relational Database Desi
g
n
FROM deposit
WHERE deposit.name = customer.name
AND br-name = “Ridge”)
Here, the EXISTS clause tests whether the customer has an account at the Ridge
branch.

You can also test the non-existence of rows in a subquery by using the NOT
EXISTS clause. For example, the following SQL statement finds the names of
customers who do not have an account at the Ridge branch.
SELECT name
FROM customer
WHERE NOT EXISTS (SELECT *
FROM deposit
WHERE deposit.name = customer.name
AND br-name = “Ridge”)
Apart from UNION, can we perform other set operations by using SQL statements?
In addition to UNION, you can perform the INTERSECT and MINUS set operations by
using the INTERSECT and MINUS clauses available in SQL. For example, the
following statement finds the codes of all products that are refrigerators and have
been bought by customer “C4171”.
SELECT prod-no
FROM product
WHERE desc = “refrigerator”
INTERSECT
SELECT prod-no
FROM sale
WHERE cust-no = “C4171”
The following statement finds the codes of all products that are refrigerators but
have not been bought by customer “C4171”.
SELECT prod-no
FROM product
WHERE desc = “refrigerator”
MINUS
SELECT prod-no
FROM sale
WHERE cust-no = “C4171”

Does SQL include any operator for comparison of values?
SQL includes the BETWEEN comparison operator. This operator simplifies the WHERE
clause that specifies a condition in which a value should be less than or equal to
some value and greater than or equal to some other value. For example, consider
Relational Database Desi
g
n
3C.20
that you want to find the account numbers of accounts with balances between
$54,000 and $66,000. There can be two SQL statements to answer this query. One
SQL statement would be:
SELECT acc-no
FROM deposit
WHERE balance <= 66000 AND balance >= 54000
Another SQL statement would be:
SELECT acc-no
FROM deposit
WHERE balance BETWEEN 54000 AND 66000
Notice that the second SQL statement is easier to understand. SQL also provides the
NOT BETWEEN comparison operator.
3C.21 Relational Database Desi
g
n
C
C
H
H
A
A
L

L
L
L
E
E
N
N
G
G
E
E
Collaborate ©NIIT
Collaborate
Challenge
1. Match the types of views with their definitions.
1. Column subset A. It includes only some rows
and some columns of the source
table.
2. Row subset B. It is formed by specifying a
two or three table query in the
view definition.
3. Joined C. It includes all columns but
only some rows of the source
table.
4. Row-column subset D. It includes all rows but only
some columns of the source
table.
Lesson 3C / Slide 14 of 22
Relational Database Desi
g

n
3C.22
Collaborate ©NIIT
Collaborate
Challenge (Contd )
2. Consider the following steps:
1. Transaction A retrieves record R at time T.
2. Transaction B retrieves the same record at time T+1.
3. Transaction A updates the record at time T+2
(based on values seen at T).
4. Transaction B updates the same record at time T+3
(based on values seen at time T+1).
What is the result of the above sequence of steps?
Lesson 3C / Slide 15 of 22
3C.23 Relational Database Desi
g
n
Collaborate ©NIIT
Collaborate
Lesson 3C / Slide 16 of 22
Challenge (Contd )
3. What is the problem with the following SQL statement?
CREATE TABLE product
(product-no CHAR(5),
product-type CHAR(14),
price DECIMAL (7, 2),
PRIMARY KEY (product-no))
1. The datatype DECIMAL is not correctly specified.
2. The NOT NULL clause is not specified with the
attribute a-no.

3. The attribute names should not be in parentheses.
4. The table name should be in parentheses.
Relational Database Desi
g
n
3C.24
Collaborate ©NIIT
Collaborate
Challenge (Contd )
4. Consider the following situation:
A student has come to a university for registration. Students in
this university have to register for at least two courses
(subjects). The student has chosen one in American history and
another in Sociology. The person in the registration office did
the following:
1. Checked for availability of seat for the course in American
history.
2. Found seat available.
3. Student was granted admission.
4. Checked for availability of seat for the course in Sociology.
5. Found no seats available.
Lesson 3C / Slide 17 of 22
3C.25 Relational Database Desi
g
n
Collaborate ©NIIT
Collaborate
Challenge (Contd )
6. Refused student admission to this course and offered registration
to a course in Psychology.

7. Registered student for both the courses.
What do you suppose the DBMS will do if there is a hardware failure
in the middle of step 4?
5. State whether true or false:
Rows in a relation do not have any order.
6. Which of the following are the advantages of using views?
1. Easy updation
2. Valid information
3. Logical data independence
4. Restricted access
Lesson 3C / Slide 18 of 22

COLLABORATE ppsx

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về