Tải bản đầy đủ (.pdf) (40 trang)

DATABASE SYSTEMS (phần 19) doc

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.64 MB, 40 trang )

22.3 The
Informix
Universal Server I
715
Data
Inheritance.
To create subtypes
under
existing row types, we use
the
UNDER
keyword as discussed earlier.
Consider
the
following example:
CREATE
ROW
TYPE
employee_type
(
ename
VARCHAR(25),
ssn
CHAR(9),
salary
INT)
;
CREATE
ROW
TYPE
engineer_type


(
degree
VARCHAR(10)
,
license
VARCHAR(20))
UNDER
employee_type;
CREATE
ROW
TYPE
engr_mgr_type
(
manager_start_date
VARCHAR(10)
,
dept_managed
VARCHAR(20))
UNDER
engineer_type;
The above
statements
create an
employee_type
and
a subtype called
engineer_type,
which represents employees
who
are engineers

and
hence
inherits all attributes of
employees
and
has additional properties of deg
ree
and
1i
cense.
Another
type called
engr_mgr_type is a subtype
under
engineer_type,
and
hence
inherits from
engineer_
type
and
implicitly from emp1
oyee_
type
as well. Informix Universal Server does
not
sup-
port multiple
inheritance.
We

can
now
create tables called employee,
engineer,
and
engr
_mg
r based
on
these row types.
Note
that
storage options for storing type hierarchies in tables vary. Informix
Universal Server provides
the
option
to store instances in different
combinations-for
example,
one
instance (record) at
each
level or
one
instance
that
consolidates all
levels-
these correspond to
the

mapping options in
Section
7.2.
The
inherited
attributes are
either represented repeatedly in
the
tables at lower levels or are represented
with
a
reference to
the
object of
the
supertype.
The
processing of SQL commands is appropriately
modified based
on
the
type hierarchy. For example,
the
query
SELECT
*
FROM
employee
WHERE
salary>

100000;
returns
the
employee
information
from all tables where
each
selected employee is repre-
sented.
Thus
the
scope of
the
employee table
extends
to all tuples
under
employee. As a
default, queries
on
the
supertable
return
columns from
the
supertable as well as those from
the subtables
that
inherit
from

that
supertable. In contrast,
the
query
SELECT
*
FROM
ONLY
(employee)
WHERE
salary>
100000;
returns instances from only
the
employee table because of
the
keyword
ONLY.
It is possible to query a supertable using a
correlation
variable
so
that
the
result
contains
not
only supertable_type columns of
the
subtables

but
also subtype-specific
columns of
the
subtables.
Such
a query returns rows of different sizes;
the
result is called a
716 IChapter 22 Object-Relational and Extended-Relational Systems
jagged
row
result.
Retrieving all information about an employee from all levels in a
"jagged form" is accomplished by
SELECT
e
FROM
employee e ;
For
each
employee,
depending
on
whether
he or she is an engineer or some other
subtypets), it will
return
additional sets of attributes from
the

appropriate subtype tables.
Views defined over supertables
cannot
be updated because
placement
of inserted
rows
is ambiguous.
Function
Inheritance.
In
the
same way
that
data
is
inherited
among tables along a
type hierarchy, functions
can
also be
inherited
in an ORDBMS. For example, a function
overpaid may be defined
on
emp1
oyee_
type
to select those employees making a higher
salary

than
Bill Brown as follows:
CREATE
FUNCTION
overpaid
(employee_type)
RETURNS
BOOLEAN
AS
RETURN
$l.salary
>
(SELECT
salary
FROM
employee
WHERE
ename =
'Bill
Brown');
The
tables
under
the
employee table automatically
inherit
this function. However,
the
same
function

may be redefined for
the
engr
_mgr
_type
as those employees making a
higher
salary
than
Jack Jones as follows:
CREATE
FUNCTION
overpaid
(engr_mgr_type)
RETURNS
BOOLEAN
AS
RETURN
$l.salary
>
(SELECT
salary
FROM
employee
WHERE
ename =
'Jack
Jones');
For example, consider
the

query
SELECT
e.ename
FROM
ONLY
(employee)
e
WHERE
overpaid
(e);
which
is evaluated
with
the
first definition of overpaid.
The
query
SELECT
g.ename
FROM
engineer
9
WHERE
overpaid
(g);
also uses
the
first definition of overpaid (because it was
not
redefined for engineer), whereas

SELECT
gm.ename
FROM
engr_mgr
gm
WHERE
overpaid
(gm);
uses
the
second definition of overpaid,
which
overrides
the
first.
This
is called operation
(or
function)
overloading, as was discussed in
Section
20.6 under polymorphism. Note
that
overpaid-and
other
functions-can
also be treated as virtualattributes;
hence
over-
paid may be referenced as

emp
1
oyee
.ove
rpa
i d or eng r
_mg
r .ove
rpa
i d in a query.
22.3 The
Informix
Universal Server I 717
22.3.4 Support for Indexing Extensions
Informix Universal Server supports indexing
on
user-defined routines
on
either
a single
table or a table hierarchy. For example,
CREATE
INDEX
empl_city
ON
employee
(city
(address));
creates an index
on

the
table employee using
the
value of
the
city function.
In order to support user-defined indexes, Informix Universal Server supports operator
classes,
which
are used to support user-defined
data
types in
the
generic B-tree as well as
other secondary access
methods
such as Rvtrees.
22.3.5 Support for External Data Source
Informix Universal Server supports external data sources (such as data stored in a file system)
that are mapped to a table in the database called the virtual table interface. This interface
enables
the
user to define operations
that
can be used as
proxies
for the otheroperations, which
are needed
to
access and manipulate the row or rows associated with the underlying data

source. These operations include open,
close,
fetch,
insert,
and
delete.
Informix Univer-
sal Server also supports a set of functions
that
enables calling
SQL
statements within a user-
defined routine without the overhead of going through a client interface.
22.3.6 Support for Data Blades Application
Programming Interface
The
Data
Blades
Application
Programming Interface (API) of Informix Universal Server
provides
new
data
types
and
functions for specific types of applications. We will review
the extensible
data
types for two-dimensional operations (required in GISor CADapplica-
tions),11

the
data
types related to image storage
and
management,
the
time series
data
type,
and
a few features of
the
text
data
type.
The
strength
of
ORDBMSs
to deal
with
the
new
unconventional
applications is largely attributed to these special
data
types
and
the
tailored functionality

that
they provide.
Two-Dimensional
(Spatial)
Data
Types. For a two-dimensional application,
the
relevant
data
types would include
the
following:
• A
point
defined by (X, Y) coordinates.
• A
line
defined by its two
end
points.
• A polygon defined by an ordered list of n points
that
form its vertices.
• A
path
defined by a sequence (ordered list) of points.
• A circle defined by its
center
point
and

radius.
11.
Recall that GIS stands for Geographic Information
Systems
and CAD for Computer Aided
Design.
718 IChapter 22 Object-Relational and Extended-Relational Systems
Given
the
above as
data
types, a function such as
distance
may be defined between two
points, a
point
and
a line, a line
and
a circle,
and
so on, by implementing
the
appropriate
mathematical expressions for distance in a programming language. Similarly, a Boolean
cross
function-which
returns true or false depending on
whether
two geometric objects

cross (or
intersectl-i-can
be defined between a line
and
a polygon, a
path
and
a polygon, a
line
and
a circle,
and
so on.
Other
relevant Boolean functions for
GIS
applications would
be
overlap
(polygon, polygon), contains (polygon, polygon), contains (point, polygon), and
so on.
Note
that
the
concept
of overloading (operation polymorphism) applies when the
same function
name
is used
with

different argument types.
Image
Data
Types. Images are stored in a variety of standard
formats-such
as
TIFF,
GIF,
JPEG,
photof.D,
GROUP
4,
and
FAX-so
one
may definea
data
type for
each
of these
formats
and
use appropriate library functions to
input
images from
other
media or to
render images for display. Alternately,
IMAGE
can

be regarded as a single
data
type with a
large
number
of options for storage of data.
The
latter
option
would allow a column in a
table to be of type
IMAGE
and
yet accept images in a variety of different formats. The
following are some possible functions (operations)
on
images:
rotate
(image,
angle)
returns
image.
crop
(image, polygon)
returns
image.
enhance (image)
returns
image.
The

crop
function
extracts
the
portion
of an image
that
intersects
with
a polygon.
The
enhance
function
improves
the
quality
of
an image by performing contrast
enhancement.
Multiple images may be supplied as parameters to
the
following functions:
common
(imagel,
image2)
returns
image.
union
(imagel,
image2)

returns
image.
similarity
(imagel,
image2)
returns
number.
The
similarity
function
typically takes
into
account
the
distance between two vectors
with
components
<co
lor,
shape,
textu
re,
edge>
that
describe
the
content
of the two
images.
The

VIR
Data
Blade in Informix Universal Server
can
be used to accomplish a
search
on
images by
content
based
on
the
above similarity measure.
Time Series
Data
Type. Informix Universal Server supports a time series data type
that
makes
the
handling
of
time
series
data
much
more simplified
than
storing it in
multiple tables. For example, consider storing
the

closing stock price
on
the
New
York
Stock
Exchange for more
than
3,000 stocks for
each
workday
when
the
market is open.
Such
a table
can
be defined as follows:
CREATE
TABLE
stockprices
(
company-name
VARCHAR(30),
symbol
VARCHAR(5),
prices
TIME_SERIES
OF
FLOAT);

Regarding
the
stock price
data
for all 3,000 companies over an entire period of,
say,
several years, only
one
relation is adequate
thanks
to
the
time series
data
type for the
prices attribute.
Without
this
data
type,
each
company would
need
one
table. For
example, a table for
the
coca_cola company (symbol KO) may be declared as follows:
22.3 The
Informix

Universal Server I
719
CREATE
TABLE
coca_cola
(
recording_date
DATE,
price
FLOAT);
In this table,
there
would be approximately 260 tuples per
year-one
for
each
business
day.
The
time series
data
type takes
into
account
the
calendar, starting time, recording
interval (for example, daily, weekly,
monthly),
and
so on. Functions such as

extracting
a
subset of
the
time series (for example, closing prices during January 1999), summarizing at
a coarser granularity (for example, average weekly closing price from
the
daily closing
prices),
and
constructing moving averages are appropriate.
A query
on
the
stockprices table
that
gives
the
moving average for 30 days starting at
June 1, 1999 for
the
coca_co
1a stock
can
use
the
MOVING-AVG
function
as follows:
SELECT

MOVING-AVG(pri
ces,
30,
'1999-06-01')
FROM
stockprices
WHERE
symbol = "KO";
The
same query in SQL
on
the
table
coca_co
1a would be
much
more complicated to
write
and
would access numerous tuples, whereas
the
above query
on
the
stockprices table
deals
with
a single row in
the
table corresponding to this company. It is claimed

that
using
the time series
data
type provides an order of magnitude performance gain in processing
such queries.
Text Data Type.
The
text
DataBlade supports storage, search,
and
retrieval for
text
objects.
It
defines a single
data
type called doc, whose instances are stored as large objects
that
belong to
the
built-in
data
type 1
arge-text.
We will briefly discuss a few
important
features of this
data
type.

The
underlying storage for 1
arge-text
is
the
same as
that
for
the
1
arge-obj
ect
data
type. References to a single large object are recorded in
the
'refcount'
system table,
which stores
information
such
as
number
of rows referring to
the
large object, its OlD, its
storage manager, its last modification time,
and
its archive storage manager.
Automatic
conversion

between
1
arge-text
and
text
data
types enables any functions with
text
arguments to be applied to 1
arge-text
objects.
Thus
concatenation
of 1
arge-text
objects as strings as well as
extraction
of substrings from a 1
arge-text
object are possible.
The
Text DataBlade parameters include format for which the default is ASCII, with other
possibilitiessuch as
postscri
pt,
dvi
postscri
pt,
nroff,
troff,

and
text.
A Text Conversion
DataBlade, which is separate from the Text DataBlade, is needed
to
convert documents among
the various formats.
An
External File parameter instructs the internal representation of doc to
storea pointer to an external filerather
than
copying it to a large object.
For
manipulation
of doc objects, functions such as
the
following are used:
Import_doc
(doc,
text)
returns
doc.
Export_doc
(doc,
text)
returns
text.
Assign
(doc)
returns

doc.
Destroy
(doc)
returns
void.
The
Assign
and
Destroy
functions already exist for
the
built-in
large-object
and
1
arge-text
data
types,
but
they
must be redefined by
the
user for objects of type doc.
The
720
I Chapter 22 Object-Relational and Extended-Relational Systems
following
statement
creates a
table

called
1
ega
1
documents,
where
each
row
has
a title of
the
document
in
one
column
and
the
document
itself as
the
other
column:
CREATE
TABLE
legaldocuments(
title
TEXT,
document DOC);
To
insert

a
new
row
into
this
table
of
a
document
called
'1
ease.
cont
ract,'
the
following
statement
can
be used:
INSERT
INTO
legaldocuments
(title,
document)
VALUES
('lease.
contract'
,
'format
{troff}:/user/local/

documents/lease');
The
second
value
in
the
values clause is
the
path
name
specifying
the
file location of
this
document;
the
format
specification signifies
that
it is a
troff
document.
To search
the
text,
an
index
must
be
created,

as in
the
following
statement:
CREATE
INDEX
legalindex
ON
legaldocuments
USING
dtree(document
text_ops);
In
the
above,
text_ops
is
an
op-class
(operator
class) applicable to
an
access
structure
called
a
dtree
index,
which
is a special

index
structure
for
documents.
When
a
document
of
the
doc
data
type is
inserted
into
a table,
the
text
is parsed
into
individual
words.
The
Text
DataBlade
is case insensitive;
hence,
Housenumber, HouseNumber, or
housenumber
are all
considered

the
same word.
Words
are stemmed
according
to the
WORDNET thesaurus. For
example,
houses
or
housi
ng would be
stemmed
to house,
quickly
to
quick,
and
talked
to
talk.
A
stopword
file is
kept,
which
contains
insignificant words
such
as articles or

prepositions
that
are
ignored
in
the
searches.
Examples
of
stopwords
include
is,
not,
a,
the,
but, for,
and,
if,
and
so
on.
Informix
Universal
Server
provides
two
sets of
routines-the
contains
routines

and
text-string
functions-to
enable
applications
to
determine
which
documents
contain
a
certain
word or words
and
which
documents
are similar.
When
these
functions
are used in
a
search
condition,
the
data
is
returned
in
descending

order
of
how
well
the
condition
matches
the
documents,
with
the
best
match
showing
first.
There
is
Wei
ght-
Contai
ns
(i
ndex
to
use,
tup
1
e-i
d
of

the
document,
input
stri
ng)
function
and a
similar Wei
ghtContai
nsWords
function
that
returns
a
precision
number
between
0 and 1
indicating
the
closeness of
the
match
between
the
input
string
or
input
words and the

specific
document
for
that
tuple-id. To illustrate
the
use of
these
functions, consider
the
following query:
Find
the
titles of legal
documents
that
contain
the
top
ten
terms in
the
document
titled
'1
ease
contract',
which
can
be specified as follows:

SELECT
d.title
FROM
legaldocuments
d,
legaldocuments
1
WHERE
contains
(d.document,
AndTerms
(TopNTerms(l.document,lO)))
AND
l.title
=
'lease.contract'
AND
d.title
<>
'lease.contract';
This
query illustrates
how
SQL
can
be
enhanced
with
these
data

type specific functions
to yield a very powerful capability of
handing
text-related
functions. In
this
query, variable
d refers to
the
entire
legal corpus whereas 1 refers
to
the
specific
document
whose title is
22.4
Object-Relational Features of
Oracle
8 I 721
"
ease.
cont
ract'.
TopNTe
rms extracts
the
top
ten
terms from

the
"
ease.
cont
ract'
document
(1);
AndTerms combines these terms
into
a list;
and
contains
compares
the
terms in
that
list
with
the
stemwords in every
other
document
(d) in
the
table
,
ega'
documents.
Summary of Data Blades. As we
can

see,
Data
Blades
enhance
an
RDBMS
by
providing various
constructors
for
abstract
data
types (ADTs)
that
allow a user
to
operate
on
the
data
as if it were stored in
an
ODBMS
using
the
ADTs as classes.
This
makes
the
relational system

behave
as
an
ODBMS,
and
drastically
cuts
down
the
programming
effort
needed
when
compared
with
achieving
the
same
functionality
with
just SQL
embedded
in
a
programming
language.
22.4 OBJECT-RELATIONAL FEATURES OF ORACLE 8
In
this
section

we will
review
a
number
of
features
related
to
the
version
of
the
Oracle
DBMS
product
called
Release
8.X,
which
has
been
enhanced
to
incorporate
object-rela-
tional
features.
Additional
features may
have

been
incorporated
into
subsequent
ver-
sions
of
Oracle.
A
number
of
additional
data
types
with
related
manipulation
facilities
called cartridges
have
been
added.
12
For
example,
the
spatial
cartridge
allows map-
based

and
geographic
information
to be
handled.
Management
of
multimedia
data
has
been
facilitated
with
new
data
types.
Here
we
highlight
the
differences
between
the
release 8.X
of
Oracle
(as
available
at
the

time
of
this
writing)
from
the
preceding
ver-
sion in
terms
of
the
new
object-oriented
features
and
data
types as well as
some
storage
options.
Portions
of
the
language sQL-99,
which
we discussed in
Section
22.1, will be
applicable to

Oracle.
We
do
not
discuss
these
features
here.
22.4.1 Some Examples of Object-Relational
Features of Oracle
As
an
ORDBMS,
Oracle
8
continues
to
provide
the
capabilities
of
an
RDBMS
and
addition-
ally supports
object-oriented
concepts.
This
provides

higher
levels of
abstraction
so
that
application developers
can
manipulate
application
objects as opposed to
constructing
the
objects from
relational
data.
The
complex
information
about
an
object
can
be
hidden,
but
the
properties (attributes, relationships)
and
methods
(operations)

of
the
object
can
be identified in
the
data
model. Moreover,
object
type declarations
can
be reused via
inheritance,
thereby
reducing
application
development
time
and
effort. To facilitate
object modeling,
Oracle
introduced
the
following features (as well as some of
the
sQL-99
features in
Section
22.1).

12. Cartridges in
Oracle
are
somewhat
similar to
Data
Blades in Informix.
722
IChapter 22 Object-Relational and Extended-Relational Systems
Representing
Multivalued
Attributes
Using
VARRAY. Some attributes of an
object/entity could be multivalued. In
the
relational model,
the
multivalued attributes
would
have
to be
handled
by forming a new table (see
Section
7.1
and
Section
10.3.2 on
first

normal
form). If
ten
attributes of a large table were rnultivalued, we would have
eleven
tables generated from a single table after normalization. To get
the
data
back, the
developer would
have
to
do
ten
joins across these tables.
This
does
not
happen
in an
object
model
since all
the
attributes of an
object-including
multivalued
ones-are
encapsulated
within

the'
object.
Oracle
8 achieves this by using a varying length array
(VARRAY)
data
type,
which
has
the
following properties:
1.
COUNT:
Current
number
of elements.
2. LIMIT:Maximum number of elements
the
VARRAYcan contain.
This
is user defined.
Consider
the
example of a customer
VARRAY
entity
with
attributes name
and
phone_

numbe
rs,
where
phone_numbe
rs
is multivalued. First, we
need
to define an object type
representing a phone_number as follows:
CREATE
TYPE
phone_num_type
AS
OBJECT
(phone_number
CHAR(lO));
Then
we define a VARRAYwhose elements would be objects of type phone_num_type:
CREATE
TYPE
phone_list_type
as
VARRAY
(5)
OF
phone_num_type;
Now
we
can
create

the
customer_type
data
type as an object
with
attributes customer_
name
and
phone_numbers:
CREATE
TYPE
customer_type
AS
OBJECT
(customer_name
VARCHAR(20),
phone_numbers
phone_list_type);
It
is
now
possible to create
the
customer table as
CREATE
TABLE
customer
OF
customer_type;
To retrieve a list of all customers

and
their
phone
numbers, we
can
issue a simple query
without
any joins:
SELECT
customer_name, phone_numbers
FROM
customers;
Using Nested Tables to Represent
Complex
Objects. In object modeling, some
attributes of an object could be objects themselves.
Oracle
8 accomplishes this by having
nested
tables (see
Section
20.6). Here, columns (equivalent
to
object attributes) can be
declared as tables. In
the
above example let us assume
that
we
have

a description attached
to every
phone
number
(for example,
home,
office, cellular).
This
could be modeled using
a nested table by first redefining phone_num_type as follows:
CREATE
TYPE
phone_num_type
AS
OBJECT
(phone_number
CHAR(lO)
,
description
CHAR(30));
We
next
redefine phone_l i
st_type
as a table of phone_number
_type
as follows:
CREATE
TYPE
phone_list_type

AS
TABLE
OF
phone_number_type;
22.4 Object-Relational Features
of
Oracle 8 I 723
We
can
then
create
the
type
customer_type
and
the
customer
table as before.
The
only
difference is
that
phone
j]
i
st_
type
is now a nested table instead of a V
ARRAY.
Both

struc-
tures
have
similar functions
with
a few differences. Nested tables do not
have
an upper
bound on
the
number
of items whereas
VARRAYs
do
have
a limit. Individual items
can
be
retrieved from
the
nested
tables,
but
this is
not
possible
with
V
ARRAYs.
Additional

indexes
can
also be built on nested tables for faster
data
access.
Object Views.
Object
views
can
be used to build virtual objects from relational data,
thereby enabling programmers to evolve existing schemas to support objects. This allows
relational
and
object applications to coexist on
the
same database. In our example, let us say
that
we
had
modeled our customer database using a relational model, but management
decided to do all future applications in
the
object model. Moving over to
the
object view of
the same existing relational
data
would thus facilitate
the
transition.

22.4.2
Managing
Large
Objects and Other Storage Features
Oracle
can
now
store extremely large objects like video, audio,
and
text
documents.
New
data types
have
been
introduced
for this purpose.
These
include
the
following:
• BLOB (binary large
object).
• CLOB
(character
large object).
• BFILE (binary file stored outside
the
database).
• NCLOB (fixed-width multibyte CLOB).

All
of
the
above
except
for BFILE,
which
is stored outside
the
database, are stored
inside
the
database along
with
other
data.
Only
the
directory
name
for a BFILE is stored in
the database.
Index
Only
Tables.
Standard
Oracle
7.X involves keeping indexes as a B+-tree
that
contains pointers to

data
blocks (see
Chapter
14).
This
gives good performance in most
situations. However,
both
the
index
and
the
data
block must be accessed to read
the
data.
Moreover, key values are stored
twice-in
the
table
and
in
the
index-increasing
the
storage costs.
Oracle
8 supports
both
the

standard indexing scheme
and
also index
only
tables, where
the
data
records
and
index are
kept
together in a B-tree structure (see
Chapter
14).
This
allows faster
data
retrieval
and
requires less storage space for small- to
medium-sized files where
the
record size is
not
too
large.
Partitioned Tables and Indexes. Large tables and indexes can be broken down into
smaller partitions.
The
table now becomes a logical structure and the partitions become the

actual physical structures
that
hold the data. This gives the following advantages:

Continued
data
availability in
the
event
of partial failures of some partitions.
• Scalable performance allowing substantial growth in
data
volumes.

Overall
performance improvement in query
and
transaction processing.
724
I Chapter 22 Object-Relational and Extended-Relational Systems
22.5
IMPLEMENTATION
AND
RELATED
ISSUES
FOR
EXTENDED
TYPE
SYSTEMS
There

are various
implementation
issues regarding
the
support of an
extended
type system
with
associated functions (operations). We briefly summarize
them
hereP

The
ORDBMS
must dynamically link a user-defined function in its address space only
when
it is required. As we saw in
the
case of
the
two
ORDBMSs,
numerous functions
are required to operate on two- or three-dimensional spatial data, images, text, and so
on.
With
a static linking of all
function
libraries,
the

DBMS
address space may
increase by an order of magnitude. Dynamic linking is available in
the
two
ORDBMSs
that
we studied.
• Client-server issues deal with
the
placement
and
activation
of functions. If
the
server
needs to perform a function, it is best to do so in
the
DBMS
address space rather than
remotely, due to
the
large
amount
of overhead. If
the
function
demands computation
that
is

too
intensive or if
the
server is
attending
to a very large
number
of clients, the
server may ship
the
function
to a separate
client
machine. For security reasons, it is
better
to
run
functions at
the
client
using
the
user ID of
the
client. In
the
future func-
tions are likely to be written in interpreted languages like
JA
VA.

• It should be possible to
run
queries inside functions. A function must operate the
same way
whether
it is used from an application using
the
application program inter-
face
(API), or
whether
it is invoked by
the
DBMS
as a
part
of executing
SQL
with the
function
embedded
in an
SQL
statement.
Systems should support a nesting of these
"callbacks."
• Because
of
the
variety in

the
data
types in an
ORDBMS
and
associated operators,
effi-
cient
storage
and
access of
the
data
is important. For spatial
data
or multidimensional
data, new storage structures such as
Rvtrees, quad trees, or
Grid
files may be used. The
OR
DBMS
must allow new types to be defined
with
new access structures. Dealing with
large
text
strings or binary files also opens up a
number
of storage

and
search options.
It should be possible to explore such new options by defining new
data
types within
the
ORDBMS.
Other
Issues
Concerning Object-Relational Systems. In
the
above discussion
of Informix Universal Server
and
Oracle
8, we
have
concentrated
on how an
ORDBMS
extends
the
relational model. We discussed
the
features
and
facilities it provides to
operate on relational
data
stored as tables as if it were an object database.

There
are other
obvious problems to consider in
the
context
of an
ORDBMS:
• Object-relational
database
design.:
We described a procedure for designing object sche-
mas in
Section
21.5. Object-relational design is more complicated because we have
to consider
not
only
the
underlying design considerations of application semantics
and
dependencies in
the
relational
data
model (which we discussed in Chapters 10
13.This discussion isderived
largely
from Stonebraker and Moore (1996).
22.6
The Nested Relational Model I725

and
11)
but
also
the
object-oriented
nature
of
the
extended
features
that
we
have
just
discussed.
• Query processing and optimization: By
extending
SQL
with
functions
and
rules, this
problem is further
compounded
beyond
the
query optimization overview
that
we dis-

cuss for
the
relational model in
Chapter
15.
• Interaction
of
rules with transactions: Rule processing as implied in
SQL
covers more
than
just
the
update-update rules (see
Section
24.1),
which
are implemented in
RDBMSs
as triggers. Moreover,
RDBMSs
currently
implement
only immediate execu-
tion
of triggers. A deferred
execution
of triggers involves additional processing.
22.6
THE

NESTED
RELATIONAL
MODEL
To complete this discussion, we summarize in this section an approach
that
proposes
the
use of
nested
tables, also
known
as
nonnormal
form relations.
No
commercial
DBMS
has
chosen
to
implement
this
concept
in its original form.
The
nested
relational model
removes
the
restriction of first normal form (iNF, see

Chapter
11) from
the
basic rela-
tional model,
and
thus is also
known
as
the
Non-lNF
or
Non-First
Normal
Form
(NFNF)
or
NF
2
relational model. In
the
basic relational
model-also
called
the
flat rela-
tional
model-attributes
are required to be single-valued
and

to
have
atomic domains.
The
nested relational model allows composite
and
multivalued attributes, thus leading to
complex tuples
with
a hierarchical structure.
This
is useful for representing objects
that
are naturally hierarchically structured. In Figure 22.1, part (a) shows a nested relation
schema
DEPT
based
on
part
of
the
COMPANY
database,
and
part
(b) gives an example of a
Non-INf
tuple in
DEPT.
To define

the
DEPT
schema
as a nested structure, we
can
write
the
following:
dept
= (dno, dname, manager, employees,
projects,
locations)
employees = (ename,
dependents)
projects
= (pname,
ploc)
locations
=
(dloc)
dependents
= (dname, age)
First, all attributes of
the
DEPT
relation
are defined.
Next,
any nested attributes of
DEPT-namely,

EMPLOYEES,
PROJECTS,
and
LOCATIONS-are
themselves defined.
Next,
any
second-level nested attributes, such as
DEPENDENTS
of
EMPLOYEES,
are defined,
and
so on. All
attribute names must be distinct in
the
nested relation definition.
Notice
that
a nested
attribute is typically a
multivalued
composite
attribute,
thus leading to a "nested
relation"
within each tuple. For example,
the
value of
the

PROJ
ECTS
attribute
within
each
DEPT
tuple is a
relation
with
two attributes
(PNAME,
PLOC).
In
the
DEPT
tuple of Figure
22.lb,
the
PROJECTS
attribute
contains
three tuples as its value.
Other
nested attributes may be
multivalued simple
attributes,
such as
LOCATIONS
of
DEPT.

It is also possible to
have
a
nested
attribute
that
is single-valued
and
composite,
although
most nested relational
models
treat
such
an
attribute
as
though
it were multivalued.
726
I Chapter 22 Object-Relational and Extended-Relational Systems
(a)
EMPLOYEES
PROJECTS
LOCATIONS
DNO
DNAME
MANAGER
ENAME
DEPENDENTS

PNAME
PLOC
DLOC
DNAME
I
AGE
(b)
4 Administration
Wallace Zelaya Thomas
8
New benefits
Stafford
Stafford
Jennifer
6
computerization
Stafford
Greenway
Wallace
Jack
18 PhoneSystem
Greenway
Robert
15
Mary 10
Jabbar

PROJECTS
LOCATIONS
~\

(c)
DNO
DEPT
r~
DNAME
MANAGER
EMPLOYEES
/\
ENAME
DEPENDENTS
/\
DNAME
AGE
PNAME
PLOC
DLOC
FIGURE 22.1 Illustrating a nested relation. (a)
DEPT
schema. (b) Example of a
Non-l
NF
tuple
of
DEPT.
(c) Tree representation
of
DEPT
schema.
When
a

nested
relational
database
schema
is defined, it consists of a
number
of
external
relation
schemas;
these
define
the
top
level of
the
individual
nested
relations. In
addition,
nested
attributes
are
called
internal
relation
schemas,
since
they
define

relational
structures
that
are
nested
inside
another
relation.
In
our
example,
DEPT
is the
only
external
relation.
All
the
others-EMPLOYEES,
PROJECTS,
LOCATIONS,
and
DEPENDENTs-are
internal
relations.
Finally,
simple
attributes
appear
at

the
leaf
level
and
are
not
nested.
22.7
Summary I
727
We
can
represent
each
relation
schema
by
means
of a
tree
structure, as
shown
in Figure
22.1c,
where
the
root
is
an
external

relation
schema,
the
leaves are simple attributes,
and
the
internal
nodes
are
internal
relation
schemas.
Notice
the
similarity
between
this
representation
and
a
hierarchical
schema
(see
Appendix
E)
and
XML
(see
Chapter
26).

It
is
important
to be aware
that
the
three
first-level
nested
relations
in DEPT
represent
independent information.
Hence,
EMPLOYEES
represents
the
employees working for
the
department,
PROJECTS
represents
the
projects
controlled by
the
department,
and
LOCATIONS
represents

the
various
department
locations.
The
relationship
between
EMPLOYEES
and
PROJECTS
is
not
represented
in
the
schema;
this
is an
M:N
relationship,
which
is difficult to
represent in a
hierarchical
structure.
Extensions to
the
relational
algebra
and

to
the
relational
calculus, as well as to
SQL,
have
been
proposed for
nested
relations.
The
interested
reader is referred to
the
selected
bibliography at
the
end
of
this
chapter
for details.
Here,
we illustrate two operations, NEST
and UNNEST,
that
can
be used to
augment
standard

relational
algebra
operations
for
converting
between
nested
and
flat relations.
Consider
the
flat
EMP
_PROJ
relation
of
Figure
11.4,
and
suppose
that
we
project
it
over
the
attributes
SSN,
PNUMBER,
HOURS,

ENAME
as follows:
EMP
_PROJ_FLAH-nssN,
ENAME,
PNUMBER,
HOURS
(EMP_PROJ)
To
create
a
nested
version
of
this
relation,
where
one
tuple
exists for
each
employee
and
the
(PNUMBER,
HOURS) are
nested,
we use
the
NEST

operation
as follows:
EMP
_PROJ_NESTED<c-NEST
PROJS
~
(PNUMBER,
HOURS)
(EMP_PROJ_FLAT)
The
effect of
this
operation
is to
create
an
internal
nested
relation
PROJS
=
(PNUMBER,
HOURS)
within
the
external
relation
EMP
_PROJ_NESTED.
Hence,

NEST groups
together
the
tuples with the same value for
the
attributes
that
are not
specified
in
the
NEST
operation;
these are
the
SSN
and
ENAME
attributes
in
our
example. For
each
such
group,
which
represents
one
employee
in

our
example,
a single
nested
tuple
is
created
with
an
internal
nested
relation
PROJS =
(PNUMBER,
HOURS).
Hence,
the
EMP
_PROJ_NESTED
relation
looks like
the
EMP
_PROJ
relation
shown
in Figure 11.9a
and
b.
Notice

the
similarity
between
nesting
and
grouping
for aggregate functions. In
the
former,
each
group
of
tuples
becomes
a single
nested
tuple; in
the
latter,
each
group
becomes a single
summary
tuple
after
an
aggregate
function
is applied to
the

group.
The
UNNEST
operation
is
the
inverse
of
NEST.
We
can
reconvert
EMP
_PROJ_NESTED to
EMP
_PROJ_FLAT as follows:
EMP
_PROJ_FLAT<c-UNNEST
pR
OJ
S
"
(PNUMBER,
HOURS)
(EMP_PROJ_NESTED)
Here,
the
PROJS
nested
attribute

is
flattened
into
its
components
PNUMBER,
HOURS.
22.7 SUMMARY
In this
chapter,
we first gave an
overview
of
the
object-oriented
features in sQL-99,
which
are applicable to
object-relational
systems.
Then
we discussed
the
history
and
current
trends in
database
management
systems

that
led
to
the
development
of
object-relational
DBMSs
(ORDBMSs).
We
then
focused
on
some
of
the
features of
Informix
Universal
Server
728
I
Chapter
22
Object-Relational
and
Extended-Relational Systems
and
of
Oracle

8 in order to illustrate
how
commercial
RDBMSs
are being extended with
object
features.
Other
commercial
RDBMSs
are providing similar extensions. We saw that
these systems also provide
Data
Blades (Inforrnix) or Cartridges (Oracle)
that
provide
specific type extensions for newer application domains, such as spatial, time series, or
text/document
databases. Because of
the
extendibility of
ORDBMSs,
these packages can be
included as abstract
data
type (ADT) libraries
whenever
the
users
need

to implement the
types of applications
they
support. Users
can
also
implement
their
own
extensions as
needed
by using
the
ADT
facilities of these systems. We briefly discussed some implemen-
tation
issues for
ADTs.
Finally, we gave an overview of
the
nested relational model, which
extends
the
flat relational model
with
hierarchically structured complex objects.
Selected Bibliography
The
references provided for
the

object-oriented database approach in
Chapters
11
and 12
are also
relevant
for object-relational systems.
Stonebraker
and
Moore (1996) provides a
comprehensive reference for object-relational
DBMSs.
The
discussion about concepts
related to Illustra in
that
book
are mostly applicable
to
the
current
Informix Universal
Server. Kim (1995) discusses
many
issues related to
modern
database systems
that
include
object

orientation.
For
the
most
current
information on Informix and Oracle, consult
their
Web
sites: www.informix.com
and
www.oracle.corn, respectively.
The
SQL3
standard is described in various publications of
the
ISO
WG3 (Working
Group
3) reports; for example, see Kulkarni et al. (1995)
and
Melton
et al. (1991). An
excellent
tutorial
on
SQL3
was given at
the
Very Large Data Bases Conference by Melton
and

Mattos
(1996).
Ullman
and
Widom
(1997)
have
a good discussion of
SQL3
with
examples.
For issues related to rules
and
triggers,
Widom
and
Ceri
(1995)
have
a collection of
chapters on active databases. Some comparative
studies-for
example, Ketabchi et al.
(1990)-compare
relational
DBMSs
with
object
DBMSs;
their

conclusion shows
the
superi-
ority of
the
object-oriented approach for
nonconventional
applications.
The
nested rela-
tional model is discussed in
Schek
and
Scholl (1985), ]aeshke
and
Schek
(1982), Chen
and
Kambayashi (1991),
and
Makinouchi (1977), among others. Algebras
and
query lan-
guages for nested relations are presented in Paredaens
and
VanGucht
(1992), Pistor and
Andersen
(1986),
Roth

et al. (1988),
and
Ozsoyoglu et al. (1987), among others. Imple-
mentation
of prototype nested relational systems is described in Dadam et al. (1986),
Deshpande
and
VanGucht
(1988),
and
Schek
and
Scholl (1989).
7
FURTHER
TOPICS
Database Security
and Authorization
This
chapter
discusses
the
techniques
used for
protecting
the
database against persons
who
are
not

authorized to access
either
certain
parts
of
a database or
the
whole data-
base.
Section
23.1 provides an
introduction
to security issues
and
the
threats
to
data-
bases
and
an overview of
the
countermeasures
that
are covered in
the
rest of
this
chapter.
Section

23.2 discusses
the
mechanisms
used to
grant
and
revoke privileges in
relational database systems
and
in SQL,
mechanisms
that
are
often
referred to as discre-
tionary access
control.
Section
23.3 offers an overview of
the
mechanisms
for enforc-
ing
multiple
levels of
security-a
more
recent
concern
in database system security

that
is
known
as
mandatory
access
control.
It also introduces
the
more recently developed
strategy of
role-based
access
control.
Section
23.4 briefly discusses
the
security
problem
in statistical databases.
Section
23.5
introduces
flow
control
and
mentions
problems
associated
with

covert
channels.
Section
23.6 is a
brief
summary
of
encryption
and
pub-
lic key infrastructure schemes.
Section
23.7 summarizes
the
chapter. Readers
who
are
interested
only
in basic database security
mechanisms
will find it sufficient to
cover
the
material in
Sections
23.1
and
23.2.
731

732
I Chapter 23 Database Security and
Authorization
23.1 INTRODUCTION TO
DATABASE
SECURITY
ISSUES
23.1.1
Types
of Security
Database security is a very
broad
area
that
addresses
many
issues,
including
the
following:
• Legal
and
ethical issues regarding
the
right
to
access certain information. Some informa-
tion
may be deemed to be private
and

cannot
be accessed legally by unauthorized persons.
In
the
United
States, there are numerous laws governing privacy of information.
• Policy issues at
the
governmental,
institutional,
or
corporate
level as to
what
kinds of
information
should
not
be
made
publicly
available-for
example,
credit
ratings and
personal medical records.
• System-related issues
such
as
the

system
levels
at
which
various security functions
should
be
enforced-for
example,
whether
a security
function
should be
handled
at
the
physical
hardware
level,
the
operating
system level, or
the
DBMSlevel.

The
need
in some organizations to identify multiple security
levels
and

to categorize
the
data
and
users based
on
these
classifications-for
example,
top
secret, secret, con-
fidential,
and
unclassified.
The
security policy of
the
organization
with
respect
to
per-
mitting
access to various classifications of
data
must be enforced.
Threats to Databases.
Threats
to
databases result in

the
loss or
degradation
of some
or all
of
the
following security goals: integrity, availability,
and
confidentiality.
• Loss
of integrity: Database integrity refers
to
the
requirement
that
information
be pro-
tected
from improper modification. Modification of
data
includes creation, insertion,
modification,
changing
the
status of data,
and
deletion. Integrity is lost if unautho-
rized
changes

are
made
to
the
data
by
either
intentional
or
accidental
acts. If
the
loss
of system or
data
integrity is
not
corrected,
continued
use of
the
contaminated
system
or
corrupted
data
could
result in inaccuracy, fraud, or erroneous decisions.
• Lossof
availability:

Database availability refers to making objects available to a
human
user
or a program to which they have a legitimate right.
• Loss of confidentiality: Database confidentiality refers to
the
protection
of data from
unauthorized disclosure.
The
impact of unauthorized disclosure of confidential informa-
tion
can
range from violation of
the
Data
Privacy
Act
to
the
jeopardization of national
security. Unauthorized, unanticipated, or
unintentional
disclosure could result in lossof
public confidence, embarrassment, or legal
action
against
the
organization.
To protect databases against these types of threats four kinds of countermeasures can be

implemented: access control, inference control, flow control, and encryption. We discusseach
of these in this chapter.
In a multiuser database system,
the
DBMS
must provide
techniques
to
enable
certain
users or user groups
to
access selected portions
of
a database
without
gaining access
to
the
rest of
the
database.
This
is particularly
important
when
a large
integrated
database is to
be used by

many
different users
within
the
same organization. For example, sensitive
23.1
Introduction
to Database Security Issues I733
information
such
as employee salaries or
performance
reviews
should
be
kept
confidential
from most of
the
database system's users. A DBMS typically includes a
database
security
and
authorization
subsystem
that
is responsible for ensuring
the
security of portions of a
database against

unauthorized
access. It is
now
customary to refer to two types of database
security mechanisms:
• Discretionary security mechanisms:
These
are used to
grant
privileges to users, includ-
ing
the
capability to access specific
data
files, records, or fields in a specified
mode
(such
as read, insert, delete, or
update).
• Mandatory
security
mechanisms:
These
are used to enforce multilevel security by classify-
ing
the
data
and
users
into

various security classes (or levels)
and
then
implementing
the
appropriate security policy of
the
organization. For example, a typical security pol-
icy is to
permit
users at a
certain
classification level to see only
the
data
items classified
at
the
user's
own
(or lower) classification level.
An
extension
of this is
role-based
secu-
rity,
which
enforces policies
and

privileges based
on
the
concept
of roles.
We discuss discretionary security in
Section
23.2
and
mandatory
and
role-based
security in
Section
23.3.
A second security problem
common
to all computer systems is
that
of preventing
unauthorized persons from accessing
the
system itself, either to obtain information or to make
malicious changes in a portion of
the
database.
The
security mechanism of a
DBMS
must

include provisions for restricting access to
the
database system as a whole.
This
function is
called access
control
and
is handled by creating user accounts and passwords to control the
login process by
the
DBMS.
We discuss access control techniques in
Section
23.1.3.
A
third
security problem associated
with
databases is
that
of controlling
the
access to a
statistical database,
which
is used to provide statistical information or summaries of values
based
on
various criteria. For example, a database for

population
statistics may provide
statistics based
on
age groups,
income
levels, size
of
household,
education
levels,
and
other
criteria. Statistical database users such as
government
statisticians or market research firms
are allowed to access
the
database to retrieve statistical information about a population
but
not to access
the
detailed confidential information
on
specific individuals. Security for
statistical databases must ensure
that
information
on
individuals

cannot
be accessed. It is
sometimes possible to
deduce
or infer
certain
facts
concerning
individuals from queries
that
involve only summary statistics
on
groups; consequently, this must
not
be
permitted
either.
This problem, called statistical database security, is discussed briefly in
Section
23.4.
The
corresponding countermeasures are called
inference
control
measures.
Another
security issue is
that
of flow
control,

which
prevents
information
from
flowing in
such
a way
that
it reaches
unauthorized
users.
It
is discused in
Section
23.5.
Channels
that
are pathways for
information
to flow implicitly in ways
that
violate
the
security policy
of
an
organization
are called
covert
channels.

We
briefly discuss some
issues
related
to
covert
channels
in
Section
23.5.1.
A final security issue is data encryption, which is used to protect sensitive data (such as
credit card numbers)
that
is being transmitted via some type of communications network.
Encryption
can
be used to provide additional protection for sensitive portions of a database as
well.
The
data
is encoded using some coding algorithm.
An
unauthorized user who accesses
encoded
data
will have difficulty deciphering it, but authorized users are given decoding or
734
I
Chapter
23

Database
Security
and
Authorization
decrypting algorithms (or keys) to decipher the data. Encrypting techniques
that
are
very
difficult to decode without a key have been developed for military applications. Section
23.6
briefly discusses encryption techniques, including popular techniques such as public
key
encryption, which is heavily used to support Web-based transactions against databases, and
digital signatures, which are used in personal communications.
A
complete
discussion of security in
computer
systems
and
databases is outside the
scope of this textbook. We give only a brief overview of database security techniques
here.
The
interested reader
can
refer to several of
the
references discussed in
the

selected
bibliography at
the
end
of this
chapter
for a more comprehensive discussion.
23.1.2 Database Security and the DBA
As we discussed in
Chapter
1,
the
database administrator
(DBA)
is
the
central authority
for managing a database system.
The
DBA's
responsibilities include granting privileges to
users
who
need
to
use
the
system
and
classifying users

and
data
in accordance with the
policy of
the
organization.
The
DBA
has a DBA
account
in
the
DBMS,
sometimes called a
system
or
superuser
account,
which
provides powerful capabilities
that
are
not
made
available to regular database accounts
and
users.' DBA-privileged commands include com-
mands for granting
and
revoking privileges to individual accounts, users, or user

groups
and
for performing
the
following types of actions:
1.
Account
creation:
This
action
creates a
new
account
and
password for a user or a
group of users
to
enable access to
the
DBMS.
2. Privilege granting:
This
action
permits
the
DBA
to grant
certain
privileges to
cer-

tain
accounts.
3. Privilege revocation:
This
action
permits
the
DBA
to revoke (cancel) certain privi-
leges
that
were previously given to
certain
accounts.
4. Security level assignment:
This
action
consists of assigning user accounts to the
appropriate security classification level.
The
DBA
is responsible for
the
overall security of
the
database system.
Action
1 in the
preceding list is used to
control

access to
the
DBMS
as a whole, whereas actions 2 and 3 are
used to
control
discretionary database authorization,
and
action
4 is used to control
mandatory authorization.
23.1.3 Access Protection, User Accounts,
and Database Audits
Whenever
a person or a group of persons needs to access a database system,
the
individual
or group must first apply for a user account.
The
DBA
will
then
create a new account
1. This account issimilarto the rootor superuser accountsthat aregiven
to
computer
system
admin-
istrators,allowing
access

to
restricted operating
system
commands.
23.2 Discretionary Access Control Based on Granting and Revoking Privileges I
735
number
and
password
for
the
user if
there
is a legitimate
need
to access
the
database.
The
user must log in
to
the
DBMS
by
entering
the
account
number
and
password

whenever
database access is needed.
The
DBMS
checks
that
the
account
number
and
password are
valid; if they are,
the
user is permitted to use
the
DBMS
and
to
access
the
database. Appli-
cation programs
can
also be considered as users
and
can
be required
to
supply passwords.
It is straightforward to keep track of database users

and
their
accounts
and
passwords
by creating an encrypted table or file
with
the
two fields
AccountNumber
and
Password.
This table
can
easily be
maintained
by
the
DBMS.
Whenever
a new
account
is created, a
new record is inserted
into
the
table.
When
an
account

is canceled,
the
corresponding
record must be deleted from
the
table.
The
database system must also keep track of all operations on
the
database
that
are
applied by a
certain
user
throughout
each
login session,
which
consists of
the
sequence of
database interactions
that
a user performs from
the
time of logging in
to
the
time of

logging off.
When
a user logs in,
the
DBMS
can
record
the
user's
account
number
and
associate it
with
the
terminal
from
which
the
user logged in.
All
operations applied from
that terminal are
attributed
to
the
user's
account
until
the

user logs off. It is particularly
important
to
keep track of update operations
that
are applied to
the
database so
that,
if
the database is tampered with,
the
DBA
can
find
out
which
user did
the
tampering.
To keep a record of all updates applied
to
the
database
and
of
the
particular user who
applied
each

update, we
can
modify
the
system
log.
Recall from
Chapters
17
and
19
that
the
system
log includes an
entry
for
each
operation
applied to
the
database
that
may be
required for recovery from a
transaction
failure or system crash. We
can
expand
the

log
entries so
that
they
also include
the
account
number
of
the
user
and
the
online
terminal
to
that
applied
each
operation
recorded in
the
log. If any tampering
with
the
database is
suspected, a database
audit
is performed,
which

consists of reviewing
the
log to examine
all accesses
and
operations applied to
the
database during a
certain
time period.
When
an
illegal or unauthorized
operation
is found,
the
DBA
can
determine
the
account
number
used to perform this operation. Database audits are particularly
important
for sensitive
databases
thar
are updated by
many
transactions

and
users, such as a banking database
that is
updated
by many
bank
tellers. A database log
that
is used mainly for security
purposes is sometimes called an
audit
trail.
23.2 DISCRETIONARY
ACCESS
CONTROL
BASED
ON
GRANTING
AND
REVOKING PRIVILEGES
The typical method of enforcing discretionary access control in a database system is based on
the granting and revoking of privileges. Let us consider privileges in the contextof a relational
DBMS.
In particular, we will discuss a system of privileges somewhat similar
to
the one origi-
nally developed for
the
SQL
language (see

Chapter
8). Many current relational
DBMSs
use
somevariation of this technique.
The
main idea is to include statements in the query language
that allow the
DBA
and selected users to grant and revoke privileges.
736 I Chapter 23 Database Security and
Authorization
23.2.1
Types
of Discretionary Privileges
In sQL2,
the
concept
of an
authorization
identifier is used
to
refer, roughly speaking, to a
user
account
(or group of user accounts). For simplicity, we will use
the
words useror
account interchangeably in place of authorization
identifier.

The
DBMS must provide selec-
tive access to
each
relation
in
the
database based on specific accounts. Operations may
also be controlled; thus,
having
an
account
does
not
necessarily
entitle
the
account
holder
to all
the
functionality provided by
the
DBMS. Informally,
there
are two levels for
assigning privileges to use
the
database system:
• The account

level:
At
this level,
the
DBA specifies
the
particular privileges
that
each
account
holds
independently
of
the
relations in
the
database.
• The relation (or
table)
level:
At
this level,
the
DBA
can
control
the
privilege to access
each
individual relation or view in

the
database.
The
privileges at
the
account
level apply to
the
capabilities provided to
the
account
itself
and
can
include
the
CREATE SCHEMA or CREATE TABLE privilege, to create a schema
or base relation;
the
CREATE VIEW privilege;
the
ALTER privilege, to apply schema
changes
such
as adding or removing attributes from relations;
the
DROP privilege, to
delete relations or views;
the
MODIFY privilege, to insert, delete, or update tuples; and the

SELECT privilege, to retrieve information from
the
database by using a SELECT
query.
Notice
that
these
account
privileges apply to
the
account
in general. If a certain account
does
not
have
the CREATE TABLE privilege, no relations
can
be created from
that
account.
Account-level
privileges are not defined as
part
of sQL2; they are left
to
the
DBMS
implementers to define. In earlier versions of SQL, a CREATETAB privilege existed to give
an
account

the
privilege to create tables (relations).
The
second level of privileges applies
to
the
relation
level,
whether
they are base
relations or virtual (view) relations.
These
privileges are defined for sQL2. In the
following discussion,
the
term
relation
may refer
either
to a base relation or to a
view,
unless we explicitly specify
one
or
the
other. Privileges at
the
relation level specify for
each
user

the
individual relations on
which
each
type of
command
can
be applied. Some
privileges also refer to individual columns (attributes) of relations.
sQL2 commands
provide privileges at
the
relation
and attribute levelonly.
Although
this is quite general, it
makes it difficult to create accounts
with
limited privileges.
The
granting
and
revoking of
privileges generally follow an authorization model for discretionary privileges known as
the
access
matrix
model, where
the
rows of a matrix M represent

subjects
(users, accounts,
programs)
and
the
columns represent
objects
(relations, records, columns, views,
operations). Each position M(i,
j) in
the
matrix represents
the
types of privileges (read,
write, update)
that
subject i holds on object j.
To
control
the
granting
and
revoking of relation privileges,
each
relation R in a
database is assigned an
owner
account,
which
is typically

the
account
that
was used when
the
relation
was created in
the
first place.
The
owner of a relation is given allprivileges on
that
relation. In sQL2,
the
DBA
can
assign an owner to a whole schema by creating the
schema
and
associating
the
appropriate authorization identifier
with
that
schema, using
the
CREATE SCHEMA
command
(see
Section

8.1.1).
The
owner
account
holder can pass
privileges
on
any
of
the
owned
relations to
other
users by
granting
privileges to their
23.2 Discretionary Access Control Based on
Granting
and Revoking Privileges I 737
accounts. In SQL
the
following types of privileges
can
be granted on
each
individual
relation R:
• SELECT (retrieval or read) privilege on R: Gives
the
account

retrieval privilege. In
SQL
this
gives
the
account
the
privilege to use
the
SELECT
statement
to
retrieve
tuples from R.
• MODIFY privileges on R:
This
gives
the
account
the
capability to modify tuples of R.
In
SQL this privilege is further divided
into
UPDATE, DELETE,
and
INSERT privileges to
apply
the
corresponding SQL

command
to R. In addition,
both
the
INSERT
and
UPDATE
privileges
can
specify
that
only
certain
attributes of R
can
be updated by
the
account.

REFERENCES
privilege on R:
This
gives
the
account
the
capability to reference rela-
tion
R
when

specifying integrity constraints.
This
privilege
can
also be restricted to
specific attributes of R.
Notice
that
to
create a view,
the
account
must
have
SELECT privilege
on
all
relations
involved in
the
view definition.
23.2.2 Specifying Privileges Using Views
The
mechanism
of views is an
important
discretionary authorization mechanism in its
own right. For example, if
the
owner

A of a relation R wants
another
account
B
to
be able
to
retrieve only some fields of R,
then
A
can
create a view V of R
that
includes only those
attributes
and
then
grant
SELECT
on
V to B.
The
same applies to limiting B to retrieving
only
certain
tuples of R; a view Vi
can
be created by defining
the
view by means of a

query
that
selects only those tuples from R
that
A wants to allow B
to
access. We shall
illustrate
this
discussion
with
the
example given in
Section
23.2.5.
23.2.3 Revoking Privileges
In some cases it is desirable to grant a privilege to a user temporarily. For example,
the
owner of a relation may
want
to grant
the
SELECT privilege to a user for a specific task
and
then revoke
that
privilege
once
the task is completed. Hence, a mechanism for revoking
privileges is needed. In

SQL a REVOKE command is included for
the
purpose of canceling
privileges. We will see how
the
REVOKE
command
is used in
the
example in Section 23.2.5.
23.2.4 Propagation
of
Privileges Using
the
GRANT
OPTION
Whenever
the
owner
A of a relation R grants a privilege
on
R to
another
account
B,
the
privilege
can
be given to B with or without
the

GRANT
OPTION.
If
the
GRANT
OPTION is
given, this means
that
B
can
also
grant
that
privilege on R to
other
accounts. Suppose
that B is given
the
GRANT
OPTION by A
and
that
B
then
grants
the
privilege
on
R to a
738

IChapter 23 Database Security and
Authorization
third
account
C, also
with
GRANT
OPTION.
In
this
way, privileges
on
R
can
propagate to
other
accounts
without
the
knowledge of
the
owner
of
R.
If
the
owner
account
A now
revokes

the
privilege
granted
to B,all
the
privileges
that
Bpropagated based
on
that
priv-
ilege
should
automatically
be revoked by
the
system.
It is possible for a user to receive a
certain
privilege from two or
more
sources.
For
example,
A4
may receive a
certain
UPDATE
R privilege from both A2
and

A3.
In such a
case, if A2 revokes
this
privilege from
A4,
A4
will still
continue
to
have
the
privilege by
virtue
of
having
been
granted
it from
A3.
If A3
later
revokes
the
privilege from A4, A4
totally loses
the
privilege.
Hence,
a

DBMS
that
allows
propagation
of privileges must keep
track
of
how
all
the
privileges were
granted
so
that
revoking
of privileges
can
be done
correctly
and
completely.
23.2.5 An Example
Suppose
that
the
DBA
creates four
accounts-AI,
A2,
A3,

and
A4-and
wants only
Al
to be
able
to
create base relations;
then
the
DBA
must issue
the
following
GRANT
command in
SQL:
GRANT
CREATETAB
TO
Al;
The
CREATETAB
(create table) privilege gives
account
Al
the
capability to create new
database
tables (base relations)

and
is
hence
an
account
privilege.
This
privilege was part of
earlier versions of
SQL
but
is
now
left
to
each
individual system
implementation
to define.
In sQL2,
the
same effect
can
be accomplished by
having
the
DBA
issue a
CREATE
SCHEMA

command,
as follows:
CREATE
SCHEMA
EXAMPLE
AUTHORIZATION
Al;
Now
user
account
Al
can
create
tables
under
the
schema
called
EXAMPLE.
To
continue
our
example, suppose
that
Al
creates
the
two base relations
EMPLOYEE
and

DEPARTMENT
shown in
Figure 23.1; A 1 is
then
the
owner
of
these
two relations
and
hence
has allthe
relation
priv-
ileges
on
each
of
them.
Next,
suppose
that
account
Al
wants
to
grant
to
account
A2

the
privilege to insert
and
delete tuples in
both
of these relations. However,
Al
does
not
want
A2 to be able to
propagate these privileges
to
additional
accounts. A1
can
issue
the
following command:
GRANT
INSERT, DELETE
ON
EMPLOYEE,
DEPARTMENT
TO
A2;
EMPLOYEE
INAME
~
BDATE1ADDRESS

~LA~
DEPARTMENT
I DNUMBER I DNAME I MGRSSN I
FIGURE 23.1 Schemas for the
two
relations
EMPLOYEE
and
DEPARTMENT.
23.2 Discretionary Access Control Based on
Granting
and Revoking Privileges I
739
Notice
that
the
owner
account
Ai
of
a
relation
automatically
has
the
GRANT
OPTION,
allowing it to
grant
privileges

on
the
relation
to
other
accounts. However,
account
A2
cannot
grant
INSERT
and
DELETE privileges
on
the
EMPLOYEE
and
DEPARTMENT
tables, because
A2 was
not
given
the
GRANT
OPTION in
the
preceding
command.
Next,
suppose

that
Ai
wants
to allow
account
A3 to retrieve
information
from
either
of
the
two
tables
and
also to be able to
propagate
the
SELECT privilege to
other
accounts.
Al
can
issue
the
following
command:
GRANT
SELECT
ON
EMPLOYEE,

DEPARTMENT
TO
A3
WITH
GRANT
OPTION;
The clause
WITH
GRANT
OPTION
means
that
A3
can
now
propagate
the
privilege to
other
accounts by using
GRANT.
For example, A3
can
grant
the
SELECT privilege
on
the
EMPLOYEE
relation to

A4
by issuing
the
following
command:
GRANT
SELECT
ON
EMPLOYEE
TO
A4;
Notice
that
A4
cannot
propagate
the
SELECT privilege to
other
accounts
because
the
GRANT OPTION was
not
given
to
A4.
Now
suppose
that

Ai
decides to revoke
the
SELECT privilege
on
the
EMPLOYEE
relation
from
A3;
Al
then
can
issue this
command:
REVOKE
SELECT
ON
EMPLOYEE
FROM
A3;
The DBMS must
now
automatically revoke
the
SELECT privilege
on
EMPLOYEE
from
A4,

too,
because A3 granted
that
privilege to
A4
and
A3 does
not
have
the
privilege any more.
Next,
suppose
that
Ai
wants
to give
back
to A3 a limited capability to SELECT from
the
EMPLOYEE
relation
and
wants
to allow A3 to be able to propagate
the
privilege.
The
limitation is to
retrieve

only
the
NAME,
BDATE,
and
ADDRESS
attributes
and
only for
the
tuples
with
DNO = 5.
Ai
then
can
create
the
following view:
CREATE
VIEW
A3EMPLOYEE
AS
SELECT
NAME,
BDATE,
ADDRESS
FROM
EMPLOYEE
WHERE

DNO
= 5;
After
the
view is created,
Ai
can
grant
SELECT
on
the
view
A3EMPLOYEE
to A3 as follows:
GRANT
SELECT
ON
A3EMPLOYEE
TO
A3
WITH
GRANT
OPTION;
Finally, suppose
that
Ai
wants
to allow
A4
to

update
only
the
SALARY
attribute
of
EMPLOYEE;
Al
can
then
issue
the
following
command:
GRANT
UPDATE
ON
EMPLOYEE
(SALARY)
TO
A4;
The
UPDATE or INSERT privilege
can
specify
particular
attributes
that
may be
updated

or inserted in a relation.
Other
privileges (SELECT, DELETE) are
not
attribute
specific,
because this specificity
can
easily be
controlled
by
creating
the
appropriate views
that
include only
the
desired
attributes
and
granting
the
corresponding privileges
on
the
views. However, because
updating
views is
not
always possible (see

Chapter
9),
the
740
I Chapter 23 Database Security and
Authorization
UPDATE
and
INSERT
privileges are given
the
option
to specify particular attributes of a
base
relation
that
may be updated.
23.2.6 Specifying Limits on Propagation of Privileges
Techniques
to
limit
the
propagation of privileges
have
been
developed, although they
have
not
yet
been

implemented
in most
DBMSs
and
are
not
a part of
SQL.
Limiting hori-
zontal
propagation
to
an integer
number
i means
that
an
account
B given
the
GRANT
OPTION
can
grant
the
privilege
to
at most i
other
accounts. Vertical propagation is more

complicated; it limits
the
depth
of
the
granting of privileges.
Granting
a privilege with a
vertical propagation of zero is
equivalent
to granting
the
privilege
with
no
GRANT
OPTION.
If
account
A grants a privilege to
account
B
with
the
vertical propagation set to
an integer
number
j > 0, this means
that
the

account
B has
the
GRANT
OPTION
on that
privilege,
but
B
can
grant
the
privilege to
other
accounts only
with
a vertical propagation
less
thanj. In effect, vertical propagation limits
the
sequence of
GRANT
OPTIONs
that
can
be given from
one
account
to
the

next
based
on
a single original
grant
of
the
privilege.
We now briefly illustrate horizontal
and
vertical propagation
limits-which
are not
available
currently in
SQL
or
other
relational
systems-with
an example. Suppose
that
Al
grants
SELECT
to A2 on
the
EMPLOYEE
relation
with

horizontal propagation equal to I and
vertical propagation equal to 2. A2
can
then
grant
SELECT
to at most
one
account
because
the
horizontal propagation
limitation
is set to 1. In addition, A2
cannot
grant the
privilege to
another
account
except
with
vertical propagation set
to
0 (no
GRANT
OPTION)
or 1; this is because A2 must reduce
the
vertical propagation by at least I when
passing

the
privilege
to
others. As this example shows, horizontal
and
vertical
propagation techniques are designed to limit
the
propagation of privileges.
23.3
MANDATORY
ACCESS
CONTROL
AND
ROLE-BASED
ACCESS
CONTROL
FOR
MULTILEVEL
SECURITy
2
The
discretionary access
control
technique
of granting
and
revoking privileges on rela-
tions has traditionally
been

the
main
security
mechanism
for relational database systems.
This
is an all-ot-nothing method: A user
either
has or does
not
have
a certain privilege.
In many applications, an
additional
security
policy
is needed
that
classifies
data
and
users
based on security classes.
This
approach,
known
as
mandatory
access
control,

would typ-
ically be
combined
with
the
discretionary access
control
mechanisms described in Section
23.2.
It is
important
to
note
that
most commercial
DBMSs
currently provide mechanisms
only for discretionary access control. However,
the
need
for multilevel security exists in
2.
The
conttibution
of Fariborz
Farahmand
to
this
and
subsequent sections is appreciated.

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×