Tải bản đầy đủ (.pdf) (5 trang)

The Language of SQL- P44 ppt

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (125.61 KB, 5 trang )

Students table:
StudentID Student
1 Amy
2 Jon
3 Beth
4 Karen
5 Alex
Teachers table:
TeacherID Teacher Assistant
1 Smith Collins
2 Jones Brown
3 Kaplan NULL
4 Harris Taylor
Tests table:
TestID TeacherID Test Date TotalPoints
1 1 Pronoun Quiz 2009-03-02 10
2 2 Pronoun Quiz 2009-03-02 10
3 3 Solids Quiz 2009-03-03 20
4 4 China Test 2009-03-04 50
5 1 Grammar Test 2009-03-05 100
Formats table:
TestID TestFormat
1 Multiple Choice
2 Multiple Choice
3 Multiple Choice
4 Essay
5 Multiple Choice
5 Essay
How to Normalize Data 201
Grades table:
StudentID TestID Grade


118
226
3317
4445
5438
4588
Your first impression might be that we have unnecessarily complicated the
situation, rather than improved it. For example, the Grades table is now a mass of
numbers, the meaning of which is not completely obvious on quick inspection.
This is true. However, remembering the ability of SQL to join tables together
easily, you can also see that there is now much greater flexibility in this new
design. Not only are we free to join together only those tables needed for any
particular analysis, but we can now add new columns to these tables much more
readily, without affecting anything else.
Our information has become more modularized. For example, if we should
decide that we want to capture additional information about each student, such
as address and phone, we can simply add new columns to the Students table.
Additionally, when we want to modify a student’s address or phone later, it only
affects one row in the table
The Art of Database Design
Ultimately, designing a database is much more than simply going through the
normalization procedures. Database design is really more of an art than a sci-
ence, and it requires asking and thinking about relevant business issues.
In our grades example, we presented one possible database design as an illustra-
tion of how to normalize data. In truth, there are many possibilities for designing
this database. Much depends on the realities of how the data will be accessed and
modified. Numerous questions can be asked to ascertain whether your design is
as flexible and meaningful as it needs to be. For example:
■ Are there other tables that need to be added to our database? One obvious
choice would be a Subjects table, so you could easily select tests by subject,

Chapter 19

Principles of Database Design202
such as English or Math. If you did this, would you relate the subject to the
test or to the teacher who gave the test?
■ Is it possible for a grade to count in more than one subject? Maybe the
English and Social Studies teachers are doing a combined lesson and want
certain tests to count for both subjects. How do you account for that?
■ What do you do if a child flunks a grade and is now taking the same tests
for a second year? How do you differentiate his grade now from last year’s
grades?
■ How do you allow for special rules that teachers might implement, such
as dropping the lowest quiz score in a particular time period?
■ Are there special analysis requirements for the data? If there is more than
one teacher for the same subject, do you want to be able to compare the
average grades for the students of each teacher, to make sure that one
teacher isn’t unfairly inflating grades?
The list of possible questions is endless. But the point is that data doesn’t exist in a
vacuum. There is a necessar y interaction between data design and requirements in
the real world. Databases need to be designed in such a way as to allow for needed
flexibil ity. However, there is also a danger that databases can be overly designed to a
point whe re the data b ecomes unintelligible. A zealous database administrator may
decide to create 20 tables to all ow for every possib le situation. That, too, is inad-
visable. Database d es ign is someth ing of a balancing ac t in search of a design that is
sufficiently flexible but also intuitive and understandable by users of the system.
Alternatives to Normalization
We have emphasized that normalization is the overriding principle that should
be followed in designing a database. In certain situations, however, there are
viable alternatives that might make more sense.
For example, in the realm of data warehouse systems and software, many prac-

titioners advocate utilizing a star sche ma design for databases rather than nor-
malization. In a star schema, a certain amount of redundancy is allowed and
encouraged. The emphasis is on creating a data structure that more intuitively
reflects business realities, and also one that allows for quick processing of data by
special analytical software.
Alternatives to Normalization 203
To give a brief overview of star schema designs, the mai n idea is to create a
central fact table, which is related to any number of dimension tables. The fact
table contains all the quantitative numbers that are additive in nature. In our
prior example, the Grade column is such a number, since we can add up grades
to obtain a meaningful total grade. The dimension tables contain information on
all the entities that are related to the central facts, such as subject, time, teacher,
student, and so on.
Furthermore, special analytical software exists that allows database developers to
create cubes from their star schema databases. These cubes extend analysis capa-
bilities, allowing users to drill down predefined hierarchies, which are defined
in the various dimensions. A user of such a system would be able to drill down
from viewing a semester’s worth of grades for a student, to his grades in any
individual week.
Figure 19.2 shows what a database with a star schema design might look like for
our grades example.
In this design, the Grades table is the central fact table. The other tables are all
dimension tables.
The first four columns in the Grades table (Date, TestID, Studen tID, and
TeacherID) are there only to relate the table to each of the dimensions. The other
two columns have the additive numeric quantities we talked about. Notice that
Figure 19.2
Star schema design.
Chapter 19


Principles of Database Design204
TotalPoints is now in the Grades table. In our normalized design, it was an
attribute of the Tests table. By putting both the Grade and TotalPoints in the
Grades table, we can use our analytical software to easily sum up grades and
compute average grades (Grade divided by the TotalPoints) for any set of data.
Certainly, this is only a brief introduction to the subject of designing databases
for data warehouses. It illustrates the point that there are many different ways to
design a database, and the best way often relates to the type of software that will
be used with the data.
Looking Ahead
This chapter covered the principles of database design. We went over the basics
of the normalization process, showing how a database with a single table can be
converted into a more flexible structure with multiple tables, related by addi-
tional key columns. We also emphasized that database design is not merely a
technical exercise. Attention must be paid to organizational realities and to
considerations as to how the data will be utilized. Finally, we briefly described
one alternative to the conventional normalized design, in an effort to emphasize
that there is often more than one approach to this endeavor.
In our final chapter, ‘‘Strategies for Displaying Data,’’ we’re going to discuss
some interesting possibilities for using reporting software tools to complement
our knowledge of SQL. In our quest to sharpen our SQL skills, we must not
forget that there is a world beyond SQL. We make to make sure that we don’t
expend our efforts in SQL when the underlying objective can be accomplished
more effectively through other means.
Looking Ahead 205

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×