SQL PROGRAMMING STYLE- P25 pot

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (154.53 KB, 5 trang )

5.4 Multiple Character Sets 97

I was amazed to go to a major hospital in Los Angeles in mid-1993
and see the clerk still looking up codes in a dog-eared looseleaf
notebook instead of bringing them up on her terminal screen. The
hospital was still using an old IBM mainframe system, which had dumb
3270 terminals, rather than a client/server system with workstations.
There was not even a help screen available to the clerk.
The translation tables can be downloaded to the workstations in a
client/server system to reduce network traffic. They can also be used to
build picklists on interactive screens and thereby reduce typographical
errors. Changes to the codes are thereby propagated in the system
without anyone having to rewrite application code. If the codes change
over time, the table for a code should have to include a pair of “date
effective” fields. This will allow a data warehouse to correctly read and
translate old data.

5.4 Multiple Character Sets

Some DBMS products can support ASCII, EBCDIC, and Unicode. You
need to be aware of this, so you can set proper collations and normalize
your text.
The predicate “<string> IS [NOT] NORMALIZED” in SQL-99
determines if a Unicode string is one of four normal forms (i.e., D, C,
KD, and KC). The use of the words

normal form

here is not the same as in
a relational context. In the Unicode model, a single character can be

built from several other characters. Accent marks can be put on basic
Latin letters. Certain combinations of letters can be displayed as ligatures
(ae becomes æ). Some languages, such as Hangul (Korean) and
Vietnamese, build glyphs from concatenating symbols in two
dimensions. Some languages have special forms of one letter that are
determined by context, such as the terminal sigma in Greek or accented
u in Czech. In short, writing is more complex than putting one letter
after another.
The Unicode standard defines the order of such constructions in their
normal forms. You can still produce the same results with different
orderings and sometimes with different combinations of symbols, but it
is handy when you are searching such text to know that it is normalized
rather than trying to parse each glyph on the fly. You can find details
about normalization and links to free software at www.unicode.org.

CHAPTER

6

Coding Choices

“Caesar: Pardon him, Theodotus. He is a barbarian and thinks the customs
of his tribe and island are the laws of nature.”

—

Caesar and Cleopatra

, by George Bernard Shaw, 1898

T

HIS

CHAPTER

DEALS

WITH

writing good DML statements in Standard SQL.
That means they are portable and can be optimized well by most SQL
dialects. I define

portable

to mean one of several things. The code is
standard and can be run as-is on other SQL dialects; standard implies
portable. Or the code can be converted to another SQL dialect in a
simple mechanical fashion, or that the feature used is so universal that
all or most products have it in some form; portable does not imply
standard. You can get some help with this concept from the X/Open
SQL Portability Guides.

A major problem in becoming a SQL programmer is that people do
not unlearn procedural or OO programming they had to learn for
their first languages. They do not learn how to think in terms of sets
and predicates, and so they mimic the solutions they know in their
first programming languages. Jerry Weinberg (1978) observed this fact
more than 25 years ago in his classic book,

Psychology of Computer
Programming

. He was teaching PL/I. For those of you younger readers,
PL/I was a language from IBM that was a hybrid of FORTRAN,
COBOL, and AlGOL that had a popular craze.

100 CHAPTER 6: CODING CHOICES

Weinberg found that he could tell the first programming languages of
the students by how they wrote PL/I. My personal experience (1989) was
that I could guess the nationality of the students in my C and Pascal
programming classes because of their native spoken language.
Another problem in becoming a SQL programmer is that people tend
to become SQL dialect programmers and think that their particular
product’s SQL is some kind of standard. In 2004, I had a job interview
for a position where I was being asked to evaluate different platforms for
a major size increase in the company’s databases. The interviewer kept
asking me “general SQL” questions based on the storage architecture of
the only product he knew.
His product is not intended for Very Large Database (VLDB)
applications, and he had no knowledge of Nucleus, Teradata, Model
204, or other products that compete in the VLDB arena. He had spent

his career tuning one version of one product and could not make the
jump to anything different, even conceptually. His career is about to
become endangered.
There is a place for the specialist dialect programmer, but dialect
programming should be a last resort in special circumstances and never
the first attempt. Think of it as cancer surgery: You do massive surgery
when there is a bad tumor that is not treatable by other means; you do
not start with it when the patient came in with acne.

6.1 Pick Standard Constructions over
Proprietary Constructions

There is a fact of life in the IT industry called the Code Museum Effect,
which works like this: First, each vendor adds a feature to its product.
The feature is deemed useful, so it gets into the next version of the
standard with slightly different syntax or semantics, but the vendor is
stuck with its proprietary syntax. Its users have written code based on it,
and they do not want to redo it. The solutions are the following:
1.

Never implement the standard and just retain the old syntax

. The
problem is that you cannot pass a conformance test, which can
be required for government and industry contracts. SQL pro-
grammers who know the standard from other products cannot
read, write, or maintain your code easily. In short, you have the
database equivalent of last year’s cell phone.
2.

Implement the standard, but retain the old syntax, too

. This is the
usual solution for a few releases. It gives the users a chance to

6.1 Pick Standard Constructions over Proprietary Constructions 101

move to the standard syntax but does not break the existing
applications. Everyone is happy for awhile.
3.

Implement the standard and depreciate the old syntax

. The vendor
is ready for a major release, which lets it redo major parts of the
database engine. Changing to the standard syntax and not
supporting the old syntax at this point is a good way to force
users to upgrade their software and help pay for that major
release.
A professional programmer would be converting his or her old code
at step two to avoid being trapped in the Code Museum when step three
rolls around. Let’s be honest, massive code conversions do not happen
until after step three occurs in most shops, and they are a mess, but you
can start to avoid the problems by always writing standard code in a step
two situation.

6.1.1 Use Standard OUTER JOIN Syntax

Rationale:

Here is how the standard OUTER JOINs work in SQL-92. Assume you
are given:

Table1 Table2
a b a c
====== ======
1 w 1 r
2 x 2 s
3 y 3 t
4 z

and the OUTER JOIN expression:

Table1
LEFT OUTER JOIN
Table2
ON Table1.a = Table2.a <== join condition
AND Table2.c = 't'; <== single table condition

We call Table1 the “preserved table” and Table2 the “unpreserved
table” in the query. What I am going to give you is a little different but
equivalent to the ANSI/ISO standards.

SQL PROGRAMMING STYLE- P25 pot

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về