Tải bản đầy đủ (.pdf) (10 trang)

Hướng dẫn học Microsoft SQL Server 2008 part 28 ppt

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (845.54 KB, 10 trang )

Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 232
Part II Manipulating Data With Select
FIGURE 10-4
Building an inner join within Management Studio’s Query Designer
Because joins pull together data from two data sets, it makes sense that SQL needs to know how to
match up rows from those sets. SQL Server merges the rows by matching a value common to both
tables. Typically, a primary key value from one table is being matched with a foreign key value from the
secondary table. Whenever a row from the first table matches a row from the second table, the two rows
are merged into a new row containing data from both tables.
The following code sample joins the
Tour (secondary) and BaseCamp (primary) tables from the Cape
Hatteras Adventures sample database. The
ON clause specifies the common data:
USE CHA2;
SELECT Tour.Name, Tour.BaseCampID,
BaseCamp.BaseCampID, BaseCamp.Name
FROM dbo.Tour
INNER JOIN dbo.BaseCamp
ON Tour.BaseCampID = BaseCamp.BaseCampID;
232
www.getcoolebook.com
Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 233
Merging Data with Joins and Unions 10
The query begins with the Tour table. For every Tour row, SQL Server will attempt to identify match-
ing
BaseCamp rows by comparing the BasecampID columnsinbothtables.TheTour table rows and
BaseCamp table rows that match will be merged into a new result:
Tour. Tour. Basecamp. Basecamp.
TourName BaseCampID BaseCampID BaseCampName

Appalachian Trail 1 1 Ashville NC


Outer Banks Lighthouses 2 2 Cape Hatteras
Bahamas Dive 3 3 Freeport
Amazon Trek 4 4 Ft Lauderdale
Gauley River Rafting 5 5 West Virginia
Number of rows returned
In the preceding query, every row in both the Tour and BaseCamp tables had a match. No rows were
excluded from the join. However, in real life this is seldom the case. Depending upon the number of
matching rows from each data source and the type of join, it’s possible to decrease or increase the final
number of rows in the result set.
To see how joins can alter the number of rows returned, look at the
Contact and [Order] tables of
the
OBXKites database. The initial row count of contacts is 21, yet when the customers are matched
with their orders, the row count changes to 10. The following code sample compares the two queries
and their respective results side by side:
USE OBXKites;
SELECT ContactCode, LastName SELECT ContactCode, OrderNumber
FROM dbo.Contact FROM dbo.Contact
ORDER BY ContactCode; INNER JOIN dbo.[Order]
ON [Order].ContactID
= Contact.ContactID
ORDER BY ContactCode;
Results from both queries:
ContactCode LastName ContactCode OrderNumber

101 Smith 101 1
101 2
101 5
102 Adams 102 6
102 3

103 Reagan 103 4
103 7
104 Franklin 104 8
105 Dowdry 105 9
106 Grant 106 10
107 Smith
233
www.getcoolebook.com
Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 234
Part II Manipulating Data With Select
108 Hanks
109 James
110 Kennedy
111 Williams
112 Quincy
113 Laudry
114 Nelson
115 Miller
116 Jamison
117 Andrews
118 Boston
119 Harrison
120 Earl
121 Zing
Joins can appear to multiply rows. If a row on one side of the join matches with several rows on the
other side of the join, the result will include a row for every match. In the preceding query, some con-
tacts (Smith, Adams, and Reagan) are listed multiple times because they have multiple orders.
Joins also eliminate rows. Only contacts 101 through 106 have matching orders. The rest of the contacts
are excluded from the join because they have no matching orders.
ANSI SQL 89 joins

A join is really nothing more than the act of selecting data from two tables for which a condition of
equality exists between common columns. Join conditions in the
ON clause are similar to WHERE clauses.
In fact, before ANSI SQL 92 standardized the
JOIN ON syntax, ANSI SQL 89 joins (also called legacy
style joins, old style joins,orevengrandpa joins) accomplished the same task by listing the tables within
the
FROM clause and specifying the join condition in the WHERE clause.
The previous sample join between
Contact and [Order] couldbewrittenasanANSI89joinasfol-
lows:
SELECT Contact.ContactCode, [Order].OrderNumber
FROM dbo.Contact, dbo.[Order]
WHERE [Order].ContactID = Contact.ContactID
ORDER BY ContactCode;
Best Practice
A
lways code joins using the ANSI 92 style. ANSI 92 joins are cleaner, easier to read, and easier to debug
than ANSI 89 style joins, which leads to improved data integrity and decreases maintenance costs. With
ANSI 89 style joins it’s possible to get the wrong result unless it’s coded very carefully. ANSI 89 style outer
joins are deprecated in SQL Server 2008, so any ANSI 89 outer joins will generate an error.
234
www.getcoolebook.com
Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 235
Merging Data with Joins and Unions 10
Multiple data source joins
As some of the examples have already demonstrated, a SELECT statement isn’t limited to one or two
data sources (tables, views, CTEs, subqueries, etc.); a SQL Server
SELECT statement may refer to up to
256 data sources. That’s a lot of joins.

Because SQL is a declarative language, the order of the data sources is not important for inner joins.
(The query optimizer will decide the best order to actually process the query based on the indexes
available and the data in the tables.) Multiple joins may be combined in multiple paths, or even circular
patterns (A joins B joins C joins A). Here’s where a large whiteboard and a consistent development style
really pay off.
The following query (first shown in Figure 10-5 and then worked out in code) answers the question
‘‘Who purchased kites?’’ The answer must involve five tables:
FIGURE 10-5
Answering the question ‘‘Who purchased kites?’’ using Management Studio’s Query Designer
1. The Contact table for the ‘‘who’’
2. The
[Order] table for the ‘‘purchased’’
235
www.getcoolebook.com
Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 236
Part II Manipulating Data With Select
3. The OrderDetail table for the ‘‘purchased’’
4. The
Product table for the ‘‘kites’’
5. The
ProductCategory table for the ‘‘kites’’
The following SQL
SELECT statement begins with the ‘‘who’’ portion of the question and specifies the
join tables and conditions as it works through the required tables. The query that is shown graphically
in Management Studio (refer to Figure 10-5) is listed as raw SQL in the following code sample. Notice
how the
where clause restricts the ProductCategory table rows and yet affects the contacts selected:
USE OBXKites;
SELECT LastName, FirstName, ProductName
FROM dbo.Contact C

INNER JOIN dbo.[Order] O
ON C.ContactID = O.ContactID
INNER JOIN dbo.OrderDetail OD
ON O.OrderID = OD.OrderID
INNER JOIN dbo.Product P
ON OD.ProductID = P.ProductID
INNER JOIN dbo.ProductCategory PC
ON P.ProductCategoryID = PC.ProductCategoryID
WHERE ProductCategoryName = ‘Kite’
ORDER BY LastName, FirstName;
Result:
LastName FirstName ProductName

Adams Terri Dragon Flight
Dowdry Quin Dragon Flight

Smith Ulisius Rocket Kite
To summarize the main points about inner joins:
■ They only match rows with a common value.
■ The order of the data sources is unimportant.
■ They can appear to multiply rows.
■ Newer ANSI 92 style is the best way to write them.
Outer Joins
Whereas an inner join contains only the intersection of the two data sets, an outer join extends the inner
join by adding the nonmatching data from the left or right data set, as illustrated in Figure 10-6.
Outer joins solve a significant problem for many queries by including all the data regardless of a match.
The common customer-order query demonstrates this problem well. If the requirement is to build a
query that lists all customers plus their recent orders, only an outer join can retrieve every customer
236
www.getcoolebook.com

Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 237
Merging Data with Joins and Unions 10
whether the customer has placed an order or not. An inner join between customers and orders would
miss every customer who did not place a recent order.
Depending on the nullability of the keys and the presence of rows on both sides of the
join, it’s easy to write a query that misses rows from one side or the other of the join. I’ve
even seen this error in third-p arty ISV application code. To avoid this data integrity error, know your
schema well and always unit test your queries against a small data set with known answers.
FIGURE 10-6
An outer join includes not only rows from the two data sources with a match, but also unmatched
rows from outside the intersection.
Data Set A Data Set B
Common
Intersection
Right Outer Join
Left Outer Join
Some of the data in the result set produced by an outer join will look just like the data from an inner
join. There will be data in columns that come from each of the data sources, but any rows from the
outer-join table that do not have a match in the other side of the join will return data only from the
outer-join table. In this case, columns from the other data source will have null values.
A Join Analogy
W
hen I teach how to build queries, I sometimes use the following story to explain the different types of
joins. Imagine a pilgrim church in the seventeenth century, segmented by gender. The men all sit on
one side of the church and the women on the other. Some of the men and women are married, and some
are single. Now imagine that each side of the church is a database table and the various combinations of
people that leave the church represent the different types of joins.
If all the married couples stood up, joined hands, and left the church, that would be an inner join between
the men and women. The result set leaving the church would include only matched pairs.
If all the men stood, and those who were married held hands with their wives and they left as a group, that

would be a left outer join. The line leaving the church would include some couples and some bachelors.
Likewise, if all women and their husbands left the church, that would be a right outer join. All the bachelors
would be left alone in the church.
A full outer join (covered later in this chapter) would be everyone leaving the church, but only the married
couples could hold hands.
237
www.getcoolebook.com
Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 238
Part II Manipulating Data With Select
Using the Query Designer to create outer joins
When building queries using the Query Designer, the join type can be changed from the default, inner
join, to an outer join via either the context menu or the properties of the join, as shown in Figure 10-
7. The Query Designer does an excellent job of illustrating the types of joins with the join symbol (as
previously detailed in Table 10-1).
FIGURE 10-7
The join Properties window displays the join columns, and is used to set the join condition (=, >, <,
etc.) and add the left or right side of an outer join (all rows from Product, all rows from OrderDetail).
T-SQL code and outer joins
In SQL code, an outer join is declared by the keywords LEFT OUTER or RIGHT OUTER before the JOIN
(technically, the keyword OUTER is optional):
SELECT *
FROM Table1
LEFT|RIGHT [OUTER] JOIN Table2
ON Table1.column = Table2.column;
238
www.getcoolebook.com
Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 239
Merging Data with Joins and Unions 10
Several keywords (such as INNER, OUTER,orAS) in SQL are optional or may be abbreviated
(such as

PROC for PROCEDURE). Although most developers (including me) omit the optional
syntax, explicitly stating the intent by spelling out the full syntax improves the readability of the code.
There’s no trick to telling the difference between left and right outer joins. In code, left or right refers to
the table that will be included regardless of the match. The outer-join table (sometimes called the driv-
ing table) is typically listed first, so left outer joins are more common than right outer joins. I suspect
any confusion between left and right outer joins is caused by the use of graphical-query tools to build
joins, because left and right refers to the table’s listing in the SQL text, and the tables’ positions in the
graphical-query tool are moot.
Best Practice
W
hen coding outer joins, always order your data sources so you can write left outer joins. Don’t use
right outer joins, and never mix left outer joins and right outer joins.
To modify the previous contact-order query so that it returns all contacts regardless of any orders,
changing the join type from inner to left outer is all that’s required, as follows:
SELECT ContactCode, OrderNumber
FROM dbo.Contact
LEFT OUTER JOIN dbo.[Order]
ON [Order].ContactID = Contact.ContactID
ORDER BY ContactCode;
The left outer join will include all rows from the Contact table and matching rows from the [Order]
table. The abbreviated result of the query is as follows:
Contact. [Order].
ContactCode OrderNumber

101 1
101 2

106 10
107 NULL
108 NULL


Because contact 107 and 108 do not have corresponding rows in the [Order] table, the columns from
the
[Order] table return a null for those rows.
Earlier versions of SQL Server extended the ANSI SQL 89 legacy join syntax with outer
joins by adding an asterisk to the left or right of the equals sign in the
WHERE clause
condition. While this syntax worked through SQL Server 2000, it has been deprecated since SQL Server
2005. ANSI SQL 89 inner joins will still work, but outer joins
require
ANSI SQL 92 syntax.
239
www.getcoolebook.com
Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 240
Part II Manipulating Data With Select
Having said that, SQL Server supports backward compatibility, so if the database compatibility level is
set to 80 (SQL Server 2000), then the ANSI 82 style outer joins still work.
Outer joins and optional foreign keys
Outer joins are often employed when a secondary table has a foreign-key constraint to the primary table
and permits nulls in the foreign key column. The presence of this optional foreign key means that if the
secondary row refers to a primary row, then the primary row must exist. However, it’s perfectly valid for
the secondary row to refrain from referring to the primary table at all.
Another example of an optional foreign key is an order alert or priority column. Many order rows will
not have an alert or special-priority status. However, those that do must point to a valid row in the
order-priority table.
The OBX Kite store uses a similar order-priority scheme, so reporting all the orders with their optional
priorities requires an outer join:
SELECT OrderNumber, OrderPriorityName
FROM dbo.[Order]
LEFT OUTER JOIN dbo.OrderPriority

ON [Order].OrderPriorityID =
OrderPriority.OrderPriorityID;
The left outer join retrieves all the orders and any matching priorities. The OBXKites_Populate.sql
script sets two orders to rush priority:
OrderNumber OrderPriorityName

1 Rush
2 NULL
3 Rush
4 NULL
5 NULL
6 NULL
7 NULL
8 NULL
9 NULL
10 NULL
The adjacency pairs pattern (also called reflexive, recursive,orself-join relationships, covered in
Chapter 17, ‘‘Traversing Hierarchies’’) also uses optional foreign keys. In the
Family sample database,
the
MotherID and FatherID are both foreign keys that refer to the PersonID of the mother or
father. The optional foreign key allows persons to be entered without their father and mother already
in the database; but if a value is entered in the
MotherID or FatherID columns, then the data must
point to valid persons in the database.
Another cool twist on left outer joins is LEFT APPLY, used with table-valued user-defined
functions. You’ll find that covered in Chapter 25, ‘‘Building User-Defined Functions.’’
240
www.getcoolebook.com
Nielsen c10.tex V4 - 07/21/2009 12:42pm Page 241

Merging Data with Joins and Unions 10
Full outer joins
A full outer join returns all the data from both data sets regardless of the intersection, as shown in
Figure 10-8. It is functionally the same as taking the results from a left outer join and the results from a
right outer join, and unioning them together (unions are explained later in this chapter).
FIGURE 10-8
The full outer join returns all the data from both data sets, matching the rows where it can and filling
in the holes with nulls.
Full Outer Join
Data Set A Data Set B
Common
Intersection
In real life, referential integrity reduces the need for a full outer join because every row from the sec-
ondary table should have a match in the primary table (depending on the optionality of the foreign key),
so left outer joins are typically sufficient. Full outer joins are most useful for cleaning up data that has
not had the benefit of clean constraints to filter out bad data.
Red thing blue thing
The following example is a mock-up of such a situation and compares the full outer join with an inner
and a left outer join. Table
One is the primary table. Table Two is a secondary table with a foreign key
that refers to table
One. There’s no foreign-key constraint, so there may be some nonmatches for the
outer join to find:
CREATE TABLE dbo.One (
OnePK INT,
Thing1 VARCHAR(15)
);
CREATE TABLE dbo.Two (
TwoPK INT,
OnePK INT,

Thing2 VARCHAR(15)
);
The sample data includes rows that would normally break referential integrity. As illustrated in
Figure 10-9, the foreign key (
OnePK) for the plane and the cycle in table Two do not have a match
in table
One; and two of the rows in table One do not have related secondary rows in table Two.The
following batch inserts the eight sample data rows:
INSERT dbo.One(OnePK, Thing1)
VALUES (1, ‘Old Thing’);
241
www.getcoolebook.com

×