Tải bản đầy đủ (.pdf) (10 trang)

Hướng dẫn học Microsoft SQL Server 2008 part 41 pot

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (513.38 KB, 10 trang )

Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 362
Part II Manipulating Data With Select
DatePosition DATE NOT NULL
)
INSERT dbo.Dept (DeptName, RaiseFactor)
VALUES (’Engineering’, 1.2),
(’Sales’, .8),
(’IT’, 2.5),
(’Manufacturing’, 1.0) ;
INSERT dbo.Employee (DeptID, LastName, FirstName,
Salary, PerformanceRating, DateHire, DatePosition)
VALUES (1, ‘Smith’, ‘Sam’, 54000, 2.0, ‘19970101’, ‘19970101’),
(1, ‘Nelson’, ‘Slim’, 78000, 1.5, ‘19970101’, ‘19970101’),
(2, ‘Ball’, ‘Sally’, 45000, 3.5, ‘19990202’, ‘19990202’),
(2, ‘Kelly’, ‘Jeff’, 85000, 2.4, ‘20020625’, ‘20020625’),
(3, ‘Guelzow’, ‘Jo’, 120000, 4.0, ‘19991205’, ‘19991205’),
(3, ‘Ander’, ‘Missy’, 95000, 1.8, ‘19980201’, ‘19980201’),
(4, ‘Reagan’, ‘Sam’, 75000, 2.9, ‘20051215’, ‘20051215’),
(4, ‘Adams’, ‘Hank’, 34000, 3.2, ‘20080501’, ‘20080501’);
When developing complex queries, I work from the inside out. The first step performs the date math;
it selects the data required for the raise calculation, assuming June 25, 2009, is the effective date of the
raise, and ensures the performance rating won’t count if it’s only 1:
SELECT EmployeeID, Salary,
CAST(CAST(DATEDIFF(d, DateHire, ‘20090625’)
AS DECIMAL(7, 2)) / 365.25 AS INT)
AS YrsCo,
CAST(CAST(DATEDIFF(d, DatePosition, ‘20090625’)
AS DECIMAL(7, 2)) / 365.25
* 12 AS INT)
AS MoPos,
CASE WHEN Employee.PerformanceRating >= 2


THEN Employee.PerformanceRating
ELSE 0
END AS Perf,
Dept.RaiseFactor
FROM dbo.Employee
JOIN dbo.Dept
ON Employee.DeptID = Dept.DeptID
Result:
EmployeeID Salary YrsCo MoPos Perf RaiseFactor

1 54000.00 12 149 2.00 1.20
2 78000.00 12 149 0.00 1.20
3 45000.00 10 124 3.50 0.80
362
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 363
Modifying Data 15
4 85000.00 7 84 2.40 0.80
5 120000.00 9 114 4.00 2.50
6 95000.00 11 136 0.00 2.50
7 75000.00 4 42 2.90 1.00
8 34000.00 1 13 3.20 1.00
The next step in developing this query is to add the raise calculation. The simplest way to see the calcu-
lation is to pull the values already generated from a subquery:
SELECT EmployeeID, Salary,
(2 + ((YearsCompany * .1) + (MonthPosition * .02)
+ (Performance * .5)) * RaiseFactor) / 100 AS EmpRaise
FROM (SELECT EmployeeID, FirstName, LastName, Salary,
CAST(CAST(DATEDIFF(d, DateHire, ‘20090625’) AS
DECIMAL(7, 2)) / 365.25 AS INT) AS YearsCompany,

CAST(CAST(DATEDIFF(d, DatePosition, ‘20090625’) AS
DECIMAL(7, 2)) / 365.25 * 12 AS INT) AS MonthPosition,
CASE WHEN Employee.PerformanceRating >= 2
THEN Employee.PerformanceRating
ELSE 0
END AS Performance, Dept.RaiseFactor
FROM dbo.Employee
JOIN dbo.Dept
ON Employee.DeptID = Dept.DeptID) AS SubQuery
Result:
EmployeeID Salary EmpRaise

1 54000.00 0.082160000
2 78000.00 0.070160000
3 45000.00 0.061840000
4 85000.00 0.048640000
5 120000.00 0.149500000
6 95000.00 0.115500000
7 75000.00 0.046900000
8 34000.00 0.039600000
The last query was relatively easy to read, but there’s no logical reason for the subquery. The query
could be rewritten combining the date calculations and the case expression into the raise formula:
SELECT EmployeeID, Salary,
(2 +
years with company
+ ((CAST(CAST(DATEDIFF(d, DateHire, ‘20090625’)
AS DECIMAL(7, 2)) / 365.25 AS INT) * .1)
months in position
+ (CAST(CAST(DATEDIFF(d, DatePosition, ‘20090625’)
AS DECIMAL(7, 2)) / 365.25 * 12 AS INT) * .02)

363
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 364
Part II Manipulating Data With Select
Performance Rating minimum
+ (CASE WHEN Employee.PerformanceRating >= 2
THEN Employee.PerformanceRating
ELSE 0
END * .5))
Raise Factor
* RaiseFactor) / 100 AS EmpRaise
FROM dbo.Employee
JOIN dbo.Dept
ON Employee.DeptID = Dept.DeptID
It’s easy to verify that this query gets the same result, but which is the better query? From a perfor-
mance perspective, both queries generate the exact same query execution plan. When considering
maintenance and readability, I’d probably go with the second query carefully formatted and commented.
The final step is to convert the query into an
UPDATE command. The hard part is already done — it
just needs the
UPDATE verb at the front of the query:
UPDATE Employee
SET Salary = Salary *
(1 + ((2
years with company
+ ((CAST(CAST(DATEDIFF(d, DateHire, ‘20090625’)
AS DECIMAL(7, 2)) / 365.25 AS INT) * .1)
months in position
+ (CAST(CAST(DATEDIFF(d, DatePosition, ‘20090625’)
AS DECIMAL(7, 2)) / 365.25 * 12 AS INT) * .02)

Performance Rating minimum
+ (CASE WHEN Employee.PerformanceRating >= 2
THEN Employee.PerformanceRating
ELSE 0
END * .5))
Raise Factor
* RaiseFactor) / 100 ))
FROM dbo.Employee
JOIN dbo.Dept
ON Employee.DeptID = Dept.DeptID
A quick check of the data confirms that the update was successful:
SELECT FirstName, LastName, Salary
FROM dbo.Employee
Result:
FirstName LastName Salary

Sam Smith 58436.64
364
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 365
Modifying Data 15
Slim Nelson 83472.48
Sally Ball 47782.80
Jeff Kelly 89134.40
Jo Guelzow 137940.00
Missy Anderson 105972.50
Sam Reagan 78517.50
Hank Adams 35346.40
The final step of the exercise is to clean up the sample tables:
DROP TABLE dbo.Employee, dbo.Dept;

This sample code pulls together techniques from many of the previous chapters: creating and dropping
tables,
CASE expressions, joins, and date scalar functions, not to mention the inserts and updates from
this chapter. The example is long because it demonstrates more than just the
UPDATE statement. It also
shows the typical process of developing a complex
UPDATE, which includes the following:
1. Checking the available data: The first
SELECT joins employee and dept, and lists all the
columns required for the formula.
2. Testing the formula: The second
SELECT is based on the initial SELECT and assembles the
formula from the required rows. From this data, a couple of rows can be hand-tested against
the specs, and the formula verified.
3. Performing the update: Once the formula is constructed and verified, the formula is edited
into an
UPDATE statement and executed.
The SQL
UPDATE command is powerful. I have replaced terribly complex record sets and nested loops
that were painfully slow and error-prone with
UPDATE statements and creative joins that worked
well, and I have seen execution times reduced from hours to a few seconds. I cannot overemphasize
the importance of approaching the selection and updating of data in terms of data sets, rather than
data rows.
Deleting Data
The DELETE command is dangerously simple. In its basic form, it deletes all the rows from a table.
Because the
DELETE command is a row-based operation, it doesn’t require specifying any column
names. The first
FROM is optional, as are the second FROM and the WHERE conditions. However,

although the
WHERE clause is optional, it is the primary subject of concern when you’re using the
DELETE command. Here’s an abbreviated syntax for the DELETE command:
DELETE [FROM] schema.Table
[FROM data sources]
[WHERE condition(s)];
Notice that everything is optional except the actual DELETE command and the table name. The
following command would delete all data from the
Product table — no questions asked and no
second chances:
365
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 366
Part II Manipulating Data With Select
DELETE
FROM OBXKites.dbo.Product;
SQL Server has no inherent ‘‘undo’’ command. Once a transaction is committed, that’s it. That’s why the
WHERE clause is so important when you’re deleting.
By far, the most common use of the
DELETE command is to delete a single row. The primary key is
usually the means of selecting the row:
USE OBXKites;
DELETE FROM dbo.Product
WHERE ProductID = ‘DB8D8D60-76F4-46C3-90E6-A8648F63C0F0’;
Referencing multiple data sources while deleting
There are two techniques for referencing multiple data sources while deleting rows: the double FROM
clause and subqueries.
The
UPDATE command uses the FROM clause to join the updated table with other tables for more flexi-
ble row selection. The

DELETE command can use the exact same technique. When using this method,
the first optional
FROM can make it look confusing. To improve readability and consistency, I recom-
mend that you omit the first
FROM in your code.
For example, the following
DELETE statement ignores the first FROM clause and uses the second FROM
clause to join Product with ProductCategory so that the WHERE clause can filter the DELETE
basedontheProductCategoryName. This query removes all videos from the Product table:
DELETE dbo.Product
FROM dbo.Product
JOIN dbo.ProductCategory
ON Product.ProductCategoryID
= ProductCategory.ProductCategoryID
WHERE ProductCategory.ProductCategoryName = ‘Video’;
The second method looks more complicated at first glance, but it’s ANSI standard and the preferred
method. A correlated subquery actually selects the rows to be deleted, and the
DELETE command just
picks up those rows for the delete operation. It’s a very clean query:
DELETE FROM dbo.Product
WHERE EXISTS
(SELECT *
FROM dbo.ProductCategory AS pc
WHERE pc.ProductCategoryID = Product.ProductCategoryID
AND pc.ProductCategoryName = ‘Video’);
366
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 367
Modifying Data 15
It terms of performance, both methods generate the exact same query execution plan.

As with the UPDATE command’s FROM clause, the DELETE command’s second FROM clause is
not an ANSI SQL standard. If portability is important to your project, then use a subquery
to reference additional tables.
Cascading deletes
Referential integrity (RI) refers to the idea that no secondary row foreign key should point to a primary
row primary key unless that primary row does in fact exist. This means that an attempt to delete a pri-
mary row will fail if a foreign-key value somewhere points to that primary row.
For more information about referential integrity and when to use it, turn to Chapter 3,
‘‘Relational Database Design,’’ and Chapter 20, ‘‘Creating the Physical Database Schema.’’
When implemented correctly, referential integrity will block any delete operation that would result in a
foreign key value without a corresponding primary key value. The way around this is to first delete the
secondary rows that point to the primary row, and then delete the primary row. This technique is called
a cascading delete. In a complex database schema, the cascade might bounce down several levels before
working its way back up to the original row being deleted.
There are two ways to implement a cascading delete: manually with triggers or automatically with
declared referential integrity (DRI) via foreign keys.
Implementing cascading deletes manually is a lot of work. Triggers are significantly slower than foreign
keys (which are checked as part of the query execution plan), and trigger-based cascading deletes usu-
ally also handle the foreign key checks. While this was commonplace a decade ago, today trigger-based
cascading deletes are very rare and might only be needed with a very complex nonstandard foreign key
design that includes business rules in the foreign key. If you’re doing that, then you’re either very new
at this or very, very good.
Fortunately, SQL Server offers cascading deletes as a function of the foreign key. Cascading deletes may
be enabled via Management Studio, in the Foreign Key Relationship dialog, or in SQL code.
The sample script that creates the
Cape Hatteras Adventures version 2 database
(
CHA2_Create.sql) provides a good example of setting the cascade-delete option for referential
integrity. In this case, if either the event or the guide is deleted, then the rows in the event-guide
many-to-many table are also deleted. The

ON DELETE CASCADE foreign-key option is what actually
specifies the cascade action:
CREATE TABLE dbo.Event_mm_Guide (
EventGuideID
INT IDENTITY NOT NULL PRIMARY KEY,
EventID
INT NOT NULL
FOREIGN KEY REFERENCES dbo.Event ON DELETE CASCADE,
GuideID
INT NOT NULL
FOREIGN KEY REFERENCES dbo.Guide ON DELETE CASCADE,
LastName
367
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 368
Part II Manipulating Data With Select
VARCHAR(50) NOT NULL,
)
ON [PRIMARY];
As a caution, cascading deletes, or even referential integrity, are not suitable for every relationship. It
depends on the permanence of the secondary row. If deleting the primary row makes the secondary row
moot or meaningless, then cascading the delete makes good sense; but if the secondary row is still a
valid row after the primary row is deleted, then referential integrity and cascading deletes would cause
the database to break its representation of reality.
As an example of determining the usefulness of cascading deletes from the
Cape Hatteras
Adventures
database, consider that if a tour is deleted, then all scheduled events for that tour become
meaningless, as do the many-to-many schedule tables between event and customer, and between
event and guide. Conversely, a tour must have a base camp, so referential integrity is required on the

Tour.BaseCampID foreign key. However, if a base camp is deleted, then the tours originating from
that base camp might still be valid (if they can be rescheduled to another base camp), so cascading a
base-camp delete down to the tour is not a reasonable action. If RI is on and cascading deletes are off,
then a base camp with tours cannot be deleted until all tours for that base camp are either manually
deleted or reassigned to other base camps.
Alternatives to physically deleting data
Some database developers choose to completely avoid deleting data. Instead, they build systems to
remove the data from the user’s view while retaining the data for safekeeping (like
dBase][ did) . This
can be done in several different ways:
■ A logical-delete
bit flag, or nullable MomentDeleted column, in the row can indicate that
the row is deleted. This makes deleting or restoring a single row a straightforward matter
of setting or clearing a bit. However, because a relational database involves multiple related
tables, there’s more work to it than that. All queries must check the logical-delete flag and
filter out logically deleted rows. This means that a bit column (with extremely poor selectivity)
is probably an important index for every query. While SQL Server 2008’s new filtered indexes
are a perfect fit, it’s still a performance killer.
■ To make matters worse, because the rows still physically exist in SQL Server, and SQL Server’s
declarative referential integrity does not know about the logical-delete flag, custom referential
integrity and cascading of logical delete flags are also required. Restoring, or undeleting,
cascaded logical deletes can become a nightmare.
■ The cascading logical deletes method is complex to code and difficult to maintain. This is a
case of complexity breeding complexity, and I no longer recommend this method.
■ Another alternative to physically deleting rows is to archive the deleted rows in an archive or
audit table. This method is best implemented by an
INSTEAD OF trigger that copies the data
to the alternative location and then physically deletes the rows from the production database.
■ This method offers several advantages. Data is physically removed from the database, so
there’s no need to artificially modify

SELECT queries or index on a bit column. Physically
removing the data enables SQL Server referential integrity to remain in effect. In addition, the
database is not burdened with unnecessary data. Retrieving archived data remains relatively
straightforward and can be easily accomplished with a view that selects data from the archive
location.
368
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 369
Modifying Data 15
Chapter 53, ‘‘Data Audit Triggers,’’ details how to automatically generate the audit system
discussed here that stores, views, and recovers deleted rows.
Merging Data
An upsert operation is a logical combination of an insert and an update. If the data isn’t already in the
table, the upsert inserts the data; if the data is already in the table, then the upsert updates with the dif-
ferences. Ignoring for a moment the new
MERGE command in SQL Server 2008, there are a few ways to
code an upsert operation with T-SQL:
■ The most common method is to attempt to locate the data with an
IF EXISTS;andiftherow
was found,
UPDATE,otherwiseINSERT.
■ If the most common use case is that the row exists and the
UPDATE was needed, then the best
method is to do the update, and if
@@RowCount = 0, then the row was new and the insert
should be performed.
■ If the overwhelming use case is that the row would be new to the database, then
TRY to
INSERT the new row; if a unique index blocked the INSERT andfiredanerror,thenCATCH
the error and UPDATE instead.

All three methods are potentially obsolete with the new
MERGE command. The MERGE command is very
well done by Microsoft — it solves a complex problem well with a clean syntax and good performance.
First, it’s called ‘‘merge’’ because it does more than an upsert. Upsert only inserts or updates; merge can
be directed to insert, update, and delete all in one command.
In a nutshell,
MERGE sets up a join between the source table and the target table, and can then perform
operations based on matches between the two tables.
To walk through a merge scenario, the following example sets up an airline flight check-in scenario. The
main work table is
FlightPassengers, which holds data about reservations. It’s updated as travelers
check in, and by the time the flight takes off, it has the actual final passenger list and seat assignments.
In the sample scenario, four passengers are scheduled to fly SQL Server Airlines flight 2008 (Denver to
Seattle) on March 1, 2008. Poor Jerry, he has a middle seat on the last row of the plane — the row that
doesn’t recline:
USE tempdb;
Merge Target Table
CREATE TABLE FlightPassengers (
FlightID INT NOT NULL
IDENTITY
PRIMARY KEY,
LastName VARCHAR(50) NOT NULL,
FirstName VARCHAR(50) NOT NULL,
FlightCode CHAR(6) NOT NULL,
FlightDate DATE NOT NULL,
Seat CHAR(3) NOT NULL
369
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 370
Part II Manipulating Data With Select

);
INSERT FlightPassengers
(LastName, FirstName, FlightCode, FlightDate, Seat)
VALUES (‘Nielsen’, ‘Paul’, ‘SS2008’, ‘20090301’, ‘9F’),
(‘Jenkins’, ‘Sue’, ‘SS2008’, ‘20090301’, ‘7A’),
(‘Smith’, ‘Sam’, ‘SS2008’, ‘20090301’, ‘19A’),
(‘Nixon’, ‘Jerry’, ‘SS2008’, ‘20090301’, ‘29B’);
The day of the flight, the check-in counter records all the passengers as they arrive, and their seat
assignments, in the
CheckIn table. One passenger doesn’t show, a new passenger buys a ticket, and
Jerry decides today is a good day to burn an upgrade coupon:
Merge Source table
CREATE TABLE CheckIn (
LastName VARCHAR(50),
FirstName VARCHAR(50),
FlightCode CHAR(6),
FlightDate DATE,
Seat CHAR(3)
);
INSERT CheckIn (LastName, FirstName, FlightCode, FlightDate, Seat)
VALUES (‘Nielsen’, ‘Paul’, ‘SS2008’, ‘20090301’, ‘9F’),
(‘Jenkins’, ‘Sue’, ‘SS2008’, ‘20090301’, ‘7A’),
(‘Nixon’, ‘Jerry’, ‘SS2008’, ‘20090301’, ‘2A’),
(‘Anderson’, ‘Missy’, ‘SS2008’, ‘20090301’, ‘4B’);
Before the MERGE command is executed, the next three queries look for differences in the data.
The first set-difference query returns any no-show passengers. A
LEFT OUTER JOIN between the
FlightPassengers and CheckIn tables finds every passenger with a reservation joined with their
CheckIn row if the row is available. If no CheckIn row is found, then the LEFT OUTER JOIN fills
in the

CheckIn column with nulls. Filtering for the null returns only those passengers who made a
reservation but didn’t make the flight:
NoShows
SELECT F.FirstName + ‘ ’ + F.LastName AS Passenger, F.Seat
FROM FlightPassengers AS F
LEFT OUTER JOIN CheckIn AS C
ON C.LastName = F.LastName
AND C.FirstName = F.FirstName
AND C.FlightCode = F.FlightCode
AND C.FlightDate = F.FlightDate
WHERE C.LastName IS NULL
Result:
Passenger Seat

Sam Smith 19A
370
www.getcoolebook.com
Nielsen c15.tex V4 - 07/21/2009 12:51pm Page 371
Modifying Data 15
The walk-up check-in query uses a LEFT OUTER JOIN and an IS NULL in the WHERE clause to locate
any passengers who are in the
CheckIn table but not in the FlightPassenger table:
Walk Up CheckIn
SELECT C.FirstName + ‘ ’ + C.LastName AS Passenger, C.Seat
FROM CheckIn AS C
LEFT OUTER JOIN FlightPassengers AS F
ON C.LastName = F.LastName
AND C.FirstName = F.FirstName
AND C.FlightCode = F.FlightCode
AND C.FlightDate = F.FlightDate

WHERE F.LastName IS NULL
Result:
Passenger Seat

Missy Anderson 4B
The last difference query lists any seat changes, including Jerry’s upgrade to first class. This query uses
an inner join because it’s searching for passengers who both had previous seat assignments and now are
boarding with a seat assignment. The query compares the
seat columns from the FlightPassenger
and CheckIn tables using a not equal comparison, which finds any passengers with a different seat
than previously assigned. Go Jerry!
Seat Changes
SELECT C.FirstName + ‘ ’ + C.LastName AS Passenger, F.Seat AS
‘previous seat’, C.Seat AS ‘final seat’
FROM CheckIn AS C
INNER JOIN FlightPassengers AS F
ON C.LastName = F.LastName
AND C.FirstName = F.FirstName
AND C.FlightCode = F.FlightCode
AND C.FlightDate = F.FlightDate
AND C.Seat <> F.Seat
WHERE F.Seat IS NOT NULL
Result:
Passenger previous seat final seat

Jerry Nixon 29B 2A
For another explanation of set difference queries, flip over to Chapter 10, ‘‘Merging Data
with Joins and Unions.’’
With the scenario’s data in place and verified with set-difference queries, it’s time to merge the check-in
data into the

FlightPassenger table.
371
www.getcoolebook.com

×