MANNING
DEEP DIVES
Volume 2
EDITED BY
Kalen Delaney Louis Davidson Greg Low Brad McGehee Paul Nielsen Paul Randal Kimberly Tripp
MVP AUTHORS
Johan Åhlén Gogula Aryalingam Glenn Berry Aaron Bertrand Kevin G. Boles Robert Cain Tim Chapman Denny Cherry Michael Coles Rod Colledge
John Paul Cook Louis Davidson Rob Farley Grant Fritchey Darren Gosbell Sergio Govoni Allan Hirt Satya Jayanty Tibor Karaszi Jungsun Kim Tobiasz
Koprowski Hugo Kornelis Ted Krueger Matija Lah Greg Low Rodney Landrum Greg Larsen Peter Larsson Andy Leonard Ami Levin John Magnabosco
Jennifer McCown Brad McGehee Siddharth Mehta Ben Miller Allan Mitchell Tim Mitchell Luciano Moreira Jessica Moss Shahriar Nikkhah Paul Nielsen
Robert Pearl Boyan Penev Pedro Perfeito Pawel Potasinski Mladen Prajdi c´ Abolfazl Radgoudarzi Denis Reznik Rafael Salas Edwin Sarmiento
Chris Shaw Gail Shaw Linchi Shea Jason Strate Paul Turley William Vaughn Peter Ward Joe Webb John Welch Allen White Thiago Zavaschi
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
The authors of this book support the children of Operation Smile
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
SQL Server MVP Deep Dives
Volume 2
SQL Server MVP
Deep Dives
Volume 2
Edited by Kalen Delaney Louis Davidson Greg Low
Brad McGehee Paul Nielsen Paul Randal Kimberly Tripp
■
■
■
■
■
MANNING
SHELTER ISLAND
For online information and ordering of this and other Manning books, please visit
www.manning.com. The publisher offers discounts on this book when ordered in quantity.
For more information, please contact
Special Sales Department
Manning Publications Co.
20 Baldwin Road
PO Box 261
Shelter Island, NY 11964
Email:
©2012 by Manning Publications Co. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in
any form or by means electronic, mechanical, photocopying, or otherwise, without prior written
permission of the publisher.
Many of the designations used by manufacturers and sellers to distinguish their products are
claimed as trademarks. Where those designations appear in the book, and Manning
Publications was aware of a trademark claim, the designations have been printed in initial caps
or all caps.
Recognizing the importance of preserving what has been written, it is Manning’s policy to have
the books we publish printed on acid-free paper, and we exert our best efforts to that end.
Recognizing also our responsibility to conserve the resources of our planet, Manning books
are printed on paper that is at least 15 percent recycled and processed without the use of
elemental chlorine.
Manning Publications Co.
20 Baldwin Road
PO Box 261
Shelter Island, NY 11964
Development editor:
Copyeditor:
Project editor:
Typesetter:
Cover designer:
ISBN 9781617290473
Printed in the United States of America
1 2 3 4 5 6 7 8 9 10 – MAL – 16 15 14 13 12 11
Cynthia Kane
Liz Welch, Linda Recktenwald
Barbara Mirecki
Marija Tudor
Marija Tudor
To all the children of Operation Smile
MVP authors and their chapters
Johan Åhlén 53
Gogula Aryalingam 57
Glenn Berry 31
Aaron Bertrand 19
Kevin G. Boles 20
Robert Cain 45
Tim Chapman 38
Denny Cherry 5
Michael Coles 46
Rod Colledge 14
John Paul Cook 21
Louis Davidson 4
Rob Farley 2
Grant Fritchey 32
Darren Gosbell 54
Sergio Govoni 22
Allan Hirt 7
Satya Jayanty 16
Tibor Karaszi 17
Jungsun Kim 39
Tobiasz Koprowski 18
Hugo Kornelis 23
Ted Krueger 50
Matija Lah 24
Rodney Landrum 15
Greg Larsen 9
Peter Larsson 25
Andy Leonard 47
Ami Levin 1
Greg Low 41
John Magnabosco 11
Jennifer McCown 37
Brad McGehee 36
Siddharth Mehta 60
Ben Miller 26
Allan Mitchell 59
Tim Mitchell 51
Luciano Moreira 27
Jessica M. Moss 40
Paul Nielsen 6
Shahriar Nikkhah 48
Robert Pearl 34
Boyan Penev 55
Pedro Perfeito 58
Pawel Potasinski 12
Mladen Prajdic´ 28
Abolfazl Radgoudarzi 48
Denis Reznik 29
Rafael Salas 52
Edwin Sarmiento 44
Chris Shaw 3
Gail Shaw 8
Linchi Shea 35
Jason Strate 33
Paul Turley 43
William Vaughn 42
Peter Ward 13
Joe Webb 10
John Welch 49
Allen White 30
Thiago Zavaschi 56
vii
brief contents
PART 1
PART 2
ARCHITECTURE ............................................... 1
1
■
Where are my keys?
3
2
■
“Yes, we are all individuals”
A look at uniqueness in the world of SQL
3
■
Architectural growth pains
4
■
Characteristics of a great relational database
5
■
Storage design considerations
6
■
Generalization: the key to a well-designed schema
16
26
37
49
60
DATABASE ADMINISTRATION .......................... 65
7
■
Increasing availability through testing
8
■
67
Page restores
9
■
Capacity planning 87
10
■
Discovering your servers with PowerShell and SMO
11
■
Will the real Mr. Smith please stand up?
12
■
Build your own SQL Server 2008 performance dashboard 111
13
■
SQL Server cost recovery
14
■
Best practice compliance with Policy-Based Management
15
■
Using SQL Server Management Studio to the fullest 138
16
■
Multiserver management and Utility Explorer—best tools for
the DBA 146
79
ix
95
105
121
128
BRIEF CONTENTS
x
PART 3
PART 4
PART 5
17
■
Top 10 SQL Server admin student misconceptions
157
18
■
High availability of SQL Server in the context of Service
Level Agreements 167
DATABASE DEVELOPMENT ............................ 175
19
■
T-SQL: bad habits to kick
177
20
■
Death by UDF 185
21
■
Using regular expressions in SSMS
22
■
SQL Server Denali: what’s coming next in T-SQL 200
23
■
Creating your own data type
24
■
Extracting data with regular expressions
25
■
Relational division
26
■
SQL FILESTREAM: to BLOB or not to BLOB 245
27
■
Writing unit tests for Transact-SQL 255
28
■
Getting asynchronous with Service Broker
29
■
Effective use of HierarchyId 278
30
■
Let Service Broker help you scale your application
195
211
223
234
267
287
PERFORMANCE TUNING AND OPTIMIZATION .... 297
31
■
Hardware 201: selecting and sizing database server
hardware 299
32
■
Parameter sniffing: your best friend…except when it isn’t
33
■
Investigating the plan cache
34
■
What are you waiting for? An introduction to waits and
queues 331
35
■
You see sets, and I see loops
36
■
Performance-tuning the transaction log
for OLTP workloads 353
37
■
Strategies for unraveling tangled code
38
■
Using PAL to analyze SQL Server performance
39
■
Tuning JDBC for SQL Server 384
309
320
343
362
374
BUSINESS INTELLIGENCE .............................. 395
40
■
Creating a formal Reporting Services report part library
397
BRIEF CONTENTS
xi
41
■
Improving report layout and visualization
405
42
■
Developing sharable managed code expressions in SSRS
43
■
Designing reports with custom MDX queries 424
44
■
Building a scale-out Reporting Services farm
45
■
Creating SSRS reports from SSAS 448
46
■
Optimizing SSIS for dimensional data loads 457
47
■
SSIS configurations management
48
■
Exploring different types of enumerators in the SSIS Foreach
Loop container 480
49
■
Late-arriving dimensions in SSIS
50
■
Why automate tasks with SSIS? 503
51
■
Extending SSIS using the Script component 515
52
■
ETL design checklist
53
■
Autogenerating SSAS cubes
54
■
Scripting SSAS databases – AMO and PowerShell, Better
Together 548
55
■
Managing context in MDX
56
■
Using time intelligence functions in PowerPivot
57
■
Easy BI with Silverlight PivotViewer
58
■
Excel as a BI frontend tool 585
59
■
Real-time BI with StreamInsight
60
■
BI solution development design considerations
411
436
469
494
526
538
557
569
577
597
608
contents
preface xxxv
acknowledgments xxxix
about Operation Smile xli
about this book xliii
about the editors xlv
about SQL Server MVPs xlvi
PART 1 ARCHITECTURE ................................................ 1
EDITED BY LOUIS DAVIDSON
1
Where are my keys? 3
AMI LEVIN
Keys in the relational model
The debate 5
The arguments 6
Pro artificial keys 6
■
4
Pro natural keys
Additional considerations
8
10
Natural keys assist the optimizer 11 Artificial keys are the de facto
standard 12 Modularity, portability, and foreseeing the future 12
IDENTITY columns may result in value gaps and “run out” of values 13
■
■
Recommendations 13
Simplicity and aesthetics
Summary 14
xiii
14
CONTENTS
xiv
2
“Yes, we are all individuals”
A look at uniqueness in the world of SQL 16
ROB FARLEY
Introducing uniqueness 16
Constrained to uniqueness 16
Primary keys 17
■
Unique constraints
18
■
Unique indexes
19
Unique constraint or unique index? 19
Advantages of the unique index
constraint 20
Uniqueness in results
19
■
Advantages of the unique
20
The good and the bad of DISTINCT 20 DISTINCT or GROUP
BY 21 Are they needed at all? 22 Unnecessary grouping 23
Being guided by “that” error 24
■
■
■
Summary
3
25
Architectural growth pains 26
CHRIS SHAW
Manage your data types
IDENTITY case in point
27
27
■
Database design and scalability
28
Naming conventions 28
Inconsistent design 29
Normalization 30
Overnormalized
30
Undernormalized
■
Primary keys and foreign keys
30
31
GUIDs as primary keys 31 System-generated integers as primary
keys 32 Generating your own ID values 32
■
■
Indexes
33
Underindexing
Fill factor 36
Summary
4
33
■
Overindexing
35
■
Maintenance 35
36
Characteristics of a great relational database 37
LOUIS DAVIDSON
Coherent
38
Standards based 38 Reasonable names and data types
Cohesive 40 Needs little documentation 40
■
■
39
CONTENTS
Normal 40
Fundamentally sound
Documented 42
Secure 43
Encapsulated 44
Well performing 46
Summary 47
5
xv
41
Storage design considerations
49
DENNY CHERRY
Selecting the correct RAID type
49
RAID 0 49 RAID 1 50 RAID 5 51 RAID 6 51
RAID 10 52 RAID 50 52 When to use RAID 5 52
When to use RAID 6 53 When to use RAID 10 53
■
■
■
■
■
■
File placement
Index files 54
Disk alignment
53
■
Transaction log files 54
■
tempdb database 54
54
Correcting disk alignment on Windows 2003 and earlier 55
Correcting disk alignment in Windows 2008 and later 56
Correcting after the partition has been created 56 Aligning
on the array 56
■
Snapshots
56
Snapshots with a VSS-enabled storage array 57 Snapshots
with a non-VSS-enabled storage array 57 Snapshots as a backup
process 58 Using snapshots to present storage to downstream
environments 58
■
■
■
Clones 59
Summary 59
6
Generalization: the key to a well-designed schema
PAUL NIELSEN
A place for normalization 60
Lessons from the UIX discipline
Generalization defined 62
Benefits of generalization 63
Summary 64
61
60
CONTENTS
xvi
PART 2 DATABASE
ADMINISTRATION ............................
EDITED BY PAUL RANDAL AND KIMBERLY TRIPP
7
Increasing availability through testing 67
ALLAN HIRT
Testing—it’s not just for application functionality
The missing link 69
Knowledge is power 71
Test early, test often 72
Automated versus manual testing 73
What needs to be tested? 73
First things first 75
Remember the big picture, too 77
Summary 78
8
68
Page restores 79
GAIL SHAW
Restore granularities 79
Requirements and limitations
80
Recovery model and availability of log backups 80
Edition 80 Page type of the damaged page 80
■
SQL Server
■
Performing a page restore
What’s coming? 84
Summary 85
9
81
Capacity planning 87
GREG LARSEN
What is capacity planning? 87
Gathering current database disk space usage
Performance metrics 90
Summary 94
10
88
Discovering your servers with PowerShell and SMO 95
JOE WEBB
Using PowerShell and Excel 95
Using SMO with PowerShell 96
65
CONTENTS
xvii
Collecting instance and machine information
Collecting SQL Agent job information 98
Collecting database information 100
Summary 103
11
97
Will the real Mr. Smith please stand up? 105
JOHN MAGNABOSCO
Personally identifiable data 106
Today’s superhero: the DBA 106
Our superpowers 106
■
Tools of the trade
107
Summary 109
12
Build your own SQL Server 2008
performance dashboard 111
PAWEL POTASINSKI
DMVs as the source of performance-related information 111
Using SQLCLR to get the performance counter values 112
Sample solution for performance monitoring 115
Use Reporting Services for performance monitoring 118
Some ideas to improve the solution 119
Summary 119
13
SQL Server cost recovery 121
PETER WARD
The context for SQL Server as a Service 121
What’s SQL Server as a Service? 122
An introduction to chargebacks 123
Implementing a chargeback model 125
Summary 127
14
Best practice compliance with
Policy-Based Management 128
ROD COLLEDGE
The context for contemporary database administration
The importance of best practice compliance 129
128
CONTENTS
xviii
Central Management Servers 130
Policy-Based Management 131
Surface area configuration
132
■
Sysadmin membership
134
Policy-Based Management with Central Management
Servers 135
Summary 137
15
Using SQL Server Management Studio to the fullest 138
RODNEY LANDRUM
Querying many servers at once 138
Creating and using a scripting solution with templates
Scripting multiple objects and now data, too 143
Summary 145
16
141
Multiserver management and
Utility Explorer—best tools for the DBA 146
SATYA SHYAM K JAYANTY
SQL Server 2008 R2 tools for the DBA 146
Tools of the trade 147
Managing multiple instances using Utility Control Point
Multiserver management and administration 152
Best practices 155
Summary 156
17
Top 10 SQL Server admin student misconceptions 157
TIBOR KARASZI
Simple recovery model 157
Default collation 158
Table-level backups 159
Using replication for high availability
Timing query performance 162
Shrinking databases 163
Auditing login access 163
Tail-log backups 165
160
148
CONTENTS
Database defaults
Difficulty 166
Summary 166
18
xix
165
High availability of SQL Server in the context of Service
Level Agreements 167
TOBIASZ JANUSZ KOPROWSKI
High availability—a definition 167
Types of unavailability
169
Unavailability indicators 169
High availability options in SQL Server 170
Service Level Agreement 171
Measurement indicators 171
The structure of a Service Level Agreement 172
Service Level Agreements: the context for high
availability 173
Summary 174
Useful links 174
PART 3 DATABASE
DEVELOPMENT .............................
EDITED BY PAUL NIELSEN
19
T-SQL: bad habits to kick
177
AARON BERTRAND
SELECT * 177
Declaring VARCHAR without length 179
Not choosing the right data type 181
Mishandling date range queries 182
Making assumptions about ORDER BY 183
Summary 184
20
Death by UDF 185
KEVIN BOLES
Poor estimates 185
Row-by-row processing 188
175
CONTENTS
xx
What can you do about it? 190
Inline table valued function solution 190
■
Set-based solution
What about code reuse? 192
One last example of how bad scalar UDFs can be
Summary 194
21
191
193
Using regular expressions in SSMS 195
JOHN PAUL COOK
Eliminating blank lines 195
Removing extra newline characters 196
Collapsing multiple lines into a single line 197
Using the beginning-of-line metacharacter 197
Using the end-of-line metacharacter 198
Summary 198
22
SQL Server Denali: what’s coming next in T-SQL 200
SERGIO GOVONI
OFFSET and FETCH 200
SQL Server 2005 and 2008 solution 201 SQL Server Denali
solution 201 Comparing execution plan results 202
■
■
SEQUENCE 202
Tips for using SEQUENCE
203
■
Restrictions 203
EXECUTE…WITH RESULT SETS
THROW 208
Summary 209
23
205
Creating your own data type 211
HUGO KORNELIS
Anatomy of a CLR user-defined type 212
…But do you need it at all? 212 Representations and
conversions 213 How about NULL? 214
■
■
Building the data type: the bare basics
214
Starting the project 214 Adding the fields for the native
representation 215 Editing the signature 215 Converting between
■
■
■
CONTENTS
xxi
.NET and text 216 Converting between .NET and serialized
Handling NULLs 220 Using the data type 222
■
219
■
Summary 222
24
Extracting data with regular expressions
223
MATIJA LAH
Understanding before coding 223
Background 224 An incredibly brief introduction to regular
expressions, matches, and groups 225 Regular expressions and SQL
Server 226 Regular expressions and the .NET Framework 226
■
■
■
The solution
227
The core 227 The SQL CLR user-defined function
script component 230
■
229
■
The SSIS
Homework 232
Summary 232
25
Relational division 234
PETER LARSSON
Why use relational division? 234
Defining relational division 234
Background 235
Sample data for two simple cases 236
Comparison charts 238
Let’s go on with the real stuff 240
Set-based solution to common relational division 240
Does one query exist for all types of relational division? 242
Summary 243
26
SQL FILESTREAM: to BLOB or not to BLOB 245
BEN MILLER
To FILESTREAM or not to FILESTREAM 245
Configuring FILESTREAM in SQL Server 247
Operating system configuration
Database configuration
249
247
■
SQL Server configuration
248
CONTENTS
xxii
Creating a table that uses FILESTREAM
Things to consider 251
How do I use FILESTREAM? 252
Summary 254
27
Writing unit tests for Transact-SQL
255
LUCIANO MOREIRA
Unit test basics
255
Unit test for databases
256
■
T-SQL unit test walkthrough
Automating unit test execution
Summary 265
28
250
257
265
Getting asynchronous with Service Broker 267
MLADEN PRAJDIC´
The Service Broker usage template 267
Creating Service Broker objects 270
Summary 277
29
Effective use of HierarchyId
278
DENIS REZNIK
Hierarchies in a database 278
Introduction to the HierarchyId data type
Using the HierarchyId data type
organization 280
Effective indexing
Depth-first indexes
279
■
282
283
■
Breadth-first indexes
More information about HierarchyId
Summary 286
30
279
Physical HierarchyId data
284
285
Let Service Broker help you scale your application 287
ALLEN WHITE
Scalable solutions 287
Service Broker objects 288
Security 288 Message types 289 Contracts 289
Queues 289 Services 289 Conversations 290
Endpoints 290 Routes 290 Remote service binding
■
■
■
■
■
■
291
CONTENTS
ETL trigger demonstration
Summary 295
PART 4 PERFORMANCE
xxiii
292
TUNING AND OPTIMIZATION......
297
EDITED BY BRAD M. MCGEHEE
31
Hardware 201: selecting and sizing database
server hardware 299
GLENN BERRY 299
Why database server hardware is important 300
Scaling up or scaling out 300
SQL Server and hardware selection 301
Database server–specific hardware factors 302
Intel vs. AMD processors 304
Memory recommendations 304
Traditional storage subsystems 305
New developments in storage subsystems 306
Benchmarking and sizing tools 307
Summary 308
32
Parameter sniffing: your best friend…except when it isn’t 309
GRANT FRITCHEY
Understanding parameter sniffing 309
Parameter sniffing gone wrong 312
Dealing with bad parameter sniffing 313
OPTIMIZE FOR 313 WITH RECOMPILE 315 Local
variables 316 Plan guides 316 Turn off parameter sniffing
■
■
■
■
Summary 318
33
Investigating the plan cache
320
JASON STRATE
Plan cache dynamic management objects
sys.dm_exec_cached_plans 321
Investigating missing indexes
■
320
sys.dm_exec_query_plan 321
322
318
CONTENTS
xxiv
Investigating index usage 324
Investigating operations 325
Investigating index scans 327
Investigating parameters 328
Plan cache considerations 329
Summary 330
34
What are you waiting for?
An introduction to waits and queues 331
ROBERT PEARL
Introduction to total response time
What are wait stats? 332
Why use wait stats?
332
■
331
Wait type categories 332
The execution model 333
Viewing and reporting on wait statistics 335
Calculating wait time: signal waits vs. resource waits 337
Correlating performance data: putting it together 339
General I/O issues 339 Buffer I/O latch issues 339 Blocking and
locking 340 CPU pressure 340 Parallelism 340 Memory
pressure 341
■
■
■
■
■
Summary 341
35
You see sets, and I see loops 343
LINCHI SHEA
What loops? 343
The loop perspective 344
Loops in a query execution plan 346
Loops in complex queries 348
User-defined scalar functions in implicit loops 349
Merging multiple loops into one 350
Parallelizing loops 350
Linked server calls in a loop 351
Squeezing the fat out of loops with a slim table 352
Summary 352