Tải bản đầy đủ (.pdf) (10 trang)

Hướng dẫn học Microsoft SQL Server 2008 part 7 pps

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.11 MB, 10 trang )

Nielsen c01.tex V4 - 07/21/2009 11:57am Page 22
Part I Laying the Foundation
■ SQL Data Services (SDS): The database side of Microsoft Azure is a full-featured relational
SQL Server in the cloud that provides an incredible level of high availability and scalable
performance without any capital expenses or software licenses at a very reasonable cost. I’m a
huge fan of SDS and I host my ISV software business on SDS.
Version 1 does have a few limitations: 10GB per database, no heaps (I can live with that), no
access to the file system or other SQL Servers (distributed queries, etc.), and you’re limited to
SQL Server logins.
Besides the general descriptions here, Appendix A includes a chart detailing the differences
between the multiple editions.
Exploring the Metadata
When SQL Server is initially installed, it already contains several system objects. In addition, every new
user database contains several system objects, including tables, views, stored procedures, and functions.
Within Management Studio’s Object Explorer, the system databases appear under the Databases ➪ Sys-
tem Databases node.
System databases
SQL Server uses five system databases to store system information, track operations, and provide a tem-
porary work area. In addition, the
model database is a template for new user databases. These five sys-
tem databases are as follows:
■ master: Contains information about the server’s databases. In addition, objects in
master are
available to other databases. For example, stored procedures in
master may be called from a
user database.
■ msdb: Maintains lists of activities, such as backups and jobs, and tracks which database
backup goes with which user database
■ model: The template database from which new databases are created. Any object placed in the
model database will be copied into any new database.
■ tempdb: Used for ad hoc tables by all users, batches, stored procedures (including Microsoft


stored procedures), and the SQL Server engine itself. If SQL Server needs to create temporary
heaps or lists during query execution, it creates them in
tempdb. tempdb is dropped and
recreated when SQL Server is restarted.
■ resource: This hidden database, added in SQL Server 2005, contains information that was
previously in the master database and was split out from the master database to make service
pack upgrades easier to install.
Metadata views
Metadata is data about data. One of Codd’s original rules for relational databases is that information
about the database schema must be stored in the database using tables, rows, and columns, just like
user data. It is this data about the data that makes it easy to write code to navigate and explore the
database schema and configuration. SQL Server has several types of metadata:
■ Catalog views: Provide information about static metadata — such as tables, security, and
server configuration
22
www.getcoolebook.com
Nielsen c01.tex V4 - 07/21/2009 11:57am Page 23
The World of SQL Server 1
■ Dynamic management views (DMVs) and functions: Yield powerful insight into the
current state of the server and provide data about things such as memory, threads, stored
procedures in cache, and connections
■ System functions and global variables: Provide data about the current state of the server,
the database, and connections for use in scalar expressions
■ Compatibility views: Serve as backward compatibility views to simulate the system tables
from previous versions of SQL Server 2000 and earlier. Note that compatibility views are
deprecated, meaning they’ll disappear in SQL Server 11, the next version of SQL Server.
■ Information schema v iews: The ANSI SQL-92 standard nonproprietary views used to exam-
ine the schema of any database product. Portability as a database design goal is a lost cause,
and these views are of little practical use for any DBA or database developer who exploits the
features of SQL Server. Note that they have been updated for SQL Server 2008, so if you used

these in the past they may need to be tweaked.
These metadata views are all listed in Management Studio’s Object Explorer under the Database ➪
Views ➪ System Views node, or under the Database ➪ Programmability ➪ Functions ➪ Metadata
Function node.
What’s New?
There are about 50 new features in SQL Server 2008, as you’ll discover in the What’s New in 2008
sidebars in many chapters. Everyone loves lists (and so do I), so here’s my list highlighting the best of
what’s new in SQL Server 2008.
Paul’s top-ten new features in SQL Server 2008:
10. PowerShell — The new Windows scripting language has been integrated into SQL Server.
If you are a DBA willing to learn PowerShell, this technology has the potential to radically
change how you do your daily jobs.
9. New data types — Specifically, I’m more excited about Date, Time, and DateTime2 than
Spatial and HierarchyID.
8. Tablix — Reporting Services gains the Tablix and Dundas controls, and loses that IIS
requirement.
7. Query processing optimizations — The new star joins provide incredible out-of-the-box
performance gains for some types of queries. Also, although partitioned tables were introduced
in SQL Server 2005, the query execution plan performance improvements and new UI for
partitioned tables in SQL Server 2008 will increase their adoption rate.
6. Filtered indexes — The ability to create a small targeted nonclustered index over a very
large table is the perfect logical extension of indexing, and I predict it will be one of the most
popular new features.
5. Management Data Warehouse — A new consistent method of gathering performance
data for further analysis by Performance Studio or custom reports and third parties lays the
foundation for more good stuff in the future.
4. Data compression — The ability to trade CPU cycles for reduced IO can significantly
improve the scalability of some enterprise databases. I believe this is the sleeper feature that
will be the compelling reason for many shops to upgrade to SQL Server 2008 Enterprise
Edition.

23
www.getcoolebook.com
Nielsen c01.tex V4 - 07/21/2009 11:57am Page 24
Part I Laying the Foundation
The third top new feature is Management Studio’s many enhancements. Even though it’s just a tool and
doesn’t affect the performance of the engine, it will help database developers and DBAs be more produc-
tive and it lends a more enjoyable experience to every job role working with SQL Server:
3. Management Studio — The primary UI is supercharged with multi-server queries and
configuration servers, IntelliSense, T-SQL debugger, customizable Query Editor tabs, Error
list in Query Editor, easily exported data from Query Results, launch profiler from Query
Editor, Object Search, a vastly improved Object Explorer Details page, a new Activity Monitor,
improved ways to work with query plans, and it’s faster.
And for the final top two new features, one for developers and one for DBAs:
2.
Merge and Table-valued parameters — Wow! It’s great to see new T-SQL features on the
list. Table-valued parameters alone is the compelling reason I upgraded my Nordic software
to SQL Server 2008. Table-valued parameters revolutionize the way application transactions
communicate with the database, which earns it the top SQL Server 2008 database developer
feature and number two in this list.
The new
merge command combines insert, update, and delete into a single transaction and is
a slick way to code an upsert operation. I’ve recoded many of my upsert stored procedures to
use
merge with excellent results.
1. Policy-based management (PBM) — PBM means that servers and databases can be declar-
atively managed by applying and enforcing consistent policies, instead of running ad hoc
scripts. This feature has the potential to radically change how enterprise DBAs do their daily
jobs, which is why it earns the number one spot on my list of top ten SQL Server 2008
features.
Going, Going, Gone?

W
ith every new version of SQL Server, some features change or are removed because they no longer
make sense with the newer feature set.
Discontinued
means a feature used to work in a previous SQL
Server version but no longer appears in SQL Server 2008.
Deprecated
means the feature still works in SQL Server 2008, but it’s going to be removed in a future
version. There are two levels of deprecation; Microsoft releases both a list of the features that will be gone in
the next version, and a list of the features that will be gone in some future version but will still work in the
next version.
Books Online has details about all three lists (just search for deprecated), but here are the highlights:
Going Eventually (Deprecated)
These features are deprecated from a future version of SQL Server. You should try to remove these from your
code:
■ SQLOLEDB
■ Timestamp (although the synonym rowversion continues to be supported)
continued
24
www.getcoolebook.com
Nielsen c01.tex V4 - 07/21/2009 11:57am Page 25
The World of SQL Server 1
continued
■ Text, ntext,andimage data types
■ Older full-text catalog commands
■ Sp_configure ‘user instances enabled’
■ Sp_lock
■ SQL-DMO
■ Sp stored procedures for security, e.g., sp_adduser
■ Setuser (use Execute as instead)

■ System tables
■ Group by all
Going Soon (Deprecated)
The following features are deprecated from the next version of SQL Server. You should definitely remove
these commands from your code:
■ Older backup and restore options
■ SQL Server 2000 compatibility level
■ DATABASEPROPERTY command
■ sp_dboption
■ FastFirstRow query hint (use Option(Fast n))
■ ANSI-89 (legacy) outer join syntax (*=, =*); use ANSI-92 syntax instead
■ Raiserror integer string format
■ Client connectivity using DB-Lib and Embedded SQL for C
Gone (Discontinued)
The following features are discontinued in SQL Server 2008:
■ SQL Server 6, 6.5, and 7 compatibility levels
■ Surface Area Configuration Tool (unfortunately)
■ Notification Services
■ Dump and Load commands (use Backup and Restore)
■ Backup log with No-Log
continued
25
www.getcoolebook.com
Nielsen c01.tex V4 - 07/21/2009 11:57am Page 26
Part I Laying the Foundation
continued
■ Backup log with truncate_only
■ Backup transaction
■ DBCC Concurrencyviolation
■ sp_addgroup, sp_changegroup, sp_dropgroup,andsp_helpgroup (use

security roles instead)
The very useful Profiler trace feature can report the use of any deprecated features.
Summary
If SQL Server 2005 was the ‘‘kitchen sink’’ version of SQL Server, then SQL Server 2008 is the version
that focuses squarely on managing the enterprise database.
Some have written that SQL Server 2008 is the second step of a two-step release. In the same way that
SQL Server 2000 was part two to SQL Server 7, the theory is that SQL Server 2008 is part two to SQL
Server 2005.
At first glance this makes sense, because SQL Server 2008 is an evolution of the SQL Server 2005
engine, in the same way that SQL Server 2000 was built on the SQL Server 7 engine. However, as I
became intimate with SQL Server 2008, I changed my mind.
Consider the significant new technologies in SQL Server 2008: policy-based management, Performance
Data Warehouse, PowerShell, data compression, and Resource Governor. None of these technologies
existed in SQL Server 2005.
In addition, think of the killer technologies introduced in SQL Server 2005 that are being extended in
SQL Server 2008. The most talked about new technology in SQL Server 2005 was CLR. Hear much
about CLR in SQL Server 2008? Nope. Service Broker has some minor enhancements. Two SQL Server
2005 new technologies, HTTP endpoints and Notification Services, are actually discontinued in SQL
Server 2008. Hmmm, I guess they should have been on the SQL Server 2005 deprecation list.
No, SQL Server 2008 is more than a SQL Server 2005 sequel. SQL Server 2008 is a fresh new vision
for SQL Server. SQL Server 2008 is the first punch of a two-punch setup focused squarely at manag-
ing the enterprise-level database. SQL Server 2008 is a down payment on the big gains coming in SQL
Server 11.
I’m convinced that the SQL Server Product Managers nailed it and that SQL Server 2008 is the best
direction possible for SQL Server. There’s no Kool-Aid here — it’s all way cool.
26
www.getcoolebook.com
Nielsen c02.tex V4 - 07/21/2009 12:02pm Page 27
Data Architecture
IN THIS CHAPTER

Pragmatic data architecture
Evaluating database designs
Designing performance into
the database
Avoiding normalization
over-complexity
Relational design patterns
Y
ou can tell by looking at a building whether there’s an elegance to the
architecture, but architecture is more than just good looks. Architecture
brings together materials, foundations, and standards. In the same way,
data architecture is the study of defining what a good database is and how one
builds a good database. That’s why data architecture is more than just data
modeling, more than just server configuration, and more than just a collection of
tips and tricks.
Data architecture is the overarching design of the database, how the database
should be developed and implemented, and how it interacts with other software.
In this sense, data architecture can be related to the architecture of a home, a fac-
tory, or a skyscraper. Data architecture is defined by the Information Architecture
Principle and the six attributes by which every database can be measured.
Enterprise data architecture extends the basic ideas of designing a single database
to include designing which types of databases serve which needs within the
organization, how those databases share resources, and how they communicate
with one another and other software. In this sense, enterprise data architecture
is community planning or zoning, and is concerned with applying the best
database meta-patterns (e.g., relational OTLP database, object-oriented database,
multidimensional) to an organization’s various needs.
27
www.getcoolebook.com
Nielsen c02.tex V4 - 07/21/2009 12:02pm Page 28

Part I Laying the Foundation
Author’s Note
D
ata architecture is a passion of mine, and without question the subject belongs in any comprehensive
database book. Because it’s the foundation for the rest of the book — the ‘‘why’’ behind the ‘‘how’’
of designing, developing, and operating a database — it makes sense to position it toward the beginning of
the book. Even if you’re not in the role of database architect yet, I hope you enjoy the chapter and that it
presents a useful viewpoint for your database career. Keep in mind that you can return to read this chapter
later, at any time when the information might be more useful to you.
Information Architecture Principle
For any complex endeavor, there is value in beginning with a common principle to drive designs, pro-
cedures, and decisions. A credible principle is understandable, robust, complete, consistent, and stable.
When an overarching principle is agreed upon, conflicting opinions can be objectively measured, and
standards can be decided upon that support the principle.
The Information Architecture Principle encompasses the three main areas of information management:
database design and development, enterprise data center management, and business intelligence analysis.
Information Architecture Principle: Information is an organizational asset, and, according to its
value and scope, must be organized, inventoried, secured, and made readily available in a usable format
for daily operations and analysis by individuals, groups, and processes, both today and in the future.
Unpacking this principle reveals several practical implications. There should be a known inventory of
information, including its location, source, sensitivity, present and future value, and current owner.
While most organizational information is stored in IT databases, un-inventoried critical data is often
found scattered throughout the organization in desktop databases, spreadsheets, scraps of papers, Post-it
notes, and (the most dangerous of all) inside the head of key employees.
Just as the value of physical assets varies from asset to asset and over time, the value of information is
also variable and so must be assessed. Information value may be high for an individual or department,
but less valuable to the organization as a whole; information that is critical today might be meaningless
in a month; or information that may seem insignificant individually might become critical for organiza-
tional planning once aggregated.
If the data is to be made easily available in the future, then current designs must be loosely connected,

or coupled, to avoid locking the data in a rigid, but brittle, database.
Database Objectives
Based on the Information Architecture Principle, every database can be architected or evaluated by six
interdependent database objectives. Four of these objectives are primarily a function of design, develop-
ment, and implementation: usability, extensibility, data integrity,andperformance. Availability and security
are more a function of implementation than design.
28
www.getcoolebook.com
Nielsen c02.tex V4 - 07/21/2009 12:02pm Page 29
Data Architecture 2
With sufficient design effort and a clear goal of meeting all six objectives, it is fully possible to design
and develop an elegant database that does just that. The idea that one attribute is gained only at the
expense of the other attributes is a myth.
Each objective can be measured on a continuum. The data architect is responsible for informing the
organization about these six objectives, including the cost associated with meeting each objective,
the risk of failing to meet the objective, and the recommended level for each objective.
It’s the organization’s privilege to then prioritize the objectives compared with the relative cost.
Usability
The usability of a data store (the architectural term for a database) involves the completeness of meeting
the organization’s requirements, the suitability of the design for its intended purpose, the effectiveness
of the format of data available to applications, the robustness of the database, and the ease of extracting
information (by programmers and power users). The most common reason why a database is less than
usable is an overly complex or inappropriate design.
Usability is enabled in the design by ensuring the following:
■ A thorough and well-documented understanding of the organizational requirements
■ Life-cycle planning of software features
■ Selecting the correct meta-pattern (e.g., relational OTLP database, object-oriented database,
multidimensional) for the data store.
■ Normalization and correct handling of optional data
■ Simplicity of design

■ A well-defined abstraction layer with stored procedures and views
Extensibility
The Information Architecture Principle states that the information must be readily available today
and in the future, which requires the database to be extensible, able to be easily adapted to meet new
requirements. Data integrity, performance, and availability are all mature and well understood by the
computer science and IT professions. While there may be many badly designed, poorly performing, and
often down databases, plenty of professionals in the field know exactly how to solve those problems. I
believe the least understood database objective is extensibility.
Extensibility is incorporated into the design as follows:
■ Normalization and correct handling of optional data
■ Generalization of entities when designing the schema
■ Data-driven designs that not only model the obvious data (e.g., orders, customers), but also
enable the organization to store the behavioral patterns, or process flow.
■ A well-defined abstraction layer with stored procedures and views that decouple the database
from all client access, including client apps, middle tiers, ETL, and reports.
■ Extensibility is also closely related to simplicity. Complexity breeds complexity, and inhibits
adaptation.
29
www.getcoolebook.com
Nielsen c02.tex V4 - 07/21/2009 12:02pm Page 30
Part I Laying the Foundation
Data integrity
The ability to ensure that persisted data can be retrieved without error is central to the Information
Architecture Principle, and it was the first major problem tackled by the database world. Without data
integrity, a query’s answer cannot be guaranteed to be correct; consequently, there’s not much point in
availability or performance. Data integrity can be defined in multiple ways:
■ Entity integrity involves the structure (primary key and its attributes) of the entity. If the pri-
mary key is unique and all attributes are scalar and fully dependent on the primary key, then
the integrity of the entity is good. In the physical schema, the table’s primary key enforces
entity integrity.

■ Domain integrity ensures that only valid data is permitted in the attribute. A domain is a
set of possible values for an attribute, such as integers, bit values, or characters. Nullabil-
ity (whether a null value is valid for an attribute) is also a part of domain integrity. In the
physical schema, the data type and nullability of the row enforce domain integrity.
■ Referential integrity refers to the domain integrity of foreign keys. Domain integrity means
that if an attribute has a value, then that value must be in the domain. In the case of the
foreign key, the domain is the list of values in the related primary key. Referential integrity,
therefore, is not an issue of the integrity of the primary key but of the foreign key.
■ Transactional integrity ensures that every logical unit of work, such as inserting 100 rows or
updating 1,000 rows, is executed as a single transaction. The quality of a database product is
measured by its transactions’ adherence to the ACID properties: atomic — all or nothing, consis-
tent — the database begins and ends the transaction in a consistent state, isolated — one transaction
does not affect another transaction, and durable — once committed always committed.
In addition to these four generally accepted definitions of data integrity, I add user-defined data
integrity:
■ User-defined integrity means that the data meets the organization’s requirements. Simple
business rules, such as a restriction to a domain, limit the list of valid data entries. Check
constraints are commonly used to enforce these rules in the physical schema.
■ Complex business rules limit the list of valid data based on some condition. For example,
certain tours may require a medical waiver. Implementing these rules in the physical schema
generally requires stored procedures or triggers.
■ Some data-integrity concerns can’t be checked by constraints or triggers. Invalid, incomplete,
or questionable data may pass all the standard data-integrity checks. For example, an order
without any order detail rows is not a valid order, but no SQL constraint or trigger traps
such an order. The abstraction layer can assist with this problem, and SQL queries can locate
incomplete orders and help in identifying other less measurable data-integrity issues, including
wrong data, incomplete data, questionable data, and inconsistent data.
Integrity is established in the design by ensuring the following:
■ A thorough and well-documented understanding of the organizational requirements
■ Normalization and correct handling of optional data

■ A well-defined abstraction layer with stored procedures and views
■ Data quality unit testing using a well-defined and understood set of test data
■ Metadata and data audit trails documenting the source and veracity of the data, including
updates
30
www.getcoolebook.com
Nielsen c02.tex V4 - 07/21/2009 12:02pm Page 31
Data Architecture 2
Performance/scalability
Presenting readily usable information is a key aspect of the Information Architecture Principle. Although
the database industry has achieved a high degree of performance, the ability to scale that performance
to very large databases with more connections is still an area of competition between database engine
vendors.
Performance is enabled in the database design and development by ensuring the following:
■ A well-designed schema with normalization and generalization, and correct handling of
optional data
■ Set-based queries implemented within a well-defined abstraction layer with stored procedures
and views
■ A sound indexing strategy that determines which queries should use bookmark lookups
and which queries would benefit most from clustered and non-clustered covering indexes to
eliminate bookmark lookups
■ Tight, fast transactions that reduce locking and blocking
■ Partitioning, which is useful for advanced scalability
Availability
The availability of information refers to the information’s accessibility when required regarding uptime,
locations, and the availability of the data for future analysis. Disaster recovery, redundancy, archiving,
and network delivery all affect availability.
Availability is strengthened by the following:
■ Quality, redundant hardware
■ SQL Server’s high-availability features

■ Proper DBA procedures regarding data backup and backup storage
■ Disaster recovery planning
Security
The sixth database objective based of the Information Architecture Principle is security. For any organi-
zational asset, the level of security must be secured depending on its value and sensitivity.
Security is enforced by the following:
■ Physical security and restricted access of the data center
■ Defensively coding against SQL injection
■ Appropriate operating system security
■ Reducing the surface area of SQL Server to only those services and features required
■ Identifying and documenting ownership of the data
■ Granting access according to the principle of least privilege
■ Cryptography — data encryption of live databases, backups, and data warehouses
■ Meta-data and data audit trails documenting the source and veracity of the data, including
updates
31
www.getcoolebook.com

×