Troubleshooting
SQL 2005:
Opening the Database
Administrator’s
Toolbox
1-800-COURSES
www.globalknowledge.com
Expert Reference Series of White Papers
Introduction
There are many methods for troubleshooting databases. Many people jump straight to a Google search, or run
to find the senior DBA who, most likely, has seen this problem before. While these may provide some results,
there’s no better solution than tracking down empirical evidence as to the cause of a bottleneck or to the
identity of the troublesome transaction. In addition to getting real answers (as opposed to trying other peo-
ple’s solutions), you gain a much greater understanding of SQL’s inner workings and of how to optimize your
database
.
Upon first hearing the words “empirical evidence,” many of you may be immediately dissuaded. Fear not! The
tools are at hand, and they are relatively simple to use
. A familiarity with logs, monitors, and other built-in pro-
grams will ensure that you have a number of research choices when trouble arises.
Two Types of Troubleshooting
Depending upon your needs, you can choose to monitor your server in two ways.
Real Time: When you need information about what is happening at this very moment
Trend: Used when you need to see or record the performance and information over a period of time, or if you
can recreate the problem for recording purposes
Most people will jump to real time monitoring as their first option. It gives instant information, and, if you
know what you are looking for, it can provide fast verification to your analysis. Trend monitoring is often used
when trying to optimize a server or to deal with a problem of unknown origins. It can be used to cast a wide
net and then to review the situation in detail.
Real Time Monitoring
Raw SQL Counters
At any given time, amazing amounts of statistical and diagnostic information are being produced by the physi-
cal server and by SQL itself
.
While this is great if you know what to monitor
,
it is pretty overwhelming if you
don’t. To glimpse a small portion of this data, run the following statement in Management Studio:
SELECT * FROM sys.dm_os_performance_counters
Jeff Peters, Global Knowledge Instructor
Troubleshooting SQL 2005: Opening
the Database Administrator’s Toolbox
Copyright ©2007 Global Knowledge T
raining LLC. All rights reserved.
Page 2
Figure 1: 740 Rows of performance data returned from a simple query
This statement returns the current values of the performance counters for SQL. 740 rows of dynamic informa-
tion gathered from the exact time the query is run.
Everything from the number of locks currently in place to
the level of latency for replication can be viewed. The same information can be viewed in a graphical format
via System Monitor by adding the needed counters. It allows you to see the last few minutes of data as
opposed to a single instant in time and averages are calculated along with minimums and maximums
.
If you know where to look, these values can give hints as to the current health of your database server. Look
at the Buffer Cache Hit Ratio
.
T
his shows how often SQL finds needed information in the buffer cache as
opposed to making the comparatively long trip to the hard drive. With an on-line transactional processing
(OLTP) database, the values should be in the nineties on-line analytical processing (OLAP) databases normally
have lower percentages.) Values lower than this could verify a need for additional RAM. While you shouldn’t
use a single value as a reason to go shopping for hardware, it can be used to corroborate other factors.
Copyright ©2007 Global Knowledge T
raining LLC. All rights reserved.
Page 3
Mini Cheat Sheet for SQL 2005 Counter
s
**Note that most of these v
alues assume that the physical server’s sole responsibility is running SQL Server
SQL’s Built-In Monitoring Systems
In addition to gathering generic data for you to analyze, SQL also has a number of built-in tools for monitoring
specific situations
. Activity Monitor, Replication Monitor, and Job Activity Monitor are some of these utilities
and are all found within Management Studio. While very narrow in focus, they can allow you to verify or dis-
credit specific concerns quickly.
Activity Monitor will give you detailed information on what is happening right now in terms of locks and
macroscopic activity by user or process. Replication Monitor will give you the rundown on latency for each
individual subscription being replicated. The Job Activity Monitor will give not only the ability to start, stop,
and monitor all jobs from a particular server, but it will also provide historical information on a job-step by job-
step basis for each time that a job has run. It will be a rare occurrence when a DBA will not be able to deci-
sively narrow their focus when troubleshooting jobs using these tools. (Figure 2)
Trend Monitoring
While troubleshooting in real time is often the first response to an immediate need, the monitoring of trends is
more likely to provide results when faced with improving efficiency in the long run.
Baselines
Trend monitoring recognizes that a value gleaned from system monitor does not normally give a clear picture
by itself
.
T
he power of these v
alues increase dramatically when we have something with which to compare
them,
and that’
s where baselines come in.
Baselines are simply readings of values taken before a problem
exists; our “normal” values. Good troubleshooting starts when everything is working fine, and we start taking
regular readings of good conditions so that when bad conditions present themselves we are not looking at
performance data in a vacuum.
Copyright ©2007 Global Knowledge T
raining LLC. All rights reserved.
Page 4
Counter Explanation
Batch Request/Sec 1000 or more is a standard measure of a “busy” server. Consistently higher levels
c
an be sustained with good hardware, but may point to the need for an upgrade
of your processor. Use this instead of Transactions/Sec to get a clearer picture of
y
our server’s activity
Avg. Disk Write/Read Queue Length The value for both of these counters should stay lower than 2 per physical disk.
Higher values show a need for faster disks or moving to RAID (Redundant Array
of Independent Drives) or a SAN (Storage Area Network) if they are not already in
place
Locks: Average Wait This should normally be below 500 Time (ms) (milliseconds). Higher numbers may
reveal high levels of locking which demand research into offending transactions.
% Processor Time If this value’s AVERAGE is above 80 percent, a processor update will probably
remove this bottleneck. Be wary of the significance when watching this in real
time. Spikes going all the way up to 100% are normal and are not by themselves
a reason for alarm.
T
he tools we mentioned previously will not work well for us in preparing these baselines or, for that matter, in
taking readings over a period of time. They were all real-time monitors and cannot save their data in an easily
used format.
Figure 2: Launch Activity Monitor, Job Activity Monitor, and Replication Monitor from within Management Studio.
(right-click on Replication to access Replication Monitor)
Counter Logs
System monitor is normally a place to go for a quick glance at the immediate health of your server. After open-
ing the utility, move down a section in the left pane and you will find counter logs (Figure 3).
This portion of the utility will allow you to monitor the same counters we mentioned earlier, but it also gives
you the opportunity to record these counters for later use and comparison. By keeping this log, you are creat-
ing your baseline. Any time a major change occurs on your network (new hardware or software, additional
users), the baseline should be renewed. You can also view a timeline of the events you recorded to investigate
the timings of slow downs and other events.
Copyright ©2007 Global Knowledge T
raining LLC. All rights reserved.
Page 5