Tải bản đầy đủ (.pdf) (52 trang)

Tài liệu Module 8: Managing Storage and Optimization pptx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.48 MB, 52 trang )






Contents
Overview 1
Analysis Server Cube Storage 2
The Storage Design Wizard 10
Analysis Server Aggregations 17
Lab A: Designing Storage for Sales 23
Usage-Based Optimization 28
Lab B: Implementing Usage-Based
Optimization 35
Optimization Tuning 39
Review 41

Module 8: Managing
Storage and
Optimization

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Information in this document is subject to change without notice. The names of companies,
products, people, characters, and/or data mentioned herein are fictitious and are in no way intended
to represent any real individual, company, product, or event, unless otherwise noted. Complying
with all applicable copyright laws is the responsibility of the user. No part of this document may
be reproduced or transmitted in any form or by any means, electronic or mechanical, for any
purpose, without the express written permission of Microsoft Corporation. If, however, your only
means of access is electronic, permission to print one copy is hereby granted.


Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual property.

 2000 Microsoft Corporation. All rights reserved.

Microsoft, BackOffice, MS-DOS, Windows, Windows NT, <plus other appropriate product
names or titles. Replace this example list with list of trademarks provided by copy editor.
Microsoft is listed first, followed by all other Microsoft trademarks in alphabetical order. > are
either registered trademarks or trademarks of Microsoft Corporation in the U.S.A. and/or other
countries.

<This is where mention of specific, contractually obligated to, third party trademarks, which are
added by the Copy Editor>

The names of companies, products, people, characters, and/or data mentioned herein are fictitious
and are in no way intended to represent any real individual, company, product, or event, unless
otherwise noted.

Other product and company names mentioned herein may be the trademarks of their respective
owners.


Module 8: Managing Storage and Optimization iii


BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Instructor Notes

This module provides students with a comprehensive understanding of
Microsoft
®
SQL Server

Analysis Services storage options and optimization
techniques for online analytical processing (OLAP) cubes. The characteristics
of the three storage modes—multidimensional OLAP (MOLAP), relational
OLAP (ROLAP), and hybrid OLAP (HOLAP)—are reviewed in detail
followed by an overview of aggregations. The module then takes students
through the Storage Design Wizard with discussion of specific aggregation
options and further discussion of the contents of aggregations and design
guidelines. The module concludes with a review of usage-based optimization
and general optimization tuning techniques.
There are two labs in the module. In lab A, students create a storage design and
process a cube by using the Storage Design Wizard. In lab B, students learn the
interfaces and mechanics of usage-based optimization.
After completing this module, students will be able to:
!
Explain the advantages and disadvantages of the three data storage modes.
!
Use the Storage Design Wizard to set storage design.
!
Describe how aggregations work and design aggregations for cubes.
!
Describe the concepts and mechanics of usage-based optimization.
!
Override aggregation settings per dimension.
Materials and Preparation
This section lists the required materials and preparation tasks that you need to

teach this module.
Required Materials
To teach this module, you need the following materials:
!
Microsoft PowerPoint
®
file 2074A_08.ppt

Preparation Tasks
To prepare for this module, you should:
!
Read all the student materials.
!
Read all the instructor notes and margin notes.
!
Practice the lecture presentation and demonstration.
!
Complete the labs.
!
Review the Trainer Preparation presentation for this module on the Trainer
Materials compact disc.
!
Review any relevant white papers that are located on the Trainer Materials
compact disc.


Presentation:
70 Minutes

Labs:

20 Minutes
iv Module 8: Managing Storage and Optimization


BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Demonstration: Designing Storage for the Sales Cube
In this demonstration, you will learn how to create a storage design by using the
Storage Design Wizard.
The following demonstration procedures contain information that does not fit in
the margin notes or is not appropriate for student notes.
!
To restore a new database and define a data source
1. In Analysis Manager, right-click the server, click Restore Database, click
the Look in list, find and click the file
C:\Moc\2074A\Labfiles\L08\Module 08.CAB, click Open, and then click
Restore.
2. Click Close, and then double-click Module 08 to expand the database.
3. Below Module 08, double-click Data Sources, right-click the Module 08
data source, and then click Edit.
4. Click the Connection tab of the Data Link Properties dialog box, and then
verify that localhost is selected in step 1.
5. In step 2, verify that Use Windows NT Integrated security is selected.
6. In step 3, verify that Module 08 is selected.
7. Click Test Connection and verify that the test succeeded. Click OK twice.

!
To specify storage type
1. In the Module 08 database, right-click the Sales cube and click Design
Storage.

2. Click Next to skip the welcome page.
3. From the Select the type of data storage step, click MOLAP, and then
click Next.

!
To design aggregations
1. In the Set aggregation options step, click Performance gain reaches from
the Aggregation options pane.
2. Type 20 in the percent box for Performance gain reaches to reflect a 20-
percent aggregation target.
For the Sales cube, the default value of 50 is unnecessarily high.
3. Click Start to initiate the graphical simulation of Performance vs. Size on
the Set aggregation options step, and then click Next.

Demonstration:
10 Minutes
Module 8: Managing Storage and Optimization v


BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

!
To process the cube
1. In the Finish the Storage Design Wizard step and with the Process now
option clicked, click Finish.
Regardless of the processing option you choose, Analysis Manager stores
the definition of the aggregations in the OLAP repository. Storing the
definition of the aggregations is different from physically creating them,
however. The Storage Design Wizard designs aggregations but does not
create them. The Analysis Server does not create aggregations until you

process the cube. Processing the cube automatically creates any
aggregations that have been designed.
2. Close the Process dialog box when processing is complete.

!
To examine the metadata
1. In Analysis Manager, click the Sales cube in the Analysis Manager tree
pane and then click Metadata in the right details pane.
2. Scroll down and notice the process and storage mode statistics.

vi Module 8: Managing Storage and Optimization


BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Other Activities
Difficult Questions
Below are difficult questions that students may ask you during the delivery of
this module and answers to the questions. These materials delve into subjects
that are within the scope of the module but are not specifically addressed in the
content of the student notes.
1. If ROLAP is slow to process and query, why would an organization use this
option?
ROLAP would be adopted if the organization needs a real-time OLAP
solution—that is, data is always updated with the current fact table
values.
In this scenario, an organization defines its cube as ROLAP with zero
aggregations. All detail and aggregate data are calculated as users
query the cube. While queries are slow, in some situations perfectly
updated data is more important than fast query times.

2. Do Analysis Services MOLAP cubes have the “data explosion” problem
common to OLAP solutions?
MOLAP database engines in competing products often create cubes
that grow exponentially from source files to fully calculated cubes. For
example, a five-megabyte (MB) source file has been known to grow into
a five-gigabyte (GB) cube after processing.
The data explosion problem when using MOLAP in Analysis Services
does not exist to the extent experienced with other OLAP products. In
many cases, the MOLAP cube may be smaller than the data source.
The following are the principal reasons for the MOLAP storage
efficiency:
• Analysis Services MOLAP cubes are completely dense in their data
storage—that is, no null values are stored.
• The Analysis Services query engine is highly optimized, calculating
commonly accessed aggregations as the cube is queried so that fewer
aggregations need to be precalculated and stored.
• The Analysis Services data compression algorithms are highly
efficient.
Some multidimensional products presumably solve the data explosion
problem by not using an OLAP engine, instead accessing data directly
from a relational database.
Such products classify themselves as ROLAP solutions because they
access relational databases directly and give users a multidimensional
view of the data, but do not create cubes that consume large amounts of
storage. Such ROLAP solutions typically suffer in query performance
compared to MOLAP solutions.
Module 8: Managing Storage and Optimization vii


BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY


3. MOLAP cubes duplicate the detail data already stored in relational tables.
How can MOLAP storage be more efficient than ROLAP cube storage?
How can MOLAP be faster to process than ROLAP if Analysis Services
brings all the detail data to the Analysis Server?
Multidimensional structures and data are extremely compressed and
optimized compared to the two-dimensional tables in relational
databases. In addition, when defining ROLAP cubes, indexes are
automatically created in the relational database management system
(RDBMS).
Even though MOLAP cubes carry over the detail data, they can still be
smaller than their ROLAP cube counterparts. The exception is a
ROLAP cube that has few or no aggregations.
From a processing standpoint, it may be faster to create a
multidimensional structure than to create, insert, and update data in
relational tables. It also may take a long time to build the indexes that
are automatically created in ROLAP cubes. Again, the exception is
when the ROLAP cube has few or no aggregations defined.
4. If processing time and disk space are not constraints, should aggregations be
set to 100 percent for Performance gains reaches?
If Performance gains reaches is set to 100 percent, all a cube’s possible
aggregations will not necessarily be computed. The setting simply
targets that query performance will be potentially increased by 100
percent.
As a cube defines more aggregations, query performance improvements
reach a point of diminishing returns. Some cubes may slow in their
query performance if the aggregation percentage is set too high.
5. How does one estimate the size of a cube based on fact table size?
You can estimate the data storage for MOLAP data on disk in bytes,
assuming zero aggregations, by using the following formula:

(((2 * total number of levels) + (4 * number of measures)) * number of
records) / 3
6. Analysis Services has intelligent algorithms for determining the most
optimized aggregation design. Why would you choose to override the
dimension level aggregations?
In most cases, you use the Storage Design Wizard and the Usage-Based
Optimization Wizard to define aggregations. However, there may be
exceptional situations in which you might want to exercise control,
overriding wizard algorithms.
For example, you might not want your cube to contain aggregations for
the lowest level of the Product dimension, because users will not be
accessing data at that level. Therefore, you have the ability to turn off
aggregations for this level.

viii Module 8: Managing Storage and Optimization


BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Module Strategy
Use the following strategy to present this module:
!
Analysis Server Cube Storage
Deliver an overview discussion of general server storage issues and then
talk through the characteristics of each of the three storage modes—
MOLAP, ROLAP, and HOLAP. This discussion leads into a basic
introduction to the concept of aggregations.
!
The Storage Design Wizard
The materials in this section can be delivered as lecture by using the slides

or integrated with the demonstration Design Storage for the Sales Cube.
Because the demonstration essentially duplicates the materials in the student
notes, it is recommended to integrate lecture and demonstration.
The following table is a mapping of lecture topics to demonstration steps.
Lecture Topic Demonstration Procedure

Choose Storage Option To specify storage type
Set Aggregation Options To design aggregations
How Much Aggregation? To design aggregations
Estimated Storage Reaches To design aggregations
Performance Gain Reaches To design aggregations
Until I Click Stop To design aggregations
Finishing Up To examine the metadata and browse
the cube

There will be substantial interest and questions from students about the
storage size and performance implications of choosing from each of the
three storage methods. Do not rush through these materials or limit
discussion and questions. Review the questions and answers in the previous
Difficult Questions section.
Be prepared to discuss elements of aggregation again, including the specific
functioning of the three aggregation options—Estimated storage reaches,
Performance gain reaches, and Until I click stop. These three choices
represent different conceptual approaches and specific underlying
algorithms for implementing aggregations.
!
Analysis Server Aggregations
The subject of aggregation is explored in more detail, including review of
aggregation tables, general characteristics of aggregations, and details about
ROLAP aggregations. Because students must thoroughly understand the

concepts and wizard implementation of aggregations, the subject is
approached repeatedly in this module at increasing levels of detail and
sophistication.
Lab A follows the aggregation details section. Students now have an
opportunity to create storage designs of their own based on MOLAP and
ROLAP storage modes. The lab essentially replicates the steps performed in
the demonstration.
Module 8: Managing Storage and Optimization ix


BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

!
Usage-Based Optimization
This section presents an important feature set of Analysis Services—usage-
based optimization. Your lecture, following the materials in the student
notes, takes students through the simple Usage-Based Optimization Wizard.
Be prepared to answer detailed questions about how each of the five query
options work by themselves and in conjunction with each other.
The section is followed by lab B, Implementing Usage-Based Optimization,
which can be conducted as a hands-on exercise with students following your
demonstration. The lab allows students to perform their own usage-based
optimizations.
!
Optimization Tuning
You complete the module with a discussion of specific optimization tuning
methods, including how to override dimension and level settings. No
exercises or labs are included in this section. However, you should switch to
Analysis Manager to show the settings and to briefly explain their functions.


Module 8: Managing Storage and Optimization 1

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Overview
!
Analysis Server Cube Storage
!
The Storage Design Wizard
!
Analysis Server Aggregations
!
Usage-Based Optimization
!
Optimization Tuning


Query performance—how long it takes a user to access requested
information—is a primary determining factor for online analytical processing
(OLAP) cube storage design. An optimal design produces fast queries for users
while maintaining reasonable cube processing times. Designing the storage
mode and aggregations for a cube is one of the most crucial steps in cube
development.
After completing this module, you will be able to:
!
Explain the advantages and disadvantages of the three data storage modes.
!
Use the Storage Design Wizard to set storage design.
!
Describe how aggregations work and design aggregations for cubes.

!
Describe the concepts and mechanics of usage-based optimization.
!
Override aggregation settings per dimension.
Topic Objective
To provide an overview of
the module topics and
objectives.
Lead-in
In this module, you will learn
about aggregation design
and storage modes, which
are the key factors in
enabling fast query
response times.
2 Module 8: Managing Storage and Optimization

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

`
#
##
# Analysis Server Cube Storage


Microsoft
®
SQL Server

2000 Analysis Services supports three storage

options:
!
Multidimensional OLAP (MOLAP)
!
Relational OLAP (ROLAP)
!
Hybrid OLAP (HOLAP)
Cube developers design the storage for cubes. The storage design of a cube is
transparent to clients—users do not realize that different cubes have different
storage designs.
The following list contains descriptions of some key characteristics of cube
storage modes:
!
The storage mode is transparent to clients. Users and client applications see
only cubes. For users, the only indication of the storage mode is their
observations of query performance.
!
The storage mode can be changed after the initial storage decision is made.
Once you specify storage and put a cube into production, you can change to
a different storage type later. After you change the mode, you must
reprocess the cube and then Analysis Server reloads the data and creates
new aggregations.
!
Each partition of a cube can have a different storage mode. A cube can
consist of multiple partitions. One cube might have both a MOLAP partition
and a ROLAP partition.

For more information about partitions, see module 10, “Managing
Partitions,” in course 2074A, Designing and Implementing OLAP Solutions
with Microsoft SQL Server 2000.


Topic Objective
To introduce Analysis
Services storage options.
Lead-in
Analysis Services supports
three storage options:
MOLAP, ROLAP, and
HOLAP.
Delivery Tips
Avoid going into too much
detail on the three storage
modes here. Wait until the
following slides.

Point out that in the
preceding illustration, Aggs
stands for aggregations.
Note
Module 8: Managing Storage and Optimization 3

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

!
Analysis Server does not allocate storage for missing values. For example, if
no bikinis are sold in Antarctica, no space is allocated to that missing value.
Because missing values take up no storage, cubes are 100 percent dense—
that is, all storage is efficiently used. This characteristic of Analysis Server
helps avoid the data explosion problems of other OLAP products.
4 Module 8: Managing Storage and Optimization


BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

MOLAP Storage Mode
!
Details and Aggregations Stored in Multidimensional
Format
!
Fastest Storage Option for Queries
!
Often the Most Efficient in Terms of Disk Storage, Due
to Compression


The following list contains characteristics of MOLAP storage:
!
Detailed data and aggregations are stored in a multidimensional format on
the Analysis Server.
• Because the detail data from the fact table is brought into Analysis
Server for storage, data is duplicated.
• The level of detail imported into a cube is based on the grain of the
cube’s dimensions. For example, if the fact table contains daily data
records, but the grain of the cube time dimension is month, the cube will
contain data at a month level. The fact table daily records are combined
at cube processing time.
• After a MOLAP cube is processed, all data necessary for querying is
located on the Analysis Server. The source relational database
management system (RDBMS) is not accessed other than at processing
time.
!

MOLAP cubes have the fastest query performance for users.
!
MOLAP is a very economical mode in terms of disk storage, due to efficient
data compression algorithms.
Topic Objective
To describe the
characteristics of MOLAP
storage.
Lead-in
In MOLAP cubes, detailed
data and aggregations are
stored in a multidimensional
format on the Analysis
Server.
Delivery Tips
Tell students that you
design cubes as MOLAP the
vast majority of the time
because of the fast query
times, processing times, and
efficient storage of MOLAP
cubes.

Point out that Aggs stands
for aggregations in the
preceding illustration.
Module 8: Managing Storage and Optimization 5

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY



ROLAP Storage Mode
!
Details and Aggregations
Stored in RDBMS
!
Slowest Query Performance
!
Most Often the Slowest to
Process
!
Analysis Server Can Create
Indexed Views
!
Useful for Large Data Sources
!
Provides Real-Time OLAP
Solution


The following are characteristics of ROLAP storage:
!
Detailed data and aggregations are stored in relational tables in the source
database.
• RDBMS indexes are automatically created in the data source to improve
cube performance.
• All queries, other than those satisfied by the client or server caches, must
access the source RDBMS tables.
!
Under most circumstances, ROLAP cubes are much slower in query

performance than MOLAP cube equivalents.
!
ROLAP cubes are usually the slowest to process, unless the ROLAP cubes
contain few aggregations.
!
When assigning the ROLAP storage mode to cubes that have data sources
defined in SQL Server 2000 databases, the Analysis Server attempts to
create indexed views instead of tables, assuming certain criteria are met in
the data source.
!
You use the ROLAP storage mode when the data source is too large to be
stored and processed effectively in MOLAP or HOLAP.
!
You use the ROLAP storage mode when you require a real-time OLAP
solution.
Topic Objective
To describe the
characteristics of ROLAP
storage.
Lead-in
The following are
characteristics of ROLAP
storage.
Key Point
Aggregation tables must be
stored in the same RDBMS
as the data source of a
cube.
Delivery Tip
Point out that Aggs stands

for aggregations in the
preceding illustration.
6 Module 8: Managing Storage and Optimization

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Creating Real-Time Cubes
Some cubes require immediate refreshing of data when changes occur in the
data source. However, by using standard cubes, you are forced to reprocess
cubes when data changes in the underlying database. To overcome the delay of
data updates, you have the ability to create real-time OLAP cubes in Analysis
Services.
You create a real-time cube by performing the following steps:
!
Define the cube by using the ROLAP storage mode.
!
Select the Enable real-time updates check box in the Select the type of
data storage page in the Storage Design Wizard.
Real-Time Cube Behavior
The following behavior occurs in real-time cubes:
!
The Analysis Server polls the database to determine if changes have been
made to the data source.
!
The Analysis Server flushes the server cache after it detects any database
changes to ensure that clients do not query outdated data.
!
Cube data automatically refreshes when fact table data changes.
Real-Time Criteria
ROLAP cubes must meet certain criteria before they can behave as real-time

cubes:
!
Cubes must contain zero aggregations or must store aggregations in SQL
Server 2000 indexed views.
!
Cube partitions cannot be defined as real-time partitions if they are remote
partitions.
Module 8: Managing Storage and Optimization 7

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY


HOLAP Storage Mode
!
Details Maintained in RDBMS
!
Aggregations Created in Multidimensional Format
!
Good Option where Disk Consumption Is a Concern
!
Good Compromise if Details Are Accessed Infrequently


The following are characteristics of HOLAP storage:
!
Detailed data is maintained in the RDBMS.
!
Aggregations are created in the multidimensional cube format and are stored
on the Analysis Server.
!

Because detailed data is not duplicated, HOLAP is a reasonable storage
compromise where disk consumption is a concern.
!
In a situation when users do not frequently access the details stored in the
RDBMS and the cube contains a high degree of aggregation, HOLAP is a
good option for cube storage.
Most cubes use MOLAP as the cube storage mode. However, you can define a
cube with a HOLAP design to use less cube storage than if the cube used the
MOLAP storage design. The following are effects of using the HOLAP design
in cubes:
!
Queries are not as slow as in a ROLAP cube, nor as fast as in a MOLAP
cube.
!
Processing time for a HOLAP cube is similar to processing time for a
MOLAP cube.
• The same amount of data is read from disk into memory for both
HOLAP and MOLAP cube types.
• The only processing difference between MOLAP and HOLAP cubes is
the writing of detail data to the Analysis Server for MOLAP cubes. This
process does not add significant processing time because the data has
already been read into memory.
Topic Objective
To describe the
characteristics of HOLAP
storage.
Lead-in
The following are
characteristics of HOLAP
storage.

Delivery Tip
Point out that Aggs stands
for aggregations in the
preceding illustration.
8 Module 8: Managing Storage and Optimization

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY


Cube Aggregations
!
Full Aggregation Not Necessary
!
Effects on Cube Size and Processing Time
$
Cube size and processing times increase as
aggregations are added to a cube
!
Tools for Implementing Aggregations
$
Storage Design Wizard
$
Usage-Based Optimization Wizard
$
Dimension and level aggregation properties


Aggregations are precalculated summaries of detailed data that enable Analysis
Server to answer queries quickly. Cubes contain aggregations designed with the
Storage Design Wizard or with the Usage-Based Optimization Wizard.

Precalculated aggregations are fundamental to OLAP cubes, making user
queries significantly faster than calculating aggregations at query time.
Accessing aggregations is transparent to users and client applications. The
Analysis Server accesses aggregations automatically.

A cube can contain multiple partitions. Aggregations can be designed
like a cube’s storage mode, on a partition-by-partition basis. For more
information about partitions, see module 10, “Managing Partitions,” in course
2074A, Designing and Implementing OLAP Solutions with Microsoft SQL
Server 2000.

Full Aggregation Not Necessary
It is not necessary to fully aggregate a cube in Analysis Services. Analysis
Server utilizes a variety of algorithms to optimize data access, thereby
eliminating the need for total cube aggregation.
If an aggregation does not exist to satisfy a query containing summarized data,
Analysis Server does not need to query the lowest level of data. Instead, the
server uses an intermediate aggregation, if one exists, to satisfy the query.
Topic Objective
To explain aggregation
topics.
Lead-in
Aggregations are
precalculated summaries of
detailed data that enable
Analysis Server to answer
queries quickly.
Delivery Tip
Use the preceding
illustration to introduce

aggregations, stepping
through the bullets on the
slide. Point out that there
are more sections covering
aggregations later in the
module.
Note
Module 8: Managing Storage and Optimization 9

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Here is an example of how Analysis Server uses an intermediate aggregation to
satisfy a query:
!
Assume a hierarchy for the Time dimension consisting of the levels Year,
Quarter, and Month. Aggregations exist for Quarter but not for Year.
!
When a user queries the Year level, the server does not access Month level
data to calculate the yearly totals. Instead, the server derives the totals from
the quarter totals that are already aggregated.
Effects on Cube Size and Processing Time
Cube size increases as aggregations are added to a cube. In addition, processing
times increase because pre-aggregations are calculated at process time.

Long processing times tend to be more detrimental to an OLAP
application’s success than large cube sizes. Disk space is inexpensive compared
to time lost waiting for a cube to process.

Size of the cube depends on several factors:
!

The number of aggregations
!
The number of dimensions
!
The number of levels
!
The number of measures
!
The number of members
!
The data distribution of the cube
When designing aggregations, the goal is to maximize query performance while
maintaining reasonable cube sizes and processing times.
Tools for Implementing Aggregations
In the following sections, you will learn about three tools available to you for
implementing aggregations:
!
Storage Design Wizard
!
Usage-Based Optimization Wizard
!
Dimension and Level aggregation properties

Note
10 Module 8: Managing Storage and Optimization

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY


#

##
# The Storage Design Wizard
!
Choosing a Storage Mode
!
Setting Aggregation Options
!
Determining the Level of Aggregation
!
Finishing Up


The Storage Design Wizard is the interface that lets you specify the storage
mode and aggregation design. The next section introduces the steps involved in
using the Storage Design Wizard to design storage modes and aggregations:
!
Choosing a storage mode
!
Setting aggregation options
!
Determining the level of aggregation
!
Finishing up

There are two entry points into the Storage Design Wizard:
!
After building or modifying a cube, you are prompted to set storage options.
You start the wizard by clicking Yes.
!
Right-click a cube or a partition in a cube, and then click Design Storage.


The user interface of the Storage Design Wizard differs depending on:
!
Whether storage has been designed previously for the cube.
!
Whether the cube contains partitions.


Topic Objective
To introduce the steps
involved in using the
Storage Design Wizard.
Lead-in
The Storage Design Wizard
is the interface that lets you
specify the storage mode
and aggregation design.
Delivery Tip
Integrate this content into
the demonstration
Designing Storage for the
Sales Cube as an effective
method of presenting the
material.
Module 8: Managing Storage and Optimization 11

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY


Choosing a Storage Mode



At the first step of the wizard, you must specify the storage type: MOLAP,
ROLAP, or HOLAP. The selected storage mode determines query performance,
processing performance, and cube storage. Therefore, you must determine the
appropriate storage mode before starting the Storage Design Wizard.
Topic Objective
To review storage options.
Lead-in
At the first real step of the
wizard, you must specify the
storage type.
12 Module 8: Managing Storage and Optimization

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY


Setting Aggregation Options


The next step of the Storage Design Wizard allows you to design aggregations
for the cube or partition. Three design options are available as radio button
choices. The design options are as follows:
!
Estimated storage reaches
!
Performance gain reaches
!
Until I click Stop


The following characteristics apply to all three options:
!
Regardless of aggregation option, the dimensional hierarchies are navigated
and aggregations are built where the greatest performance benefits will be
realized based on the number of members at a given level.
!
Aggregations are calculated when the cube is processed or refreshed and
when incremental data is added to the cube—not during the aggregation
design.
!
Aggregation design is per partition. A single cube can consist of multiple
partitions, and each partition can have a different aggregation design.

At the stage in which you design aggregations in the Storage Design
Wizard, aggregations are designed, not calculated. The aggregation design is
stored in the metadata repository. Only when the cube is processed are actual
aggregations built.


Topic Objective
To show the aggregation
design step of the Storage
Design Wizard.
Lead-in
The next step of the Storage
Design Wizard allows you to
design aggregations for the
cube or partition.
Key Points
The Performance Gain

Reaches option is the most
helpful method for designing
aggregations.

No one specific percentage
works best for all cubes.
Every cube is different and
requires testing to find the
optimal aggregation
percentage. The best
method of determining the
optimal percentage is to test
various settings for their
effects on the query and
cube processing times.
Note
Module 8: Managing Storage and Optimization 13

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Estimated Storage Reaches
The Estimated storage reaches option is determined by disk space usage,
specified in megabytes (MB). With this option, aggregations are designed until
they consume the specified amount of disk space or until 100 percent
aggregation is attained.
Performance Gain Reaches
The entry in the Performance gain reaches box represents the percentage
improvement between the maximum and minimum query times as represented
by the formula:
PercentGain = 100 * (QTimeMAX - QTimeTARGET) /

(QTimeMAX - QTimeMIN)

For example, if a nonoptimized query takes 22 seconds to execute
(QTimeMAX), and the best possible query performance with maximum
aggregations is 2 seconds (QTimeMIN), you would specify a 75 percent desired
performance gain to achieve a query time of seven seconds (QTimeTARGET).
Use this setting as a relative rather than an absolute gauge. A recommended
practice is to start with a low value—for example, enter 10 percent. If queries to
the cube are too slow, increase the setting to a higher level—for example, 20
percent. If queries are still too slow, repeat the process.

The more aggregations a cube contains, the greater the amount of time
required to process the cube. Excessive aggregation of a large cube can lead to
unacceptable processing times.

As you increase the performance percentage, you quickly reach a level of
diminishing returns, consuming large amounts of disk storage in exchange for
nominal query performance benefit. For example, the number of aggregations
required to reach 10 percent optimization is fewer—sometimes by an order of
magnitude—than the number of aggregations required to improve from 10
percent to 20 percent.
In general, you gain little by setting the Performance gain reaches option
higher than 50 percent. Ten percent to 30 percent is adequate for most
applications.
The following is recommended practice for implementing Performance gain
reaches:
1. Enter a low percentage, record the processing time, and then gauge retrieval
performance through testing.
2. Enter a higher percentage, again recording processing time and testing
performance.

3. Repeat this procedure until an optimum balance between processing time
and retrieve performance is achieved for your business need.

Note
14 Module 8: Managing Storage and Optimization

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Until I Click Stop
With the Until I click Stop option, the system designs aggregations until one of
the following two situations occurs:
!
You literally click Stop as you watch a simulation of your aggregation
design take place in the Performance vs. Size box
!
A 100-percent performance gain is reached without clicking Stop
You should click Stop when the curve of the line in the Performance vs. Size
box starts to level at an acceptable performance gain.

The Until I click Stop and Estimated storage reaches options are not
very helpful and potentially misleading when designing aggregations. Use the
Performance gain reaches option for the best results.


Note
Module 8: Managing Storage and Optimization 15

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY



Determining the Level of Aggregation


Because Analysis Server uses aggregations intelligently, it is unnecessary to
create aggregations for all possible combinations of data. A common mistake is
over-aggregation.
The following are two important guidelines to follow when designing
aggregations:
!
Due to the speed of MOLAP data retrieval, MOLAP generally requires less
aggregation.
!
Due to the relative slowness of ROLAP retrieval, ROLAP generally requires
more aggregation.
The key to defining aggregations is identifying when you have reached a point
of diminishing returns. Testing is necessary to find that point for each cube.
Topic Objective
To explain how much
aggregation is necessary.
Lead-in
Here are some guidelines
to help you determine how
much aggregation is
generally necessary.

×