Tải bản đầy đủ (.pdf) (5 trang)

SAS Data Integration Studio 3.3- P46 pptx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (473.64 KB, 5 trang )

230
231
Glossary
administrator
the person who is responsible for maintaining the technical attributes of an object
such as a table or a library. For example, an administrator might specify where a
table is stored and who can access the table. See also owner.
alternate key
another term for unique key. See unique key.
analysis data set
in SAS data quality, a SAS output data set that provides information on the degree of
divergence in specified character values.
business key
one or more columns in a dimension table that comprise the primary key in a source
table in an operational system.
change analysis
the process of comparing one set of metadata to another set of metadata and
identifying the differences between the two sets of metadata. For example, in SAS
Data Integration Studio, you have the option of performing change analysis on
imported metadata. Imported metadata is compared to existing metadata. You can
view any changes in the Differences window and choose which changes to apply. To
help you understand the impact of a given change, you can run impact analysis or
reverse impact analysis on tables and columns in the Differences window.
change management
in the SAS Open Metadata Architecture, a facility for metadata source control,
metadata promotion, and metadata replication.
change-managed repository
in the SAS Open Metadata Architecture, a metadata repository that is under
metadata source control.
cluster
in SAS data quality, a set of character values that have the same match code.


comparison result
the output of change analysis. For example, in SAS Data Integration Studio, the
metadata for a comparison result can be selected, and the results of that comparison
can be viewed in a Differences window and applied to a metadata repository. See also
change analysis.
232 Glossary
cross-reference table
a table that contains only the current rows of a larger dimension table. Columns
generally include all business key columns and a digest column. The business key
column is used to determine if source rows are new dimensions or updates to existing
dimensions. The digest column is used to detect changes in source rows that might
update an existing dimension. During updates of the fact table that is associated
with the dimension table, the cross-reference table can provide generated keys that
replace the business key in new fact table rows.
custom repository
in the SAS Open Metadata Architecture, a metadata repository that must be
dependent on a foundation repository or custom repository, thus allowing access to
metadata definitions in the repository or repositories on which it depends. A custom
repository is used to specify resources that are unique to a particular data collection.
For example, a custom repository could define sources and targets that are unique to
a particular data warehouse. The custom repository would access user definitions,
group definitions, and most server metadata from the foundation repository. See also
foundation repository, project repository.
data analysis
in SAS data quality, the process of evaluating input data sets in order to determine
whether data cleansing is needed.
data cleansing
the process of eliminating inaccuracies, irregularities, and discrepancies from
character data.
data lineage

a search that seeks to identify the tables, columns, and transformations that have an
impact on a selected table or column. See also impact analysis, reverse impact
analysis, transformation.
data transformation
in SAS data quality, a cleansing process that applies a scheme to a specified
character variable. The scheme creates match codes internally to create clusters. All
values in each cluster are then transformed to the standardization value that is
specified in the scheme for each cluster.
database library
a collection of one or more database management system files that are recognized by
SAS and that are referenced and stored as a unit. Each file is a member of the library.
database server
a server that provides relational database services to a client. Oracle, DB/2 and
Teradata are examples of relational databases.
delimiter
a character that separates words or phrases in a text string.
derived mapping
a mapping between a source column and a target column in which the value of the
target column is a function of the value of the source column. For example, if two
tables contain a Price column, the value of the target table’s Price column might be
equal to the value of the source table’s Price column multiplied by 0.8.
digest column
a column in a cross-reference table that contains a concatenation of encrypted values
for specified columns in a target table. If a source row has a digest value that differs
from the digest value for that dimension, then changes are detected and the source
Glossary 233
row becomes the new current row in the target. The old target row is closed out and
receives a new value in the end date/time column.
dimension
a category of contextual data or detail data that is implemented in a data model such

as a star schema. For example, in a star schema, a dimension named Customers
might associate customer data with transaction identifiers and transaction amounts
in a fact table.
dimension table
in a star schema or snowflake schema, a table that contains data about a particular
dimension. A primary key connects a dimension table to a related fact table. For
example, if a dimension table named Customers has a primary key column named
Customer ID, then a fact table named Customer Sales might specify the Customer ID
column as a foreign key.
fact table
the central table in a star schema or snowflake schema. A fact table typically
contains numerical measurements or amounts and is supplemented by contextual
information in dimension tables. For example, a fact table might include transaction
identifiers and transaction amounts. Dimension tables could add contextual
information about customers, products, and salespersons. Fact tables are associated
with dimension tables via key columns. Foreign key columns in the fact table contain
the same values as the primary key columns in the dimension tables.
foreign key
one or more columns that are associated with a primary key or unique key in another
table. A table can have one or more foreign keys. A foreign key is dependent upon its
associated primary or unique key. In other words, a foreign key cannot exist without
that primary or unique key.
foundation repository
in the SAS Open Metadata Architecture, a metadata repository that is used to
specify metadata for global resources that can be shared by other repositories. For
example, a foundation repository is used to store metadata that defines users and
groups on the metadata server. Only one foundation repository should be defined on
a metadata server. See also custom repository, project repository.
generated key
a column in a dimension table that contains values that are sequentially generated

using a specified expression. Generated keys are used to implement surrogate keys
and retained keys.
global resource
an object, such as a server or a library, that is shared on a network.
impact analysis
a search that seeks to identify the tables, columns, and transformations that would
be affected by a change in a selected table or column. See also transformation, data
lineage.
intersection table
a table that describes the relationships between two or more tables. For example, an
intersection table could describe the many-to-many relationships between a table of
users and a table of groups.
iterative job
a job with a control loop in which one or more processes are executed multiple times.
Iterative jobs can be executed in parallel. See also job.
234 Glossary
iterative processing
a method of processing in which a control loop executes one or more processes
multiple times.
job
a metadata object that specifies processes that create output.
locale
a value that reflects the language, local conventions, and culture for a geographic
region. Local conventions can include specific formatting rules for dates, times, and
numbers, and a currency symbol for the country or region. Collating sequences,
paper sizes, and conventions for postal addresses and telephone numbers are also
typically specified for each locale. Some examples of locale values are
French_Canada, Portuguese_Brazil, and Chinese_Singapore.
lookup standardization
a process that applies a scheme to a data set for the purpose of data analysis or data

cleansing.
match code
an encoded version of a character value that is created as a basis for data analysis
and data cleansing. Match codes are used to cluster and compare character values.
See also sensitivity.
metadata administrator
a person who defines the metadata for servers, metadata repositories, users, and
other global resources.
metadata model
a definition of the metadata for a set of objects. The model describes the attributes
for each object, as well as the relationships between objects within the model.
metadata object
a set of attributes that describe a table, a server, a user, or another resource on a
network. The specific attributes that a metadata object includes vary depending on
which metadata model is being used.
metadata repository
a collection of related metadata objects, such as the metadata for a set of tables and
columns that are maintained by an application. A SAS Metadata Repository is an
example.
metadata server
a server that provides metadata management services to one or more client
applications. A SAS Metadata Server is an example.
metadata source control
in the SAS Open Metadata Architecture, a feature that enables multiple users to
work with the same metadata repository at the same time without overwriting each
other’s changes. See also change management.
operational data
data as it exists in the operational system, which is used as source data for a data
warehouse.
operational system

one or more programs (frequently relational databases) that provide source data for a
data warehouse.
owner
the person who is responsible for the contents of an object such as a table or a
library. See also administrator.

×