Tải bản đầy đủ (.pdf) (563 trang)

IGI global selected readings on database technologies and applications aug 2008 ISBN 1605660981 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (13.62 MB, 563 trang )


Selected Readings on
Database Technologies
and Applications
Terry Halpin
Neumont University, USA

InformatIon scIence reference
Hershey • New York


Director of Editorial Content:
Managing Development Editor:
Senior Managing Editor:
Managing Editor:
Assistant Managing Editor:
Typesetter:
Cover Design:
Printed at:

Kristin Klinger
Kristin M. Roth
Jennifer Neidig
Jamie Snavely
Carole Coulson
Carole Coulson
Lisa Tosheff
Yurchak Printing Inc.

Published in the United States of America by
Information Science Reference (an imprint of IGI Global)


701 E. Chocolate Avenue, Suite 200
Hershey PA 17033
Tel: 717-533-8845
Fax: 717-533-8661
E-mail:
Web site:
and in the United Kingdom by
Information Science Reference (an imprint of IGI Global)
3 Henrietta Street
Covent Garden
London WC2E 8LU
Tel: 44 20 7240 0856
Fax: 44 20 7379 0609
Web site:
Copyright © 2009 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in any form or by
any means, electronic or mechanical, including photocopying, without written permission from the publisher.
Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does
not indicate a claim of ownership by IGI Global of the trademark or registered trademark.

Library of Congress Cataloging-in-Publication Data
Selected readings on database technologies and applications / Terry Halpin, editor.
p. cm.
Summary: "This book offers research articles focused on key issues concerning the development, design, and analysis of databases"-Provided by publisher.
Includes bibliographical references and index.
ISBN 978-1-60566-098-1 (hbk.) -- ISBN 978-1-60566-099-8 (ebook)
1. Databases. 2. Database design. I. Halpin, T. A.
QA76.9.D32S45 2009
005.74--dc22
2008020494
British Cataloguing in Publication Data

A Cataloguing in Publication record for this book is available from the British Library.
All work contributed to this book set is original material. The views expressed in this book are those of the authors, but not necessarily of
the publisher.
If a library purchased a print copy of this publication, please go to for information on activating the
library's complimentary electronic access to this publication.


Table of Contents

Prologue............................................................................................................................................ xviii
About the Editor............................................................................................................................. xxvii
Section I
Fundamental Concepts and Theories
Chapter I
Conceptual Modeling Solutions for the Data Warehouse....................................................................... 1
Stefano Rizzi, DEIS - University of Bologna, Italy

Chapter II
Databases Modeling of Engineering Information................................................................................. 21
Z. M. Ma, Northeastern University, China
Chapter III
An Overview of Learning Object Repositories.................................................................................... 44
Argiris Tzikopoulos, Agricultural University of Athens, Greece
Nikos Manouselis, Agricultural University of Athens, Greece
Riina Vuorikari, European Schoolnet, Belgium
Chapter IV
Discovering Quality Knowledge from Relational Databases............................................................... 65
M. Mehdi Owrang O., American University, USA



Section II
Development and Design Methodologies

Chapter V
Business Data Warehouse: The Case of Wal-Mart .............................................................................. 85
Indranil Bose, The University of Hong Kong, Hong Kong
Lam Albert Kar Chun, The University of Hong Kong, Hong Kong
Leung Vivien Wai Yue, The University of Hong Kong, Hong Kong
Li Hoi Wan Ines, The University of Hong Kong, Hong Kong
Wong Oi Ling Helen, The University of Hong Kong, Hong Kong
Chapter VI
A Database Project in a Small Company (or How the Real World Doesn’t Always
Follow the Book) ................................................................................................................................. 95
Efrem Mallach, University of Massachusetts Dartmouth, USA
Chapter VII
Conceptual Modeling for XML: A Myth or a Reality ........................................................................112
Sriram Mohan, Indiana University, USA
Arijit Sengupta, Wright State University, USA
Chapter VIII
Designing Secure Data Warehouses................................................................................................... 134
Rodolfo Villarroel, Universidad Católica del Maule, Chile
Eduardo Fernández-Medina, Universidad de Castilla-La Mancha, Spain
Juan Trujillo, Universidad de Alicante, Spain
Mario Piattini, Universidad de Castilla-La Mancha, Spain
Chapter IX
Web Data Warehousing Convergence: From Schematic to Systematic ............................................. 148
D. Xuan Le, La Trobe University, Australia
J. Wenny Rahayu, La Trobe University, Australia
David Taniar, Monash University, Australia


Section III
Tools and Technologies
Chapter X
Visual Query Languages, Representation Techniques, and Data Models.......................................... 174
Maria Chiara Caschera, IRPPS-CNR, Italy
Arianna D’Ulizia, IRPPS-CNR, Italy
Leonardo Tininini, IASI-CNR, Italy


Chapter XI
Application of Decision Tree as a Data Mining Tool in a Manufacturing System ............................ 190
S. A. Oke, University of Lagos, Nigeria
Chapter XII
A Scalable Middleware for Web Databases ....................................................................................... 206
Athman Bouguettaya, Virginia Tech, USA
Zaki Malik, Virginia Tech, USA
Abdelmounaam Rezgui, Virginia Tech, USA
Lori Korff, Virginia Tech, USA
Chapter XIII
A Formal Verification and Validation Approach for Real-Time Databases ....................................... 234
Pedro Fernandes Ribeiro Neto, Universidade do Estado–do Rio Grande do Norte, Brazil
Maria Lígia Barbosa Perkusich, Universidade Católica de Pernambuco, Brazil
Hyggo Oliveira de Almeida, Federal University of Campina Grande, Brazil
Angelo Perkusich, Federal University of Campina Grande, Brazil
Chapter XIV
A Generalized Comparison of Open Source and Commercial Database Management Systems ...... 252
Theodoros Evdoridis, University of the Aegean, Greece
Theodoros Tzouramanis, University of the Aegean, Greece

Section IV

Application and Utilization
Chapter XV
An Approach to Mining Crime Patterns ............................................................................................ 268
Sikha Bagui, The University of West Florida, USA
Chapter XVI
Bioinformatics Web Portals ............................................................................................................... 296
Mario Cannataro, Università “Magna Græcia” di Catanzaro, Italy
Pierangelo Veltri, Università “Magna Græcia” di Catanzaro, Italy
Chapter XVII
An XML-Based Database for Knowledge Discovery: Definition and Implementation .................... 305
Rosa Meo, Università di Torino, Italy
Giuseppe Psaila, Università di Bergamo, Italy
Chapter XVIII
Enhancing UML Models: A Domain Analysis Approach .................................................................. 330
Iris Reinhartz-Berger, University of Haifa, Israel
Arnon Sturm, Ben-Gurion University of the Negev, Israel


Chapter XIX
Seismological Data Warehousing and Mining: A Survey .................................................................. 352
Gerasimos Marketos,University of Piraeus, Greece
Yannis Theodoridis, University of Piraeus, Greece
Ioannis S. Kalogeras, National Observatory of Athens, Greece

Section V
Critical Issues
Chapter XX
Business Information Integration from XML and Relational Databases Sources ............................. 369
Ana María Fermoso Garcia, Pontifical University of Salamanca, Spain
Roberto Berjón Gallinas, Pontifical University of Salamanca, Spain

Chapter XXI
Security Threats in Web-Powered Databases and Web Portals ......................................................... 395
Theodoros Evdoridis, University of the Aegean, Greece
Theodoros Tzouramanis, University of the Aegean, Greece
Chapter XXII
Empowering the OLAP Technology to Support Complex Dimension Hierarchies........................... 403
Svetlana Mansmann, University of Konstanz, Germany
Marc H. Scholl, University of Konstanz, Germany
Chapter XXIII
NetCube: Fast, Approximate Database Queries Using Bayesian Networks ...................................... 424
Dimitris Margaritis, Iowa State University, USA
Christos Faloutsos, Carnegie Mellon University, USA
Sebastian Thrun, Stanford University, USA
Chapter XXIV
Node Partitioned Data Warehouses: Experimental Evidence and Improvements ............................. 450
Pedro Furtado, University of Coimbra, Portugal

Section VI
Emerging Trends
Chapter XXV
Rule Discovery from Textual Data .................................................................................................... 471
Shigeaki Sakurai, Toshiba Corporation, Japan


Chapter XXVI
Action Research with Internet Database Tools .................................................................................. 490
Bruce L. Mann, Memorial University, Canada
Chapter XXVII
Database High Availability: An Extended Survey ............................................................................. 499
Moh’d A. Radaideh, Abu Dhab Police – Ministry of Interior, United Arab Emirates

Hayder Al-Ameed, United Arab Emirates University, United Arab Emirates

Index .................................................................................................................................................. 528


Detailed Table of Contents

Prologue............................................................................................................................................ xviii
About the Editor............................................................................................................................. xxvii

Section I
Fundamental Concepts and Theories
Chapter I
Conceptual Modeling Solutions for the Data Warehouse....................................................................... 1
Stefano Rizzi, DEIS - University of Bologna, Italy
This opening chapter provides an overview of the fundamental role that conceptual modeling plays in data
warehouse design. Specifically, research focuses on a conceptual model called the DFM (Dimensional
Fact Model), which suits the variety of modeling situations that may be encountered in real projects
of small to large complexity. The aim of the chapter is to propose a comprehensive set of solutions for
conceptual modeling according to the DFM and to give the designer a practical guide for applying them
in the context of a design methodology. Other issues discussed include descriptive and cross-dimension
attributes; convergences; shared, incomplete, recursive, and dynamic hierarchies; multiple and optional
arcs; and additivity.
Chapter II
Databases Modeling of Engineering Information................................................................................. 21
Z. M. Ma, Northeastern University, China
As information systems have become the nerve center of current computer-based engineering, the need
for engineering information modeling has become imminent. Databases are designed to support data
storage, processing, and retrieval activities related to data management, and database systems are the
key to implementing engineering information modeling. It should be noted that, however, the current

mainstream databases are mainly used for business applications. Some new engineering requirements
challenge today’s database technologies and promote their evolution. Database modeling can be classified into two levels: conceptual data modeling and logical database modeling. In this chapter, the
author tries to identify the requirements for engineering information modeling and then investigates the
satisfactions of current database models to these requirements at two levels: conceptual data models
and logical database models.


Chapter III
An Overview of Learning Object Repositories ................................................................................... 44
Argiris Tzikopoulos, Agricultural University of Athens, Greece
Nikos Manouselis, Agricultural University of Athens, Greece
Riina Vuorikari, European Schoolnet, Belgium
Learning objects are systematically organized and classified in online databases, which are termed learning object repositories (LORs). Currently, a rich variety of LORs is operating online, offering access
to wide collections of learning objects. These LORs cover various educational levels and topics, store
learning objects and/or their associated metadata descriptions, and offer a range of services that may
vary from advanced search and retrieval of learning objects to intellectual property rights (IPR) management. Until now, there has not been a comprehensive study of existing LORs that will give an outline of
their overall characteristics. For this purpose, this chapter presents the initial results from a survey of 59
well-known repositories with learning resources. The most important characteristics of surveyed LORs
are examined and useful conclusions about their current status of development are made.
Chapter IV
Discovering Quality Knowledge from Relational Databases .............................................................. 65
M. Mehdi Owrang O., American University, USA
Current database technology involves processing a large volume of data in order to discover new knowledge. However, knowledge discovery on just the most detailed and recent data does not reveal the longterm trends. Relational databases create new types of problems for knowledge discovery since they are
normalized to avoid redundancies and update anomalies, which make them unsuitable for knowledge
discovery. A key issue in any discovery system is to ensure the consistency, accuracy, and completeness
of the discovered knowledge. This selection describes the aforementioned problems associated with the
quality of the discovered knowledge and provides solutions to avoid them.

Section II
Development and Design Methodologies

Chapter V
Business Data Warehouse: The Case of Wal-Mart .............................................................................. 85
Indranil Bose, The University of Hong Kong, Hong Kong
Lam Albert Kar Chun, The University of Hong Kong, Hong Kong
Leung Vivien Wai Yue, The University of Hong Kong, Hong Kong
Li Hoi Wan Ines, The University of Hong Kong, Hong Kong
Wong Oi Ling Helen, The University of Hong Kong, Hong Kong
The retailing giant Wal-Mart owes its success to the efficient use of information technology in its operations. One of the noteworthy advances made by Wal-Mart is the development of a data warehouse,
which gives the company a strategic advantage over its competitors. In this chapter, the planning and
implementation of the Wal-Mart data warehouse is described and its integration with the operational
systems is discussed. The chapter also highlights some of the problems encountered in the developmental


process of the data warehouse. The implications of the recent advances in technologies such as RFID,
which is likely to play an important role in the Wal-Mart data warehouse in future, are also detailed in
this chapter.
Chapter VI
A Database Project in a Small Company (or How the Real World Doesn’t Always
Follow the Book) ................................................................................................................................. 95
Efrem Mallach, University of Massachusetts Dartmouth, USA
The selection presents a small consulting company’s experience in the design and implementation of a
database and associated information retrieval system. The company’s choices are explained within the
context of the firm’s needs and constraints. Issues associated with development methods are discussed,
along with problems that arose from not following proper development disciplines. Ultimately, the author
asserts that while the system provided real value to its users, the use of proper development disciplines
could have reduced some problems while not reducing that value.
Chapter VII
Conceptual Modeling for XML: A Myth or a Reality ........................................................................112
Sriram Mohan, Indiana University, USA
Arijit Sengupta, Wright State University, USA

Conceptual design is independent of the final platform and the medium of implementation, and is usually in a form that is understandable to managers and other personnel who may not be familiar with the
low-level implementation details, but have a major influence in the development process. Although a
strong design phase is involved in most current application development processes, conceptual design
for XML has not been explored significantly in literature or in practice. In this chapter, the reader is
introduced to existing methodologies for modeling XML. A discussion is then presented comparing and
contrasting their capabilities and deficiencies, and delineating the future trend in conceptual design for
XML applications.
Chapter VIII
Designing Secure Data Warehouses................................................................................................... 134
Rodolfo Villarroel, Universidad Católica del Maule, Chile
Eduardo Fernández-Medina, Universidad de Castilla-La Mancha, Spain
Juan Trujillo, Universidad de Alicante, Spain
Mario Piattini, Universidad de Castilla-La Mancha, Spain
As an organization’s reliance on information systems governed by databases and data warehouses (DWs)
increases, so does the need for quality and security within these systems. Since organizations generally
deal with sensitive information such as patient diagnoses or even personal beliefs, a final DW solution
should restrict the users that can have access to certain specific information. This chapter presents a
comparison of six design methodologies for secure systems. Also presented are a proposal for the design
of secure DWs and an explanation of how the conceptual model can be implemented with Oracle Label
Security (OLS10g).


Chapter IX
Web Data Warehousing Convergence: From Schematic to Systematic ............................................. 148
D. Xuan Le, La Trobe University, Australia
J. Wenny Rahayu, La Trobe University, Australia
David Taniar, Monash University, Australia
This chapter proposes a data warehouse integration technique that combines data and documents from
different underlying documents and database design approaches. Well-defined and structured data,
semi-structured data, and unstructured data are integrated into a Web data warehouse system and user

specified requirements and data sources are combined to assist with the definitions of the hierarchical
structures. A conceptual integrated data warehouse model is specified based on a combination of user
requirements and data source structure, which necessitates the creation of a logical integrated data warehouse model. A case study is then developed into a prototype in a Web-based environment that enables
the evaluation. The evaluation of the proposed integration Web data warehouse methodology includes
the verification of correctness of the integrated data, and the overall benefits of utilizing this proposed
integration technique.

Section III
Tools and Technologies
Chapter X
Visual Query Languages, Representation Techniques, and Data Models.......................................... 174
Maria Chiara Caschera, IRPPS-CNR, Italy
Arianna D’Ulizia, IRPPS-CNR, Italy
Leonardo Tininini, IASI-CNR, Italy
An easy, efficient, and effective way to retrieve stored data is obviously one of the key issues of any
information system. In the last few years, considerable effort has been devoted to the definition of more
intuitive, visual-based querying paradigms, attempting to offer a good trade-off between expressiveness and intuitiveness. In this chapter, the authors analyze the main characteristics of visual languages
specifically designed for querying information systems, concentrating on conventional relational databases, but also considering information systems with a less rigid structure such as Web resources storing
XML documents. Two fundamental aspects of visual query languages are considered: the adopted visual
representation technique and the underlying data model, possibly specialized to specific application
contexts.
Chapter XI
Application of Decision Tree as a Data Mining Tool in a Manufacturing System ............................ 190
S. A. Oke, University of Lagos, Nigeria
This selection demonstrates the application of decision tree, a data mining tool, in the manufacturing
system. Data mining has the capability for classification, prediction, estimation, and pattern recognition
by using manufacturing databases. Databases of manufacturing systems contain significant information


for decision making, which could be properly revealed with the application of appropriate data mining

techniques. Decision trees are employed for identifying valuable information in manufacturing databases.
Practically, industrial managers would be able to make better use of manufacturing data at little or no
extra investment in data manipulation cost. The work shows that it is valuable for managers to mine
data for better and more effective decision making.
Chapter XII
A Scalable Middleware for Web Databases ....................................................................................... 206
Athman Bouguettaya, Virginia Tech, USA
Zaki Malik, Virginia Tech, USA
Abdelmounaam Rezgui, Virginia Tech, USA
Lori Korff, Virginia Tech, USA
The emergence of Web databases has introduced new challenges related to their organization, access,
integration, and interoperability. New approaches and techniques are needed to provide across-the-board
transparency for accessing and manipulating Web databases irrespective of their data models, platforms,
locations, or systems. In meeting these needs, it is necessary to build a middleware infrastructure to support flexible tools for information space organization communication facilities, information discovery,
content description, and assembly of data from heterogeneous sources. This chapter describes a scalable
middleware for efficient data and application access built using available technologies. The resulting
system, WebFINDIT, is a scalable and uniform infrastructure for locating and accessing heterogeneous
and autonomous databases and applications.
Chapter XIII
A Formal Verification and Validation Approach for Real-Time Databases ....................................... 234
Pedro Fernandes Ribeiro Neto, Universidade do Estado–do Rio Grande do Norte, Brazil
Maria Lígia Barbosa Perkusich, Universidade Católica de Pernambuco, Brazil
Hyggo Oliveira de Almeida, Federal University of Campina Grande, Brazil
Angelo Perkusich, Federal University of Campina Grande, Brazil
Real-time database-management systems provide efficient support for applications with data and transactions that have temporal constraints, such as industrial automation, aviation, and sensor networks,
among others. Many issues in real-time databases have brought interest to research in this area, such
as: concurrence control mechanisms, scheduling policy, and quality of services management. However,
considering the complexity of these applications, it is of fundamental importance to conceive formal
verification and validation techniques for real-time database systems. This chapter presents a formal
verification and validation method for real-time databases. Such a method can be applied to database

systems developed for computer integrated manufacturing, stock exchange, network-management, and
command-and-control applications and multimedia systems.
Chapter XIV
A Generalized Comparison of Open Source and Commercial Database Management Systems ...... 252
Theodoros Evdoridis, University of the Aegean, Greece
Theodoros Tzouramanis, University of the Aegean, Greece


This chapter attempts to bring to light the field of one of the less popular branches of the open source
software family, which is the open source database management systems branch. In view of the objective, the background of these systems is first briefly described followed by presentation of a fair generic
database model. Subsequently and in order to present these systems under all their possible features, the
main system representatives of both open source and commercial origins will be compared in relation
to this model, and evaluated appropriately. By adopting such an approach, the chapter’s initial concern
is to ensure that the nature of database management systems in general can be apprehended. The overall
orientation leads to an understanding that the gap between open and closed source database management
systems has been significantly narrowed, thus demystifying the respective commercial products.
Section IV
Application and Utilization
Chapter XV
An Approach to Mining Crime Patterns ............................................................................................ 268
Sikha Bagui, The University of West Florida, USA
This selection presents a knowledge discovery effort to retrieve meaningful information about crime from
a U.S. state database. The raw data were preprocessed, and data cubes were created using Structured
Query Language (SQL). The data cubes then were used in deriving quantitative generalizations and for
further analysis of the data. An entropy-based attribute relevance study was undertaken to determine
the relevant attributes. A machine learning software called WEKA was used for mining association
rules, developing a decision tree, and clustering. SOM was used to view multidimensional clusters on
a regular two-dimensional grid.
Chapter XVI
Bioinformatics Web Portals ............................................................................................................... 296

Mario Cannataro, Università “Magna Græcia” di Catanzaro, Italy
Pierangelo Veltri, Università “Magna Græcia” di Catanzaro, Italy
Bioinformatics involves the design and development of advanced algorithms and computational platforms
to solve problems in biomedicine (Jones & Pevzner, 2004). It also deals with methods for acquiring,
storing, retrieving and analysing biological data obtained by querying biological databases or provided
by experiments. Bioinformatics applications involve different datasets as well as different software
tools and algorithms. Such applications need semantic models for basic software components and need
advanced scientific portal services able to aggregate such different components and to hide their details
and complexity from the final user. For instance, proteomics applications involve datasets, either produced by experiments or available as public databases, as well as a huge number of different software
tools and algorithms. To use such applications, it is required to know both biological issues related to
data generation and results interpretation and informatics requirements related to data analysis.
Chapter XVII
An XML-Based Database for Knowledge Discovery: Definition and Implementation .................... 305
Rosa Meo, Università di Torino, Italy
Giuseppe Psaila, Università di Bergamo, Italy


Inductive databases have been proposed as general purpose databases to support the KDD process.
Unfortunately, the heterogeneity of the discovered patterns and of the different conceptual tools used
to extract them from source data make integration in a unique framework difficult. In this chapter, using XML as the unifying framework for inductive databases is explored, and a new model, XML for
data mining (XDM), is proposed. The basic features of the model are presented, based on the concepts
of data item (source data and patterns) and statement (used to manage data and derive patterns). This
model uses XML namespaces (to allow the effective coexistence and extensibility of data mining operators) and XML schema, by means of which the schema, state and integrity constraints of an inductive
database are defined.
Chapter XVIII
Enhancing UML Models: A Domain Analysis Approach .................................................................. 330
Iris Reinhartz-Berger, University of Haifa, Israel
Arnon Sturm, Ben-Gurion University of the Negev, Israel
UML has been largely adopted as a standard modeling language. The emergence of UML from different
modeling languages has caused a wide variety of completeness and correctness problems in UML models. Several methods have been proposed for dealing with correctness issues, mainly providing internal

consistency rules, but ignoring correctness and completeness with respect to the system requirements
and the domain constraints. This chapter proposes the adoption of a domain analysis approach called
application-based domain modeling (ADOM) to address the completeness and correction problems of
UML models. Experimental results from a study which checks the quality of application models when
utilizing ADOM on UML suggest that the proposed domain helps in creating more complete models
without compromising comprehension.
Chapter XIX
Seismological Data Warehousing and Mining: A Survey .................................................................. 352
Gerasimos Marketos,University of Piraeus, Greece
Yannis Theodoridis, University of Piraeus, Greece
Ioannis S. Kalogeras, National Observatory of Athens, Greece
Earthquake data is comprised of an ever increasing collection of earth science information for postprocessing analysis. Earth scientists, as well as local and national administration officers, use these data
collections for scientific and planning purposes. In this chapter, the authors discuss the architecture of a
seismic data management and mining system (SDMMS) for quick and easy data collection, processing,
and visualization. The SDMMS architecture includes a seismological database for efficient and effective
querying and a seismological data warehouse for OLAP analysis and data mining. Template schemes are
provided for these two components and examples of how these components support decision making are
given. A comparative survey of existing operational or prototype SDMMS is also offered.


Section V
Critical Issues
Chapter XX
Business Information Integration from XML and Relational Databases Sources ............................. 369
Ana María Fermoso Garcia, Pontifical University of Salamanca, Spain
Roberto Berjón Gallinas, Pontifical University of Salamanca, Spain
Roberto Berjón Gallinas, Pontifical University of Salamanca, Spain
This chapter introduces different alternatives to store and manage jointly relational and eXtensible
Markup Language (XML) data sources. Nowadays, businesses are transformed into e-business and
have to manage large data volumes and from heterogeneous sources. To manage large amounts of information, Database Management Systems (DBMS) continue to be one of the most used tools, and the

most extended model is the relational one. On the other side, XML has reached the de facto standard to
present and exchange information between businesses on the Web. Therefore, it could be necessary to
use tools as mediators to integrate these two different data to a common format like XML, since it is the
main data format on the Web. First, a classification of the main tools and systems where this problem
is handled is made, with their advantages and disadvantages. The objective will be to propose a new
system to solve the integration business information problem.
Chapter XXI
Security Threats in Web-Powered Databases and Web Portals ......................................................... 395
Theodoros Evdoridis, University of the Aegean, Greece
Theodoros Tzouramanis, University of the Aegean, Greece
It is a strongly held view that the scientific branch of computer security that deals with Web-powered
databases (Rahayu & Taniar, 2002) that can be accessed through Web portals (Tatnall, 2005) is both
complex and challenging. This is mainly due to the fact that there are numerous avenues available for
a potential intruder to follow in order to break into the Web portal and compromise its assets and
functionality. This is of vital importance when the assets that might be jeopardized belong to a legally
sensitive Web database such as that of an enterprise or government portal, containing sensitive and
confidential information. It is obvious that the aim of not only protecting against, but mostly preventing
from potential malicious or accidental activity that could set a Web portal’s asset in danger, requires an
attentive examination of all possible threats that may endanger the Web-based system.
Chapter XXII
Empowering the OLAP Technology to Support Complex Dimension Hierarchies........................... 403
Svetlana Mansmann, University of Konstanz, Germany
Marc H. Scholl, University of Konstanz, Germany
Comprehensive data analysis has become indispensable in a variety of domains. OLAP (On-Line Analytical Processing) systems tend to perform poorly or even fail when applied to complex data scenarios.
The restriction of the underlying multidimensional data model to admit only homogeneous and balanced
dimension hierarchies is too rigid for many real-world applications and, therefore, has to be overcome in
order to provide adequate OLAP support. The authors of this chapter present a framework for classifying


and modeling complex multidimensional data, with the major effort at the conceptual level of transforming

irregular hierarchies to make them navigable in a uniform manner. The properties of various hierarchy
types are formalized and a two-phase normalization approach is proposed: heterogeneous dimensions
are reshaped into a set of well-behaved homogeneous subdimensions, followed by the enforcement of
summarizability in each dimension’s data hierarchy. The power of the current approach is exemplified
using a real-world study from the domain of academic administration.
Chapter XXIII
NetCube: Fast, Approximate Database Queries Using Bayesian Networks ...................................... 424
Dimitris Margaritis, Iowa State University, USA
Christos Faloutsos, Carnegie Mellon University, USA
Sebastian Thrun, Stanford University, USA
This chapter presents a novel method for answering count queries from a large database approximately
and quickly. This method implements an approximate DataCube of the application domain, which can be
used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual
device that in principle stores the number of matching records for all possible such queries. However,
because its size and generation time are inherently exponential, the current approach uses one or more
Bayesian networks to implement it approximately. By means of such a network, the proposed method,
called NetCube, exploits correlations and independencies among attributes to answer a count query
quickly without accessing the database. Experimental results show that NetCubes have fast generation
and use, achieve excellent compression and have low reconstruction error while also naturally allowing
for visualization and data mining.
Chapter XXIV
Node Partitioned Data Warehouses: Experimental Evidence and Improvements ............................. 450
Pedro Furtado, University of Coimbra, Portugal
Data Warehouses (DWs) with large quantities of data present major performance and scalability challenges, and parallelism can be used for major performance improvement in such context. However,
instead of costly specialized parallel hardware and interconnections, the authors of this selection focus
on low-cost standard computing nodes, possibly in a non-dedicated local network. In this environment,
special care must be taken with partitioning and processing. Experimental evidence is used to analyze
the shortcomings of a basic horizontal partitioning strategy designed for that environment, and then improvements to allow efficient placement for the low-cost Node Partitioned Data Warehouse are proposed
and tested. A simple, easy-to-apply partitioning and placement decision that achieves good performance
improvement results is analyzed. This chapter’s experiments and discussion provides important insight

into partitioning and processing issues for data warehouses in shared-nothing environments.


Section VI
Emerging Trends
Chapter XXV
Rule Discovery from Textual Data .................................................................................................... 471
Shigeaki Sakurai, Toshiba Corporation, Japan
This chapter introduces knowledge discovery methods based on a fuzzy decision tree from textual data.
The author argues that the methods extract features of the textual data based on a key concept dictionary,
which is a hierarchical thesaurus, and a key phrase pattern dictionary, which stores characteristic rows
of both words and parts of speech, and generate knowledge in the format of a fuzzy decision tree. The
author also discusses two application tasks. One is an analysis system for daily business reports and
the other is an e-mail analysis system. The author hopes that the methods will provide new knowledge
for researchers engaged in text mining studies, facilitating their understanding of the importance of the
fuzzy decision tree in processing textual data.
Chapter XXVI
Action Research with Internet Database Tools .................................................................................. 490
Bruce L. Mann, Memorial University, Canada
This chapter discusses and presents examples of Internet database tools, typical instructional methods
used with these tools, and implications for Internet-supported action research as a progressively deeper
examination of teaching and learning. First, the author defines and critically explains the use of artifacts in an educational setting and then differentiates between the different types of artifacts created by
both students and teachers. Learning objects and learning resources are also defined and, as the chapter
concludes, three different types of instructional devices – equipment, physical conditions, and social
mechanisms or arrangements – are analyzed and an exercise is offered for both differentiating between
and understanding differences in instruction and learning.
Chapter XXVII
Database High Availability: An Extended Survey ............................................................................. 499
Moh’d A. Radaideh, Abu Dhab Police – Ministry of Interior, United Arab Emirates
Hayder Al-Ameed, United Arab Emirates University, United Arab Emirates

With the advancement of computer technologies and the World Wide Web, there has been an explosion in the amount of available e-services, most of which represent database processing. Efficient and
effective database performance tuning and high availability techniques should be employed to ensure
that all e-services remain reliable and available all times. To avoid the impacts of database downtime,
many corporations have taken interest in database availability. The goal for some is to have continuous
availability such that a database server never fails. Other companies require their content to be highly
available. In such cases, short and planned downtimes would be allowed for maintenance purposes. This
chapter is meant to present the definition, the background, and the typical measurement factors of high
availability. It also demonstrates some approaches to minimize a database server’s shutdown time.

Index .................................................................................................................................................. 528


xviii

Prologue

historical oVErViEw of databasE tEchnology
This prologue provides a brief historical perspective of developments in database technology, and then
reviews and contrasts three current approaches to elevate the initial design of database systems to a
conceptual level.
Beginning in the late 1970s, the old network and hierarchic database management systems (DBMSs)
began to be replaced by relational DBMSs, and by the late 1980s relational systems performed sufficiently
well that the recognized benefits of their simple bag-oriented data structure and query language (SQL)
made relational DBMSs the obvious choice for new database applications. In particular, the simplicity of
Codd’s relational model of data where all facts are stored in relations (sets of ordered n-tupes) facilitated
data access and optimization for a wide range of application domains (Codd, 1970). Although Codd’s
data model was purely set-oriented, industrial relational DBMSs and SQL itself are bag-oriented, since
SQL allows keyless tables, and SQL queries queries may return multisets (Melton & Simon, 2002).
Unlike relational databases, network and hierarchic databases store facts in not only record types but
also navigation paths between record types. For example, in a hierarchic database the fact that employee

101 works for the Sales department would be stored as a parent-child link from a department record (an
instance of the Department record type where the deptName attribute has the value ‘Sales’) to an employee record (an instance of the Employee record type where the empNr attribute has the value 101).
Although relational systems do support foreign key “relationships” between relations, these relationships are not navigation paths; instead they simply encode constraints (e.g. each deptName in an Employee
table must also occur in the primary key of the Department table) rather than ground facts. For example,
the ground fact that employee 101 works for the Sales department is stored by entering the values 101,
‘Sales’ in the empNr and deptName columns on the same row of the Employee table.
In 1989, a group of researchers published “The Object-Oriented Database System Manifesto” in which
they argued that object-oriented databases should replace relational databases (Atkinson et al. 1989).
Influenced by object-oriented programming languages, they felt that databases should support not only
core databases features such as persistence, concurrency, recovery, and an ad hoc query facility, but also
object-oriented features such as complex objects, object identity, encapsulation of behavior with data,
types or classes, inheritance (subtyping), overriding and late binding, computational completeness, and
extensibility. Databases conforming to this approach are called object-oriented databases (OODBs) or
simply object databases (ODBs).
Partly in response to the OODB manifesto, one year later a group of academic and industrial researchers proposed an alternative “3rd generation DBMS manifesto” (Stonebraker et al., 1990). They
considered network and hierarchic databases to be first generation, and relational databases to be second
generation, and argued that third generation databases should retain the capabilities of relational systems
while extending them with object-oriented features. Databases conforming to this approach are called
object-relational databases (ORDBs).


xix

While other kinds of databases (e.g. deductive, temporal, and spatial) were also developed to address
specific needs, none of these has gained a wide following in industry. Deductive databases typically
provide a declarative query language such as a logic programming language (e.g. Prolog), giving them
powerful rule enforcement mechanisms with built-in backtracking and strong support for recursive rules
(e.g. computing the transitive closure of an ancestor relation).
Spatial databases provide efficient management of spatial data, such as maps (e.g. for geographical applications), 2-D visualizations (e.g. for circuit designs), and 3-D visualizations (e.g. for medical
imaging). Built-in support for spatial data types (e.g. points, lines, polygons) and spatial operators (e.g.

intersect, overlap, contains) facilitates queries of a spatial nature (e.g. how many residences lie within
3 miles of the proposed shopping center?).
Temporal databases provide built-in support for temporal data types (e.g. instant, duration, period)
and temporal operators (e.g. before, after, during, contains, overlaps, precedes, starts, minus), facilitating
queries of a temporal nature (e.g. which conferences overlap in time?).
A more recent proposal for database technology employs XML (eXtensible Markup Language). XML
databases store data in XML (eXtensible Markup Language), with their structure conforming either to
the old DTD (Document Type Definition) or the newer XSD (XML Schema Definition) format. Like
the old hierarchic databases, XML is hierarchic in nature. However XML is presented as readable text,
using tags to provide the structure. For example, the facts that employees 101 and 102 work for the Sales
department could be stored (along with their names and birth dates) in XML as follows.
<department name = “Sales”>
<employee empNr = “101”>
<name>Fred Smith</name>
<birthdate>1946-02-15</birthdate>
</employee>
<employee empNr = “102”>
<name>Sue Jones</name>
<birthdate>1980-06-30</birthdate>
</employee>
</department>

Just as SQL is used for querying and manipulating relational data, the XQuery language is now the
standard language for querying and manipulating XML data, (Melton & Buxton, 2006).
One very recent proposal for a new kind of database technology is the so-called “ontology database”,
which is proposed to help achieve the vision of the semantic web (Berners-Lee et al., 2001). The basic
idea is that documents spread over the Internet may include tags to embed enough semantic detail to
enable understanding of their content by automated agents. Built on Unicode text, URIrefs (Uniform
Resource Identifiers) to identify resources, XML and XSD datatypes, facts are encoded in RDF (Resource
Description Framework) triples (subject, predicate, object) representing binary relationships from a node

(resource or literal) to another node. RDF Schema (RDFS) builds on RDF by providing inbuilt support
for classes and subclassing. The Web Ontology Language (OWL) builds on these underlying layers to
provide what is now the most popular language for developing ontologies (schemas and their database
instances) for the semantic web.
OWL includes three versions. OWL Lite provides a decidable, efficient mechanism for simple ontologies composed mainly of classification hierarchies and relationships with simple constraints. OWL
DL (the “DL” refers to Description Logic) is based on a stronger SHOIN(D) description logic that is
still decidable. OWL Full is more expressive but is undecidable, and even goes beyond even first order
logic.


xx

All of the above database technologies are still in use, to varying degrees. While some legacy systems
still use the old network and hierarchic DBMSs, new database applications are not built on these obsolete technologies. Object databases, deductive databases, and temporal databases provide advantages
for niche markets. However the industrial database world is still dominated by relational and objectrelational DBMSs. In practice, ORDBs have become the dominant DBMS, since virtually all the major
industrial relational DBMSs (e.g. Oracle, IBM DB2, and Microsoft SQL Server) extended their systems
with object-oriented features, and also expanded their support for data types including XML. The SQL
standard now includes support for collection types (e.g. arrays, row types and multisets, recursive queries
and XML). Some ORDBMSs (e.g. Oracle) include support for RDF. While SQL is still often used for
data exchange, XML is being increasingly used for exchanging data between applications.
In practice, most applications use an object model for transient (in-memory) storage, while using
an RDB or ORDB for persistent storage. This has led to extensive efforts to facilitate transformation
between these differently structured data stores (known as Object-Relational mapping). One interesting
initiative in this regard is Microsoft’s Language Integrated Query (LINQ) technology, which allows users
to interact with relational data by using an SQL-like syntax in their object-oriented program code.
Recently there has been a growing recognition that the best way to develop database systems is by
transformation from a high level, conceptual schema that specifies the structure of the data in a way
that can be easily understood and hence validated by the (often nontechnical) subject matter experts,
who are the only ones who can reliably determine whether the proposed models accurately reflect their
business domains.

While this notion of model driven development was forcefully and clearly proposed over a quarter
century ago in an ISO standard (van Griethuysen, 1982), only in the last decade has it begun to be
widely accepted by major commercial interests. Though called differently by different bodies (e.g. the
Object management Group calls it “Model Driven Architecture” and Microsoft promotes model driven
development based on Domain Specific Languages) the basic idea is to clearly specify the business
domain model at a conceptual level, and then transform it as automatically as possible to application
code, thereby minimizing the need for human programming. In the next section we review and contrast
three of the most popular approaches to specifying high level data models for subsequent transformation
into database schemas.

concEptual databasE modEling approachEs
In industry, most database designers either use a variant of Entity Relationship (ER) modeling or simply
design directly at the relational level. The basic ER approach was first proposed by Chen (1976), and
structures facts in terms of entities (e.g. Person, Car) that have attributes (e.g. gender, birthdate) and
participate in relationships (e.g. Person drives Car). The most popular industrial versions of ER are the
Barker ER notation (Barker, 1990), Information Engineering (IE) (Finkelstein, 1998), and IDEF1X
(IEEE, 1999). IDEF1X is actually a hybrid of ER and relational, explicitly using relational concepts
such as foreign keys. Barker ER is currently the best and most expressive of the industrial ER notations,
so we focus our ER discussion on it.
The Unified Modeling Language (UML) was adopted by the Object Management Group (OMG) in
1997 as a language for object-oriented (OO) analysis and design. After several minor revisions, a major
overhaul resulted in UML version 2.0 (OMG, 2003), and the language is still being refined. Although
suitable for object-oriented code design, UML is less suitable for information analysis (e.g. even UML
2 does not include a graphic way to declare that an attribute is unique), and its textual Object Constraint


xxi

Language (OCL) is too technical for most business people to understand (Warmer & Kleppe, 2003). For
such reasons, although UML is widely used for documenting object-oriented programming applications,

it is far less popular than ER for database design.
Despite their strengths, both ER and UML are fairly weak at capturing the kinds of business rules
found in data-intensive applications, and their graphical language does not lend itself readily to verbalization and multiple instantiation for validating data models with domain experts.
These problems can be remedied by using a fact-oriented approach for information analysis, where
communication takes place in simple sentences, each sentence type can easily be populated with multiple
instances, attributes are avoided in the base model, and far more business rules can be captured graphically. At design time, a fact-oriented model can be used to derive an ER model, a UML class model, or
a logical database model.
Object Role Modeling (ORM), the main exemplar of the fact-oriented approach, originated in Europe in the mid-1970s (Falkenberg, 1976), and been extensively revised and extended since, along with
commercial tool support (e.g. Halpin, Evans, Hallock & MacLean, 2003). Recently, a major upgrade to
the methodology resulted in ORM 2, a second generation ORM (Halpin 2005; Halpin & Morgan 2008).
Neumont ORM Architect (NORMA), an open source tool accessible online at www.ORMFoundation.
org, is under development to provide deep support for ORM 2 (Curland & Halpin, 2007).
ORM pictures the world simply in terms of objects (entities or values) that play roles (parts in relationships). For example, you are now playing the role of reading, and this prologue is playing the role
of being read. Wherever ER or UML uses an attribute, ORM uses a relationship. For example, the Person.
birthdate attribute is modeled in ORM as the fact type Person was born on Date, where the role played by date
in this relationship may be given the rolename “birthdate”.
ORM is less popular than either ER or UML, and its diagrams typically consume more space because
of their attribute-free nature. However, ORM arguably offers many advantages for conceptual analysis,
as illustrated by the following example, which presents the same data model using the three different
notations.
In terms of expressibility for data modeling, ORM supports relationships of any arity (unary, binary,
ternary or longer), identification schemes of arbitrary complexity, asserted, derived, and semiderived
facts and types, objectified associations, mandatory and uniqueness constraints that go well beyond ER
and UML in dealing with n-ary relationships, inclusive-or constraints, set comparison (subset, equality,
exclusion) constraints of arbitrary complexity, join path constraints, frequency constraints, object and
role cardinality constraints, value and value comparison constraints, subtyping (asserted, derived and
semiderived), ring constraints (e.g. asymmetry, acyclicity), and two rule modalities (alethic and deontic
(Halpin, 2007a)). For some comparisons between ORM 1 and ER and UML see Halpin (2002, 2004).
As well as its rich notation, ORM includes detailed procedures for constructing ORM models and
transforming them to other kinds of models (ER, UML, Relational, XSD etc.) on the way to implementation. For a general discussion of such procedures, see Halpin & Morgan (2008). For a detailed discussion

of using ORM to develop the data model example discussed below, see Halpin (2007b).
Figure 1 shows an ORM schema for a fragment of a book publisher application. Entity types appear
as named, soft rectangles, with simple identification schemes parenthesized (e.g. Books are identified by
their ISBN). Value types (e.g. character strings) appear as named, dashed, soft rectangles (e.g. BookTitle).
Predicates are depicted as a sequence of one or more role boxes, with at least one predicate reading. By
default, predicates are ready left-right or top-down. Arrow tips indicate other predicate reading directions. An asterisk after a predicate reading indicates the fact type is derived (e.g. best sellers are derived
using the derivation rule shown). Role names may be displayed in square brackets next to the role (e.g.
totalCopiesSold).


xxii

Figure 1. Book publisher schema in ORM
has/is of
PersonName
is translated from
has

is of

Gender
(.code)

is authored by

BookTitle
Year
(CE)

Book

(ISBN)
was published in

… in … sold ...
[copiesSoldInYear]
NrCopies

Published
Book*

sold total- *
[totalCopiesSold]

Person
(.nr)

≥2
is assigned for review by
“ReviewAssignment !”

has

resulted in

{‘M’, ‘F’}

is restricted to
PersonTitle
Grade
(.nr)


{1..5}

is a best seller*

Each PublishedBook is a Book that was published in some Year.
* For each PublishedBook, totalCopiesSold= sum(copiesSoldInYear).
* PublishedBook is a best seller iff PublishedBook sold total NrCopies >= 10000.

A bar over a sequence of one or more roles depicts a uniqueness constraint (e.g. each book has at
most one booktitle, but a book may be authored by many persons and vice versa). The external uniqueness constraint (circled bar) reflects the publisher’s policy of publishing at most one book of any given
title in any given year. A dot on a role connector indicates that role is mandatory (e.g. each book has a
booktitle).
Subtyping is depicted by an arrow from subtype to supertype. In this case, the PublishedBook subtype
is derived (indicated by an asterisk), so a derivation rule for it is supplied. Value constraints are placed
in braces (e.g. the possible codes for Gender are ‘M’ and ‘F’).
The ring constraint on the book translation fact type indicates that relationship is acyclic. The exclusion constraint (circled X) ensures that no person may review a book that he or she authors. The
frequency constraint (≥ 2) ensures that any book assigned for review has at least two reviewers. The
subset constraint (circled ⊆) ensures that if a person has a title that is restricted to a specific gender (e.g.
‘Mrs’ is restricted to females), then that person must be of that gender—an example of a constraint on a
conceptual join path. The textual declarations provide a subtype definition and two derivation rules, one
in attribute style (using role names) and one in relational style. ORM schemas can also be automatically
verbalized in natural languages sentences, enabling validation by domain experts without requiring them
to understand the notation (Curland & Halpin, 2007).
Figure 2 depicts the same model in Barker ER notation, supplemented by textual rules (6 numbered
constraints, plus 3 derivations) that cannot be captured in this notation.
Barker ER depicts entity types as named, soft rectangles. Mandatory attributes are preceded by an
asterisk and optional attributes by “o”. An attribute that is part of the primary identifier is preceded by
“#”, and a role that is part of an identifier has a stroke “|” through it.
All relationships must be binary, with each half of a relationship line depicting a role. A crowsfoot

indicates a maximum cardinality of many. A line end with no crowsfoot indicates a maximum cardinality of one. A solid line end indicates the role is mandatory, and a dashed line end indicates the role is
optional. Subtyping is depicted by Euler diagrams with the subtype inside the supertype. Unlike ORM
and UML, Barker ER supports only single inheritance, and requires that the subtyping always forms a
partition.


xxiii

Figure 2. Book publisher schema in Barker ER, supplemented by extra rules
2

a translation of

the translation
source of

BOOK

1

authored by

# * ISBN
* book title
o year published

PERSON

3


an author of

# * person nr
* person name
* gender 4

with

PERSON TITLE

5

# * title name
o restricted gender

of

UNPUBLISHED BOOK
assigned

PUBLISHED BOOK
with

1

allocated

2
3


for

SALES FIGURE

≥2

for

REVIEW ASSIGNMENT

# * year sold
* copies sold in year

o

grade

4

by

5
3
6

(book title, year published) is unique.
The translation relationship is acyclic.
Review Assignment is disjoint with authorship.
Possible values of gender are ‘M’, ‘F’.
Each person with a person title restricted to a gender

has that gender.
Possible values of grade are 1..5.

6

Subtype Definition:
Each Published_Book is a Book where year_published is not null.
Derivation Rules:
Published_Book.totalCopiesSold = sum(Book_Sales_Figure.copies_sold_in_year).
Published_Book.is_a_best seller = totalCopiesSold >= 10000.

Figure 3. Book publisher schema in UML, supplemented by extra rules
title.restrictedGender = self.gender
or
title.restrictedGender -> isEmpty()

acyclic
translationSource
translation

*

0..1
Book

isbn {P}
bookTitle {U1}
yearPublished [0..1] {U1}
Each (yrSold, publishedBook)
combination applies to at most

one SalesFigure

bookAuthored

author
1..*

*

ReviewAssignment
grade [0..1] { value in 1..7 }

SalesFigure
yrSold
copiesSoldInYear

PublishedBook
*

1

Person

nr {P}
bookReviewed reviewer name
*
0, 2..* gender: GenderCode

/isaBestSeller
/totalCopiesSold


yearPublished -> notEmpty().
totalCopiesSold = sum(salesFigure.copiesSoldInYear).
isaBestSeller = (totalCopiesSold >= 10000).

bookAuthored disjoint
with bookReviewed

*
1
Title

«enumeration»
gendercode

name {P}
restrictedGender [0..1]: GenderCode

m
f

Figure 3 shows the same model as a class diagram in UML, supplemented by several textual rules
captured either as informal notes (e.g. acyclic) or as formal constraints in OCL (e.g. yearPublished ->
notEmpty()) or as nonstandard notations in braces (e.g., the {P} for preferred identifier and {Un} for
uniqueness are not standard UML). Derived attributes are preceded by a slash. Attribute multiplicities
are assumed to be 1 (i.e. exactly one) unless otherwise specified (e.g. restrictedGender has a multiplicity
of [0..1], i.e. at most one). A “*” for maximum multiplicity indicates “many”.


xxiv


Figure 4. Book publisher relational schema
1

Book ( isbn, title, [yearPublished], [translationSource] )
2

SalesFigure ( isbn, yearSold, copiesSold )
Authorship ( personNr, isbn )
{1..5}
ReviewAssignment ( personNr, isbn, [grade] )
≥2
{M, F}
6
Person ( personNr, personName, gender, personTitle )
{M, F} 6
TitleRestriction ( personTitle, gender )
3

4

5

View: SoldBook (isbn, totalCopiesSold, isaBestSeller )
acyclic
only where yearPublished exists
3 SalesFigure.isbn
4 sum(copiesSold) from SalesFigure group by isbn
5
totalCopiesSold > 10000

6 not exists(Person join TitleRestriction on personTitle
where Person.gender <> TitleRestriction.gender).

1

2

Part of the problem with the UML and ER models is that in these approaches personTitle and gender
would normally be treated as attributes, but for this application we need to talk about them to capture a
relevant business rule. The ORM model arguably provides a more natural representation of the business
domain, while also formally capturing much more semantics with its built-in constructs, facilitating
transformation to executable code. This result is typical for industrial business domains.
Figure 4 shows the relational database schema obtained by mapping these data schemas via ORM’s
Rmap algorithm (Halpin & Morgan, 2008), using absorption as the default mapping for subtyping. Here
square brackets indicate optional, dotted arrows indicate subset constraints, and a circled “X” depicts
an exclusion constraint. Additional constraints are depicted as numbered textual rules in a high level
relational notation. For implementation, these rules are transformed further into SQL code (e.g. check
clauses, triggers, stored procedures, views).

conclusion
While many kinds of database technology exist, RDBs and ORDBs currently dominate the market, with
XML being increasingly used for data exchange. While ER is still the main conceptual modeling approach
for designing databases, UML is gaining a following for this task, and is already widely used for object
oriented code design. Though less popular than ER or UML, the fact-oriented approach exemplified by
ORM has many advantages for conceptual data analysis, providing richer coverage of business rules,
easier validation by business domain experts, and semantic stability (ORM models and queries are unimpacted by changes that require one to talk about an attribute). Because ORM models may be used to
generate ER and UML models, it may also be used in conjunction with these if desired.



×