Tải bản đầy đủ (.pdf) (10 trang)

Data Mining and Knowledge Discovery Handbook, 2 Edition part 34 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (100.29 KB, 10 trang )

310 Frank H
¨
oppner
J-measure compares the a priori distribution of X (binary variable, either the an-
tecedent holds (X = x) or not (X =
x)) with the a posteriori distribution of X given
that Y = y. The relative information
j(X |Y = y)=

z∈{x,x}
P(X = z|Y = y) log
2

P(X = z|Y = y)
P(X = z)

yields the instantaneous information that Y = y provides about X ( j is also known as
the Kullbach-Leibler distance). When applying the rule multiple times, on average
we have the information J(X|Y=y) = P(Y=y)j(X|Y=y), which is the J-value of the
rule and is bounded by 0.53 bit. The drawback is, however, that highly infrequent
rules do not carry much information on average (due to the factor P(Y = y)), such
that highly interesting but rarely occurring associations may not appear under the
top-ranked rules.
Other measures are conviction (a “directed”, asymmetric lift) (Brin et al., 1997B),
certainty factors from MYCIN (Berzal et al., 2001), correlation coefficients from
statistics (Tan and Kumar, 2002), Laplace or Gini from rule induction (Clark and
Boswell, 1991) or decision tree induction (Breiman, 1996). For a comparison of var-
ious measures of interestingness the reader is referred to (Hilderman and Hamilton,
2001), where also general properties rule measures should have are discussed. In (Ba-
yardo and Agrawal, 1999) it is outlined that, given a fixed consequent, the ordering
of rules obtained from confidence is identical to those obtained by lift or conviction


(which is further generalized in (Bayardo et al., 1999)).
15.4.2 Interactive or Knowledge-Based Filtering
Whatever the rule evaluation measure may propose, the final judgment about the
interestingness and usefulness of a rule is made by the human expert or user. For
instance, many measures consistently return those rules as most interesting that con-
sists of a single item in the consequent, because in this case confidence is maximized
(see Section 60.2.1). But the user may be interested in different items or item combi-
nations in the consequent, therefore the subjective interestingness of these rules may
be low in some applications. Indeed, all measures of interestingness rely on statisti-
cal properties of the items and do not take background information into account. But
background knowledge about the presence or absence of correlation between items
may alert a human expert at some rules, which are undistinguishable from others if
looked at the interestingness rate provided by statistical measures alone.
Given the large number of rules, as a first step a visualization and navigation tool
may help to quickly find interesting rules. In (Klemettinen et al., 1994) rule templates
are proposed, allowing the user to restrict the set of rules syntactically. Some visu-
alization techniques are also presented, such as visualizing rules in a graph, where
the items represent nodes and rules are represented by edges that lead from the an-
tecedent to consequent attributes. The thickness of an edge may illustrate the rating
of the rule.
The idea of (Dong and Li, 1998) is to compare the performance of a rule against
the performance of similar rules and flag it only as interesting, if it deviates clearly
15 Association Rules 311
from them. A distance measure for rules is used to define the neighborhood of a rule
(containing all rules within a certain distance according to some distance measure).
Several possibilities to flag a rule as interesting are discussed: it may qualify as in-
teresting by an unexpectedly high confidence value, if its confidence deviates clearly
from the average confidence in its neighborhood, or by an unexpectedly sparse neigh-
borhood, if the number of mined rules is small compared to the number of possible
rules in the neighborhood. Another rule filtering approach is proposed in (Liu et al.,

1999), where statistical correlation is used to define the direction of a rule: Basically,
a rule has direction 1 / 0 / -1, if antecedent and consequent are positively correlated
/ uncorrelated / negatively correlated. From the set of predecessors of a rule, an ex-
pected direction can be derived (e.g. if all subrules have direction 0, we may expected
an extended rule to have also direction 0), and a rule is flagged as interesting if its di-
rection deviates from this expected direction (closely related to (Brin et al., 1997A),
see section 15.4.5).
A completely different approach is to let the expert formalize her or his domain
knowledge in advance, which then can be used to test the newly discovered rules
against the expert’s belief. The Apriori algorithm is extended to find rules that con-
tradict predefined rules in (Padmanabhan, 1998). Only contradictory rules are then
reported, because they represent potentially new information to the expert.
The first interactive approaches were designed as post processing systems, which
generate all rules first and then allow for a fast navigation through the rules. The idea
of rule template matching can be extended to a rule query language that is supported
by the mining algorithm (Srikant et al., 1997). In contrast to the post-processing ap-
proach, such an integrated querying will be faster if only a few queries are posted and
may succeed in occasions where a complete enumeration of all rules fails, e.g. if the
minimum support threshold is very low but many other constraints can be exploited
to limit the search space. The possibility of posing additional constraints is therefore
crucial for successful interactive querying (see Section 15.4.4). In (Goethals and Van
den Bussche, 2000) the user specifies items that must or must not appear in the an-
tecedent or consequent of a rule. From this set of constraints, a reduced database
(reduced in the size and number of transactions) is constructed, which is used for
faster itemset enumeration. The generation of a working copy of the database is ex-
pensive, but may pay off for repeated queries, especially since intermediate results of
earlier queries are reused. The technique is extended to process full-fledged Boolean
expressions.
15.4.3 Compressed Representations
Possible motivations for using compressed or condensed representations of rules in-

clude
• Reduction of storage needs and, if possible, computational costs
• Derivation of new rules by condensing items to meta-items
• Reduction of the number of rules to be evaluated by an expert
312 Frank H
¨
oppner
Using a compressed rule set is motivated by problems occurring with dense databases
(see Section 15.4.4): Suppose we have found a frequent itemset r containing an item
i. If we include additional n items in our database that are perfectly correlated to
item i,2
n
−1 variations of itemset r will be found. As another example, suppose the
database consists of a single transaction {a
1
, ,a
n
} and min
supp
= 1, then 2
n
−1
frequent subsets will be generated to discover {a
1
, , a
n
} as a frequent itemset. In
such cases, a condensed representation (Mannila and Toivonen, 1996) can help to
reduce the size of the rule set. An itemset X is called closed, if there is no superset X


that is contained in every transaction containing X. In the latter example, only a single
closed frequent itemset would be discovered. The key to finding closed itemsets is
the fact that the support value remains constant, regardless of which subset of the n
items is contained in the itemset. Therefore, from the frequent closed itemsets, all
frequent itemsets can be reconstructed including their support values, which makes
closed itemsets a lossless representation. Algorithms that compute closed frequent
itemsets are, to name just a few, Close (Pasquier et al., 1999), Charm (Zaki, 2000),
and Closet (Wang, 2003). An overview of lossless representations can be found in
(Kryszkiewicz, 2001).
Secondly, for a rule to become more meaningful, it may be helpful to consider
several items in combination. An example is the use of product taxonomies in mar-
ket basket analysis, where many rules that differ only in one item, say apple-juice,
orange-juice, and cherry-juice, may be outperformed (and thus replaced) by a single
rule using the generalized meta-item fruit-juice. Such meta-items can be obtained
from an a-priori known product taxonomy (Han et al., 2000, Srikant and Agrawal,
1995). If such a taxonomy is not given, one may want to discover disjunctive combi-
nation of items that optimize the rule evaluation automatically (see (Zelenko, 1999)
for disjunctive combination of items, or (H
¨
oppner, 2002) for generalization in the
context of sequences).
Finally, the third motivation addresses the combinatorial explosion of rules that
can be generated from a single (closed) frequent itemset. Since the expert herself
knows best which items are interesting, why not present associations – as a com-
pressed representation of many rules – rather than individual rules? In particular
when sequences are mined, the order of items is fixed in the sequence, such that we
have only two degrees of freedom: (1) Remove individual items before building a
rule and (2) select the location which separates antecedent from consequent items.
In (H
¨

oppner, 2002) every location in a frequent sequence is attributed with the J-
value that can at least be achieved in predicting an arbitrary subset of the consequent.
This series of J-values characterizes the predictive power of the whole sequence and
thus represents a condensed representation of many associated rules. The tuple of
J-values can also be used to define an order on sequences, allowing for a ranking of
sequences rather than just rules.
15.4.4 Additional Constraints for Dense Databases
Suppose we have extracted a set of rules R from database D. Now we add a very
frequent item f randomly to our transactions in Dand derive rule set R

. Due to the
15 Association Rules 313
randomness of f, the support and confidence values of the rules in R do not change
substantially. Besides the rules in R, we will also obtain rules that contain f either in
the antecedent or the consequent (with very similar support/confidence values). Even
worse, since f is very frequent, almost every subset of I will be a good predictor of
f , adding another 2
|I|
rules to R. This shows that the inclusion of f increases the size
of R

dramatically, while none of these rules are actually worth investigating. In the
market basket application, such items f usually do not exist, but come easily into
play if customer data is included in the analysis. For dense databases the bottom-
up approach of enumerating all frequent itemsets quickly turns out to be infeasible.
In this section some approaches that limit the search space further to reduce time
and space requirements are briefly reviewed. (In contrast to the methods in the next
section, the minimum support threshold is not necessarily released.)
In (Webb, 2000) it is argued that a user is not able to investigate many thousand
rules and therefore will be satisfied if, say, only the top 1000 rules are returned. Then,

optimistic pruning can be used to prune those parts of the search space that cannot
contribute to the result since even under optimistic conditions there is no chance
of getting better than the best 1000 rules found so far. This significantly limits the
size of the space that has actually to be explored. The method requires, however, to
run completely in main memory and may therefore be not applicable to very large
databases. A modification to the case of numerical attributes (called impact rules) is
proposed in (Webb, 2001).
In (Bayardo et al., 1999) the set of rules is reduced by listing specializations of
a rule only if they improve the confidence of a common base rule by more than a
threshold min
imp
. The improvement of a rule is defined as
imp(A→C) = min { A’ ⊂ A | conf(A→C) – conf(A’→C) }
These additional constraints must be exploited during rule mining, since enumer-
ating all frequent itemsets first and apply constraints thereafter is not feasible. The
key to efficiency of DenseMiner is the derivation of upper bounds for confidence,
improvement, and support, which are then used for optimistic pruning of itemset
blocks. This saves the algorithm from determining the support of many frequent
itemsets, whose rules will miss the confidence or improvement constraints anyway.
This work is enhanced in (Bayardo and Agrawal, 1999) to mine only those rules that
are the optimal rules according to a partial order of rules based on support and confi-
dence. It is then shown that the optimal rules under various interestingness measures
are included in this set. In practice, while the set of obtained rules is quite man-
ageable, it characterizes only a specific subset of the database records (Bayardo and
Agrawal, 1999). An alternative partial order is proposed (based on the subsumption
of the transactions supported by rules) that better characterizes the database, but also
leads to much larger rule sets.
One of the appealing properties of the Apriori algorithm is that it is complete in
the sense that all association rules will be discovered. We have seen that, on the other
314 Frank H

¨
oppner
hand, this is at the same time one of its drawbacks, since it leads to an unmanageably
large rule set. If not completeness but quality of the rules has priority, many of the
rule induction techniques from statistics and machine learning may be also applied
to discover association rules. As an example, in (Friedman, 1997) local regions in
the data (corresponding to associations) are sought where the distribution deviates
from the distribution expected under the independence assumption. Starting with a
box containing the whole data set, it is iteratively shrunk by limiting the set of admis-
sible values for each variable (removing items from categorical attributes or shifting
the borders of the interval of admissible values for numerical attributes). Among all
possible ways to shrink the box, the one that exhibits the largest deviation from the
expected distribution is selected. The refinement is stopped as soon as the support
falls below a minimum support threshold. Having arrived at a box with a large de-
viation, a second phase tries to maximize the support of the box by enlarging the
range of admissible values again, without compromising the high deviation found so
far. From a box found in this way, association rules may be derived by exploring the
correlation among the variables. The whole approach is not restricted to discovering
associations, but finds local maxima of some objective function, such that it could
also be applied to rule induction for numerical variables. Compared to partitioning
approaches (such as decision or regression trees), the rules obtained from trees are
much more complex and usually have much smaller support.
Many of the approaches in the next section can also be used for dense databases,
because they do not enumerate all frequent itemsets either. Similarly, for some tech-
niques in this section the minimum support threshold may be dropped with some
databases, but in (Webb, 2000, Bayardo and Agrawal, 1999) and (Friedman, 1997)
minimum support thresholds have been exploited.
15.4.5 Rules without Minimum Support
For some applications, the concept of a minimum support threshold is a rather ar-
tificial constraint primarily required for the efficient computation of large frequent

itemsets. Even in market basket analysis one may miss interesting associations, e.g.
between caviar and vodka (Cohen et al. (2007)). Therefore alternatives to Apriori
have been developed that do not require a minimum support threshold for association
rule enumeration. We discuss some of these approaches briefly in this section.
The observation in section 15.2.1 characterizes support as being downward closed:
If an itemset has minimum support, all subsets also have minimum support. We have
seen that the properties of a closure led to an efficient algorithm since it provided
powerful pruning capabilities. In (Brin et al., 1997A) the upward closure of corre-
lation (measured by the chi-square test) is proven and exploited for mining. Since
correlation is upward closed, to utilize it in a level-wise algorithm the inverse prop-
erty is used: For a (k + 1)-itemset to be uncorrelated, all of its k-subsets must be
uncorrelated. The uncorrelated itemsets are then generated and verified in a similar
fashion (“has minimum support” is substituted by “is uncorrelated”) as in the Apriori
algorithm (Figure 15.2). The interesting output, however, is not the set of uncorre-
lated items, but the set of correlated items. The border between both sets is identified
15 Association Rules 315
during the database pass: whenever a candidate k-itemset has turned out to be corre-
lated it is stored in a set of minimally correlated itemsets (otherwise it is stored in a
set of uncorrelated k-itemsets and used for pruning in the next stage). From the set of
minimally correlated itemsets we know that all supersets are also correlated (upward
closure), so the partitioning into correlated and uncorrelated itemsets is complete.
Note that in contrast to Apriori and derivatives, no rules but only sets of correlated
attributes are provided, and that no ranking for the correlation can guide the user in
manual investigation (see Section 60.2.2).
An interesting subproblem is that of finding all rules with a very high confidence
(but no constraints on support), as it is proposed in (Li and Fang, 1989). The key idea
is to, given a consequent, subdivide the database D into two parts, one, D
1
, containing
all transactions that contain the consequent and the other, D

2
, containing those that
do not. Itemsets that occur in D
1
but not at all in D
2
can be used as antecedents for
100%-confidence rules. A variation for rules with high confidence is also proposed.
Other proposals also address the enumeration of associations (or association
rules), but have almost nothing in common with the Apriori algorithm. Such an ap-
proach is presented in (Cohen et al. (2007)), which heavily relies on the power of
randomization. Associations are identified via correlation or similarity between at-
tributes X and Y : S(X,Y) = supp(X ∩ Y) / supp (X ∪Y ). The approach is restricted
to the identification of pairs of similar attributes, from which groups of similar vari-
ables may be identified (via transitivity). Since only 2-itemsets are considered and
due to the absence of a minimum support threshold, considerable effort is under-
taken to prune the set of candidate 2-patterns of size |I|
2
. (In (Brin et al., 1997A) this
problem is attacked by additionally introducing a support threshold guided by the
chi-square test requirements.) The approach uses so-called signatures for each col-
umn, which are calculated in a first database scan. Several proposals for signatures
are made, the most simple defines a random order on the rows and the signature it-
self is simply the first row index (under the ordering) in which the column has a 1.
The probability that two columns X and Y have the same signature is proportional to
their similarity S(X,Y ). To estimate this probability, k independently drawn orders
are used to calculate k individual signatures. The estimated probabilities are used for
pruning and exact similarities are calculated during a second pass.
The idea of Mambo (Castelo and Giudici, 2003) is that only those attributes
should be considered in the antecedent of a rule, that directly influence the conse-

quent attribute, which is reflected by conditional independencies: If X and Y are
conditionally independent given Z (or I(X,Y|Z)), then once we know Z, Y (or X) does
not tell us anything of importance about X (or Y ):
I(X,Y|Z) ⇔ P(X,Y | Z) = P(X | Z) P(Y | Z)
As an example from automobile manufacturing (Rokach and Maimon, 2006),
consider the variables “car model”, “top”, and “color”. If, for a certain car type, a
removable top is available, the top’s color may be restricted to only a few possibili-
ties, which consequently restricts the available colors for the car as a whole. The car
model influences the color only indirectly, therefore we do not want to see rules lead-
ing from car model to color, but only from top to color. If conditional independencies
are known, then they can be used to prevent a rule miner from listing such rules. The
316 Frank H
¨
oppner
idea of Mambo is to find these independencies and use them for pruning. A mini-
mal set of variables MB for a variable X such that for all variables Y /∈ MB∪{X}
we have I(X,Y|MB), is called a Markov Blanket of X. The blanket shields X from
the remaining variables. The difficulty of this approach lies in the estimation of the
Markov Blankets, which are obtained from a Markov Chain Monte Carlo method.
In (Aumann & Lindell, 1999) an approach for mining rules from one numeri-
cal attribute to another numerical attribute is proposed, that also does not require a
minimum support threshold. The numerical attribute in the antecedent is restricted
by an interval and the consequent characterizes the population mean in this case. A
rule is generated only if the mean of the selected subset of the population differs
significantly from the overall mean (according to a Z-test), and a “minimum differ-
ence” threshold is used to avoid enumerating rules which are significant but only
marginally different. The idea is to sort the database according to the attribute in
the antecedent (which is, however, a costly operation). Then any set of consecutive
transactions in the sorted database, whose consequent values are above or below av-
erage, gives a new rule. They also make use of a closure property: if two neighboring

intervals [a,b] and [b, c] lead to a significant change in the consequent variable, then
so does the union [a,c]. This property is exploited to list only rule with maximal
intervals in the antecedent.
15.5 Conclusions
We have discussed the common basis for many approaches to association rule min-
ing, the Apriori algorithm, which gained its attractivity and popularity from its sim-
plicity. Simplicity, on the other hand, always implies some insufficiencies and opens
space for various (no longer simple) improvements. Since the first papers about as-
sociation rule mining have been published, the number of papers in this area has
exploded and it is almost impossible to keep track of all different proposals. There-
fore, the overview provided here is necessarily incomplete.
Over time, the focus has shifted from sparse (market basket) data to (general
purpose) dense data, and reasonably large itemsets (promoting less than 0.1% of
the customers in a supermarket probably is not worth the effort) to small patterns,
which represent deviations from the mean population (small set of most profitable
customers, which shall be mailed directly). Then it may be desirable to get rid of
constraints such as minimum support, which possibly hide some interesting patterns
in the data. But on the other hand, the smaller the support the higher the probability
of observing an incidental rather than a meaningful deviation from the average pop-
ulation, especially when taking the size of the database into account (Bolton et al.,
2002). A major issue for the upcoming research is therefore to limit the findings to
substantive “real” patterns.
15 Association Rules 317
References
Agrawal R. and Srikant R. Fast Algorithms for mining association rules. Proc. Int. Conf. on
Very Large Databases, 487-499, 1994
Agrawal R. and Srikant R. Mining Sequential Patterns. Proc. Int. Conf. on Data Engineering,
3-14, 1995.
Agrawal R., Mannila H., Srikant R., Toivonen H., Verkamo A.I. Fast Discovery of Asso-
ciation Rules. In: Advances in Knowledge Discovery and Data Mining, Fayyad U.M.,

Piatetsky-Shapiro G., Smyth P., Uthurusamy R. (eds)., AAAI Press / The MIT Press,
307-328, 1996
Aumann Y., Lindell, Y. A Statistical Theory for Quantitative Association Rules, Journal of
Intelligent Information Systems, 20(3):255-283, 2003
Bayardo R.J., Agrawal R. Mining the Most Interesting Rules. Proc. ACM SIGKDD Int. Conf.
on Knowledge Discovery and Data Mining, 145-154, 1999
Bayardo R.J., Agrawal R., Gunopulos D. Constrained-Based Rule Mining in Large, Dense
Databases. Proc. 15
th
Int. Conf. on Data Engineering, 188-197, 1999
Berzal F., Blanco I., Sanchez D., Vila M.A. A New Framework to Assess Association Rules.
Proc. Symp. on Intelligent Data Analysis, LNCS 2189, 95-104, Springer, 2001
Bolton R., Hand D.J., Adams, N.M. Determining Hit Rate in Pattern Search, Proc. Pattern
Detection and Discovery, LNAI 2447, 36-48, Springer, 2002
Borgelt C., Berthold M. Mining Molecular Fragments: Finding Relevant Substructures of
Molecules, Proc. Int. Conf. on Data Mining, 51-58, 2002
Breiman L., Friedman J., Olshen R., Stone C. Classification and Regression Trees. Chapman
& Hall, New York, 1984.
Brijs T., Swinnen G., Vanhoof K. and Wets G. Using Association Rules for Product Assort-
ment Decisions: A Case Study. Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery
and Data Mining, 254-260, 1999
Brin S., Motwani R., and Silverstein C. Beyond market baskets: Generalizing association
rules to correlations. Proc. ACM SIGMOD Int. Conf. Management of Data, 265-276,
1997
Brin S., Motwani R., Ullman, J.D., Tsur, S. Dynamic itemset counting and implication rules
for market basket data. SIGMOD Record 26(2):255-264, 1997
Castelo R., Feelders A., Siebes A. Mambo: Discovering Association Rules. Proc. Symp. on
Intelligent Data Analysis. LNCS 2189, 289-298, Springer, 2001
Clark P., Boswell R. Rule Induction with CN2: Some recent improvements. In Proc. Euro-
pean Working Session on Learning EWSL-91, 151-163, 1991

Cohen E., Datar M., Fujiwara S., Gionis A., Indyk P., Motwani R., Ullman J., Yang C. Find-
ing Interesting Associations without Support Pruning, IEEE Transaction on Knowledge
Discovery 13(1):64-78, 2001
Dehaspe L., Toivonen H. Discovery of Frequent Datalog Patterns. Data Mining and Knowl-
edge Discovery, 3(1):7-38,1999
Delgado M., Martin-Bautista M.J., Sanchez D., Vila M.A. Mining Text Data: Features and
Patterns. Proc. Pattern Detection and Discovery, LNAI 2447, 140-153, Springer, 2002
Dong G., Li J. Interestingness of Discovered Association Rules in Terms of Neighbourhood-
Based Unexpectedness. Proc. Pacific Asia Conf. on Knowledge Discovery in Databases,
LNAI 1394, 72-86, 1998
Friedman J.H., Fisher N.I. Bump Hunting in High-Dimensional Data. Statistics and Com-
puting 9(2), 123-143, 1999
318 Frank H
¨
oppner
Gamberger D., Lavrac N., Jovanoski, V. High Confidence Association Rules for Medical
Diagnosis, Proc. of Intelligent Data Analysis in Medical Applications (IDAMAP), 42-
51, 1999
Goethals B., Van den Bussche, J. On Supporting Interactive Association Rule Mining, Proc.
Int. Conf. Data Warehousing and Knowledge Discovery,
LNCS 1874, 307-316, Springer, 2000
Han J., Fu Y. Discovery of Multiple-Level Association Rules from Large
Databases Proc. Int. Conf. on Very Large Databases, 420-431, 1995
Hilderman R.J. and Hamilton H.J. Knowledge Discovery and Measures of Interest. Kluwer
Academic Publishers, 2001
H
¨
oppner F., Klawonn F. Finding Informative Rules in Interval Sequences. Intelligent Data
Analysis, 6(3):237-256, 2002
H

¨
oppner F. Discovery of Core Episodes from Sequences. Proc. Pattern Detection and Dis-
covery, LNAI 2447, 199-213, Springer, 2002
Klemettinen M., Mannila H., Ronkainen P., Toivonen, H. and Verkamo A.I. Finding Inter-
esting Rules from Large Sets of Discovered Association Rules. Proc. Int. Conf. on In-
formation and Knowledge Managament, 401-407,
1994.
Koperski K., Han J. Discovery of Spatial Association Rules in Geographic Information
Databases. Proc. Int. Symp Advances in Spatial Databases, LNCS 951, 47-66, 1995
Kryszkiewicz M. Concise Representation of Frequent Patterns based on
Disjunction-free Generators. Proc. Int. Conf. on Data Mining, 305-312,
2001.
Lee C.H., Lin C.R., Chen M.S. On Mining General Temporal Association Rules in a Publi-
cation Database. Proc. Int. Conf. on Data Mining, 337-344, 2001
Li J., Zhang X., Dong G., Ramamohanarao K., Sun Q. Efficient Mining of High Confidence
Association Rules without Support Thresholds. Proc. Principles of Data Mining and
Knowledge Discovery, 406-411, 1999
Liu B., Hsu W., Ma Y. Pruning and Summarizing the Discovered Associations. Proc. ACM
SIGKDD Conf. Knowledge Discovery and Data Mining, 125-134, 1999
Mannila H., Toivonen H. Multiple uses of frequent sets and condensed representations. Proc.
ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 189-194, 1996
Mannila H., Toivonen H., Verkamo A.I. Discovery of Frequent Episodes in Event Sequences.
Data Mining and Knowledge Discovery, 1(3):259-289, 1997
Miller R.J., Yang Y. Association Rules over Interval Data. Proc. Int. Conf. on Management
of Data, 452-461, 1997
Mitchell T., Machine Learning, McGraw-Hill, 1997
Mobasher B., Dai H., Luo T., Nakagawa, M. Discovery and Evaluation of Aggregate Usage
Profile for Web Personalization, Data Mining and Knowledge Discovery 6:61-82, 2002
Padmanabhan B., Tuzhilin A. A Belief-Driven Method for Discovering Unexpected Patterns.
Int. Conf. on Knowledge Discovery and Data Mining, 94-100, 1998

Pasquier N., Bastide Y., Taouil R., Lakhal L. Efficient Mining of association rules using
closed itemset lattices. Information Systems, 24(1):25-46, 1999
Piatetsky-Shapiro, G. Discovery, Analysis, and Presentation of Strong Rules. Proc. Knowl-
edge Discovery in Databases, 229-248, 1991
Rokach, L., Averbuch, M. and Maimon, O., Information retrieval system for medical narra-
tive reports. Lecture notes in artificial intelligence, 3055. pp. 217-228, Springer-Verlag
(2004).
15 Association Rules 319
Rokach L. and Maimon O., Data mining for improving the quality of manufacturing: A
feature set decomposition approach. Journal of Intelligent Manufacturing 17(3): 285299,
2006.
Smyth P. and Goodman, R.M. An Information Theoretic Approach to Rule Induction from
Databases. IEEE Trans. Knowledge Discovery and Data Engineering, 4(4):301-316,
1992
Srikant R., Agrawal R. Mining Generalized Association Rules. Proc. Int. Conf. on Very
Large Databases, 407-419, 1995
Srikant R., Agrawal R. Mining Quantitative Association Rules in Large Relational Tables.
Proc. ACM SIGMOD Conf. on Management of Data, 1-12, 1996
Srikant R., Vu Q., Agrawal R. Mining Association Rules with Constraints. Proc. Int. Conf.
Knowledge Discovery and Data Mining, 66-73, 1997
Tan P.N., Kumar V. Selecting the Right Interestingness Measure for Association Patterns,
Proc. ACM SIGKDD Conf. Knowledge Discovery and Data Mining, 32-41, 2002
Tsoukatos, I. and Gunopulos, D. Efficient Mining of Spatiotemporal Patterns. Proc. Int.
Symp. Spatial and Temporal Databases, LNCS 2121, pages 425-442, 2001
Wang J., Han J., Pei J. CLOSET+: Searching for the Best Strategies for mining Frequent
Closed Itemsets, Proc. ACM SIGKDD Int. Conf. on Knowledge Discovery and Data
Mining, 236-245, 2003
Webb G.I. Efficient Search for Association Rules. Proc. ACM SIGKDD Int. Conf. on Knowl-
edge Discovery and Data Mining, 99-107, 2000
Webb G.I. Discovering Associations with numeric variables. Proc. ACM

SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 383-388, 2001
Zaki M.J. Generating non-redundant Association Rules. In Proc. ACM
SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 34-43, 2000
Zaki M.J. SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learn-
ing 42(1):31-60, 2001
Zelenko D. Optimizing Disjunctive Association Rules. Proc. of Int. Conf. on Principles of
Data Mining and Knowledge Discovery, LNAI 1704, 204-213, 1999

×