Tải bản đầy đủ (.pdf) (14 trang)

DSpace at VNU: Efficient strategies for parallel mining class association rules

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.09 MB, 14 trang )

Expert Systems with Applications 41 (2014) 4716–4729

Contents lists available at ScienceDirect

Expert Systems with Applications
journal homepage: www.elsevier.com/locate/eswa

Efficient strategies for parallel mining class association rules
Dang Nguyen a, Bay Vo b,⇑, Bac Le c
a

University of Information Technology, Vietnam National University, Ho Chi Minh, Viet Nam
Information Technology Department, Ton Duc Thang University, Ho Chi Minh, Viet Nam
c
Department of Computer Science, University of Science, Vietnam National University, Ho Chi Minh, Viet Nam
b

a r t i c l e

i n f o

Keywords:
Associative classification
Class association rule mining
Parallel computing
Data mining
Multi-core processor

a b s t r a c t
Mining class association rules (CARs) is an essential, but time-intensive task in Associative Classification
(AC). A number of algorithms have been proposed to speed up the mining process. However, sequential


algorithms are not efficient for mining CARs in large datasets while existing parallel algorithms require
communication and collaboration among computing nodes which introduces the high cost of synchronization. This paper addresses these drawbacks by proposing three efficient approaches for mining CARs in
large datasets relying on parallel computing. To date, this is the first study which tries to implement an
algorithm for parallel mining CARs on a computer with the multi-core processor architecture. The proposed parallel algorithm is theoretically proven to be faster than existing parallel algorithms. The experimental results also show that our proposed parallel algorithm outperforms a recent sequential algorithm
in mining time.
Ó 2014 Elsevier Ltd. All rights reserved.

1. Introduction
Classification is a common topic in machine learning, pattern
recognition, statistics, and data mining. Therefore, numerous approaches based on different strategies have been proposed for
building classification models. Among these strategies, Associative
Classification (AC), which uses the associations between itemsets
and class labels (called class association rules), has been proven itself to be more accurate than traditional methods such as C4.5
(Quinlan, 1993) and ILA (Tolun & Abu-Soud, 1998; Tolun, Sever,
Uludag, & Abu-Soud, 1999). The problem of classification based
on class association rules is to find the complete set of CARs which
satisfy the user-defined minimum support and minimum confidence thresholds from the training dataset. A subset of CARs is
then selected to form the classifier. Since the first introduction in
(Liu, Hsu, & Ma, 1998), tremendous approaches have been proposed to solve this problem. Examples include the classification
based on multiple association rules (Li, Han, & Pei, 2001), the classification model based on predictive association rules (Yin & Han,
2003), the classification based on the maximum entropy (Thabtah,
Cowling, & Peng, 2005), the classification based on the information
gain measure (Chen, Liu, Yu, Wei, & Zhang, 2006), the lazy-based
approach for classification (Baralis, Chiusano, & Garza, 2008), the

⇑ Corresponding author. Tel.: +84 083974186.
E-mail addresses: (D. Nguyen), vdbay@it.
tdt.edu.vn (B. Vo), lhbac@fit.hcmus.edu.vn (B. Le).
/>0957-4174/Ó 2014 Elsevier Ltd. All rights reserved.


use of an equivalence class rule tree (Vo & Le, 2009), the classifier
based on Galois connections between objects and rules (Liu, Liu, &
Zhang, 2011), the lattice-based approach for classification (Nguyen,
Vo, Hong, & Thanh, 2012), and the integration of taxonomy information into classifier construction (Cagliero & Garza, 2013).
However, most existing algorithms for associative classification
have primarily concentrated on building an efficient and accurate
classifier but have not considered carefully the runtime performance of discovering CARs in the first phase. In fact, finding all
CARs is a challenging and time-consuming problem due to two reasons. First, it may be hard to find all CARs in dense datasets since
there are a huge number of generated rules. For example, in our
experiments, some datasets can induce more than 4,000,000 rules.
Second, the number of candidate rules to check is very large.
Assuming there are d items and k class labels in the dataset, there
can be up to k  (2d À 1) rules to consider. Very few studies, for instance (Nguyen, Vo, Hong, & Thanh, 2013; Nguyen et al., 2012; Vo
& Le, 2009; Zhao, Cheng, & He, 2009), have discussed the execution
time efficiency of the CAR mining process. Nevertheless, all algorithms have been implemented by sequential strategies. Consequently, their runtime performances have not been satisfied on
large datasets, especially recently emerged dense datasets.
Researchers have begun switching to parallel and distributed computing techniques to accelerate the computation. Two parallel
algorithms for mining CARs were recently proposed on distributed
memory systems (Mokeddem & Belbachir, 2010; Thakur &
Ramesh, 2008).


4717

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

Along with the advent of the computers with the multi-core
processors, more memory and computing power of processors
have been utilized so that larger datasets can be tackled in the
main memory with lower cost in comparison with the usage of distributed or mainframe systems. Therefore, this present study aims

to propose three efficient strategies for parallel mining CARs on the
multi-core processor computers. The proposed approaches overcome two disadvantages of existing methods for parallel mining
CARs. They eliminate communication and collaboration among
computing nodes which introduces the overhead of synchronization. They also avoid data replication and do not require data transfer among processing units. As a result, the proposals significantly
improve the response time compared to the sequential counterpart
and existing parallel methods. The proposed parallel algorithm is
theoretically proven to be more efficient than existing parallel
algorithms. The experimental results also show that the proposed
parallel algorithm can achieve up to a 2.1Â speedup compared to
a recent sequential CAR mining algorithm.
The rest of this paper is organized as follows. In Section 2, some
preliminary concepts of the class association rule problem and the
multi-core processor architecture are briefly given. The benefits of
parallel mining on multi-core processor computers are also discussed in this section. Work related to sequential and parallel mining class association rules are reviewed in Section 3. Our previous
sequential CAR mining algorithm is summarized in Section 4 because it forms the basic framework of our proposed parallel algorithm. The primary contributions are presented in Section 5 in
which three proposed strategies for efficiently mining classification rules under the high performance parallel computing context
are described. The time complexity of the proposed algorithm is
analyzed in Section 6. Section 7 presents the experimental results
while conclusions and future work are discussed in Section 8.

2. Preliminary concepts
This section provides some preliminary concepts of the class
association rule problem and the multi-core processor architecture. It also discusses benefits of parallel mining on the multi-core
processor architecture.

2.1. Class association rule
One of main goals of data mining is to discover important relationships among items such that the presences of some items in a
transaction are associated with the presences of some other items.
To achieve this purpose, Agrawal and his colleagues proposed the
Apriori algorithm to find association rules in a transactional dataset (Agrawal & Srikant, 1994). An association rule has the form

X ? Y where X, Y are frequent itemsets and X \ Y = £. The problem
of mining association rules is to find all association rules in a dataset having support and confidence no less than user-defined minimum support and minimum confidence thresholds.
Class association rule is a special case of association rule in
which only the class attribute is considered in the rule’s right-hand
side (consequent). Mining class association rules is to find the set
of rules which satisfy the minimum support and minimum confidence thresholds specified by end-users. Let us define the CAR
problem as follows.
Let D be a dataset with n attributes {A1, A2, . . . , An} and |D| records (objects) where each record has an object identifier (OID).
Let C = {c1, c2, . . . , ck} be a list of class labels. A specific value of an
attribute Ai and class C are denoted by lower-case letters aim and
cj, respectively.

Definition 1. An item is described as an attribute and a specific
value for that attribute, denoted by h(Ai, aim)i and an itemset is a set
of items.

Definition 2. Let I ¼ fhðA1 ; a11 Þi; . . . ; hðA1 ; a1m1 Þi; hðA2 ; a21 Þi; . . . ;
hðA2 ; a2m2 Þi; . . . ; hðAn ; an1 Þi; . . . ; hðAn ; anmn Þig be a finite set of items.
Dataset D is a finite set of objects, D = {OID1, OID2, . . . , OID|D|} in
which each object OIDx has the form OIDx = attr(OIDx) ^ class(OIDx)
(1 6 x 6 |D|) with attr(OIDx) # I and class(OIDx) e C. For example,
OID1 for the dataset shown in Table 1 is {h(A, a1)i, h(B, b1)i,
h(C, c1)i} ^ {1}.
Definition 3. A class association rule R has the form itemset ? cj,
where cj e C is a class label.
Definition 4. The actual occurrence ActOcc(R) of rule R in D is the
number of objects of D that match R’s antecedent, i.e.,
ActOcc(R) = |{OID|OID e D ^ itemset # attr(OID)}|.
Definition 5. The support of rule R, denoted by Supp(R), is the
number of objects of D that match R’s antecedent and are labeled

with R’s class. Supp(R) is defined as:

SuppðRÞ ¼ jfOIDjOID 2 D ^ itemset # attrðOIDÞ ^ cj ¼ classðOIDÞgj

Definition 6. The confidence of rule R, denoted by Conf(R), is
defined as:

Conf ðRÞ ¼

SuppðRÞ
ActOccðRÞ

A sample dataset is shown in Table 1. It contains three objects, three
attributes (A, B, and C), and two classes (1 and 2). Considers rule
R: h(A, a1)i ? 1. We have ActOcc(R) = 2 and Supp(R) = 1 since there
are two objects with A = a1, in that one object (object 1) also conSuppðRÞ
tains class 1. We also have Conf ðRÞ ¼ ActOccðRÞ
¼ 12.
2.2. Multi-core processor architecture
A multi-core processor (shown in Fig. 1) is a single computing
component with two or more independent central processing units
(cores) in the same physical package (Andrew, 2008). The processors were originally designed with only one core. However, multi-core processors became mainstream when Intel and AMD
introduced their commercial multi-core chip in 2008 (Casali &
Ernst, 2013). A multi-core processor computer has different specifications from either a computer cluster (Fig. 2) or a SMP (Symmetric Multi-processor) system (Fig. 3): the memory is not distributed
like in a cluster but rather is shared. It is similar to the SMP architecture. Many SMP systems, however, have the NUMA (Non Uniform Memory Access) architecture. There are several memory
blocks which are accessed with different speeds from each processor depending on the distance between the memory block and the
processor. On the contrary, the multi-core processors are usually
on the UMA (Uniform Memory Access) architecture. There is one

Table 1

Example of a dataset.
OID

A

B

C

Class

1
2
3

a1
a1
a2

b1
b1
b1

c1
c1
c1

1
2
2



4718

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

Thread

Thread

Chip

C
o
r
e

C
o
r
e

Memory
Fig. 1. Multi-core processor: one chip, two cores, two threads (Source: http://
software.intel.com/en-us/articles/multi-core-processor-architecture-explained).

memory block only, so all cores have an equal access time to the
memory (Laurent, Négrevergne, Sicard, & Termier, 2012).

candidates. Their main contribution was to enhance the task of

candidate generation in the Apriori algorithm on the multi-core
processor computers. Schlegel, Karnagel, Kiefer, and Lehner
(2013) recently adapted the well-known Eclat algorithm to a
highly parallel version which runs on the multi-core processor system. They proposed three parallel approaches for Eclat: independent class, shared class, and shared itemset. Parallel mining has
also been widely adopted in many other research fields, such as
closed frequent itemset mining (Negrevergne, Termier, Méhaut, &
Uno, 2010), gradual pattern mining (Laurent et al., 2012), correlated pattern mining (Casali & Ernst, 2013), generic pattern mining
(Negrevergne, Termier, Rousset, & Méhaut, 2013), and tree-structured data mining (Tatikonda & Parthasarathy, 2009).
While many researches have been devoted to develop parallel
pattern mining and association rule mining algorithms relied on
the multi-core processor architecture, no studies have published
regarding the parallel class association rule mining problem. Thus,
this paper proposes the first algorithm for parallel mining CARs
which can be executed efficiently on the multi-core processor
architecture.
3. Related work
This section begins with the overview of some sequential versions of CAR mining algorithm and then provides details about
two parallel versions of it.

2.3. Parallel mining on the multi-core processor architecture
Obviously, the multi-core processor architecture has many
desirable properties, for example each core has direct and equal access to all the system’s memory and the multi-core chip also allows
higher performance at lower energy and cost. Therefore, numerous
researchers have developed parallel algorithms on the multi-core
processor architecture in the data mining literature. One of the first
algorithms targeting multi-core processor computers was FP-array
proposed by Liu and his colleagues in 2007 (Liu, Li, Zhang, & Tang,
2007). The authors proposed two techniques, namely a cacheconscious FP-array and a lock-free dataset tiling parallelism mechanism for parallel discovering frequent itemsets on the multi-core
processor machines. Yu and Wu (2011) proposed an efficient load
balancing strategy in order to reduce massive duplicated generated


3.1. Sequential CAR mining algorithms
The first algorithm for mining CARs was proposed by Liu et al.
(1998) based on the Apriori algorithm (Agrawal & Srikant, 1994).
After its introduction, several other algorithms adopted its approach, including CAAR (Xu, Han, & Min, 2004) and PCAR (Chen,
Hsu, & Hsu, 2012). However, these methods are time-consuming
because they generate a lot of candidates and scan the dataset several times. Another approach for mining CARs is to build the frequent pattern tree (FP-tree) (Han, Pei, & Yin, 2000) to discover
rules, which was presented in some algorithms such as CMAR (Li
et al., 2001) and L3 (Baralis, Chiusano, & Garza, 2004). The mining

Processor

Memory

Processor
Processor
Memory
Memory

Processor

Memory

Fig. 2. Computer cluster (Source: />

4719

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

Main

Memory

System Bus
Bus Arbiter

Cache

Cache

Cache

I/O

...
Processor 1

Processor 2

Processor n

Fig. 3. Symmetric multi-processor system (Source: />
process used by the FP-tree does not generate candidate rules.
However, its significant weakness lies in the fact that the FP-tree
does not always fit in the main memory. Several algorithms, MMAC
(Thabtah, Cowling, & Peng, 2004), MCAR (Thabtah et al., 2005), and
MCAR (Zhao et al., 2009), utilized the vertical layout of the dataset
to improve the efficiency of the rule discovery phase by employing
a method that extends the tidsets intersection method mentioned
in (Zaki, Parthasarathy, Ogihara, & Li, 1997). Vo and Le proposed
another method for mining CARs by using an equivalence class rule

tree (ECR-tree) (Vo & Le, 2009). An efficient algorithm, called ECRCARM, was also proposed in their paper. The two strong features
demonstrated by ECR-CARM are that it scans the dataset only once
and uses the intersection of object identifiers to determine the support of itemsets quickly. However, it needs to generate and test a
huge number of candidates because each node in the tree contains
all values of a set of attributes. Nguyen et al. (2013) modified the
ECR-tree structure to speed up the mining process. In their enhanced tree, named MECR-tree, each node contains only one value
instead of the whole group. They also provided theorems to identify the support of child nodes and prune unnecessary nodes
quickly. Based on MECR-tree and these theorems, they presented
the CAR-Miner algorithm for effectively mining CARs.
It can be seen that many sequential algorithms of CAR mining
have been developed but very few parallel versions of it have been
proposed. Next section reviews two parallel algorithms of CAR
mining which have been mentioned in the associative classification literature.
3.2. Parallel CAR mining algorithms
One of the primary weaknesses of sequential versions of CAR
mining is that they are unable to provide the scalability in terms
of data dimension, size, or runtime performance for such large
datasets. Consequently, some researchers recently have tried to
apply parallelism to current sequential CAR mining algorithms to
release the sequential bottleneck and improve the response time.
Thakur and Ramesh (2008) proposed a parallel version for the
CBA algorithm (Liu et al., 1998). Their proposed algorithm was
implemented on a distributed memory system and based on data
parallelism. The parallel CAR mining phase is an adaption of the
CD approach which was originally proposed for parallel mining frequent itemsets (Agrawal & Shafer, 1996). The training dataset was
partitioned into P parts which were computed on P processors.
Each processor worked on its local data to mine CARs with the
same global minimum support and minimum confidence. However, this algorithm has three big weaknesses as follows. First, it
uses a static load balance which partitions work among processors


by using a heuristic cost function. This causes a high load imbalance. Second, a high synchronization happens at the end of each
step. Final, each site must keep the duplication of the entire set
of candidates. Additionally, the authors did not provide any experiments to illustrate the performance of the proposed algorithm.
Mokeddem and Belbachir (2010) proposed a distributed version
for FP-Growth (Han et al., 2000) to discover CARs. Their proposed
algorithm was also employed on a distributed memory system
and based on the data parallelism. Data were partitioned into P
parts which were computed on P processors for parallel discovering the subsets of classification rules. An inter-communication
was established to make global decisions. Consequently, their approach faces the big problem of high synchronization among
nodes. In addition, the authors did not conduct any experiments
to compare their proposed algorithm with others.
Two existing parallel algorithms for mining CARs which were
employed on distributed memory systems have two significant
problems: high synchronization among nodes and data replication.
In this paper, a parallel CAR mining algorithm based on the multicore processor architecture is thus proposed to solve those
problems.
4. A sequential class association rule mining algorithm
In this section, we briefly summarize our previous sequential
CAR mining algorithm as it forms the basic framework of our proposed parallel algorithm.
In (Nguyen & Vo, 2014), we proposed a tree structure to mine
CARs quickly and directly. Each node in the tree contains one itemset along with:
(1) (Obidset1, Obidset2, . . . , Obidsetk) – A list of Obidsets in which
each Obidseti is a set of object identifiers that contain both
the itemset and class ci. Note that k is the number of classes
in the dataset.
(2) pos – A positive integer storing the position of the class with
the
maximum
cardinality
of

Obidseti,
i.e.,
pos = argmaxie[1,k]{|Obidseti|}.
(3) total – A positive integer which stores the sum of cardinality
P
of all Obidseti, i.e., total ¼ ki¼1 ðjObidseti jÞ.
However, the itemset is converted to the form att  values for
easily programming, where
(1) att – A positive integer represents a list of attributes.
(2) values – A list of values, each of which is contained in one
attribute in att.


4720

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

For example, itemset X = {h(B, b1)i, h(C, c1)i} is denoted as
X = 6 Â b1c1. A bit representation is used for storage of itemset
attributes to save memory usage. Attributes BC can be represented
as 110 in bit representation, so the value of these attributes is 6.
Bitwise operations are then used to quickly join itemsets.
In Table 1, itemset X = {h(B, b1)i, h(C, c1)i} is contained in objects
1, 2 and 3. Thus, the node which contains itemset X has the form
6 Â b1c1(1, 23) in which Obidset1 = {1} (or Obidset1 = 1 for short)
(i.e., object 1 contains both itemset X and class 1), Obidset2 = {2, 3}
(or Obidset2 = 23 for short) (i.e., objects 2 and 3 contain both itemset X and class 2), pos = 2 (denoted by a line under Obidset2, i.e., 23),
and total = 3. pos is 2 because the cardinality of Obidset2 for class 2
is maximum (2 versus 1).
Obtaining support and confidence of a rule becomes computing

jObidset pos j
|Obidsetpos| and
, respectively. For example, node
total
6 Â b1c1(1, 23) generates rule {h(B, b1)i, h(C, c1)i} ? 2 (i.e., if B = b1
and C = c1, then Class = 2) with Supp = |Obidset2| = |23| = 2 and
Conf ¼ 23.
Based on the tree structure, we also proposed a sequential algorithm for mining CARs, called Sequential-CAR-Mining, as shown in
Fig. 4. Firstly, we find all frequent 1-itemsets and add them to the
root node of the tree (Line 1). Secondly, we recursively discover
other frequent k-itemsets based on the Depth-First Search strategy
(procedure Sequential-CAR-Mining). Thirdly, while traversing

{}
1× a1(1, 2 )

1× a 2 ( ∅,3)

3 × a1b1(1, 2 ) 5 × a1c1(1, 2 )

3 × a 2b1( ∅,3) 5 × a 2c1( ∅,3) 6 × b1c1(1, 23)

7 × a1b1c1(1, 2 )

7 × a 2b1c1( ∅,3)

nodes in the tree, we also generate rules which satisfy the minimum confidence threshold (procedure Generate-Rule). The pseudo
code of the algorithm is shown in Fig. 4.
Fig. 5 shows the tree structure generated by the sequential CAR
mining algorithm for the dataset shown in Table 1. For details on

the tree generation, please refer to the study by Nguyen and Vo (2014).

5. The proposed parallel class association rule mining algorithm
Although Sequential-CAR-Mining is an efficient algorithm for
mining all CARs, its runtime performance reduces significantly on
large datasets due to the computational complexity. As a result,

Output: All CARs satisfying minSup and minConf
Procedure:

1. Let Lr be the root node of the tree. Lr includes a set of nodes in which each node contains a
frequent 1-itemset.
Sequential-CAR-Mining( Lr , minSup, minConf)

2. CARs= ∅ ;
3. for all lx ∈ Lr .children do
Generate-Rule( lx , minConf);

5.

Pi = ∅ ;

6.

for all l y ∈ Lr .children , with y > x do

7.

if l y .att ≠ lx .att then // two nodes are combined only if their attributes are different


8.

O.att = l x .att | l y .att ; // using bitwise operation

9.

O.values = l x .values ∪ l y .values ;

10.

O.Obidseti = l x .Obidseti ∩ l y .Obidseti ; // ∀i ∈ [1, k ]

11.

O. pos = argmax i∈[1,k ] { O.Obidseti } ;

12.

O.total = ∑ O.Obidseti ;

13.

if O.ObidsetO. pos ≥ minSup then // node O satisfies minSup

k

i =1

14.
15.


Pi = Pi ∪ O ;
Sequential-CAR-Mining( Pi , minSup, minConf);

Generate-Rule( l , minConf)

16. conf = l.Obidsetl . pos / l.total ;
17. if conf ≥ minConf then
18.

{

(

4 × c1(1, 23)

Fig. 5. Tree generated by sequential-CAR-mining for the dataset in Table 1.

Input: Dataset D, minSup and minConf

4.

2 × b1(1, 23)

CARs=CARs ∪ l.itemset → c pos l.Obidsetl . pos , conf

)} ;

Fig. 4. Sequential algorithm for mining CARs.



D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

Input: Dataset D, minSup and minConf
Output: All CARs satisfying minSup and minConf
Procedure:

1. Let Lr be the root node of the tree. Lr includes a set of nodes in which each node contains a
frequent 1-itemset.
PMCAR( Lr , minSup, minConf)

2. totalCARs=CARs= ∅ ;
3. for all lx ∈ Lr .children do
4.

Generate-Rule(CARs, lx , minConf);

5.

Pi = ∅ ;

6.

for all l y ∈ Lr .children , with y > x do

7.

if l y .att ≠ lx .att then // two nodes are combined only if their attributes are different

8.


O.att = lx .att | l y .att ; // using bitwise operation

9.

O.values = l x .values ∪ l y .values ;

10.

O.Obidseti = l x .Obidseti ∩ l y .Obidseti ; // ∀i ∈ [1, k ]

11.

O. pos = argmax i∈[1,k ] { O.Obidseti } ;

12.

O.total = ∑ O.Obidseti ;

k

i =1

if O.ObidsetO. pos ≥ minSup then // node O satisfies minSup

13.

Pi = Pi ∪ O ;

14.

15.

Task ti = new Task(() => {
Sub-PMCAR(tCARs, Pi , minSup, minConf); });

16. for each task in the list of created tasks do
17.

collect the set of rules ( tCARs ) returned by each task;

18.

totalCARs = totalCARs ∪ tCARs ;

19. totalCARs = totalCARs ∪ CARs ;
Sub-PMCAR(tCARs, Lr , minSup, minConf)

20. for all lx ∈ Lr .children do
21.

Generate-Rule(tCARs, lx , minConf);

22.

Pi = ∅ ;

23.

for all l y ∈ Lr .children , with y > x do


24.

if l y .att ≠ lx .att then // two nodes are combined only if their attributes are different

25.

O.att = l x .att | l y .att ; // using bitwise operation

26.

O.values = lx .values ∪ l y .values ;

27.

O.Obidseti = l x .Obidseti ∩ l y .Obidseti ; // ∀i ∈ [1, k ]

28.

O. pos = argmax i∈[1,k ] { O.Obidseti } ;

29.

O.total = ∑ O.Obidseti ;

30.

if O.ObidsetO. pos ≥ minSup then // node O satisfies minSup

k


i =1

31.
32.

Pi = Pi ∪ O ;
Sub-PMCAR(tCARs, Pi , minSup, minConf);
Fig. 6. PMCAR with independent branch strategy.

4721


4722

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

{}
1× a1(1, 2 )

1× a 2 ( ∅,3)

2 × b1(1, 23)

3 × a1b1(1, 2 ) 5 × a1c1(1, 2 )

3 × a 2b1( ∅,3) 5 × a 2c1( ∅,3) 6 × b1c1(1, 23)

7 × a1b1c1(1, 2 )

7 × a 2b1c1( ∅,3)


t1

t2

4 × c1(1, 23)

t3

Fig. 7. Illustration of the independent branch strategy.

we have tried to apply parallel computing techniques to the
sequential algorithm to speed up the mining process.
Schlegel et al. (2013) recently adapted the well-known Eclat
algorithm to a highly parallel version which runs on the multi-core
processor system. They proposed three parallel approaches for
Eclat: independent class, shared class, and shared itemset. In the
‘‘independent class’’ strategy, each equivalence class is distributed
to a single thread which mines its assigned class independently
from other threads. This approach has an important advantage in
that the synchronization cost is low. It, however, consumes much
higher memory than the sequential counterpart because all threads
hold entire their tidsets at the same time. Additionally, this strategy
often causes high load imbalances when a large number of threads
are used. Threads mine light classes often finish sooner than
threads mine heavier classes. In the ‘‘shared class’’ strategy, a single
class is assigned to multiple threads. This can reduce the memory
consumption but increase the cost of synchronization since one
thread has to communicate to others to obtain their tidsets. In the
final strategy, ‘‘shared itemset’’, multiple threads concurrently perform the intersection of two tidsets for a new itemset. In this strategy, threads have to synchronize with each other with a high cost.

Basically, the proposed algorithm, Parallel Mining Class Association Rules (PMCAR), is a combination of Sequential-CAR-Mining
and parallel ideas mentioned in (Schlegel et al., 2013). It has the
same core steps as Sequential-CAR-Mining where it scans the dataset once to obtain all frequent 1-itemsets along with their Obidsets, and it then starts recursively mining. It also adopts two
parallel strategies ‘‘independent class’’ and ‘‘shared class’’. However, PMCAR has some differences as follows. PMCAR is a parallel
algorithm for mining class association rules while the work done
by Schlegel et al. focuses on mining frequent itemsets only. Additionally, we also propose a third parallel strategy shared Obidset
for PMCAR. PMCAR is employed on a single system with the multi-core processor where the main memory can be shared with and
equally accessed by all cores. Hence, PMCAR does not require synchronization among computing nodes like other parallel CAR mining algorithms employed on distributed memory systems.
Compared to Sequential-CAR-Mining, the main differences between PMCAR and Sequential-CAR-Mining in terms of parallel
CAR mining strategies are discussed in the following sections.
5.1. Independent branch strategy
The first strategy, independent branch, distributes each branch of
the tree to a single task, which mines assigned branch independently
from all other tasks to generate CARs. General speaking, this strategy
is similar to the ‘‘independent class’’ strategy mentioned in (Schlegel
et al., 2013) except that PMCAR uses the different tree structure for
the purpose of CAR mining and it is implemented by using tasks instead of threads. As mentioned above, this strategy has some limitations such as high load imbalances and high memory consumption.
However, the primary advantage of this strategy is that each task is
executed independently from other tasks without any synchronization. In our implementation, the algorithm is employed based on the
parallelism model in .NET Framework 4.0. Instead of using threads,

our algorithm uses tasks that have more advantageous than threads.
First, task consumes less memory usage than thread. Second, while a
single thread runs on a single core, tasks are designed to be aware of
the multi-core processor and multiple tasks can be executed on a
single core. Final, using threads takes much time because operating
systems must allocate data structures of threads, initialize, destroy
them, and also perform the context switches between threads. Consequently, our implementation can solve two problems: high memory consumption and high imbalance.
The pseudo code of PMCAR with independent branch strategy is
shown in Fig. 6.

We apply the algorithm to the sample dataset shown in Table 1 to
illustrate its basic ideas. First, PMCAR finds all frequent 1-itemsets as
done in Sequential-CAR-Mining (Line 1). After this step, we have
Lr = {1 Â a1(1, 2), 1 Â a2(£, 3), 2 Â b1(1, 23), 4 Â c1(1, 23)}. Second,
PMCAR calls procedure PMCAR to generate frequent 2-itemsets (Lines
3–14). For example, consider node 1 Â a1(1, 2). This node combines
with two nodes 2 Â b1(1, 23) and 4 Â c1(1, 23) to generate two new
nodes 3 Â a1b1(1, 2) and 5 Â a1c1(1, 2). Note that node 1 Â a1(1, 2)
does not combine with node 1 Â a2(£, 3) since they have the same
attribute (attribute A) which causes the support of the new node is
zero regarding Theorem 1 mentioned in (Nguyen & Vo, 2014). After
these steps, we have Pi = {3 Â a1b1(1, 2), 5 Â a1c1(1, 2)}. Then,
PMCAR creates a new task ti and calls procedure Sub-PMCAR inside
that task with four parameters tCARs, minSup, minConf, and Pi. The first
parameter tCARs is used to store the set of rules returned by SubPMCAR in a task (Line 15). For instance, task t1 is created and procedure Sub-PMCAR is executed inside t1. Procedure Sub-PMCAR is
recursively called inside a task to mine all CARs (Lines 20–32). For
example, task t1 also generates node 7 Â a1b1c1(1, 2) and its rule. Finally, after all created tasks completely mine all assigned branches,
their results are collected and form the complete set of rules (Lines
16–19). In Fig. 7, three tasks t1, t2, and t3 represented by solid blocks
parallel mine three branches a1, a2, and b1 independently.
5.2. Shared branch strategy
The second strategy, shared branch, adopts the same ideas of the
‘‘shared class’’ strategy mentioned in Schlegel et al. (2013). In this
strategy, each branch is parallel mined by multiple tasks. The pseudo code of PMCAR with shared branch strategy is shown in Fig. 8.
First, the algorithm initializes the root node Lr (Line 1). Then, the
procedure PMCAR is recursively called to generate CARs. When
node lx combines with node ly, the algorithm creates a new task
ti and performs the combination code inside that task (Lines 7–
17). Note that because multiple tasks concurrently mine the same
branch, synchronization happens to collect necessary information

for the new node (Line 18). Additionally, to avoid a data race
(i.e., two or more tasks perform operations that update a shared
piece data) (Netzer & Miller, 1989), we use a lock object to coordinate tasks’ access to the share data Pi (Lines 15 and 16).
We also apply the algorithm to the dataset in Table 1 to demonstrate its work. As an example, we can discuss node 1 Â a1(1, 2).
The algorithm creates task t1 to combine node 1 Â a1(1, 2) with
node 2 Â b1(1, 23) to generate node 3 Â a1b1(1, 2); it parallel creates task t2 to combine node 1 Â a1(1, 2) with node 4 Â c1(1, 23) to
generate node 5 Â a1c1(1, 2). However, before the algorithm continues creating task t3 to generate node 7 Â a1b1c1(1, 2), it has
to wait till tasks t1 and t2 finishing their works. Therefore, this
strategy is slower than the first one in execution time. In Fig. 9,
three tasks t1, t2, and t3 parallel mine the same branch a1.
5.3. Shared Obidset strategy
The third strategy, shared Obidset, is different from the ‘‘shared
itemset’’ strategy discussed in Schlegel et al. (2013). Each task has a


D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

4723

Input: Dataset D, minSup and minConf
Output: All CARs satisfying minSup and minConf
Procedure:

1. Let Lr be the root node of the tree. Lr includes a set of nodes in which each node contains a
frequent 1-itemset.
PMCAR( Lr , minSup, minConf)

2. CARs= ∅ ;
3. for all lx ∈ Lr .children do
4.


Generate-Rule( lx , minConf);

5.

Pi = ∅ ;

6.

for all l y ∈ Lr .children , with y > x do

7.

Task ti = new Task(() => {

8.

if l y .att ≠ lx .att then

9.

O.att = lx .att | l y .att ; // using bitwise operation

10.

O.values = l x .values ∪ l y .values ;

11.

O.Obidseti = l x .Obidseti ∩ l y .Obidseti ; // ∀i ∈ [1, k ]


12.

O. pos = argmax i∈[1,k ] { O.Obidseti } ;

13.

O.total = ∑ O.Obidseti ;

k

i =1

if O.ObidsetO. pos ≥ minSup then // node O satisfies minSup

14.
15.

lock(lockObject)

Pi = Pi ∪ O ;

16.
17.

});

18.

Task.WaitAll( ti );


19.

PMCAR( Pi , minSup, minConf);
Fig. 8. PMCAR with shared branch strategy.

{}
1× a1(1, 2 )
t1

1× a 2 ( ∅,3)

2 × b1(1, 23)

4 × c1(1, 23)

t2

3 × a1b1(1, 2 ) 5 × a1c1(1, 2 )

3 × a 2b1( ∅,3) 5 × a 2c1( ∅,3) 6 × b1c1(1, 23)

t3
7 × a1b1c1(1, 2 )

7 × a 2b1c1( ∅,3)
Fig. 9. Illustration of the shared branch strategy.

different branch assigned and its child tasks process together a
node in the branch. The pseudo code of PMCAR with shared Obidset

strategy is shown in Fig. 10. The algorithm first finds all frequent 1itemsets and adds them to the root node (Line 1). It then calls procedure PMCAR to generate frequent 2-itemsets (Lines 2–14). For
each branch of the tree, it creates a task and call procedure SubPMCAR inside that task (Line 15). Sub-PMCAR is recursively called
to generate frequent k-itemsets (k > 2) and their rules (Lines 20–
34). We can see that the functions of procedures PMCAR and
Sub-PMCAR look like those mentioned in PMCAR with independent
branch strategy. However, this algorithm provides a more complicated parallel strategy. In Sub-PMCAR, the algorithm creates a list
of child tasks to parallel intersect Obidseti of two nodes (Lines 27–

28). This allows the work distribution to be the most fine-grained.
Nevertheless, all child tasks have to finish their work before calculating two properties pos and total for the new node (Lines 29–31).
Consequently, there is a high cost of synchronization among child
tasks and between child tasks and their parent task.
Let us illustrate the basic ideas of shared Obidset strategy by
Fig. 11. Branch a1 is assigned to task t1. In procedure Sub-PMCAR,
tasks t2 and t3 which are child tasks of t1 process together node
3 Â a1b1(1, 2), i.e., tasks t2 and t3 parallel intersect Obidset1 and
Obidset2 of two nodes 3 Â a1b1(1, 2) and 5 Â a1c1(1, 2), respectively. However, task t2 must wait till task t3 finishing the intersection of two Obidset2 to obtain Obidset1 and Obidset2 of the new node
7 Â a1b1c1(1, 2). Additionally, parent task t1 represented by the
solid block must wait till all tasks t2, t3, and other child tasks finishing their work.

6. Time complexity analysis
In this section, we analyze the time complexities of both
sequential and proposed parallel CAR mining algorithms. We then
derive the speedup of the parallel algorithm. We also compare the
time complexity of our parallel algorithm with those of existing
parallel algorithms.


4724


D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

Input: Dataset D, minSup and minConf
Output: All CARs satisfying minSup and minConf
Procedure:

1. Let Lr be the root node of the tree. Lr includes a set of nodes in which each node contains a
frequent 1-itemset.
PMCAR( Lr , minSup, minConf)

2. totalCARs=CARs= ∅ ;
3. for all lx ∈ Lr .children do
4.

Generate-Rule(CARs, lx , minConf);

5.

Pi = ∅ ;

6.

for all l y ∈ Lr .children , with y > x do

7.

if l y .att ≠ lx .att then // two nodes are combined only if their attributes are different

8.


O.att = l x .att | l y .att ; // using bitwise operation

9.

O.values = l x .values ∪ l y .values ;

10.

O.Obidseti = l x .Obidseti ∩ l y .Obidseti ; // ∀i ∈ [1, k ]

11.

O. pos = argmax i∈[1,k ] { O.Obidseti } ;

12.

O.total = ∑ O.Obidseti ;

k

i =1

if O.ObidsetO. pos ≥ minSup then // node O satisfies minSup

13.

Pi = Pi ∪ O ;

14.
15.


Task ti = new Task(() => {
Sub-PMCAR(tCARs, Pi , minSup, minConf); });

16. for each task in the list of created tasks do
17.

collect the set of rules ( tCARs ) returned by each task;

18.

totalCARs = totalCARs ∪ tCARs ;

19. totalCARs = totalCARs ∪ CARs ;
Sub-PMCAR(tCARs, Lr , minSup, minConf)

20. for all lx ∈ Lr .children do
21.

Generate-Rule(tCARs, lx , minConf);

22.

Pi = ∅ ;

23.

for all l y ∈ Lr .children , with y > x do

24.


if l y .att ≠ lx .att then // two nodes are combined only if their attributes are different

25.

O.att = l x .att | l y .att ; // using bitwise operation

26.

O.values = l x .values ∪ l y .values ;

27.

for i = 1 to k do // k is the number of classes

28.

Task childi = new Task(() => {
O.Obidseti = l x .Obidseti ∩ l y .Obidseti ; });

29.

Task.WaitAll( childi );

30.

O. pos = argmax i∈[1,k ] { O.Obidseti } ;

31.


O.total = ∑ O.Obidseti ;

32.

if O.ObidsetO. pos ≥ minSup then // node O satisfies minSup

k

i =1

33.
34.

Pi = Pi ∪ O ;
Sub-PMCAR(tCARs, Pi , minSup, minConf);
Fig. 10. PMCAR with shared Obidset strategy.

We can see that the sequential CAR mining algorithm described
in Section 4 scans the dataset once and uses a main loop to mine all

CARs. Based on the cost model in Skillicorn (1999), the time complexity of this algorithm is:


4725

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

{}
1× a1(1, 2 )


1× a 2 ( ∅,3)

2 × b1(1, 23)

t2,t3
3 × a1b1(1, 2 ) 5 × a1c1(1, 2 )

3 × a 2b1( ∅,3) 5 × a 2c1( ∅,3) 6 × b1c1(1, 23)

7 × a1b1c1(1, 2 )

7 × a 2b1c1( ∅,3)

t1

4 × c1(1, 23)

Fig. 11. Illustration of shared Obidset strategy.

T S ¼ kS Â m þ a
where TS is the execution time of the sequential CAR mining algorithm, kS is the number of iterations in the main loop, m is the execution time of generating nodes and rules in each iteration, and a is
the execution time of accessing dataset.
The proposed parallel algorithm distributes node and rule generations to multiple tasks executed on multi-cores. Thus, the exem
cution time of generating nodes and rules in each iteration is tÂc
,
where t is the number of tasks and c is the number of cores. The
time complexity of the parallel algorithm is:

m
T P ¼ kP Â

þa
tÂc
where TP is the execution time of the proposed parallel CAR mining
algorithm, kP is the number of iterations in the main loop.
The speedup is thus:

Sp ¼

TS
kS Â m þ a
¼
m
T P kP Â tÂc
þa

In our experiments, the execution time of the sequential code (for
example, the code to scan the dataset) is very small. In addition,
the number of iterations in the main loop in both sequential and
parallel algorithms is similar. Therefore, the speedup equation can
be simplified as follows:

Sp ¼

kS Â m þ a
kS Â m
m
%
m
m % m ¼ tÂc
kP Â tÂc

þ a kP Â tÂc
tÂc

Thus, we can achieve up to a t  c speedup over the sequential
algorithm.
Now we analyze the time complexity of the parallel CBA algorithm proposed in Thakur and Ramesh (2008). Since this algorithm
is based on the Apriori algorithm, it must scan the dataset many
times. Additionally, this algorithm was employed on a distributed
memory system which means that it needs an additional computation time for communication and information exchange among
nodes. Consequently, the time complexity of this algorithm is:

T C ¼ kC Â




m
þaþd
p

where TC is the execution time of the parallel CBA algorithm, kC is
the number of iterations required by the parallel CBA algorithm, p
is the number of processors, and d is the execution time for communication and data exchange among computing nodes.
Assume that kP % kC and t  c % p. We have:



m
T C ¼ kC Â þ a þ ðkC À 1Þ Â a þ kC Â d
p


where TF is the execution time of the parallel FP-Growth algorithm,
kF is the number of iterations required by the parallel FP-Growth
algorithm.
The parallel FP-Growth scans the dataset once and then partitions it into P parts regarding the number of processors. Each processor scans its local data partition to count the local support of
each item. Therefore, the execution time of accessing the dataset
in this algorithm is only a. However, computing nodes need to
broadcast the local support of each item across the group so that
each processor can calculate the global count. Thus, this algorithm
also needs an additional computation time d for data transfer.
Assume that kP % kF and t  c % p. We have:

TF ¼



kF Â


m
þ a þ kF Â d % T P þ kF Â d
p

It can conclude that our proposed parallel algorithm is also faster
than the parallel FP-Growth algorithm in theory and TP < TF < TC.
7. Experimental results
This section provides the results of our experiments including
the testing environment, the results of the scalability experiments
of three proposed parallel strategies, and the performance of the
proposed parallel algorithm with variation on the number of objects and attributes. It finally compares the execution time of

PMCAR with that of the recent sequential CAR mining algorithm,
CAR-Miner (Nguyen et al., 2013).
7.1. Testing environment
All experiments were conducted on a multi-core processor
computer which has one Intel i7-2600 processor. The processor
has 4 cores and an 8 MB L3-cache, runs at a core frequency of
3.4 GHz, and also supports Hyper-threading. The computer has
4 GB of memory and runs OS Windows 7 Enterprise (64-bit) SP1.
The algorithms were coded in C# by using MS Visual Studio .NET
2010 Express. The parallel algorithm was implemented based on
the parallelism model supported in Microsoft .NET Framework
4.0 (version 4.0.30319).
The experimental datasets were obtained from the University of
California Irvine (UCI) Machine Learning Repository () and the Frequent Itemset Mining (FIM) Dataset
Repository (http://fimi.ua.ac.be/data/). The four datasets used in
the experiments are Poker-hand, Chess, Connect-4, and Pumsb
with the characteristics shown in Table 2. The table shows the
number of attributes (including the class attribute), the number
of class labels, the number of distinctive values (i.e., the total number of distinct values in all attributes), and the number of objects
(or records) in each dataset. The Chess, Connect-4, and Pumsb
datasets are dense and have many attributes whereas the Pokerhand dataset is sparse and has few attributes.
7.2. Scalability experiments
We evaluated the scalability of PMCAR by running it on the
computer that had been configured to utilize a different number

% T P þ ðkC À 1Þ Â a þ kC Â d
Obviously, TP < TC which implies that our proposed algorithm is faster than the parallel version for CBA in theory.
Similarly, the time complexity of the parallel FP-Growth algorithm proposed in Mokeddem and Belbachir (2010) is as follows:



T F ¼ kF Â


m
þd þa
p

Table 2
Characteristics of the experimental datasets.
Dataset

# Attributes

# Classes

# Distinctive values

# Objects

Poker-hand
Chess
Connect-4
Pumsb

11
37
43
74

10

2
3
5

95
76
130
2113

1,000,000
3196
67,557
49,046


4726

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

Poker-hand

Speedups

2.0

1.5

CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch

Speedups

2.5

Chess

1.5
1.0

CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch

1.0

0.5

0.5
0.0

0.0
1

2
# cores

4

1


2
# cores

4

(a) Scalability of PMCAR for the Poker-hand
dataset (minSup = 0.01%)

(b) Scalability of PMCAR for the Chess
dataset (minSup = 30%)

Connect-4

Pumsb

Speedups

2.0

1.5

CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch

Speedups

2.5

1.5

1.0

CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch

1.0

0.5

0.5
0.0

0.0
1

2

4

1

2

# cores

4

# cores


(c) Scalability of PMCAR for the Connect-4
dataset (minSup = 80%)

(d) Scalability of PMCAR for the Pumsb
dataset (minSup = 70%)

Fig. 12. Speedup performance of PMCAR with two parallel strategies.

# Objects = 500K, Density = 55%, minSup = 50%
1,800

Dataset

#
Attributes

#
Classes

Density

#
Objects

File size
(KB)

C50R100KD55
C50R200KD55
C50R300KD55

C50R400KD55
C50R500KD55
C10R500KD55
C20R500KD55
C30R500KD55
C40R500KD55

50
50
50
50
50
10
20
30
40

2
2
2
2
2
2
2
2
2

55
55
55

55
55
55
55
55
55

100,000
200,000
300,000
400,000
500,000
500,000
500,000
500,000
500,000

9961
19,992
29,883
39,844
49,805
10,743
20,508
30,247
40,040

Runtime (in seconds)

1,600


250
200

60,000
50,000
40,000
30,000

100

20,000

50

10,000
0

0
100K

200K

300K

400K

350,000
300,000
250,000


800

200,000

600

150,000

400

100,000

200

50,000
0
20

30

40

Fig. 14. Performance comparison between PMCAR and CAR-Miner with variation
on the number of attributes. Other parameters are set to: # Objects = 500 K,
Density = 55% and minSup = 50%.

70,000

150


400,000

# Attributes

# CARs

Runtime (in seconds)

300

1,000

10

80,000

# CARs
CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch
PMCAR-Shared Obidset

1,200

450,000

0

# Attributes = 50, Density = 55%, minSup = 70%

350

1,400

500,000

# CARs
CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch
PMCAR-Shared Obidset

# CARs

Table 3
Characteristics of the synthetic datasets.

500K

# Objects

Fig. 13. Performance comparison between PMCAR and CAR-Miner with variation
on the number of objects. Other parameters are set to: # Attributes = 50,
Density = 55% and minSup = 70%.

of cores. The configuration was adjusted in the BIOS Setup. The
number of supported cores was setup at 1, 2, and 4 core(s) in turn.
The performance of PMCAR and CAR-Miner were compared. We
observed that the performances of CAR-Miner were nearly identical when it was run on the computer utilized a various number
of cores. It can be said that the sequential algorithms cannot take

the advantages of the multi-core processor architecture. In the
contrary, PMCAR scaled much better than CAR-Miner when the
number of running cores was increased. In the experiments, we
used the runtime performance of CAR-Miner to be the baseline
for obtaining the speedups. Fig. 12(a)–(d) illustrate the speedup
performance of PMCAR with two parallel strategies for the Pokerhand, Chess, Connect-4, and Pumsb datasets, respectively. Note
that minConf = 50% was used for all experiments.


4727

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

Poker-hand

Poker-hand

400,000
Runtime (in seconds)

350,000
# CARs

300,000
250,000
200,000
150,000
100,000
50,000
0.11


0.09

0.07

0.05

0.03

0.01

minSup (%)

(a) # CARs produced

50
45
40
35
30
25
20
15
10
5
0

CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch


0.11

0.09

0.07

0.05

0.03

0.01

minSup (%)

(b) Runtime for PMCAR and CAR-Miner

Fig. 15. Comparative results between PMCAR and CAR-Miner for the Poker-hand dataset with various minSup values.

Obviously, PMCAR is slower than the sequential algorithm CARMiner when they are executed on a single core because task is processor-intensive. This situation is known as the processor oversubscription. However, when the number of using cores is increased,
PMCAR is much faster than CAR-Miner. As shown in Fig. 12(c) for
the Connect-4 dataset, PMCAR with two independent branch and
shared branch strategies archives speedups up to 2.1Â and 1.4Â,
respectively. Interestingly, the shared branch strategy is not beneficial for the Chess dataset. Fig. 12(b) shows that PMCAR with shared
branch is always slower than the sequential CAR-Miner. As discussed before, the shared branch strategy has high synchronization
cost occurred between tasks. As a result, the huge number of tasks
(4,253,728 tasks) generated for the Chess dataset with minSup = 30% reduces significantly the runtime performance. We also
conducted the scalability experiments for the shared Obidset strategy. It, however, did not obtain good scalability results because of
the high costs of synchronization among child tasks and between
child tasks and their parent task. Therefore, we did not show its

performance on the charts.
7.3. Influence of the number of dimensions and the size of dataset
To obtain a clear understanding on how PMCAR is affected by
the dataset dimension and size, we conducted experiments on synthetic datasets with a various number of attributes and objects.
Based on the ideas from (Coenen, 2007), we developed a tool for
generating a synthetic dataset. Firstly, we fixed other parameters
of the dataset generator as follows: (1) the number of attributes
is 50; (2) the density is 55%. We then generated test datasets with
a different number of objects which ranged between 100,000 and
500,000. Secondly, we fixed the number of objects and the density
for 500,000 and 55%, respectively. We then generated datasets
with a various number of attributes in a range between 10 and
40. The details of synthetic datasets are shown in Table 3.
Fig. 13 illustrates the performance results with respect to the
number of objects in the dataset. As shown, PMCAR achieves a
good result with the dataset size compared to CAR-Miner. For
example, when the dataset size reaches to 500 K, PMCAR with
shared branch strategy is up to 1.6Â compared to CAR-Miner. However, two other strategies, independent branch and shared Obidset,
failed to execute their operation at the dataset size 500 K due to
the memory leak. This problem happens because each task in these
strategies holds the entire branch of the tree which consumes very
high memory on dense datasets.
Fig. 14 demonstrates the performance results with respect to
the number of attributes in the dataset. Again, PMCAR achieves a
good result with the dataset dimension compared to the sequential
algorithm. For instance, the execution time of PMCAR with shared
branch was only 1,003.694 s while CAR-Miner was 1,572.754 s

when the number of dimension was 40. However, two strategies,
independent branch and shared Obidset, failed to execute at the

dataset dimension 40 due to the memory leak.
7.4. Comparison with sequential algorithms1
In this section, we compare the execution time of PMCAR with
the sequential algorithm CAR-Miner. These experiments aim to
show that PMCAR is competitive with the existing algorithm.
Figs. 15–18 show the number of generated CARs and the execution
times of PMCAR and CAR-Miner for Poker-hand, Chess, Connect-4,
and Pumsb datasets with various minSup values on the computer
configured to utilize 4 cores and enable Hyper-threading. It can
be observed that CAR-Miner performs badly except for the Chess
dataset. It is slower than PMCAR because it cannot utilize the computing power of the multi-core processor. On the contrary, PMCAR
is optimized for parallel mining the dataset; thus its performance
is superior to CAR-Miner. PMCAR with the independent branch
strategy is always the fastest of all tested algorithms. For example,
consider the Connect-4 dataset with minSup = 65%. Independent
branch consumed only 1,776.892 s to finish its work while shared
Obidset, shared branch, and CAR-Miner consumed 1,924.081s,
2,477.279s and 2,772.470s, respectively. The runtime performances of shared Obidset were worst on the Poker-hand, Chess,
and Pumsb datasets. Thus, we did not show them on the charts.
8. Conclusions and future work
In this paper, we have proposed three strategies for parallel
mining class association rules on the multi-core processor architecture. Unlike sequential CAR mining algorithms, our parallel
algorithm distributes the process of generating frequent itemsets
and rules to multiple tasks executed on multi-cores. The framework of the proposed method is based on our previous sequential
CAR mining method and three parallel strategies independent
branch, shared branch, and shared Obidset. The time complexities
of both sequential and parallel CAR mining algorithms have been
analyzed, with results showing the good effect of the proposed
algorithm. The speedup can be achieved up to t  c in theory. We
have also theoretically proven that the execution time of our parallel CAR mining algorithm is faster than those of existing parallel

CAR mining algorithms. Additionally, a series of experiments have
been conducted on both real and synthetic datasets. The experimental results have also shown that three proposed parallel methods are competitive with the sequential CAR mining method.
However, the first and third strategies currently consume higher
1
Executable files of the CAR-Miner and PMCAR algorithms and experimental
datasets can be downloaded from />

4728

D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729

Chess

4,500,000
4,000,000
3,500,000
3,000,000
2,500,000
2,000,000
1,500,000
1,000,000
500,000
-

Runtime (in seconds)

# CARs

Chess


55

50

45

40

35

100
90
80
70
60
50
40
30
20
10
0

CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch

30

55


50

minSup (%)

45

40

35

30

minSup (%)

(a) # CARs produced

(b) Runtime for PMCAR and CAR-Miner

Fig. 16. Comparative results between PMCAR and CAR-Miner for the Chess dataset with various minSup values.

Connect-4
3,000
Runtime (in seconds)

# CARs

Connect-4
4,500,000
4,000,000
3,500,000

3,000,000
2,500,000
2,000,000
1,500,000
1,000,000
500,000
-

CAR-Miner
PMCAR-Shared Branch
PMCAR-Independent Branch
PMCAR-Shared Obidset

2,500
2,000
1,500
1,000
500
0

90

85

80
75
70
minSup (%)

90


65

(a) # CARs produced

85

80
75
minSup (%)

70

65

(b) Runtime for PMCAR and CAR-Miner

Fig. 17. Comparative results between PMCAR and CAR-Miner for the Connect-4 dataset with various minSup values.

Pumsb
Runtime (in seconds)

# CARs

Pumsb
1,800,000
1,600,000
1,400,000
1,200,000
1,000,000

800,000
600,000
400,000
200,000
90

85

80

75

70

1,000
900
800
700
600
500
400
300
200
100
0

65

CAR-Miner
PMCAR-Shared Branch

PMCAR-Independent Branch

90

85

(a) # CARs produced

80

75

70

65

minSup (%)

minSup (%)

(b) Runtime for PMCAR and CAR-Miner

Fig. 18. Comparative results between PMCAR and CAR-Miner for the Pumsb dataset with various minSup values.

memory than the sequential counterpart which causes their inability to cope with very dense datasets. Thus, we will research how to
reduce the memory consumption of these strategies in the future.
We will also investigate the applicability of the proposed methods
on other platforms such as multiple graphic processors or clouds.
Acknowledgements
This work was funded by Vietnam’s National Foundation for

Science and Technology Development (NAFOSTED) under Grant
No. 102.01-2012.17.

References
Agrawal, R., & Shafer, J. (1996). Parallel mining of association rules. IEEE Transactions
on Knowledge and Data Engineering, 8, 962–969.
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in
large databases. In The 20th International Conference on Very Large Data Bases
(pp. 487–499). Morgan Kaufmann Publishers Inc..
Andrew, B. (2008). Multi-Core Processor Architecture Explained. In http://
software.intel.com/en-us/articles/multi-core-processor-architectureexplained: Intel.
Baralis, E., Chiusano, S., & Garza, P. (2004). On support thresholds in associative
classification. In The 2004 ACM Symposium on Applied Computing (pp. 553–558).
ACM.


D. Nguyen et al. / Expert Systems with Applications 41 (2014) 4716–4729
Baralis, E., Chiusano, S., & Garza, P. (2008). A lazy approach to associative
classification. IEEE Transactions on Knowledge and Data Engineering, 20, 156–171.
Cagliero, L., & Garza, P. (2013). Improving classification models with taxonomy
information. Data & Knowledge Engineering, 86, 85–101.
Casali, A., & Ernst, C. (2013). Extracting correlated patterns on multicore
architectures. Availability, Reliability, and Security in Information Systems and
HCI (Vol. 8127, pp. 118–133). Springer.
Chen, W.-C., Hsu, C.-C., & Hsu, J.-N. (2012). Adjusting and generalizing CBA
algorithm to handling class imbalance. Expert Systems with Applications, 39,
5907–5919.
Chen, G., Liu, H., Yu, L., Wei, Q., & Zhang, X. (2006). A new approach to classification
based on association rule mining. Decision Support Systems, 42, 674–689.
Coenen, F. (2007). Test set generator (version 3.2). In />KDD/Software/LUCS-KDD-DataGen/generator.html.

Han, J., Pei, J., & Yin, Y. (2000). Mining frequent patterns without candidate
generation. ACM SIGMOD Record (Vol. 29, pp. 1–12). ACM.
Laurent, A., Négrevergne, B., Sicard, N., & Termier, A. (2012). Efficient parallel
mining of gradual patterns on multicore processors. Advances in Knowledge
Discovery and Management (Vol. 398, pp. 137–151). Springer.
Li, W., Han, J., & Pei, J. (2001). CMAR: Accurate and efficient classification based on
multiple class-association rules. In IEEE International Conference on Data Mining
(ICDM 2001) (pp. 369–376). IEEE.
Liu, B., Hsu, W., & Ma, Y. (1998). Integrating classification and association rule
mining. In The 4th International Conference on Knowledge Discovery and Data
Mining (KDD 1998) (pp. 80–86).
Liu, L., Li, E., Zhang, Y., & Tang, Z. (2007). Optimization of frequent itemset mining on
multiple-core processor. In The 33rd International Conference on Very Large Data
Bases (pp. 1275–1285). VLDB Endowment.
Liu, H., Liu, L., & Zhang, H. (2011). A fast pruning redundant rule method using
Galois connection. Applied Soft Computing, 11, 130–137.
Mokeddem, D., & Belbachir, H. (2010). A distributed associative classification
algorithm. Intelligent Distributed Computing IV (Vol. 315, pp. 109–118). Springer.
Negrevergne, B., Termier, A., Méhaut, J.-F., & Uno, T. (2010). Discovering closed
frequent itemsets on multicore: Parallelizing computations and optimizing
memory accesses. In International Conference on High Performance Computing
and Simulation (HPCS 2010) (pp. 521–528). IEEE.
Negrevergne, B., Termier, A., Rousset, M.-C., & Méhaut, J.-F. (2013). Para Miner: a
generic pattern mining algorithm for multi-core architectures. Data Mining and
Knowledge Discovery, 1–41.
Netzer, R., & Miller, B. (1989). Detecting data races in parallel program executions.
University of Wisconsin-Madison.
Nguyen, D., & Vo, B. (2014). Mining class-association rules with constraints.
Knowledge and Systems Engineering (Vol. 245, pp. 307–318). Springer.


4729

Nguyen, L. T., Vo, B., Hong, T.-P., & Thanh, H. C. (2012). Classification based on
association rules: A lattice-based approach. Expert Systems with Applications, 39,
11357–11366.
Nguyen, L. T., Vo, B., Hong, T.-P., & Thanh, H. C. (2013). CAR-Miner: An efficient
algorithm for mining class-association rules. Expert Systems with Applications,
40, 2305–2311.
Quinlan, J. R. (1993). C4. 5: programs for machine learning. Morgan Kaufmann
Publishers Inc..
Schlegel, B., Karnagel, T., Kiefer, T., & Lehner, W. (2013). Scalable frequent itemset
mining on many-core processors. In The 9th International Workshop on Data
Management on New Hardware. ACM. Article No. 3.
Skillicorn, D. (1999). Strategies for parallel data mining. IEEE Concurrency, 7, 26–35.
Tatikonda, S., & Parthasarathy, S. (2009). Mining tree-structured data on multicore
systems. Proceedings of the VLDB Endowment, 2, 694–705.
Thabtah, F., Cowling, P., & Peng, Y. (2004). MMAC: A new multi-class, multi-label
associative classification approach. In The 4th IEEE International Conference on
Data Mining (ICDM 2004) (pp. 217–224). IEEE.
Thabtah, F., Cowling, P., & Peng, Y. (2005). MCAR: multi-class classification based on
association rule. In The 3rd ACS/IEEE international conference on computer systems
and applications (pp. 33–39). IEEE.
Thakur, G., & Ramesh, C. J. (2008). A framework for fast classification algorithms.
International Journal Information Theories and Applications, 15, 363–369.
Tolun, M., & Abu-Soud, S. (1998). ILA: An inductive learning algorithm for rule
extraction. Expert Systems with Applications, 14, 361–370.
Tolun, M., Sever, H., Uludag, M., & Abu-Soud, S. (1999). ILA-2: An inductive learning
algorithm for knowledge discovery. Cybernetics & Systems, 30, 609–628.
Vo, B., & Le, B. (2009). A novel classification algorithm based on association rules
mining. Knowledge Acquisition: Approaches, Algorithms and Applications (Vol.

5465, pp. 61–75). Springer.
Xu, X., Han, G., & Min, H. (2004). A novel algorithm for associative classification of
image blocks. In The 4th International Conference on Computer and Information
Technology (CIT 20 04) (pp. 46–51). IEEE.
Yin, X., & Han, J. (2003). CPAR: Classification based on predictive association rules.
The 3rd SIAM International Conference on Data Mining (SDM 2003) (Vol. 3,
pp. 331–335). SIAM.
Yu, K.-M., & Wu, S.-H. (2011). An efficient load balancing multi-core frequent
patterns mining algorithm. In The IEEE 10th International Conference on Trust,
Security and Privacy in Computing and Communications (TrustCom 2011)
(pp. 1408–1412). IEEE.
Zaki, M., Parthasarathy, S., Ogihara, M., & Li, W. (1997). New algorithms for fast
discovery of association rules. In The 3rd international conference on knowledge
discovery and data mining (Vol. 20, pp. 283–286).
Zhao, M., Cheng, X., & He, Q. (2009). An algorithm of mining class association rules.
Advances in Computation and Intelligence (Vol. 5821, pp. 269–275). Springer.



×