Tải bản đầy đủ (.pdf) (10 trang)

DSpace at VNU: Picture fuzzy clustering for complex data

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (913.61 KB, 10 trang )

Engineering Applications of Artificial Intelligence 56 (2016) 121–130

Contents lists available at ScienceDirect

Engineering Applications of Artificial Intelligence
journal homepage: www.elsevier.com/locate/engappai

Picture fuzzy clustering for complex data
Pham Huy Thong, Le Hoang Son n
VNU University of Science, Vietnam National University, 334 Nguyen Trai, Thanh Xuan, Hanoi, Viet Nam

art ic l e i nf o

a b s t r a c t

Article history:
Received 24 April 2016
Received in revised form
9 August 2016
Accepted 9 August 2016

Fuzzy clustering is a useful segmentation tool which has been widely used in many applications in real
life problems such as in pattern recognition, recommender systems, forecasting, etc. Fuzzy clustering
algorithm on picture fuzzy set (FC-PFS) is an advanced fuzzy clustering algorithm constructed on the
basis of picture fuzzy set with the appearance of three membership degrees namely the positive, the
neutral and the refusal degrees combined within an entropy component in the objective function to
handle the problem of incomplete modeling in fuzzy clustering. A disadvantage of FC-PFS is its capability
to handle complex data which include mix data type (categorical and numerical data) and distinct
structured data. In this paper, we propose a novel picture fuzzy clustering algorithm for complex data
called PFCA-CD that deals with both mix data type and distinct data structures. The idea of this method is
the modification of FC-PFS, using a new measurement for categorical attributes, multiple centers of one


cluster and an evolutionary strategy – particle swarm optimization. Experiments indicate that the proposed algorithm results in better clustering quality than others through clustering validity indices.
& 2016 Elsevier Ltd. All rights reserved.

Keywords:
Complex data
Distinct structured data
Fuzzy clustering
Mix data type
Picture fuzzy clustering

1. Introduction
Fuzzy clustering is used for partitioning dataset into clusters
where each element in the dataset can belong to all clusters with
different membership values (Bezdek et al., 1984). Fuzzy clustering
was firstly introduced by Bezdek et al. (1984) under the name
“Fuzzy C-Means (FCM)”. This algorithm is based on the idea of
K-Means clustering with membership values being attached to the
objective function for partitioning all data elements in the dataset
into appropriate groups (Chen et al., 2016). FCM is more flexible
than K-Means algorithm, especially in overlapping and uncertainty
dataset (Bezdek et al., 1984). Moreover, FCM has many applications
in real life problems such as in pattern recognition, recommender
systems, forecasting, etc. (Son et al., 2012a, 2012b, Son et al., 2013,
2014; Thong and Son, 2014; Son, 2014a, 2014b; Son and Thong,
2015; Thong and Son, 2015; Son, 2015b, 2015c, 2016; Son and
Tuan, 2016; Son and Hai, 2016; Wijayanto et al., 2016; Tuan et al.,
2016; Thong et al., 2016; Tuan et al., 2016).
However, FCM still has some limitations regarding clustering
quality, hesitation, noises and outliers (Ferreira and de Carvalho,
2012; De Carvalho et al., 2013; Thong and Son, 2016b). There have

been many researches proposed to overcome these limitations;
one of them is innovating FCM on advanced fuzzy sets such as the
n

Corresponding author.
E-mail addresses: (P.H. Thong), ,
(L.H. Son).
/>0952-1976/& 2016 Elsevier Ltd. All rights reserved.

type-2 fuzzy sets (Mendel and John, 2002), intuitionistics fuzzy
sets (Atanassov, 1986) and picture fuzzy sets (Cuong, 2014). Fuzzy
clustering algorithm on PFS (FC-PFS) (Son, 2015a; Thong and Son,
2016b) is an extension of FCM with the appearance of three
membership degrees of picture fuzzy sets namely the positive, the
neutral and the refusal degrees combined within an entropy
component in the objective function to handle the problem of
incomplete modeling in FCM (Yang et al., 2004). FC-PFS was
shown to have better accuracy than other fuzzy clustering
schemes in the equivalent articles (Son, 2015a; Thong and Son,
2016b).
Nonetheless, a remark regarding the working flow of the FCPFS algorithm extracted from our experiments through various
types of datasets is the inefficiency of processing complex data
which include mix data types and distinct structure data. Mix data
are known as categorical and numerical data, which can be effectively processed with equipped kernel functions only (Ferreira
and de Carvalho, 2012). Distinct structure data contains nonsphere structured data such as data scatter in a linear line or a ring
types, etc. that prevent clustering algorithms to partition data
elements into exact clusters. Almost fuzzy clustering methods,
including FC-PFS, find them hard to deal with complex data. There
have been many researches on developing new fuzzy clustering
algorithms that employed dissimilarity distances and kernel

functions to cope with complex data in (Cominetti et al., 2010;
Hwang, 1998; Ji et al., 2013, 2012). However, they solved either mix
data types or distinct structure data but not all of them so this


122

P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130

leaves the motivation for this paper to work on.
In this paper, we propose a novel picture fuzzy clustering algorithm for complex data called PFCA-CD that deals with both mix
data type and distinct data structures. The idea of this method is
the modification of FC-PFS, using a measurement for categorical
attributes, multiple centers of one cluster and an evolutionary
strategy - particle swarm optimization. Experiments indicate that
the proposed algorithm results in better clustering quality than
others through clustering validity indices.
The rest of the paper is organized as follows. Section 2 describes the background with literature review and some particular
fuzzy clustering methods for complex data. Section 3 presents our
proposed method. Section 4 validates the method on the benchmark UCI datasets. Finally, conclusions and further works are
covered in last section.

2. Background
In this section, we firstly give an overview of the relevant
methods for clustering complex data in Section 2.1. Sections 2.2–2.3
review two typical methods of this approach.
2.1. Literature review
The related works for clustering complex data is divided into
two groups: mixed type of data including categorical and numerical data and distinct structure of data (Fig. 1).
In the first group, there have been many researches about

clustering for both categorical and numerical data. Hwang (1998)
extended the k-means algorithm for clustering large datasets including categorical values. Yang et al. (2004) used fuzzy clustering
algorithms to partition mixed feature variables by giving a modified dissimilarity measure for symbolic and fuzzy data. Ji et al.
(2012, 2013) proposed fuzzy k-prototype clustering algorithms
combining the mean and fuzzy centroid to represent the prototype
of a cluster and employing a new measure based on co-occurrence
of values to evaluate the dissimilarity between data objects and
prototypes of clusters. Chen et al. (2016) presented a soft subspace
clustering of categorical data by using a novel soft feature-selection scheme to make each categorical attribute be automatically
assigned a weight that correlates with the smoothed dispersion of
the categories in a cluster. A series of methods based on multiple
dissimilarity matrices to handle with mix data was introduced by
De Carvalho et al. (2013). The main ideas of these methods were to
obtain a collaborative role of the different dissimilarity matrices to
get a final consensus partition. Although these methods can partition mixed data efficiently, they find it difficult to solve with
complex distinct structure of data.
In the second group, many researchers tried to partition complex structure of data which had intrinsic geometry of non-sphere
and non-convex clusters. Cominetti et al. (2010) proposed a
method called DifFuzzy combining ideas from FCM and diffusion

on graph to handle the problem of clusters with a complex nonlinear geometric structure. This method is applicable to a larger
class of clustering problems which do not require any prior information on the number of clusters. Ferreira and de Carvalho
(2012) presented kernel fuzzy clustering methods based on local
adaptive distances to partition complex data. The main idea of
these methods were based on a local adaptive distance where
dissimilarity measures were obtained as sums of the Euclidean
distance between patterns and centroids computed individually
for each variable by means of kernel functions. Dissimilarity
measure is utilized to learn the weights of variables during the
clustering process that improves performance of the algorithms.

However, this method could deal with numerical data only.
It has been shown that the DifFuzzy algorithm (Cominetti et al.,
2010) and the fuzzy clustering algorithm based on multiple dissimilarity matrices (Dissimilarity) (De Carvalho et al., 2013) are
two typical clustering methods in each group. Therefore, we will
analyze these methods more detailed in the next sections.
2.2. DifFuzzy
DifFuzzy clustering algorithm (Cominetti et al., 2010) is based
on FCM and the diffusion on graph to partition the dataset into
clusters with a complex nonlinear geometric structure. Firstly, the
auxiliary function is defined:

F ( σ ): ( 0, ∞) → N

where σ ∈ ( 0, ∞) be a positive number. The i − th and j − th nodes
are connected by an edge if: ‖Xi − Xj ‖ < σ . F ( σ ) is equal to the
number of components of the σ− neighborhood graph which
contain at least M vertices, where M is the mandatory parameter
of DifFuzzy. F ( σ ) begins from zero, and then increases to its
maximum value, before settling back down to a value of 1.

C = max F ( σ ),
σ∈ ( 0, ∞)

(2)

⎧ 1 if i and j are hard po int s in the same core


⎪ clusters ,


ωi, j( β ) = ⎨
⎛ ‖X − X ‖2 ⎞

i
j
⎟otherwise ,
⎪ exp⎜⎜ −


β




(3)

where
β is a positive
L ( β ): ( 0, ∞) → ( 0, ∞) is:
N

L( β ) =

N

real

number.

The


function:



∑ ∑ ωi, j( β ).

(4)

i=1 j=1

It has two well defined limits:
C

lim L( β ) = N +
β→0

∑ n i ( n i − 1)
i=1

and lim L( β ) = N2,
β →∞

(5)

where ni corresponds to the number of points in the i − th core
cluster. DifFuzzy does this by finding β* which satisfies the relation:

Complex data



L β* = 1 − γi ⎜⎜ N +


( ) (

Mix data types
(categorical and
numerical data)

(1)

Distinct structure of
data (different
distribution of data)

C



i=1



∑ ni( ni − 1)⎟⎟ + γiN2,

(6)

where γ1 ∈ ( 0, 1) is an internal parameter of the method. Its default
value is 0.3. Then the auxiliary matrices are defined as follows.



Fig. 1. Classification of methods dealing with complex data.

)

( )

W = W β* .

(7)


P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130

The matrix D is defined as a diagonal matrix with diagonal
elements.
N

Di, j =

∑ ωi, j,

i = 1, 2, … …, N ,
(8)

j=1




( 0)

D⎛
⎞ e, G
⎢ C ⎜ ⎜ γk( 0), s⎟ i k



0
uik( ) = ⎢ ∑ ⎜
( 0)
⎢ h = 1 D⎛
⎜⎜ ⎜ γ ( 0), s⎞⎟ ei , Gh


⎝ h

⎢⎣

(
(

P = I + ⎡⎣ W − D⎤⎦

γ2
max Di, j

C

n


m

( u ( )) D ( ) ( e , G ( ))
= ∑ ∑ ( u ( )) ∑ ( γ ( )) ∑ d ( e
( )

J ( 0) =

0
ik

∑∑

k=1 i=1
C

n

0
ik

m

k=1 i=1

⎢ γ ⎥
α = ⎢ 3 ⎥,
⎢⎣ log γ2 ⎥⎦


(10)

where γ2 corresponds to the second (largest) eigenvalue of P and
⌊. ⌋ denotes the integer part. In order to compute the diffusion
distance between soft point Xs and c − th cluster, the following
formula is used.
α

dist ( Xs , c ) = ‖P αe − P e‖,

(11)

where e( j ) = 1 if j = s , and e( j ) = 0 otherwise. Finally the membership value of the soft point Xs in the c − th cluster, uc ( Xs), is
determined as
−1

C

.
−1

∑l = 1 dist ( Xs , l)

(12)

This procedure is applied to every soft data point Xs and every
cluster c ∈ { 1, 2, ... , C}. The output of DifFuzzy is a number of
clusters ( C ) and for each data point a set of C numbers that represent the degree of membership in each cluster. The membership value of Xi , i = 1, 2, ... , N , in the c − th cluster, c = 1, 2, ... , C ,
is denoted by uc ( Xi). The degree of membership is a number between 0 and 1, where the values close to 1 correspond to points
that are very likely to belong to that cluster. The sum of the

membership values of a data point in all clusters is always one.

i

r

s

0
kj

(

=

1
,
r

.... ,

1
r

),

(

k1


kr

)

(

)

k

(

k1

kr

)

k = 1, ... , C . Randomly select C distinct prototypes

0
Gk( ) ∈ E ( q)( k = 1, ... , C ). For each object ei( i = 1, ... , n) compute its
0
membership degree u ( ) k = 1... , C on fuzzy cluster C :
ik

(

)


k

j

e ∈ Gk

i, e

)

0

(14)

(
(

)

K

1

)

t
t−1
t−1
U ( t − 1) = u1( ) , ... . , un( ) are fixed. The prototype Gk( ) = G* ∈E ( q)


of fuzzy cluster Ck( k = 1, .. . . , C ) is calculated according to the
procedure described in Proposition: The prototype Gk = G* ∈ E ( q)
of fuzzy cluster Ck( k = 1, .. . . , C ) is chosen to minimizes the cluss

( ) (

n
r
tering criterion J: ∑i = 1 ( uik )m ∑ j = 1 γkj Dj ei , G* → Min.

)

2.3.3. Compute the best relevance weight vector
t
t
When the vector of prototypes G( t ) = G1( ), ... . , Gk( ) and the

(
(

)

t−1
t−1
fuzzy partition represented by U ( t − 1) = u1( ) , ... . , un( )

) are

t
fixed, the components γkj( )( j = 1, .... , r ) of relevance weight vector

t
γ ( ) k = 1.... , C are computed as in Eqs. (15) or (17) if the matching
k

(

)

function given by Eqs. (16) or (18), respectively.

γkj

{∏
=

r
⎡ ∑n
h=1 ⎣ i=1

1
r

( uik) Dh( ei , Gk)⎤⎦}
m
n
∑i = 1 ( uik ) Dj ( ei , Gk )

r
⎡ ∑n
h=1 ⎣ i=1


{∏
=

m

∑e ∈ G dh( ei , e)⎤⎦
k
⎡ ∑n u m ∑
d e , e)⎤⎦
⎣ i = 1 ( ik )
e ∈ Gk j( i
s

1
r

},

m

( uik)

(15)

r

∑ ( γkj) Dj( ei , Gk) = ∑ γkj ∑

dj( ei , e),


e ∈ Gk

(16)

−1
1 ⎤

m
⎢ r ⎛ ∑in= 1 ( uik ) ∑e ∈ G dj( ei , e) ⎞ s − 1 ⎥
⎟ ⎥ ,
k
= ⎢∑ ⎜ n
m
⎢ h = 1 ⎜⎝ ∑i = 1 ( uik ) ∑e ∈ G dh( ei , e) ⎟⎠ ⎥
k
⎥⎦
⎢⎣

(17)

j=1

2.3.1. Initialization
Fix C (the number of clusters), 2 ≤ C < < n; fix m, 1 < m < + ∞;
fix s , 1 ≤ s < + ∞; fix T (an iteration limit); fix ε > 0 and ε < < 1. Fix
the cardinality 1 ≤ q < < n of the prototypes Gk( k = 1, .... , C ). Set
0
0
0

0
0
0
t = 0. Set λ ( ) = γ ( ), .... , γ ( ) = 1, ... , 1 or set λ ( ) = γ ( ), .... , γ ( )

k

2.3.2. Compute the best prototypes
Set
The
vector
of
relevance
weights
t = t + 1.
t−1
t−1
Λ( t − 1) = γ ( ), .... , γ ( ) and the fuzzy partition represented by

D( γk)( ei , Gk ) =

The dissimilarity algorithm (De Carvalho et al., 2013) is Fuzzy
K-Medoids with relevance weight for each dissimilarity matrix
consisting of 5 steps below.

0

⎛⎜

γ 0 , s⎟

⎝ k


j=1

r

2.3. Dissimilarity

k

(13)

(9)

where I ∈ R N × N is the identity matrix and γ2 is an internal parameter of DifFuzzy. Its default value is 0.1. DifFuzzy also computes
an auxiliary integer parameter α by,

dist ( Xs , c )

s

( )
( )

,

i = 1,... N

uc ( X s ) =


−1
1 ⎤
⎞ m−1⎥






⎟⎟


⎥⎦

−1
1 ⎤
⎞ m−1⎥
∑e ∈ G ( 0) dj( ei , e) ⎟

k

s


∑e ∈ G ( 0) dj( ei , e)


h
⎥⎦



⎢ C ⎛ ∑r
( 0)
⎜ j = 1 γkj

= ⎢∑ ⎜
r
0
⎢ h = 1 ⎜⎝ ∑ j = 1 γhj( )
⎢⎣

where ωi, j are the entries of matrix W. Finally, the matrix P is
defined as,

)
)

123

j=1

−1
1 ⎤

m
⎢ r ⎛ ∑in= 1 ( uik ) Dj ( ei , Gk ) ⎞ s − 1 ⎥
⎟ ⎥
γkj = ⎢ ∑ ⎜⎜ n
m


⎢ h = 1 ⎝ ∑i = 1 ( uik ) Dh( ei , Gk ) ⎠ ⎥



r

D( γk, s)( ei , Gk ) =

s

r

s

∑ ( γkj) Dj( ei , Gk) = ∑ ( γkj) ∑
j=1

j=1

e ∈ Gk

dj( ei , e),

(18)


124

P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130


2.3.4. Define the best fuzzy partition
t
t
The vector of prototypes G( t ) = G1( ), ... . , Gk( ) and the vector of

(

(

)

)

t
t
relevance weights Λ( t ) = γi( ), .... , γk( ) are fixed. The membership

uik( )
t

degree

of

ei( i = 1, .... , n)

object

in


fuzzy

cluster

Ck( k = 1, .. . . , C ) is calculated as in Eq. (19).


t

D⎛ ( t ) ⎞ ei , Gk( )
⎜ γk , s⎟
⎢ C


t
uik( ) = ⎢ ∑
⎢ h = 1 D⎛ t ⎞ e , G ( t )
i
()
h

⎜ γk , s⎟


⎢⎣

(
(


)
)

−1
1 ⎤
m−1 ⎥





⎥⎦

−1
1 ⎤

s
⎞ m−1⎥
t)
⎢ C ⎛ ∑r
(
⎜ j = 1 γkj ∑e ∈ G ( t ) dj( ei , e) ⎟


k

= ⎢∑ ⎜
s

r

t)
(



⎢ h = 1 ⎝ ∑ j = 1 γhj ∑e ∈ G ( t ) dj( ei , e) ⎠
h
⎥⎦
⎢⎣

( )
( )

(19)

2.3.5. Stopping criterion
Compute:
J ( t) =

C

n

∑ ∑
k =1 n =1
C

=

n


∑∑
k =1 i =1


⎛ ( t )⎞ m
t ⎞
⎜ u ik ⎟ D⎛ t ⎞⎜ ei , Gk( )⎟

⎠ ⎜ γk ( ), s⎟⎝


⎛ ( t )⎞ m
⎜u ⎟
⎝ ik ⎠



r


j =1



⎛ ( t )⎞ s
⎜γ ⎟
⎝ kj ⎠

∑ dj( ei, e)

( t)

(20)

e ∈G
k

If J ( t ) − J ( t − 1) ≤ ε or t < T : STOP; otherwise go to Step A.

2.4. Particle swarm optimization
Particle Swarm Optimization (PSO) firstly introduced by Eberhart and Kennedy (1995) is an evolutionary strategy that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. It simulates the
movement of organisms in a bird flock or fish school to find food.
Suppose that there are popsize particles in the swarm, each of
them is presented to be one solution of the problems and is encoded with their location ( loc ) and velocity ( vec ). The PSO procedure consists of these steps: initializing the swarm, calculating
the fitness values and updating particles.
Firstly, the location and velocity of each particle are initiated
randomly. Secondly, each particles is measured the quality by fitness values. Depending on the demand the problem, the fitness
value is design to assess the quality of solution. Finally, the update
process is demonstrated in Eqs. (21) and (22).

veci = veci + C1rand( locPbest − loci ) + C2rand( locGbest − loci ),

(21)

loci = veci + loci

(22)

where C1, C2 ≥ 0 are PSO’s parameters. Generally, C1, C2 are often set
as 1. locPbest is the location that particle i has best current solution

and locGbest is the location that the swarm has best current
solution.
The whole process is repeated until the number of iteration has
reached or the best solution in the two continuous steps has not
been changed. Details of this method are described in Fig. 2.

Fig. 2. Schema of PSO algorithm.


P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130

3. The proposed method

125

By using the Lagranian method, the authors determined the
optimal solutions of model in Eqs. (32)–(35).

Based on the FC-PFS algorithm (Thong and Son, 2016a), a new
picture fuzzy clustering method for complex data (PFCA-CD) is presented. The idea of this method is to overcome the complex structure
and mix data by enhancing FC-PFS with multiple centers, measurement of categorical attributes and evolutionary strategy with Particle
Swarm Optimization (PSO) (Eberhart and Kennedy, 1995). Therein,
the multiple centers are used to deal with complex structure of data
because data with complex structures have many different shapes
that cannot be represented by one center. The centers are alternatively selected from data elements because of categorical data.
Moreover, categorical attributes are not be measured by using the
same way with numerical attributes; therefore new measurements
are used to cope with mix data. The subsections are as follow: Section 3.1 describes details of FC-PFS, Section 3.2 proposes a new
measurement of categorical attributes and Section 3.3 presents the
PFCA-CD algorithm accompanied with remarks in Section 3.4.


(

ξkj = 1 − μkj + ηkj

( (

− 1 − μkj + ηkj

Definition 1. A Picture Fuzzy Set (PFS) (Cuong, 2014) in a nonempty set X is,

1
α α

)) ,

(k = 1, N , j = 1, C ),

(32)

where α ∈ ( 0, 1⎦⎤ is an exponent coefficient used to control the
refusal degree in PFS sets.

1

μkj =

(

C


∑i = 1 ( 2 − ξki )

ηkj =

e−ξkj

‖xk − V j‖
‖xk − Vi‖


⎜1 − 1
C


C
∑i = 1 e−ξki ⎝

)

C



i=1



∑k = 1 μkj 2 − ξkj
N

∑k = 1

2
m−1

∑ ξki⎟⎟,

, (k = 1, N , j = 1, C ),
(33)

(k = 1, N , j = 1, C ),
(34)

m

)) X
( (
,
( μ ( 2 − ξ ))

N

Vj =

3.1. Fuzzy clustering on picture fuzzy set

)

k


m

(j = 1, C ).
(35)

kj

kj

Details for FC-PFS are described in Table 1.
3.2. A new measurement for categorical attributes

(

)

Supposes that dh x i , xj is a distance of the element x i and xj on

(23)

attribute h ( i = 1, N , j = 1, N , h = 1, R ). If the hth attribute is numerical data, dh x i , xj is calculated based on Euclid distance.

where μ Ȧ ( x ) is the positive degree of each element x ∈ X , η Ȧ( x )
is the neutral membership and γ Ȧ( x ) is the negative degree satisfying the constraints,

Otherwise, if the hth attribute is categorical data, d x ih, vjh is calculated by Eq. (36).

Ȧ =

{


}

x, μ Ȧ ( x), η A(̇ x), γ A(̇ x) |x ∈ X ,

μ Ȧ ( x), η A(̇ x), γ A(̇ x) ∈ ⎡⎣ 0, 1⎤⎦,

∀ x ∈ X,

0 ≤ μ Ȧ ( x) + η A(̇ x) + γ A(̇ x) ≤ 1, ∀ x ∈ X .

(24)
(25)

The refusal degree of an element is:

(

)

ξ Ȧ( x) = 1 − μ Ȧ ( x) + η A(̇ x) + γ A(̇ x) , ∀ x ∈ X .

(26)

Based on theory of picture fuzzy set, Thong and Son (2016a)
proposed a picture fuzzy model for clustering problem called FCPFS, which was proven to get better clustering quality than other
relevant methods. Suppose there is a dataset X consisting of N
data points in r dimensions. The objective function for dividing the
dataset into C groups is:
N


J=

C

m

∑ ∑ ( μkj ( 2 − ξkj))

+

)

(

)

⎧ 0 if x = v
ih
jh
d xih , vjh = ⎨
,
⎩ 1 otherwise

(

)

(36)


This means that data input has to be normalized in range [0,1]
with 0 respects to minimum of distance between two objects and
1 respects to the maximum one. In Eq. (36), if the two categorical
objects are not equal, the distance between them is the maximum
one.
3.3. The PFCA-CD algorithm
In order to partition dataset with mix data type and distinct
data structure, we combine FC-PFS with PSO as follows. Suppose
Table 1
FC-PFS.

‖xk − Vj‖2

k=1 j=1
N

(

Fuzzy clustering method on picture fuzzy sets

C

∑ ∑ ηkj( log ηkj + ξkj) →

min .

k=1 j=1

I:


(27)

Some constraints are defined as follows.

μkj , ηkj , ξkj ∈ ⎡⎣ 0, 1⎤⎦,

μkj + ηkj + ξkj ≤ 1,

(28)

(29)

Data X whose number of elements ( N ) in r dimensions; Number of clusters
( C ); the fuzzifier m ; Threshold ε ; the maximum iteration maxSteps40
O: Matrices u , η , ξ and centers V ;
FC-PFS:
1: t ¼ 0
2: u (t ) ← random ; η (t ) ← random ; ξ (t ) ← random ( k = 1, N , j = 1, C ) satisfy
kj

3:
4:
5:

C

∑ ( μkj ( 2 − ξkj)) = 1,
j=1



ξ ⎞
∑ ⎜⎝ ηkj + kj ⎟⎠ = 1, k = 1, N , j = 1, C .
C
j=1

(30)

C

(31)

kj

kj

Eqs. (28) and (29)
Repeat
t ¼ t þ1
Calculate Vj (t ) ( j = 1, C ) by Eq. (35)

6:

Calculate ukj (t ) ( k = 1, N ; j = 1, C ) by Eq. (33)

7:

Calculate ηkj (t ) ( k = 1, N ; j = 1, C ) by Eq. (34)

8:


Calculate ξkj (t ) ( k = 1, N ; j = 1, C ) by Eq. (32)

9:

Until ‖μ(t ) − μ(t − 1) ‖ + ‖η(t ) − η(t − 1)‖ + ‖ξ (t ) − ξ (t − 1)‖ ≤ ε or maxSteps has
reached


126

P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130

that dataset X contains mix numerical and categorical data with
complex structure and the number of cluster C is given. Instead of
using the iteration of FC-PFS, PSO iteration is employed. The initial
population of PSO is encoded as P =

{ p , p , .. , p
1

2

popsize

} where each

particle consists of the following components:
– ( μkj , ηkj , ξkj ): the positive, neutral and refusal degrees of elements
in X respectively.
– ( μPbest , ηPbest , ξPbestkj ): the positive, neutral and refusal degrees of

kj
kj
elements in X having best clustering quality respectively.
– Vj and VPbest j : the set of cluster centers corresponding to ( μkj , ηkj ,
ξkj ) and ( μPbest , ηPbest , ξPbestkj ) respectively.
kj

A particle starts from a given values of ( μkj , ηkj , ξkj ) and tries to
change them to achieve the best fitness value. The fitness value is
chosen by the same way to calculate the optimization function
(27) as in Eq. (37)

Fitness =

C

m

∑ ∑ ( μkj ( 2 − ξkj))

‖xk − Vj‖2

+

C

∑ ∑ ηkj( log ηkj + ξkj),

(37)


k=1 j=1

This process can be regarded as best matching of the fitness
value with current status of the particle. If the achieved solutions
are better than the previous ones, the local optimal solutions Pbest
– ( μPbest , ηPbest , ξPbestkj , VPbestkj ) of the particle are recorded. Then,
kj

1:

Vj = ∅

2:
3:

Repeat

( (

Find xi ∉ Vj , xi ∈ X , such that i = arg min ∑kN= 1 μkj 2 − ξkj
h = 1, N

4:

Vj = Vj ∪ x i

5:

Until min ∑kN= 1 μkj 2 − ξkj


( (

h = 1, N

m

))

m

))

‖xk − xh‖2

‖xk − xh‖2 > EPS and xh ∉ Vj

kj

evolution of the particle is made by changing the value of ( μkj , ηkj ,
ξkj and Vj ). In the evolution, ( μkj , ηkj ) are calculated by Eqs. (33)–(34)
and (38)–(39) as below.

(

)

(

)


μkj = μkj + C1rand μPbest − μkj + C2rand μGbest − μkj ,
(k = 1, N , j = 1, C ),

(

(38)

)

(

sure to be the best ones.
– The computational time for PSO strategy is quite high. The
complexity of the proposed algorithm is O NC2 + N2 for one
particle and one loop, where N is the number of elements in the
dataset and C is the number of clusters. Then, the complexity of
the algorithm is O popsize × NC2 + N2 × numSteps where
numSteps and popsize are the number of iterations and number of
particles respectively. Because popsize and C are always small, the
complexity of the proposed algorithm is about O N2 × numSteps .
If numSteps = 1, the complexity of the algorithm is only O N2 . In
worst cases, numSteps=maxSteps, the computational time of the
proposed algorithm may be the highest.

(

(

(


)

)

)

(

k=1 j=1
N

Determining center for cluster j ( Vj )

kj

– Pbesti : the best quality value that a particle achieves.

N

Table 2
Choosing centers for clusters.

)

ηkj = ηkj + C1rand ηPbest − ηkj + C2rand ηGbest − ηkj ,
(k = 1, N , j = 1, C ),

(39)

where ( μGbest , ηGbest , ξGbest and VGbest ) are the best values of the

swarm ( Gbest ). The centers for cluster j are chosen by the procedure in Table 2.
The evolution of all particles is continued until a number of
iterations are reached. The final solutions comprising the most
suitable values of its clustering centers and membership matrices
are determined with the minimum fitness value. Details of the
proposed algorithm are presented in Fig. 3 and Table 3.
3.4. Remarks
The proposed method has some advantages:
– The proposed method uses multiple centers for each cluster so
that a cluster with data elements scattering in un-sphered and
distinct structure can be easily presented by these centers.
– The proposed method employing FC-PFS with the PSO strategy
can enhance the convergence process.
– The proposed method employs a new measurement for categorical attribute values that is appropriate in calculating distance between two objects.
However, this method still has some limitations:
– The use of PSO algorithm may result in good solutions, but not

( )

)

4. Experiments
4.1. Materials and system configuration
The following benchmark datasets of UCI Machine Learning
Repository (University of California, 2007) are used for the validation of performance of algorithms (Table 4). They are very wellknown and standard data for clustering and classification consisting of seven datasets with different sizes, number of attributes
and number of classes. The largest dataset is ABALONE including
4177 elements and 8 numerical attributes. The dataset contains
largest attributes is AUTOMOBILE with 15 numerical and 10 categorical attributes. In the experiments, we do not normalize the
dataset. The aim is to verify the quality of clustering algorithms
from small to large sizes and mix datatype (numerical and categorical attributes). In order to assess the quality, the number of

classes in each dataset is used as the ‘correct’ number of clusters.
The proposed algorithm – PFCA-CD has been implemented in
addition to the DifFuzzy algorithm (Cominetti et al., 2010) and the
Dissimilarity algorithm (De Carvalho et al., 2013) in C programming language and executed them on a Linux Cluster 1350 with
eight computing nodes of 51.2GFlops. Each node contains two Intel
Xeon dual core 3.2 GHz, 2GB Ram. The experimental results are
taken as the average values after 50 runs.
Cluster validity measurement: Mean Accuracy (MA), the DaviesBouldin (DB) index (Davies and Bouldin, 1979), the Rand index (RI)
and Alternative Silhouette (ASWC) (Vendramin et al., 2010) are
used to evaluate the qualities of solutions for clustering algorithms. The DB index is shown as below.

DB =

Si =

⎧ S + S ⎫⎞
i
j
⎬⎟⎟,
j : j ≠ i ⎩ Mij ⎭⎠

i=1
C



1
C

∑ ⎜⎜ max⎨


1
Ti



Ti
j=1









(40)

2

Xj − Vi , (i = 1, …, C ),
(41)


P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130

127

Fig. 3. Schema of PFCA-CD.


Mij = ‖Vi − Vj‖, (i, j = 1, …, C , i ≠ j ),

(42)

where Ti is the size of cluster ith. Si is a measure of scatter within
the cluster, and Mij is a measure of separation between cluster ith
and jth. The minimum value indicates the better performance for
DB index. The Rand index is defined as,

RI =

a+d
,
a+b+c+d

(43)

where a ( b) is the number of pairs of data points belonging to the
same class in R and to the same (different) cluster in Q with R and
Q being two ubiquitous clusters. c (d ) is the number of pairs of data
points belonging to the different class in R and to the same (different) cluster. The Rand index the larger, the better is. Alternative
Silhouette (ASWC), is invoked to measure the clustering quality.

ASWC =

1
N

N


∑ sxi,
i=1

(44)


128

P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130

Table 3
Picture fuzzy clustering algorithm for complex data.

Table 5
The average validity index values of algorithms. (Bold values mean the best one in
each dataset and validity index).

Picture fuzzy clustering algorithm for complex data
MA

ASWC

DB

RI

66.667
92.639
92.667

66.667
86.888
88.785

96.404
96.464
33.333
82.146
93.157
20
77.35
94.538

100
89.3

1.569
1.946
1.971
0.621
1.082
1.142

1.715
1.147
0.745
1.411
0.937
0.699
1.21

1.035

1.076
1.02

2.707
9.915
11.217
4.654
4.239
11.808

3.812
4.936
6.437
8.721
5.319
5.356
8.279
4.667

2.868
2.7

81.960
79.092
76.599
67.557
57.303
66.994


62.56
61.346
63.912
64.371
69.458
65.356
63.862
66.607

54.006
50.512

I:

Data X whose number of elements ( N ) in r dimensions; Number of
clusters (C ); threshold ε ; fuzzifier m and the maximal number of iteration
max Steps > 0
O: Matrices μ , η , ξ and centers V ;
PFCA-CD
1: t ¼ 0
2:
u (t ) ← random ; η (t ) ← random ; ξ (t ) ← random (k = 1, N , j = 1, C ) satisfy
kj

3:
4:
5:
6:


kj

kj

(28–29)
Repeat
t ¼ t þ1
For each particle i

GLASS

ABALONE

AUTOMOBILE

Choosing centers Vj (t ) ( j = 1, C ) as in Table 2

7:

Calculate μkj (t ) ( k = 1, N ; j = 1, C ) by Eq. (38)

8:

Calculate ηkj (t ) ( k = 1, N ; j = 1, C ) by Eq. (39)

9:

Calculate ξkj (t ) ( k = 1, N ; j = 1, C ) by Eq. (32)

10:

11:
12:
13:
14:
15:

Calculate fitness value by Eq. (37)
Update Pbest value
Update Gbest value
End
Until Gbest unchanges or maxSteps has reached
Output ( μ , η , ξ , V ) ¼ ( μGbest , ηGbest , ξGbest , VGbest )

SERVO

STATLOG

Table 4
Descriptions of experimental datasets.
Dataset

No. elements No. numerical
attributes

No. categorical
attributes

No. classes

IRIS

GLASS
ABALONE
AUTOMOBILE
SERVO
STATLOG

150
214
4177
159
167
1000

0
0
0
10
3
13

3
6
3
6
5
2

sxi =

IRIS


bp, i
a p, i + ε

4
9
8
15
1
7

, (i = 1, N ),

(45)

where ap, i is the average distance of element i to all other elements
in cluster p, bp, i is the average distance of element i to all other
elements in cluster p. ε is a small constant (e.g. 10 À 6 for normalized data) used to avoid division by zero when ap, i = 0. The
maximum value indicates the better performance for the ASWC
index.
Parameters setting: Some values of parameters such as fuzzifier
m = 2, ε = 10−3, max Steps = 1000 are set up for all algorithms.
Particularly for PFCA-CD, we set C1 = C2 = 1, α ∈ 0.6 and ε = 10−3
(Thong & Son, 2016).
Objectives: We aim to evaluate the clustering qualities of algorithms through validity indices. Some experiments by various
cases of parameters are also considered.

DifFuzzy
Dissimilarity
PFCA-CD

DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD

and 69.458 (RI) compared to 33.333 (MA), 6.437 (DB) and 63.912
(RI) of DifFuzzy and 82.146 (MA), 8.721 (DB) and 64.371 (RI) of
Dissimilarity for AUTOMOBILE dataset. Fig. 4 indicates more about
the MA and RI values of algorithms over different dataset. In most
case of dataset, the proposed method has better values than those
of DifFuzzy and Dissimilarity.
Fig. 5 shows the values of ASWC and DB of all algorithms with
different datasets. It can be seen that the proposed algorithm results in smaller values of DB than those of others in STATLOG,
SERVO, AUTOMOBILE datasets. In ASWC, the proposed algorithm
has better in IRIS and GLASS datasets, which are only numeric
datasets. This means that ASWC maybe not good for complex data.
Table 6 show the times each algorithm has reached the best
values in Table 5. The PFCA-CD ranks first within 12 best values,
Dissimilarity ranks second within 9 values and the remained algorithm has 3 times of best values. The fluctuation of the values

for validity index changes is presented in Table 7.
In Table 7, the std. values for validity indices DifFuzzy algorithm
are not change over time to time because this algorithm is not
employed heuristic strategy. The std. values of PFCA-CD changes
less than those of Dissimilarity in general. This means that the
proposed method results in more stable solutions than those of
Dissimilarity method.
Table 8 shows the computational time of all algorithms

4.2. Results and discussions
Table 5 indicates the average validity index values of algorithm.
It can be seen that the proposed PFCA-CD algorithm has better
clustering quality based on validity indices than others. In most
case, the proposed algorithm has at least one best value of validity
indices. For instance, in AUTOMOBILE and SERVO datasets, the
values for PFCA-CD are better in MA, DB and RI indices than those
of DifFuzzy and Dissimilarity. There are 93.157 (MA), 5.319 (DB)

Fig. 4. The chart of MA and RI values of all algorithms with different datasets.


P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130

129

Dissimilarity. The discrepancy is less than one second and particularly the std. value of the proposed algorithm is much less than
that of others, these mean that the proposed algorithm is as good
as others in runtime for IRIS dataset. Only for AUTOMOBILE dataset, the proposed algorithm take more time to run (7643.86
compared to 5214.669 of Dissimilarity). This indicates that the
proposed algorithm is not effectively in large and only numerical

dataset.

5. Conclusions

Fig. 5. The chart of ASWC and DB values of all algorithms with different datasets.

Table 6
Times to achieve best values of algorithms.
(Bold values mean the best one).
Algorithms

Times to achieve best value

DifFuzzy
3
Dissimilarity 9
PFCA-CD
12

Table 7
The STD values for validity indices of algorithms.

IRIS

GLASS

ABALONE

AUTOMOBILE


SERVO

STATLOG

DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD
DifFuzzy
Dissimilarity
PFCA-CD

MA

ASWC

DB

RI


0
7.144
8.55
0
38.586
3.184

2.63
0.7
0
9.694
1.938
0
5.23
3.435

0
4.612

0
0.447
0.267
0
1.014
0.094

0.039
0.002
0

1.433
0.132
0
0.088
0.053

2.96E À 4
0.011

0
6.512
3.663
0
5.352
3.354

4.113
0.086
0
12.587
5.938
0
13.358
5.089

0.433
9.891

0
11.375

6.886
0
49.76
0.936

0.247
0.072
0
4.23
0.806
0
1.121
0.864

0.075
0.688

Table 8
The computational time (with STD. values) for algorithms in seconds.

IRIS
GLASS
ABALONE
AUTOMOBILE
SERVO
STATLOG

DifFuzzy

Dissimilarity


PFCA-CD

31.048 (1.369)
522.184 (35.528)

149.553 (0.058)
16.975 (2.9E À 3)


4.165 (3.626)
122.44 (121.87)
5214.669 (4457.619)
318.622 (71.871)
19.124 (3.439)
3082.443 (253.439)

4.743 (0.919)
17.577 (1.39)
7643.86 (844.934)
22.9 (9.017)
19.064 (6.279)
108.688 (5.991)

accompanied with STD. values. The computational time of the
proposed method is less than those of other algorithms in GLASS,
AUTOMOBILE, SERVO and STATLOG datasets. Only in AUTOMOBILE
and IRIS datasets, the proposed algorithm is higher. In IRIS dataset,
the proposed algorithm need 4.743 s compared to 4.165 s of


In this paper, we presented a novel picture fuzzy clustering
algorithm for complex data (PFCA-CD) that enables to cluster mix
numerical and categorical data with distinct structures. PFCA-CD
made uses of hybridization between Particle Swarm Optimization
strategy to Picture Fuzzy Clustering where combined solutions
consisting of equivalent clustering centers and membership matrices are packed in PSO. The idea of each cluster can be shortly
captured as more than one center can deal with complex structure
of data where the shape of data is not sphere. The use of a novel
measurement for categorical attributes can cope with mix data
also. Thus, this process created both the most suitable solutions for
the problem. The experimental results on the benchmark datasets
of UCI Machine Learning Repository indicated that in most cases
the PFCA-CD algorithm not only produced solution with better
clustering quality but also was faster than other algorithms. Further research directions of this paper could be lean to the following ways: i) investigate a distributed version of PFCA-CD; ii)
consider the semi-supervised situations for PFCA-CD; iii) apply the
algorithm to recommended systems and other problems.

Appendix
Source codes and the experimental datasets of this paper can
be retrieved at this link: />code/ci/master/tree/.

References
Atanassov, K.T., 1986. Intuitionistic fuzzy sets. Fuzzy Sets Syst. 20 (1), 87–96.
Bezdek, J.C., Ehrlich, R., Full, W., 1984. FCM: the fuzzy c-means clustering algorithm.
Comput. Geosci. 10 (2), 191–203.
Chen, L., Wang, S., Wang, K., Zhu, J., 2016. Soft subspace clustering of categorical
data with probabilistic distance. Pattern Recognit. 51, 322–332.
Cominetti, O., Matzavinos, A., Samarasinghe, S., Kulasiri, D., Liu, S., Maini, P., Erban,
R., 2010. DifFUZZY: a fuzzy clustering algorithm for complex datasets. Int. J.
Comput. Intell. Bioinform. Syst. Biol. 1 (4), 402–417.

Cuong, B.C., 2014. Picture fuzzy sets. J. Comput. Sci. Cybern. 30 (4), 409–416.
Davies, D.L., Bouldin, D.W., 1979. A cluster separation measure. IEEE Trans. Pattern
Anal. Mach. Intell. 2, 224–227.
De Carvalho, F.D.A., Lechevallier, Y., De Melo, F.M., 2013. Relational partitioning
fuzzy clustering algorithms based on multiple dissimilarity matrices. Fuzzy Sets
Syst. 215, 1–28.
Eberhart, R.C., Kennedy, J., 1995. A new optimizer using particle swarm theory, In:
Proceedings of the Sixth International Symposium on Micro Machine and
Human Science, 1, pp. 39–43.
Ferreira, M.R., de Carvalho, F.D., 2012. Kernel fuzzy clustering methods based on
local adaptive distances, In: Proceedings of 2012 IEEE International Conference
on In Fuzzy Systems (FUZZ-IEEE), pp. 1–8.
Hwang, Z., 1998. Extensions to the k-means algorithm for clustering large data sets
with categorical values. Data Min. Knowl. Discov. 2 (3), 283–304.
Ji, J., Pang, W., Zhou, C., Han, X., Wang, Z., 2012. A fuzzy k-prototype clustering
algorithm for mixed numeric and categorical data. Knowl. – Based Syst. 30,
129–135.
Ji, J., Bai, T., Zhou, C., Ma, C., Wang, Z., 2013. An improved k-prototypes clustering
algorithm for mixed numeric and categorical data. Neurocomputing 120,
590–596.
Mendel, J.M., John, R.I.B., 2002. Type-2 fuzzy sets made simple. IEEE Trans. Fuzzy
Syst. 10 (2), 117–127.
Son, L.H., 2014a. Enhancing clustering quality of geo-demographic analysis using


130

P.H. Thong, L.H. Son / Engineering Applications of Artificial Intelligence 56 (2016) 121–130

context fuzzy clustering type-2 and particle swarm optimization. Appl. Soft

Comput. 22, 566–584.
Son, L.H., 2014b. HU-FCF: a hybrid user-based fuzzy collaborative filtering method
in recommender systems. Expert Syst. Appl. 41 (15), 6861–6870.
Son, L.H., 2015a. DPFCM: a novel distributed picture fuzzy clustering method on
picture fuzzy sets. Expert Syst. Appl. 42 (1), 51–66.
Son, L.H., 2015b. A novel kernel fuzzy clustering algorithm for geo-demographic
analysis. Inf. Sci. 317, 202–223.
Son, L.H., 2015c. HU-FCF þ þ : a novel hybrid method for the new user cold-start
problem in recommender systems. Eng. Appl. Artif. Intell. 41, 207–222.
Son, L.H., 2016. Dealing with the new user cold-start problem in recommender
systems: a comparative review. Inf. Syst. 58, 87–104.
Son, L.H., Thong, N.T., 2015. Intuitionistic fuzzy recommender systems: an effective
tool for medical diagnosis. Knowl. – Based Syst. 74, 133–150.
Son, L.H., Tuan, T.M., 2016. A cooperative semi-supervised fuzzy clustering framework for dental X-ray image segmentation. Expert Syst. Appl. 46, 380–393.
Son, L.H., Hai, P.V., 2016. A novel multiple fuzzy clustering method based on internal clustering validation measures with gradient descent. Int. J. Fuzzy Syst.
/>Son, L.H., Cuong, B.C., Long, H.V., 2013. Spatial interaction – modification model and
applications to geo-demographic analysis. Knowl. – Based Syst. 49, 152–170.
Son, L.H., Linh, N.D., Long, H.V., 2014. A lossless DEM compression for fast retrieval
method using fuzzy clustering and MANFIS neural network. Eng. Appl. Artif.
Intell. 29, 33–42.
Son, L.H., Cuong, B.C., Lanzi, P.L., Thong, N.T., 2012a. A novel intuitionistic fuzzy
clustering method for geo-demographic analysis. Expert Syst. Appl. 39 (10),
9848–9859.
Son, L.H., Lanzi, P.L., Cuong, B.C., Hung, H.A., 2012b. Data mining in GIS: a novel
context-based fuzzy geographically weighted clustering algorithm. Int. J. Mach.
Learn. Comput. 2 (3), 235–238.

Thong, P.H., Son, L.H., 2014. A new approach to multi-variables fuzzy forecasting
using picture fuzzy clustering and picture fuzzy rules interpolation method, In:
Proceeding of 6th International Conference on Knowledge and Systems Engineering, pp. 679–690.

Thong, N.T., Son, L.H., 2015. HIFCF: an effective hybrid model between picture fuzzy
clustering and intuitionistic fuzzy recommender systems for medical diagnosis.
Expert Syst. Appl. 42 (7), 3682–3701.
Thong, P.H., Son, L.H., Fujita, H., 2016. Interpolative Picture Fuzzy Rules: A Novel
Forecast Method for Weather Nowcasting, In: Proceeding of the 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2016), pp. 86–93.
Thong, P.H., Son, L.H., 2016a. Picture fuzzy clustering: a new computational intelligence method. Soft Comput. 20 (9), 3549–3562.
Thong, P.H., Son, L.H., 2016b. An overview of semi-supervised fuzzy clustering algorithms. Int. J. Eng. Technol. 8 (4), 301–306.
Tuan, T.M., Ngan, T.T., Son, L.H., 2016. A novel semi-supervised fuzzy clustering
method based on interactive fuzzy satisficing for dental X-ray image segmentation. Appl. Intell. 45 (2), 402–428.
Tuan, T.M., Duc, N.T., Hai, P.V., Son, L.H., 2016. Dental diagnosis from X-Ray images
using fuzzy rule-based systems. Int. J. Fuzzy Syst. Appl. (in press).
University of California, 2007. UCI Repository of Machine Learning Databases
〈 />Vendramin, L., Campello, R.J., Hruschka, E.R., 2010. Relative clustering validity criteria: a comparative overview. Stat. Anal. Data Min. 3 (4), 209–235.
Wijayanto, A.W., Purwarianti, A., Son, L.H., 2016. Fuzzy geographically weighted
clustering using artificial bee colony: an efficient geo-demographic analysis
algorithm and applications to the analysis of crime behavior in population.
Appl. Intell. 44 (2), 377–398.
Yang, M.S., Hwang, P.Y., Chen, D.H., 2004. Fuzzy clustering algorithms for mixed
feature variables. Fuzzy Sets Syst. 141 (2), 301–317.



×