Tải bản đầy đủ (.pdf) (22 trang)

DSpace at VNU: A novel kernel fuzzy clustering algorithm for Geo-Demographic Analysis

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (3.31 MB, 22 trang )

Information Sciences 317 (2015) 202–223

Contents lists available at ScienceDirect

Information Sciences
journal homepage: www.elsevier.com/locate/ins

A novel kernel fuzzy clustering algorithm for Geo-Demographic
Analysis
Le Hoang Son ⇑
VNU University of Science, Vietnam National University, 334 Nguyen Trai, Thanh Xuan, Hanoi, Viet Nam

a r t i c l e

i n f o

Article history:
Received 3 January 2014
Received in revised form 19 April 2015
Accepted 25 April 2015
Available online 6 May 2015
Keywords:
Fuzzy clustering
Geo-Demographic Analysis
Intuitionistic possibilistic fuzzy clustering
Kernel-based clustering
Spatial Interaction – Modification Model

a b s t r a c t
Geo-Demographic Analysis (GDA) is a major concentration of various interdisciplinary
researches and has been used in many decision-making processes regarding the provision


and distribution of products and services in society. Machine learning methods namely
Principal Component Analysis, Self-Organizing Map, K-Means, fuzzy clustering and fuzzy
geographically weighted clustering were proposed to enhance the quality of GDA.
Among them, the state-of-the-art method – Modified Intuitionistic Possibilistic Fuzzy
Geographically Weighted Clustering (MIPFGWC) has some drawbacks such as: (i) using
the Euclidean similarity measure often results in high error rate and sensitivity to noises
and outliers; (ii) updating the membership matrix by the Spatial Interaction –
Modification Model (SIM2) model leads to new centers not being ‘‘geographically aware’’.
In this paper, we present a novel fuzzy clustering algorithm named as Kernel Fuzzy
Geographically Clustering (KFGC) that utilizes both the kernel similarity function and the
new update mechanism of the SIM2 model to remedy the disadvantages of MIPFGWC.
Some supported properties and theorems of KFGC are also examined in the paper.
Specifically, the differences between solutions of KFGC and those of MIPFGWC and of some
variants of KFGC are theoretically validated. Lastly, experimental analysis is performed to
compare the performance of KFGC with those of the relevant algorithms in terms of
clustering quality.
Ó 2015 Elsevier Inc. All rights reserved.

1. Introduction
Prior to the definition of Geo-Demographic Analysis (GDA) problem, let us consider an example to demonstrate the roles
of GDA to practical applications.
Example 1. A hot-spot analysis for the number of viral hemorrhagic fever cases in Vietnam in 2011 is examined in Fig. 1. The
results are expressed in a map showing various groups determined by intervals of cases such as [0, 47] and [48, 233]. From
this fact, decision makers could observe the most dangerous places and issue appropriate medical measures to prevent such
the situation in the future. The distribution can be expressed by linguistic labels such as ‘‘High cases of viral hemorrhagic
fever’’ and ‘‘Low cases of viral hemorrhagic fever’’ to eliminate the limitations of boundary points in the intervals. Such
hot-spot analysis in this example is a kind of Geo-Demographic Analysis regarding the classification of a geographical area
according to a given subject, e.g. viral hemorrhagic fever cases. The classification could be done for spatial data both in point
⇑ Tel.: +84 904171284; fax: +84 0438623938.
E-mail addresses: ,

/>0020-0255/Ó 2015 Elsevier Inc. All rights reserved.


L.H. Son / Information Sciences 317 (2015) 202–223

203

Lists of abbreviation
Terms
Explanation
GIS
Geographical Information Systems
GDA
Geo-Demographic Analysis
SIM
Spatial Interaction Model
SIM-PF Spatial Interaction Model with Population Factor
SIM2
Spatial Interaction – Modification Model
SOM
Self-Organizing Maps
PCA
Principal Component Analysis
K-Means a hard partition based clustering algorithm
FCM
Fuzzy C-Means
NE
Neighborhood Effects
FGWC
Fuzzy Geographically Weighted Clustering

IPFGWC Intuitionistic Possibilistic Fuzzy Geographically Weighted Clustering
MIPFGWC Modified Intuitionistic Possibilistic Fuzzy Geographically Weighted Clustering
KFGC
Kernel Fuzzy Geographically Clustering
IFV
a spatial clustering quality validity index
UNO
United Nations Organization

and region standards of Geographical Information Systems (GIS), and a number of points/regions that both share common
characteristics in the spatial and attribute data forming a group marked by a unique symbol and color1 in the map.
Intuitively, GDA can be regarded as the spatial clustering in GIS.

Definition 1. Given a geo-demographic dataset X consisting of N data points where each data point is equivalent to a point/region of spatial data in GIS. This data point is characterized by many geo-demographic attributes where each one could be
considered as a subject for clustering. The objective of GDA is to classify X into C clusters so that,



N X
C
X


ukj  X k À V j  ! min;

ð1Þ

k¼1 j¼1

8

ukj 2 ½0; 1Š;
>
>
> C
>
X
>
>
>
>
ukj ¼ 1;
>
<
j¼1
À Á
>
u ¼ ukj wj ;
>
>
> kj
À
Á
>
>
>
V ¼ V j wj ;
>
> j
:
k ¼ 1; N; j ¼ 1; C;


ð2Þ

where ukj is the membership value of data point X k ðk ¼ 1; NÞ to cluster jth ðj ¼ 1; CÞ. V j is the center of cluster jth ðj ¼ 1; CÞ. wj
is the weight of cluster jth ðj ¼ 1; CÞ showing the influence of spatial relationships in a map. It is often calculated through a
spatial model such as Spatial Interaction Model (SIM) [6], Spatial Interaction Model with Population Factor (SIM-PF) [14,25,27]
or Spatial Interaction – Modification Model (SIM2) [26].
GDA is widely used in the public and private sectors for planning and provision of products and services. In GDA,
geo-demographic attributes are used to characterize essential information of population at a certain geographical area
and a specific point of time. Some common geo-demographics can be named but a few such as gender, age, ethnicity, knowledge of languages, disabilities, mobility, home ownership, and employment status. One of the most useful functions of GDA
is the capability to visualize the geo-demographic trends by locations and time stamps such as the study of the average age
of a population over time and the investigation of the migration trends of local people in a town. As illustrated in Example 1,
the results of GDA are depicted on a map that demonstrates the distribution of several distinct groups. Various distribution
maps can be combined or overlapped into a single one so that users could observe the tendency of a certain group over time
for the analyses of geo-demographic trends. Both distributions and trends of values within a geo-demographic variable are of
interest in GDA. By providing essential information about geo-demographic distributions and trends, GDA assists effectively
for many decision-making processes involving the provision and distribution of products and services in society, the determination of common population’s characteristics and the study of population variation. This clearly demonstrates the important role of GDA to practical applications nowadays.
1

For interpretation of color in Figs. 1 and 3, the reader is referred to the web version of this article.


204

L.H. Son / Information Sciences 317 (2015) 202–223

Fig. 1. The distribution of viral hemorrhagic fever cases in Vietnam in 2011.

In GDA, improving the clustering quality especially in terms of theoretical results over the relevant exiting methods is the
major concentration of researchers in many years [19]. Finding more accurate distribution of distinct groups of

geo-demographic data is essential and significant to the expression and reasoning of events and phenomena concerning
the characteristics of population and to the better support of decision-making. Some of the first methods –
Self-Organizing Maps (SOM) [13] and Principal Component Analysis (PCA) [34] rely on basic principles of statistics and neural networks to determine the underlying demographic and socio-economic phenomena. However, their disadvantages are
the requisition of large memory space and computational complexity. For these reasons, researchers tended to use clustering
algorithms such as Agglomerative Hierarchical Clustering [5] and K-Means [16] to classify geo-demographic datasets into
clusters represented in the forms of hierarchical trees and isolated groups. Nonetheless, using hard clustering for GDA often
leads to the issues of ecological fallacy, which can be shortly understood that statistics accurately describing group characteristics do not necessarily apply to individuals within that group. Thus, fuzzy clustering algorithms like Fuzzy C-Means
(FCM) and its variants were considered as the appropriate methods to determine the distribution of a demographic feature
on a map [10,27,28,39]. Since the results of FCM are independent to the geographical factors, some improvements of that
algorithm were made by attaching FCM with a spatial model such as SIM [6], SIM-PF [14,25,27] or SIM2 [26]. The comparative experiments [26] showed that the MIPFGWC algorithm combining the SIM2 model with an intuitionistic possibilistic


L.H. Son / Information Sciences 317 (2015) 202–223

205

fuzzy geographically clustering algorithm has better clustering quality than other relevant algorithms such as NE [6], FGWC
[14] and IPFGWC [25]. In addition to MIPFGWC, there are some relevant works concerning the applications and algorithms
for GDA such as in [4,7,12,15,17,18,20–24,29,31,36,37,40]. Among all existing works, MIPFGWC is considered as the
state-of-the-art algorithm for the GDA problem.
MIPFGWC calculates new centers through the Euclidean similarity measure and uses the SIM2 model to update the membership matrix. However, there exists some problems in this process that may decrease the clustering quality. Let us make a
deeper analysis about this consideration.
(a) The similarity measure for clustering is the Euclidean function. According to Keogh and Ratanamahatana [11], using the
Euclidean similarity measure in clustering algorithms often has high error rate and sensitivity to noises and outliers
since this measure is not suitable for ordinal data, where preferences are listed according to rank instead of according
to actual real values. Furthermore, the Euclidean measure cannot determine the correlation between user profiles that
have similar trends in tastes, but different ratings for some of the same items. Gu et al. [8] stated that Euclidean measure results in the contributions to the clustering results among all features being the same, which lead to a strong
dependence of clustering on the sample space. For those reasons, a better similarity measure should be used instead
of Euclidean function to obtain high clustering quality.
(b) Updating the membership matrix by the SIM2model is another concerned problem. MIPFGWC calculates the values of

membership matrix, the hesitation level and the typicality by solving the optimization problem. The membership
matrix is then updated by the SIM2 model. The problem is that the hesitation level and the typicality values are
not updated by any geographical model so that the new centers, which are calculated from those values and the
updated membership matrix, are not ‘‘geographically aware’’. Thus, the final clustering results could be far from
the accurate ones by this situation.
We clearly recognize that these two drawbacks prevent MIPFGWC from achieving high quality of clustering so that they
need enhancing and improving thoroughly. Our motivation in this article is to investigate a new method that can handle
those limitations, more specifically,
(a) For the first problem, we consider the kernel functions, which can be used in many applications as they provide a simple bridge from linearity to non-linearity for algorithms that can be expressed in terms of dot products. This type of
similarity measures makes very good sense as a measure of difference between samples in the context of certain data,
e.g. geo-demographic datasets.
(b) For the second problem, we convert the activities of the SIM2 model into the objective function of the problem; thus
giving more closely related to spatial relationships since all elements such as the cluster memberships, the hesitation
level, the typicality values and the centers are ‘‘geographically aware’’.
Our contributions in this paper are:
(a) A novel fuzzy clustering algorithm named as Kernel Fuzzy Geographically Clustering (KFGC) that utilizes both the kernel
function and the new update mechanism of the SIM2 model to remedy the disadvantages of MIPFGWC.
(b) Some supported properties and theorems of KFGC are also examined in the paper. Specifically, the differences between
the solutions of KFGC and those of MIPFGWC and of some variants of KFGC are theoretically validated.
(c) Experimental analysis is performed to compare the performance of KFGC with those of the relevant algorithms in
terms of clustering quality.
These findings both ameliorate the quality of results for the GDA problem and enrich the knowledge of developing clustering algorithms based on Kernel distances and the spatial model SIM2 for practical applications. In the other words, the
findings are significant to both theory and practical sides.
Before we close this section and move to the detailed model and solutions, let us raise an important question: ‘‘How can
we apply the Kernel functions to the clustering algorithm in order to handle the problem of similarity measures?’’. To answer
the question, a survey of kernel-based fuzzy clustering algorithms in [1,2,8,9,30,35,38] was done in order to find the appropriate algorithm and kernel function for our considered problem. Through this survey, it is clear that the kernel function is
often applied directly to the objective function with the most frequent used function being the Gaussian kernel-induced distance [38]. Moreover, most spatial-based kernel fuzzy clustering employed the spatial bias correction in the objective function so that this gives us a hint of how to apply the activities of the SIM2 model to handle the second limitation of MIPFGWC.
The rest of the paper is structured as follows. Section 2 presents our main contribution consisting of a new objective function and solutions, some supported properties and theorems, and details of the new algorithms – KFGC. Specifically, in
Section 2.1, we introduce a new objective function that integrates some results of MIPFGWC and Yang and Tsai [38], aiming
to handle two limitations that exist in MIPFGWC. By using the Lagrangian method, the optimal solutions including the new

centers, the membership matrix, the hesitation level and the typicality values are found accordingly. In Section 2.2, we
examine some interesting properties and theorems of solutions such as,


206

L.H. Son / Information Sciences 317 (2015) 202–223

 Some characteristics of KFGC’s solutions, e.g. the limit of membership values when fuzzifier is large;
 The estimation of the difference between solutions of KFGC and those of MIPFGWC and of the variants of KFGC.
In Section 2.3, the proposed algorithm – KFGC is presented in details. Section 3 validates the proposed approach through a
set of experiments involving real-world data. Finally, Section 4 draws the conclusions and delineates the future research
directions.
2. Kernel Fuzzy Geographically Clustering
2.1. The proposed model and solutions
Supposing there is a geo-demographic dataset X consisting of N data points. Let us divide the dataset into C groups satisfying the objective function (3).



N X
C  
C
N
m
 g
 s À
X
X
À
ÁÁ X

À
Ág
0
a1 u0kj þ a2 t0kj þ a3 hkj
cj
1 À t kj ! min;
1 À K Xk ; V j þ
k¼1 j¼1

j¼1

ð3Þ

k¼1

0

where K ðx; yÞ is the Gaussian kernel-induced function. u0kj ; t 0kj and hkj are the updated membership values, typicality values
2

and hesitation level by the SIM model in Eqs. (4)–(9), respectively.

u0k ¼ au  uk þ bu Â

t0k ¼ at  t k þ bt Â

0

kÀ1
C

X
1 X
wkj  u0j þ cu  Â
wkj  uj ;
Au j¼k
j¼1

ð4Þ

kÀ1
C
X
1 X
wkj  t0j þ ct  Â
wkj  t j ;
At j¼k
j¼1

hk ¼ ah  hk þ bh Â

kÀ1
C
X
1 X
0
wkj  hj þ ch  Â
wkj  hj ;
Ah j¼k
j¼1


ð5Þ

8k ¼ 1; C;

ð6Þ

au þ bu þ cu ¼ 1;

ð7Þ

at þ bt þ ct ¼ 1;

ð8Þ

ah þ bh þ ch ¼ 1;

ð9Þ

wkj is the weighting function showing the influence of area kth to area jth defined through Eqs. (10) and (11).

wkj ¼

8
b
< ðpopk Âpopj Þ Âpckj ÂIMdkj

k – j;

:


else;

dakj

0

8 PC
>
< k¼1 popk ¼ N0 ;
pkj 6 dkj ;
>
:
IMkj 6 popk þ popj :

ð10Þ

ð11Þ

The parameters Au ; At and Ah are scaling variables that force the membership values, typicality values and hesitation level
satisfying constraints (12)–(18).

ukj ; tkj ; hkj 2 ½0; 1Š;

ð12Þ

ukj þ hkj þ tkj ¼ 1;

ð13Þ

!

C
C
X
1X
wij ukj ¼ 1;
C i¼1
j¼1

ð14Þ

!
C
C
X
1X
wij hkj ¼ 1;
C i¼1
j¼1

ð15Þ

m; g; s > 1;

ð16Þ


207

L.H. Son / Information Sciences 317 (2015) 202–223


ai > 0;

i ¼ 1; 3;

ð17Þ

cj > 0 ðj ¼ 1; CÞ:

ð18Þ

The proposed model KFGC in Eqs. (3)–(18) relies on the principles of intuitionistic fuzzy sets, possibilistic fuzzy clustering,
weighted clustering, the SIM2 model and the kernel-based clustering. In order to analyze the difference and the improvement of this model in comparison to MIPFGWC [26], let us review some points below.
 The objective function of KFGC in (3) employs the Gaussian kernel function K ðx; yÞ instead of the traditional Euclidean
function of MIPFGWC. This handles the first limitation of MIPFGWC as shown above.
0
 In Eq. (3), u0kj ; t 0kj ; hkj are used instead of the original membership values, typicality values and hesitation level in MIPFGWC.
This tackles the second problem of MIPFGWC where the typicality values and the hesitation level are not updated by any
geographical model so that the next center is not correctly calculated. According to this amendment, the membership values, typicality values and the hesitation level are all updated by the SIM2 model as shown in Eqs. (4)–(9). The weighting
function is kept intact for the update of those values.
 Inspired by the spatial bias correction of the group of kernel-based fuzzy clustering, especially the work of Yang and Tsai
[38], an improvement in the constraints of the proposed model was made and described in Eqs. (14) and (15). The role of
 P

the average weight to cluster jth C1 Ci¼1 wij is equivalent to rj in the work of Yang & Tsai. Nonetheless, since there has
already had the weights derived from the SIM2 model, they are utilized for this task. By providing the weights into the
constraints, the proposed model is getting closely related to spatial relationships.
Now, let us continue to find the optimal solutions of the model (3)–(18).
Theorem 1. The optimal solutions of the systems (3)–(18) are:
1
!mÀ1

ð1ÀK ðXk ;V j ÞÞ
 
 0
P
1 1
À
ÁÁ!mÀ1
PC À
iÀ1
ð1ÀK ðXk ;V j ÞÞ
bu jÀ1
1 À K Xk; V j
i¼1 wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ bu uk1
j¼1
@1 À
A;
 À


À
À
ÁÁ
PC  1 PC
P
cu
iÀ1
1 À K Xk ; V j
au þ bu jÀ1
j¼1 C
i¼1 wij

i¼1 wji Au bu þ Á Á Á þ b þ 1

PC

j¼1

ukj ¼

k ¼ 1; N; j ¼ 1; C;

ð19Þ

1
!sÀ1
ð1ÀK ðX k ;V j ÞÞ
 
 0
P
1 1
À
ÁÁ!sÀ1
PC À
iÀ1
ð1ÀK ðX k ;V j ÞÞ
bh jÀ1
1 À K Xk; V j
i¼1 wji ah hki þ bh hkðiÀ1Þ þ Á Á Á þ bh hk1
j¼1
@1 À
A;

 À


À
À
ÁÁ
PC 1 PC
P
ch
1 À K Xk; V j
ah þ bh jÀ1
bhiÀ1 þ Á Á Á þ b þ 1
j¼1 C
i¼1 wij
i¼1 wji A

PC

j¼1

hkj ¼

h

k ¼ 1; N; j ¼ 1; C;

ð20Þ
1

tkj ¼ 


cgj À1

1

 


 À
À
ÁÁgÀ1
P
ct
iÀ1
iÀ1
 a2  at þ bt jÀ1
w
b
þ
ÁÁÁ
þ
b
þ
1
þ ÁÁÁ þ b þ 1 Â 1 À K X k ;V j
ji
t
i¼1
i¼1 wji At bt



P
iÀ1
bt jÀ1
i¼1 wji at t ki þ bt t kðiÀ1Þ þ ÁÁÁ þ bt t k1
À 1 

 ;
P
iÀ1
cgj À1 þ at þ Actt bt jÀ1
w
b
þ
ÁÁÁ
þ
b
þ
1
t
i¼1 ji



1

cgj À1 þ at þ Actt bt

PjÀ1


ð21Þ




m
m 1
PjÀ1  
cu
cu PC
iÀ1
iÀ1
B a1 au ukj þ bu i¼1 wji au uki þ bu ukðiÀ1Þ þ ÁÁÁ þ bu uk1 þ Au  bu þ ÁÁÁ þ b þ 1  ukj þ Au  i¼j wji  uki C





g C
PN B
PjÀ1  
C
B
ch
ch PC
iÀ1
iÀ1
CX k
k¼1 B þa2 ah hkj þ bh
i¼j wji  hki

i¼1 wji ah hki þ bh hkðiÀ1Þ þ ÁÁÁ þ bh hk1 þ Ah  bh þ ÁÁÁ þ b þ 1  hkj þ Ah Â
C
B
A
@








s
PjÀ1
ct
ct P C
iÀ1
iÀ1
þa3 at t kj þ bu i¼1 wji at tki þ bt tkðiÀ1Þ þ ÁÁÁ þ bt t k1 þ At  bt þ ÁÁÁ þ b þ 1  tkj þ At  i¼j wji  t ki
0 
Vj ¼



m
m 1 j ¼ 1;C:
PjÀ1  
cu
cu PC

iÀ1
iÀ1
B a1 au ukj þ bu i¼1 wji au uki þ bu ukðiÀ1Þ þ ÁÁÁ þ bu uk1 þ Au  bu þ ÁÁÁ þ b þ 1  ukj þ Au  i¼j wji  uki C





g C
PN B
PjÀ1  
P
C
B
wji ah hki þ bh hkðiÀ1Þ þ ÁÁÁ þ biÀ1
hk1 þ Ach  biÀ1
þ ÁÁÁ þ b þ 1  hkj þ Ach  Ci¼j wji  hki C
k¼1 B þa2 ah hkj þ bh
h
h
i¼1
h
h
C
B
A
@

 





s
P
P
ct
iÀ1
iÀ1
þa3 at tkj þ bu jÀ1
þ ÁÁÁ þ b þ 1  t kj þ Actt  Ci¼j wji  tki
i¼1 wji at t ki þ bt t kðiÀ1Þ þ ÁÁÁ þ bt t k1 þ At  bt
0

ð22Þ


208

L.H. Son / Information Sciences 317 (2015) 202–223

2.2. Some supported properties and theorems
2.2.1. Properties of solutions

Property 1. The limits of the membership values are

È É
lim ukj ¼ 1;

ð23Þ


m!1þ

bu
È É
lim ukj ¼ À

PjÀ1

m!1À

i¼1 wji





au uki þ bu ukðiÀ1Þ þ Á Á Á þ biÀ1
u uk1

au þ bu

C
C
È É X
1X
lim ukj ¼
wij
m!1
C i¼1

j¼1

PjÀ1



biÀ1
u þ ÁÁÁ þ b þ 1

cu

i¼1 wji Au

!
À

bu

PjÀ1

i¼1 wji




ð24Þ

;




au uki þ bu ukðiÀ1Þ þ Á Á Á þ biÀ1
u uk1



P
cu
iÀ1
au þ bu jÀ1
i¼1 wji Au bu þ Á Á Á þ b þ 1



C
C
X
1X
wij
C i¼1
j¼1

Â

!

!
þ1 :

ð25Þ


The results in (24) and (25) differ those in [26] a quantity of constants. This means that applying the SIM2 model into the objective
function makes the limits (24) and (25) dependent on the parameters of the model.
Property 2. Similarly, some limits of the hesitation level are

È É
lim hkj ¼ 1;

ð26Þ

È É
lim hkj ¼ 0;

ð27Þ

s!1þ

s!1À

È

lim hkj

É

s!1

1
! 0
!

jÀ1
C
C
 

X
X
1X
1
iÀ1

A :
wij À bh wji ah hki þ bh hkðiÀ1Þ þ ÁÁÁ þ bh hk1
¼
 @1 P
ch
C i¼1
ah þ bh jÀ1
biÀ1
i¼1
j¼1
h þ ÁÁÁ þ b þ 1
i¼1 wji A

ð28Þ

h

Property 3. Limits of the typicality value are


È É
lim t kj ¼ 

g!1þ



P
iÀ1
bt jÀ1
i¼1 wji at t ki þ bt t kðiÀ1Þ þ Á Á Á þ bt t k1
1

 À 

 ;
P
P
iÀ1
iÀ1
at þ Actt bt jÀ1
þ ÁÁÁ þ b þ 1
at þ Actt bt jÀ1
þ ÁÁÁ þ b þ 1
i¼1 wji bt
i¼1 wji bt

ð29Þ


È É
lim t kj ¼ 1;

ð30Þ

g!1À

È É
lim t kj ¼ 

g!1



P
iÀ1
bt jÀ1
i¼1 wji at t ki þ bt t kðiÀ1Þ þ Á Á Á þ bt t k1
1

 À 

 :
P
P
iÀ1
iÀ1
at þ Actt bt jÀ1
þ ÁÁÁ þ b þ 1
at þ Actt bt jÀ1

þ ÁÁÁ þ b þ 1
i¼1 wji bt
i¼1 wji bt

ð31Þ

The remarks from Properties 2 and 3 are similar to those in Property 1 that means the limits are more dependent to the parameters of the model than those in the MIPFGWC algorithm [26].
Property 4. if a2 ¼ 0 then tkj ¼ 1; 8k ¼ 1; N; j ¼ 1; C.
Contrary to the result in [26], there does not exist the typicality values in cases of a2 ¼ 0. This recommends us the value of
parameter a2 should be avoided in order to obtain the best clustering quality of algorithm.

Property 5
lim
fv j g ¼
m!1

g!1
es!1

lim

PN
Xk
  k¼1
m  g

PN

ðm;g;sÞ!ð1;1;1Þ




l¼1;l–k

a1

a1

u0kj

u0

lj

  m

þa2

þa2

t 0lj

 
t0kj

þa3 ðh0lj Þ

s
ca3 ðh0kj Þ




s

ffi

PN
k¼1 X k
0   m
1 þ ðN À 1Þlimðm;g;sÞ!ð1;1;1Þ @

PN

a1

u0

lj

þa2

a1

u0kj

þa2

 m

 g


s

t 0lj

þa3 ðh0lj Þ

t0kj

þa3 ðh0kj Þ

 g

1 ¼

s

k¼1 X k

N

A

¼ V l 2 ½1;N Š:

When the parameters are quite large, all the clusters’ centers tend to move to the central point of the dataset.

ð32Þ



209

L.H. Son / Information Sciences 317 (2015) 202–223

Property 6. The limits of the ratio between ukj and hkj are

&
lim

m!1þ
s!1þ

lim

&

m!1þ
s!1À

ukj
hkj
ukj
hkj

'
~
¼ 1;

ð33Þ


¼ 1;

ð34Þ

'


 P
 
 
jÀ1
iÀ1
1
1
Á
À
b
w
a
u
þ
b
u
þ
Á
Á
Á
þ
b
u

P
Â
1
À
& ' PC À1P
u
ki
k1
C
jÀ1
u
u kðiÀ1Þ
u
i¼1 ji
w
au þbu i¼1 wji Acuu ðbiÀ1
ukj
u þÁÁÁþbþ1Þ
C
i¼1 ij
!:
¼ j¼1
lim
m!1
 P
 

s!1 hkj
jÀ1
iÀ1

1
1
PC À1PC Á À bh i¼1 wji ah hki þ bh hkðiÀ1Þ þ Á Á Á þ bh hk1
 1PjÀ1 c
w
ah þbh i¼1 wji Ah ðbhiÀ1 þÁÁÁþbþ1Þ
j¼1 C
i¼1 ij
h

ð35Þ

~ is complex infinity.
where 1
2.2.2. The difference of solutions between algorithms
Firstly, the difference between the solutions of KFGC and MIPFGWC [26] is measured.
Theorem 2. Supposing that similar inputs and initialization are given to both KFGC and MIPFGWC. The upper bounds of the
difference of solutions of algorithms are:


 ðKMIPFGWCÞ

À U ðMIPFGWCÞ  6 N Â C
U
0
 @P

C
j¼1


1
 P
C
1
C

i¼1 wij



0

XC

@bu
j¼1

0
 

iÀ1
@1 À
w
ji au uki þ bu ukðiÀ1Þ þ ÁÁÁ þ bu uk1
i¼1

XjÀ1

a u þ bu


PjÀ1

cu

i¼1 wji Au

1


111
biÀ1
u þ ÁÁÁ þ b þ 1

AAA;

ð36Þ



 ðKMIPFGWCÞ
À HðMIPFGWCÞ  6 N Â C
H
0
 @P

C
j¼1

1

 P
C
1
C

i¼1 wij



0
0
jÀ1
C
 

X
X
@bh wji ah hki þ bh hkðiÀ1Þ þ ÁÁÁ þ biÀ1 hk1 @1 À
h

j¼1



C
X
 ðKMIPFGWCÞ


À T ðMIPFGWCÞ  6 N Â C Â

T
j¼1

i¼1

a h þ bh

PjÀ1

ch

i¼1 wji Ah

1


111
AAA;
bhiÀ1 þ ÁÁÁ þ b þ 1

ð37Þ

1

1



cgj À1 þ at þ Actt bt


cgj À1
PjÀ1

i¼1 wji


 :
biÀ1
þ ÁÁÁ þ b þ 1
t

ð38Þ

From those results, the difference of clustering qualities of two algorithms through the IFV index [3] can be estimated. IFV was
used to evaluate the clustering qualities of MIPFGWC and other algorithms in [26]. It was shown to be robust and stable when clustering spatial data. The definition of IFV is stated below.

8
"
#2 9
C <
N
N
=
X
X
X
2
ð1=NÞ ukj log2 C À ð1=NÞ log2 ukj
IFV ¼ ð1=C Þ
 ðSDmax =rD Þ;

:
;
j¼1
k¼1
k¼1

2
SDmax ¼ maxV k À V j  ;
k–j

rD

!
C
N 
X
X
2


¼ ð1=C Þ
ð1=NÞ
Xk À V j
:
j¼1

ð39Þ

ð40Þ


ð41Þ

k¼1

When IFV ! max, the value of IFV is said to yield the most optimal of the dataset. The difference of IFV values between KFGC and
MIPFGWC is estimated as


210

L.H. Son / Information Sciences 317 (2015) 202–223

IFV

KMIPFGWC

À IFV

MIPFGWC

8
"
#2 9
C < X
N
= SD
1X
1 N  KMIPFGWC 2
1X
max

KMIPFGWC
¼
ukj
log2 C À
log2 ukj
Â
:
;
C j¼1 N k¼1
N k¼1
rD
8
"
#2 9
= SD
C < X
N
1X
1 N  MIPFGWC 2
1X
max
À
ukj
log2 C À
log2 uMIPFGWC
Â
;
kj
;
C j¼1 :N k¼1

N k¼1
rD

ð42Þ

8 0
"
#2
"
#2 19
C <
N 
N
N 
N
=
2
2
X
1X
1 @X
1X
1X
KMIPFGWC
KMIPFGWC
MIPFGWC
MIPFGWC
A Â SDmax :
¼
ukj

log2 C À
log2 ukj
À
ukj
log2 C À
log2 ukj
;
C j¼1 :N k¼1
N k¼1
N k¼1
rD
k¼1
ð43Þ
Based upon the results in Theorem 2, we can recognize that IFV KMIPFGWC À IFV MIPFGWC P 0. In the other words, the clustering quality
of KFGC is generally better than that of MIPFGWC.
Secondly, the effective of KFGC with and without using Gaussian kernel function is verified.

2
Theorem 3. In the objective function (3) of KFGC, the kernel function is replaced with the Euclidean function X k À V j  and the
similar proof with that of Theorem 1 is used to determine the new optimal solutions of KFGC without using Gaussian kernel function
as in Eqs. (44)–(47).

PC

2

kXk ÀV j k
2
kXk ÀV j k
ukj ¼ P  P

C
C
1
j¼1

j¼1 C

1
!mÀ1

1
1 0
0 PjÀ1  
 ! 1
PC 
X k À V j 2 mÀ1
bu i¼1 wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ biÀ1
u uk1
C
j¼1
AÂB
 À@


@1 À
A;


P
cu

iÀ1
X k À V j 2
au þ bu jÀ1
i¼1 wij
i¼1 wji Au bu þ Á Á Á þ b þ 1

k ¼ 1; N; j ¼ 1; C;
PC

kX k ÀV j k
2
kX k ÀV j k
hkj ¼ P  P
C
C
1
j¼1

j¼1 C

ð44Þ
2

1
!sÀ1

1
1 0
0 PjÀ1  
 !1

PC 
X k À V j 2 sÀ1
bh i¼1 wji ah hki þ bh hkðiÀ1Þ þ Á Á Á þ biÀ1
h hk1
C
j¼1
AÂB
 À@

@1 À
A;


PjÀ1 ch  iÀ1
X k À V j 2
w
a
þ
b
w
b
þ
Á
Á
Á
þ
b
þ
1
ij

ji
h
h
h
i¼1
i¼1
A

ð45Þ

h

1

tkj ¼ 

cgj À1



1

cgj À1 þ at þ Actt bt

À

bt
1

PjÀ1


i¼1 wji



1

 


 
2 gÀ1
P
ct
iÀ1
iÀ1
w
b
þ
Á
Á
Á
þ
b
þ
1
þ Á Á Á þ b þ 1 Â X k À V j 
 a2  at þ bt jÀ1
ji
t

i¼1
i¼1 wji At bt

PjÀ1



at tki þ bt tkðiÀ1Þ þ Á Á Á þ biÀ1
t t k1

cgj À1 þ at þ Actt bt




 ;
iÀ1
w
b
þ
Á
Á
Á
þ
b
þ
1
ji
t
i¼1


PjÀ1

k ¼ 1; N; j ¼ 1; C;

ð46Þ


 



m 1
P
cu
iÀ1
iÀ1
a1 au ukj þ bu jÀ1
i¼1 wji au uki þ bu ukðiÀ1Þ þ ÁÁ Á þ bu uk1 þ Au  bu þ Á Á Á þ b þ 1  ukj
C
B
C
B

 



g C
PN B

P
C
B
jÀ1
ch
iÀ1
iÀ1
CX k
k¼1 B þa2 ah hkj þ bh
i¼1 wji ah hki þ bh hkðiÀ1Þ þ ÁÁ Á þ bh hk1 þ Ah  bh þ Á ÁÁ þ b þ 1  hkj
C
B
C
B
A
@








s
PjÀ1
ct
iÀ1
iÀ1
þa3 at tkj þ bu i¼1 wji at t ki þ bt t kðiÀ1Þ þ ÁÁ Á þ bt t k1 þ At  bt þ Á ÁÁ þ b þ 1  t kj

Vj ¼
0 
 



m 1 ; j ¼ 1; C:
P
cu
iÀ1
iÀ1
a1 au ukj þ bu jÀ1
i¼1 wji au uki þ bu ukðiÀ1Þ þ Á ÁÁ þ bu uk1 þ Au  bu þ Á ÁÁ þ b þ 1  ukj
C
B
C
B
C








g
PN B
PjÀ1
C

B
ch
iÀ1
iÀ1
C
k¼1 B þa2 ah hkj þ bh
i¼1 wji ah hki þ bh hkðiÀ1Þ þ ÁÁ Á þ bh hk1 þ Ah  bh þ ÁÁ Á þ b þ 1  hkj
C
B
C
B
A
@








s
P
ct
iÀ1
iÀ1
þa3 at tkj þ bu jÀ1
w
a
t

þ
b
t
þ
ÁÁ
Á
þ
b
t
Â
b
þ
Á
ÁÁ
þ
b
þ
1
þ
Â
t
t
ki
k1
kj
t kðiÀ1Þ
t
t
i¼1 ji
At

0

ð47Þ
Theorem 4. The upper bounds of the difference of solutions of KFGC with (a.k.a. K1) and without (a.k.a. K2) using Gaussian kernel
function are:


L.H. Son / Information Sciences 317 (2015) 202–223



 ðK1Þ

À U ðK2Þ  6 N Â C
U

0
jÀ1
C X
 

X
iÀ1
wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ bu uk1 @1 À
 bu
j¼1 i¼1

211

1

1

A;
P
cu
iÀ1
au þ bu jÀ1
w
b
þ
Á
Á
Á
þ
b
þ
1
ji
u
i¼1
Au
ð48Þ



 ðK1Þ

À HðK2Þ  6 N Â C
H


0
jÀ1
C X
 

X
@1 À
wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ biÀ1
 bu
u uk1
j¼1 i¼1

1
1

A;
P
cu
iÀ1
au þ bu jÀ1
i¼1 wji Au bu þ Á Á Á þ b þ 1
ð49Þ

1


 X
C
 ðK1Þ



À T ðK2Þ  6
T
j¼1

N Â C Â cgj À1
;
1



 


gÀ1
1
P
P
ct
iÀ1
iÀ1
cgj À1 þ at þ Actt bt jÀ1
þ Á ÁÁ þ b þ 1
þ Á ÁÁ þ b þ 1
 a2  at þ bt jÀ1
i¼1 wji bt
i¼1 wji At bt
ð50Þ

Thus,


IFV ðK1Þ P IFV ðK2Þ :

ð51Þ

Thirdly, the effectiveness of plugging the SIM2 model into the objective function (3) is verified.
Theorem 5. In the objective function (3) of KFGC, the updated membership values, the updated typicality values and the updated
hesitation level are replaced with their original ones and the similar proof with that of Theorem 1 is used to determine the new
optimal solutions of KFGC without plugging the SIM2 model into the objective function as in Eqs. (52)–(55).
1
À
ÁÁ!mÀ1
PC À
j¼1 1 À K X k ; V j

À
À
ÁÁ
;
1 À K Xk; V j
i¼1 wij

ukj ¼ P
C

1
 P
C
1


hkj ¼ P
C

1
 P
C
1

j¼1 C

j¼1 C

tkj ¼




1
À
ÁÁ!sÀ1
PC À
j¼1 1 À K X k ; V j

À
À
ÁÁ
;
1 À K Xk ; V j
i¼1 wij


1

a2 ð1ÀK ðX k ;V j ÞÞ

;
1
gÀ1

k ¼ 1; N; j ¼ 1; C;

ð52Þ

k ¼ 1; N; j ¼ 1; C;

ð53Þ

k ¼ 1; N; j ¼ 1; C;

ð54Þ

cj

PN 


s
g
a1 um
kj þ a2 t kj þ a3 hkj X k
 ;

Vj ¼ P 
s
N
g
m
k¼1 a1 ukj þ a2 t kj þ a3 hkj
k¼1

j ¼ 1; C:

ð55Þ

Theorem 6. The upper bounds of the difference of solutions of KFGC with (a.k.a. K1) and without (a.k.a. K3) plugging the SIM2
model into the objective function are:



C b
X
u
 ðK1Þ

À U ðK3Þ  6 N Â C Â
U
j¼1



C b
X

h
 ðK1Þ

À HðK3Þ  6 N Â C Â
H
j¼1

PjÀ1

i¼1 wji





au uki þ bu ukðiÀ1Þ þ Á Á Á þ biÀ1
u uk1

au þ bu
PjÀ1

i¼1 wji



PjÀ1

cu

i¼1 wji Au




biÀ1
u þ ÁÁÁ þ b þ 1



ah hki þ bh hkðiÀ1Þ þ Á Á Á þ biÀ1
h hk1

ah þ bh

PjÀ1

ch

i¼1 wji Ah





biÀ1
h þ ÁÁÁ þ b þ 1

;

ð56Þ


;

ð57Þ




212

L.H. Son / Information Sciences 317 (2015) 202–223



1
PjÀ1
gÀ1
iÀ1


C c
À
b
w
a
t
þ
b
t
þ
Á

Á
Á
þ
b
t
X
t
ji
ki
k
ð
iÀ1
Þ
k1
t
t
t
i¼1
j
 ðK1Þ

 1
À T ðK3Þ  6 N Â C Â
T


 :
P
iÀ1
j¼1

cgj À1 þ at þ Actt bt jÀ1
þ ÁÁÁ þ b þ 1
i¼1 wji bt

ð58Þ

IFV ðK1Þ P IFV ðK3Þ :

ð59Þ

Thus,

Fourthly, it needs to be checked how using spatial bias correction in the constraints (14) and (15) changes the optimal
solutions of the system.
Theorem 7. The objective function (3) is kept intact and the standardized weights out of constraints (14) and (15) are removed
and the new optimal solutions of KFGC without using standardized weights of constraints as in Eqs. (60)–(63) are found.

 
 0
P
1
1 1
À
ÁÁ!mÀ1
À
ÁÁ!mÀ1
PC À
PC À
iÀ1
bu jÀ1

1 À K Xk; V j
i¼1 wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ bu uk1
j¼1 1 À K X k ; V j
j¼1
@1 À
A;
À
À
ÁÁ


À
À
ÁÁ
À
P
1 À K Xk; V j
1 À K Xk; V j
au þ b jÀ1 wji cu biÀ1 þ Á Á Á þ b þ 1

ukj ¼

u

i¼1

u

Au


k ¼ 1; N; j ¼ 1; C;

ð60Þ

 
 0
P
1
1 1
À
ÁÁ!sÀ1
À
ÁÁ!sÀ1
PC À
PC À
iÀ1
bh jÀ1
1 À K Xk ; V j
i¼1 wji ah hki þ bh hkðiÀ1Þ þ Á Á Á þ bh hk1
j¼1 1 À K X k ; V j
j¼1
@1 À
A;
À
À
ÁÁ


À
À

ÁÁ
À
P
1 À K Xk ; V j
1 À K Xk; V j
ah þ b jÀ1 wji ch biÀ1 þ Á Á Á þ b þ 1

hkj ¼

h

i¼1

h

Ah

k ¼ 1; N; j ¼ 1; C;

ð61Þ
1

tkj ¼ 

cgj À1

1

 



 À
À
ÁÁgÀ1
P
ct
iÀ1
 a2  at þ bt jÀ1
biÀ1
þ ÁÁ Á þ b þ 1
þ ÁÁ Á þ b þ 1 Â 1 À K X k À V j
t
i¼1 wji At bt


P
iÀ1
bt jÀ1
i¼1 wji at t ki þ bt t kðiÀ1Þ þ ÁÁ Á þ bt t k1
À 1 
ð62Þ

 ; k ¼ 1; N; j ¼ 1; C:
P
iÀ1
cgj À1 þ at þ Actt bt jÀ1
þ Á ÁÁ þ b þ 1
i¼1 wji bt
1




cgj À1 þ at þ Actt bt

PjÀ1

i¼1 wji


 



m 1
P
a1 au ukj þ bu jÀ1
wji au uki þ bu ukðiÀ1Þ þ ÁÁ Á þ biÀ1
uk1 þ Acuu  biÀ1
þ ÁÁ Á þ b þ 1 Â ukj
u
u
i¼1
C
B




g C
PN B

PjÀ1  
C
B
ch
iÀ1
iÀ1
þa
a
h
þ
b
w
a
h
þ
b
h
þ
ÁÁ
Á
þ
b
h
Â
b
þ
Á
ÁÁ
þ
b

þ
1
þ
Â
h
CX k
B
h
ki
k1
kj
h
h kðiÀ1Þ
k¼1
h
h
i¼1 ji
Ah
C
B 2 h kj




s A
@
PjÀ1  
ct
iÀ1
þa3 at t kj þ bu i¼1 wji at tki þ bt t kðiÀ1Þ þ Á Á Á þ biÀ1

t
Â
b
þ
Á
ÁÁ
þ
b
þ
1
þ
Â
t
k1
kj
t
t
At
0 
Vj ¼



m 1 ; j ¼ 1; C:
PjÀ1  
cu
iÀ1
a1 au ukj þ bu i¼1 wji au uki þ bu ukðiÀ1Þ þ Á ÁÁ þ bu uk1 þ Au  biÀ1
þ Á ÁÁ þ b þ 1 Â ukj
u

C
B




g C
PN B
PjÀ1  
C
B
ch
iÀ1
iÀ1
C
k¼1 B þa2 ah hkj þ bh
i¼1 wji ah hki þ bh hkðiÀ1Þ þ ÁÁ Á þ bh hk1 þ Ah  bh þ ÁÁ Á þ b þ 1  hkj
C
B




s A
@
PjÀ1  
ct
iÀ1
iÀ1
þa3 at tkj þ bu i¼1 wji at t ki þ bt t kðiÀ1Þ þ ÁÁ Á þ bt t k1 þ At  bt þ Á ÁÁ þ b þ 1  t kj

0

ð63Þ
Theorem 8. The upper bounds of the difference of solutions of KFGC with (a.k.a. K1) and without (a.k.a. K4) using the standardized
weights of constraints (14) and (15) are:

0

 1
PjÀ1
iÀ1


C
b
w
a
u
þ
b
u
þ
Á
Á
Á
þ
b
u
XB
ki

k1
C
u
u kðiÀ1Þ
u
i¼1 ji u
NÂC
 ðK1Þ

B1 À 

À U ðK4Þ  6  P
U


C
A;
@
1
C
P
1
jÀ1
c
g
À1
iÀ1
j¼1
i¼1 wij
c þ au þ u b

wji b þ Á Á Á þ b þ 1
C
j

Au

u

i¼1

0

 1
PjÀ1
iÀ1


C
bh i¼1 wji ah hki þ bh hkðiÀ1Þ þ Á Á Á þ bh hk1
XB
C
NÂC
 ðK1Þ

B1 À 

À HðK4Þ  6  P
H



C
@
A;
1
C
P
1
j¼1
i¼1 wij
cgÀ1 þ au þ ch b jÀ1 wji biÀ1 þ Á Á Á þ b þ 1
C
j

Ah

h

i¼1

ð64Þ

u

h

ð65Þ


213


L.H. Son / Information Sciences 317 (2015) 202–223

0

1
1
PjÀ1
gÀ1
iÀ1


C Bc
À
b
w
a
t
þ
b
t
þ
Á
Á
Á
þ
b
t
X
t
ji

ki
k
ð
iÀ1
Þ
k1
C
t
t
t
i¼1
NÂC
 ðK1Þ

B j

À T ðK4Þ  6  P
T

C
@
A:
1
C
PjÀ1  iÀ1
1
c
g
À1
t

w
j¼1
i¼1 ij
c þ at þ b
wji b þ Á Á Á þ b þ 1
C
j

At

t

ð66Þ

t

i¼1

Thus,

IFV ðK1Þ P IFV ðK4Þ :

ð67Þ

From Theorems 2 to 8, it is clear that the clustering quality of KFGC is better than those of MIPFGWC and other variants of KFGC.
2.3. The proposed algorithm
In this section, the KFGC algorithm is presented in details.
Kernel Fuzzy Geographically Clustering
I:


O:

Data X whose number of elements ðNÞ in r dimensions; Number of clusters ðCÞ; Threshold
and other parameters: au ; bu ; cu ; at ; bt ; ct ; ah ; bh ; ch; m; g; s > 1; ai > 0; ði ¼ 1; 3Þ; cj > 0

e

ðj ¼ 1; CÞ; a; b; c; d; r;
Matrices U; T; H and centers V;

KFGC:
1:

Vj

2:

ð0Þ
ukC

3:
4:
5:

ð0Þ

random ðj ¼ 1; CÞ; t = 0;

random;
Repeat

t=t+1
ðtÞ

uk1

ðtÀ1Þ

ukC

;

ð0Þ

hkC

ðtÞ

hk1

ð0Þ

random;

ðtÀ1Þ

hkC

;

t kC


ðtÞ

tk1

random ðk ¼ 1; NÞ satisfy (12) and (13)

ðtÀ1Þ

t kC

6:

Calculate ukj ðk ¼ 1; N; j ¼ 1; CÞ by Eq. (19)

7:

ðtÞ
Calculate hkj ðk ¼ 1; N; j ¼ 1; CÞ by Eq. (20)
ðtÞ
Calculate tkj ðk ¼ 1; N; j ¼ 1; CÞ by Eq. (21)
ðtÞ
Update V j ðj ¼ 1; CÞ by Eq. (22)

8:
9:
10:

ðtÞ






Until V ðtÞ À V ðtÀ1Þ  6 e

3. Results
3.1. Experimental environments
In this part, the experimental environments are described such as,
 Experimental tools: the proposed algorithm – KFGC has been implemented in addition to FGWC [14] and MIPFGWC [26] in
C programming language and executed them on a PC Intel(R) Core(TM)2 Duo CPU T6570 @ 2.10 GHz (2 CPUs), 2048 MB
RAM, and the operating system is Windows 7 Professional 32-bit. The experimental results are taken as the average values after 10 runs.
 Experimental dataset:
– A real dataset of socio-economic demographic variables from United Nation Organization – UNO [33] which was used
for experiments in the articles [25,26]. It contains statistics of 230 nations on population size and composition, births,
deaths, marriage and divorce on an annual basis, economic activity, educational attainment, household characteristics,
etc. UNO will be used in Sections 3.2 and 3.3 to compare the clustering qualities of algorithms and to examine the
characteristics of KFGC, respectively.
– A benchmark UCI Machine Learning dataset so-called Coil 2000 [32] consisting 9000 socio-demographic instances of
86 variables describing information of customers of an insurance company. It will be used in Section 3.4 to validate the
capabilities of KFGC to produce results that are more closely related to spatial relationships.
– All attributes/variables of these datasets are used concurrently for the best evaluation of the algorithms.
 Cluster validity measurement: the IFV validity function in Eqs. (57)–(59).
 Parameters setting: some parameters of KFGC such as the threshold e are set up similar to those of MIPFGWC [26].


214

L.H. Son / Information Sciences 317 (2015) 202–223


 Objective:
– To evaluate the clustering qualities and the computational time of all algorithms.
– To examine the characteristics of KFGC by various cases and parameters of the Gaussian kernel function.
– To validate the capabilities of KFGC to produce results which are more closely related to spatial relationships.
3.2. The comparisons of clustering quality and computational time
In Table 1, the IFV values of all algorithms are measured by various numbers of clusters and parameters ða; b; cÞ on the
UNO dataset. The first remark from this table is that the IFV values of KFGC are larger and better than those of MIPFGWC and
FGWC even if the number of clusters or the parameters ða; b; cÞ changes. Specifically,
 In the first case of ða; b; cÞ when the number of clusters is 2, the IFV values of KFGC, MIPFGWC and FGWC are 5.1, 4.3 and
0.8, respectively. When the number of clusters increases to 3, the IFV values of all algorithms are also larger, but again we
see that the IFV value of KFGC is better than those of other algorithms, i.e. 29.8, 22.2 and 3. 8, respectively.
 The remark is recognized in other cases of ða; b; cÞ, for instance in the second case – ða; b; cÞ ¼ ð0:35; 0:4; 0:25Þ when the
number of clusters is 5, the IFV values of KFGC, MIPFGWC and FGWC are 44.8, 37.4, 8.9, respectively.
 Those experimental results have shown that the clustering quality of KFGC is better than those of other algorithms.
Secondly, the impact of the number of clusters to the IFV values of all algorithms is investigated. Clearly, Table 1 indicates
that there is a slight increment of IFV values of all algorithms when the number of clusters increases:
 In the first case of ða; b; cÞ when the number of clusters changes from 3 to 4, the IFV value of KFGC increases from 29.8 to
32.9. Meanwhile, the IFV value of MIPFGWC (resp. FGWC) varies from 22.2 (resp. 3.8) to 30.6 (resp. 6.0). When we test
with 5 clusters, the IFV value of KFGC increases from 32.9 to 40.7 and the IFV value of MIPFGWC (resp. FGWC) also
changes from 30.6 (resp. 6.0) to 37.6 (resp. 7.9). The average increment ratios of KFGC, MIPFGWC and FGWC in the first
case of ða; b; cÞ are 19%, 24.1% and 31.5%, respectively.
 In the second case – ða; b; cÞ ¼ ð0:35; 0:4; 0:25Þ, the average increment ratios of KFGC, MIPFGWC and FGWC are 20.6%,
29.2% and 28.9%, respectively.
 The average increment ratios of KFGC in the third, fourth, fifth and sixth cases are 39.7%, 34.5%, 21.7% and 20%, respectively. Similarly, the values of MIPFGWC (resp. FGWC) are 36.3% (resp. 40.8%), 41.1% (resp. 54.6%), 26.4% (resp. 24.9%) and
26.1% (resp. 29.6%), respectively.
 The average increment ratios of KFGC, MIPFGWC and FGWC by cases are 25.9%, 30.5% and 35%, respectively.
 Those ratios help us to predict the IFV values of algorithms for a given number of clusters.

Table 1
IFV values by geographic parameters and C.

C

2
3
4
5
6
7

ða; b; cÞ ¼ ð0:3; 0:25; 0:45Þ

ða; b; cÞ ¼ ð0:35; 0:4; 0:25Þ

KFGCa

MIPFGWC FGWCb

KFGCa

MIPFGWC

FGWCb

5.0585
29.8394
32.8705
40.6905
52.7738
59.3786


4.2773
22.1791
30.5811
37.6089
42.9346
52.1952

0.7674
3.7982
6.0314
7.8566
9.3534
11.0136

6.4445
26.2245
33.8837
44.7920
55.4991
53.8379

2.0574
19.5822
29.6247
37.4115
45.8490
53.4887

0.8783
4.3664

7.2173
8.8986
9.7352
11.4477

6.7451
16.4691
28.3652
39.1279
43.2684
53.7999

2.0891
7.0151
11.3263
18.7527
22.6014
26.0993

ða; b; cÞ ¼ ð0:7; 0:2; 0:1Þ
2
3
4
5
6
7

11.0862
17.2164
31.4931

48.8323
49.0544
59.1806

ða; b; cÞ ¼ ð0:55; 0:15; 0:3Þ

ða; b; cÞ ¼ ð0:34; 0:33; 0:33Þ
2
3
4
5
6
7
a
b

7.8089
24.5602
30.5453
37.4965
51.8698
52.5315

2.0337
20.6662
28.3496
36.5597
46.4185
52.1431


au ¼ at ¼ ah ¼ a; bu ¼ bt ¼ bh ¼ b; cu ¼ ct ¼ ch ¼ c.
b value in FGWC is equal to the sum of b and c.

8.3532
20.0305
39.2364
43.5082
54.0188
57.7399

4.0304
15.2013
31.0666
37.5862
47.4099
53.6075

0.6440
4.1440
10.3852
13.4101
16.6708
19.0490

3.4353
20.7994
31.2177
38.3869
42.0896
51.2836


1.0529
6.3679
10.1296
12.0988
14.4043
17.4037

ða; b; cÞ ¼ ð0:5; 0:3; 0:2Þ
0.4040
4.6698
6.7717
8.7119
9.4352
11.1229

8.07516
26.2767
35.7169
43.7499
44.5557
53.4076


215

L.H. Son / Information Sciences 317 (2015) 202–223

Thirdly, the impact of the parameters ða; b; cÞ to the IFV values of all algorithms is conducted. The results showed that the
IFV values of KFGC and MIPFGWC are stable through various cases of parameters. Some proofs are given as follows.

 The average IFV values of KFGC from the first to the sixth case are 36.8, 36.8, 36.1, 37.1, 34.1 and 35.3, respectively. There
is a one-IFV-value gap between the maximal and minimal IFV values of KFGC in those cases.
 Analogously, the average IFV values of MIPFGWC from the first to the sixth case are 31.6, 31.3, 31.3, 31.5, 31.0 and 31.2,
respectively.
 The FGWC algorithm is not stable since the difference between the maximal and minimal IFV values are 8 IFV values.
 From these numbers, we can recognize that the average IFV values of KFGC are still better than those of MIPFGWC and
FGWC. Besides, the effectiveness of KFGC is independent from the changes of parameters.
Fourthly, which case of parameters results in the best IFV values of KFGC should be known.
 From the average IFV values of KFGC above, it comes to conclusion that the fourth case – ða; b; cÞ ¼ ð0:55; 0:15; 0:3Þ is the
best case of parameters which means that we should set up the medium value of parameter a, the low value of b and the
high value of c in order to obtain large IFV values in KFGC.
 However if each IFV value of the fourth case is observed, the difference between the results of two consecutive numbers of
clusters is irregular.
 On the contrary, the third case – ða; b; cÞ ¼ ð0:7; 0:2; 0:1Þ makes the difference between the results of two consecutive
numbers of clusters is absolutely equal.
 Thus, our recommendation is choosing a large value of a, a medium value of b and a low value of c in order to achieve the
best IFV values of KFGC.
Lastly, the computational time of all algorithms in Table 2 is recorded. The results showed that KFGC runs longer than
MIPFGWC and FGWC. Some proofs are as follows.
 In the first case when the number of cluster is 2, the computational time of KFGC, MIPFGWC and FGWC is 0.67, 0.18 and
0.11 s, respectively.
 The average computational time of KFGC from the first to sixth cases is 2.7, 3.4, 3.1, 3.6, 2.7 and 2.8 s, respectively. Those
values in cases of MIPFGWC (resp. FGWC) are 1.29 (resp. 0.62), 1.27 (resp. 0.6), 1.31 (resp. 0.59), 1.14 (resp. 0.59), 1.23
(resp. 0.61) and 1.28 (resp. 0.58) seconds, respectively.
 The average computational time of KFGC by various cases of parameters and various numbers of clusters is 2.43 and 5
times larger than those of MIPFGWC and FGWC, respectively. Nonetheless, it takes only 3 s for each run that process a
given number of clusters and a case of parameters.

Table 2
The computational time of algorithms by geographic parameters and C (s).

C

2
3
4
5
6
7

ða; b; cÞ ¼ ð0:3; 0:25; 0:45Þ

ða; b; cÞ ¼ ð0:35; 0:4; 0:25Þ

KFGCa

MIPFGWC

FGWCb

KFGCa

MIPFGWC

FGWCb

0.6744
2.1971
2.7883
2.8166
3.1938

4.7668

0.1775
0.5597
1.2513
1.3622
1.7202
2.6462

0.1092
0.3247
0.7456
0.7084
0.8612
0.9814

1.6702
2.7612
3.0743
3.5081
3.9346
5.5056

0.218
0.6485
1.0134
1.3873
1.812
2.5304


0.1048
0.3039
0.6639
0.6798
0.9009
1.0003

0.2523
0.6563
1.2005
1.2903
1.9696
2.4851

0.1201
0.3713
0.542
0.6177
0.882
1.0611

1.6098
3.4167
3.6922
3.9845
4.2604
4.3596

ða; b; cÞ ¼ ð0:7; 0:2; 0:1Þ
2

3
4
5
6
7

2.5559
2.8355
2.9155
3.1376
3.4095
3.4632

ða; b; cÞ ¼ ð0:55; 0:15; 0:3Þ

ða; b; cÞ ¼ ð0:34; 0:33; 0:33Þ
2
3
4
5
6
7
a
b

1.2190
2.2414
2.9427
3.1366
3.3171

3.5604

0.1786
0.631
0.9881
1.3555
1.7106
2.5288

au ¼ at ¼ ah ¼ a; bu ¼ bt ¼ bh ¼ b; cu ¼ ct ¼ ch ¼ c.
b value in FGWC is equal to the sum of b and c.

0.2231
0.7059
0.9596
1.3778
1.6378
1.9538

0.0942
0.3178
0.518
0.6467
0.7749
1.2478

0.2124
0.6338
0.9905
1.2886

2.483
2.1251

0.1314
0.3411
0.5347
0.6416
0.7734
1.0946

ða; b; cÞ ¼ ð0:5; 0:3; 0:2Þ
0.0870
0.3028
0.6041
0.6989
0.8970
1.0985

1.3499
2.1719
2.39459
2.9959
3.7792
4.1296


216

L.H. Son / Information Sciences 317 (2015) 202–223


Table 3
The IFV values of KFGC by various cases.
C

Case

2
3
4
5
6
7

1

2

3

4

5

6

7

11.7049
17.9523
35.0811

51.5421
50.4248
60.0569

20.0233
25.4080
39.7139
56.8321
59.0267
59.6235

10.1541
15.7234
30.3731
47.8757
47.9637
57.9032

8.2307
16.3287
29.1589
48.7489
46.4649
57.1424

9.238
15.9463
30.7203
48.8075
47.6161

58.7163

10.4698
16.9254
31.0940
46.9292
46.1987
56.4657

11.0862
17.2164
31.4931
48.8323
49.0544
59.1806

Table 4
The IFV values of KFGC by

r of the Gaussian kernel function.

r

C

2
3
4
5
6

7

1.0

1.5

2.0

2.5

3.0

3.5

4.0

10.5376
15.6419
30.6946
45.5547
46.6749
56.5632

9.4920
15.3285
29.9652
48.0115
47.5884
58.2222


11.0862
17.2164
31.4931
48.8323
49.0544
59.1806

11.6872
17.6214
32.9895
49.0008
50.8182
61.0639

13.7415
18.4386
33.1232
49.6653
51.3159
60.6796

14.4181
19.8629
33.8038
50.1130
50.0033
60.5061

14.1753
18.1979

35.4039
50.8741
52.6601
64.1375

 Thus, the computational time of KFGC is not too large and can be acceptable.
The final conclusion of Section 3.2 is:
– The clustering quality of KFGC is better than those of MIPFGWC and FGWC.
– KFGC is stable through various cases of parameters.
– We should choose a large value of parameter a, a medium value of b and a low value of c in order to achieve the best IFV
values of KFGC.
– The computational time of KFGC is acceptable.
3.3. The characteristics of KFGC
In this part, the characteristics of KFGC have been investigated by various cases defined below and by the values of
parameter r of the Gaussian kernel function on the UNO dataset. The aim is to verify the impact of these parameters to
the IFV values of KFGC. The results are expressed in Tables 3 and 4, respectively.








Case
Case
Case
Case
Case
Case

Case

1:
2:
3:
4:
5:
6:
7:

ðau > at > ah Þ : ðau ; bu ; cu Þ ¼ ð0:7; 0:2; 0:1Þ; ðat ; bt ; ct Þ ¼ ð0:6; 0:15; 0:25Þ; ðah ; bh ; chÞ ¼ ð0:5; 0:2; 0:3Þ.
ðau > ah > at Þ : ðau ; bu ; cu Þ ¼ ð0:7; 0:2; 0:1Þ; ðat ; bt ; ct Þ ¼ ð0:5; 0:2; 0:3Þ; ðah ; bh ; chÞ ¼ ð0:6; 0:15; 0:25Þ.
ðat > au > ah Þ : ðau ; bu ; cu Þ ¼ ð0:6; 0:15; 0:25Þ; ðat ; bt ; ct Þ ¼ ð0:7; 0:2; 0:1Þ; ðah ; bh ; chÞ ¼ ð0:5; 0:2; 0:3Þ.
ðat > ah > au Þ : ðau ; bu ; cu Þ ¼ ð0:5; 0:2; 0:3Þ; ðat ; bt ; ct Þ ¼ ð0:7; 0:2; 0:1Þ; ðah ; bh ; chÞ ¼ ð0:6; 0:15; 0:25Þ.
ðah > au > at Þ : ðau ; bu ; cu Þ ¼ ð0:6; 0:15; 0:25Þ; ðat ; bt ; ct Þ ¼ ð0:5; 0:2; 0:3Þ; ðah ; bh ; chÞ ¼ ð0:7; 0:2; 0:1Þ.
ðah > at > au Þ : ðau ; bu ; cu Þ ¼ ð0:5; 0:2; 0:3Þ; ðat ; bt ; ct Þ ¼ ð0:6; 0:15; 0:25Þ; ðah ; bh ; chÞ ¼ ð0:7; 0:2; 0:1Þ.
ðau ¼ at ¼ ah Þ : ðau ; bu ; cu Þ ¼ ðat ; bt ; ct Þ ¼ ðah ; bh ; chÞ ¼ ð0:7; 0:2; 0:1Þ.

From Table 3, we recognize that Case 2 produces the best results among all.
 When C ¼ 2, the IFV value of Case 2 is 20, which is 1.71, 1.97, 2.43, 2.17, 1.91 and 1.81 times larger than those of Case 1,
Case 3, Case 4, Case 5, Case 6 and Case 7, respectively.
 The average IFV value of Case 2 by the number of clusters is 43.4, whilst those of Case 1 and from Case 3 to Case 7 are
37.79, 34.99, 34.34, 35.17, 34.68 and 36.14, respectively.
 Those average IFV values pointed out the order of cases for the sake of best IFV that is Case 2, Case 1, Case 7, Case 5, Case 3,
Case 6 and Case 4.
 This order gives us two remarks:
– In order to achieve the best IFV values of KFGC, the parameters as in Case 2 ðau > ah > at Þ should be set.
– The changing of IFV in KFGC can be observed by this order.



L.H. Son / Information Sciences 317 (2015) 202–223

Now, the IFV values of KFGC by

217

r of the Gaussian kernel function are examined in Table 4.

 The results revealed that the IFV values of KFGC are in the direct proportional to the values of r. In the other words, the
higher the value of r is, the larger the IFV of KFGC could achieve.
 For example when C ¼ 4, the IFV values of KFGC from r ¼ 1:0 to r ¼ 4:0 are 30.7, 29.9, 31.5, 32.9, 33.1, 33.8 and 35.4,
respectively.
 Thus, high value of r should be set in order to achieve the best IFV values of KFGC.
The final conclusion of Section 3.3 is:
– In order to obtain the best clustering quality in KFGC, either the parameters should be set as au > ah > at or the value of
parameter r is high;
– The changing of IFV values in KFGC by various cases and parameters can be referenced in Tables 3 and 4.
3.4. The validation of spatial relationships and outliers elimination in KFGC
This section validates the capabilities of KFGC to produce the results which are (i) more closely related to the spatial relationships and (ii) less outliers than other relevant algorithms as stated in Section 1.
The experiments conducted on the Coil 2000 dataset where the first 43 attributes relate to the socio-demographic data
and the last 43 attributes describe product ownership. The target variable has two classes ‘‘0’’ and ‘‘1’’ showing the number of
mobile home policies. The distribution of the dataset according to the attribute ‘‘Customer Subtype’’ is depicted in Fig. 2. The
accurate classification results of the Coil 2000 dataset including 9236 classes ‘‘0’’ (Group 1) and 586 classes ‘‘1’’ (Group 2)
according to the description of dataset are described in Fig. 3.
Using the FGWC method [14], the classification results on the Coil 2000 dataset are depicted in Fig. 4. Some analyses are
shown as follows.
 Obviously, the number of wrong prediction results increases remarkably in comparison with the accurate distribution in
Fig. 3.
 The number of data points in ‘‘Group 2’’ (Blue color) is more than that in Fig. 3 and tends to expand to the entire space
instead of a small right-corner sub-space.

 Some data points in the left-corner were changed from ‘‘Group 1’’ to ‘‘Group 2’’ so that the number of data points
belonged to ‘‘Group 2’’ in this area of Fig. 4 raises dramatically. The number of boundary points belonging to ‘‘Group
2’’ is also added up.
 The reason for this fact is that FGWC was constructed on the bases of the traditional fuzzy sets and the SIM model so that
the classification results have large numbers of outliers. As being mentioned in Section 1, the SIM model considers the
update of the (old) neighboring groups only, and it does not take into account the new updated neighboring groups so
that the classification results of FGWC are less closely related to the spatial relationships and contain large number of
outliers.
Analogously, the classification results of the MIPFGWC algorithm [26] are illustrated in Fig. 5. We clearly recognize some
crucial remarks from this figure.

Fig. 2. The distribution of Coil 2000 dataset by ‘‘Customer Subtype’’.


218

L.H. Son / Information Sciences 317 (2015) 202–223

Fig. 3. The accurate distribution of Coil 2000 dataset.

Fig. 4. The classification results of FGWC.

 The number of wrong prediction results in MIPFGWC reduces remarkably in comparison with that in FGWC. Even though
the data points of ‘‘Group 2’’ are still located in the entire space, yet it can be seen that the problem of data concentration
in the left-corner and in the boundary sides does not exist in MIPFGWC as illustrated in Fig. 5.
 The interactions between cluster memberships by the SIM2 model re-calculated the values of memberships and made the
correct labels for data points. The number of correct classification results in this algorithm is 8068 over 9822 labeled data
points.
 MIPFGWC improves FGWC by considering two crucial points: (i) the algorithm is deployed on the intuitionistic fuzzy sets
instead of the traditional fuzzy sets in order to handle the vagueness in the membership function and to process the hesitation levels; thus giving more accurate clustering results; (ii) the SIM2 model, which takes into account the new updated

neighboring groups, is used instead of the SIM model. Thus, MIPFGWC not only has better clustering quality than FGWC
but also contains less outlier than this algorithm.
 Nevertheless if the results in Fig. 5 are observed carefully, it can be seen that there are some data points such as those in
upper-right corner (the blanks and the boundaries) being not labeled. These points are the outliers of the MIPFGWC algorithm. Since the SIM2 model does not update the typicality values and the hesitation levels, the classification results are
incorrect since the calculation of the membership matrix is performed through those values. In the intuitionistic


L.H. Son / Information Sciences 317 (2015) 202–223

219

Fig. 5. The classification results of MIPFGWC.

Fig. 6. The classification results of KFGC.

possibilistic fuzzy model, the typicality values and the hesitation levels have great impacts to the decision of which cluster that a data point belong to. Thus, the local update with the previous cluster memberships would make the next memberships diversified, and the outliers could occur as a result.
In Fig. 6, there are the classification results of the KFGC algorithm and some remarks as follows.
 By comparing the data points of ‘‘Group 2’’ from Figs. 3 to 6, it is clear that the distribution of those points of KFGC in Fig. 6
is more similar to the accurate distribution in Fig. 3 than those of MIPFGWC (Fig. 5) and FGWC (Fig. 4), which are either
irregular, misclassified or data-concentration.
 This proves the capability of KFGC to produce results which are more closely related to spatial relationships. In additions,
the number of outliers is reduced, and the KFGC algorithm could produce better classification results than other algorithms as shown in Fig. 6. In this case, the number correct classification results in this algorithm is 9244 over 9822 labeled
data points.
 By providing the new update SIM2 scheme in the objective function, the spatial bias correction and the Gaussian kernel
function, the classification results of KFGC are proven to be closely related to spatial relationships.
From these figures, it is obvious that KFGC achieves better classification results than other algorithms as proven by the
following facts.


220


L.H. Son / Information Sciences 317 (2015) 202–223

 The quantities of ‘‘Group 1’’ in both KFGC and MIPFGWC are nearly equal and larger than that of FGWC with the numbers
being 8680, 8660 and 7602, respectively. Those numbers are smaller than the actual value, which is 9236 data points
belonged to ‘‘Group 1’’. Similar facts are found for ‘‘Group 2’’.
 The classification results of KFGC are better than those of MIPFGWC and FGWC with the numbers being 94.1%, 87.6% and
83.1%, respectively.
The final conclusion of Section 3.4 is:
– KFGC is more efficient than other relevant algorithms in terms of robustness to outliers and spatially-guaranteed results;
– The accuracy of KFGC is approximately 94%.
4. Conclusions
In this paper, a novel kernel-based intuitionistic possibilistic fuzzy clustering algorithm so-called KFGC was introduced
for the Geo-Demographic Analysis problem. The objective function of KFGC employed the Gaussian kernel function instead
of the traditional Euclidean, used the updated membership values, typicality values and hesitation level by the SIM2 model,
and made the spatial bias correction through the standardized weights. By doing so, KFGC could produce results which are
more closely related to spatial relationships and eliminate outliers since the membership values, the hesitation level, the
typicality values and the centers are ‘‘geographically aware’’. Some properties of KFGC’s solutions and the comparison of
clustering qualities between KFGC and other algorithms were theoretical validated. The experimental results on a benchmark dataset also re-affirmed that the clustering quality of KFGC is better than those of other relevant algorithms.
Further researches will investigate the uses of context variables to KFGC and variants of this algorithm in distributive
environments.
Acknowledgements
The authors are greatly indebted to the editor-in-chief, Prof. W. Pedrycz; anonymous reviewers for their comments and
their valuable suggestions that improved the quality and clarity of paper. Other thanks are sent to Mr. Nguyen Van Canh and
Ms. Hoang Thi Thu Huong, FPT for some experimental works and language editing, respectively. This research is funded by
Vietnam National Foundation for Science and Technology Development (NAFOSTED) under Grant No. 102.05-2014.01.
Appendix A. Proof of Theorem 1
Fix T; H; V for the kth column U k of U, we get the reduced problem,
jÀ1
C

C
X
X
c X

a1 Â au ukj þ bu wji u0ki þ u Â
wji  uki
Au
j¼1
i¼j
i¼1

¼

!m

À
À
ÁÁ
 1 À K X k ; V j ! min;

ðA:1Þ

jÀ1
C
C
C
C

 c

X
X
X
X
X
u
a1 Â au ukj þ bu wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ biÀ1
buiÀ1 ukt þ Á Á Á þ bu ukt þ
ukt
u uk1 þ
Au
t¼2
j¼1
t¼i
t¼iþ1
i¼1
!m
C
À
À
ÁÁ
cu X
wji  uki
 1 À K X k ; V j ! min :
þ Â
Au
i¼j

!!


ðA:2Þ

Using Lagrange multiplier for (A.2) we obtain
!m
 

jÀ1
C
 c 

X
X
c u XC
iÀ1
iÀ1
u
LðU k ; kk Þ ¼
a1 Â au ukj þ bu wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ bu uk1 þ Â bu þ ÁÁ Á þ b þ 1 Â ukj þ Â
wji  uki
i¼j
Au
Au
j¼1
i¼1
!
!
C
C
X
À

À
ÁÁ
1X
 1 À K X k ; V j À kk
wij ukj À 1 ;
ðA:3Þ
C i¼1
j¼1
@LðU k ;kk Þ
)
¼ a1 Â m Â
@ukj

!
jÀ1

X
À
À
ÁÁ
cu  iÀ1
au þ bu wji
bu þ ÁÁÁ þ b þ 1 Â 1 À K X k ;V j
A
u
i¼1

 
jÀ1
C

 c

 c X
X
u
 au ukj þ bu wji au uki þ bu ukðiÀ1Þ þ ÁÁÁ þ biÀ1
 ukj  buiÀ1 þ ÁÁÁ þ b þ 1 þ u Â
wji  uki
u uk1 þ
A
A
u
u
i¼1
i¼j
!
C
C
X
1X
À kk
wij :
C i¼1
j¼1

!mÀ1

ðA:4Þ



221

L.H. Son / Information Sciences 317 (2015) 202–223

Since

@LðU k ;kk Þ
@ukj

¼ 0, we get

À

a1 ÂmÂ

ukj ¼

1
!mÀ1
PC À 1 PC Á
 

kk
w
P
C
iÀ1
i¼1 ij
Á
PjÀ1 cj¼1

À bu jÀ1
i¼1 wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ bu uk1
iÀ1
u
au þbu i¼1 wji Au ðbu þÁÁÁþbþ1Þ Âð1ÀK ðX k ;V j ÞÞ


;
P
cu
iÀ1
au þ bu jÀ1
i¼1 wji Au bu þ Á Á Á þ b þ 1

j ¼ 1; C; k ¼ 1; N:

ðA:5Þ

Due to the constrain (14), we have


 P  P
  P
 
1mÀ1
C
C
jÀ1
iÀ1
1

biÀ1
u þ ÁÁÁ þ b þ 1 þ
j¼1 C
i¼1 wij  bu
i¼1 wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ bu uk1
A

kk ¼ @
PC  1 PC
j¼1 C
i¼1 wij


 P À
À
ÁÁ
P
cu
iÀ1
a1  m  au þ bu jÀ1
 Cj¼1 1 À K X k ; V j
i¼1 wji Au bu þ Á Á Á þ b þ 1


Â
ðA:6Þ
; k ¼ 1; N:
PC 1 PC
j¼1 C
i¼1 wij

0

au þ bu

PjÀ1

cu

i¼1 wji Au

From (A.5) and (A.6), we have
1
!mÀ1
ð1ÀK ðXk ;V j ÞÞ
 
 0
P
1 1
À
ÁÁ!mÀ1
PC À
iÀ1
ð1ÀK ðXk ;V j ÞÞ
bu jÀ1
1 À K Xk; V j
i¼1 wji au uki þ bu ukðiÀ1Þ þ Á Á Á þ bu uk1
j¼1
@1 À
A;
 À



À
À
ÁÁ
PC  1 PC
P
cu
iÀ1
1 À K Xk ; V j
au þ bu jÀ1
j¼1 C
i¼1 wij
i¼1 wji Au bu þ Á Á Á þ b þ 1

PC

j¼1

ukj ¼

k ¼ 1; N; j ¼ 1; C:

ðA:7Þ

Similarly, fix U; T; H for the kth column Hk of Hand use the Lagrange multiplier, we get the solution in (A.8).
1
!sÀ1
ð1ÀK ðX k ;V j ÞÞ
ð1ÀK ðX k ;V j ÞÞ


hkj ¼ P  P
C
C
1
j¼1 C
i¼1 wij
 
 0
P
1 1
À
ÁÁ!sÀ1
PC À
iÀ1
bh jÀ1
1 À K Xk ; V j
i¼1 wji ah hki þ bh hkðiÀ1Þ þ Á Á Á þ bh hk1
j¼1
@1 À
A;


À
À
ÁÁ
À
P
ch
1 À K Xk; V j

ah þ bh jÀ1
biÀ1
h þ ÁÁÁ þ b þ 1
i¼1 wji A

PC

j¼1

k

h

¼ 1; N; j ¼ 1; C:

ðA:8Þ

Next, we continue to fix U; H; V for the typicality value t kj and get the reduced problem from (A.9) to (A.11).
C
 g À
À Á X
À
ÁÁ
À
Ág
J tkj ¼
a2 t 0kj
1 À K X k ; V j þ cj  1 À t kj ! min;
j¼1


¼

C
X

a2 at tkj þ bt

j¼1

jÀ1
C
X
c X
wji t0ki þ t Â
wji  tki
At
i¼j
i¼1

!g

À
À
ÁÁ
À
Ág
1 À K X k ; V j þ cj  1 À t kj ! min;

jÀ1


 c
X
t
¼
a2 Â at t kj þ bt
wji at t ki þ bt tkðiÀ1Þ þ Á Á Á þ biÀ1
t t k1 þ
At
i¼1
j¼1
!g
C
À
À
ÁÁ
À
Ág
c X
wjl  t kl  1 À K X k ; V j þ cj  1 À tkj ! min :
þ tÂ
At
l¼j
C
X

Since

@Jðt kj Þ
@t kj


ðA:9Þ

biÀ1
t

C
X
l¼2

ðA:10Þ

C
C
X
X
tkl þ Á Á Á þ bt t kl þ
tkl
l¼i

!!

l¼iþ1

ðA:11Þ

¼ 0 we get

!gÀ1
 
jÀ1

C
 c

 c
X
X
iÀ1
iÀ1
t
t
at tkj þ bt wji at tki þ bt tkðiÀ1Þ þ Á Á Á þ bt tk1 þ  tkj  bt þ Á Á Á þ b þ 1 þ  wji  tki
At
At
i¼j
i¼1
!
jÀ1


X
À
À
ÁÁ
À
ÁgÀ1
c
 a2  at þ bt
wji t biÀ1
þ ÁÁÁ þ b þ 1
;

 1 À K X k ; V j ¼ g  cj  1 À t kj
t
A
t
i¼1

ðA:12Þ


222

L.H. Son / Information Sciences 317 (2015) 202–223
1

) tkj ¼ 



1

cgj À1 þ at þ Actt bt

À

bt
1

PjÀ1

i¼1 wji


PjÀ1

i¼1 wji


biÀ1
þ ÁÁ Á þ b þ 1
t





at tki þ bt tkðiÀ1Þ þ Á ÁÁ þ biÀ1
t t k1



cgj À1 þ at þ Actt bt

PjÀ1

i¼1 wji

cgj À1

1




 À
À
ÁÁgÀ1
P
ct
iÀ1
 a2  at þ bt jÀ1
þ ÁÁ Á þ b þ 1 Â 1 À K X k ; V j
i¼1 wji At bt




 ; k ¼ 1; N; j ¼ 1;C:
biÀ1
þ Á ÁÁ þ b þ 1
t

ðA:13Þ

Finally, we take the derivative of J m;g;s ðV Þ with respect to each V j .
0

!m 1
jÀ1
C
 





X
X
cu
B a1 au ukj þ bu wji au uki þ bu ukðiÀ1Þ þ ÁÁ Á þ buiÀ1 uk1 þ cu  biÀ1
C
wji  uki
u þ ÁÁÁ þ b þ 1 Â ukj þ Au Â
Au
B
C
B
C
i¼1
i¼j
B
!g C
B
C
jÀ1
N X
C B
C
 




X

X

Á
À
Á
@J m;g;s ðV Þ X
B þa2 ah hkj þ bh wji ah hki þ bh hkðiÀ1Þ þ ÁÁ Á þ biÀ1 hk1 þ ch  biÀ1 þ ÁÁ Á þ b þ 1  hkj þ ch Â
¼
wji  hki C
h
h
A
A
B
C 2X k À 2V j  K X k ; V j :
h
h
@V j
C
k¼1 j¼1 B
i¼1
i¼j
B
C
!s
B
C
jÀ1
C
 





X
X
B
C
ct
ct
iÀ1
iÀ1
@ þa3 at t þ b
A
wji at t þ b t
þ ÁÁÁ þ b t
wji  t
þ Â b þ ÁÁ Á þ b þ 1 Â t þ Â
kj

ki

u

t kðiÀ1Þ

t

k1


At

t

kj

ki

At

i¼1

i¼j

ðA:14Þ
Since

@J m;g;s ðV Þ
@V j

¼ 0 we get
0 



m
m 1
PjÀ1  
cu
cu PC

iÀ1
iÀ1
B a1 au ukj þ bu i¼1 wji au uki þ bu ukðiÀ1Þ þ Á ÁÁ þ bu uk1 þ Au  bu þ Á ÁÁ þ b þ 1  ukj þ Au  i¼j wji  uki C





g C
PN B
PjÀ1  
C
B
ch
ch PC
iÀ1
iÀ1
CX k
k¼1 B þa2 ah hkj þ bh
i¼j wji  hki
i¼1 wji ah hki þ bh hkðiÀ1Þ þ ÁÁ Á þ bh hk1 þ Ah  bh þ ÁÁ Á þ b þ 1  hkj þ Ah Â
C
B
A
@









s
PjÀ1
P
C
ct
ct
iÀ1
iÀ1
þa3 at t kj þ bu i¼1 wji at tki þ bt t kðiÀ1Þ þ Á ÁÁ þ bt t k1 þ At  bt þ Á ÁÁ þ b þ 1  t kj þ At  i¼j wji  tki
0 
Vj ¼



m
m 1 :
PjÀ1  
cu
cu PC
iÀ1
iÀ1
B a1 au ukj þ bu i¼1 wji au uki þ bu ukðiÀ1Þ þ ÁÁ Á þ bu uk1 þ Au  bu þ ÁÁ Á þ b þ 1  ukj þ Au  i¼j wji  uki C






g C
PN B
PjÀ1  
P
C
B
wji ah hki þ bh hkðiÀ1Þ þ ÁÁ Á þ biÀ1
hk1 þ Ach  biÀ1
þ ÁÁ Á þ b þ 1  hkj þ Ach  Ci¼j wji  hki C
k¼1 B þa2 ah hkj þ bh
h
h
i¼1
h
h
C
B
A
@








s
P
ct

ct PC
iÀ1
iÀ1
þ
Â
t
þ
þa3 at tkj þ bu jÀ1
w
a
t
þ
b
t
þ
ÁÁ
Á
þ
b
t
Â
b
þ
ÁÁ
Á
þ
b
þ
1
Â

w
Â
t
t
ki
k1
kj
ki
t kðiÀ1Þ
t
t
i¼j ji
i¼1 ji
At
At

ðA:15Þ
Appendix B. Proof of Theorem 2



 ðKMIPFGWCÞ

À U ðMIPFGWCÞ  6 N Â C
U
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u0
0
1112
1 0

u
À
ÁÁ!mÀ1
PC À

u
XC XjÀ1  
1
1
j¼1 1 À K X 1 ;V j
u@
@
@1 À
 þ bu
AAA ;
w au uki þ bu ukðiÀ1Þ þ ÁÁÁ þ biÀ1
Ât
PC 1 PC
PjÀ1 cu  iÀ1
u uk1
j¼1
i¼1 ji
ð1 À K ðX 1 ;V 1 ÞÞ
au þ bu i¼1 wji Au bu þ ÁÁÁ þ b þ 1
j¼1 C
i¼1 wij
0
0
11


XC XjÀ1  
1
1
iÀ1
@1 À
 þ bu

AA:
6 N Â C Â @P  P
w
a
u
þ
b
u
þ
ÁÁ
Á
þ
b
u
ji
u
ki
k
ð
iÀ1
Þ
k1
u

P
u
j¼1
i¼1
C
C
cu
iÀ1
1
au þ bu jÀ1
j¼1 C
i¼1 wij
i¼1 wji Au bu þ ÁÁ Á þ b þ 1

ðB:1Þ
ðB:2Þ

Similarly, we have the estimation of hesitation level in (B.3).


 ðKMIPFGWCÞ

À HðMIPFGWCÞ  6 N Â C
H
0
 @P

C
j¼1


1
 P
C
1
C

i¼1 wij

 þ bh

jÀ1
C X
X

wji





ah hki þ bh hkðiÀ1Þ þ ÁÁ Á þ biÀ1
h hk1

j¼1 i¼1



0
@1 À


11
1
AA:


P
ch
ah þ bh jÀ1
biÀ1
h þ ÁÁ Á þ b þ 1
i¼1 wji A
h

ðB:3Þ
Now, we calculate the estimation of typicality values.

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi



2
 ðKMIPFGWCÞ

t KMIPFGWC
;
À T ðMIPFGWCÞ  6 N Â C Â
À t MIPFGWC
T
kj
kj


ðB:4Þ

6NÂC
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u
0
12
u
u
1
gÀ1
uX B
C
cj
1
u C B
C
 u j¼1 B 1 
À
;
1
1 C
 

 À

gÀ1
u
À

ÁÁgÀ1
PjÀ1  iÀ1
PjÀ1 ct  iÀ1
@
A
c
g
À1
a
Â
1ÀK
X
;V
t
ð k j ÞÞ
t
2 ð
 a2  at þ bt i¼1 wji At bt þ Á ÁÁ þ b þ 1  1 À K X k ;V j
cj þ at þ At bt i¼1 wji bt þ Á Á Á þ b þ 1

c
j

ðB:5Þ


L.H. Son / Information Sciences 317 (2015) 202–223

6NÂCÂ


C
X
j¼1

223

1



1



cgj À1 þ at þ Actt bt

cgj À1


 :
iÀ1
w
b
þ
Á
Á
Á
þ
b
þ

1
t
i¼1 ji

PjÀ1

ðB:6Þ

References
[1] M.N. Ahmed, S.M. Yamany, N. Mohamed, A.A. Farag, T. Moriarty, A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI
data, IEEE Trans. Med. Imaging 21 (2002) 193–199.
[2] S.C. Chen, D.Q. Zhang, Robust image segmentation using FCM with spatial constrains based on new kernel-induced distance measure, IEEE Trans. Syst.
Man Cybernet. Part B 34 (2004) 1907–1916.
[3] H. Chunchun, M. Lingkui, S. Wenzhong, Fuzzy clustering validity for spatial data, Geo-spatial Inform. Sci. 11 (3) (2008) 191–196.
[4] B.C. Cuong, L.H. Son, H.T.M. Chau, Some context fuzzy clustering methods for classification problems, in: Proceedings of the 2010 ACM Symposium on
Information and Communication Technology, 2010, pp. 34–40.
[5] P. Day, J. Pearce, D. Dorling, Twelve worlds: a geo-demographic comparison of global inequalities in mortality, J. Epidemiol. Community Health 62
(2008) 1002–1010.
[6] Z. Feng, R. Flowerdew, Fuzzy geodemographics: a contribution from fuzzy clustering methods, in: S. Carver (Ed.), Innovations in GIS 5, Taylor & Francis,
London, 1998, pp. 119–127.
[7] H. Fritz, L.A. GarcíA-Escudero, A. Mayo-Iscar, Robust constrained fuzzy clustering, Inform. Sci. 245 (2013) 38–52.
[8] C. Gu, S. Zhang, K. Liu, H. Huang, Fuzzy kernel k-means clustering method based on immune genetic algorithm, J. Comput. Inform. Syst. 7 (1) (2011)
221–231.
[9] H.C. Huang, Y.Y. Chuang, C.S. Chen, Multiple kernel fuzzy clustering, IEEE Trans. Fuzzy Syst. 20 (1) (2012) 120–134.
[10] J. Ji, W. Pang, C. Zhou, X. Han, Z. Wang, A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data, Knowl.-Based Syst. 30 (2012)
129–135.
[11] E. Keogh, C.A. Ratanamahatana, Exact indexing of dynamic time warping, Knowl. Inform. Syst. 7 (3) (2005) 358–386.
[12] V. Loia, W. Pedrycz, S. Senatore, P-FCM: a proximity-based fuzzy clustering for user-centered web applications, Int. J. Approx. Reason. 34 (2) (2003)
121–144.
[13] M. Loureiro, F. Bação, V. Lobo, Fuzzy classification of geodemographic data using self-organizing maps, in: Proceedings of 4th International Conference

of GIScience 2006, 2006, pp. 123–127.
[14] G.A. Mason, R.D. Jacobson, Fuzzy geographically weighted clustering, in: Proceedings of the 9th International Conference on GeoComputation
(Electronic Proceedings on CD-ROM), 2007.
[15] W. Pedrycz, B.J. Park, S.K. Oh, The design of granular classifiers: a study in the synergy of interval calculus and fuzzy sets in pattern recognition, Pattern
Recognit. 41 (12) (2008) 3720–3735.
[16] J. Petersen, M. Gibin, P. Longley, P. Mateos, P. Atkinson, D. Ashby, Geodemographics as a tool for targeting neighbourhoods in public health campaigns,
J. Geogr. Syst. 13 (2011) 173–192.
[17] W. Pedrycz, Granular Computing: Analysis and Design of Intelligent Systems, CRC Press, 2013.
[18] G. Peters, Rough clustering utilizing the principle of indifference, Inform. Sci. 277 (2014) 358–374.
[19] D.K. Rossmo, Recent developments in geographic profiling, Policing 6 (2) (2012) 144–150.
[20] M. Rostam Niakan Kalhori, M.H. Fazel Zarandi, I.B. Turksen, A new credibilistic clustering algorithm, Inform. Sci. 279 (2014) 105–122.
[21] L.H. Son, Enhancing clustering quality of geo-demographic analysis using context fuzzy clustering type-2 and particle swarm optimization, Appl. Soft
Comput. 22 (2014) 566–584.
[22] L.H. Son, HU-FCF: a hybrid user-based fuzzy collaborative filtering method in recommender systems, Expert Syst. Appl. 41 (15) (2014) 6861–6870.
[23] L.H. Son, Optimizing municipal solid waste collection using chaotic particle swarm optimization in GIS based environments: a case study at Danang
City, Vietnam, Expert Syst. Appl. 41 (18) (2014) 8062–8074.
[24] L.H. Son, DPFCM: a novel distributed picture fuzzy clustering method on picture fuzzy sets, Expert Syst. Appl. 42 (1) (2015) 51–66.
[25] L.H. Son, B.C. Cuong, P.L. Lanzi, N.T. Thong, A novel intuitionistic fuzzy clustering method for geo-demographic analysis, Expert Syst. Appl. 39 (10)
(2012) 9848–9859.
[26] L.H. Son, B.C. Cuong, H.V. Long, Spatial interaction – modification model and applications to geo-demographic analysis, Knowl.-Based Syst. 49 (2013)
152–170.
[27] L.H. Son, P.L. Lanzi, B.C. Cuong, H.A. Hung, Data mining in GIS: a novel context-based fuzzy geographically weighted clustering algorithm, Int. J. Mach.
Learn. Comput. 2 (3) (2012) 235–238.
[28] L.H. Son, N.D. Linh, H.V. Long, A lossless DEM compression for fast retrieval method using fuzzy clustering and MANFIS neural network, Eng. Appl. Artif.
Intell. 29 (2014) 33–42.
[29] L.H. Son, N.T. Thong, Intuitionistic fuzzy recommender systems: an effective tool for medical diagnosis, Knowl.-Based Syst. 74 (2015) 133–150.
[30] P. Thakur, C. Lingam, Generalized spatial kernel based fuzzy c-means clustering algorithm for image segmentation, Int. J. Sci. Res. 2 (5) (2013) 165–169.
[31] P.H. Thong, L.H. Son, A new approach to multi-variables fuzzy forecasting using picture fuzzy clustering and picture fuzzy rules interpolation method,
in: Proceeding of 6th International Conference on Knowledge and Systems Engineering, 2014, pp. 679–690.
[32] UCI Machine Learning Repository, COIL 2000, 2000. < (accessed 06.01.14).

[33] UNSD Statistical Databases, Demographic Yearbook, 2011. < (accessed 14.07.12).
[34] N. Walford, An Introduction to Geodemographic Classification (Census Learning), 2011. < />[35] Z. Wu, W.X. Xie, J.P. Yu, Fuzzy c-means clustering algorithm based on kernel method, in: Proceedings of Fifth International Conference on
Computational Intelligence and Multimedia Applications, 2003, pp. 49–56.
[36] W. Wang, X. Liu, Fuzzy forecasting based on automatic clustering and axiomatic fuzzy set classification, Inform. Sci. 294 (2015) 78–94.
[37] H.J. Xing, M.H. Ha, Further improvements in feature-weighted fuzzy c-means, Inform. Sci. 267 (2014) 1–15.
[38] M.S. Yang, H.S. Tsai, A Gaussian kernel-based fuzzy c-means algorithm with a spatial bias correction, Pattern Recognit. Lett. 29 (12) (2008) 1713–1725.
[39] S.M.R. Zadegan, M. Mirzaie, F. Sadoughi, Ranked k-medoids: a fast and accurate rank-based partitioning algorithm for clustering large datasets, Knowl.Based Syst. 39 (2013) 133–143.
[40] M. Zarinbal, M.H. Fazel Zarandi, I.B. Turksen, Relative entropy fuzzy c-means clustering, Inform. Sci. 260 (2014) 74–97.



×