Tải bản đầy đủ (.pdf) (10 trang)

DSpace at VNU: HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.33 MB, 10 trang )

ESWA 9318

No. of Pages 10, Model 5G

15 May 2014
Expert Systems with Applications xxx (2014) xxx–xxx
1

Contents lists available at ScienceDirect

Expert Systems with Applications
journal homepage: www.elsevier.com/locate/eswa
5
6

HU-FCF: A hybrid user-based fuzzy collaborative filtering method in
Recommender Systems

3
4
7

Q1

Le Hoang Son ⇑

8

Q2

VNU University of Science, Vietnam National University, Viet Nam



9
10
1
2 2
1
13
14
15
16
17
18
19
20

a r t i c l e

i n f o

Keywords:
Football results prediction
Fuzzy Recommender Systems
Fuzzy similarity degrees
Hard user-based degrees
Hybrid fuzzy collaborative filtering

a b s t r a c t
Recommender Systems (RS) have been being captured a great attraction of researchers by their applications in various interdisciplinary fields. Fuzzy Recommender Systems (FRS) is an extension of RS with the
fuzzy similarity being calculated based on the users’ demographic data instead of the hard user-based
degree. Based upon the observations that the FRS researches did not offer a mathematical definition of

FRS accompanied with its algebraic operations and properties, and the fuzzy similarity degree is not
enough to express accurately the analogousness between users, in this paper we will present a systematic
mathematical definition of FRS including theoretical analyses of algebraic operations and properties and
propose a novel hybrid user-based fuzzy collaborative filtering method that integrates the fuzzy similarity degrees between users based on the demographic data with the hard user-based degrees calculated
from the rating histories into the final similarity degrees in order to obtain high accuracy of prediction.
Experimental results on some benchmark datasets show that the proposed method obtains better accuracy than other relevant methods. Lastly, an application for the football results prediction is given to illustrate the uses of the proposed method.
Ó 2014 Published by Elsevier Ltd.

22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

38
39
40
41
42

43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

1. Introduction
Recommender Systems (RS) have been being captured a great
attraction of researchers by their applications in various interdisciplinary fields. RS, which are a subclass of decision support systems,
can give users information about predictive ‘‘rating’’ or ‘‘preference’’ that they would like to assess an item; thus helping them
to choose the appropriate item among numerous possibilities. This
kind of expert systems is now commonly popularized in numerous
application fields such as books, documents, images, movie, music,
Q3 shopping and TV programs personalized systems as stated by Park,
Kim, Choi, and Kim (2012) in a survey of 210 articles on Recommender Systems from 46 journals published between 2001 and
2010. A large number of researches involving the uses of RS to
practical applications have been found in those journals especially
those focusing on expert and knowledge-based systems, for example the work of Ghazanfar and Prügel-Bennett (2014) offering a
hybrid recommendation algorithm to make reliable recommendations for gray-sheep users that reduce the recommendation error

rate and maintain reasonable computational performance.
Christidis and Mentzas (2013) handled the difficulty of processing
⇑ Address: 334 Nguyen Trai, Thanh Xuan, Hanoi, Viet Nam. Tel.: +84 904171284;
fax: +84 0438623938.
E-mail address:

a large number of items bought and sold every day in auction marketplaces across the web by the mean of a RS system that exploits
the hidden topics of unstructured information. The analysis of student’s academic performance by a RS system determining the level
of learning productivity integrally through physiological, psychological and behavioral was studied by Kaklauskas (2013). Fang
et al. (2012) developed a mobile RS to capture users’ preferences
in indoor shopping context through users’ positions and contextual
information. Costa-Montenegro, Barragáns-Martínez, and ReyLópez (2012) addressed the issue of information overload when
downloading applications in markets by an integrated RS solution.
Carrer-Neto, Hernández-Alcaraz, Valencia-García, and GarcíaSánchez (2012) employed knowledge and social networks to a
hybrid RS for the cinematographic domain. Shih, Yen, Lin, and
Shih (2011) implemented the most common three kinds of RS techniques in order to recommend to customers which countries are
the best traveling locations. Borges and Lorena (2010) applied RS
to the domain of news and listed some typical examples such as
GroupLens, NewsWeeder, online newspaper P-Tango and Google
news personalization. Drachsler (2010) gave an application of RS
for technology enhanced learning. Duan, Street, and Xu (2011)
studied nursing care plans in a healthcare RS. Tag recommendation
for Social RS was investigated by Derntl (2011), Song, Zhang, and
Giles (2011) and Zheng and Li (2011). Industrial RS applications

/>0957-4174/Ó 2014 Published by Elsevier Ltd.

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
59

60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82


ESWA 9318

No. of Pages 10, Model 5G

15 May 2014

Q1
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111

112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141

142
143
144
145
146
147
148

2

L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx

could be named but a few such as the personalized online store of
Amazon.com, music recommendation of YouTube, the hottest
news of Yahoo, Netflix, etc. (Ricci, Rokach, & Shapira, 2011). Other
applications of RS could be referenced in Shapira (2011), Son,
Cuong, Lanzi, and Thong (2012), Son, Cuong, and Long (2013a),
Son, Minh, Cuong, and Canh (2013b), Son, Linh, and Long (2014),
Son (in press) and Son and Thong (submitted for publication).
These researches of RS applications clearly depict two remarks:
(i) RS is getting more and more important to practical applications;
(ii) the studies of RS and its computational intelligence techniques
such as the association rule, clustering algorithms, decision tree, knearest neighbor, link analysis, neural network, regression and
other heuristic methods to enhance the accuracy of prediction
are significant to not only the community of expert and knowledge-based systems researches but also the applied sciences.
Our objective in this paper is to investigate an advanced computational intelligence technique for RS to enhance the accuracy of
prediction. We have already known that Collaborative Filtering (CF)
is one of the most popular, traditional hard user-based filtering
methods to predict the ratings of items based on the similarities
between users through the Pearson coefficient. Rating of a user is

indeed approximated to the most frequent of those of other similar
users. Nevertheless, the problem of CF can be recognized that the
calculation of similarities between users based on the rating histories is not accurate in many practical applications, and other information such as users’ demographic data should be used instead
since those data reflect the correlation between users expressed
through various attributes of users more strictly than the rating
history. An important issue in this observation is that the attributes
of users are not only continuous values but also discrete ones such
as ‘‘Gender’’, ‘‘Occupation’’, etc. Thus, in order to calculate the similarity between users based on the demographic data, it is necessary to integrate fuzzy logic with RS, and this research
orientation belongs to the class of Fuzzy Recommender Systems
(FRS). Yager (2003) stated that the usefulness of information is
dependent upon its representation visualized by fuzzy sets so that
the final rating, calculated through the ordered weighted averaging
operator, bases solely on the preferences of the single individual
and makes no use of the preferences of other collaborators.
Zenebe and Norcio (2009) presented a fuzzy set theoretic method
for RS including a representation method, similarity measures
and aggregation methods that handles the non-stochastic uncertainty induced from subjectivity, vagueness and imprecision in
the data, and the domain knowledge and the task under consideration. Cao and Li (2007) presented a fuzzy-based system for consumer electronics that ranks customer needs by their importance
and sets up fuzzy rules between customer needs and product features. Porcel, López-Herrera, and Herrera-Viedma (2009) used
some filtering tools and a particular fuzzy linguistic modeling,
called multi-granular fuzzy linguistic modeling, which is useful
when different qualitative concepts have to be assessed for
research resources. Porcel and Herrera-Viedma (2010) investigated
the problem of incomplete information in a fuzzy linguistic RS and
presented a new system that facilitates the acquisition of the user
preferences to characterize the user profiles. Palanivel and
Siavkumar (2010) adopted the fuzzy linguistic and fuzzy multi-criteria decision making approaches to represent the user ratings and
accurately rank the relevant items. Romero, Ferreira-Satler, Olivas,
Prieto-Mendez, and Menéndez-Domínguez (2011) proposed a
fuzzy linguistic model based on three dimensions such as structural, contextual and personal and applied it to learning object

repository. Serrano-Guerrero, Herrera-Viedma, Olivas, Cerezo, and
Romero (2011) introduced a novel fuzzy linguistic RS based on
the Google Wave capabilities for communicating researchers interested in common research lines. Boulkrinat, Hadjali, and Mokhtari
(2013) used linguistic terms for the rating of users’ preferences and

calculated the similarity between users on the basis of the similarity of their preference relations which can better capture similar
users’ ratings patterns. Some authors used soft computing method
for the calculation of similarities between users, for instance Park,
Yoo, and Cho (2006) applied Fuzzy Bayesian Networks to fuzzify
information obtained from sensors and Internet and to get suitable
contexts with the probability; thus determining the similarities
between users. Al-Shamri and Bharadwaj (2008) presented a
hybrid fuzzy-genetic approach to fuzzify the user model and to
reflect more appropriately the fuzziness of each fuzzy feature.
Terán and Meier (2010) designed a fuzzy interface for voters and
candidates to write their profiles, and used fuzzy clustering to calculate the top-N recommendation in eElections. Nadi, Saraee,
Bagheri, and Davarpanh Jazi (2011) focused on web users’ behaviours problem and proposed a fuzzy-ant based RS based on collaborative behaviour of ants. Sevarac, Devedzic, and Jovanovic (2012)
proposed neuro-fuzzy pedagogical recommender, which is an
adaptive RS based on neuro-fuzzy inference, to create pedagogical
rules in technology enhanced learning. Lucas, Laurent, Moreno, and
Teisseire (2012) and Zhang et al. (2013) proposed hybrid methodologies for RS, which use collaborative filtering and content-based
approaches in a joint method taking advantage from the strengths
of both approaches. From the summary of relevant researches to
FRS, some important issues are drawn out as follows.

149

 Those relevant FRS researches solely used fuzzy sets to model
uncertain information existed in the users’ demographic data
but did not offer a mathematical definition of FRS accompanied

with its algebraic operations and properties.
 According to these researches, FRS is merely a small extension
of RS with the users’ demographic data being provided in addition to other datasets, and the calculation of the similarity
degrees between users is conducted on the demographic data
only. Even though demographic data contain multiple-dimensions comprehensive information of users, relying solely on this
type of data for the calculation of the similarity degrees without
the knowledge of rating histories may lead to erroneous and
inaccurate results. Since previous ratings could somehow affect
the constitution of the considered one, it is better if the similarities between users are evaluated both by the demographic data
and the rating histories.

173

151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167

168
169
170
171
172

174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189

Motivated by the problems of the relevant FRS researches and
the work of Shih et al. (2011) concluding that hybrid RS could conquer the shortcomings of the available filtering approaches, our
contributions in this article are expressed as follows.

190

 A systematic mathematical definition of FRS accompanied with

its algebraic operations and properties.
 A novel hybrid user-based fuzzy collaborative filtering method
so-called HU-FCF that integrates the fuzzy similarity degrees
between users based on the demographic data with the hard
user-based degrees calculated from the rating histories into
the final similarity degrees in order to obtain high accuracy of
prediction.
 An application of HU-FCF for the football results prediction
problem.

194

191
192
193

195
196
197
198
199
200
201
202
203
204

The difference and the novel of the proposed approach in comparison with the relevant FRS ones are expressed in the contributions
above. Even though the idea of HU-FCF is quite simple, it could help
accelerating the accuracy of prediction since the final similarity is

evaluated more accurately through the integration of both the fuzzy
and hard user-based similarity degrees. The relevant approaches
were based solely on either the hard similarities between users
(the CF method and its variants) or fuzzy similarities from the

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
150

205
206
207
208
209
210
211
212


ESWA 9318

No. of Pages 10, Model 5G

15 May 2014
Q1
213
214
215
216
217

218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242

demographic data (the group of FRS methods) so that the group of
the most similar users to the considered one may admit irrelevant
users as a result of incorrect selection and computation. Intuitively,
the proposed HU-FCF may result in better accuracy and flexibility

than other relevant approaches through the using of the hybrid similarity degrees. This may be verified and validated by experiments in
some benchmark RS datasets such as the MovieLens (GroupLens
research, 2014) and Book-Crossing (Ziegler, McNee, Konstan, &
Lausen, 2005). The advantage of the proposed method HU-FCF may
be not only the better accuracy than other relevant approaches but
also the capability to handle more numbers of cases than other
algorithms since when the coefficient or weight of the fuzzy (hard)
user-based similarity degree is set to zero, the HU-FCF algorithm
works similarly to the available relevant algorithms. Additionally,
HU-FCF is constructed on the basis of a systematic mathematical
definition of FRS accompanied with its algebraic operations and
properties so that the theoretical foundation of the algorithm can
be guaranteed. The disadvantage of HU-FCF is somehow the large
computational time since it has to compute more numbers of similarity degrees than other algorithms. However, this limitation could
be compromised if the priority of RS is dedicated to the accuracy of
prediction.
The rest of the paper is structured as follows. Section 2 presents
the main contributions of the paper including a mathematical definition of FRS accompanied with its algebraic operations and properties and a novel hybrid user-based fuzzy collaborative filtering
method named HU-FCF. Section 3 validated HU-FCF by experiments on some benchmark datasets. Section 4 introduces an application of HU-FCF for the football results prediction. Section 5 draws
the conclusions and delineates the future research directions.

243

2. The proposed methodology

244
245

2.1. Definitions


246

Definition 1 (Recommender Systems – RS Adomavicius and Tuzhilin,
2005; Borges and Lorena, 2010; Ricci et al., 2011). Suppose U is a set
of all users and I is the set of items in the system. The utility
function R is a mapping specified on U 1 & U and I1 & I as follows.

247
248
249

250

R : U 1 Â I1 ! P;
252

3

L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx

ð1Þ

ðu1 ; i1 Þ # Rðu1 ; i1 Þ;

Example 1. Suppose that U = {John, David, Jenny, Marry} and
I = {Titanic, Hulk, Scallet}. The set of criteria of a movie is
P = {Story, Visual effects}. The ratings are assigned numerically
from 1 (bad) to 5 (excellent). Table 1 describes the utility function. Q4
From this table, it is clear that MCRS can help us to predict the
ratings of a user (Marry) to a movie that was not rated by her

beforehand (Titanic). This kind of systems also recommends her
favourite movie through available ratings. In cases that there is
only an criterion in P, MCRS returns to the traditional RS. Now,
we extend MCRS by the definition below.

272

Definition 3 (Fuzzy Recommender Systems – FRS). Suppose U is a
set of all users, I is the set of items and C is the set of fuzzy contexts
in the system. The utility function R is a mapping specified on
U 1 & U; I1 & I and C 1 & C as follows.

282

R : U 1 Â I1 Â C 1 ! P1 Â P 2 Â Á Á Á Â Pk ;

273
274
275
276
277
278
279
280
281

283
284
285


286

ð3Þ

ðu1 ; i1 ; fc1 ; lc gÞ # ðfR1 ; l1 g; fR2 ; l2 g; . . . ; fRk ; lk gÞ;

288

where lc 2 ½0; 1Š is the membership value of context c1 . Ri ði ¼ 1; kÞ
is the rating of user u1 to item i1 in context c1 by criteria P i with
fuzzy membership li 2 ½0; 1Š. FRS are the systems that provide
two basic functions below.

289

Ã

(a) Prediction: the capability to determine Rðuà ; i ; fcÃ1 ; lÃc gÞ for
Ã
any ðuà ; i ; fcÃ1 ; lÃc gÞ 2 ðU; I; CÞ n ðU 1 ; I1 ; C 1 Þ.
Ã
(b) Recommendation: the capability to choose i 2 I satisfying
Ã
i ¼ arg maxi2I Rðu; i; fc; lc gÞ for all u 2 U; fc; lc g 2 C and a
certain criterion.

290
291
292
293

294
295
296
297
298

As we can recognize in Definition 3, FRS is the generalized definition of MCRS (Definition 2) and RS (Definition 1) since in cases
that C ¼ f/g and li ¼ 1 ð8i ¼ 1; kÞ, FRS returns to MCRS. If the condition: k ¼ 1 is appended, FRS returns to the traditional RS. Now,
let us consider the example below to illustrate the difference
between FRS and other types of RS.

299

Example
2. Suppose
that
U = {John, David, Jenny, Marry},
I = {Titanic, Hulk, Scallet} and C = {Weather, Mood}. The fuzzy linguistic labels of ‘‘Weather’’ are {‘‘Fine’’, ‘‘Normal’’, ‘‘Bad’’}, and those
of ‘‘Mood’’ are {‘‘Happy’’, ‘‘Normal’’, ‘‘Angry’’}. The set of criteria of a
movie is P = {Story, Visual effects} whose fuzzy linguistic labels are
{‘‘Very good’’, ‘‘Fair’’, ‘‘Boring’’} and {‘‘Amazing’’, ‘‘Thrill’’, ‘‘Melody’’}, respectively. Table 2 describes the utility function of FRS.

305

300
301
302
303
304


306
307
308
309
310
311
312

253
254
255

where Rðu1 ; i1 Þ is a non-negative integer or a real number within a
certain range. P is a set of available ratings in the system. Thus,
RS is the system that provides two basic functions below.

259
256
260
257
261
258

262
263
264
265

Ã


Ã

Ã

Ã

(a) Prediction: determine Rðu ; i Þ for any ðu ; i Þ 2 ðU; IÞ n ðU 1 ; I1 Þ.
Ã
(b) Recommendation:
choose
i 2I
satisfying
Ã
i ¼ arg maxi2I Rðu; iÞ for all u 2 U.
Definition 2 (Multi-criteria Recommender Systems – MCRS Shapira,
2011). MCRS are the systems providing similar basic functions
with RS but following by multiple criteria. In the other words,
the utility function is defined below

266

R : U 1 Â I1 ! P 1 Â P 2 Â Á Á Á Â P k ;

ð2Þ

268

ðu1 ; i1 Þ # ðR1 ; R2 ; . . . ; Rk Þ;

269


where Ri ði ¼ 1; kÞ is the rating of user u1 2 U 1 for item i1 2 I1 following by criteria i. In this case, the recommendation is performed
according to a given criteria.

270
271

It is obvious that the ratings for a movie of a user are expressed
by fuzzy linguistic labels in terms of criteria. For example, when
the weather is ‘‘Normal’’ and the mood of user John is ‘‘Happy’’,
he would like to assess the story and visual effect of movie ‘‘Hulk’’
being ‘‘Fair’’ and ‘‘Amazing’’, respectively. Contrary to the utility
function of MCRS in Table 1 where the rating is assigned by
numeric values, the rating for a criterion is expressed by the set
of fuzzy linguistic labels. In the example above, the set of fuzzy

Table 1
Movies’s rating.
User

Movie

Story

Visual effects

John
John
David
David

David
Jenny
Jenny
Marry
Marry

Hulk
Scallet
Titanic
Hulk
Scallet
Hulk
Titanic
Hulk
Titanic

4
2
4
3
1
2
1
3
?

3
2
2
1

4
3
2
5
?

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
313
314
315
316
317
318
319
320


ESWA 9318

No. of Pages 10, Model 5G

15 May 2014
Q1

4

Q6

Table 2

The utility function of FRS.

L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx

User

Movie

John
John
David
David
David
Jenny
Jenny
Marry
Marry

n

FRS12 ¼ U 12 ;I12 ;C 12 ;

Context

Hulk
Scallet
Titanic
Hulk
Scallet
Hulk

Titanic
Hulk
Titanic

Criteria

Weather

Mood

Story

Visual effects

Nornal
Bad
Fine
Bad
Normal
Bad
Fine
Normal
Normal

Happy
Normal
Happy
Angry
Normal
Angry

Normal
Happy
Normal

Fair
Very good
Very good
Fair
Boring
Boring
Very good
Fair
?

Amazing
Melody
Amazing
Thrill
Amazing
Thrill
Melody
Amazing
?

2.2. Some algebraic operations of FRS

336

Suppose
that

we
have
FRS ¼ fU; I; C; fP i g j i ¼ 1; ng below.

324
325
326
327
328
329
330
331
332
333

337

338

340

o
o

FRS1 ¼ U 1 ; I1 ; C 1 ; P1ii i ¼ 1; n ;
n
n o
o

FRS2 ¼ U 2 ; I2 ; C 2 ; P2i i ¼ 1; n ;

n
n o
o

FRS3 ¼ U 3 ; I3 ; C 3 ; P3i i ¼ 1; n ;

341

where,

342

n

three

subsets

of

n

344

n

o

c1;j ; lc1;j j ¼ 1; l ;
n


o
lc1;j ¼ cg1;j ; lgc1;j g ¼ 1; h ;
n
o n

o

P1i ¼ R1i ; l1i ¼
R1i;q ; l1i;q q ¼ 1; r ;
n

o

c2;j ; lc2;j j ¼ 1; l ;
C2 ¼
n

o
lc2;l ¼ cg2;j ; lgc2;j g ¼ 1; h ;
n
o n

o

P2i ¼ R2i ; l2i ¼
R2i;q ; l2i;q q ¼ 1; r ;
n

o


C3 ¼
c3;j ; lc3;j j ¼ 1; l ;
n

o
lc3;j ¼ cg3;j ; lgc3;j g ¼ 1; h ;
n
o n

o

R3i;q ; l3i;q q ¼ 1; r ;
P3i ¼ R3i ; l3i ¼
n

o


cj ; lcj j ¼ 1; l ;
n

o
lcj ¼ cgj ; lgcj g ¼ 1; h ;

o
È
É n

Pi ¼ Ri ; li ¼

Ri;q ; li;q q ¼ 1; r :

345

Some algebraic operations of FRS are defined below.

C1 ¼

346

347
349

350

g
c1;j ;

g
c2;j g;

ð9Þ
ð10Þ
ð11Þ
ð12Þ
ð13Þ
ð14Þ
ð15Þ
ð16Þ
ð17Þ

ð18Þ

ð24Þ
ð25Þ
ð26Þ
ð27Þ

ð28Þ

where

FRS12 ¼ U 12 ;I12 ;C 12 ;

n

P12
l

o
o

l ¼ 1;k ;

U 12 ¼ U 1 \ U 2 ;
I12 ¼ I1 \ I2 ;
n

o

C 12 ¼ c12;j ; lc12;j  j ¼ 1;l ;

n

o
lc12;j ¼ cg12;j ; lgc12;j g ¼ 1;h ;
n
o
lgc12;j ¼ min lgc1;j ; lgc2;j ;
n o n
o n

o
12
12 
¼ R12
¼ R12
P12
i
i ; li
i;q ; li;q q ¼ 1;r;l 2 N;k 2 N ;
n
o
1
2
l12
i;q ¼ min li;q ; li;q :

359

ð29Þ
ð30Þ

ð31Þ
ð32Þ
ð33Þ
ð34Þ
ð35Þ
ð36Þ

(c) Complement:

363

ð37Þ

367

ð39Þ
ð40Þ

ð38Þ

n

o

¼ c1C ;j ; lc C j ¼ 1;l ;
1 ;j
n

o
lc12;j ¼ cg1C ;j ; lgc C g ¼ 1;h ;

1 ;j

g
c1C ;j

g
c1;j ;

¼1Àl
o n C
o n C

o
C
C 
P1i ¼ R1i ; l1i ¼ R1i;q ; l1i;q q ¼ 1;r;l 2 N;k 2 N ;
C

l ¼1Àl

1C
i;q :

ð41Þ
ð42Þ
ð43Þ
ð44Þ

2.3. Properties
(a) Commutative:


374

ð45Þ

FRS1 \ FRS2 ¼ FRS2 \ FRS1 :

ð46Þ

(b) Associative:

where

FRS1 [ FRS2 ¼ FRS12 ;
n
n o
o

FRS12 ¼ U 12 ; I12 ; C 12 ; P 12
l ¼ 1; k ;
l

376
377

378

ð47Þ
ð48Þ


380
381
382
383

384

ð49Þ
ð50Þ

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
370

373
372

FRS1 [ FRS2 ¼ FRS2 [ FRS1 ;

ðFRS1 [ FRS2 Þ [ FRS3 ¼ FRS1 [ ðFRS2 [ FRS3 Þ;
ðFRS1 \ FRS2 Þ \ FRS3 ¼ FRS1 \ ðFRS2 \ FRS3 Þ:

369

371

We prove the first commutative property in Eq. (45). Other
properties are proven analogously.

ð19Þ


365
366

C C1

1C
i;q

361
362

n
n C o
o

¼ U C1 ; IC1 ; C C1 ; P1i i ¼ 1; n ;

U C1 ¼ U n U 1 ;
IC1 ¼ I n I1 ;

n

355
357
358

n

l


353
354

FRS1 \ FRS2 ¼ FRS12 ;

where

ð8Þ

ð23Þ

(b) Intersection:

ð5Þ

(a) Union:

FRS1 [ FRS2 ¼ FRS12 ;

g
c12;j

ð4Þ

ð7Þ

ð20Þ
ð22Þ


FRSC1

ð6Þ

351

ð21Þ

¼ maxfl l
o n
o n

o
12
12 
P12
¼ R12
¼ R12
i
i ; li
i;q ; li;q q ¼ 1;r;l 2 N;k 2 N ;
n
o
1
2
l12
i;q ¼ max li;q ; li;q :

335


323

o
o

l ¼ 1;k ;

U 12 ¼ U 1 [ U 2 ;

n

334

322

P12
l

I12 ¼ I1 [ I2 ;
n

o

C 12 ¼ c12;j ; lc12;j j ¼ 1;l ;
n

o
lc12;j ¼ cg12;j ; lgc12;j g ¼ 1;h ;

l


linguistic labels for criterion ‘‘Story’’ accompanied with fuzzy
memberships as stated in Eq. (3) is {(‘‘Very good’’, 0.2); (‘‘Fair’’,
0.7); (‘‘Boring’’, 0.1)}. Since the fuzzy membership of label ‘‘Fair’’
is the maximum among all, the rating value for criterion ‘‘Story’’
is ‘‘Fair’’ as shown in Table 2. By using the fuzzy linguistic labels,
FRS has tackled the problem of vague, incomplete and uncertainty
that exist in RS and MCRS.
The aims of FRS consist of the prediction and the recommendation of a user for a movie in a specific context. For example, we
would like to know the ratings of user Marry in terms of {‘‘Story’’,
‘‘Visual effects’’} for the movie ‘‘Titanic’’ in the contexts of weather
and mood being ‘‘Normal’’ and ‘‘Normal’’, respectively. Furthermore, the best movie in term of a specific criterion that Marry
has ever seen should be recommended.

321

n

386


ESWA 9318

No. of Pages 10, Model 5G

15 May 2014
Q1
387

388


where

U 12 ¼ U 1 [ U 2 ;
I12 ¼ I1 [ I2 ;
n

o

C 12 ¼
c12;j ; lc12;j j ¼ 1; l ;
n

o
lc12;j ¼ cg12;j ; lgc12;j g ¼ 1; h ;

l

g
c12;j

g
c1;j ;

g
c2;j g;

390

¼ maxfl l

n o n
o n

o
12
12
12 
¼ R12
¼
R12
Pi
i ; li
i;q ; li;q q ¼ 1; r; l 2 N; k 2 N ;
n
o
1
2
l12
i;q ¼ max li;q ; li;q :

391

Similarly, we have

392
394

n
n o
o


FRS21 ¼ U 21 ; I21 ; C 21 ; P21
l ¼ 1; k ;
l

395

where

396

398

U 21 ¼ U 2 [ U 1 ;
I21 ¼ I2 [ I1 ;
n

o

C 21 ¼
c21;j ; lc21;j j ¼ 1; l ;
n

o
lc21;j ¼ cg21;j ; lgc21;j g ¼ 1; h ;
n
o
lgc21;j ¼ max lgc2;j ; lgc1;j ;
n o n
o n


o
21
21 
P 21
¼ R21
¼
R21
i
i ; li
i;q ; li;q q ¼ 1; r; l 2 N; k 2 N ;
n
o
2
1
l21
i;q ¼ max li;q ; li;q :

399

Thus,

400
402

FRS1 [ FRS2 ¼ FRS2 [ FRS1 :

ð51Þ
ð52Þ


ð55Þ
ð56Þ
ð57Þ

ð58Þ

ð59Þ
ð60Þ
ð61Þ
ð62Þ
ð63Þ
ð64Þ

ð66Þ

In this section, we present a novel hybrid user-based fuzzy collaborative filtering method so-called the Hybrid User-based Fuzzy
Collaborative Filtering (HU-FCF). Since most of the available RS
datasets are designed in the forms of {User, Item, Criterion} and a
fuzzy filtering method could be developed either on the set of
{User, Item, Context} or the set of {Criteria} or both of them, we
consider the reduction of the definition of FRS (Definition 3) to that
of RS (Definition 1) and perform the fuzzy filtering method on the
user dataset. Most of the relevant FRS approaches were also
designed by this way, and the proposed HU-FCF would be an
extension of them in order to achieve the objective of better accuracy of prediction. To be frank, the HU-FCF method is designed for
the original RS but not truly for the FRS as stated in Definition 3.
Nonetheless, by providing an appendage of fuzzy similarity
degrees, this method could be considered as one of the fuzzy filtering methods for FRS. The basic idea of HU-FCF method is to integrate the fuzzy similarity degrees between users based on the
demographic data with the hard user-based degrees calculated
from the rating histories into the final similarity degrees. As such,

those degrees would reflect more exactly the correlation between
users in terms of the internal (attributes of users) and external
information (interactions between users). Each similarity degree
(fuzzy/hard) is accompanied by weights automatically calculated
according to the numbers of analogous users. Once the final similarity degrees are calculated, the final rating will be constructed
based on the rating values of neighbors of the considered user.
Depending on the domain of a specific problem, the final rating will
be approximated to its nearest value in that domain accompanied
by an error threshold, which is normally smaller than 5%. A list of
nearest values with equivalent error thresholds is also given as the
prediction ratings of a user for an item. The following pseudo-code
will describe the ideas more details.

408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424

425
426
427
428
429
430
431
432
433
434
435

ði ¼ 1; lÞ;
Output: – Rða; iÃ Þ or ra;ià for any ða; iÃ Þ 2 ðU; IÞ
HU-FCF:
1: Set the number of similarity degrees in the users’
demographic data: c ¼ 0;
2: Determine the membership functions for demographic
attributes and calculate the membership values of all users
according to the demographic attributes. The most frequent
used function for various type of data is the standard
normal Gaussian as recommended by Subasi (2006);
3: For each demographic attribute i:
4: Based upon the membership function, calculate the fuzzy
distances between the considered user and other ones by
the formula below:
FDðai ; bi Þ ¼j ai À bi j;

ð65Þ


404

407

number of users and l is the number of demographic
attributes;
– The items set: I ¼ fI1 ; . . . ; IM g where M is the
number of items;
– The rating histories: R ¼ fRðU i ; Ij Þ j U i 2 U; Ij 2 Ig;
– The similarity threshold: h 2 ½0; 1Š;
– The weights of the demographic attributes: wi

ð54Þ

2.4. The HU-FCF algorithm

406

Input: – The users’ demographic data: U ¼ fU 1 ; . . . ; U N g
n
o
where each U i ¼ U 1i ; . . . ; U li ði ¼ 1; NÞ; N is the

ð53Þ

403

405

5


L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx

ð67Þ

where ai ; bi are the membership values according to the
demographic attribute i of the considered user a and another
user b
ð68Þ
5: If FDðai ; bi Þ 6 h then c ¼ c þ 1
6: End for
7: Calculate the global fuzzy distances between the
considered user and other ones by the formula:
GFDða; bÞ ¼

l
X
wi FDðai ; bi Þ;

ð69Þ

i¼1

where wi 2 ½0; 1Š is the weight of the demographic attribute i
showing the influence to the global results and satisfying the
condition,
l
X
wi ¼ 1:


ð70Þ

i¼1

8: Calculate the fuzzy similarity degrees between the
considered user and other ones by the formula:

FSDða; bÞ ¼ 1 À GFDða; bÞ:

ð71Þ

9: Determine the hard (user-based) similarity degrees
between the considered user and other ones from the rating
histories by the Pearson coefficient below:

PM
i¼1 ðr a;i À r a Þ Ã ðr b;i À r b Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
;
HSDða; bÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PM
PM
2
2
i¼1 ðr a;i À r a Þ Ã
i¼1 ðr b;i À r b Þ

ð72Þ

where ra;i is the rating value of the considered user a for item

i 2 I;
rb;i is the rating value of user b for item i 2 I;
ra is the average rating value of the considered user
a by all items;
rb is the average rating value of user b by all items;
10: Calculate the final similarity degrees between the
considered user and other ones from Eqs. (71), (72) as
follows.
(continued on next page)

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />

ESWA 9318

No. of Pages 10, Model 5G

15 May 2014
Q1

6

L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx

SIMða; bÞ ¼ a  FSDða; bÞ þ b  HSDða; bÞ;

ð73Þ

where a ðbÞ is the weight of the fuzzy (hard) similarity
degree, and is calculated through the equations below.


c
;
NþcÀ1
a þ b ¼ 1:



ð74Þ
ð75Þ

11: Calculate the final rating by the equation below.

P

Ã

Rða; i Þ ¼ r a þ

b2Unfag SIMða; bÞ

P

b2Unfag

à ðr b;ià À r b Þ

j SIMða; bÞ j

ð76Þ


:

Ã

12: Determine the nearest value of Rða; i Þ in the domain D of
the problem as the final result, and calculate the error
threshold as follows.
Ã

D ¼ 100 Â

j Rða; i Þ À d j
;
Ã
maxfRða; i Þ; dg

ð77Þ

529

531
532
533
534

In Step 7 of the HU-FCF algorithm, the weights of the demographic attributes are normally taken from the experience of
experts according to a given context. Nevertheless, the formula
below could be used to estimate those weights in a general case.\


Ui
wi ¼ P
l

i¼1 U

537

536
538
539
540
541
542
543
544
545
546
547
548

i

;

User

Age

Education


No. children

Living standard

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

0.04411765
0.08272059
0.07904412
0.07720588
0.06617647
0.03492647
0.06985294
0.03860294
0.04963235

0.08272059
0.06985294
0.07720588
0.08088235
0.07720588
0.06985294

0.058823529
0.029411765
0.058823529
0.088235294
0.088235294
0.117647059
0.058823529
0.088235294
0.058823529
0.029411765
0.029411765
0.029411765
0.117647059
0.058823529
0.088235294

0.046153846
0.153846154
0.107692308
0.138461538
0.123076923
0
0.092307692

0.015384615
0.046153846
0.123076923
0.030769231
0.061538462
0.015384615
0.015384615
0.030769231

0.066666667
0.088888889
0.088888889
0.066666667
0.044444444
0.066666667
0.044444444
0.044444444
0.088888889
0.044444444
0.066666667
0.066666667
0.088888889
0.066666667
0.066666667

Table 5
The median values of demographic dataset.
Age

Education


No. children

Living standard

0.069852941

0.058823529

0.046153846

0.066666667

Table 6
The weights.

Ã

where d 2 D is the nearest value of Rða; i Þ.

530

Table 4
The normalized demographic dataset.

i ¼ 1; l;

ð78Þ

w1


w2

w3

w4

0.28925

0.24358

0.19112

0.27606

3. Evaluation

549

3.1. Experimental design

550

In this part, we describe the experimental environments such

&$ 
'

U i ¼ median U ij j ¼ 1; N ;


ð79Þ

as,

551
552

$

where U ij is the normalized value of U ij ði ¼ 1; l; j ¼ 1; NÞ. Eqs. ((78)
and (79)) determine the values of weights according to their contributions in all demographic attributes. Let us consider the example
below to illustrate the calculation of weights.
Example 3. Suppose that we have a demographic dataset of 15
users in Table 3.
Normalize the demographic dataset we obtain the results in
Table 4. Take the median values by demographic attributes in Eq.
(79) we have Table 5. Use Eq. (78) we get the values of weights
in Table 6.

Table 3
The demographic dataset.
User

Age

Education

No. children

Living standard


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

24
45
43
42
36
19
38
21
27
45
38
42
44

42
38

2
1
2
3
3
4
2
3
2
1
1
1
4
2
3

3
10
7
9
8
0
6
1
3
8
2

4
1
1
2

3
4
4
3
2
3
2
2
4
2
3
3
4
3
3

 Experimental tools: We have implemented the proposed algorithm – HU-FCF in addition to the fuzzy collaborative filtering
algorithms of Lucas et al. (2012) and Zenebe and Norcio
(2009) in C programming language and executed them on a
PC Intel Pentium Dual Core 1.80 GHz, 1 GB RAM.
 Experimental dataset: the benchmark RS datasets such as the
MovieLens (GroupLens research, 2014) and Book-Crossing
(Ziegler et al., 2005). MovieLens datasets consist of 2 types:
100k and 1M and show the rating values from 1 (Bad) to 5
(Excellent) of users for a collection of movies in the system.

The data were collected through the MovieLens web site
(movielens.umn.edu) during the seven-month period from September 19th, 1997 through April 22nd, 1998. The Book-Crossing
dataset was collected by Cai-Nicolas Ziegler in a 4-week crawl
(August/September 2004) from the Book-Crossing community
and showed the rating values from 1 to 10 of users for a set
of books. Besides these data, there are other benchmark RS
datasets such as Jester ( Sushi ( HetRec2011
( WikiLens (http://
grouplens.org/datasets/wikilens), etc. However, they do not
contain the demographic information so that for the best of
comparison we adopt the MovieLens and Book-Crossing for
Table 7
The descriptions of datasets.
Dataset

No. users

No. attributes

No. items

No. Ratings

MovieLens 100k
MovieLens 1M
Book-Crossing

943
6040
278,858


3
3
2

1682
3900
271,379

100,000
1,000,209
1,149,780

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567

568
569
570
571
572
573
574
575


ESWA 9318

No. of Pages 10, Model 5G

15 May 2014
Q1
576
577
578
579
580
581
582
583
584
585
586

experiments. The cross-validation method used to get the training and testing datasets is 3-fold. Table 7 gives an overview of
those datasets.

 The validity indices: we use the Mean Accuracy (MA) and the
computational time.
 Parameters setting: the similarity threshold is set as h ¼ 0:2.
 Objective:
Å To compare the accuracy of HU-FCF with those of relevant
algorithms;
Å To evaluate the computational time of algorithms.

587
588

3.2. Experimental results

589

In this section, we present the experimental results expressed
in Table 8. In this table, we compare the proposed algorithm HUFCF with the algorithms of Lucas et al. (a.k.a. Lucas) and Zenebe
and Norcio (a.k.a. ZN) in terms of accuracy and computational time.
The experimental datasets are denoted as 100k (MovieLens 100k),
1M (MovieLens 1M) and BC (Book-Crossing). In order to validate
the efficiency of the method to determine the weights of demographic attributes in Eqs. (78) and (79), we made the experiments
by various cases of weights such as,

590
591
592
593
594
595
596

597
598
599
600
601
602
603

 Weight 1: the values of weights wi ð8i ¼ 1; lÞ are calculated by
Eqs. (78) and (79);
 Weight 2: the values of weights wi ð8i ¼ 1; lÞ are set up equally:
wi ¼ 1=l;
 Weight 3: the values of weights wi ð8i ¼ 1; lÞ are randomly set
up in (0,1) satisfying constraint (70).

604
605
606
607
608
609
610
611
612
613
614
615
616
617
618

619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640

7

L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx

According to the results in Table 8, we clearly recognize that the
accuracy of HU-FCF is better than those of Lucas and ZN. For example, in cases of Weight 1 and the 1M dataset, the MA value of HUFCF is 82.2% which is larger than those of Lucas and ZN with the
numbers being 78.4% and 62.1%. In cases of Weight 2 and the BC

dataset, we also recognize that the MA value of HU-FCF is still better than those of Lucas and ZN with the numbers being 79.4%,
79.4% and 63.4%, respectively. Lastly, the MA values of all algorithms in cases of Weight 3 and the 100k dataset also show that
HU-FCF is better than the algorithms of Lucas and ZN with the
numbers being 82.1%, 81.4% and 55%, respectively. Those examples
clearly affirm that the accuracy of the HU-FCF algorithm is better
than those of Lucas and ZN.
Nevertheless, there are some cases that the accuracy of HUFCF is worse than those of Lucas and ZN. For example, in cases
of Weight 1 and the 100k dataset, the MA value of HU-FCF is
76.7% which is smaller than that of Lucas with the number being
81.4%. Similarly, in cases of Weight 3 and the BC dataset, the MA
values of HU-FCF, Lucas and ZN algorithms are 77.4%, 79.4% and
63.4%, respectively. However, the numbers of such bad cases are
small in comparison with the rests, and most of time the accuracy
of HU-FCF is still better than those of other algorithms. Taking the
comparison of algorithms by various cases of weights, we clearly
recognize that the case of Weight 1 often results in better accuracy of the HU-FCF algorithm than the cases of Weight 2 and
Weight 3. Even though there exists a bad case of the accuracy
of HU-FCF in comparison with other algorithms in each case of
weight, the average MA value of HU-FCF by various datasets in
the case of Weight 1 is 80.7% whilst those in cases of Weight 2
and Weight 3 are 80.1% and 79.3%, respectively. This clearly
affirms the fact that using the generation method in Eqs. (78)
and (79) to create the values of weights wi ð8i ¼ 1; lÞ is more efficient than other methods of weights. As we have early predict the
drawback of the computational time of HU-FCF over other relevant methods, the results in Table 8 re-confirm that the computational time of HU-FCF is longer than those of other

Table 8
The comparison of algorithms.
Data

MA (%)


Time (s)

HU-FCF

Lucas

ZN

HU-FCF

Lucas

ZN

Weight 1
100k
1M
BC

76.7
82.2
83.1

81.4
78.4
79.4

55.0
62.1

63.4

22.7
113
132

15.3
78
96

18.4
93
115

Weight 2
100k
1M
BC

81.3
79.7
79.4

81.4
78.4
79.4

55.0
62.1
63.4


27.4
122
148

15.3
78
96

18.4
93
115

Weight 3
100k
1M
BC

82.1
78.4
77.4

81.4
78.4
79.4

55.0
62.1
63.4


28.3
136
155

15.3
78
96

18.4
93
115

algorithms. For example, in case of the 100k dataset, it takes
the HU-FCF algorithm approximately 26.1 s on average by various
cases of weights. This number is larger than those of Lucas and
ZN with the numbers being 15.3 and 18.4 s, respectively. In case
of the 1M dataset, the values of computational time of HU-FCF,
Lucas and ZN are 123, 78 and 93 s, respectively. Lastly, the values
of computational time of HU-FCF, Lucas and ZN in case of the BC
dataset are 145, 96 and 115 s, respectively. Even though the computational time of HU-FCF is larger than those of other algorithms, it is obvious that the difference is unremarkable and
can be acceptable.

641

3.3. Concluding remarks

652

Throughout the experimental results, we have extracted the following concluding remarks.


653

 The accuracy of HU-FCF is better than those of other relevant
algorithms;
 The generation method of weights of demographic attributes in
Eqs. (78) and (79) of Section 2.4 is the most effective ones
among other methods of weights;
 The drawback of the computational time of HU-FCF can be
acceptable.

655

643
644
645
646
647
648
649
650
651

654

656
657
658
659
660
661

662

4. An application of HU-FCF for football results prediction

663

In this section, we illustrate an application of HU-FCF for the
football results prediction problem. The experimental datasets
were taken from the Barclays English Premier League (BEPL)
including 20 teams and 38 rounds (Statto organisation, 2014).
From the datasets, we have summarized some characteristics of a
team by Table 9.

664

Table 9
Statistical information of a football team in BEPL.
Psychological/non-psychological information
The number of games that failed to score
The number of goals scored (in home
team)
The number of goals against (home team)
The number of clean sheets (home team)

The average age of players
Injury per game
The number of red (yellow)
cards
The number of penalties
(against)


The average number of shots per game

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
642

665
666
667
668
669


ESWA 9318

No. of Pages 10, Model 5G

15 May 2014
Q1

8

L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx

Fig. 1. Results of BEPL season 2012–2013.

670
671
672

673
674
675
676
677
678

Now, we describe how the HU-FCF algorithm can be applied to
predict the football results. Let us take a look at the statistics of
BEPL season 2012–2013 visualized by Fig. 1. From this figure, we
split the scoring results into 2 subsets: the training and testing
by the hold-out method, and use the training as the rating histories. The users and items sets in this case are identical and consist
of 20 football teams whose demographic data are shown in Table 9.
Next, we use the HU-FCF algorithm to predict the result of the
match between Manchester United (Home team) and Arsenal

0

0:29 0:21 0:15 0:39 0:09 0:32 0:45 0:34

B 0:46
B
B
B 0:17
B
B
B 0:34
B
T
FD ¼ B

B 0:48
B
B 0:11
B
B 0:09
B
B
@ 0:18

0:2

0:4

(Away team). The similarity threshold is set as h ¼ 0:2, and the
weights of the demographic attributes are wi ¼ 1=9 ði ¼ 1; 9Þ. From
the rating histories, we encode a result ‘‘x—y’’ to the form of
‘‘x  10 þ y’’ for easy calculation so that the domain of the problem
is now transformed to D ¼ ½0; . . . ; 99Š.
According to Eq. (67), the fuzzy distances matrix is calculated
as,

0:15 0:25 0:23 0:14 0:34 0:15 0:46 0:41 0:49 0:33

0:28 0:38 0:38 0:28 0:25 0:42 0:42 0:45 0:16 0:25 0:04 0:41 0:18 0:09

0:35 0:31 0:41 0:13 0:42 0:42 0:36 0:23 0:12 0:13 0:21
0:05

0:2


0:37 0:38 0:11 0:44 0:06

0:43 0:22

0:5

0:08 0:18 0:29 0:21

0:1

0:06 0:22 0:44 0:09 0:27

0:05 0:47 0:31 0:31 0:02 0:38 0:43 0:04 0:13 0:06

0:29 0:35 0:3 0:37 0:48 0:33 0:12 0:28 0:12
0:47 0:03 0:35 0:37 0:42 0:2 0:33 0:27 0:04
0:14 0:08 0:22 0:13 0:25

0:2

0:03 0:38

0:3
0

0:38 0:38 0:19 0:45 0:16 0:25 0:33
0:24 0:1 0:03 0:07 0:08 0:44 0:2

0:12 0:06 0:37 0:27 0:31 0:37 0:01


0:01 0:12 0:01

0:1

0:05 0:37 0:12 0:22 0:25 0:09 0:25 0:12 0:31 0:37 0:49 0:15 0:16 0:28
0:4

0:03 0:09 0:29 0:37 0:17 0:06 0:47

0:3

0:2

0:04 0:13 0:28 0:06

1

0:3 C
C
C
0:35 C
C
C
0:14 C
C
0:42 C
C;
C
0:35 C
C

0:26 C
C
C
0:21 A
0:34
ð80Þ

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
679
680
681
682
683
684
685


ESWA 9318

No. of Pages 10, Model 5G

15 May 2014
Q1

L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx
Table 10
The comparison of accuracy (%).
BEPL season


HU-FCF

Lucas

ZN

2012–2013
2011–2012
2010–2011

33.3
38.3
31.1

30.6
35.4
29.8

32.7
37.3
30.2

Table 11
The comparison of accuracy (%) with the new domain.

686
687
688
689
690


BEPL season

HU-FCF

Lucas et al.

Zenebe and Norcio

2012–2013
2011–2012
2010–2011

94.1
90.4
92.3

91.7
90.1
91.7

92.8
90.4
90.6

From Eq. (80), we calculate the number of similarity degrees as
c ¼ 65. Thus, a ¼ 0:77 and b ¼ 0:23. The fuzzy similarity degrees
matrix is then expressed in Eq. (81). From the rating histories we
calculate the hard similarity degrees and the final similarity
degrees matrices in Eqs. (82) and (83), respectively.


9

between users. Thus, we have made the following contributions:
(i) a systematic mathematical definition of Fuzzy Recommender
Systems that is a generalization of the existing definitions of Recommender Systems and Multi-Criteria Recommender Systems
with an illustrated example from the MovieLens dataset was proposed; (ii) some basic algebraic operations in Fuzzy Recommender
Systems such as the union, the intersection and the complement
accompanied with their properties were presented; (iii) a novel
hybrid user-based fuzzy collaborative filtering method for Fuzzy
Recommender Systems so-called HU-FCF that utilizes both fuzzy
and hard user-based similarity degrees and automatically calculates the weights of attributes and degrees was described. Experimental results conducted on some benchmark RS datasets such as
MovieLens and Book-Crossing showed that HU-FCF obtains better
accuracy than other relevant fuzzy filtering methods.
The proposed methodology has good impacts and practical
implications to the community researches of Recommender Systems and expert & knowledge-based systems. Firstly, it enriches
the knowledge of modeling and formulation of Fuzzy Recommender Systems. Secondly, some basic algebraic operations of
Fuzzy Recommender Systems could be used for further studies

711
712
713
714
715
716
717
718
719
720
721

722
723
724
725
726
727
728
729
730
731

FSDT ¼ ð 0:76 0:74 0:79 0:65 0:74 0:7 0:73 0:75 0:74 0:78 0:81 0:72 0:79 0:78 0:77 0:81 0:8 0:76 0:7 Þ:
ð81Þ

HSDT ¼ ð À0:42 À0:42 0 À0:32 À0:43 0:14 0:33 0:28 0:06 À0:09 0:02 À0:12 À0:14 À0:64 0:21 À0:71 0:31 À0:28 À0:44 Þ;
ð82Þ

SIM T ¼ ð 0:49 0:48 0:61 0:43 0:48 0:57 0:64 0:64 0:59 0:58 0:63 0:53 0:58 0:46 0:64 0:47 0:69 0:52 0:44 Þ:

691
692
693
694
695
696
697
698
699
700
701

702

Ã

Thus, the final rating is Rða; i Þ ¼ 21:248. From the domain, we
Ã
determine the nearest value of Rða; i Þ is ‘‘21’’ that means ‘‘2–1’’
with the error threshold being D ¼ 1:16%. This result is identical
to that in Fig. 1. If we use the existing fuzzy collaborative filtering
methods such as the work of Lucas et al. (2012) and Zenebe and
Norcio (2009), the predictive results are ‘‘0–1’’ and ‘‘1–1’’, respectively. Eventually, we have made the comparative experiments
for the remaining matches in the testing data of the season
2012–2013 and the matches of other seasons and received the
results in Table 10.
If domain D returns to {‘‘Win’’, ’’Draw’’, ‘‘Lose’’} then we receive
the results in Table 11.

703

5. Conclusions

704

In this paper, we aimed to enhance the accuracy of prediction of
the available filtering method in Fuzzy Recommender Systems.
From the scanning literature, we have pointed out the limitations
of the relevant researches that are the lack of a well-defined mathematical definition of Fuzzy Recommender Systems accompanied
with its algebraic operations and properties and the fuzzy similarity degree is not enough to express accurately the analogousness

705

706
707
708
709
710

ð83Þ

involving the mathematical foundations of Recommender Systems.
Thirdly, an application of the fuzzy filtering method HU-FCF for the
football results prediction in Section 4 has shown the capability of
the proposed method to be applied to various practical problems.
Last but not least, general readers could have a great benefit from
taking the know-how of system modeling and algorithmic formulation; thus utilizing them for cross interdisciplinary researches.
As being mentioned in Section 2.4, the HU-FCF algorithm was
designed on the basis of Recommender Systems so that one of further research directions is to extend this algorithm to work with
truly Fuzzy Recommender Systems expressed in Definition 3 by
considering the fuzzification both in the left and right sides of
Eq. (3). Furthermore, some other algebraic operations of Fuzzy Recommender Systems should be developed for the completeness of
the system. Finally, a combination of the HU-FCF algorithm with
a neuro-fuzzy network for some forecast problems to accelerate
the accuracy is also our target.

732

Acknowledgments

749

The authors are greatly indebted to the editor-in-chief Prof. B.

Lin and anonymous reviewers for their comments and suggestions

750

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748

751


ESWA 9318

No. of Pages 10, Model 5G


15 May 2014
Q1

10

L.H. Son / Expert Systems with Applications xxx (2014) xxx–xxx

755

that improve the clarity and quality of the paper. Other thanks are
sent to Mr. Khuat Manh Cuong, VNU for the calculation works. This
work is sponsored by the NAFOSTED under contract No. 102.052014.01.

756

References

757
758
759
760
761
762
763
764
765
766
767
768
769

770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799

800
801
802
803
804
805
806
807
808
809
810
811
812
813

Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender
systems: A survey of the state-of-the-art and possible extensions. IEEE
Transactions on Knowledge and Data Engineering, 17(6), 734–749.
Al-Shamri, M. Y. H., & Bharadwaj, K. K. (2008). Fuzzy-genetic approach to
recommender systems based on a novel hybrid user model. Expert Systems
with Applications, 35(3), 1386–1399.
Borges, H. L., & Lorena, A. C. (2010). A survey on recommender systems for news
data. In Smart information and knowledge management (pp. 129–151). Berlin,
Heidelberg, Germany: Springer.
Boulkrinat, S., Hadjali, A., & Mokhtari, A. (2013). Towards recommender systems
based on a fuzzy preference aggregation. Proceeding of the eighth conference of
the European society for fuzzy logic and technology (EUSFLAT-13), 146–153.
Cao, Y., & Li, Y. (2007). An intelligent fuzzy-based recommendation system for
consumer electronic products. Expert Systems with Applications, 33(1), 230–240.
Carrer-Neto, W., Hernández-Alcaraz, M. L., Valencia-García, R., & García-Sánchez, F.

(2012). Social knowledge-based recommender system. Application to the
movies domain. Expert Systems with Applications, 39(12), 10990–11000.
Christidis, K., & Mentzas, G. (2013). A topic-based recommender system for
electronic marketplace platforms. Expert Systems with Applications, 40(11),
4370–4379.
Costa-Montenegro, E., Barragáns-Martínez, A. B., & Rey-López, M. (2012). Which
App? A recommender system of applications in markets: Implementation of the
service for monitoring users’ interaction. Expert Systems with Applications,
39(10), 9367–9375.
Derntl, M. et al. (2011). Inclusive social tagging and its support in Web 2.0. services.
Computers in Human Behavior, 27(4), 1460–1466.
Drachsler, H. et al. (2010). Issues and considerations regarding sharable data sets for
recommender systems in technology enhanced learning. Procedia Computer
Science, 1(2), 2849–2858.
Duan, L., Street, W. N., & Xu, E. (2011). Healthcare information systems: Data mining
methods in the creation of a clinical recommender system. Enterprise
Information Systems, 5(2), 169–181.
Fang, B., Liao, S., Xu, K., Cheng, H., Zhu, C., & Chen, H. (2012). A novel mobile
recommender system for indoor shopping. Expert Systems with Applications,
39(15), 11992–12000.
Ghazanfar, M. A., & Prügel-Bennett, A. (2014). Leveraging clustering approaches to
solve the gray-sheep users problem in recommender systems. Expert Systems
with Applications, 41(7), 3261–3275.
GroupLens research. (2014). MovieLens. Available at: < />datasets/movielens/>.
Kaklauskas, A. et al. (2013). Recommender system to analyze student’s academic
performance. Expert Systems with Applications, 40(15), 6150–6165.
Lucas, J. P., Laurent, A., Moreno, M. N., & Teisseire, M. (2012). A fuzzy associative
classification approach for recommender systems. International Journal of
Uncertainty, Fuzziness and Knowledge-Based Systems, 20(04), 579–617.
Nadi, S., Saraee, M., Bagheri, A., & Davarpanh Jazi, M. (2011). FARS: Fuzzy ant based

recommender system for web users. International Journal of Computer Science
Issues, 8(1), 203–209.
Palanivel, K., & Siavkumar, R. (2010). Fuzzy multi-criteria decision-making
approach for collaborative recommender systems. International Journal of
Computer Theory and Engineering, 2(1), 57–63.
Park, D. H., Kim, H. K., Choi, I. Y., & Kim, J. K. (2012). A literature review and
classification of recommender systems research. Expert Systems with
Applications, 39(11), 10059–10072.
Park, H. S., Yoo, J. O., & Cho, S. B. (2006). A context-aware music recommendation
system using fuzzy bayesian networks with utility theory. In Fuzzy systems and
knowledge discovery (pp. 970–979). Berlin: Springer.

752
753
754

Porcel, C., & Herrera-Viedma, E. (2010). Dealing with incomplete information in a
fuzzy linguistic recommender system to disseminate information in university
digital libraries. Knowledge-Based Systems, 23(1), 32–39.
Porcel, C., López-Herrera, A. G., & Herrera-Viedma, E. (2009). A recommender
system for research resources based on fuzzy linguistic modelling. Expert
Systems with Applications, 36(3), 5173–5183.
Ricci, F., Rokach, L., & Shapira, B. (2011). Introduction to recommender systems
handbook. In Recommender systems handbook (pp. 1–35). US: Springer.
Romero, F. P., Ferreira-Satler, M., Olivas, J. A., Prieto-Mendez, M. E., & MenéndezDomínguez, V. H. (2011). A fuzzy-based recommender approach for learning
objects management systems. Proceeding of the 2011 IEEE 11th international
conference on intelligent systems design and applications (ISDA), 984–989.
Serrano-Guerrero Herrera-Viedma, E., Olivas, J. A., Cerezo, A., & Romero, F. P. (2011).
A Google wave-based fuzzy recommender system to disseminate information
in university digital libraries 2.0. Information Sciences, 181(9), 1503–1516.

Sevarac, Z., Devedzic, V., & Jovanovic, J. (2012). Adaptive neuro-fuzzy pedagogical
recommender. Expert Systems with Applications, 39(10), 9797–9806.
Shapira, B. (2011). Recommender systems handbook. US: Springer.
Shih, D. H., Yen, D. C., Lin, H. C., & Shih, M. H. (2011). An implementation and
evaluation of recommender systems for traveling abroad. Expert Systems with
Applications, 38(12), 15344–15355.
Son, L. H. & Thong, N. T. (Submitted for publication). Intuitionistic fuzzy
recommender systems: An effective tool for medical diagnosis. Fuzzy Sets and
Systems.
Son, L. H., Cuong, B. C., Lanzi, P. L., & Thong, N. T. (2012). A novel intuitionistic fuzzy
clustering method for geo-demographic analysis. Expert Systems with
Applications, 39(10), 9848–9859.
Son, L. H., Cuong, B. C., & Long, H. V. (2013a). Spatial interaction – modification
model and applications to geo-demographic analysis. Knowledge-Based Systems,
49, 152–170.
Song, Y., Zhang, L., & Giles, C. L. (2011). Automatic tag recommendation algorithms
for social recommender systems. ACM Transactions on the Web, 5(1), 4–39.
Son, L. H. (in press). Enhancing clustering quality of geo-demographic analysis using
context fuzzy clustering type-2 and particle swarm optimization. Applied Soft
Computing..
Son, L. H., Linh, N. D., & Long, H. V. (2014). A lossless DEM compression for fast
retrieval method using fuzzy clustering and MANFIS neural network.
Engineering Applications of Artificial Intelligence, 29, 33–42.
Son, L. H., Minh, N. T. H., Cuong, K. M., & Canh, N. V. (2013b). An application of fuzzy
geographically clustering for solving the cold-start problem in recommender
systems. Proceeding of fifth IEEE international conference of soft computing and
pattern recognition (SoCPaR 2013), 44–49.
Statto organisation. (2014). English premier league 2013–2014. Available at:
< />Subasi, A. (2006). Automatic detection of epileptic seizure using dynamic fuzzy
neural networks. Expert Systems with Applications, 31(2), 320–328.

Terán, L., & Meier, A. (2010). A fuzzy recommender system for eelections. In
Electronic government and the information systems perspective (pp. 62–76).
Berlin, Heidelberg, Germany: Springer.
Yager, R. R. (2003). Fuzzy logic methods in recommender systems. Fuzzy Sets and
Systems, 136(2), 133–149.
Zenebe, A., & Norcio, A. F. (2009). Representation, similarity measures and
aggregation methods using fuzzy sets for content-based recommender
systems. Fuzzy Sets and Systems, 160(1), 76–94.
Zhang, Z., Lin, H., Liu, K., Wu, D., Zhang, G., & Lu, J. (2013). A hybrid fuzzy-based
personalized recommender system for telecom products/services. Information
Sciences, 235, 117–129.
Zheng, N., & Li, Q. (2011). A recommender system based on tag and time
information for social tagging systems. Expert Systems with Applications, 38(4),
4575–4587.
Ziegler, C. N., McNee, S. M., Konstan, J. A., & Lausen, G. (2005). Improving
recommendation lists through topic diversification. Proceedings of the 14th
ACM international conference on world wide web, 22–32.

Q1 Please cite this article in press as: Son, L. H. HU-FCF: A hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Systems
with Applications (2014), />
Q5

814
815
816
817
818
819
820
821

822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851

852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877




×