Tải bản đầy đủ (.pdf) (88 trang)

Community learning in location based social networks 4

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (7.49 MB, 88 trang )

111
Rank Visualization Profile
1
• Coffee/Tea lovers
• Group Size: 795
2
• Food lovers
• Group Size: 679
3
• Shoppers
• Group Size: 599
4
• Travellers
• Group Size: 584
5
• Fast Food Lovers
• Group Size: 576
112
4.9 Top Five Communities in New York City
In this section, we show the top five communities detected in New York City. We
manually observe and derive the community profiles from the prominent venue
categories, tip topics and photo concepts of the communities.
Rank Visualization Profile
1
• Food lovers
• Group Size: 784
2
• Clubbing lovers
• Group Size: 698
3
• Shoppers


• Group Size: 643
4
• Travellers
• Group Size: 639
5
• Tourists
• Group Size: 589
113
4.10 Summary
In this chapter, we investigated the problem of community understanding in LB-
SNs. We proposed a novel and unified framework which models the heterogenous
entities and interactions by constructing a heterogenous, non-uniform hypergraph.
We then formulated it as a problem to detect dense subgraph over hypergraph,
where constraints were added to ensure the interpretability of the detected commu-
nities. We then proposed an efficient procedure to solve the optimization problem.
Extensive experiments have been perfo r med both qualitatively and quantitatively
to verify our proposed approach. Meaningful and interpretable communities were
detected in an optimal way while interesting culture differences were revealed by
analyzing the communities in Singapore and New Yo r k City.
There are a few interesting aspects worth further exploration. First, the
time-dependent users’ behaviors allow interest communities to be detected and
understood in a timely manner. For example, it is interesting to mine and profile
different interest groups, which are active during different time periods. Second,
users often participate at various social networks. The aggregation of user behaviors
across multiple sources is expected t o lead to more accurate and timely communities
that enrich community understanding.
114
115
Chapter 5
Community Matching across

Geographical Regions
People are usually active only within small geographical regions, as described in
the previous chapters. While it is easy to connect users when they visit similar
sets of venues in the same geographical regions, it is interesting and challenging to
investigate ways to correlate users across geographical regions based on their local
behaviors.
In this chapter, we study the problem of community matching in LBSNs
across geographical regions in the context of generating personalized recommenda-
tions of locally interesting venues to tourists. To do so, we first propose a Bayesian
approach to extract the latent social dimensions for users based on their local behav-
iors. We then match users’ preferences across geographical regions based on latent
global interest factors. In the experiments, we both validate the quality of the ex-
tracted latent social dimensions and the community matching across geographical
regions in the recommendation frameworks.
The rest of the chapter is organized as follows. Section 5.1 reiterates the
motivation of community matching across geographical regions by describing a real
116
world application scenario and the use of LBSNs as a solution and discusses the
challenges of the problem. Section 5.2 reviews related works, which have not been
covered in the literat ure review in Chapter 2, and yet are relat ed to the problem
and techniques described in this chapter. Section 5.3 gives an overview of the pro-
posed framework, while Section 5.4 formally defines the problem. Sections 5.5 and
5.6 detail social dimensions extraction and recommendations generation, r espec-
tively. Section 5.7 reports the experimental results. Finally, Section 5.8 gives the
concluding remarks.
5.1 Motivation and Challen ges
When we travel to new places, in additional to sightseeing, we are often interested
in exploring local cultures, which match our personal interests, such as sampling
local cuisines, understanding local customs, and visiting shops selling local special
items, etc. However, there exists a large gap between what we want and what

are provided by the dominant tourism resources, such as the Wikitravel, Lonely
Planet and official t ourism boards of certain countries, such as YourSingapore
1
and
AustraliaTravel
2
. The gap is caused by two main reasons. First, these sites mainly
provide information of famous attractions or popular local landmarks instead of
locally interesting places. However, many tourists may want to experience local
cultures that match their interests in terms of local food, events and shops. These
locally interesting places or activities may not be famous enough to be included
in these tourism resources. Second, they generate user-independent contents while
people usually have drastically different personal preferences in reality. For exam-
ple, people who love shopping may want to visit more popular local shops while
food lovers are more interested in sampling different kinds of local foods, such as
1
/>2
/>117
the local foods in Shilin Night Market in Taipei.
On the other hand, rich location data a t fine-gr ained level is now available
from t he recently emerging LBSNs. They are becoming more and more popular
thanks to the recent availability of open mobile platforms, which makes LBSNs
much more accessible to mobile users. These LBSNs are able to provide sufficient
resources to bridge the aforementioned gap. First, they a llow users to voluntarily
annotate the real world with check-ins which indicate the specific times that the
users were at particular locations. In addition, LBSNs provide “location-specific
data”, in which users may check in at nearly the same geographical coordinates but
at very different venues. For example, users can check in at a cinema or a r estaurant
in the same shopping mall where both venues share the same geographical coordi-
nates. In contrast, cell phone data provides coarse location accuracy and cannot

differentiate users’ presence across different floors in the same building. The ac-
tive participation of Foursquare users and the fine-grained venue annotations make
personalized recommendation of locally interesting venues possible.
Collaborative filtering (CF) based a pproaches [48, 58] seem to be the plausi-
ble solutions to this problem as demonstrated by their great successes in commercial
applications, such as Amazon [69], Netflix [11], Tivo [2] and eBay [152] and research
on point-of-interest (POI) recommendations [150, 24, 156, 151]. These approaches
automatically generate recommended items of a user using known preferences of
other users or known items preferred by the target user. However, CF-based al-
gorithms, being memory or model based, require sufficient overlaps among users
in terms of items rated so that the correspondences amo ng users or items can be
readily identified. In LBSNs, however, users usually visit venues that are within a
small geographical distance apar t from their homes [25, 27], which makes it hard
if not impossible to correlate users if they visit a very different set of venues with
little/no overlap. Let’s consider the user-venue matrix shown in Table 5.1 where
118
Table 5.1: User-Venue Matrix (Values indicate number of visits).
v
1
v
2
v
3
v
4
v
5
v
6
v

7
v
8
v
9
v
10
u
1
0 0 0 0 0 10 5 3 0 1
u
2
0 0 0 0 0 21 15 3 0 12
u
3
0 0 0 0 0 1 0 3 3 1
u
4
0 0 0 0 0 10 5 0 4 3
u
5
1 11 5 3 1 0 0 0 0 0
u
6
3 9 0 3 2 0 0 0 0 0
u
7
7 1 1 0 1 0 0 0 0 0
u
8

2 4 3 3 4 0 0 0 0 0
Table 5.2: “?” stands for preferences to be predicted.
v
1
v
2
v
3
v
4
v
5
v
6
v
7
v
8
v
9
v
10
u
1
? ? ? ? ? 10 5 3 0 1
u
2
? ? ? ? ? 21 15 3 0 12
u
3

? ? ? ? ? 1 0 3 3 1
u
4
? ? ? ? ? 10 5 0 4 3
u
5
1 11 5 3 1 ? ? ? ? ?
u
6
3 9 0 3 2 ? ? ? ? ?
u
7
7 1 1 0 1 ? ? ? ? ?
u
8
2 4 3 3 4 ? ? ? ? ?
users {u
1
, ··· , u
4
} never visited venues {v
1
, ··· , v
5
} a nd users {u
5
, ··· , u
8
} never
visited venues {v

6
, ··· , v
10
}. If we were t o use traditional CF techniques, the rat-
ings marked with “?” in Table 5.2 would be hard to be estimated. In addition,
most CF algorithms are based on static models in which relations are a ssumed to
be fixed at different times. However, users’ visiting behaviours often evolve over
time [97] and exhibit strong temporal patterns, such as daily/weekly patterns and
periodic property [25]. For example, people perform more check-ins at restaurants
during meal time and visit shops mostly during weekend and weekday evenings.
Hence, it requires an effective way to incorporate the temporal information.
119
5.2 Related Work
The problem that we are investigating and the techniques we propose are related to
four research areas, namely mobility prediction, location recommendation, travel
recommendation and latent factor models.
5.2.1 location Prediction
Location prediction based on cellular network traces has recently spurred lots of
attention in the mobile computing [49, 154, 113, 43, 138]. The various proposed
mobility models aim to provide an accurate pr ediction of individual’s future lo-
cation, which is an essential requirement for various mobile applications, such as
home heating control [115], urban planning [105, 20], mobile advertising [8, 7] and
demographic prediction [16 ]. The basic concept in this research line is to compare
a current patt ern with historical data and to extract similar patterns f or predicting
the next location. Different from the objective of location prediction, the objec-
tive of location recommendation reviewed in Section 5.2.2 aims to recommend new
locations to users to widen their choices though they adopt the similar evaluation
strategies. In addition, location semantics are usually not readily available in the
mobile phone data while venues in user-generated data come with rich annotations
in various aspects, such as categories, comments, photos, etc.

5.2.2 Location Recommendation
The recent boom of LBSNs have motivated emerging research on point-of-interest
(POI) or more generally location recommendations [150, 1 51, 156]. Location rec-
ommendation aims t o recommend a list of POIs or locations to a user based on the
user’s past visiting histories. These lines of work usually focus on general recommen-
dation tasks in a traditional CF framework. For example, Ye et al. compared the
120
influences from user similarity that based on historical behavior, geog r aphical dis-
tance and friend network in POI recommendation task [150]. Ying et al. proposed
to consider both user preferences and location properties in their recommendation
framework [151]. Recently, Zhou e t al. studied and compared the performances of
different CF recommenders, including user-based, item-based and probabilistic la-
tent semantic analysis in location recommendation [156], where they reported that
the probabilistic approach gives the optimum performance. There are two main
differences between our work and these related work: (1) we study a new problem
which aims to provide tourists with recommendations based on their local visits;
and (2) none of these work has studied the effects of simultaneously considering
time, social relations and venue similarities.
5.2.3 Travel Recommendation
In Web 2.0 communities, people often share their traveling experience in blogs, fo-
rums and social networks in terms of travelogues, photos, etc. These geo-referenced
media resources contain rich information of tourism, which motivates research on
generating travel recommendations from these user generated contents. Hao et al.
proposed a location-topic model to model travelogue documents and develop a tour
destination recommendation [57]. To recommend a destination, a user needs to
issue an query and then the system utilizes the topic model to select a destination
with the highest matching score. Cheng et al. leveraged community-contributed
photos from Flickr to provide personalized travel recommendation based on peo-
ple’s attributes, such as gender, race and age in a probabilistic Bayesian learning
framework [23]. More recently, Lucchese et al. proposed an interactive random

walk approach f or personalized recommendations of touristy places based on the
knowledge mined from Flickr and Wikipedia [75]. While these works all aim to
provide personalized recommendations of touristy points based on users’ past be-
121
haviors, our work f ocus on recommending locally interesting venues and aim to
solve a problem of cross region recommendation. In a ddition, we utilize user gen-
erated location contents in LBSNs, which better connect the physical world with
the online virtual world.
5.2.4 Latent Factor Models
Latent factor models a r e shown to be promising in recommendation tasks such as
Netflix competition [10 ], results diversification [118], review helpfulness pr ediction
[85] and web site recommendations [76]. The underlying assumption of using latent
factor models is that the entities, such as the users and items (venues, reviews,
products, etc) can be modeled by a set of latent representations, which together
determine the preferences of unknown items in a probabilistic way. For example,
[85] proposed a series of increasingly sophisticated probabilistic graphical models
based on tensor factorization and showed their effectiveness in the prediction of
review helpfulness. Recent work by Cheng et al. has shown a positive influence by
introducing social regularization in POI r ecommendations performed on Gowalla
[24]. Our proposed framework differs from these efforts in two main aspects: (1)
the framework considers tempora l changes of users’ preferences and heterogeneous
intra/inter entity relations in a unified manner; and (2) we derive a Bayesian treat-
ment to sample lat ent factors, which both avoids overfitting and tedious parameter
tuning.
5.3 Overview
In this chapter, we aim to investigate community matching across geographical
regions in LBSNs with the aim to provide tourists with personalized destination
recommendations leveraging on rich user generated locat ion contents. Besides, we
122
identify loca lly interesting venues to be t hose frequently visited by local people but

obscure to most fo r eigners. We make use of these digital footprints [46] to under-
stand collective local user behaviors and then provide venue recommendations to
tourists from a global understanding of cross region communities’ matching. Fig-
ure 5.1 shows the overall framework which consists of four components. To tackle
the sparseness problem a nd handle time-dependent varied behaviours, we propose
to first extract users’ latent social dimensions [127] to capture users’ preferences
according to their local check-ins at different times, social relations and similarities
among the visited venues. Social dimensions reflect users’ latent drives of their
social behaviors and each dimension represents a plausible interest community a-
mong users. To accomplish this subtask, we propose a novel framework named
Bayesian probabilistic tensor fa cto rization with social and location regularization
(BPTFSLR) that puts users’ visiting behaviors, social relations and venue simi-
larities into a unified framework. Second, we mine local interest communities in
each geographical region using adaptive affinity propagation. Third, we represent
each local community using global properties, such as venue categories and time
of visits according to the aggregated behaviors of community members. Fourth,
we correlate communities at different geographical regions to generate personalized
recommendations of locally interesting venues to tourists. By conducting experi-
ments on a representative real-world dataset, we demonstrate that our proposed
scheme is effective in generating personalized recommendations in the local setting.
5.4 Problem Definit ion
In this section, we formally define the problem statement. It is worth mentioning
that the problem we study and the method we pro pose in this paper are applicable
to all LBSNs and we choose Foursquare as the testbed in this work. For clarity
and convenience, we list the variables and acronyms used in this section in Table 5.3.
123
6RFLDO'LPHQVLRQ([WUDFWLRQ
/RFDO&RPPXQLW\'HWHFWLRQ
&URVV5HJLRQ&RPPXQLW\0DWFKLQJ
/RFDO&RPPXQLW\3URILOLQJ

Figure 5.1: Overall framework of locally interesting venue recommendations to
tourists. It includes four components: 1. So cial dimensions extraction (Section 5.5),
2. Local interest communities detection (Section 5.6.1) , 3. Community profiling and
representation (Section 5.6.1) and 4. Recommendation via cross region community
matching (Section 5.6.2). Venues of the same color are similar venues and dotted
arrows indicate foreign visits to b e recommended to the users.
Table 5.3: List of notations of variables used in Chapter 4.
Notation Explanation
U
g
the set of users in geogra phical region g
V
g
the set of venues in geographical region g
Continued on next page
124
Table 5.3 – c ontinued from previous page
Notation Explanation
T the set of locat ion-independent time periods
C
g
the set of check-ins in geographical region g
u
g
i
the ith user in geographical region g
v
g
i
the ith venue in geographical region g

t
i
the ith location-independent time period
(u
g
i
, v
g
j
, t
k
)
the i’s user visits the jth venue in geograph-
ical region g during the kth time period
G
g
the undirected social network graph in geo-
graphical region g
E
g
1
the edge set representing the social relations
between users in geographical region g
R
g
the adjacency matrix representing the social
relations between users in geographical re-
gion g
H
g

the undirected social network graph in geo-
graphical region g
E
g
2
the edge set representing the affinity relation-
s between venues in geographical region g
B
g
the adjacency matrix representing t he affin-
ity relations between venues in geographical
region g
BPMF bayesian probabilistic matrix factorization
Q user × venue matrix, where Q
ij
is the pref-
erence of the i towards the j venue
Continued on next page
125
Table 5.3 – c ontinued from previous page
Notation Explanation
D the latent dimension
U the collection of la t ent social dimensions of
users
u
i
a D-dimensional latent social dimensions for
the ith user
V the latent venue feature matrix
v

i
a D-dimensional latent feature vector for the
ith venue
MCMC markov chain monte carlo
Q user ×venue ×time tensor, where Q
k
ij
is the
preference of the ith user towards the jth
venue during the kth time period
T the number of different location independent
time periods
T the collection of latent feature vectors of time
periods
t
i
a D-dimensional latent feature vector for the
ith time period
BPTF bayesian probabilistic tensor factorization
α the user similarity tradeoff parameter
F
i
the friend set of the ith user
w
i
the word-frequency vector of the ith venue
Z the size of the vocabulary in the tips’ corpus
S the auxiliary user factor feature matrix
Continued on next page
126

Table 5.3 – c ontinued from previous page
Notation Explanation
D the auxiliary venue factor feature matrix
BPTFSLR bayesian probabilistic tensor f actorization
with social and location regularization
AP affinity propagation
AAP adaptive affinity propagation
C
i
the community representation at geographi-
cal region i
A the collection of latent community interest
factors
X the sparse community representations
Problem Statement: Let U
g
= {u
g
1
, ··· , u
g
N
g
} be a set of users and V
g
=
{v
g
1
, ··· , v

g
M
g
} be a set of venues in geographical region g. Let T = {t
1
, ··· , t
T
}
be a set of location-independent time periods. We define a set of check-ins C
g
=
{c
g
1
, ··· , c
g
q
g
}, where each check-in is a tuple: (u
g
i
, v
g
j
, t
k
) indicating that user u
g
i
visits venue v

g
j
at time t
k
in region g. Let G
g
= (U
g
, E
g
1
) be the undirected social
network graph in r egion g, where E
g
1
represents the social relations between user-
s in r egion g. We then define the corr esponding adjacency matrix R
g
∈ R
N×N
,
where R
g
ri
is the strength of the social relation between users r and i in region g.
Let H
g
= (V
g
, E

g
2
) be the undirected venue relation graph in region g. We next
define the corresponding adja cency matrix B
g
∈ R
M×M
, where B
g
jl
represents the
venue similarity between venues j and l in region g. Given C
g
, G
g
, H
g
and T,
where g = 1 , 2, ···, our aim is to recommend a list of locally interesting venues
{v
a
1
, ··· , v
a
L
a
} from region a to users {u
b
1
, ··· , u

b
N
b
} from region b when they visit
127
region a, where a is g eographically different from b, L
a
is the number of locally
interesting venues in a, and N
b
is the number of users in b.
5.5 Social Dimension s Extractio n
In LBSNs, users exhibit heterogenous visiting behaviors, which naturally classify
them into different interest groups, such as fo od lovers, shoppers, etc. In addition,
even within the similar interest groups, people exhibit different preferences. For
example, sports lovers may have different exercising preferences in terms of venues
and times: some prefer jogging in the morning in their neighbourhoods; some
like to exercise during weekends in nature parks and the others may prefer to
exercise in the gyms after work. The inherent heterogenous user preferences make
it hard to interpret the connections between people in social networks. Towards
gaining insights on the underlying users’ interests, Tang et al. formally defined
social dimensions of each user with each dimension representing a latent affiliation
among users in order to approximate direct differentiating connections [127]. In
this section, we present a unified framework for effective extraction of latent social
dimensions for each user by simultaneously considering temporal factors and various
relations among different entities.
5.5.1 Matrix Factorization Model
A simple approach to extract the latent social dimensions is to use probabilistic
matrix factorization (PMF) [84], where the underlying assumption is that both
users and venues can be modeled by a set of latent representations. Let Q ∈ R

N×M
be the user ×venue matrix, where Q
ij
is the preference of user i towards venue j
and is computed based on the number of times i visits j as follows.
128
Q
ij
=
c(i, j)

M
j

=1
c(i, j

)
, (5.1)
where c(i, j) is the number of times user i check in at venue j. Let U ∈ R
D×N
be
the collection of latent social dimensions of users with each column u
i
representing
a D-dimensional latent social dimensions for user i and V ∈ R
D×M
be the latent
venue feature matrix with each column v
j

representing a D-dimensional latent
feature vector for venue j. PMF then approximates Q
ij
based on the inner-product
of corresponding latent features, i.e. Q
ij
≈ u
T
i
v
j
. The conditional probability of
the observed preferences is defined as:
p(Q|U, V) =
N

i=1
M

j=1

N(Q
ij
|u
T
i
v
j
, τ
−1

Q
)

I
ij
, (5.2)
where N(·|µ, τ
−1
) denotes the Gaussian distribution with mean µ and precision τ.
Here I
ij
is the indicator function that equals to 1 if user i ever visits venue j and
0 otherwise. In addition, zero-mean spherical Gaussian priors are imposed on u
i
and v
j
to control the mo del complexity. Assuming U and V are independently
distributed, we have:
p(U) =
N

i=1
N(u
i
|0, σ
2
U
I), p(V) =
M


j=1
N(v
j
|0, σ
2
V
I), (5.3)
where I is the identity matrix of dimension D × D. we can maximize the log-
posterior over U and V as follows:
U

, V

= arg max
U,V
p(U, V|Q) = arg max
U,V
p(U)p(V)p(Q|U, V). (5.4)
It turns out that the learning procedure corresponds to the following weighted
regularized matrix factorization:
129
U

, V

= arg min
U,V
1
2
N


i=1
M

j=1
(Q
ij
− u
T
i
v
j
)
2
+
λ
U
2
N

i=1
u
i

2
2
+
λ
V
2

M

j=1
v
j

2
2
, (5.5)
where λ
U
= (τ
Q
σ
2
U
)
−1
and λ
V
= (τ
Q
σ
2
V
)
−1
. The local minimum of this non-convex
optimization problem can be efficiently found by stochastic gradient descent [14].
Alternatively, to avoid parameter tuning and achieve automatic control of model

complexity, we can also apply a full Bayesian treatment using markov chain monte
carlo (MCMC) to obtain the posterior probability distribution of the user latent
social dimensions [110]. However, PMF does not consider the temporal factors and
assume consistent users’ behavior across different time periods.
5.5.2 Tensor Factorization Model
The previous approach assumes that visiting preferences are fixed at differen-
t times. However, time factors a r e strong drives which inherently direct users’
movements and users’ visiting behaviors exhibit significantly different temporal
patterns in the real world [33, 25]. The visiting preferences are affected by t-
wo temporal aspects. First, users visit different venues at different time of the
day. For example, people often visit food courts or restaurants during meal times
and watch movies during the evening on Friday and weekends. Second, user-
s exhibit different lifestyles in weekdays and weekends. Noulas et al. report-
ed a drastic differences among types of venues visited at weekdays and week-
ends [97]. To bring in the time factors, we employ probabilistic tensor factor-
ization (PTF) to model the time-evolving preferences [148]. With the introduc-
tion of time factors, the user × venue two-dimensional matrix is converted into
the user × venue × time three-dimensional tensor. We consider splitting users’
visiting times into eight periods: {morning (5am − 11am), afternoon (12pm −
18pm), evening (19pm − 23pm), night (12am − 4am)}×{weekday, weekend}.
130
Extended from the relational data in matrix factorization model, let Q ∈
R
N×M ×T
be the user × venue × time t ensor, where Q
k
ij
is the preference of user i
towards venue j at time k and can be computed based on t he number of times i
visits j at k a s fo llows.

Q
k
ij
=
c
k
(i, j)

M
j

=1
c
k
(i, j

)
, k = 1, 2, ··· , T, (5.6)
where c
k
(i, j) is the number of t imes user i visits venue j at time k. Extending the
idea of PMF, we can approximate Q
k
ij
with the inner-product of three D-dimensional
vectors:
Q
k
ij
≈ u

i
, v
j
, t
k
 =
D

d=1
U
di
V
dj
T
dk
, (5.7)
where t
k
is the additional latent feature vector for the kth time factor. Intuitively,
Eq (5.7) makes t he visiting preferences not only depend on how similar a user’s
preferences and a venue’s preferences are, but also on how much these preferences
match with the current crowd behaviors which are reflected by the time factors.
We then extend the conditional probability of the observed preferences as:
p(Q|U, V, T) =
N

i=1
M

j=1

T

k=1

N(Q
k
ij
|u
i
, v
j
, t
k
), τ
−1
Q

I
k
ij
. (5.8)
To avoid overfitting, similarly, we impose zero-mean, independent Gaussian
priors on user and venue latent vectors as before. Following [148], we assume that
the time factors change smoothly over time and depends only on their immediate
predecessor where we also assume that the Markov property holds. Thus, the
conditional prior for T and the initial time feature vector t
0
are defined as:
P (t
k

) = N(t
k−1
, σ
2
T
I), P (t
0
) = N(µ
T
, σ
2
0
I). (5.9)
We can maximize the log-posterior over U, V, T as follows:
131
U

, V

, T

= arg max
U,V,T
p(U, V, T|Q) = arg max
U,V,T
p(U, V, T)p(Q|U, V, T). (5.10)
With the independence assumption, after mathematical derivations, the optimiza-
tion problem becomes:
U


, V

, T

= arg min
U,V,T
1
2

N
i=1

M
j=1

T
k=1

Q
k
ij
− u
i
, v
j
, t
k


2

+
λ
U
2

N
i=1
u
i

2
2
+
λ
V
2

M
j=1
v
j

2
2
+
λ
T
2

T

k=1
t
k
− t
k−1

2
2
+
λ
0
2
t
0
−µ
T

2
2
, (5.11)
where λ
U
= (τ
Q
σ
2
U
)
−1
, λ

V
= (τ
Q
σ
2
V
)
−1
, λ
T
= (τ
Q
σ
2
T
)
−1
and λ
0
= (τ
Q
σ
2
0
)
−1
. We
can adopt the same stochastic gradient descent approach to find local minimums of
this non-convex optimization problem. Similarly, we can also adopt the Bayesian
treatment and use MCMC methods to obtain the posterior distribution of users’

latent social dimensions [148]. However, PTF does not take users’ social relations
and venue similarities into considerations.
5.5.3 Regularized Tensor Factorization
The formulation in Section 5.5.2 has considered the temporal variations of users’
visiting behaviours. In this section, we further extend the previous formulation by
simultaneously considering the social ties and inter-venue similarities in LBSNs in
order to achieve more accurate extraction of users’ social dimensions.
5.5.3.1 Social Relation
Intuitively, “friends” tend to have similar behaviours and preferences. For example,
a group of friends may often visit the same restaurants for gathering or hang out to
watch movies together. A user may also visit certain places which are recommended
by his/her friends. These suggest tha t it is useful to consider social ties to bring
132
“friends” closer to each other in the latent space. Following [150], we consider
two factors when relating users in LBSNs. First, friends who have more common
friends may have better trust in terms of their recommendations, thus we consider
the overlapping levels of their friend sets. Second, friends sharing more check-ins
should have mor e similar tastes, thus we consider the overlapping levels of their
check-in sets.
We define the user similarity as follows. Given the user set U = {u
1
, ··· , u
N
},
their friends set {F
1
, ··· , F
N
} and t heir check-ins set {V
1

, ··· , V
N
}, we introduce
α ∈ [0 , 1] as a tuning parameter and define the user similarity matrix R ∈ R
N×N
,
where R
ri
is computed as follows:
R
ri
=





α ·
|F
r
∩F
i
|
|F
r
∪F
i
|
+ (1 −α) ·
|V

r
∩V
i
|
|V
r
∪V
i
|
if u
r
∈ F
i
,
0 Otherwise.
(5.12)
5.5.3.2 Venue Similarity
Venues have different social functions. Other than categor ies, venues are also en-
riched with users’ comments about the activities, reviews and descriptions. In
Foursquare, users are free to write tips, which may cover a variety of diverse topics
at venues. For example, a tip left at an art museum may recommend a special
exhibition or give positive/negative comments on the museum environment. We
argue that tips sometimes provide better evidences than categories to describe v-
enues. For example, during the examination reading weeks, venues such as libraries,
scho ol canteens, study rooms and Starbucks in universities, though belong to dif -
ferent categories, tend to have similar social functions: places for preparing exams.
Thus, we seek to model venue similarities using the associated tips.
We aggregate all tips of a venue and perform the following steps to filter the
noise and reduce the feature space:
• We tokenize text descriptions and put them into lowercase.

133
• We remove all the non-alphanumeric characters.
• We remove rare terms (terms with frequency< 5).
Then, the text descriptions for each venue v
j
are represented as a word-
frequency vector w
j
= [w
j
(1) ···w
j
(Z)], where w
j
(b) denotes the frequency of t erm
b in the text descriptions of venue v
j
and Z is the vocabulary size. We then define
the venue similarity matrix B ∈ R
M×M
where B
jl
=
w
j
·w
l
|w
j
|·|w

l
|
.
5.5.3.3 The C omplete Formulation
With the intro duction of user relations and venue similarities, we now present the
complete formulation. Let S ∈ R
D×N
be the auxiliary user factor feature matrix and
D ∈ R
D×M
be the auxiliary venue factor feature matrix. We have the conditional
distribution of user and venue similarities as follows.
p(R|S, U) =
N

r=1
N

i=1

N(R
ri
|s
T
r
u
i
, τ
−1
R

)

I
R
ri
, (5.13)
p(B|V, D) =
M

j=1
M

l=1

N(B
jl
|v
T
j
d
l
, τ
−1
B
)

I
B
jl
. (5.14)

As before, we introduce zero-mean, independent Gaussian priors on the two
introduced feature matrices. Assuming user similarities, venue similarities and
user visiting preferences are independently distributed conditioned on the latent
factors, we may estimate U, V, S, D, T by maximizing the logarithm of the posterior
distribution of the observed similarities and preferences:
U

, V

, S

, D

, T

= arg max
U,V,S,D,T
p(U, V, S, D, T|R, B, Q)
= arg max
U,V,S,D,T
p(U, V, S, D, T)p(R, B, Q|U, V, S, D, T).
(5.15)
134
Maximizing the log posterior with respect to U, V, S, D, T is equivalent to mini-
mizing the following objective function with quadratic regularization terms:
L(U, V, S, D, T) =
1
2
N


i=1
M

j=1
T

k=1
I
k
ij
[Q
k
ij
− u
i
, v
j
, t
k
]
2
+
λ
R
2
N

r=1
N


i=1
I
R
ri
[R
ri
− s
T
r
u
i
]
2
+
λ
B
2
M

j=1
M

l=1
I
B
jl
[B
jl
− v
T

j
d
l
]
2
+
λ
D
2
M

l=1
d
l

2
2
+
λ
T
2
T

k=1
t
k
− t
k−1

2

2
+
λ
0
2
t
0
−µ
T

2
2
+
λ
U
2
N

i=1
u
i

2
2
+
λ
S
2
N


r=1
s
r

2
2
+
λ
V
2
M

j=1
v
j

2
2
,
(5.16)
where λ
R
=
τ
R
τ
Q
, λ
B
=

τ
B
τ
Q
, λ
D
= (τ
Q
σ
2
D
)
−1
, λ
T
= (τ
Q
σ
2
T
)
−1
, λ
0
= (τ
Q
σ
2
0
)

−1
, λ
U
=

Q
σ
2
U
)
−1
, λ
S
= (τ
Q
σ
2
S
)
−1
and λ
V
= (τ
Q
σ
2
V
)
−1
.

The objective function is non-convex, and we may only be able to find a local
minimum by iteratively updating the latent feature vectors using methods such as
the stochastic gradient descent. One issue with this appro ach is parameter-tuning.
Since there a re eight of them, the usual approach of parameter selection, such as
cross-validation is infeasible even for a problem of moderate size. Here, in the spirit
of [148], we seek a full Bayesian treatment to average out the hyp erpar ameters in
the model, which both help to alleviate overfitting and save us from the painful
problem of parameter tuning.
5.5.3.4 Learning By Markov Chain Monte Carlo
The fully Bayesian treatment integrates all model parameters and hyperparameters
and ar r ives at a predictive distribution of future observations, given the previous
observed data. Since this predictive distribution is obtained by averaging all models
135
ȝ


Ŗǰȱȱȱ

Ŗ
U 1
V
U
X
L
Y
M
G
O
5
UL

4

LM
4

LM
M 0
O 0
%
MO
L 1
4
7
LM
W

W

W
7
ȝ
7
ȁ
7
ȁ
6
ȝ
6
ȁ
8

ȝ
8
ȁ
9
ȝ
9
ȁ
'
ȝ
'
ȝ

ȝ


Ŗǰȱȱȱ

Ŗ

Ŗǰȱȱȱ

Ŗ

Ŗǰȱȱȱ

Ŗ

Ŗǰȱȱȱ

Ŗ

ȝ

ȝ

IJ
4
IJ
%

ŗǰȱȱȱ

ŗ
IJ
5

ŗǰȱȱȱ

ŗ
Figure 5.2: The graphical model of probabilistic tensor factorization with user
regularization: R and location regularization: B (BPTFSLR).
in the model space specified by the priors, it is less likely to overfit the given set o f
observations. However, when integrating over parameters, one often cannot obtain
an a na lytical solution, thus we resort to sampling-based approximation methods,
in particular, MCMC [4].
To generate user similarity R
ri
, venue similarity B
jl
and user visiting prefer-
ence Q

k
ij
, we first sample U, V, S, D using Eq (5.18), and then sample T using Eq
(5.19). R
ri
, B
jl
and Q
k
ij
can then be generated according to Eq (5.13), Eq (5.14) and
Eq (5.8) respectively. Figure. 5.2 shows the graphical model of the entire process.
The key ingredient of the fully Bayesian treatment is to view the hyperpa-
rameters: τ
Q
, τ
R
, τ
B
, Θ
U
≡ {µ
U
, Λ
U
}, Θ
V
≡ {µ
V
, Λ

V
}, Θ
S
≡ {µ
S
, Λ
S
}, Θ
D


D
, Λ
D
} a nd Θ
T
≡ {µ
T
, Λ
T
} as random variables as showed in Figure 5.2. We
choose the prior distributions for the hyperparameters as follows:

×