Tải bản đầy đủ (.pdf) (410 trang)

Ebook Mobile AD hoc networking (2nd edition) Part 2

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (16.62 MB, 410 trang )

12
DATA DISSEMINATION IN
OPPORTUNISTIC NETWORKS
Chiara Boldrini and Andrea Passarella

ABSTRACT
Among the alternatives to pure general-purpose MANETs, one of the most promising
approach is that of opportunistic networks [2]. Differently from MANETs, opportunistic networks are designed to work properly even when the nodes of the network
move. More specifically, opportunistic networks reverse the approach of MANETs,
and what was before an accident to avoid (the mobility of nodes) now becomes an
opportunity for communications. In fact, in an opportunistic network messages are
exchanged between nodes when they come into contact, creating a multi-hop path
from the source to the destination of the message.
One of the most appealing applications to build upon an opportunistic network is
data dissemination. Conceptually, data dissemination systems can be seen as variations of the publish/subscribe paradigm: publisher nodes generate content items and
inject them into the network, subscriber nodes declare their interest in receiving certain types of content (e.g., sport news, radio podcast, blog entries, etc.) and strive
to get it in some ways. Nodes can usually be publishers and subscribers at the same
time. The main difference between message forwarding and content dissemination is
that the source and destination of a message are typically well known when routing
a message (and clearly listed in the header of the message itself), while, in content
dissemination, content generators and content consumers might well be unaware of
each other. Publish/subscribe systems have gained new momentum thanks to the Web
2.0 User Generated Content (UCG) paradigm, with users generating their own content
Mobile Ad Hoc Networking: Cutting Edge Directions, Second Edition. Edited by Stefano Basagni,
Marco Conti, Silvia Giordano, and Ivan Stojmenovic.
© 2013 by The Institute of Electrical and Electronics Engineers, Inc. Published 2013 by John Wiley & Sons, Inc.

453


454



DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

and uploading it on popular platforms like Blogger, Youtube, or Flickr. The application of the UGC paradigm to opportunistic networks is particularly appealing. A
future of users generating content items on the fly while moving, and distributing this
content to the users in their proximity, can be realistically envisioned for the next
years. In order to make this future a reality, new strategies for disseminating content
items must be designed, while at the same time accounting for a wise usage of network
resources, which can be easily saturated in this scenario.
In this chapter we discuss the challenges connected with content dissemination
in an opportunistic network and the solutions proposed in the literature. We classify
current proposals that address the problem of content dissemination into six main
categories, based on the specific problem targeted and the type of solution proposed.
Then, we present and discuss the work that we believe best summarizes the main
features of each category.

12.1

INTRODUCTION

Opportunistic networks represent one of the most interesting evolution of traditional
Mobile Ad Hoc NETworks (MANET). The typical MANET scenario comprises mobile users with their wireless-enabled mobile devices that cooperate in an ad hoc
fashion to support communication without relying on any preexisting networking
infrastructure. Specifically, in MANETs the nodes of the network become active entities and also become a substitute to routers, switches, and so on, in forwarding
messages. Thus, messages are delivered following a multihop path over the nodes
of the MANET itself. Despite the huge research activity that they have generated,
MANETs were far from being widely adopted. The main drawback of MANETs
was their lack of realism in research approach [1]. From a practical standpoint, real
small-scale implementations have been long disregarded, and real users have not been
involved in the MANET evaluation. From a research standpoint, MANET results were

mined by excessively unrealistic assumptions. The most significant among these is
the intolerance to temporary network partitions, which actually may be very common
in a network where users move and where communication devices are expected to
run out of battery, or to be out of reach, very often.
Among the alternatives to pure general-purpose MANETs, one of the most promising approach is that of opportunistic networks [2]. Differently from MANETs,
opportunistic networks are designed to work properly even when the nodes of the
network move. More specifically, opportunistic networks reverse the approach of
MANETs, and what was before an accident to avoid (the mobility of nodes) now
becomes an opportunity for communications. In fact, in an opportunistic network,
messages are exchanged between nodes when they come into contact, creating a multihop path from the source to the destination of the message. The exploitation of direct
contacts between nodes for message forwarding introduces, as a side effect, additional
delays in the message delivery process. In fact, user mobility cannot be engineered:
Node contacts are usually neither controllable nor scheduled, and networking protocols can only wait for them to occur. For this reason, opportunistic networks fall
into the category of delay-tolerant networks [3]. For many common applications, this


INTRODUCTION

455

additional latency may be a tolerable price for the ubiquity provided by the opportunistic network. Web 2.0 content sharing services, for example, are already delay-tolerant
in their nature, because they rely on an asynchronous communication paradigm.
One of the most appealing applications to build on top of an opportunistic network
is data dissemination. Conceptually, data dissemination systems can be seen as variations of the publish/subscribe paradigm [4]: Publisher nodes generate contents and
inject them into the network, and subscriber nodes declare their interest in receiving
certain types of content (e.g., sport news, radio podcast, blog entries, etc.) and strive
to get it in some ways. Nodes can usually be publishers and subscribers at the same
time. The main difference between message forwarding and content dissemination is
that the source and destination of a message are typically well known when routing it
(and clearly stated in the header of the message itself), while, in content dissemination, content generators and content consumers might well be unaware of each other.

Publish/subscribe systems have gained new momentum thanks to the Web 2.0 User
Generated Content (UCG) paradigm, with users generating their own content and
uploading it on popular platforms like Blogger, Youtube, or Flickr. The application
of the UGC paradigm to opportunistic networks is particularly appealing. A future
of users generating content items on the fly while moving, as well as distributing this
content to the users in their proximity, can be realistically envisioned for the next
years. In order to make this future a reality, new strategies for disseminating content
items must be designed, while at the same time accounting for a wise usage of network
resources, which can be easily saturated in this scenario.
12.1.1

Motivation and Taxonomy

In this chapter we discuss the challenges connected with content dissemination in
an opportunistic network and the solutions proposed in the literature. We classify
current proposals that address the problem of content dissemination into six main
categories, based on the specific problem targeted and the type of solution proposed.
For each category, we present and discuss the work that we believe best summarizes
the approach proposed.
We start in Section 12.2 by discussing the initial work on the area which ignited the
research on this topic. To the best of our knowledge, the PodNet Project [5] was the
first initiative to explicitly address the problem of disseminating content in a network
made up of users’ mobile devices in an opportunistic fashion. Within the PodNet
project, heuristics were defined in order to drive the selection of content items to
be cached based on the popularity of the content itself. Such heuristics enforced a
cooperative caching among nodes and were shown to clearly outperform the simple
strategy in which each node only keeps the content it is directly interested in.
The second category of solutions is based on the exploitation of the social characteristics of user behavior (Section 12.3). In this case, heuristics are proposed that
take into account the social dimension of users—that is, the fact that people belonging to the same community tend to spend significant time together and to be willing to cooperate with each other. We take ContentPlace [6] as representative of this
approach because it was one of the first fully-fledged solutions to incorporate the idea

of communities with a systematic approach to data dissemination.


456

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

The third category of content dissemination approaches brings the ideas of publish/subscribe overlays into the realm of opportunistic networks (Section 12.4). Publish/subscribe systems are based on content-centric overlays in which broker nodes
bring together the needs of both content publishers and subscribers by matching the
content generated by publishers with the interests of subscribers and by delivering the
content to them. How the publish/subscribe ideas can be adapted to an opportunistic
environment is well exemplified by Yoneki et al. [7], in which the pub/sub overlay is
built exploiting the knowledge on the social behavior of users.
Protocols belonging to the fourth category, discussed in Section 12.5, reverse the
approach of heuristic-based protocols as they depart from the local optimization problems at the basis of heuristic approaches in order to find a global, optimal solution
to the content dissemination problem. Such a global solution, typically unfeasible in
practice in real scenarios, is then approximated using a local, distributed strategy. To
the best of our knowledge, the work by Reich and Chaintreau [8] has been the first to
provide a comprehensive analysis of a global optimization problem applied to content
dissemination in opportunistic networks.
The fifth category (Section 12.6) is characterized by the exploitation of a broadband
wireless infrastructure in conjunction with the opportunistic network of user devices.
The idea here is to partially relieve the burden of disseminating content from the infrastructure by exploiting opportunistic content dissemination among users. We choose
the work by Whitbeck et al. [9] as representative of this category because of the tight
interaction that it proposes between the infrastructure and the opportunistic network.
Finally, the sixth category hosts proposals that tackle the dissemination problem
using an analogy with unstructured p2p systems (Section 12.7). To the best of our
knowledge, the work by Zhou et al. [10] is one of the most significant in this area,
which formulates the dissemination problem by means of p2p universal swarms and
provides solid theoretical results regarding the advantage of cooperative strategies

against greedy approaches.
12.2 INITIAL IDEAS: PODNET
Initial research efforts on content dissemination in opportunistic networks were made
within the PodNet project [5]. The aim of the PodNet project is to develop a content distribution system that builds up from the mobile users that opportunistically
participate to the network. The scenario considered by PodNet is that of one or more
Access Points that are able to retrieve content from the Internet and to send it to
those nodes that are in radio range. Coverage provided by the Access Points might be
limited, thus content items are also disseminated by the mobile users of the network
in an opportunistic fashion. In addition, content may also be generated by the users
themselves, according to the Web 2.0 User Generated Content paradigm.
12.2.1

Content Organization

PodNet borrows the representation of content as a set of channels from Web syndication [11,12]. Syndication provides a structured way for making content available


457

INITIAL IDEAS: PODNET

NODE A

NODE B
Request for discovery channel
Discovery channel
returned
Requests for subscribed
channels
Entries returned


Requests for entries to be stored
in the public cache
Entries returned

Figure 12.1 PodNet message exchange.

on the Internet. Content items are organized into channels, based on the information
they carry. Each item is an entry of the channel. For example, a blog can be classified
as a channel, and each new post becomes an entry of the channel. Upon generation
of a new channel, the content producer generates a feed (which is an XML file) that
lists the available entries for the channel. Depending on the type of content that it
generates, each channel can have different requirements. For breaking news channels,
the freshness of the entries is most important. On the contrary, music channel entries
remain interesting for months or years after their original publication.
Entries are exchanged by nodes during pairwise contacts. Besides the user-defined
channels described above, a discovery channel is defined in the PodNet system as a
control channel that lists the channels cached by the node itself. Consider two nodes
that establish a contact (Figure 12.1). The one that is interested in retrieving new
content items asks the other for its discovery channel. By reading the information provided through the discovery channel, the node will be informed of the entries available
on the peering node and it will make decisions about which entries to download.
12.2.2

Content-Centric Dissemination Strategies

In the simplest case, nodes only download and store those content items that are interesting to them. There is no intentional content dissemination in this case, but the
net result is that content items happen to travel across the network based on the interests of users. This can be considered a baseline, unintentional content dissemination
process. Lenders et al. [13] evaluate the performance improvement that is introduced
when nodes not only download the entries they are interested to, but also cooperate
with other nodes by making available the unused portion of their cache to entries that

might be of interest to other nodes. Cache is thus split into two spaces: a private space,
reserved to content items of subscribed channels, and a public cache (Figure 12.2).


458

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

Private
Cache
Channel 1

Entry

Entry

Channel 2

Entry

Entry

Channel 3

Entry

Entry

Channel 5


Entry

Entry

Channel 6

Entry

Entry

Channel 7

Entry

Entry

Public
Cache

Figure 12.2 Cache and content organization.

More specifically, Lenders et al. investigate the problem of which entries should be
hosted in the public cache in order to best serve the other nodes.
The popularity of channels among mobile users is assumed to be different: There
are channels that have more subscribers, other channels that have less subscribers.
Statistics on the popularity of channels among encountered nodes are collected by
sharing personal interests when encountering other peers. The popularity of channels
is at the basis of the PodNet dissemination strategies. In fact, each policy in PodNet
is a heuristic that is a function of the channel popularity. Four strategies are defined.
With the Most Solicited strategy, entries of the most popular channels are requested

first. On the contrary, entries belonging to the least popular channels are requested
first when using the Least Solicited strategy. A probabilistic approach is used by the
Inverse proportional strategy, in which entries are requested with a probability that is
inversely proportional to the popularity of the channel they belong to. Finally, with
the Uniform strategy, items are requested at random and channel popularity is not
taken into consideration.
12.2.3

Performance Results

Lenders et al. [13] propose two metrics for the evaluation of the performance of
content dissemination. The freshness metric measures the age of entries at the time
a subscriber receives them. This metric is important, for example, when considering
the dissemination of news items. Freshness can be computed for individual channels,
as well as for the aggregate of all channels. The latter gives a global picture of the
overall capability of the protocol to deliver fresh items, whereas the former allows
the authors to evaluate the behavior of the policy depending on the specific channel
popularity. This is important because a good overall freshness could be obtained
by just disseminating those channels that are more popular, while letting the others
starve. Finally, dissemination policies are also ranked based on their fairness value.


INITIAL IDEAS: PODNET

459

Fairness is measured in terms of the max–min fairness. A strategy is fair according to
the max–min fairness if the performance of a channel cannot be increased anymore
without affecting the performance of a channel with a lower performance.
As far as simulations are concerned, crucial is the way that user preferences are

modeled. A Zipf-like distribution of requests for cached content has been highlighted
in the context of Web caching [14], and Internet RSS feeds do not seem to be an
exception [15]. Thus, the distribution of channel popularity is often assumed to follow
a Zipf’s law also in the context of opportunistic networks. When a Zipf’s law is used,
the frequency fn at which the channel ranked nth in popularity is requested is given
by fn ∼ n1α [16]. Parameter α is positive by definition and allows for a fine tuning of
the Zipf distribution. More specifically, the greater the α value, the more uneven the
frequency of requests across channels with different popularity.
Lenders et al. [17] evaluate the content dissemination policies by means of simulations using the random waypoint model to represent user mobility. This model
reproduces a very mixed network, where anybody can meet with anybody. Despite
their simplicity, the heuristics defined for PodNet show the improvement provided
by cooperative caching. In fact, in all scenarios considered, the intentional dissemination strategies discussed above increase the freshness of the content seen by users
and, at the same time, are more fair with respect to the baseline unintentional content
dissemination. Among the proposed policies, the one that surprisingly performs best
overall is the Uniform policy. The reason is that the dissemination policies defined by
Lenders et al. affect only the public cache. The caching process for the private cache
is entirely driven by the subscriptions—that is, by the interests of the user owning
the device. Given that most popular channels have more subscribers, the fraction of
private caches allocated to most popular channel is greater than the fraction occupied
by least popular channel. Thus, intuitively, the Uniform strategy for the public cache
helps to increase the diversity of content items and to give a chance also to least
popular channel, eventually increasing the fairness.
12.2.4

Take-Home Messages

PodNet has been the first work to tackle the problem of content dissemination in
opportunistic networks. Two are the main contributions of PodNet. First, PodNet has
clearly shown the advantage of cooperative caching with respect to a greedy behaviour
of the users. Second, borrowing the approach of Internet Podcasting, PodNet has

introduced a way for classifying items into channels, to which users express their
interests with subscriptions.
The main limitation of PodNet is that the policies defined by Lenders et al. [13]
only focus on one of the two actors of content dissemination—that is, on the channels.
PodNet heuristics are totally content-centric: Each policy is exclusively a function of
the popularity of the channel and individual user preferences or different user capabilities to disseminate messages are not considered at all. These strategies might work
when users are well mixed and homogeneous, and items can easily travel from one side
of the other of the network. However, when node movements are heterogeneous and
communities of nodes tend to cluster together, this approach can be quite limited [6].


460

12.3

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

SOCIAL-AWARE SCHEMES

The simple heuristics defined by PodNet highlighted the opportunities offered by
cooperative caching for disseminating public content items in opportunistic networks.
PodNet heuristics were totally content-centric: Each policy is exclusively a function
of the popularity of the channel, and individual user preferences or different user
capabilities to disseminate messages are not considered at all. Thus, later works have
focused on the definition of more elaborate heuristics that could better exploit user
diversity in order to improve content dissemination. We refer to these heuristics as
user-centric, in contrast with PodNet content-centric strategies.
Opportunistic networks, especially as far as content dissemination is concerned,
are intrinsically networks of people, and sociality is something that peculiarly distinguishes such kind of networks with respect to others. Thus, one of the principal
directions of user-centric heuristics is that of social awareness. People are social in

the sense that their movements are influenced by their relationships with other users
[18]. This suggests that it is possible, based on the analysis of social relations or
mobility patterns, to identify those nodes that are better fit to cache certain content
items. People are also social in the sense that their interests for specific content might
be correlated with the interests of those who are socially close to them [19], and also
in the sense that the degree to which they are willing to cooperate with others might
depend on the kind of social relationships that they share [20]. The class of socialaware dissemination protocols aims to exploit all these aspects in order to improve
the performance of the content dissemination.
The social-aware ContentPlace dissemination system [6,21] proposes a general
framework for designing content dissemination policies. The building block of this
framework is the utility function that is defined in order to quantify, using a heuristic
approach, the advantage of caching a certain content item or not. The utility function
is then used to solve, in a distributed way, an optimization problem. Thus, when two
nodes come into contact, they exchange a summary vector that lists the items in each
other’s cache, and then the items to be stored are selected among these listed items. If
the available memory on mobile devices were infinite, the best strategy would be to
cache whatever content is found. However, memory is a limited resource, as well as
battery. Thus, the best dissemination strategy is the one that maximizes the benefit for
the system without breaking existing resource constraints. This is equivalent to solving
a multiconstrained knapsack problem like the one in equation (12.1), where k denotes
the kth item that the node can select, Uk its utility, cjk the percentage consumption
of resource j related to fetching and storing item k, m the number of considered
resources, and xk the problem’s variables (xk = 0 corresponds to not caching the
item, xk = 1 to caching it).


⎨ max
s.t.




k

U k xk

k cjk xk

≤1

xk ∈ {0, 1}

j = 1, . . . , m
∀k

(12.1)


461

SOCIAL-AWARE SCHEMES

When the number of managed resources (m) is not big (which is quite reasonable),
solving this problem is very fast from a computational standpoint [22]. Such a solution
is therefore suitable to be implemented in resource constrained mobile devices.
12.3.1

Social-Aware Utility

The main strength of ContentPlace lies in the social-aware definition of the utility Uk
that it provides. ContentPlace builds upon a community detection algorithm (like the

one proposed by Hui et al. [23]) that is able to identify the social communities the user
belongs to. Users belonging to the same community have strong social relationships
with each other. In general, users can belong to more than one community (a working
community, a family community, etc.), each of which is a “home” community for that
user. Users can also have relationships outside their home communities (“acquainted”
communities). ContentPlace assumes that people movements are governed by their
social relationships and by the fact that communities are also bound to particular places
(i.e., the community of office colleagues is bound to the office location). Therefore,
users will spend their time in the places their home communities are bound to, and they
will also visit places of acquainted communities. Different communities will have,
in general, different interests (Figure 12.3a). Therefore, the utility of the same data
object will be different for different communities. Given that communities represent
the sets of nodes with which the user interacts most, intuitively, caching items that are
popular within these communities will increase the probability that such items will
be actually delivered to people that are interested in them (Figure 12.3b).
Once the communities have been identified, ContentPlace splits the utility function
into as many components as the number of communities the user belongs to. Thus,
dropping subscript k in equation (12.1), the utility can be written as follows:
U=

(12.2)

ωi ui
i

(a)

(b)

Figure 12.3 ContentPlace at a glance: (a) Nodes declare their interests. (b) Nodes remember

the interests of their communities and, based on these interests, while moving around, they
fetch data to be brought back to their communities.


462

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

where ui is the utility component associated with the ith community, and ωi measures
user’s willingness to cooperate with the ith community. Thus, each component ui
measures the gain that caching a certain content item provides to community i. The
advantage of this approach is that, by tuning parameter ωi , each user is able to cooperate with each community in a targeted manner, without wasting its resources. For
example, ωi could be taken as proportional to the social strength of the relationship
between the user and nodes in the ith community.
Following the approach of Web caching literature [24], for each community i the
utility ui is defined as a function of the access probability (pac,i ), the availability
(pav,i ), and the size s of a given content item [equation (12.3)].
ui =

pac,i · fc (pav,i )
s

(12.3)

The access probability pac,i is a measure of how many users of community i are
expected to be interested in a given content item, and thus to issue requests for it.
The availability pav,i , instead, quantifies the penetration of the content item in the
community and can be measured as the fraction of nodes that share a copy of the
item. The utility increases as the access probability increases, while it decreases with
the availability of the content (thus, fc must be a monotonically decreasing function).

In fact, when an item is quite available in the network, the marginal gain of replicating
it once more is low, and so is the utility that it provides to the system.
12.3.1.1 Parameter Estimation. ContentPlace estimates the access probability
pac,i and the availability pav,i for each content item via online estimation. This
estimation is based on the information collected during meeting between pairs of
nodes. More specifically, when two nodes meet, they exchange a summary of the
state of their buffers. From this summary, each node is able to keep track of how often
a given content item has been seen on other nodes’ caches during a time period T , and
based on this information, it computes a sample pˆ av for the availability of that item.
More specifically, each node keeps an estimate of the availability of a given content
item for each community i it belongs to (pˆ av,i ). Keeping separate the statistics for
each community allows the node to make targeted decision for each community. The
estimate of the availability is then updated using the exponential weighted moving
average method: pav,i ← αpav,i + (1 − α)pˆ av,i . Similarly, a sample pˆ ac,i of the access probability related to community i is obtained for each time period T by tracking
the interests advertised by encountered nodes that belong to community i, and such
sample is then used to update the estimate pac,i as described above.

12.3.2

Social-Aware Dissemination Strategies

Based on the relation between the user and its communities, and based on the current
position of the user, ContentPlace defines and evaluates the following dissemination
policies.


SOCIAL-AWARE SCHEMES

463


Uniform Social. All communities the user gets in touch with are given an equal
weight (i.e., ωi = ω, ∀i).
Present. The community the user is currently roaming in is assigned weight 1,
while all other communities are assigned weight 0.
Most Frequently Visited. Each community is assigned a weight proportional to the
time spent by the user in the community (i.e., ωi = ti / i ti ).
Future. Like the Most Frequently Visited, but weight is set to zero for the community the user is currently roaming in.
Most Likely Next. The weight of each community is proportional to the probability
that the user will move to the community conditioned by the current position
of the user. The weight for the current community is set to zero.
The main difference between these policies is the degree to which each of them
implements a “look-ahead” behavior. The look-ahead behavior refers to the ability
of each policy to act proactively with respect to future encounters. Thus, the Present
policy is the one with the least degree of look-ahead behavior, while the Most Likely
Next is the one with the highest degree.
12.3.3

Performance Results

The above social-aware policies are compared against two social-oblivious policies:
the Greedy policy, which corresponds to the baseline content dissemination in which
each user only stores items it is personally interested in, and the Uniform policy,
in which all channels are given the same priority and which was shown by Lenders
et al. [13] to provide the best overall performance with respect to other contentcentric policies (see Section 12.2.2). The scenario considered is that of three isolated
communities, connected by a few nodes (travelers) that commute between them.
The Movements of nodes are generated using the HCMM mobility model [25]. Each
community is assumed to generate a different subset of content items. Thus, the only
way nodes can access content produced in a different community is with the help of
content dissemination strategies. The process of requests for content items is modeled
as a Poisson Process.

The authors find that the policies that perform best are those that drive the content
dissemination based on the prediction of future encounters. More specifically, the one
that achieves the best overall performance is the Future policy. Recall that according
to this policy, nodes do not cooperate with the nodes of the community in which
they are roaming, but instead they proactively select, from the neighboring nodes’
caches, those items that are interesting for nodes belonging to different communities.
The share of cache space that each of these communities gets is roughly proportional
to the average time spent in the past by the node in the community. This result
may be surprising as the noncooperation with the roaming community provides a
better performance than cooperation. However, as it has been shown by Ioannidis
and Chaintreau [26], social relationships that should be exploited more are those that
provide the greater diversity. Cooperation with the roaming community is less useful,


464

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

because nodes in the same community see more or less the same content items and,
thus, the diversity is extremely limited. On the other hand, bringing content items
from one community to another one drastically contributes to diffuse heterogeneous
items and, thus, helps significantly the dissemination process.
12.3.4

Take-Home Messages

The main advantage of social-aware heuristics for content dissemination is that they
directly exploit the feature of human interactions to improve the dissemination process. This approach has been already shown to be successful as far as routing in
opportunistic networks is concerned [27–29]. Social-aware content dissemination
generally results in a quicker and fairer content dissemination with respect to socialoblivious policies [6]. In addition, social-aware dissemination heuristics often (as in

the case of ContentPlace) do not make any assumption on the characteristics of the
underlying contact process or on the distribution of content popularity. They do not
need to: They directly learn this information using online estimation. This implies that
social-aware content dissemination strategies are expected to be resilient to mobility
and content popularity changes.
The main drawback of this class of heuristic-based content dissemination policies
lies in the overhead they introduce for collecting and managing the information on
which the heuristic reasoning is based. In the case of ContentPlace, statistics must be
collected per item and per community. While the number of communities a single user
is in touch with is not expected to become excessively big, the number of items might
grow significantly, and keeping such per-item statistics might become unfeasible.
Thus, solutions that improve the scalability of this approach are under study.

12.4

PUBLISH/SUBSCRIBE SCHEMES

An original approach in the literature inherits concepts from publish/subscribe overlays in the opportunistic networking environment. The most relevant example of this
class of solutions is the one proposed by Yoneki et al. [7].
Conventional pub/sub solutions are designed as content-centric overlays, and they
are typically conceived to run on top of static networks. Nodes that generate content
are termed publishers, while nodes subscribing to content are termed subscribers.
The overlay consists of a network of brokers. Brokers receive subscriptions from
subscribers, and they are aware of publications of publishers. When a new content
item is published, brokers identify interested subscribers and take care of delivering
the content item to them. Pub/sub systems can be grouped according to the way they
perform the matching between the properties of the content items and the interests
of the subscribers. Typical solutions are topic-based, content-based, or type-based
pub/sub systems [4]. In the former case, content is grouped in a set of predefined
topics. Publishers decide which topic to associate publications to, while subscribers

subscribe to topics (and receive the entire set of items of that topics). In contentbased pub/sub, items are associated with properties or metadata that describe them.


PUBLISH/SUBSCRIBE SCHEMES

465

Subscribers provide filters or patterns to brokers, which are matched to the properties of the publications. Items whose properties satisfy the filter are delivered to
subscribers. Finally, type-based pub/sub add semantic similar to data types in programming languages to the description of the items and to the subscriptions, so that, for
example, it is possible to define topics and subtopics, as well as subscribe to topics at
different levels.
One of the key concepts of pub/sub systems is to decouple the generation from
the consumption of content, by using an intermediate layer (the brokers). This concept is suitable to be exploited also in opportunistic networks. Nodes that generate
content and nodes that consume it are seldom connected to the network at the same
time. Therefore, storing content items at some rendezvous point is one of the typical
approaches used in this networking environment (see, e.g., the throwbox concept in
Zhao et al. [30]).
The pub/sub overlay proposed Yoneki et al. [7] is built on the following idea.
In opportunistic networks, mobile users can be grouped in communities, which are
basically defined by their social behavior (this concept is also shared by the socialaware approaches discussed in Section 12.3). Specifically, the underlying assumption
is that members of the same social community spend significant time together and
are thus often in contact. Let us assume communities can be identified dynamically
through some online algorithm. Among the nodes of each community, one of the
nodes is selected as a broker. The broker is the one, within the community, which
can reach “most easily” the other nodes in the community (we will provide a precise
definition later on). Brokers collect subscriptions of other nodes in their community,
and advertise them to the other brokers. Therefore, when content items are generated,
brokers know to which “fellow” brokers items should be sent.
Figure 12.4 provides a conceptual representation of the idea proposed by Yoneki
et al. [7]. Brokers are those nodes which are more central in their communities

(i.e., which have more links) and form a conceptual overlay, implemented through
gossiping upon encounters. Gossiping is also used to circulate subscription information among brokers. Data (events) are first sent to the broker of the community

Figure 12.4 Conceptual example of the pub/sub system in Yoneki et al. [7].


466

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

where they are generated, and then they are forwarded—according to the recorded
interests—to the broker of the communities which include subscribed users.
In general, the pub/sub system proposed by Yoneki et al. [7] is built on two key
ideas: (i) Assuming social communities are present in opportunistic networks, bridges
can be identified to enable inter-community communication, and (ii) among community nodes, the best bridge is the one that can more easily reach all the other nodes in
the community.
To implement the above idea, two building blocks are required. On the one hand,
an online community detection algorithm is required, which is able to group nodes in
communities such that each node knows which community it belongs to. On the other
hand, algorithms to implement the pub/sub mechanisms are required. We summarize
these building blocks in the following sections.
12.4.1

Community Detection

Two community detection algorithms are proposed in Yoneki et al. [7]: namely SIMPLE and k-CLIQUE. In both cases, a node keeps two local structures, namely the
Familiar Set F and the Local Community C. Conceptually, given a node i, the Familiar
Set contains other nodes that i meets very frequently, while the Local Community
contains, in addition to the members of the familiar set, other nodes whose Familiar
Set shares a sufficient number of nodes with the Local Community of i.

In both algorithms, when node i meets node j, it increases a counter storing the
total contact time with j. As soon as this time exceeds a threshold, j is included in the
Familiar Set of node i. Nodes also exchange their respective Familiar Set and Local
Community. SIMPLE and k-CLIQUE differ in the way they (i) admit nodes in their
local community and (ii) decide when two communities should be merged, because
they are indeed the same community. In the case of SIMPLE, node j is included in
the Local Community of node i if the fraction of common nodes between the Familiar
Set of node j and the Local Community of node i exceeds a given threshold, that is,
F j ∩ Ci
> λ.
|Fj |

(12.4)

On the other hand, two communities are merged when the overlapping members
of the two Local Communities (with respect to the total number of nodes in both
communities) are enough, that is,
|Ci ∩ Cj |
> γ.
|Ci ∪ Cj |

(12.5)

The first criterion (to admit node j in the Local Community of node i) changes in
k-CLIQUE, and node j is admitted if its Familiar Set contains at least k − 1 nodes
that are in the Local Community of node i, that is,
|Fj ∩ Ci | ≥ k − 1.

(12.6)



467

PUBLISH/SUBSCRIBE SCHEMES

Finally, node i checks whether each node in the Local Community of node j should
be part of its Local Community, as well. In particular, given a node l in the Local
Community of node j, l is added to the Local Community of i if at least k − 1 members
of the Familar Set of l are in the Local Community of i, that is,
|Fl ∩ Ci | ≥ k − 1.

(12.7)

Note that to check the last criterion, it is necessary that node j sends to node i the
Familiar Sets of all nodes in its Local Community. To this end, each node keeps a
local estimate of the Familiar Sets of all the members of its Local Community, which
is updated upon encounters.
Performance results presented in Yoneki et al. [7] show that both SIMPLE and
k-CLIQUE are able to discover most of the communities that can be identified
using offline, centralized methods. Specifically, up to 90% of the communities can
be found. Among the two, SIMPLE clearly requires less information and results in
lower communication overhead, although, in the tested configurations, k-CLIQUE
achieves slightly better performance in terms of community detection.
12.4.2

Overlay Operations

Overlay construction is implemented through the same gossiping mechanism used
to detect communities. Thanks to the community detection algorithm, each node
in a community can compute its closeness centrality value. Closeness centrality is

defined as the reciprocal of the distance (typically measured in terms of hop count) to
all nodes in the community. Specifically, if C(a) denotes nodes in the same community
of a while dab denotes the distance between nodes a and b, closeness centrality of
node a is
C(a) = 1/

dab .

(12.8)

b∈C(a)

By gossiping each other’s centrality value during contacts, nodes in a community elect
the broker. During gossiping operations, nodes also exchange information about each
other’s subscriptions, which eventually reaches the broker. Note that both subscription
and un-subscription messages can be sent by subscribers to make the broker know
dynamically about which content they are interested in. All in all, by using gossiping
mechanism,


each node knows its broker;
• brokers know subscriptions of all nodes in their community.
The broker of a community can change dynamically. When a node becomes the
new broker, it receives from the old one the community subscriptions. Brokers forward subscription information to the other “fellow” brokers. In general this is can
be done again through opportunistic forwarding mechanisms (such as, for example, BubbleRAP [27] or HiBOp [28]). Otherwise, when possible, brokers can use


468

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS


infrastructure shortcuts (e.g., activate communications through cellular networks)
just for the purpose of exchanging subscription information.
The pub/sub model used in Yoneki et al. [7] is topic-based, although the same
mechanisms can be used to implement the other types of pub/sub systems, as well.
When a new item for a given topic is generated, it is sent to the broker of the community.
If some node is subscribed to that topic within the community, the item is broadcasted.
It is also sent to other brokers if their community members are subscribed.
Finally, although this is not expanded in Yoneki et al. [7], the possibility of
having multiple brokers in a community is considered. For example, this can happen
if several nodes happen to have a similar closeness centrality value. In general,
this allows the overlay to balance the load between the brokers, and rotate this
functionality among more nodes. However, the drawback is additional coordination
between the brokers of the same community. How to deal with these aspects is not
addressed in Yoneki et al. [7].
12.4.3

Performance Results

For the purpose of evaluation, authors developed a custom simulator to replay some
mobility traces available in the literature and run the community detection and overlay mechanisms. Specifically three traces are selected, based on their community
structure—that is, the CAM, the MIT and the UCSD traces. The CAM trace was
collected during the Haggle project [31,32] by using custom Bluetooth iMotes given
to first- and second-year undergraduate students in Cambridge. The MIT trace comes
from the Reality Mining project [33]. It also records Bluetooth contacts, measured
by 100 smartphones given to MIT students and staff over 9 months. The UCSD trace
comes from the UCSD Wireless Topology Discovery project [34], and collects WiFi
sightings of 300 devices over 11 weeks. Note that in the latter case colocation is
assumed for nodes that are connected to the same WiFi access point at the same time.
The CAM trace consists of very tightly connected users, and different communities

can hardly be identified. On the opposite end, the UCSD trace consists of very independent users, and any community can hardly be found. In the middle, in the MIT
trace several communities can be identified.
Simulation results highlight the following key results. First of all, the socio-aware
overlay works better when a community structure exists, such as in the MIT case.
This was clearly anticipated and expected. Moreover, dissemination using the socioaware overlay is more efficient than flooding in terms of average hop count from
publishers to subscribers. This is a side effect of the selection of brokers as the most
central nodes in their respective communities. Third, the distribution of item dissemination time shows a power law shape. This tells that, although some items need
significant time to be delivered, most of them are delivered fairly quickly (corresponding to the head of the distribution). Finally, in terms of delivery rate, results
show that intra-community delivery is much more efficient than inter-community delivery. In the former case the delivery ratio is about 97%, while in the latter it is in
the range 74% down to 42%. It is argued in Yoneki et al. [7] that this might not be a
too severe problem, as users of the same community are expected to share common


GLOBAL OPTIMIZATION

469

interests, and therefore it is more important to achieve a high delivery ratio inside the
communities than between different communities. To the best of our knowledge, no
experimental results are available describing the distribution of interests in social communities. Therefore, this claim—although reasonable in some cases—still needs to be
validated.

12.4.4

Take-Home Messages

The main contribution of Yoneki et al. [7] is to explore an original approach to
data dissemination—that is, how to apply pub/sub mechanisms coming from the p2p
community in opportunistic networks. The main rationale behind this idea is that
in both cases it is important to decouple generators (publishers) of content items

from consumers (subscribers). In opportunistic networks, this is important because
publishers and subscribers might seldom be connected at the same time through a
stable network.
Another interesting idea is the use of “socially central” nodes as brokers. Socially
central nodes are expected to get in touch frequently with most of the other nodes in
the community, and thus represent natural hubs for intracommunity communications
(which are the main focus of the paper).
The performance results show that this approach can actually work when networks
have a well-defined social structure. However, it does not bring significant advantage
with respect to epidemic dissemination when social structures are not so well defined.
Moreover, issues such as rotation of the broker functionality between more nodes,
and the associated overhead in terms of consistency, is not addressed in the paper
and is an important point to consider. Finally, mechanisms to elect the broker and
collect subscription and unsubscription information might require too much overhead
depending on the dynamism of the network. This aspect is also not investigated in
the paper.

12.5 GLOBAL OPTIMIZATION
In Sections 12.2 and 12.3 we have discussed content dissemination schemes whose
main contribution is the definition of heuristic policies to be used to make local
decisions about whether or not to cache certain content items. An opposite approach
is that of defining a global utility and to solve a global optimization problem as if
nodes’ caches were a big, cumulative caching space. Typically, protocols belonging
to this class focus less on the specific definition of the utility function, while devoting
great attention to the global optimization problem and how to translate such centralized
optimization problem into a distributed one. In the representative work that we have
chosen, Reich and Chaintreau [8] focus on the problem of finding a global optimal
allocation for a set of content items assuming that users are impatient—that is, that
their interest for items monotonically decreases with the time they have to wait before
their request is fulfilled.



470

12.5.1

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

The System Model

Nodes are divided into two sets: the set S of server nodes, which participate to the
caching process and generate content items, and the set C of client nodes, which only
issue requests. These two sets may intersect or not, meaning that nodes can be at
the same time content producers and content consumers. The set of content items is
denoted by I. The global cache is defined as the union set of the cache space of all
servers, and its state is represented by matrix x = (xi,m ), where element xi,m is equal to
one if server m stores a copy of item i and zero otherwise. The content dissemination
problem is translated into the following global optimization problem:


⎨ max U(x)
s.t.
(12.9)
i∈I xi,m ≤ ρ, ∀m ∈ S


xi,m ∈ {0, 1}
where ρ denotes the cache space on each node. This formulation is similar to the
one used by ContentPlace [equation (12.1)]; but while the latter only considered
those items that were either in the local cache or in the peer’s cache, in this case

the global cache is considered. Thus, the strategy proposed by Reich and Chaintreau
aims at optimizing globally the caches of all nodes, in order to provide the best overall
allocation of content items.
Clients issue requests for different items at different rates. The aggregate rate at
which nodes demand item i is denoted with di . The relative likeliness that a generic
node n issues a request for item i is denoted with πi,n . Simply, πi,n denotes the
fraction of requests for item i to which user n contributes. Thus, the rate at which
node n demands item i is given by di πi,n .
The contact process between any pair of nodes is modeled as an independent and
memoryless process. This implies that the time between consecutive contacts of the
same pair of nodes is assumed to follow an exponential distribution.
12.5.2

The Delay-Utility Function

Reich and Chaintreau use a per-item utility function h(t), called delay-utility function,
that is monotonically decreasing with time. More specifically, the interest that users
have in a specific content item is a function of the time they have to wait for it, or, in
other words, of the time it takes for the system to fulfill the request. Note that this utility
is not content-centric—that is, it does not prioritize the freshness of the content, like
in PodNet—but is user-centric, because it quantifies user preference of not having to
wait too much after a request has been issued before receiving the associated content
item. Different delay-utility functions are proposed depending on the time-sensitivity
of the user with respect to a particular content. As an example (Figure 12.5), the
utility function associated with time-critical information (e.g., advertisement for a
well-located and cheap apartment, which all users interested in renting a new flat
want to receive as soon as possible) can be well represented by an inverse power
1−α
function hα : t → tα−1 , with α > 1. For cases in which not receiving the information



471

GLOBAL OPTIMIZATION

3

Inverse power
Negative log

2

Utility

1
0
-1
-2
-3
0

5

10

15

20

25


30

Time [s]

Figure 12.5 Examples of delay-utility functions.

promptly may damage users (e.g., when information is a critical system update), a
negative logarithm (h : t → − ln(x)) can be used to reproduce negative utility values.
Utility is brought to the system if a node n is able to quickly access a given item
i after issuing a request for it. Variable Ui,n (x) measures how useful is a certain
allocation x when node n demands item i. Such utility is defined as the expectation of
hi (Y ), where hi (·) is the delay-utility function associated with item i and Y is the time
needed to fulfill the request (which depends on the allocation x). Thus, intuitively, the
most useful allocation for node n demanding item i is the one that, on average, better
satisfies the time-sensitivity of the user with respect to item i. A general expression
for Ui,n (x) can be found in Reich and Chaintreau [8]. The overall utility function
U(x), referred to as social welfare, is then defined as follows:
U(x) =

πi,n Ui,n (x).

di
i∈I

(12.10)

n∈C

The rationale behind equation (12.10) is that the system utility can be interpreted as

the sum of the utility gain provided to each item i by cache allocation x. Then, the
per-item utility gain Ui,n is weighted with the actual rate di πi,n at which requests for
that item are issued, in order to account for the actual expected amount of requests
per item.
12.5.3

Optimal Cache Allocation

Reich and Chaintreau prove an important property of the utility function Ui,n they
define, known as submodularity. In other words, the utility always increases when a
new copy is generated (because, intuitively, each new copy potentially improves the
delay experienced by users) but the marginal utility, i.e., the added utility of generating
a new copy given that there are already c in the system, decreases as c increases.
Submodularity is illustrated in Figure 12.6. A consequence of submodularity is that


472

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

Utility

overall

marginal

# of copies

Figure 12.6 The submodularity property.


the global optimization problem can be solved using a greedy algorithm [35] with a
good approximation.
Assuming that contact processes are homogeneous—that is, all node pairs meet
at the same rate—the authors derive an even stronger result. In this case, in fact, the
greedy algorithm is able to find the optimal solution (not an approximation of it) in a
finite number of steps. In addition, in this case the social welfare only depends on the
number of copies for item i, not on the actual nodes that store them. In order to find
the optimal solution to the optimization problem, a greedy algorithm that works as
follows can be used. At each time step, a copy of the item that increases the utility of
the system the most is added. This initially implies that items that are requested with
greater frequency are those that are replicated more. However, this effect diminishes
as the number of copies grows, because the marginal gain tends to decrease as the
number of copies increase. Thus, after a while, less popular items will be selected.
12.5.4

From Global to Local Decisions

Global optimization problems like the one discussed above cannot be implemented
directly in an opportunistic network. In fact, in real scenarios, nodes are not aware of
the rate at which other nodes issue requests [corresponding to the di , πi,n components
in equation (12.9)]. Moreover, nodes can only make decisions about which items to
store when they meet other nodes and thus become aware of the content of their cache.
For all these reasons, research approaches that start from a global, centralized, optimization problem must inevitably turn into a local, distributed, optimization problem.
The local, distributed dissemination strategy proposed by Reich and Chaintreau is the
Query Counting Replication (QCR) scheme. The main idea of QCR is to tune the
number of replicas based on how many different encounters it took before getting a
copy of the message after request. As an example, consider that at time t node A has
issued a request for a certain item i. Let us assume that node A met n other nodes
before finding someone who has a copy of item i. According to QCR, after receiving



GLOBAL OPTIMIZATION

473

item i, node A will replicate it n times, i.e., node A will send a replica to each of the
next n encounters. Note that, differently from the global optimization strategy in Section 12.5.3, QCR only exploits local knowledge on the number of peers encountered
before finding item i.

12.5.5

Performance Results

The performance of the greedy algorithm described in Section 12.5.3 (hereafter denoted as OPT) and of the distributed QCR strategy is evaluated against four reference
protocols. It is useful to recall here that the OPT policy is optimal when contact rates
are homogeneous and approximate when contact rates are heterogeneous. The four
reference schemes are the UNI, PROP, SQRT, and DOM algorithms. With the UNI
algorithm (roughly corresponding to the PodNet Uniform strategy), all items get the
same share of the global cache. With the PROP scheme, memory is allocated to items
proportionally to their popularity, while with the SQRT scheme memory is allocated
proportionally to the square root of their popularity. Finally, using the DOM policy,
the cache space is allocated to the ρ most popular items. Performances are compared
measuring the distance between the social welfare achieved by QCR, PROP, SQRT,
and DOM and the social welfare provided by the OPT policy.
Three scenarios are considered. The first one is obtained simulating 50 content
items and 50 nodes with homogeneous contact rates equal to 0.05. The other two are
obtained from real mobility datasets. The first one is a subset of 50 nodes from the
Infocom’06 conference dataset [36], the second one comprises contacts between 50
cabs selected from the traces of the Cabspotting project [37].
In the homogeneous scenario, SQRT achieves the best overall performance. QCR’s

loss of utility with respect to the OPT policy remains below 5% when the delayutility function h(t) is a step function and below 60% when h(t) takes the form of an
inverse power. When using the Infocom’06 trace, the performance of SQRT drops, and
DOM and PROP perform best. This time QCR remains within 15% of OPT. Overall,
considering that QCR does not make use of a priori knowledge on content popularity
while DOM, PROP, and SQRT do, the performance of QCR can be considered fair.
Similar results are obtained for the cab mobility traces.
One of the most interesting results derived from the two real scenarios is that some
allocation policies can even improve the social welfare with respect to the optimal OPT
policy. This effect is a result of the fact that the optimal policy is computed, assuming
that contact processes are independent and memoryless, while this assumption may
not apply to real traces.

12.5.6

Take-Home Messages

The main contribution of the body of works focusing on global optimization is the
formalization, using a rigorous mathematical framework, of the content dissemination
process. Thanks to this approach, if a solution to the optimization problem can be
found, then such a solution is guaranteed to be either optimal or nearly optimal.


474

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

Differently, with heuristic content dissemination schemes it is not clear to which
extent they can be outperformed by other strategies or how far from optimal they are.
On the downside, there are two main drawbacks of this approach. First, global optimization requires global knowledge of the network and a priori information on how
users behave (e.g., their preferences for content and their movements) that in practice

is very unlikely to be available. In fact, opportunistic networks may be intrinsically
disconnected and unstable; thus it may take so long to distribute global information
from one side or the other of the network that either the information never reaches all
nodes or, when it does, it is already obsolete. In addition to this, the overhead caused
by the exchange of such global information among the nodes of the network can be
very high. Second, the modeling techniques required to formalize and solve analytically the content dissemination problem usually implies reducing the complexity of
the system under study. If the simplifying assumptions are embedded into the content dissemination problem, there might be the case that the proposed solution is not
optimal when applied to real scenarios, as happens for the OPT algorithm when real
mobility traces are considered.

12.6 INFRASTRUCTURE-BASED APPROACHES
A very recent approach to data dissemination to mobile users looks at possible synergies between disseminating content using the opportunistic network that can be
formed by the users’ devices, along with using a wireless broadband infrastructure
(e.g., a last-generation cellular network) to which users are assumed to be subscribed.
The most interesting example of this trend, in our opinion, has been recently presented
by Whitbeck et al. [9].
Note that approaches assuming that some kind of infrastructure is present and
can be exploited in the opportunistic dissemination process are not new. For example,
works looking at global optimization policies (see Section 12.5) typically assume that
mobile nodes can once in a while make contact with some fixed infrastructure element,
such as a WiFi Access Point, which assists in the dissemination process. The original
angle of the work we present in this section is, instead, to explore a much tighter
integration between delivering content items through purely opportunistic contacts
and through “infrastructure-based” direct links between an operator and the users.
Conceptually, the approach is based on the idea of offloading part of the dissemination process from the operator infrastructure to the opportunistic network formed
by the user devices. The scenario is that of a very large number of users located
in a relatively small region (e.g., a campus, a city, the location of a very popular
event, etc.), who are interested in the same content items. According to a traditional
“operator-exclusive” approach, the content items are sent from the operator to each
individual user through the wireless infrastructure. Each user thus generates a load on

the operator infrastructure equal to the bandwidth required to download the content
items. It is argued that, due to the proliferation of high-end mobile devices and the
bandwidth requirements of multimedia services, this will not scale, and the capacity
of the operator infrastructures will not keep the pace. On the other hand, according to


INFRASTRUCTURE-BASED APPROACHES

475

a purely opportunistic approach, content items must be available at some user nodes
and then disseminated through one of the schemes described in the other sections of
this chapter. While such an approach is certainly valid for content items generated
by users themselves, an integration between the operator-exclusive and the purely
opportunistic approach brings significant advantages when content is produced by
some provider and then disseminated to a large set of users subscribed to an operator
network. In the rest of the section we describe in more detail the approach proposed
by Whitbeck et al. [9], highlighting why this is the case.
12.6.1

The Push-and-Track System

The Push-and-Track system assumes that a particular content item is generated at
some time t0 in some part of the Internet and must be delivered to a large set of
mobile users located in a given area (even the size of a whole city) by some deadline
t0 + T . Mobile users are assumed to be always connected to an operator network and
to also build an opportunistic network among them.
The content item is initially sent from a central controller to a very small subset
of the mobile users through the infrastructure. Then the opportunistic dissemination
process starts, using any of the purely opportunistic algorithms described in the other

sections of this chapter. Nodes that receive the item send a short ACK message to
the controller, which therefore can keep track of the fraction of subscribed users
that have received the item. An “ideal dissemination plan” (objective function) is
also installed on the controller, which tells, at any point in time, the fraction of
subscribed nodes that should have received the item. Once every T seconds in the
interval [t0 , t0 + T ], the controller compares the objective function with the actual
fraction of users that have received the item. If the difference is too high, it selects a
certain subset of users and sends the item to them through the infrastructure. Finally,
a “panic zone” is defined close to t0 + T . When the dissemination process enters
into the panic zone, the controller sends the item to all the users that have not been
reached yet.
Figure 12.7 highlights the advantage of the Push-and-Track approach. Note that
only a small portion of the overall data traffic needs to go through the wireless infrastructure, while most of it is offloaded to the opportunistic network. The infrastructure
is used to track the dissemination process, which requires only a lightweight traffic.
Note that the Push-and-Track system manages to (i) guarantee 100% delivery under a strict deadline constraint and (ii) achieve this with a relatively low load on the
infrastructure (this is shown by simulation in Whitbeck et al. [9]). Basically, jointly
achieving these results is possible only through the integration of an infrastructureassisted dissemination (guaranteeing result i) and an opportunistic dissemination process (guaranteeing result ii).
While the overall concept of Push-and-Track is clear, several knobs exist to steer
its behavior in various directions. Specifically, policies should be determined for (i)
defining how the dissemination plan should theoretically proceed, (i.e., defining the
objective function), and (ii) defining to whom the content item should be sent at
the beginning and also defining when the dissemination process diverges from the


476

DATA DISSEMINATION IN OPPORTUNISTIC NETWORKS

Figure 12.7 The advantage of the offloading concept in Push-and-Track.


objective function. In Whitbeck et al. [9], the first class of policy is named whenstrategies, while the second class is named whom-strategies.
When-strategies can be broadly divided in three classes: slow start, fast start, and
linear. If 0 ≤ x ≤ 1 denotes the fraction of time elapsed in the interval [t0 , t0 + T ],
the linear strategy defines the target infection ratio as a linear function y = x. Slow
start strategies are sublinear. In particular, four strategies are proposed in Whitbeck
et al. [9]. Single Copy and Ten Copies push one and ten copies, respectively, at the
beginning of the dissemination, and then wait until the panic zone without performing
any re-injection. Quadratic uses an objective function y = x2 . Finally, the slow linear
strategy starts with an objective function y = x/2 for the first half of the interval,
and then it switches to a more aggressive function y = 23 x − 21 for the second half.
On the other √
hand, fast start strategies are superlinear. Square root uses an object
function y = x. Fast linear uses a function y = 3/2x for the first half, and switches
to y = x/2 + 1/2 for the second half. A scheme of the behavior of the strategies is
depicted in Figure 12.8 (adapted from Whitbeck et al. [9]).
Whom-strategies can be grouped in four classes: Random, Entry-based, GPSbased, and Connectivity-Based. Random selects nodes that did not yet receive the
item according to a uniform distribution. Entry-based policies select nodes according
to when they have subscribed, assuming that the subscription time is a good predictor
of the position of the nodes in the area where nodes move. In GPS-based policies,
users are assumed to also report to the control their updated coordinates. The policy
injects content according to the nodes position. For example, in GPS-Density, nodes
are selected that are located in the areas with maximum density of uninfected nodes. In
Connectivity-based policies, nodes are assumed to send to the controller their updated
list of current neighbours. This allows the controller to have a rough view of the


477

INFRASTRUCTURE-BASED APPROACHES


1

0

at
ic

r

ad
r

ea

lin

qu

ro
ot
sq
ua
re

target infection ratio

r
linea
fast


ar
line
slow

percentage of time (x)

1

Figure 12.8 Objective functions of Push-and-Track.

connectivity status of the whole network. The item is injected on nodes belonging to
the largest uninfected connected component. Clearly the different strategies generate
diffent overhead on the infrastructure, due to the different set of information they
need to send to the controller. Simulation results presented in Whitbeck et al. [9] tell
whether this additional overhead is worth or not.
12.6.2

Performance Results

Push-and-Track has been evaluated using a large vehicular trace, collected in the city
of Bologna by the iTetris EU project [38]. As explained in Whitbeck et al. [9], this
trace permits to test it in a realistic vehicular setting, which also includes a significant
number of users entering and leaving the area interested by the dissemination process
dynamically. The different policies described in Section 12.6.1 are compared against
(i) an “operator-exclusive” policy, which disseminates only using the infastructure,
and (ii) an oracle-based policy. In this policy, at the beginning of the dissemination
a vertex is added between two nodes if they can be connected through a space–time
path during the interval [t0 , t0 + T ]. Content items are then sent to a dominating set
of this graph. Identifying such set is known to be NP-hard. The optimal policy uses
a greedy approximation, which provides a subset of cardinality at most log K larger

than the dominating set, where K is the maximum degree of the graph.
Simulation results highlight several interesting features of the system. Simulations
have been run both in a case with a strict deadline of 1 minute and in one with a
more relaxed deadline of 10 minutes. The best policy among the when-strategies is
Quadratic for the 1 minute case and Slow Linear for the 10 minutes case.
It is interesting to note that in terms of whom-strategies, Connectivity-based policies are the best ones, but the Random policy performs quite close to them. Note that
Connectivity-based policies require significant additional information with respect to
Random, which is likely to be thus preferrable in most practical cases. In Whitbeck


×