Tải bản đầy đủ (.pdf) (143 trang)

A game theoretical model for collaborative protocols in selfish, tariff free, multi hop wireless networks

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (663.85 KB, 143 trang )

A GAME THEORETICAL MODEL
FOR
COLLABORATIVE PROTOCOLS
IN
SELFISH, TARIFF-FREE, MULTI-HOP
WIRELESS NETWORKS
BY
NG SEE KEE
A THESIS SUBMITTED FOR THE DEGREE OF
MASTER OF ENGINEERING
DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
NATIONAL UNIVERSITY OF SINGAPORE
(2005)

Acknowledgments
I would like to thank my supervisor, Dr. Winston K.G. Seah, for his assistance in
many ways.
1
Table of Contents
1 Introduction 1
1.1 Mobile Ad Hoc Networks 4
1.1.1 Network Routing 6
1.1.2 Medium Access Control 7
1.1.3 Quality of Service Provisioning 8
1.2 Game Theory 9
1.2.1 Strategic Games 10
1.2.2 Extensive Games 13
1.3 Our Contributions 15
2 Wireless Network Availability 19
2.1 Introduction 19
2.2 Incentive-Based Mechanisms 21


2.3 Punishment-Based Mechanisms 24
2.4 Summary 25
3 Punishments in Repeated Games 27
3.1 Introduction 27
3.2 Finitely Repeated Games 29
3.3 Infinitely Repeated Games 30
3.3.1 Repeated Prisoner's Dilemma 32
3.3.2 Folk Theorems 35
3.3.2.1 Nash folk theorem 36
2
3.3.2.2 Perfect folk theorem 36
3.4 Session-based Generous Tit-for-Tat (GTFT) 38
3.5 Do ut des strategy 40
3.6 Topology Dependent Analysis 40
3.7 Punishment, parole and rehabilitation 42
3.8 Self-Learning Repeated Game 42
3.9 Summary 44
4 Private Monitoring 47
4.1 Introduction 47
4.2 Aoyagi's Game for Dynamic Bertrand Oligopoly 48
4.2.1 Game Model 50
4.3 Summary 56
5 The Wireless Multi-hop Game 58
5.1 Introduction 58
5.2 Modelling Multi-hop Characteristics 61
5.3 Periodic Punishment Approach 65
5.4 Condition for Efficient Collusion 69
5.5 Summary 75
6 Playing in the Wireless Environment 78
6.1 Introduction 78

6.2 Modelling Private Observations 79
6.3 The Reporting Strategy 82
6.4 Proof of Assumption 1: Correlated Packet Arrival Signal 90
6.5 Proof of Assumption 2: Highest Unanimity at Collusion 93
6.6 Summary 101
3
7 The SRRR Protocol Framework 103
7.1 Introduction 103
7.2 Protocol Description 104
7.3 Secrets and Lies 108
7.4 Simulation Results 113
7.5 Summary 119
8 Conclusion 120
9 Bibliography 126
4
Summary
Traditional networks are built on the assumption that network entities cooperate based
on a mandatory network communication semantic to achieve desirable qualities such
as efficiency and scalability. With technological maturity and widespread technical
know-how, a different set of network problems has emerged - clever users that alter
network behavior in a way to benefit themselves at the expense of others. The
problem would be more pronounced in mobile ad hoc networks (MANET) where
network ownership can be shared among different entities.
Node misbehavior can occur in various degrees. At the extreme end, a malicious node
may eavesdrop on sensitive data or deliberately inject fabricated, replayed or tampered
packets into the network to disrupt network operations. The solution is, generally, to
enable network encryption and authentication. This thesis, on the other hand, focuses
on misbehaviors caused by selfish but rational users while keeping in mind the
dangers posed by malicious ones. In contrast to a malicious node, a rational node acts
only to obtain the outcome that he most prefers. In such a case, cooperation can still

be achievable if the outcome of cooperation is to the best interest of the node.
MANETs, which are typically made up of wireless, battery-powered devices, will find
cooperation hard to maintain because it requires the consumption of scarce resources
such as bandwidth, computational power and battery power. The objective of this
thesis is to apply game theory to achieve collusive networking behavior in the
MANET operational environment. The scenarios for such behaviour to occur lies in
5
the emerging 4
th
generation networks where communications over multihop wireless
links, across nodes that may subscribe to different providers, are envisaged to occur.
Research in this area is still in its infancy and existing solutions lack technical
feasibility and theoretical consistency. These solutions fall into the category of pricing
or punishment. The pricing solution either requires a tamper-proof counter as a
reliable storage of a node's wealth, or an occasional connection to a central authority
where payments can be coordinated. Punishment methods are often designed based on
the well-established Repeated Game model and promiscuous listening may be relied
on for the monitoring of other players' actions. Promiscuous listening is, nevertheless,
unreliable and computationally demanding. In addition, the Repeated Game model
(perfect and public) fails to account for imperfection in the wireless monitoring device
(whether it is public or private) and proposed solutions also overlooked the need for
coordinated punishment. Most unforgivably, mass punishment of nodes creates a
vulnerability for Denial-of-Service (DoS) attacks, threatening even the feasibility of
the punishment mechanism as a solution for sustaining cooperation in MANETs. The
complexity of modeling MANETs and the suitability of available game models poses
a significant challenge to the realization of a theoretical model for collusive MANETs
protocols.
In this work, pricing, promiscuous listening and mass punishments are avoided
altogether. Our model relies on a recent work in the field of Economics on the theory
of imperfect private monitoring for the dynamic Bertrand oligopoly, and adapts it to

the wireless multi-hop network. The model derives conditions for collusive packet
forwarding, truthful routing broadcasts and packet acknowledgments under a lossy,
6
wireless, multi-hop environment, thus capturing many important characteristics of the
network layer and link layer in one integrated analysis that has not been achieved in
previous works.
We provided a proof of the viability of the model under a theoretical wireless
environment. Based on the model, we proposed an SRRR protocol for demonstrating
the application of our model to protocol design. Finally, we proof by simulation that
the SRRR protocol is resilient against selfish users under a several deception
scenarios.
7
Tables
Table 1 Prisoners' Dilemma 11
Table 2 Battle of the Sexes 12
Table 3 Strategic form of the extensive game 14
Table 4 Modified Prisoner's Dilemma 30
Table 5 Control Slot Information 106
Table 6 Example Control Slot Information During Bandwidth Reservation 109
Table 7 Cooperative Scenario without Packet Dropping 109
Table 8 Simple Packet Dropping 110
Table 9 Secret Packet Dropping with Acknowledgment Lies 111
Table 10 Secret Packet Dropping with Bandwidth Lies 112
Table 11 Honest Packet Dropping 112
8
Figures
Figure 1 Two player extensive game 14
Figure 2 Optimum Cutoff Reporting 88
Figure 3 Graphical Evaluation of Unanimous Probability 97
Figure 4 Graphical Evaluation of Unanimous Probability with Zoom 98

Figure 5 Unanimous Probability at Various Error Rates 100
Figure 6: Network Topology 109
Figure 7: Collusive Packet Forwarding 114
Figure 8: Upstream and Downstream Punishments during Simple Packet Dropping114
Figure 9: Downstream Punishment for Secret Packet Dropping with Acknowledgment
Lies 115
Figure 10: Upstream Punishment for Secret Packet Dropping with Acknowledgment
Lies 116
Figure 11: Downstream Punishment for Secret Packet Dropping with Bandwidth Lies
117
Figure 12: Upstream Punishment for Secret Packet Dropping with Bandwidth Lies 117
Figure 13: Downstream Punishment for Source Deviations 118
Figure 14: Upstream Punishment for Source Deviations 118
9

1 Introduction
Traditional networks assume that network entities or nodes can be designed to
have well-defined behaviors and coordinate accordingly to ensure certain network
goals are met. These goals can be, for example, the optimized use of network
resources or the Quality of Service (QoS) provided to the end users who generally
arise from the interest of the network operator or the network users at large. The
goals, however, may not be commonly shared by individual end user who would
always prefer to have better network access, even at the expense of other users.
Such a selfish behavior has been reported on rogue TCP sources that do not
respond to Explicit Congestion Notification (ECN) [46].
The increasingly popular wireless networks are much more vulnerable to node
misbehavior than the traditional wired networks. Wireless networks can be
classified into three categories – infrastructured, infrastructureless and hybrid. The
1
infrastructured wireless network has geographically fixed stations, interconnected

by a wired backbone, and serve as central points of coordination and network
access for wireless nodes. An example is the cellular network. The
infrastructureless wireless network does not depend on any wired backbone but
depend on members of the network to route packets for one another wirelessly,
possibly over multiple hops. Mobile Ad Hoc NETworks (MANETs) and sensor
networks are examples of infrastructureless wireless networks. The hybrid
wireless network, as the name implies, is a mixture of the two, and applies
infrastructureless networking to provide access to a wired access points for nodes
that do not have direct access to these access points. An example of such a
network comes from the rooftop networks [19].
This thesis focuses on the study of selfish routing behavior in infrastructureless
MANETs. In a network, or portion of the network, without infrastructure,
uncooperative behavior can be rampant and devastating. Relating to the medium
access layer, [57], [3] and [4] studies competition for wireless transmission. In the
network layer, the assumption of cooperative relaying of packets among nodes to
reach destinations that are beyond the wireless transmission range is no longer
valid when nodes exhibit selfish behavior. The reason is that helping other nodes
consumes precious resources, such as battery power, which is costly and non-
beneficial to one. Without suitable incentives, most existing protocols that assume
cooperation are likely to fail.
Pioneering works on mitigating node misbehaviors in the routing layer ([27], [26],
[45], [43], [42] and [37]) highlighted the problem of selfishness and
2
recommended, basically, two approaches to solve the problem – pricing and
watchdog cum punishment. Later works do not deviate far from these approaches
but tries to align towards game theory.
Adopting pricing as a solution in [47], [23] and [13] gives rise to the reliance on a
central bank or a tamper-proof counter, which limits the practicability especially
for a purely infrastructureless network. Punishment methods based on repeated
games are proposed by [32], [57], [41], [2] and [50]. Promiscuous listening may

be relied on for monitoring transmission activities in the neighborhood. It may
require cross-layer integration, depending on the protocol layer of interest, and is
too costly for a computationally resource-limited machine to process all packets
overheard on a high data rate link. Furthermore, the unreliable nature of
promiscuous listening has not been studied and modeled sufficiently. On the other
hand, the problem of analyzing the protocol system in fragmented components has
been studied in [48]; we too recognize this problem and therefore take an
integrated approach in our analysis and try to capture as many network
characteristics as possible. In addition, the difficulty of coordinating punishment
in a multi-hop environment has been neglected. Without coordinated punishments
that divide time into collusive and punishment periods, punishments and
deviations are otherwise indistinguishable. The major drawback in many
punishment schemes, however, is the need for the whole or a large portion of the
network to participate in the punishment of one deviating node. Such a
punishment is too severe, inefficient and opens a security hole for denial of service
(DoS) attacks.
3
Selfish and uncooperative behaviors can be analyzed with game theory. A well
developed field of mathematics, game theory is a formal way of analyzing
outcomes of group behavior with the basic assumption that players are rational. A
rational player chooses an action that maximizes her outcome given her believes
about other players' preferences. The game analysis predicts the final outcome
when rational players play against rational players. Application of game theory has
already commenced in the wired domain in areas such as congestion control, flow
control and multicasting [6], [15], [7], [9] and [46]. Nevertheless, available
models do not sufficiently model the wireless multi-hop environment. This thesis
relies on the adaptation of Aoyagi's imperfect monitoring with communication for
the Bertrand oligopoly [31] to analyze collusive packet forwarding, packet
acknowledgments and truthful routing information dissemination in MANET.
1.1 Mobile Ad Hoc Networks

The Mobile Ad Hoc NETwork (MANET) was initially of interest mainly to the
military, police and rescue agencies. These organizations often need to operate
under disorganized conditions or hostile environments where either network
infrastructure is absent or difficult to construct. Thus MANET embodies
characteristics that are suited to these scenarios. It is particularly quite well quoted
that MANETs are self-creating, self-organizing and self-administering. In a
MANET, nodes dynamically create a wireless network among themselves without
the need of an infrastructure or the intervention of a centralized coordinator. This
is achieved through mutual cooperation and coordination. The dynamic and
4
distributed characteristics of MANET also make it fault resilient.
The challenges of MANET have been numerous. The differences between the
wireless and wired medium and the infrastructured and infrastructureless
operating environment prevent many existing solutions from being transplanted
onto MANET. In the initial phase, the research community confronted the
challenge of route formation for multi-hop communication. This requires the
enabling of routing function in every node, the formation of loop-free paths and
the reduction of communication overheads. Without a centralized control, channel
access, inevitably, has to be distributed, causing packet collisions. With the
infamous “hidden terminal” problem, traditional solutions become less effective.
Additionally, QoS that is already a challenge in today's Internet faces more
obstacles in MANET. This includes the complexity of route selection, resource
reservation and, foremost, the maintenance of QoS performance under
dramatically changing environment caused, for example, by node mobility and
link instability.
In recent years, however, there has been a growing interest in the application of
MANET in the home or small office networking environment. Nevertheless, the
exposure of MANET to the public domain introduces a new strain of users. Often
called selfish, rational, greedy or uncooperative, these users challenge the very
paradigm on which MANET has been designed and threatens its ability to

function in such an environment.
5
1.1.1 Network Routing
The advent of Defense Advanced Research Projects Agency (DARPA) packet
radio networks in the early 1970s stimulated the research of numerous routing
protocols for the MANET. These protocols must address the problems of MANET
such as limited battery life, low bandwidth and high error rates not found in the
well-researched wired counterparts. They generally fall within two categories [12]
– proactive (table driven) or reactive (source initiated).
A proactive routing protocol maintains an up-to-date table of routing information
to every other node in the network. This is accomplished by advertising itself
periodically throughout the network. Protocols within this category differ by the
information they advertise and tables they maintain. Examples of proactive
routing protocols are Fisheye State Routing (FSR), Optimized Link State Routing
Protocol (OLSR) and Topology Broadcast Based on Reverse Path Forwarding
(TBRPF).
Reactive routing protocols are designed based on the principle that routes are
discovered only when they are needed. They commonly consists of two phases,
namely the route discovery phase, whereby the node desiring transmission
searches the network for a route, and a route maintenance phase, whereby dynamic
changes along the route is monitored and updated. Examples of reactive routing
protocols are Ad Hoc On-Demand Distance Vector Routing (AODV) and
Dynamic Source Routing (DSR).
6
Among these protocols, AODV, OLSR and TBRPF are already part of the IETF
recommendations [10], [49] and [39]. For a survey and comparison, refer to [18]
and [52].
1.1.2 Medium Access Control
Medium Access Control (MAC) protocols are usually designed to optimize to the
medium that they operate on, and a wireless medium is very different from a wired

one. To begin, wireless medium has limited bandwidth due to spectrum scarcity
and hardware constraints. Optimized use of bandwidth is therefore of great
importance. In addition, path loss and signal fading deteriorates transmission
reliability, making error correction and recovery inevitable, at the same time
creating more protocol overheads. Transmitting at high power to attain better
reception quality has hardly been a viable option due to the fact that wireless
devices are usually small and limited in battery life. The greatest difficulty met by
wireless MAC is nevertheless access contention resolution. Sharing the same
wireless medium and usually equipped only with a half-duplex transceiver,
simultaneous transmission can interfere with each other. Coordinated transmission
is therefore a must and the infamous hidden node terminals make the problem less
simple.
For a more comprehensive survey of wireless MAC protocols see [1], [5] and
[20].
7
1.1.3 Quality of Service Provisioning
Quality of Service [44] is concerned with the provisioning of services meeting,
mainly, delay, jitter and bandwidth requirements. A set of QoS requirements is
meaningful to a flow, or a connection between the source and the destination. To
realize the QoS, the network must guarantee the availability of a set of resources
required by the flow. Thus routers have to be aware of the flows traversing
themselves and their respective resource requirements, which are generally
achieved with resource reservation techniques.
Before resources can be reserved, routes of adequate resources have to be chosen.
The availability of resources limits the QoS guarantees. If a set of QoS guarantees
can be maintained regardless of the topology updates in the network, the network
is said to be QoS robust. If QoS guarantees can be maintained between
consecutive topology updates, it is said to be QoS-preserving. The selection of a
route to meet the required QoS is the responsibility of QoS routing.
The accuracy of network state information determines the quality of QoS routing.

Local state information is maintained at each node and can be assumed to be
always available. The local state information contains the cost metric of outgoing
links, such as queuing delay, propagation delay and available bandwidth. The
collection of local state information of all nodes in the network forms the global
state. Unlike local state information, global state information takes time to acquire
as is achieved through the exchange of local state information. Its inaccuracy
deteriorates QoS performance.
8
There are generally three classes of QoS routing – source routing, distributed
routing and hierarchical routing. As the name implies, a feasible route is selected
by the source using locally stored global state information in source routing. In
distributed routing, other nodes in the network also play a part in determining the
next forwarded node. Hierarchical routing groups nodes into clusters and perform
source routing between clusters. To preserve QoS, a broken route can be repaired
or an alternate candidate route chosen. Redundant routes are used to reduce the
likelihood of QoS violation.
1.2 Game Theory
Game theory [51] uses mathematics to express the phenomena of decision making
among more than one agent. The earliest known analysis of a formal game theory
was by Antoine Cournot in 1838 in which duopoly was studied. Emile Borel
suggested a formal theory of games in 1921 which was furthered by John von
Neumann in 1928. John von Neumann and Oskar Morgenstern together wrote the
monumental volume “Theory of Games and Economic Behavior” which
establishes game theory as a field and provided essential terminology and problem
definition that is still used today. In 1950, John Nash explored the concept of non-
cooperative games and demonstrated that finite games can have an equilibrium at
which no player can choose an action that is better for them given their opponents'
choices. In 1994, John Nash, John Harsanyi and Reinhard Selten received the
Nobel Prize in economics for work in this area.
9

There are three main models or forms in the study of games – the strategic form,
the extensive form and the coalition form. The strategic form game or normal
form game models simultaneous decision making. The extensive form game
models sequential decision making. The extensive form game is further divided
into games with perfect information and imperfect information. In the case when
the players know all past moves, the game is said to have perfect information, and
when only partial information is available it is said to have imperfect information.
The strategic and extensive form games are often referred to as non-cooperative
games as decisions are taken autonomously by an individual player. In contrast,
coalition games or cooperative games model the tendency for players to form
coalitions to favor common interests. To limit the scope of our work, coalition
games will not be studied.
1.2.1 Strategic Games
In a strategic game of more than one player, each player is associated with a set of
actions. The combination of actions among players forms an action profile which
is also an outcome of the game. Each player may take more than one action,
forming a set of possible outcomes. Every player has a preference of an outcome
over the others. It is often, however, mathematically convenient to map the
player's preference to a numeric value referred to as the player's payoff or utility
from the outcome.
A simple finite two-person strategic game is usually denoted in a table format
(Table 1). One player's actions are listed in each cell of the first row of the table
10
and the other player's actions are similarly listed in the first column of the table.
The rest of the cells contain vectors of the two players' payoffs, with the first
element belonging to the row player.
Prisoner 2
Prisoner 1
D C
D 2,2 0,3

C 3,0 1,1
Table 1 Prisoners' Dilemma
The solutions assume that players are rational and the actions are simultaneous.
Hence each player only understands her own and her opponents' available actions
and respective payoffs but not the eventual action that was ultimately chosen until
the game ends. There are three main solution concepts – elimination of dominated
strategies, Nash equilibrium and mixed equilibrium.
The elimination of dominated strategies is applicable to games where there exists
a strategy that is always superior to all other strategies regardless of the opponent's
strategies. The strategy is then said to strictly dominate the other strategies.
Rational players never play strictly dominated strategies. Using the Prisoners'
Dilemma as an example (Table 1), if prisoner 1 is going to defect (D), the other
player is better off playing D. If instead prisoner 1 is going to cooperate (C), it is
still better for the other player to defect. Hence by elimination of the dominated
strategy, we obtain the outcome
〈D,D〉
. In a game where more actions are
available, the process can be repeated and is referred as the iterated elimination of
strictly dominated strategies.
11
This solution is weaker than the well-known Nash Equilibrium as it does not
provide a solution most of the time. A Nash equilibrium can be applied to a much
broader class of games and is defined as an action profile that no player can
profitably deviate from. Using the Prisoners' Dilemma again as an example (Table
1),
〈C,C〉
is not a Nash equilibrium because player 1 would tend to deviate to D.
Neither is
〈D,C〉
an equilibrium because player 2 will tend to choose D. Following

the same arguments for all other strategies
〈D,D〉
remains as the unique Nash
equilibrium of the game. Note that there can be zero or more Nash equilibriums in
a game. In the Battle of the Sexes game (Table 2), two people wishes to go out
together but have conflicting interests. The game has two Nash equilibriums -
〈 Football, Football〉
and
〈Opera, Opera〉
.
Female
Male
Football Opera
Football 2,1 0,0
Opera 0,0 1,2
Table 2 Battle of the Sexes
Finally, a mixed strategy models the steady state of a game which player's decision
is probabilistic. When applying a mixed strategy of a finite strategic game, there
always exists in a Nash equilibrium. For the Battle of the Sexes game, the mixed
strategy equilibrium occurs when every action in a player's mixed equilibrium
strategy yields the same payoff. The resultant mixed strategy Nash equilibriums
are
〈2/3, 1/3〉
and
〈1/3, 2/3〉
.
Although strategic games models assume simultaneous decision making, the game
12
need not be restricted to decisions that are executed at the exact instance but when
time and order of events have no effects on the strategies and outcomes of the

game.
1.2.2 Extensive Games
In an extensive game, sequentiality of actions is important. A sequence of actions
taken by the players is defined as a history and different possible sequences of
actions form a set of histories. A history can be terminal or non-terminal. For a
non-terminal history, the player function defines the next player to act after that
history. For terminal histories, players' preferences are defined. An infinite history
is also considered terminal. As in the case of a strategic game, preferences over
terminal histories may also be mapped to a utility or payoff.
A convenient way to represent an extensive game is in a tree structure (Figure 1).
The small disc on the top represents the initial history. The number beside the disc
represents the player to make the move after that history. In this case, the first
player is player 1. The two lines extending from the initial history are the actions
available after that history and are labeled beside the lines. The lines lead to two
more discs with one indicating that the next player to move is player 2. The other
disc is a terminal history with the payoffs indicated below it.
13
1
0,0
2
A
B
L
R
1,2
2,1
1

×