Скачиваний:
17
Добавлен:
02.04.2015
Размер:
9.02 Mб
Скачать

184

8. The Superframe as Single Stage Game

is larger than what is achieved through rational behavior, i.e., in Nash equilibria, are considered as being more efficient in the sense of total utility. It is said that the interacting players achieve an improved social outcome of the game with such action profiles. In other words, payoffs outside the Pareto boundary form a subset of the payoffs with improved social outcome, given by the sum of all individual outcomes.

8.3.2Core Behaviors

As result of the analysis of the SSG, the core behaviors are defined in this section. The behaviors capture all relevant aspects of the decision taking processes identified so far. They will allow the definition of strategies and the application of repeated SSGs with dynamically changing behaviors as part of the strategies.

8.3.2.1Simple Core Behavior “Persist” (BEH-P)

A player i that behaves according to BEH-P will always select what is required:

 

Θi

 

 

Θi

 

dem

:=

req

 

i

 

 

i

 

dem

 

req

This behavior achieves highest payoffs as result of the SSG if the opponent player -i does not require any radio resource, for example in an isolated QBSS. However, if all players behave according to BEH-P, the resulting payoffs are generally very low due to uncoordinated resource allocation attempts.

8.3.2.2Rational Core Behavior “BestResponse” (BEH-B)

A player i that behaves according to BEH-B will always select what achieves the highest payoff in the SSG, considering its expectations of the action of its opponent.

 

Θi

Θi

 

Θi

 

 

req ,

dem

dem

 

i

i

 

i

 

 

req

dem

 

dem

This behavior is referred to as rational behavior. Player i achieves payoffs that can be sustained as result of the SSG if the opponent player -i also behaves rational. Depending on the requirements of the players, the resulting payoffs may not be Pareto efficient.

8.3 ParetoTP PT Efficiency Analysis, and Behaviors

185

8.3.2.3Cooperative Core Behavior “Coop” (BEH-C)

A player i that behaves according to BEH-C, attempts to cross the Pareto boundary by unilaterally deviating from the best response to an action that will allow the opponent player -i to better meet its individual requirements, without actually knowing the requirements of the opponent player. If the opponent also deviates from its own best response, all players can gain from this coordinated deviation in games where the Nash equilibrium is not Pareto efficient.

In order to define the behavior BEH-C, simulation campaigns are used to analyze what type of deviation from BEH-B towards a different behavior is beneficial for an opponent player, and what type of deviation has negative effects. Deviation

can be the increase or the decrease of the demanded share of capacity Θdem , or the increase or decrease of dem .

The results of the analysis are illustrated in Table 8.1 and Table 8.2. The Table 8.1 shows the results taken from the analytical approximation, whereas the Table 8.2 shows the results of stochastic simulation. Instead of detailed results, the relative changes in payoff are given in the tables. They are indicated by “+”, if the payoff increases as a result of deviation, “0”, if it keeps constant while deviating, and “- “, if the payoff decreases. Four cases are shown in four lines where the player i has different requirements relative to the opponent player -i (line 1, 2 vs. line 3, 4), and player -i takes actions either according to BEH-P (line 1, 3) or according to BEH-B (line 2, 4).

From Table 8.1 it can be concluded that increasing the demanded resource allocation interval dem , while keeping the demanded share of capacity Θdem constant, is positive for the player, as such a behavior has negative implications on the resulting payoff of the opponent player.

In contrast, reducing the demanded resource allocation interval dem is beneficial for the opponent player. Increasing the demanded share of capacity Θdem , while keeping the demanded resource allocation interval dem constant, has negative implications on the resulting payoff of the opponent. On the other hand, decreasing the demanded share of capacity Θdem is clearly beneficial for the opponent player, in any case.

These results are confirmed by the simulation results shown in Table 8.2, with some differences for the results when player i reduces its demanded share of capacity (right column). In this case, the simulation indicate that player i itself may observe smaller payoffs when deviating. This is not captured by the analytical approximation in Table 8.1 because of the simplified model of collisions of resource allocation attempts.

186

8. The Superframe as Single Stage Game

As result, cooperative behavior is defined as reducing the demand for share of capacity, and at the same time reducing the demanded resource allocation interval to smaller intervals. The two measures allow an opponent player to allocate resource more often at the demanded points in time. When player i cooperates, resource allocations of player i are shorter, due to the smaller resource allocation interval. Therefore, the opponent player -i now has to wait shorter times for the channel to become idle, when player i allocated the resources before. This is clearly beneficial for player -i, and can be referred to as cooperative behavior of player i. Cooperation means in the game context that the system of interacting players crosses the Pareto boundary from a Nash equilibrium towards an operation point that achieves higher payoffs for at least one player, and no payoff reduction for any player. The following expression indicates that the demands selected in cooperation are a result of the deviation from the rational behavior BEH-B:

 

Θdemi *

 

Θdem,Ci

≤Θdemi *

,

 

i

 

i

i

 

 

 

 

 

 

 

dem*

 

dem,C

≤ ∆dem*

 

where the index “C” indicates the demand selected by a player in cooperation. As preliminary definition of the limits to which a player deviates when cooperating, dem,Ci and Θdem,Ci are defined in this thesis by

i

 

 

i

1

 

 

 

Θdem,C

min

Θdem *,

 

 

 

,

N

 

 

 

 

 

 

demi

≤ ∆dem,Ci

≤ ∆demi *,

 

 

N being the number of interacting players in the game. Here it is indicated that the demand of the opponent player is estimated by any player, therefore, players adapt to demi instead of demi . When cooperating, a player demands a maximum

share of capacity of Θdem,Ci =1N = 0.5 , which can be interpreted as fair share when two players interact. Further, the demanded resource allocation interval in

cooperation, dem,Ci , is decreased until it reaches the value of the opponent’s resource allocation interval. This simple definition implies that, in times of high

offered traffic, players give their opponents a fair chance to allocate resources regularly. There are games where cooperation is not beneficial for the opponent, especially in games where the opponent achieved the maximum payoff already by playing rational, when the Nash equilibrium is Pareto efficient. In such a game, when player i does not meet its requirements in Nash equilibrium, player i is referred to as being weaker than player -i.

8.3 ParetoTP PT Efficiency Analysis, and Behaviors

187

Table 8.1: Deviating behaviors – resulting pairs of payoffs per players i, -i as taken from the analytic approximation. Player i deviates the demands from its requirements by increasing or decreasing ∆ , or Θ . The opponent player -i plays BEH-P or BEH-B.

relative

 

behavior

deviation of

deviation of

deviation of

deviation of

 

 

pl. i

pl. i

 

pl. i

pl. i

 

requirements

of pl. -i

 

increase

reduce

increase Θ

reduce Θ

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

< ∆reqi

 

BEH-P

0 / –

0 / +

0 / –

0 / +

 

reqi

 

BEH-B

0 / –

0 / +

0 / –

0 / +

 

2

 

 

 

3

> ∆reqi

BEH-P

0 / –

+ / 0

0 / –

0 / +

 

reqi

BEH-B

0 / –

+ / +

0 / –

0 / +

 

4

 

 

 

explanation:

pl i observes the same payoff

 

opp. pl. -i observes the same payoff

 

0 /…

…/ 0

 

– /…

pl. i observes smaller payoff

…/ –

opp. pl. -i observes smaller payoff

 

+ /…

pl. i observes increased payoff

…/+

opp. pl. -i observes increased payoff

 

for example, “– / +” means “pl. i observes a smaller payoff than before and player -i observes a higher payoff ”

A weak player cannot gain from playing cooperatively. There are measures for a weak player i based on the behavior “defect” that will enable it to force the opponent player -i to cooperate, as part of dynamic strategies. The behavior “defect” is defined in the next section.

8.3.2.4Punishing Core Behavior “Defect” (BEH-D)

A player i that behaves according to BEH-D will always select the demand that is most damaging to the payoffs of the opponent player -i, according to the analysis shown in Table 8.1 and Table 8.2:

 

Θdemi

 

 

Θimax

 

 

1

 

i

 

:=

i

=

 

 

 

 

 

 

0.1

 

dem

 

 

max

 

This behavior is likely to destroy any attempt of the opponent player to achieve some payoff in the SSG.

Table 8.2: Deviating behaviors – resulting utilities per players i,-i now taken from stochastic simulation instead of the analytic approximations.

relative

behavior

deviation of

deviation of

deviation of

deviation of

pl. i

pl. i

pl. i

pl. i

requirements

of pl. -i

increase

reduce

increase Θ

reduce Θ

 

 

 

 

 

 

 

 

 

 

1

< ∆reqi

BEH-P

0 / –

0 / +

0 / 0

0 / 0

reqi

BEH-B

0 / –

0 / 0

0 / –

0 / 0

2

 

3

> ∆reqi

BEH-P

0 / –

0 / +

0 / –

– / +

reqi

BEH-B

0 / –

0 / +

0 / –

– / +

4

 

explanation: see Table 8.1.