Скачиваний:
17
Добавлен:
02.04.2015
Размер:
9.02 Mб
Скачать

206

9. Coexisting Wireless LANs as Multi Stage Game

Prob(delay>x) (CDF)

100

 

player 1 achieves

 

 

 

 

pl 1

 

 

 

minimum delays,

pl 2

 

 

 

player 2 achieves

EDCF

 

 

 

delays as in NE

 

 

 

 

 

before (best)

 

10−1

before

 

Θ1

= 0.50

 

 

 

obs

 

 

 

(persist)

 

Θ2

= 0.49

 

 

 

 

 

 

 

 

obs

 

 

 

 

 

EDCF share = 0.00

 

10−2

2

4

6

8

10

0

x = resource allocation delay (TXOP delay) [ms]

shares (thrp.) and payoffs per player

1 highest payoffs

in cooperation

0.8cooperation is not beneficial here

share pl.1 0.6 share pl.2

share EDCF payoff pl.1

payoff pl.2

0.4

both achieve the same thrp.

(pl.2 not visible)

0.2

no EDCF because of lower priority

0

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.1

req. Θ per player

Figure 9.12: Probabilities of delay of resource allocation attempts, TXOP delay (left), and throughput and payoff results for varying requirements (right).

9.3Dynamic Strategies

The analysis in the last section has shown that there are games in the CCHC coexistence scenario where one player requires the cooperation of the opponent player, to improve the achievable payoff for this player. Further, there are games in the CCHC coexistence scenario where both players benefit from cooperation. What has to be solved for such games are mechanisms to establish cooperation, once a player has concluded that cooperation will beneficial for all involved players. To establish cooperation, both players are required to cross the Pareto boundary. To motivate an opponent player to cooperate, punishments can be used as part of dynamic strategies. Dynamic strategies, where a player adaptively changes its behavior during the course of the MSG are discussed in the following. Dynamic strategies are also referred to as trigger strategies (Osborne and Rubinstein, 1994; Berlemann, 2003).

It is analyzed in the following under which condition deviating from cooperation will result in a higher payoff than not deviating, under the assumption that deviation will be punished as part of the dynamic strategy.

In the rest of this chapter, a scenario is assumed where cooperation is beneficial for all involved players.

9.3.1Cooperation and Punishment

Players adopt their behavior for different purposes. In an MSG, the rational behind the decisions taken at the stages about what behavior to select is not the payoff per stage, but the discounted MSG payoff for the current and any following stages. It depends on the discounting factor if a player considers future stages, or if it attempts to optimize the instantaneous payoff of this stage.

9.3 Dynamic Strategies

207

Any behavior other than cooperation (denoted as BEH-C) is in the following interpreted as a defection. Defection may be the selection of punishments by playing BEH-D, but also the selection of BEH-B, the best response behavior. Playing the best response when the opponent is cooperating is a deviation from cooperation that is maximizing the payoff, but at the cost of the observed payoff of the opponent. Hence, it is interpreted as defection in the following.

It was defined in Section 9.1.2 that the MSG payoff is calculated with the help of the discounting factor, given by

VMSGi = δnV i(n) . n =0

Any player i cooperates during the MSG if it expects to observe a higher payoff with cooperation than without cooperation:

VMSGi ( STRATC ) VMSGi ( any other strategy )

STRATC \i any other strategy, i Ν.

9.3.2Condition for Cooperation

Let VCCi be the payoff player i observes at any stage when both interacting players cooperate, and let VCDi be the payoff when player i cooperates and at the same time player -i defects. Further, let VDCi be the payoff when only the opponent player cooperates, and VDDi be the payoff when neither player cooperates.

At any stage n, player i can expect the following discounted MSG payoff if it continues to cooperate, assuming that the opponent player will not deviate from cooperation:

VMSGi = (δi )k VCCi .

k =n

This payoff gain must outweigh the payoff gain that results from a one-time defection, including the expected payoffs that follow when the opponent punishes this defection. As part of a dynamic strategy that seeks to establish cooperation, the payoff gain of defecting must be compensated through punishment over one or several following stages.

At any stage n, player i can expect the following discounted MSG payoff if it unilaterally deviates from cooperation to defection, assuming that the opponent player will punish this defection for the n’ following stages:

208

 

 

9. Coexisting Wireless LANs as Multi Stage Game

VMSGi

= VDCi

n +n' +1

(δi )k VCDi +

(δi )k VCCi .

+

 

one time

k=n +1

 

k =n +n' +1

 

 

defection

punishment

return to cooperation

Comparing the last two equations, it can be defined under which condition player i does not deviate from established cooperation. The condition is

(δi )k VCCi

> VDCi

n +n' +1

(δi )k VCDi

(δi )k VCCi ,

 

+

 

+

 

k=n

 

 

 

k=n +1

 

 

k=n +n' +1

 

 

which can be rewritten as

 

 

 

 

 

 

 

 

 

n +n' +1

(δi )k VCCi

 

 

VDCi

n +n' +1

(δi )k VCDi .

 

 

 

>

+

(9.6)

 

k =n

 

 

 

 

 

k =n +1

 

 

 

The worst-case punishment is a game wide punishment, i.e., n' → ∞, which would for example mean the end of cooperation because of the deviation. For a game wide punishment, Equation (9.6) can be solved to

V i

 

 

1

>

V i

+V i

 

 

δi

.

 

δi

 

δi

CC

1

 

DC

CD

1

 

With this condition, it is visible that it depends on the discounting factor if deviation from cooperation at one stage is preferred over cooperation, regardless of how long the punishment will take place. In conclusion, the condition that cooperation is preferred over defection is given by

V i

V i

 

δi >

CC

DC

.

(9.7)

 

 

V i

V i

 

 

CD

DC

 

9.3.3Experimental Results

In the following, a dynamic strategy that allows players to achieve cooperation under some conditions is discussed as an example. Let player -i play according to the dynamic strategy STRAT-CnP, which is illustrated as machine in Figure 9.13. Player i plays cooperates at any time (STRAT-C). At a stage n, it considers deviating to the best response for a single stage. The question to be answered is if it is worth to continue the cooperation or not.

Figure 9.14 shows the results obtained from the analysis as described here, as well as taken from simulation. A typical CCHC coexistence scenario is analyzed. Results are given for different discounting factors, i.e., different QoS requirements. The MSG outcomes of player i are shown in the figure.

9.3 Dynamic Strategies

209

otherwise

n=n0

 

COOPERATE:

 

 

PUNISH(1):

 

BEH-C

 

 

BEH-D

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

opponent

 

 

 

defects

for a number of stages, depending on discouning factor

any behavior

any behavior

of opponent

of opponent

PUNISH(n’):

BEH-D

Figure 9.13: Strategy STRAT-CnP as machine. A player initially cooperates (state “COOPERATE”), and punishes for a number of stages (states PUNISH(1)...PUNISH(n’)) upon defection of the opponent player.

All results are normalized to the cooperation payoff VCOOPi ( δi ) . The lines mark the expected payoffs, determined at stage n when the player’s decision which

behavior to select is taken. Player i deviates for a single stage and is consequently punished by the opponent player -i. Depending on the intensity of the punishment, i.e. the number of stages, the discounted MSG payoff for player -i from this single deviation is higher than the payoff that results when continuing cooperation, as can be seen in all four subfigures. If the cooperation payoff that is in

the figure denoted as VCOOPi ( δi ) , is higher than the deviation payoff, cooperation can be established.

For low values of δi , i.e., if the future is not very much considered compared to the current stage, the player i gives the short-term payoff gain through the deviation at this stage a higher value than the long-term payoff from cooperation, see Figure 9.14(d). In this case, cooperation cannot be established.

In this example, the player prefers cooperation to defection for discounting factors that are larger than 0.7 . This is indicated in Figure 9.14(a-c). The larger the discounting factor, the smaller the number of stages during which punishment takes place, until the single deviation from cooperation will be compensated.

9.3.4Conclusion

Cooperation is more likely to be established, and maintained, if players consider the outcomes of future stages, when taking the decision about what behaviors to select. This confirms the empirical study in Axelrod (1984). As a consequence for the CCHC coexistence, cooperation is indeed a realistic assumption likely to be achieved in scenarios where the offered traffic is high, because of the CCHC characteristics. CCHCs will operate with larger discounting factors when they attempt to support multiple different MACs.

210

9. Coexisting Wireless LANs as Multi Stage Game

 

1.5

 

 

 

 

 

 

δi

= 1

 

1.5

player i

1

 

 

 

 

 

 

VCOOPi

i=1)

player i

1

of

 

 

 

 

 

 

 

 

of

i MSG

 

 

 

 

 

 

 

 

i MSG

V

 

 

 

 

 

 

 

 

 

V

 

 

0.5

 

 

 

 

 

 

 

 

 

0.5

 

1

2

3

4

5

6

7

8

9

10

 

 

 

stages of punishment through player −i

 

 

 

 

 

 

 

 

 

 

δi

= 0.8

 

VCOOPi

i=0.8)

 

 

 

 

 

 

 

 

1

2

3

4

5

6

7

8

9

10

 

stages of punishment through player −i

 

(a) player i takes future stages into account

(b) player i takes future stages into some ac-

 

 

 

 

 

(δi =1)

 

 

 

 

 

 

 

 

 

count (δi = 0.8)

 

 

 

 

 

1.5

 

 

 

 

 

 

 

δi

= 0.75

 

 

1.5

 

 

 

 

 

 

 

δi

= 0.6

 

of player i

1

VCOOPi

i=0.75)

 

 

 

 

 

 

 

of player i

1

VCOOPi

i=0.6)

 

 

 

 

 

 

 

i MSG

 

 

 

 

 

 

 

i MSG

 

 

 

 

 

 

 

V

 

 

 

 

 

 

 

 

 

 

 

V

 

 

 

 

 

 

 

 

 

 

 

 

0.5

1

2

3

4

5

6

7

8

9

10

 

0.5

1

2

3

4

5

6

7

8

9

10

 

 

 

 

 

 

 

stages of punishment through player −i

 

 

 

 

stages of punishment through player −i

 

(c) future is less important for player i

(d) mainly the current, instantaneous QoS is

(δi = 0.75 )

important to player i (δi = 0.6 )

lines: analytical results; markers: YouShi simulation results

Figure 9.14: MSG payoffs of player i, normalized to the payoff in cooperation. Results are given for various discounting factors. Player i deviates for a single stage and is consequently punished by the opponent. If the MSG payoff in cooperation is larger than the MSG payoff when deviating, cooperation can be established.