The Prisoner's Dilemma

Posted on
the prisoners dilemma geometric series game theory

This post will review geometric series and game theory, and discuss the prisoner’s dilemma and its importance.

What is Game Theory

Game theory is the study of interdependent decisions. That is, “my optimal choice depends on what you choose, and your optimal choice depends on what I choose.” A stable state where no participant can gain an advantage through a change of strategy, assuming the other participants actions remain unchanged is a nash equilibrium. The Prisoner’s Dilemma is perhaps the most well-known game. It is used to describe situations where individuals pursuing self-interest makes everybody as a whole worse off. Many game theorists have studied this dilemma to determine whether cooperation is ever possible. It is not possible in a single game, nor in repeated games with a known ending point. However, in infinitely repeated games, cooperation can emerge as a Nash equilibrium, given the right value of a discount factor.

A Review of Geometric Series

A geometric series is the sum of a sequence of numbers where each number is multiplied by a common ratio, \(r\).

i.e. sum of 2, 4, 8, 16, 32…

To find a formula for an infinite geometric series, lets first look at the explicit formula of a geometric sequence.

\[ a_n = a_1*r^{n-1} \] so, from the example above,

\[2+4+8+16+32+...\]

can be rewritten as

\[ a +ar+ar^2 + ar^3 + ar^4+... \]

Let’s assume that there is some finite end point, \(n\). Then, the sum, \(S\), would be:

\[ S = a +ar+ar^2 + ar^3 + ar^4+...ar^n \]

We can multiply both sides of the equation by \(r\), to get:

\[ Sr = ar+ar^2 + ar^3 + ar^4+...ar^n + ar^{n+1} \]

Then, we will subtract this second equation from the first equation to get:

\[ S - Sr = a - ar^{n+1} \]

We will factor to find \(S - Sr = S(1-r)\), and then use this to solve for \(S\),

\[ S = \frac{a - ar^{n+1}}{1-r} \]

This is the formula for the sum of a finite geometric series. In an infinite geometric series, \(n \rightarrow \infty\), so we will take the limit:

\[ S = \lim_{n \rightarrow \infty} \frac{a - ar^{n+1}}{1-r} \]

where \(-1 \lt r \lt 1\). As \(n \rightarrow \infty\), \(r^{n+1} \rightarrow 0\), so our equation will be:

\[ S = \frac{a}{1-r} \]

A series only converges, meaning there is a finite sum, if \(-1\lt r \lt 1\). If \(r \le -1\) or \(r \ge 1\), the series diverges and goes to \(\pm \infty\). Thus, this equation for the sum only holds if the series converges.

The Prisoner’s Dilemma

Two criminals are caught attempting to rob a store. The cops try to get them to confess to the crime by interrogating them both separately. Without the confession, they only have them pinned for trespassing, but with a confession they can charge them for burglary. Let prisoner 1 be in blue and prisoner 2 in red. If one criminal turns their partner in, they will get leniancy from the police on their charges. The jail time for each situation creates the 2 x 2 matrix below:

There are three possible outcomes.

  1. Both prisoners lie
  2. One criminal lies and the other confesses
  3. Both criminals confess

If both prisoners lie, they each only get 1 month of jail time for trespassing. If one criminal lies and the other confesses, the one who confessed walks free while the one who lied gets 12 months in jail. If both criminals confess, they each get 8 months in jail. When running this scenario a single time, it is always optimal for each prisoner to confess, since regardless of whether or not their partner confesses, they individualy get less jail time for confessing. Therefore, confessing is the strictly dominant strategy. This raises some interesting questions about whether or not cooperation is ever possible, since that would lead to the most mutually beneficial outcome. The short answer is yes. The long answer is actually explained by a geometric series.

If we consider an infinite number of repetitions of this scenario, it is possible to acheive a Nash equilibrium of cooperation. This can result in infinite cooperation (or a phase of cooperation followed by infinite defection).

Possible Strategies

There are many strategies the prisoners can adopt. One such strategy is known as grim trigger. This says that the prisoners will cooperate in the first round and other rounds until either one of them confesses (defects). From that point on, they will confess forever.

Another common strategy is tit-for-tat. This says that prisoners will copy each others last decision. We will assume the prisoner’s are using a grim trigger strategy for this tutorial.

How Can Grim Trigger Work?

We can set up a formula to find each prisoners amount of jail time. The optimal result is whatever is closest to zero. We will base the future values off of some discount factor, \(\delta\). A discount factor is used to “discount” the future values of a given payoff based on what they’re worth today. They range between 0 and 1, exclusive. A higher \(\delta\) represents more patience and higher chance of making it into the next period. This would be written as:

\[ x + x\delta + x\delta^2 + x\delta^3 + ... \]

Through some algebra, this can be rewritten as: \[ S = \frac{x}{1-\delta} \]

Here, x is the equilibrium payoff of -1 for the cooperation phase, so our equation looks like:

\[ -1 -\delta -\delta^2 -\delta^3 - ... = \frac{-1}{1-\delta} \]

We have the formula for cooperation, let’s look at what happens if the prisoners defect in the first round:

\[ 0 - 8\delta -8\delta^2 - 8\delta^3 - ... = 0 + \frac{-8}{1-\delta} = \frac{-8}{1-\delta} \]

For cooperation to be possible, it must be beneficial for the prisoners to cooperate, so:

\[ \frac{-1}{1-\delta} \ge \frac{-8}{1-\delta} \]

We can easily solve this with a bit of algebra to find that for any discount factor, it will always be beneficial to cooperate given these conditions in an infinite prisoners dilemma.

Conclusion

In this case, cooperation is always beneficial, but that is not always true. It is often dependent on the discount factor, as well as payoffs and strategy. It is also only possible in infinite situations, since with a finite number of rounds, it is always beneficial for prisonners to defect at some point. Grim trigger works in theoretical situations, but it is not always practical in real life. For example, imagine if foreign affairs were conducted using a grim trigger strategy. One misstep would prevent cooperation between nations for eternity.

You may be thinking, “now, this is all interesting, but why is it actually important?”

Well, the prisoner’s dilemma can be applied to real-life situations, such as the Tragedy of the Commons. The tragedy of the commons says that common-pool resources like forests and fisheries will inevitably end up completely exhausted, leaving everybody worse off with nothing to harvest, because everybody self-interestedly takes as much as they can before others can get the resources. By studying the prisoner’s dilemma, it is possible to apply solutions to these real world problems.