What makes a game an instance of the PD is strictly and only its payoff structure. Thus we could have two Mother Theresa types here, both of whom care little for themselves and wish only to feed starving children. But suppose the original Mother Theresa wishes to feed the children of Calcutta while Mother Juanita wishes to feed the children of Bogota. Our saints are in a PD here, though hardly selfish or unconcerned with the social good. In that case, this must be reflected in their utility functions, and hence in their payoffs.

But all this shows is that not every possible situation is a PD; it does not show that selfishness is among the assumptions of game theory. Agents who wish to avoid inefficient outcomes are best advised to prevent certain games from arising; the defender of the possibility of Kantian rationality is really proposing that they try to dig themselves out of such games by turning themselves into different kinds of agents.

In general, then, a game is partly defined by the payoffs assigned to the players. In any application, such assignments should be based on sound empirical evidence. Our last point above opens the way to a philosophical puzzle, one of several that still preoccupy those concerned with the logical foundations of game theory. It can be raised with respect to any number of examples, but we will borrow an elegant one from C. Bicchieri Consider the following game:. The NE outcome here is at the single leftmost node descending from node 8.

To see this, backward induct again. At node 10, I would play L for a payoff of 3, giving II a payoff of 1. II can do better than this by playing L at node 9, giving I a payoff of 0. I can do better than this by playing L at node 8; so that is what I does, and the game terminates without II getting to move. A puzzle is then raised by Bicchieri along with other authors, including Binmore and Pettit and Sugden by way of the following reasoning.

Both players use backward induction to solve the game; backward induction requires that Player I know that Player II knows that Player I is economically rational; but Player II can solve the game only by using a backward induction argument that takes as a premise the failure of Player I to behave in accordance with economic rationality. This is the paradox of backward induction.

That is, a player might intend to take an action but then slip up in the execution and send the game down some other path instead. In our example, Player II could reason about what to do at node 9 conditional on the assumption that Player I chose L at node 8 but then slipped.

## Two-Person Game Theory. the Essential Ideas

Gintis a points out that the apparent paradox does not arise merely from our supposing that both players are economically rational. It rests crucially on the additional premise that each player must know, and reasons on the basis of knowing, that the other player is economically rational. A player has reason to consider out-of-equilibrium possibilities if she either believes that her opponent is economically rational but his hand may tremble or she attaches some nonzero probability to the possibility that he is not economically rational or she attaches some doubt to her conjecture about his utility function.

We will return to this issue in Section 7 below. The paradox of backward induction, like the puzzles raised by equilibrium refinement, is mainly a problem for those who view game theory as contributing to a normative theory of rationality specifically, as contributing to that larger theory the theory of strategic rationality. This involves appeal to the empirical fact that actual agents, including people, must learn the equilibrium strategies of games they play, at least whenever the games are at all complicated. What it means to say that people must learn equilibrium strategies is that we must be a bit more sophisticated than was indicated earlier in constructing utility functions from behavior in application of Revealed Preference Theory.

Instead of constructing utility functions on the basis of single episodes, we must do so on the basis of observed runs of behavior once it has stabilized , signifying maturity of learning for the subjects in question and the game in question. As a result, when set into what is intended to be a one-shot PD in the experimental laboratory, people tend to initially play as if the game were a single round of a repeated PD. The repeated PD has many Nash equilibria that involve cooperation rather than defection.

Thus experimental subjects tend to cooperate at first in these circumstances, but learn after some number of rounds to defect. The experimenter cannot infer that she has successfully induced a one-shot PD with her experimental setup until she sees this behavior stabilize. If players of games realize that other players may need to learn game structures and equilibria from experience, this gives them reason to take account of what happens off the equilibrium paths of extensive-form games.

Of course, if a player fears that other players have not learned equilibrium, this may well remove her incentive to play an equilibrium strategy herself. This raises a set of deep problems about social learning Fudenberg and Levine The crucial answer in the case of applications of game theory to interactions among people is that young people are socialized by growing up in networks of institutions , including cultural norms.

## Game theory and environmental disputes | SpringerLink

Most complex games that people play are already in progress among people who were socialized before them—that is, have learned game structures and equilibria Ross a. Novices must then only copy those whose play appears to be expected and understood by others. Institutions and norms are rich with reminders, including homilies and easily remembered rules of thumb, to help people remember what they are doing Clark As noted in Section 2.

Given the complexity of many of the situations that social scientists study, we should not be surprised that mis-specification of models happens frequently. Applied game theorists must do lots of learning, just like their subjects. The paradox of backward induction is one of a family of paradoxes that arise if one builds possession and use of literally complete information into a concept of rationality. Consider, by analogy, the stock market paradox that arises if we suppose that economically rational investment incorporates literally rational expectations: assume that no individual investor can beat the market in the long run because the market always knows everything the investor knows; then no one has incentive to gather knowledge about asset values; then no one will ever gather any such information and so from the assumption that the market knows everything it follows that the market cannot know anything!

As we will see in detail in various discussions below, most applications of game theory explicitly incorporate uncertainty and prospects for learning by players. The extensive-form games with SPE that we looked at above are really conceptual tools to help us prepare concepts for application to situations where complete and perfect information is unusual. We cannot avoid the paradox if we think, as some philosophers and normative game theorists do, that one of the conceptual tools we want to use game theory to sharpen is a fully general idea of rationality itself.

But this is not a concern entertained by economists and other scientists who put game theory to use in empirical modeling. In real cases, unless players have experienced play at equilibrium with one another in the past, even if they are all economically rational and all believe this about one another, we should predict that they will attach some positive probability to the conjecture that understanding of game structures among some players is imperfect.

This then explains why people, even if they are economically rational agents, may often, or even usually, play as if they believe in trembling hands. Learning of equilibria may take various forms for different agents and for games of differing levels of complexity and risk. Incorporating it into game-theoretic models of interactions thus introduces an extensive new set of technicalities. For the most fully developed general theory, the reader is referred to Fudenberg and Levine ; the same authors provide a non-technical overview of the issues in Fudenberg and Levine A first important distinction is between learning specific parameters between rounds of a repeated game see Section 4 with common players, and learning about general strategic expectations across different games.

The latter can include learning about players if the learner is updating expectations based on her models of types of players she recurrently encounters. A major difficulty for both players and modelers is that screening moves might be misinterpreted if players are also incentivized to make moves to signal information to one another see Section 4. Finally, the discussion so far has assumed that all possible learning in a game is about the structure of the game itself.

It was said above that people might usually play as if they believe in trembling hands. They must make and test conjectures about this from their social contexts. Sometimes, contexts are fixed by institutional rules. In other markets, she might know she is expect to haggle, and know the rules for that too. Given the unresolved complex relationship between learning theory and game theory, the reasoning above might seem to imply that game theory can never be applied to situations involving human players that are novel for them. Fortunately, however, we face no such impasse. In a pair of influential papers in the mid-to-late s, McKelvey and Palfrey , developed the solution concept of quantal response equilibrium QRE.

QRE is not a refinement of NE, in the sense of being a philosophically motivated effort to strengthen NE by reference to normative standards of rationality. It is, rather, a method for calculating the equilibrium properties of choices made by players whose conjectures about possible errors in the choices of other players are uncertain. QRE is thus standard equipment in the toolkit of experimental economists who seek to estimate the distribution of utility functions in populations of real people placed in situations modeled as games. QRE would not have been practically serviceable in this way before the development of econometrics packages such as Stata TM allowed computation of QRE given adequately powerful observation records from interestingly complex games.

QRE is rarely utilized by behavioral economists, and is almost never used by psychologists, in analyzing laboratory data. But NE, though it is a minimalist solution concept in one sense because it abstracts away from much informational structure, is simultaneously a demanding empirical expectation if it is imposed categorically that is, if players are expected to play as if they are all certain that all others are playing NE strategies.

Predicting play consistent with QRE is consistent with—indeed, is motivated by—the view that NE captures the core general concept of a strategic equilibrium. NE defines a logical principle that is well adapted for disciplining thought and for conceiving new strategies for generic modeling of new classes of social phenomena.

For purposes of estimating real empirical data one needs to be able to define equilibrium statistically. QRE represents one way of doing this, consistently with the logic of NE. The idea is sufficiently rich that its depths remain an open domain of investigation by game theorists. We will see later that there is an alternative interpretation of mixing, not involving randomization at a particular information set; but we will start here from the coin-flipping interpretation and then build on it in Section 3.

Our river-crossing game from Section 1 exemplifies this. Symmetry of logical reasoning power on the part of the two players ensures that the fugitive can surprise the pursuer only if it is possible for him to surprise himself. Suppose that we ignore rocks and cobras for a moment, and imagine that the bridges are equally safe. He must then pre-commit himself to using whichever bridge is selected by this randomizing device.

This fixes the odds of his survival regardless of what the pursuer does; but since the pursuer has no reason to prefer any available pure or mixed strategy, and since in any case we are presuming her epistemic situation to be symmetrical to that of the fugitive, we may suppose that she will roll a three-sided die of her own. Note that if one player is randomizing then the other does equally well on any mix of probabilities over bridges, so there are infinitely many combinations of best replies.

However, each player should worry that anything other than a random strategy might be coordinated with some factor the other player can detect and exploit. Since any non-random strategy is exploitable by another non-random strategy, in a zero-sum game such as our example, only the vector of randomized strategies is a NE. Now let us re-introduce the parametric factors, that is, the falling rocks at bridge 2 and the cobras at bridge 3. Suppose that Player 1, the fugitive, cares only about living or dying preferring life to death while the pursuer simply wishes to be able to report that the fugitive is dead, preferring this to having to report that he got away.

In other words, neither player cares about how the fugitive lives or dies. Suppose also for now that neither player gets any utility or disutility from taking more or less risk. In this case, the fugitive simply takes his original randomizing formula and weights it according to the different levels of parametric danger at the three bridges.

She will be using her NE strategy when she chooses the mix of probabilities over the three bridges that makes the fugitive indifferent among his possible pure strategies. The bridge with rocks is 1. Therefore, he will be indifferent between the two when the pursuer is 1. The cobra bridge is 1. Then the pursuer minimizes the net survival rate across any pair of bridges by adjusting the probabilities p1 and p2 that she will wait at them so that.

Now let f1, f2, f3 represent the probabilities with which the fugitive chooses each respective bridge. Then the fugitive finds his NE strategy by solving. These two sets of NE probabilities tell each player how to weight his or her die before throwing it. Note the—perhaps surprising—result that the fugitive, though by hypothesis he gets no enjoyment from gambling, uses riskier bridges with higher probability. We were able to solve this game straightforwardly because we set the utility functions in such a way as to make it zero-sum , or strictly competitive. That is, every gain in expected utility by one player represents a precisely symmetrical loss by the other.

However, this condition may often not hold. Suppose now that the utility functions are more complicated. The pursuer most prefers an outcome in which she shoots the fugitive and so claims credit for his apprehension to one in which he dies of rockfall or snakebite; and she prefers this second outcome to his escape. The fugitive prefers a quick death by gunshot to the pain of being crushed or the terror of an encounter with a cobra. Most of all, of course, he prefers to escape. Suppose, plausibly, that the fugitive cares more strongly about surviving than he does about getting killed one way rather than another.

This is because utility does not denote a hidden psychological variable such as pleasure. As we discussed in Section 2. How, then, can we model games in which cardinal information is relevant? Here, we will provide a brief outline of their ingenious technique for building cardinal utility functions out of ordinal ones. It is emphasized that what follows is merely an outline , so as to make cardinal utility non-mysterious to you as a student who is interested in knowing about the philosophical foundations of game theory, and about the range of problems to which it can be applied.

Providing a manual you could follow in building your own cardinal utility functions would require many pages. Such manuals are available in many textbooks. Suppose that we now assign the following ordinal utility function to the river-crossing fugitive:. We are supposing that his preference for escape over any form of death is stronger than his preferences between causes of death.

This should be reflected in his choice behaviour in the following way. In a situation such as the river-crossing game, he should be willing to run greater risks to increase the relative probability of escape over shooting than he is to increase the relative probability of shooting over snakebite. Suppose we asked the fugitive to pick, from the available set of outcomes, a best one and a worst one.

Now imagine expanding the set of possible prizes so that it includes prizes that the agent values as intermediate between W and L. We find, for a set of outcomes containing such prizes, a lottery over them such that our agent is indifferent between that lottery and a lottery including only W and L. In our example, this is a lottery that includes being shot and being crushed by rocks.

Call this lottery T. What exactly have we done here? Furthermore, two agents in one game, or one agent under different sorts of circumstances, may display varying attitudes to risk. Perhaps in the river-crossing game the pursuer, whose life is not at stake, will enjoy gambling with her glory while our fugitive is cautious. Both agents, after all, can find their NE strategies if they can estimate the probabilities each will assign to the actions of the other. We can now fill in the rest of the matrix for the bridge-crossing game that we started to draw in Section 2.

If both players are risk-neutral and their revealed preferences respect ROCL, then we have enough information to be able to assign expected utilities, expressed by multiplying the original payoffs by the relevant probabilities, as outcomes in the matrix. Suppose that the hunter waits at the cobra bridge with probability x and at the rocky bridge with probability y. Then, continuing to assign the fugitive a payoff of 0 if he dies and 1 if he escapes, and the hunter the reverse payoffs, our complete matrix is as follows:.

We can now read the following facts about the game directly from the matrix. No pair of pure strategies is a pair of best replies to the other. But in real interactive choice situations, agents must often rely on their subjective estimations or perceptions of probabilities. In one of the greatest contributions to twentieth-century behavioral and social science, Savage showed how to incorporate subjective probabilities, and their relationships to preferences over risk, within the framework of von Neumann-Morgenstern expected utility theory.

Then, just over a decade later, Harsanyi showed how to solve games involving maximizers of Savage expected utility. This is often taken to have marked the true maturity of game theory as a tool for application to behavioral and social science, and was recognized as such when Harsanyi joined Nash and Selten as a recipient of the first Nobel prize awarded to game theorists in As we observed in considering the need for people playing games to learn trembling hand equilibria and QRE, when we model the strategic interactions of people we must allow for the fact that people are typically uncertain about their models of one another.

This uncertainty is reflected in their choices of strategies. Consider the fourth of these NE. The structure of the game incentivizes efforts by Player I to supply Player III with information that would open up her closed information set. Player III should believe this information because the structure of the game shows that Player I has incentive to communicate it truthfully.

Theorists who think of game theory as part of a normative theory of general rationality, for example most philosophers, and refinement program enthusiasts among economists, have pursued a strategy that would identify this solution on general principles. The relevant beliefs here are not merely strategic, as before, since they are not just about what players will do given a set of payoffs and game structures, but about what understanding of conditional probability they should expect other players to operate with.

What beliefs about conditional probability is it reasonable for players to expect from each other? Consider again the NE R, r 2 , r 3. Suppose that Player III assigns pr 1 to her belief that if she gets a move she is at node The use of the consistency requirement in this example is somewhat trivial, so consider now a second case also taken from Kreps , p.

The idea of SE is hopefully now clear. We can apply it to the river-crossing game in a way that avoids the necessity for the pursuer to flip any coins of we modify the game a bit. This requirement is captured by supposing that all strategy profiles be strictly mixed , that is, that every action at every information set be taken with positive probability. You will see that this is just equivalent to supposing that all hands sometimes tremble, or alternatively that no expectations are quite certain.

A SE is said to be trembling-hand perfect if all strategies played at equilibrium are best replies to strategies that are strictly mixed. You should also not be surprised to be told that no weakly dominated strategy can be trembling-hand perfect, since the possibility of trembling hands gives players the most persuasive reason for avoiding such strategies. How can the non-psychological game theorist understand the concept of an NE that is an equilibrium in both actions and beliefs? Multiple kinds of informational channels typically link different agents with the incentive structures in their environments.

Some agents may actually compute equilibria, with more or less error. Others may settle within error ranges that stochastically drift around equilibrium values through more or less myopic conditioned learning. Still others may select response patterns by copying the behavior of other agents, or by following rules of thumb that are embedded in cultural and institutional structures and represent historical collective learning.

Note that the issue here is specific to game theory, rather than merely being a reiteration of a more general point, which would apply to any behavioral science, that people behave noisily from the perspective of ideal theory. In a given game, whether it would be rational for even a trained, self-aware, computationally well resourced agent to play NE would depend on the frequency with which he or she expected others to do likewise. If she expects some other players to stray from NE play, this may give her a reason to stray herself. Instead of predicting that human players will reveal strict NE strategies, the experienced experimenter or modeler anticipates that there will be a relationship between their play and the expected costs of departures from NE.

Consequently, maximum likelihood estimation of observed actions typically identifies a QRE as providing a better fit than any NE. Rather, she conjectures that they are agents, that is, that there is a systematic relationship between changes in statistical patterns in their behavior and some risk-weighted cardinal rankings of possible goal-states. If the agents are people or institutionally structured groups of people that monitor one another and are incentivized to attempt to act collectively, these conjectures will often be regarded as reasonable by critics, or even as pragmatically beyond question, even if always defeasible given the non-zero possibility of bizarre unknown circumstances of the kind philosophers sometimes consider e.

The analyst might assume that all of the agents respond to incentive changes in accordance with Savage expected-utility theory, particularly if the agents are firms that have learned response contingencies under normatively demanding conditions of market competition with many players.

All this is to say that use of game theory does not force a scientist to empirically apply a model that is likely to be too precise and narrow in its specifications to plausibly fit the messy complexities of real strategic interaction. A good applied game theorist should also be a well-schooled econometrician. However, games are often played with future games in mind, and this can significantly alter their outcomes and equilibrium strategies. Our topic in this section is repeated games , that is, games in which sets of players expect to face each other in similar situations on multiple occasions.

This may no longer hold, however, if the players expect to meet each other again in future PDs. Imagine that four firms, all making widgets, agree to maintain high prices by jointly restricting supply. That is, they form a cartel. This will only work if each firm maintains its agreed production quota. Typically, each firm can maximize its profit by departing from its quota while the others observe theirs, since it then sells more units at the higher market price brought about by the almost-intact cartel.

In the one-shot case, all firms would share this incentive to defect and the cartel would immediately collapse. However, the firms expect to face each other in competition for a long period. In this case, each firm knows that if it breaks the cartel agreement, the others can punish it by underpricing it for a period long enough to more than eliminate its short-term gain.

Of course, the punishing firms will take short-term losses too during their period of underpricing. But these losses may be worth taking if they serve to reestablish the cartel and bring about maximum long-term prices. One simple, and famous but not , contrary to widespread myth, necessarily optimal strategy for preserving cooperation in repeated PDs is called tit-for-tat. This strategy tells each player to behave as follows:. A group of players all playing tit-for-tat will never see any defections. Since, in a population where others play tit-for-tat, tit-for-tat is the rational response for each player, everyone playing tit-for-tat is a NE.

You may frequently hear people who know a little but not enough game theory talk as if this is the end of the story. It is not. There are two complications. First, the players must be uncertain as to when their interaction ends. Suppose the players know when the last round comes. In that round, it will be utility-maximizing for players to defect, since no punishment will be possible. Now consider the second-last round.

In this round, players also face no punishment for defection, since they expect to defect in the last round anyway. So they defect in the second-last round. But this means they face no threat of punishment in the third-last round, and defect there too. We can simply iterate this backwards through the game tree until we reach the first round. Since cooperation is not a NE strategy in that round, tit-for-tat is no longer a NE strategy in the repeated game, and we get the same outcome—mutual defection—as in the one-shot PD.

Therefore, cooperation is only possible in repeated PDs where the expected number of repetitions is indeterminate. Of course, this does apply to many real-life games. Note that in this context any amount of uncertainty in expectations, or possibility of trembling hands, will be conducive to cooperation, at least for awhile. When people in experiments play repeated PDs with known end-points, they indeed tend to cooperate for awhile, but learn to defect earlier as they gain experience.

Now we introduce a second complication. Consider our case of the widget cartel. Suppose the players observe a fall in the market price of widgets. Perhaps this is because a cartel member cheated. Or perhaps it has resulted from an exogenous drop in demand. If tit-for-tat players mistake the second case for the first, they will defect, thereby setting off a chain-reaction of mutual defections from which they can never recover, since every player will reply to the first encountered defection with defection, thereby begetting further defections, and so on.

If players know that such miscommunication is possible, they have incentive to resort to more sophisticated strategies. In particular, they may be prepared to sometimes risk following defections with cooperation in order to test their inferences. However, if they are too forgiving, then other players can exploit them through additional defections. In general, sophisticated strategies have a problem. Because they are more difficult for other players to infer, their use increases the probability of miscommunication.

But miscommunication is what causes repeated-game cooperative equilibria to unravel in the first place. The complexities surrounding information signaling, screening and inference in repeated PDs help to intuitively explain the folk theorem , so called because no one is sure who first recognized it, that in repeated PDs, for any strategy S there exists a possible distribution of strategies among other players such that the vector of S and these other strategies is a NE.

Thus there is nothing special, after all, about tit-for-tat. Real, complex, social and political dramas are seldom straightforward instantiations of simple games such as PDs. Hardin offers an analysis of two tragically real political cases, the Yugoslavian civil war of —95, and the Rwandan genocide, as PDs that were nested inside coordination games. A coordination game occurs whenever the utility of two or more players is maximized by their doing the same thing as one another, and where such correspondence is more important to them than whatever it is, in particular, that they both do.

In these circumstances, any strategy that is a best reply to any vector of mixed strategies available in NE is said to be rationalizable. That is, a player can find a set of systems of beliefs for the other players such that any history of the game along an equilibrium path is consistent with that set of systems.

Pure coordination games are characterized by non-unique vectors of rationalizable strategies. The Nobel laureate Thomas Schelling conjectured, and empirically demonstrated, that in such situations, players may try to predict equilibria by searching for focal points , that is, features of some strategies that they believe will be salient to other players, and that they believe other players will believe to be salient to them.

Coordination was, indeed, the first topic of game-theoretic application that came to the widespread attention of philosophers.

### Discussion

In , the philosopher David Lewis published Convention , in which the conceptual framework of game-theory was applied to one of the fundamental issues of twentieth-century epistemology, the nature and extent of conventions governing semantics and their relationship to the justification of propositional beliefs. The basic insight can be captured using a simple example.

This insight, of course, well preceded Lewis; but what he recognized is that this situation has the logical form of a coordination game. Thus, while particular conventions may be arbitrary, the interactive structures that stabilize and maintain them are not. Furthermore, the equilibria involved in coordinating on noun meanings appear to have an arbitrary element only because we cannot Pareto-rank them; but Millikan shows implicitly that in this respect they are atypical of linguistic coordinations.

In a city, drivers must coordinate on one of two NE with respect to their behaviour at traffic lights. Either all must follow the strategy of rushing to try to race through lights that turn yellow or amber and pausing before proceeding when red lights shift to green, or all must follow the strategy of slowing down on yellows and jumping immediately off on shifts to green. Both patterns are NE, in that once a community has coordinated on one of them then no individual has an incentive to deviate: those who slow down on yellows while others are rushing them will get rear-ended, while those who rush yellows in the other equilibrium will risk collision with those who jump off straightaway on greens.

However, the two equilibria are not Pareto-indifferent, since the second NE allows more cars to turn left on each cycle in a left-hand-drive jurisdiction, and right on each cycle in a right-hand jurisdiction, which reduces the main cause of bottlenecks in urban road networks and allows all drivers to expect greater efficiency in getting about.

Unfortunately, for reasons about which we can only speculate pending further empirical work and analysis, far more cities are locked onto the Pareto-inferior NE than on the Pareto-superior one. Conditional game theory see Section 5 below provides promising resources for modeling cases such as this one, in which maintenance of coordination game equilibria likely must be supported by stable social norms, because players are anonymous and encounter regular opportunities to gain once-off advantages by defecting from supporting the prevailing equilibrium.

This work is currently ongoing. While various arrangements might be NE in the social game of science, as followers of Thomas Kuhn like to remind us, it is highly improbable that all of these lie on a single Pareto-indifference curve. These themes, strongly represented in contemporary epistemology, philosophy of science and philosophy of language, are all at least implicit applications of game theory. The reader can find a broad sample of applications, and references to the large literature, in Nozick Most of the social and political coordination games played by people also have this feature.

Unfortunately for us all, inefficiency traps represented by Pareto-inferior NE are extremely common in them. And sometimes dynamics of this kind give rise to the most terrible of all recurrent human collective behaviors. Take for example the game in figure Lets imagine the players initially agreed to play R2, C2. Now both have serious reasons to deviate, as deviating unilaterally would profit either player.

However, in an additional step of reflection, both players may note that they risk ending up with nothing if they both deviate, particularly as the rational recommendation for each is to unilaterally deviate. Players may therefore prefer the relative security of sticking to the agreed-upon strategy. They can at least guarantee 2 utils for themselves, whatever the other player does, and this in combination with the fact that they agreed on R2, C2 may reassure them that their opponent will in fact play strategy 2.

So R2, C2 may well be a self-enforcing agreement, but it nevertheless is not a Nash equilibrium. Last, the argument from self-enforcing agreements does not account for mixed strategies. Even though the mixed strategies might have constituted a self-enforcing agreement before the mechanism made its assignment, it is hard to see what argument a player should have to stick to her agreement after the assignment is made Luce ad Raiffa , Pearce , Jacobsen formalizes this idea with the help of three assumptions. Second, he assumes that the player behaves rationally in both positions.

Thirdly, he assumes that a player conceives of his opponent as similar to himself; i. If his opponent also holds such a Nash equilibrium conjecture which she should, given the similarity assumption , then the game has a unique Nash equilibrium. This argument has met at least two criticisms. First, Jacobsen provides an argument for Nash equilibrium conjectures, not Nash equilibria.

If each player ends up with a multiplicity of Nash equilibrium conjectures, an additional coordination problem arises over and above the coordination of which Nash equilibrium to play: now first the conjectures have to be matched before the equilibria can be coordinated. This requires that he predict his own behavior. Otherwise deliberation would be vacuous, since the outcome is determined when the relevant parameters of the choice situation are available.

Concluding this section, it seems that there is no general justification for Nash equilibria in one-shot, simultaneous-move games. This does not mean that there is no justification to apply the Nash concept to any one-shot, simultaneous-move game — for example, games solvable by iterated dominance have a Nash equilibrium as their solution. Also, this conclusion does not mean that there are no exogenous reasons that could justify the Nash concept in these games.

However, the discussion here was concerned with endogenous reasons — i. And there the justification seems deficient. If people encounter an interactive situation sufficiently often, they sometimes can find their way to optimal solutions by trial-and error adaptation. In a game-theoretic context, this means that players need not necessarily be endowed with the ability to play equilibrium — or with the sufficient knowledge to do so - in order to get to equilibrium.

If they play the game repeatedly , they may gradually adjust their behavior over time until there is no further room for improvement. At that stage, they have achieved equilibrium. Kalai and Lehrer show that in an infinitely repeated game, subjective utility maximizers will converge arbitrarily close to playing Nash equilibrium. The only rationality assumption they make is that players maximize their expected utility, based on their individual beliefs. Knowledge assumptions are remarkably weak for this result: players only need to know their own payoff matrix and discount parameters.

Knowledge assumptions are thus much weaker for Nash equilibria arising from such adjustment processes than those required for one-shot game Nash solutions. Players converge to playing equilibrium because they learn by playing the game repeatedly. Learning, it should be remarked, is not a goal in itself but an implication of utility maximization in this situation. Each player starts out with subjective prior beliefs about the individual strategies used by each of her opponents. On the basis of these beliefs, they choose their own optimal strategy. Beliefs are adjusted by Bayesian updating : the prior belief is conditionalized on the newly available information.

On the basis of these assumptions, Kalai and Lehrer show that after sufficient repetitions, i the real probability distribution over the future play of the game is arbitrarily close to what each player believes the distribution to be, and ii the actual choices and beliefs of the players, when converged, are arbitrarily close to a Nash equilibrium.

Nash equilibria in these situations are thus justified as potentially self-reproducing patterns of strategic expectations. It needs to be noted, however, that this argument depends on two conditions that not all games satisfy. First, players must have enough experience to learn how their opponents play. Depending on the kind of learning, this may take more time than a given interactive situation affords. Second, not all adjustment processes converge to a steady state for an early counterexample, see Shapley For these reasons, the justification of Nash equilibrium as the result of an adjustment process is sensitive to the game model, and therefore does not hold generally for all repeated games.

Backward induction is the most common Nash equilibrium refinement for non-simultaneous games. Backward induction depends on the assumption that rational players remain on the equilibrium path because of what they anticipate would happen if they were to deviate. Backward induction thus requires the players to consider out-of-equilibrium play. But out-of-equilibrium play occurs with zero probability if the players are rational. To treat out-of-equilibrium play properly, therefore, the theory needs to be expanded.

The problem of counterfactuals cuts deeper, however, than a call for mere theory expansion. For reasons of representational convenience, the game is represented as progressing from left to right instead of from top to bottom as in the usual extensive-form games. Player 1 starts at the leftmost node, choosing to end the game by playing down or to continue the game giving player 2 the choice by playing right. The payoffs are such that at each node it is best for the player who has to move to stop the game if and only if she expects that in the event she continues, the game will end at the next stage by the other player stopping the game or by termination of the game.

The two zigzags stand for the continuation of the payoffs along those lines. Now backward induction advises to solve the game by starting at the last node z , asking what player 2 would have done if he ended up here.

Substituting the payoffs of this down for node z , one now moves backwards. What would player 1 have done had she ended up at node y? This line of argument then continues back to the first node.

For the centipede, backward induction therefore recommends player 1 to play down at the first node; all other recommendations are counterfactual in the sense that no rational player should ever reach it. So what should player 2 do if he found himself at node x? But if player 1 is not rational, then player 2 may hope that she will not choose down at her next choice either, thus allowing for a later terminal node to be reached. This consideration becomes problematic for backward induction if it also affects the counterfactual reasoning.

Now the truth of the first counterfactual makes false the antecedent condition of the second: it can never be true that player 2 found herself at x and be rational. Thus it seems that by engaging in these sorts of counterfactual considerations, the backward induction conclusion becomes conceptually impossible. This is an intensely discussed problem in game theory and philosophy. Here only two possible solutions can be sketched. The first answer insists that common knowledge of rationality implies backward induction in games of perfect information Aumann This position is correct in that it denies the connection between the indicative and the counterfactual conditional.

Players have common knowledge of rationality, and they are not going to lose it regardless of the counterfactual considerations they engage in. But common knowledge by definition is not revisable, so the argument instead has to assume common belief of rationality. If one looks more closely at the versions of the above argument e.

Pettit and Sugden , it becomes clear that they employ the notion of common belief, and not of common knowledge. Another solution of the above problem obtains when one shows, as Bicchieri , chapter 4 does, that limited knowledge of rationality and of the structure of the game suffice for backward induction. All that is needed is that a player, at each of her information sets, knows what the next player to move knows. This condition does not get entangled in internal inconsistency, and backward induction is justifiable without conceptual problems.

Further, and in agreement with the above argument, she also shows that in a large majority of cases, this limited knowledge of rationality condition is also necessary for backward induction. If her argument is correct, those arguments that support the backward induction concept on the basis of common knowledge of rationality start with a flawed hypothesis, and need to be reconsidered.

In this section, I have discussed a number of possible justifications for some of the dominant game theoretic solution concepts. Note that there are many more solution concepts that I have not mentioned at all most of them based on the Nash concept. Note also that this is a very active field of research, with new justifications and new criticisms developed constantly. All I tried to do in this section was to give a feel for some of the major problems of justification that game theoretic solution concepts encounter.

In the preceding section, the focus was on the justification of solution concepts. In this section, I discuss some problematic results that obtain when applying these concepts to specific games. In particular, I show that the solutions of two important games disagree with some relevant normative intuitions. Note that in both cases these intuitions go against results accepted in mainstream game theory; many game theorists, therefore, will categorically deny that there is any paradox here at all. From a philosophical point of view as well as from some of the other social sciences these intuitions seem much more plausible and therefore merit discussion.

Recall the story from section 1c: a chain store faces a sequence of possible small-business entrants in its monopolistic market. In each period, one potential entrant can choose to enter the market or to stay out. If he has entered the market, the chain store can choose to fight or to share the market with him. Fighting means engaging in predatory pricing, which will drive the small-business entrant out of the market, but will incur a loss the difference between oligopolistic and predatory prices for the chain store. Thus fighting is a weakly dominated strategy for the chain store, and its threat to fight the entrant is not credible.

### Towards Data Science

Because there will only be a finite number of potential entrants, the sequential game will also be finite. When the chain store is faced with the last entrant, it will cooperate, knowing that there is no further entrant to be deterred. But since the last entrant cannot be deterred, it would be irrational for the chain store to fight the penultimate potential entrant.

Thus, by backward induction, the chain store will always cooperate and the small-businesses will always decide to enter. Selten , who developed this example, concedes that backward induction may be a theoretically correct solution concept. However, for the chain-store example, and a whole class of similar games, Selten construes backward induction as an inadequate guide for practical deliberation.

Instead, he suggests that the chain store may accept the backward induction argument for the last x periods, but not for the time up to x. Then, following what Selten calls a deterrence theory , the chain store responds aggressively to entries before x , and cooperatively after that. He justifies this theory which, after all, violates the backward induction argument, and possibly the dominance argument by intuitions about the results:.

If I had to play a game in the role of [the chain store], I would follow the deterrence theory. I would be very surprised if it failed to work. From my discussion with friends and colleagues, I get the impression that most people share this inclination. In fact, up to now I met nobody who said that he would behave according to the [backwards] induction theory.

My experience suggests that mathematically trained persons recognize the logical validity of the induction argument, but they refuse to accept it as a guide to practical behavior. Various attempts have been made to explain the intuitive result of the deterrence theory on the basis of standard game theory. Some of these limitations are discussed under the heading of bounded rationality in Section 2f.

Finite repetitions, however, still yield the result R2,C2 from backward induction. That case is structurally very similar to the chain store paradox, whose implausibility was discussed above. Gauthier has offered such a justification based on the concept of constrained maximization. As a consequence, morality is not to be seen as a separate sphere of human life but as an essential part of maximization. Gauthier envisions a world in which there are two types of players: constrained maximizers CM and straightforward maximizers SM. An SM player plays according to standard solution concepts; A CM player commits herself to choose R1 or C1 whenever she is reasonably sure she is playing with another CM player, and chooses to defect otherwise.

The problem for CM players is how to verify this condition. In particular in one-shot games, how can they be reasonably sure that their opponent is also CM, and thus also committed to not exploit? And how can one be sure that opponents of type CM correctly identify oneself as a CM type?

With regards to these questions, Gauthier offers two scenarios, which try to justify a choice to become a CM. In the case of translucency , players only have beliefs about their mutual types. First, they need to believe that there are at least some CMs in the population. Second, they need to believe that players have a good capacity to spot CMs, and third that they have a good capacity to spot SMs.

If most players are optimistic about these latter two beliefs, they will all choose CM, thus boosting the number of CMs, making it more likely that CMs spot each other. Hence they will find their beliefs corroborated. If most players are pessimistic about these beliefs, they will all choose SM and find their beliefs corroborated. Gauthier, however, does not provide a good argument of why players should be optimistic; so it remains a question whether CM can be justified on rationality considerations alone.

Bounded rationality is a vast field with very tentative delineations. The fundamental idea is that the rationality which mainstream cognitive models propose is in some way inappropriate. Depending on whether rationality is judged inappropriate for the task of rational advice or for predictive purposes, two approaches can be distinguished.

For game theory, questions of this kind concern computational capacity and the complexity-optimality trade-off. The discussion here will be restricted to the normative bounded rationality. The outmost bound of rationality is computational impossibility. Binmore discusses this topic by casting both players in a two-player game as Turing machines.

A Turing machine is a theoretical model that allows for specifying the notion of computability. Very roughly, if a Turing machine receives an input, performs a finite number of computational steps which may be very large , and gives an output then the problem is computable. If a Turing machine is caught in an infinite regress while computing a problem, however, then the problem is not computable. The question Binmore discusses is whether Turing machines can play and solve games. Roughly put, when machine 1 first calculates the output of machine 2 and then takes the best response to its action, and machine 2 simultaneously calculates the output of machine 1 and then takes the best response to its action, the calculations of both machines enter an infinite regress.

Computational impossibility, however, is very far removed from the realities of rational deliberation. Take for example the way people play chess. Zermelo long ago showed that chess has a solution. Despite this result, chess players cannot calculate the solution of the game and choose their strategies accordingly. Compare Gigerenzer et al He restricts the set of strategies to those that can be executed by finite machines. He then defines the complexity of a strategy as the number of states of the machine that implements it.

Rubinstein shows that the set of equilibria for complexity-sensitive games is much smaller than that of the regular repeated game. We now turn from the use of game theory as a normative theory to its use as a scientific theory of human behavior. Game theory, as part of Rational Choice Theory, is an important social scientific method. There is, however, considerable controversy about the usefulness of Rational Choice Theory for the purposes of the social sciences. Some of this controversy arises along disciplinary boundaries: while Rational Choice Theory is considered mainstream in economics to the extent that no one even bothers using this label , sociologists and political scientists are more divided.

In this book, they make two major claims about the scientific usefulness of Rational Choice Theory. First, they argue that Rational Choice Theory is empirically empty: that it has produced virtually no new propositions about politics that have been carefully tested and not found wanting. Second, they argue that the perceived universality claim of Rational Choice Theory is misguided: that even if an empirically successful Rational Choice Theory were to emerge, it would not be any more universal than the middle-level theories that they advocate. These two claims have been challenged on various fronts.

First, it has been pointed out that Green and Shapiro employ inappropriate standards for testing Rational Choice Theory, standards that not even successful theories of the hard sciences would survive Diermeier In Section 1c, I argued that game theory is in fact not a universal theory of rationality, but rather offers a menu of tools to model specific situations.

At least with respect to game theory, therefore, they attack the wrong target: game theory is useful because it is a widely applicable method, which works well in certain circumstances, rather than a universal substantive theory of human behavior. Although game theory cannot be dismissed as not useful for prediction just because it is part of Rational Choice Theory, game theory has a number of problems of its own that need to be discussed in depth.

The first issue is to what extent the role of game theory as a theory of rationality is relevant here. I contrast this possibility with a brief sketch of evolutionary game theory, which abandons the rationality notion altogether. In the consecutive section, I discuss the problems of specifying the payoffs in a game, and thus giving a game model empirical content.

Last, I discuss the possibility whether game theory can be tested at all, and investigate a recent claim that indeed game theory has been tested, and refuted. Game theory may be useful in predicting human behavior for two distinct reasons. First, it may be the case that game theory is a good theory of rationality, that agents are rational and that therefore game theory predicts their behavior well. If game theory was correct for this reason, it could reap the additional benefit of great stability. Many social theories are inherently unstable, because agents adjust their behavior in the light of its predictions.

Such a self-fulfilling theory would be more stable than a theory that predicts irrational behavior. Players who know that their opponents will behave irrationally because a theory tells them can improve their results by deviating from what the theory predicts, while players who know that their opponents will behave rationally cannot. However, the prospects for game theory as a theory where prescription and prediction coincide are not very good; evidence from laboratory experiments, as well as from casual observations, often puts doubt on it.

Second, and independently of the question of whether game theory is a good theory of rationality, game theory may be a good theory because it offers the relevant tools to systematize and predict interactive behavior successfully. This distinction may make sense when separating our intuitions about how agents behave rationally from a systematic account of our observations of how agents behave. Aumann for example suggests that. On the other hand, it is conceptually simple and attractive, and mathematically easy to work with.

As a result, it has led to many important insights in the applications, and has illuminated and established relations between many different aspects of interactive decision situations. It is these applications and insights that lend it validity. Aumann , Philosophy of Science discusses various ways of how approximate models can relate to real phenomena; each has its specific problems, which cannot be discussed here. Evolutive approaches of game theory offer such an interpretation Binmore proposed this term in order to distinguish it from the eductive approaches discussed in Section 2.

Its proponents claim that the economic, social and biological evolutionary pressure directs human agents, who have no clear idea what is going on, to behavior that is in accord with the solution concepts of game theory. The evolutive interpretation seeks to apply techniques, results, and justifications of assumptions from evolutionary game theory to game theory as a predictive theory of human behavior.

Evolutionary game theory was developed in biology; it studies the appearance, robustness and stability of behavioral traits in animal populations. This article cannot do justice even to the basics of this very vibrant and expanding field for a concise and formal introduction, see Maynard Smith and Weibull , but instead presents only some aspects relevant to two questions; namely i , to what extend can standard game theory elements be based on evolutionary game theory?

And ii , does this reinterpretation help in the prediction of human behavior? Evolutionary game theory studies games that are played over and over again by players drawn from a populations. It is thus often said that the strategies themselves are the players. Success of a strategy is defined in terms of the number of replications that a strategy will leave of itself to play in games of future generations.

Rather than determining equilibrium as the consequence of strategic reasoning by rational players, evolutionary game theory determines the stability of a strategy distribution in a population either as the resistance to mutant invasions, or as the result of a dynamic process of natural selection.

Its equilibrium concept is thus much closer to the stable state concept of the natural sciences, where different causal factors balance each other out, than the eductive interpretation is. Evolutionary game theory can be distinguished into a static and into a dynamic approach. The static approach specifies strategies that are evolutionary stable against a mutant invasion. Imagine a population of players programmed to play one mixed or pure strategy A.

A strategy is an evolutionary stable strategy ESS if for every possible mutant strategy B different from A , the payoff of playing A against the A is higher than the payoff of playing B against A — or, if both payoffs are equal, then the payoff of playing A against B is higher than playing B against B. Note that ESS is a robustness test only against a single mutation at a time. It is assumed that the population that plays an ESS has time to adjust back to status quo before the next mutant invasion begins.

It follows from this definition that every ESS is a strategy that is in Nash equilibrium with itself. However, not every strategy that is Nash equilibrium with itself is an ESS. The dynamic approach of evolutionary game theory considers a selection mechanism that favors some strategies over others in a continuously evolving population. Imagine a population whose members are programmed to play different strategies. Pairs of players are drawn at random to play against each other. Their payoff consists in an increase or decrease in fitness, measured as the number of offspring per time unit.

Reproduction takes place continuously over time, with the birthrate depending on fitness, and the death rate being uniform for all players. Long continuations of tournaments between players then may lead to stable states in the population, depending on the initial population distribution. This notion of dynamic stability is wider than that of evolutionary stability: while all evolutionary stable strategies are also dynamically stable, not all dynamically stable strategies are evolutionary stable.

It has been shown that in the long run, all strictly dominated and all iteratively strictly dominated strategies are eliminated from the population. The relation between stable states and Nash equilibria is more complex, and would require specifications that go beyond the scope of this brief sketch. Evolutionary game theory provides interesting concepts and techniques that are quite compatible with the solution concepts of standard game theory discussed in Section 1 however, it focuses mainly on two-person static games; dynamic games and game repetitions are less investigated.

Clearly, evolutionary game theory is more concerned with discovering conditions of stability and robustness of strategies in populations, than with finding the equilibria of a single game. The question that remains is whether it competes in its explanatory efforts with eductive game theory, or whether it deals instead with different although maybe related phenomena.

Players are mere hosts to these memes, and their behavior is partly determined by them. Fitness is a property of the meme and its capacity to replicate itself to other players. Expected utility maximization is then interpreted as a result of evolutionary selection:. There are two references I provide that are good: One is this discussion on simple proofs of Nash's theorem and this one is a very well done readable and accurate survey of the history in PNAS.

Once you regard several players as a single player, they are meant to cooperate as they must act like a single player. Nash is very clear about this in his Annals paper:. Von Neumann and Morgenstern have developed a very fruitful theory of two-person zero-sum games in their book Theory of Games and Economic Behavior. This theory is based on an analysis of the interrelationships of the various coalitions which can be formed by the players of the game.

Our theory, in contradistinction, is based on the absence of coalitions in that it is assumed that each participant acts independently, without collaboration or communication with any of the others. The notion of an equilibrium point is the basic ingredient in our theory. This notion yields a generalization of the concept of the solution of a two-person zero-sum game. It turns out that the set of equilibrium points of a two-person zero-sum game is simply the set of all pairs of opposing "good strategies.

We shall also introduce the notions of solvability and strong solvability of a non-cooperative game and prove a theorem on the geometrical structure of the set of equilibrium points of a solvable game. This answer overlaps with other answers but I think another restatement may be helpful because the situation is slightly confusing. However, the key point is that it's important to ask the right question. We now understand, thanks to Nash, that a basic necessary condition for a set of strategies to be "optimal" is for them to form a Nash equilibrium, but von Neumann and Morgenstern did not hit on this concept.

So Nash didn't just answer the obvious question; the right question wasn't obvious, but he found it anyway, and answered it. The second innovative aspect of Nash's work is that the two-person zero-sum result was based on the theory of linear programming and minimax. Proving the existence of a Nash equilibrium requires different techniques. So the naive approach to generalization, namely staring at the existing result and trying to figure out how to use the same ideas to prove something more general, does not lead to Nash's key insight.

Myerson gives a good history of the theory: Nash equilibrium and the history of economic theory. Thus von Neumann argued that virtually any competitive game can be modeled by a mathematical game with the following simple structure: There is a set of players, each player has a set of strategies, each player has a payoff function from the Cartesian product of these strategy sets into the real numbers, and each player must choose his strategy independently of the other players.

Von Neumann did not consistently apply this principle of strategic independence, however. In his analysis of games with more than two players, von Neumann assumed that players would not simply choose their strategies independently, but would coordinate their strategies in coalitions. Furthermore, by his emphasis on max-min values, von Neumann was implicitly assuming that any strategy choice for a player or coalition should be evaluated against the other players' rational response, as if the others could plan their response after observing this strategy choice.

Before Nash, however, no one seems to have noticed that these assumptions were inconsistent with von Neumann's own argument for strategic independence of the players in the normal form. Von Neumann also added two restrictions to his normal form that severely limited its claim to be a general model of social interaction for all the social sciences: He assumed that payoff is transferable, and that all games are zero-sum. In contrast, Nash provided a way to deal with the more general problem of non-transferable utility and non-zero-sum games.

But the most important new contribution of Nash , fully as important as the general definition and the existence proof of Nash b , was his argument that this noncooperative equilibrium concept, together with von Neumann's normal form, gives us a complete general methodology for analyzing all games Von Neumann's normal form is our general model for all games, and Nash's equilibrium is our general solution concept.