About a year ago, when I was making my first serious inroads into the technical underbelly of my field, following my utilitarian opponents into a futurist landscape of backwards Es and upside-down As, Robert Paul Wolff (who seems to be entirely incapable of stopping writing, bless his soul) ran a tutorial on formal methods in political philosophy which I found very useful, especially its introduction to Arrow's Impossibility Theorem concerning social choice theory. (The tutorial is archived along with his autobiography, which I heartily recommend, and some other bits and pieces here.) However, I thought that Wolff was rather harsh on the prospects of the prisoner's dilemma as a topic for serious philosophy, largely because the suppositions behind the case are harshly disconnected from the conditions of the actual world. Contrary to Wolff's harsh evaluation, I think there is something to be said for the prisoner's dilemma as an analytic tool. Heaven knows a lot of people make ridiculous claims regarding it (for instance, I was told once that it shows that ethics is impossible), but there are at least two reasons to take it seriously, if only as an analytical device. The prisoner's dilemma might be a good tool for cutting to the heart of various hypotheses, even if we agree with what Wolff has said about its limitations (as we should).
If I may remind the reader of what the prisoner's dilemma (hereafter, PD), I'll quote the description from the Stanford Encyclopedia of Philosophy entry:Tanya and Cinque have been arrested for robbing the Hibernia Savings Bank and placed in separate isolation cells. Both care much more about their personal freedom than about the welfare of their accomplice. A clever prosecutor makes the following offer to each. “You may choose to confess or remain silent. If you confess and your accomplice remains silent I will drop all charges against you and use your testimony to ensure that your accomplice does serious time. Likewise, if your accomplice confesses while you remain silent, they will go free while you do the time. If you both confess I get two convictions, but I'll see to it that you both get early parole. If you both remain silent, I'll have to settle for token sentences on firearms possession charges. If you wish to confess, you must leave a note with the jailer before my return tomorrow morning.”
We can represent the options in a little pay-off matrix, mapping out all the possibilities:
- Cinque stays silentCinque confessesTanya stays silentTanya is jailed very briefly / Cinque is jailed very brieflyTanya is jailed for a long time / Cinque goes freeTanya confessesTanya goes free / Cinque is jailed for a long time /Tanya is is jailed for a short time/ Cinque is jailed for a short time
Each of the pair can see that, whatever the other does, they get less time in jail if they confess and hang the other prisoner out to dry. But, that would lead both of them confessing, leading to a situation that is worse for both than if they had stayed silent. That is the prisoner's dilemma.
Wolff complains that discussing the PD in terms of this story distorts our understanding of the picture, because there are a number of assumptions about how players in such a game would act which maps very badly indeed with how actual people in real situations act. For instance, it is very hard indeed to imagine someone only caring about how little time they spend in jail, with no regard for the other's welfare, and any other concern. Wolff's point is well-taken, but I want to say that even if we accept what he says, we can mine some interesting results from using this scenario as an analytic tool: in particular, in seeing what it tells us about the types of reasoning decision-theorists and the like would like us to do (Perhaps I am better disposed towards the PD than Wolff is because I didn't need to wade through the thousand-odd journal articles written on this subject in the 60s and 70s, when nobody could shut up about this thing). To that end, it is useful to consider the in its most general and useful form the PD as game described by the following pay-off matrix (one agent choosing a row, the other choosing a column, like Tanya and Cinque had above) where the options are either to co-operate with the other agent (staying silent, in the prisoner's case) or defecting (confessing to the crime and selling the other prisoner up the river):
- Co-operateDefectCo-operateGood / GoodWorst / BestDefectBest / WorstBad / Bad
Any situation which has a pay-off matrix like this in it can be analysed in terms of the prisoner's dilemma.
Having done the throat-clearing, let me now present the two reasons why I think we should pay attention to the PD. The second is far and away the most important, but the first helps to lead us there.
The first reason is that there are simply so many theoretically interesting cases which can be modelled as some variation of the PD, that is, where situations arise with the payout matrix I described above. There are traveller's dilemmas, the centipede game, the ultimatum game, etc. I'll leave it up to the reader to investigate these cases, and their link to the PD, on their own. But note that understanding any situation which can be modelled in this way is going to necessitate understanding the implications of the PD (which includes, as Wolff stresses, knowing what it doesn't entail).
Secondly, the most important reason to look at the PD (which I was surprised to see get no mention at all in the tutorial) is that it gives a very embarrassing and problematic result for the mass of people who believe that decision theory, etc., provide the gold standard for human reasoning. That is, the PD shows that utility maximisation doesn't lead to Pareto-optimal situations (which was a bit of a surprise, since under similar suppositions the free market, which is driven entirely by utility-maximisation, does lead to Pareto-optimal distributions of resources – a bit more on that later in this paragraph). Utility maximisation is the procedure whereby at each point you need to make a procedure you take whatever course of action has the best prospects for getting you what you want (after taking into consideration all the likely future effects of your actions), and Pareto-optimality is the idea that one situation is preferable to another if every person involved finds the first one to be at least as good as the latter. In non-wonk terms, the PD demonstrates that if everybody tries at every step to take the action with consequences they'd most prefer, they are quite likely to end up in a situation they find less preferable than one they would have reached had they acted differently. It in fact does even more, in that the situation of the two prisoners if both defect is worse for both of them, whereas it's Pareto-suboptimal if only one person reaches a situation they don't prefer. This is embarrassing and problematic to the decision theorist, because Pareto-optimality is a very low bar indeed. There are a range of terrible situations that are Pareto-optimal – for instance, a fiefdom with its range of landlords and impoverished serfs is a Pareto-optimal distribution of land, since to give any land to a serf you need to take it away from a landlord, which means that changing the distribution of land would always be against the preferences of at least one person. If utility-maximisation can't even ensure reaching situations with that low level of goodness, then the decision theorist has reason to worry.
It's this feature of the PD which gives force to the tragedy of the commons (as Garrett Hardin described it in 1968, though only later was this analysed as a PD). Each member of a community who tends sheep and has access to the common pasture always has the incentive to put one more sheep in the field: though this lowers the total productivity of the commons through being overloaded, the individual's gains of having the extra sheep outweigh the marginal loss to each sheep. But if everybody follows this incentive (as utility-maximisation demands) then the commons will soon be exhausted and every farmer will be worse off in the end. The lesson to be learn here isn't that co-operation in such situations is impossible (as some people bizarrely claim, showing off a staggering confusion about the structure of human purposive action) but that utility-maximisation – the hard-nosed pragmatism which makes the prisoner defect every time – is untenable as a general guide to action. In scenarios with PD pay-offs (and the insights of the countless writers on this topic indicate just how many there might be) utility-maximisation turns out to lead us by the nose to our downfall. And that is what we should learn from the prisoner's dilemma.