Humans are a remarkably cooperative bunch of organisms. This is a remarkable fact because cooperation can open the door wide to all manner of costly exploitation. While it can be a profitable strategy for all involved parties, cooperation requires a certain degree of vigilance and, at times, the credible threat of punishment in order to maintain its existence. Figuring out how people manage to solve these cooperative problems has provided us with no shortage of research and theorizing, some of which is altogether more plausible than the rest. Though I haven’t quite figured out the appeal yet, there are many thoughtful people who favor the group selection accounts for explaining why people cooperate. They suggest that people will often cooperate in spite of its personal fitness costs because it serves to better the overall condition of the group to which they belong. While there haven’t been any useful predictions that appear to have fallen out of such a model, there are those who are fairly certain it can at least account for some known, but ostensibly strange findings.
One human trait purported to require a group selection explanation is altruistic punishment and cooperation, especially in one-shot anonymous economic games. The basic logic goes as follows: in a prisoner’s dilemma game, so long as that game is a non-repeated event, there is really only one strategy, and that’s defection. This is because if you defect when your partner defects, you’re better off than if you cooperated; if you partner cooperated, on the other hand, you’re still better off if you defect. Economists might thus call the strategy of “always defect” to be a “rational” one. Further, punishing a defector in such conditions is similarly considered irrational behavior, as it only results in a lower payment for the punisher than they would have otherwise had. As we know from decades of research using these games, however, people don’t always behave “rationally”: sometimes they’ll cooperate with other people they’re playing with, and sometimes they’ll give up some of their own payment in order to punish someone who has either wronged them or, more importantly, wronged stranger. This pattern of behavior – paying to be nice to people who are nice, and paying to punish those who are not – has been dubbed “strong reciprocity”. (Fehr, Fischbacher, & Gachter, 2002)
The general raison d’etre of strong reciprocity seems to be that groups of people which had lots of individuals playing that strategy managed to out-compete other groups of people without them. Even though strong reciprocity is costly on the individual level, the society at large reaps larger overall benefits, as cooperation has the highest overall payoff, relative to any kind of defection. Strong reciprocity, then, helps to force cooperation by altering the costs and benefits to cooperation and defection on the individual level. There is a certain kind of unfairness inherent in this argument, though; a conceptual hypocrisy that can be summed up by the ever-popular phrase, “having one’s cake and eating it too”. To consider why, we need to understand the reason people engage in punishment in the first place. The likely, possibly-obvious candidate explanation just advanced is that punishment serves a deterrence function: by inflicting costs on those who engage in the punished behavior, those who engage in the behavior fail to benefit from it and thus stop behaving in that manner. This function, however, rests on a seemingly innocuous assumption: actors estimate the costs and benefits to acting, and only act when the expected benefits are sufficiently large, relative to the costs.
The conceptual hypocrisy is that this kind of cost-benefit estimation is something that strong reciprocators are thought to not to engage in. Specifically, they are punishing and cooperating regardless of the personal costs involved. We might say that a strong reciprocator’s behavior is inflexible with respect to their own payments. This example is a bit like playing the game of “chicken”, where two cars face each other from a distance and start driving at one another in a straight line. The first drive to turn away loses the match. However, if both cars continue on their path, the end result is a much greater cost to both drivers than is suffered if either one turns. If a player in this game was to adopt an inflexible strategy, then, by doing something like disabling their car’s ability to steer, they can force the other player to make a certain choice. Faced with a driver who cannot turn, you really only have one choice to make: continue going straight and suffer a huge cost, or turn and suffer a smaller one. If you’re a “rational” being, then, you can be beaten by an “irrational” strategy.
So what would be the outcome if other individuals started playing the ever-present “always defect” strategy in a similarly inflexible fashion? We’ll call those people “strong defectors” for the sake of contrast. No matter what their partner does in these interactions, the strong defectors will always play defect, regardless of the personal costs and benefits. By doing so, these strong defectors might manage to place themselves beyond the reach of punishment from strong reciprocators. Why? Well, any amount of costly punishment directed towards a strong defector would be a net fitness loss from the group’s perspective, as costly punishment is a fitness-reducing behavior: it reduces the fitness of the person engaging in it (in the form of whatever cost they suffer to deliver the punishment) and it reduces the fitness of the target of the punishment. Further, the costs to punishing the defectors could have been directed towards benefiting other people instead – which are net fitness gains for the group – so there are opportunity costs to engaging in punishment as well. These fitness costs would need to be made up for elsewhere, from the group selection perspective.
The problem is that, because the strong defectors are playing an inflexible strategy, the costs cannot be made up for elsewhere; no behavioral change can be affected. Extending this game of chicken analogy to the group level, let’s say that turning away is the “cooperative” option, and dilemmas like these were at least fairly regular. They might not have involved cars, but they did involve a similar kind of payoff matrix: there’s only one benefit available, but there are potential costs in attempting to achieve it. Keeping in line with the metaphor, it would be in the interests of the larger population if no one crashed. It follows that between-group selective pressures favor turning every time, since the costs are guaranteed to be smaller for the wider population, but the sum of the benefits don’t change; only who achieves them does. In order to force the cooperative option, a strong reciprocator might disable their ability to turn so as it alters the cost and benefits to others.
The strong reciprocators shouldn’t be expected to be unaffected by costs and benefits, however; they ought to be affected by such considerations, just on the group level, rather than the individual one. Their strategy should be just as “rational” as any others, just with regard to a different variable. Accordingly, it can be beaten by other seemingly irrational strategies – like strong defection – that can’t be affected by the threats of costs. Strong defectors which refuse to turn will either force a behavioral change in the strong reciprocators or result in many serious crashes. In either case, the strong reciprocator strategy doesn’t seem to lead to benefits in that regard.
Now perhaps this example sounds a bit flawed. Specifically, one might wonder how appreciable portions of the population might come to develop an inflexible “always defect” strategy in the first place. This is because the strategy appears to be costly to maintain at times: there are benefits to cooperation and being able to alter one’s behavior in response to costs imposed through punishment, and people would be expected to be selected to achieve and avoid them, respectively. On top of that, there is also the distinct concern that repeated attempts at defection or exploitation can result in punishment severe enough to kill the defector. In other words, it seems that there are certain contexts in which strong defectors would be at a selective disadvantage, becoming less prevalent in the population over time. Indeed, such a criticism would be very reasonable, and that’s precisely the because the always defect population behaves without regard to their personal payoff. Of course, such a criticism applies in just as much force to the strong reciprocators, and that’s the entire point: using a limited budget to affect the lives of others regardless of its effects on you isn’t the best way to make the most money.
The idea of strong defectors seems perverse precisely because they act without regard to what we might consider their own rational interests. Were we to replace “rational” with “fitness”, the evolutionary disadvantage to a strategy that functions as if behaving in such a manner seems remarkably clear. The point is that the idea of a strong reciprocator type of strategy should be just as perverse. Those who attempt to put forth a strong reciprocator type of strategy as plausible account for cooperation and punishment attempt to create a context that allows them to have their irrational-agent cake and eat it as well: strong reciprocators need not behave within their fitness interests, but all the other agents are expected to. This assumption needs to be at least implicit within the models, or else they make no sense. They don’t seem to make very much sense in general, though, so perhaps that assumption is the least of their problems.
References: Fehr, E., Fischbacher, U., & Gachter, S. (2002). Strong reciprocity, human cooperation, and the enforcement of social norms. Human Nature, 13, 1-25 DOI: 10.1007/s12110-002-1012-7