Science By Funeral

“A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.”

As the above quote by Max Planck suggests, science is a very human affair. While, in an idealized form, the scientific process is a very useful tool for discovering truth, the reality of using the process in the world can be substantially messier. One of the primary culprits of this messiness is that being a good scientist per se – as defined by one who rigorously and consistently applies the scientific method – is not necessarily any indication that one is particularly bright or worthy of social esteem. It is perfectly possible to apply the scientific method to the testing of any number of inane or incorrect hypotheses. Instead, social status (and its associated rewards) tends to be provided to people who discover something that is novel, interesting, and true. Well, sort of; the discovery itself need not be exactly true as much as people need to perceive the idea as being true. So long as people perceive my ideas to be true, I can reap those social benefits; I can even do so if my big idea was actually quite wrong.

Sure; it looks plenty bright, but it’s mostly just full of hot air

Just as there are benefits to being known as the person with the big idea, there are also benefits to being friends with the person with the big idea, as access to those social (and material) resources tends to diffuse to the academic superstar’s close associates. Importantly, these benefits can still flow to those associates even if they lack the same skill set that made the superstar famous. To put this all into a simple example, getting a professor position at Harvard likely carries social and material benefits to the professor; those who study under the professor and get a degree from Harvard can also benefit by riding the coattails of the professor, even if they aren’t particularly smart or talented themselves. One possible result of this process is that certain ideas can become entrenched in a field, even if the ideas are not necessarily the best: as the originator of the idea has a vested interest in keeping it the order of the day in his field, and his academic progeny have a similar interest in upholding the originator’s status (as their status depends on his), new ideas may be – formally or informally – barred from entry and resisted, even if they more closely resemble the truth. As Planck quipped, then, science begins to move forward as the old guard die out and can no longer defend their status effectively; not because they relinquish their status in the face of new, contradictory evidence.

With this in mind, I wanted to discuss the findings of one of the most interesting papers I’ve seen in some time. The paper (Azoulay, Fons-Rosen, & Zivin, 2015) examined what happens to a field of research in the life sciences following the untimely death of one of its superstar members. Azoulay et al (2015) began by identifying their sample of approximately 13,000 superstars, 452 of which died prematurely (which, in this case, corresponded to an average age of death at 61). Of those who died, the term “superstar” would certainly describe them well, at least in terms of their output, generating a median authorship on 138 papers, 8,347 citations, and receiving over $16 million in government funding by the time of their death. These superstars were then linked to various subfields in which they published, their collaborators and non-collaborators within those subfields were identified, and a number of other variables that I won’t go into were also collected.

The question of interest, then, is what happens to these fields following the death of a superstar? In terms of the raw number of publications within a subfield, there was a very slight increase following the death of about 2%. That number does not give much of a sense for the interesting things that were happening, however. The first of these things is that the superstar’s collaborators saw a rather steep decline in their research output; a decline of about 40% over time. However, this drop in productivity of the collaborators was more than offset by an 8% increase in output by non-collaborators. This was an effect that remained (though it was somewhat reduced) even when the analysis excluded papers on which the superstar was an author (which makes sense: if one of your authors dies, of course you will produce fewer papers; there was just more to the decline than that). This decline in collaborator output would be consistent with a healthy degree of coattail-riding likely taking place prior to death. Further, there were no hints of these trends prior to the death, suggesting that the death in question was doing the causing when it came to changes in research output.

Figure 2: How much better-off your death made other people

The possible “whys” as to these effects was examined in the rest of the paper. A number of hints as to what is going on follow. First, there is the effect of death on citation counts, with non-collaborators producing more high-impact – but not low-impact – papers after the superstar’s passing. Second, these non-collaborators were producing papers in the very same subfields that the superstar had previously been in. Third, this new work did not appear to be building on the work of the superstar; the non-collaborators tended to cite the superstar less and newer work more. Forth, the newer authors were largely not competitors of the superstar during the time they were alive, opting instead to become active in the field following the death. The picture being painted by the data seems to be one in which the superstars initially dominate publishing within their subfields. While new faces might have some interest in researching these same topics, they fail to enter the field while the superstar is alive, instead providing their new ideas – not those already established – only after a hole has opened in the social fabric of the field. In other words, there might be barriers to entry for newcomers keeping them out, and those barriers relax somewhat following the death of a prominent member.

Accordingly, Azoulay et al (2015) turn their attention to what kinds of barriers might exist. The first barrier they posit is one they call “Goliath’s Shadow”, where newcomers are simply deterred by the prospect of having to challenge existing, high-status figures. Evidence consistent with this prospect was reported: the importance of the superstar – as defined by the fraction of papers in the field produced by them – seemed to have a noticeable effect, with more important figures creating a larger void to fill. By contrast, the involvement of the superstar – as defined by what percentage of their papers were published in a given field – did not seem to have an effect. The more a superstar published (and received grant money), the less room other people seemed to see for themselves. 

Two other possible barriers to entry concern the intellectual and social closure of a field: the former refers to the degree that most of the researchers within a field – not just the superstar – agree on what methods to use and what questions to ask; the latter refers to how tightly the researchers within a field work together, coauthoring papers and such. Evidence for both of these came up positive: fields in which the superstar trained many of the researchers in it and fields in which people worked very closely did not show the major effects of superstar death. Finally, a related possibility is that the associates of the superstar might indirectly control access to the field by denying resources to newcomers who might challenge the older set of ideas. In this instance, the authors reported that the deaths of those superstars who had more collaborators on editorial and funding boards tended to have less of an impact, which could be a sign of trouble. 

The influence of these superstars on generating barriers to entry, then, were often quite indirect. It’s not that the superstars were preventing newcomers themselves; it is unlikely they had the power to do so, even if they were trying. Instead, these barriers were created indirectly, either through the superstar receiving a healthly portion of the existing funding and publication slots, or through the collaborators of the superstar forming a relatively tight-knit community that could wield influence over what ideas got to see the light of day more effectively.

“We have your ideas. We don’t know who you are, and now no one else will either”

While it’s easy (and sometimes fun) to conjure up a picture of some old professor and their intellectual clique keeping out plucky, young, and insightful prospects with the power of discrimination, it is important to not leap to that conclusion immediately. While the faces and ideas within a field might change following the deaths of important figures, that does not necessarily mean the new ideas are closer to to that all-important, capital-T, Truth that we (sometimes) value. The same social pressures, costs, and benefits that applied to the now-dead old guard apply in turn to the new researchers, and new status within a field will not be reaped by rehashing the ideas of the past, even if they’re correct. Old-but-true ideas might be cast aside for the sake of novelty, just as new-but-false ideas might be promulgated. Regardless of the truth value of these ideas, however, the present data does lend a good deal of credence of the notion that science tends to move one funeral at a time. While truth may eventually win out by a gradual process of erosion, it’s important to always bear in mind that the people doing science are still only human, subject to the same biases and social pressures we all are.

References: Azoulay, P., Fons-Rosen, C., & Zivin, J. (2015). Does science advance one funeral at a time? The National Bureau of Economic Research, DOI: 10.3386/w21788


When Intuitions Meet Reality

Let’s talk research ethics for a moment.

Would you rather have someone actually take $20 from your payment for taking part in a research project, or would you rather be told – incorrectly – that someone had taken $20, only to later (almost immediately, in fact) find out that your money is safely intact and that the other person who supposedly took it doesn’t actually exist? I have no data on that question, but I suspect most people would prefer the second option; after all, not losing money tends to be preferable to losing money, and the lie is relatively benign. To use a pop culture example, Jimmy Kimmel has aired a segment where parents lie to their children about having eaten all their Halloween candy. The children are naturally upset for a moment and their reactions are captured so people can laugh at them, only to later have their candy returned and the lie exposed (I would hope). Would it be more ethical, then, for parents to actually eat their children’s candy so as to avoid lying to their children? Would children prefer that outcome?

“I wasn’t actually going to eat your candy, but I wanted to be ethical”

I happen to think that answer is, “no; it’s better to lie about eating the candy than to actually do it” if you are primarily looking out for the children’s welfare (there is obviously the argument to be made that it’s neither OK to eat the candy or to lie about it, but that’s a separate discussion). That sounds simple enough, but according to some arguments I have heard, it is unethical to design research that, basically, mimics the lying outcome. The costs being suffered by participants need to be real in order for research on suffering costs to be ethically acceptable. Well, sort of; more precisely, what I’ve been told is that it’s OK to lie to my subjects (deceive them) about little matters, but only in the context of using participants drawn from undergraduate research pools. By contrast, it’s wrong for me to deceive participants I’ve recruited from online crowd-sourcing sites, like Mturk. Why is that the case? Because, as the logic continues, many researchers rely on MTurk for their participants, and my deception is bad for those researchers because it means participants may not take future research seriously. If I lied to them, perhaps other researchers would too, and I have poisoned the well, so to speak. In comparison, lying to undergraduates is acceptable because, once I’m done with them, they probably won’t be taking part in many future experiments, so their trust in future research is less relevant (at least they won’t take part in many research projects once they get out of the introductory courses that require them to do so. Forcing undergraduates to take part in research for the sake of their grade is, of course, perfectly ethical).

This scenario, it seems, creates a rather interesting ethical tension. What I think is happening here is that a conflict has been created between looking out for the welfare of research participants (in common research pools; not undergraduates) and looking out for the welfare of researchers. On the one hand, it’s probably better for participants’ welfare to briefly think they lost money, rather than to let them actually lose money; at least I’m fairly confident that is the option subjects would select if given the choice. On the other hand, it’s better for researchers if those participants actually lose money, rather than briefly hold the false believe that they did, so participants continue to take their other projects seriously. An ethical dilemma indeed, balancing the interests of the participants against those of the researchers.

I am sympathetic to the concerns here; don’t get me wrong. I find it plausible to suggest that if, say, 80% of researchers outright deceived their participants about something important, people taking this kind of research over and over again would likely come to assume some parts of it were unlikely to be true. Would this affect the answers participants provide to these surveys in any consistent manner? Possibly, but I can’t say with any confidence if or how it would. There also seems to be workarounds for this poisoning-the-well problem; perhaps honest researchers could write in big, bold letters, “the following research does not contain the use of deception” and research that did use deception would be prohibited from attaching that bit by the various institutional review boards that need to approve these projects. Barring the use of deception across the board would, of course, create its own set of problems too. For instance, many participants taking part in research are likely curious as to what the goals of the project are. If researchers were required to be honest and transparent about their purposes upfront so as to allow their participants to make informed decisions regarding their desire to participate (e.g., “I am studying X…”), this can lead to all sorts of interesting results being due to demand characteristics - where participants behave in unusual manners as a result of their knowledge about the purpose of the experiment – rather than the natural responses of the subjects to the experimental materials. One could argue (and many have) that not telling participants about the real purpose of the study is fine, since it’s not a lie as much as an omission. Other consequences of barring explicitly deception exist as well, though, including the lack of control over experimental stimuli during interactions between participants and the inability to feasibly even test some hypotheses (such as whether people prefer the tastes of identical foods, contingent on whether they’re labeled in non-identical ways).

Something tells me this one might be a knock off

Now this debate is all well and good to have in the abstract sense, but it’s important to bring some evidence to the matter if you want to move the discussion forward. After all, it’s not terribly difficult for people to come up with plausible-sounding, but ultimately incorrect, lines of reasoning as for why some research practice is possibly (un)ethical. For example, some review boards have raised concerns about psychologists asking people to take surveys on “sensitive topics”, under the fear that answering questions about things like sexual histories might send students into an abyss of anxiety. As it turns out, such concerns were ultimately empirically unfounded, but that does not always prevent them from holding up otherwise interesting or valuable research. So let’s take a quick break from thinking about how deception might be harmful in the abstract to see what effects it has (or doesn’t have) empirically.

Drawn by the debate between economists (who tend to think deception is bad) and social scientists (who tend to think it’s fine), Barrera & Simpson (2012) conducted two experiments to examine how deceiving participants affected their future behavior. The first of these studies tested the direct effects of deception: did deceiving a participant make them behave differently in a subsequent experiment? In this study, participants were recruited as part of a two-phase experiment from introductory undergraduate courses (so as to minimize their previous exposure to research deception, the story goes; it just so happens they’re likely also the easiest sample to get). In the first phase of this experiment, 150 participants played a prisoner’s dilemma game which involved cooperating with or defecting on another player; a decision which would affect both player’s payments. Once the decisions had been made, half the participants were told (correctly) that they had been interacting with another real person in the other room; the other half were told they had been deceived, and that no other player was actually present. Everyone was paid and sent home.

Two to three weeks later, 140 of these participants returned for phase two. Here, they played 4 rounds of similar economic games: two rounds of dictator-games and two rounds of trust-games. In the dictator games, subjects could divide $20 between themselves and their partner; in the trust games, subjects could send some amount of $10 to the other player, this amount would be multiplied by three, and that player could then keep it all or send some of it back. The question of interest, then, is whether the previously-deceived subjects would behave any differently, contingent on their doubts as to whether they were being deceived again. The thinking here is that if you don’t believe you’re interacting with another real person, then you might as well be more selfish than you otherwise would. The results showed that while the previously-deceived participants were more likely to believe that social science researchers used deception somewhat more regularly, relative to the non-deceived participants their behavior was actually no different. Not only were the amounts of money sent to others no different (participants gave $5.75 on average in the dictator condition and trusted $3.29 when they were not previously deceived, and gave $5.52 and trusted $3.92 when they had been), but the behavior was no more erratic either. The deceived participants behaved just like the non-deceived ones.

In the second study the indirect effects of deception were examined. One-hundred-six participants first completed the same dictator and trust games as above. They were then either assigned to read about an experiment that did or did not make use of deception; a deception which included the simulation of non-existent participants. They then played another round of dictator and trust games immediately afterwards to see if their behavior would differ, contingent on knowing about how researchers might be deceive them. As in the first study, no behavioral differences emerged. Neither directly deceiving participants about the presence of others in the experiment or providing them with information that deception does take place in such research seemed to have any noticeable effects on subsequent behavior.

“Fool me once, shame on me; Fool me twice? Sure, go ahead”

Now it is possible that the lack of any effect in the present research had to do with the fact that participants were only deceived once. It is certainly possible that repeated exposures to deception, if frequent enough, will begin to have an effect and that effect will be a lasting one and it will not just be limited to the researcher employing the deception. In essence, it is possible that some spillover between experimenters over time might occur. However, this is something that needs to be demonstrated; not just assumed. Ironically, as Barrera & Simpson (2012) note, demonstrating such a spillover effect can be difficult in some instances, as designing non-deceptive control conditions to test against the deceptive ones is not always a straightforward task. In other words, as I mentioned before, some research is quite difficult – if not impossible – to conduct without being able to use deception. Accordingly, some control conditions might require that you deceive participants about deceiving them, which is awfully meta. Barrera & Simpson (2012) also mention some research findings that report even when no deception is used, participants who repeatedly take part in these kinds of economic experiments tend to get less cooperative over time. If that finding holds true, then the effects of repeated deception need to be filtered out from the effects of repeated participation in general. In any case, there does not appear to any good evidence that minor deceptions are doing harm to participants or other researchers. They might still be doing harm, but I’d like to see it demonstrated before I accept that they do. 

References: Barrera, D. & Simpson, B. (2012). Much ado about deception: Consequences of deceiving research participants in the social sciences. Sociological Methods & Research, 41, 383-413.

Preferences For Equality?

People are social creatures. This is a statement that surprises no one, seeming trivial to the same degree it is widely recognized (which is to say, “very”). That many people will recognize such a statement in the abstract and nod their head in agreement when they hear it does not mean they will always apply it to their thinking in particular cases, though. Let’s start with a context in which people will readily apply this idea to their thinking about the world: a video in which pairs of friends watch porn together while being filmed by others who have the intention to put the video online for view by (at the time of writing) about 5,700,000 people worldwide. The video is designed to get people’s reactions to an awkward situation, but what precisely is it about that situation which causes the awkward reactions? As many of you will no doubt agree, I suspect that answer has to do with the aforementioned point that people are social creatures. Because we are social creatures, others in our environment will be relatively inclined (or disinclined) from associating with us contingent on, among other things, our preferences. If some preferences make us seem like a bad associate to others – such as, say, our preferences concerning what kind of pornography arouses us, or our interest in pornography more generally – we might try to conceal those preferences from public view. As people are trying to conceal their preferences, we likely observe a different pattern of reactions to – and searches for – pornography in the linked video, compared to what we might expect if those actors were in the comfort and privacy of their own home.

Or, in a pinch, in the privacy of an Apple store or Public Library 

Basically, we would be wrong to think we get a good sense for these people’s pornography preferences from their viewing habits in the video, as people’s behavior will not necessarily match their desires. With that in mind, we can turn to a rather social human behavior: punishment. Now, punishment might not be the first example of social behavior that pops into people’s heads when they think about social things, but make no mistake about it; punishment is quite social. A healthy degree of human gossip centers around what we believe ought to be and not be punished; a fact which, much to my dismay, seems to take up a majority of my social media feeds at times. More gossip still concerns details of who was punished, how much they were punished, why they were punished, and, sometimes, this information will lead to other people joining in the punishment themselves or trying to defend someone else from it. From this analysis, we can conclude a few things, chief among which are that, (a) some portion of our value as an associate to others (what I would call our association value) will be determined by the perception of our punishment preferences, and (b) punishment can be made most or less costly, contingent on the degree of social support our punishment receives from others. 

This large social component of punishment means that observing the results of people’s punishment decisions does not necessarily inform you as to their preferences for punishment; sometimes people might punish others more or less than they would prefer to, were it not for these public variables being a factor. With that in mind, I wanted to review two pieces of research to see what we can learn about human punishment preferences from people’s behavior. The first piece claims that human punishment mechanisms have – to some extent – evolved to seek equal outcomes between the punisher and the target of their punishment. In short, if someone does some harm to you, you will only desire to punish them to the extent that it will make you two “even” again. An eye for an eye, as the saying goes; not an eye for a head. The second piece makes a much different claim: that human punishment mechanisms are not designed for fairness at all, seeking instead to inflict large costs on others who harm you, so as to deter future exploitation. Though both of these papers do not assess punishment in a social context, I think they have something to tell us about that all  the same. Before getting to that point, though, let’s start by considering the research in question.

The first of these papers is from Bone & Raihani (2015). Without getting too bogged down in the details, the general methods of the paper go as follows: two players enter into a game together. Player A begins the game with $1.10 while player B begins with a payment ranging from $0.60 to also $1.10. Player B is then given a chance to “steal” some of player A’s money for himself. The important part about this stealing is that it would either leave player B (a) still worse off than A, (b) with an equal payment to A, or (c) with a better payment than A. After the stealing phase, player A has the chance to respond by “punishing” player B. This punishment was either efficient – where for each cent player A spent, player B would lose three – or inefficient – where for each cent player A spent, player B would only lose one. The results of this study turned up the following findings of interest: first, player As who were stolen from tended to punish the player Bs more, relative to when the As were not stolen from. Second, player As who had access to the more efficient punishment option tended to spend more on punishment than those who had access to the less efficient option. Third, those player As who had access to the efficient punishment option also punished player Bs more in cases where B ended up better off than them. Finally, when participants in that former case were punishing the player Bs, the most common amount of punishment they enacted was the amount which would leave both player A and B with the same payment. From these findings, Bone & Raihani (2015) conclude that:

Although many of our results support the idea that punishment was motivated primarily by a desire for revenge, we report two findings that support the hypothesis that punishment is motivated by a desire for equality (with an associated fitness-leveling function…)

In other words, the authors believe they have observed the output of two distinct preferences: one for punishing those who harm you (revenge), and one for creating equality (fitness leveling). But were people really that concerned with “being even” with their agent of harm? I take issue with that claim, and I don’t believe we can conclude that from the data. 

We’re working on preventing exploitation; not building a frame.

To see why I take issue with that claim, I want to consider an earlier paper by Houser & Xiao (2010). This study involves a slightly different setup. Again, two players are involved in a game: player A begins the game by receiving $8. Player A could then transfer some amount of that money (either $0, $2, $4, $6, or $8) to player B, and then keep whatever remained for himself (another condition existed in which this transfer amount was randomly determined). Following that transfer, both players received $2. Finally, player B was given the following option: to pay $1 for the option to reduce player A’s payment by as much as they wanted. The results showed the following pattern: first, when the allocations were random, player B rarely punished at all (under 20%) and, when they did punish, they tended to punish the other player irrespective of inequality. That is they were equally as likely to deduct at all, no matter the monetary difference, and the amount they deducted did not appear to aimed at achieving equality. By contrast, of the player Bs that received $0 or $2 intentionally, 54% opted to punish player A and, when they did punish, were most likely to deduct so much from player A that they ended up better off than him (that outcome obtained between 66-73% of the time). When given free reign over the desired punishment amount, then, punishers did not appear to be seeking equality as an outcome. This finding, the authors conclude, is inconsistent with the idea that people are motivated to achieve equality per se. 

What both of these studies do, then, is vary the cost of punishment. In the first, punishment is either inefficient (1-to-1 ratio) or quite efficient (3-to-1 ratio); in the second, punishment is unrestricted in its efficiency (X-to-1 ratio). In all cases, as punishment becomes more efficient and less costly, we observe people engaging in more of it. What we learn about people’s preferences for punishment, then, is that they seems to be based, in some part, on how costly punishment is to enact. With those results, I can now turn to the matter of what they tell us about punishment in a social context. As I mentioned before, the costs of engaging punishment can be augmented or reduced to the extent that other people join in your disputes. If your course of punishment is widely supported by others, this means its easier to enact it; if your punishment is opposed by others, not only is it costlier to enact, but you might in turn get punished for engaging in your excessive punishment. This idea is fairly easy to wrap one’s mind around: stealing a piece of candy from a corner store does not usually warrant the death penalty, and people would likely oppose (or attack) the store owner or some government agency if they attempted to hand down such a draconian punishment for the offense.

Now many of you might be thinking that third parties were not present in the studies I mentioned, so it would make no sense for people to be thinking about how these non-existent third parties might feel about their punishment decisions. Such an intuition, I feel, would be a mistake. This brings me back to the matter of pornography briefly. As I’ve written before, people’s minds tend to generate physiological arousal to pornography despite there being no current adaptive reason for that arousal. Instead, our minds – or, more precisely, specific cognitive modules – attend to particular proximate cues when generating arousal that historically correlated with opportunities to increase our genetic fitness. In modern environments, where that link between cue and fitness benefit is broken by digital media providing similar proximate cues, the result in maladaptive outputs: people get aroused by an image, which makes about as much adaptive sense as getting aroused by one’s chair.

The same logic can likely be applied to punishment here as well, I feel: the cognitive modules in our mind responsible for punishment decisions evolved in a world of social punishment. Not only would your punishment decisions become known to others, but those others might join in the conflict on your side or opposing you. As such, proximate cues that historically correlated with the degree of third party support are likely still being utilized by our brains in these modern experimental contexts where that link is being intentionally broken and interactions are anonymous and dyadic. What is likely being observed in these studies, then, is not an aversion to inequality as much as an aversion to the costs of punishment or, more specifically, the estimated social and personal costs of engaging in punishment in a world that other people exist in.

“We’re here about our concerns with your harsh punishment lately”

When punishment is rather cheap to enact for the individual in question – as it was in Houser & Xiao (2010) – the social factor probably plays less of a role in determining the amount of punishment enacted. You can think of that condition as one in which a king is punishing a subject who stole from him: while the king is still sensitive to the social costs of punishment (punish too harshly and the rabble will rise up and crush you…probably), he is free to punish someone who wronged him to a much greater degree than your average peasant on the street. By contrast, in Bone & Raihani (2015), the punisher is substantially less powerful and, accordingly, more interested in the (estimated) social support factors. You can think of those conditions as ones in which a knight or a peasant is trying to punish another peasant. This could well yield inequality-seeking punishment in the former study and equality-seeking punishment in the latter, as different groups require different levels of social support, and so scale their punishment accordingly. Now the matter of why third parties might be interested in inequality between the disputants is a different matter entirely, but recognition of the existence of that factor is important for understanding why inequality matters to second parties at all.

References: Bone, J. & Raihani, N. (2015). Human punishment is motivated both by a desire for revenge and a desire for equality. Evolution & Human Behavior, 36, 323-330.

Houser, D., & Xiao, E. (2010). Inequality-seeking punishment. Economics Letters, 109, 20-23.