Money For Nothing, But The Chicks Aren’t Free

When people see young, attractive women in relationships with older and/or unattractive men, the usual perception that comes to mind is that the relationship revolves around money. This perception is usual because it tends to be accurate: women do, in fact, tend to prefer men who both have access to financial resources and who are willing to share them.  What is rather notable is that the reverse isn’t quite as a common: a young, attractive man shacking up with an older, rich woman just doesn’t call too many examples to mind. Women seem to have a much more pronounced preference for men with wealth than men have for women. While examples of such preferences playing themselves out in real life exist anecdotally, it’s always good to try and showcase their existence empirically.

Early attempts were made by Dr. West, but replications are required

This brings me to a new paper by Arnocky et al (2016) that examined how altruism affects mating success in humans (as this is still psychology research, “humans” translates roughly as “undergraduate psychology majors”, but such is the nature of convenience samples). The researchers first sought (a) to document that more altruistic people really were preferred as mating partners (spoilers: they are), and then (b) to try and explain why we might expect them to be. Let’s begin with what they found, as that much is fairly straightforward. In their first study, Arnocky et al (2016) recruited 192 women and 105 men from a Canadian university and asked them to complete a few self-report measures: an altruism scale (used to measure general dispositions towards providing aid to others when reciprocation is unlikely), a mating success scale (measuring perceptions of how desirable one tends to be towards the opposite sex), their numbers of lifetime sexual partners, as well as the number of those that were short-term, the number of times over the last month they had sex with their current partner (if they had one, which about 40% did), and a measure of their personality more generally.

These measures were then entered into a regression (controlling for personality). When it came to predicting perceived mating success, reported altruism was a significant predictor (ß = 0.25), but neither sex nor the altruism-sex interaction was. This suggests that both men and women tend become more attractive to the opposite sex if they behave more altruistically (or, conversely, that people who are more selfish are less desirable, which sounds quite plausible). However, what it means for one to be successful in the mating domain varies by sex: for men, having more sexual partners usually implies a greater level of success, whereas the same does not hold true for women as often (as gametes are easy to obtain for women, but investment is difficult). In accordance with this point, it was also found that altruism predicted the number of lifetime sexual partners overall (ß = .16), but this effect was specific to men: more altruistic men had more sexual partners (and more casual ones), whereas more altruistic women did not. Finally, within the contexts of existing relationships, altruism also (sort of) predicted the number of times someone had sex with their partner in the last month (ß = .27); while there was not a significant interaction with sex, a visual inspection of the provided graphs suggest that if this effect existed, it was being predominately carried by altruistic women having more sex within a relationship; not the men.

Now that’s all well and good, but the authors wanted to go a little further. In their second study, rather than just asking participants about how altruistic they were, they offered participants the opportunity to be altruistic: after completing the survey, participants could indicate how much (if any) of their earnings they wanted to donate to a charity of their choice. That way, you get what might be a less-biased measure of one’s actual altruism (rather than their own perception of it). Another 335 women and 189 men were recruited for this second phase and, broadly, the results follow the same general pattern, but there were some notable differences. In terms of mating success, actual altruistic donations (categorized as either making a donation or not, rather than the amount donated) were not a good predictor (ß = -.07). In terms of number of lifetime dating and sexual partners, however, the donation-by-sex interaction was significant, indicating that more charitable men – but not women – had a greater number of relationships and sexual partners (perhaps suggesting that charitable men tend to have more, but shorter, relationships, which isn’t necessarily a good thing for the women involved). Donations also failed to predict the amount of sex participants had been having in their relationship in the last month.

Guess the blood drive just isn’t a huge turn on after all

With these results in mind, there are two main points I wanted to draw attention to. The first of these concerns the measures of altruism in general: effectively charitable behaviors to strangers. While such a behavior might be a more “pure” form of altruistic tendencies as compared with, say, helping a friend move or giving money to your child, it does pose some complications for the present topic. Specifically, when looking for a desirable mate, people might not want someone who is just generally altruistic. After all, it doesn’t always do me much good if my committed partner is spending time and investing resources in other people. I would probably prefer that resources be preferentially directed at me and those I care about, rather than strangers, and I might especially dislike it if altruism directed towards strangers came at my expense (as the same resources can’t be invested in me and someone else most of the time). While it is possible that such investments in strangers could return to me later in the form of them reciprocating such aid to my partner, it seems unlikely that deficit would be entirely and consistently made up, let alone surpassed.

To make the point concrete, if someone was equally altruistic towards all people, there would be little point in forming as kind of special relationship with that kind person (friendships or otherwise) because you’d get the same benefits from them regardless of how much you invested in them (even if that amount was nothing).

This brings me to the second point I wanted to discuss: the matter of why people like the company of altruists. There are two explanations that come to mind. The first explanation is simple: people like access to resources, and altruists tend to provide them. This explanation should hardly require much in the way of testing given its truth is plainly obvious. The second explanation is more complex, and it’s one the authors favor: altruism honestly signals some positive, yet difficult-to-observe quality about the altruist. For instance, if I were to donate blood, or my time to clean up a park, this would tell you something about my underlying genetic qualities, as an individual in worse condition couldn’t shoulder the costs of altruism effectively. In this sense, altruism functions in a comparable manner to a peacock’s tail feathers; it’s a biologically-honest signal because it’s costly.

While it does have some plausibility, this signaling explanation runs into some complications. First, as the authors note, women donated more than men did (70% to 57%), despite donating predicting sexual behavior better for men. If women were donating to signal some positive qualities in the mating domain, it’s not at all clear it was working. Further, patterns of charitable donations in the US show a U-shaped distribution, whereby those with access to the most and  the fewest financial resources tend to donate more than those in the middle. This seems like a pattern the signaling explanation should not predict if altruism is meaningfully and consistently tied to important, but difficult-to-observe biological characteristics. Finally, while the argument could be made that altruism directed towards friends, sexual partners, and kin are not necessarily indicative of someone’s willingness to donate to strangers (i.e., how altruistic they are dispositionally might not predict how nepotistic they are), well, that’s kind of a problem for the altruism-as-signaling model. If donations towards strangers are fairly unpredictive of altruism towards closer relations, then they don’t really tell you what you want to know.  Specifically, if you want to know how good of a friend or dating partner someone would be for you, a better cue is how much altruism they direct towards their friends and romantic partners; not how much they direct to strangers.

“My boyfriend is so altruistic, buying drinks for other women like that”

Last, we can consider the matter of why people behave altruistically, with respect to the mating domain. (Very) broadly speaking, there are two primary challenges people need to overcome: attracting a mate and retaining them. Matters get tricky here, as altruism can be used for both of these tasks. As such, a man who is generally altruistic towards lot of people might be using altruism as a means of attracting the attention of prospective mates without necessarily intending to keep them around. Indeed, the previous point about how altruistic men report having more relationships and sexual partners could be interpreted in just such a light. There are other explanations, of course, such as the prospect that generally selfish people simply don’t have many relationships at all, but these need to be separated out. In either case, in terms of how much altruism we provide to others, I suspect that the amount provided to strangers and charitable organizations only makes up a small fraction; we give much more towards friends, family, and lovers regularly. If that’s the case, measuring someone’s willingness to donate in those fairly uncommon contexts might not capture their desirability as partner as well as we would like.

References: Arnocky, S., Piche, T., Albert, G., Ouellette, D., & Barclay, P. (2016). Altruism predicts mating success in humans. British Journal of Psychology, DOI:10.1111/bjop.12208

 

Homophobia Isn’t Repressed Homosexuality

In the wake of the Orlando shooting at the Pulse nightclub, there were quite a number of speculations floating around my social media that the shooter himself had been harboring homosexual urges that he had been trying to repress. Repression – being the odd thing that it apparently is – in this case involved his visiting gay nightclubs and using gay dating apps to communicate – and presumably have sex – with other gay men; he might have even been doing all those things while telling himself he had no interest in such activities, that they were morally wrong, or at the very least while trying to keep it secret from other people in his life. The shooting resulted, then, at least in part from this unsuccessful repression of his homosexual urges; an inward loathing directed outwards at others. Or so the story went, anyway. Subsequent official investigations into Omar Mateen’s life revealed no evidence of such behavior: no gay dating apps, no credible homosexual partners, and no gay pornography. Perhaps he was just very good at covering his tracks, but a more parsimonious explanation jumps out at me: he probably wasn’t grappling with homosexual urges.

“Keep grappling with those urges! Don’t stop! You’re almost there…”

The underlying idea in that case – that some degree of homophobia is actually explained by the homophobes in question trying to deny their own homosexual urges – remains a somewhat popular speculation. It has roots as far back as Freud, and I’ve already discussed one piece of more modern research on the idea from the mid-90s. This homosexuality repression hypothesis is also even a subplot in one of my favorite movies, American Beauty. For an idea with such a long history, it does seem rather peculiar that more empirical research on the topic doesn’t seem to exist. Perhaps the most obvious guess as to why such research doesn’t exist is that its not exactly the easiest thing in the world to measure someone’s implicit sexual attraction (provided such a thing can even be said to exist at all). If the subjects themselves aren’t even aware of it, a failure to uncover any evidence of its existence might not mean it’s not there; it might just mean that you don’t know how to uncover it. Designing the proper experiments and accurately interpreting the data resulting from them thus becomes troublesome.

Before considering some new research on the hypothesis, then, I wanted to take a step back and consider why, on a theoretical level, we shouldn’t expect implicit or repressed homosexual urges to predict homophobic attitudes particularly well. The first starting point is to note that explicit homosexuality is rare in humans (about 1-3%). This should be expected, as homosexuality does not appear to be adaptive; same-sex attraction just isn’t a good way to reproduce ones’ genes directly or indirectly (whether through kin or alliance formation). Further, open homosexuals don’t tend to be particularly homophobic; at least not as far as I know. Given that rarity, then, if something around even 20% of the population is homophobic, then there is either a lot of homophobia unrelated to homosexuality, or repressed homosexuality is very, very common. In other words, one of two statements follow, neither of which bode well for the homophobia-as-repressed-attraction hypothesis: (a) lots of people who are homophobic harbor no homosexual urges or (b) many of those who are homophobic harbor such urges.

If the first idea is true, then very little homophobia could even be explained in principle by homosexual urges. Most people who were homophobic just wouldn’t have homosexual urges, and an absent variable can’t explain a present trait.

If the second idea is true, however, then repression-via-homophobia strategy would be fairly ineffective. In order to understand why, we need to start with the following point: people are only repressing homosexual urges to convince others that they are not gay. From an adaptive point of view, an organism does not need to deceive itself about its desires. False beliefs, in that sense, just don’t do anything functionally useful, and there is no “self” to be deceived in the first place, given the modular nature of the mind. Taking that as a given for the moment, if you’re trying to convince others that you don’t have a desire, you will only be successful to the extent you engage in behaviors that someone with that desire would usually not. Placed into a simple example, if you’re trying to convince others that you’re not hungry, you turn down food. Eating a lot isn’t a particularly good way to do that, as people who aren’t hungry don’t normally eat a lot. So, if lots of people who do have homosexual urges were homophobic, then adopting a homophobic stance should actually be expected to positively signal that one is a homosexual, as being homophobic is something lots of (closeted) homosexual people actually do.

Thus the dilemma of the homophobia-as-repression hypothesis is highlighted: if only few homophobes are meaningful homosexual, then homosexuality can’t explain much; if many homophobes actually are homosexual, then homophobia will be ineffective at persuading others one is straight.

“They’re trying to signal they’re gay so much that they must be straight!”

As such, it should come as little surprise that some recent research finds no evidence for this homophobia-as-repressed-homosexuality hypothesis. MacInnis & Hodson (2013) sought to examine whether any link exists between a measure of implicit sexual attraction and explicit homophobia in heterosexuals. In order to do this, the authors used an implicit association task (IAT) adapted to sexual attraction: a task in which participants have to categorize pictures as male/female and words as sexually attractive/unattractive, and the speed at which they do so should tell you something about the cognitive association between the two. I’m wary of the interpretations of IATs for a number of reasons, but I’ll assume for the time being that such a test does indeed kind of measure what they hope. Participants were also asked about their explicit sexual attractions to men and women, and their attitudes towards gay/lesbian and heterosexual populations. In total, their sample represented 237 Canadian undergraduates (85 men).

As I would expect, the IAT results only correlated modestly with explicit measures of sexual attraction (r = .37 for men, r = .15 for women). The correlations between those IAT measures and negative, explicit evaluations of homosexuals for men was r = -.06, and for women, r = -.24. In other words, not only were such correlations quite small, but they nominally went in the opposite direction of the repression account: as people showed more implicit attraction to the same sex, they also showed less explicit negativity. On a similar note, men’s explicit attractions to the same sex negatively correlated with their homophobia as well (r = -.31), meaning that as men reported more conscious attraction to other men, they were also more positive towards homosexuals. People tend to be more positive towards those that resemble them – for good reason – so this isn’t terribly shocking.

The researchers tried additional analyses as well to address other interpretations of the repression-to-attraction account. First, they divided the data such that those who showed positive homosexual implicit attraction were compared to those who on the negative side. The male sample, it’s worth noting, could not be analyzed here as only 4 of the 85 men had such a score (perhaps there’s just not much implicit attraction floating around?); for women, the same finding as before emerged: those showing more implicit attraction were less negative towards homosexuals. Next, the authors tried to examine only those in the upper-half of homophobia score, and then those in the more extreme ends. However, the implicit attraction scores did not differ between those high and low in prejudice for men or women. The repression hypothesis wasn’t even supported when the authors tried to isolate those participants whose explicit and implicit attraction scores were maximally different from one another (the authors frame this as participants overstating their heterosexuality on an explicit level, but I suspect the actual interpretation is that the IAT isn’t too great of a tool).

Directions for future research: invasive mind-reading technology

With all the dividing of their sample, MacInnis & Hodson (2013) gave their data every possible advantage to find somethingeven some spurious relationship – but essentially nothing arose. They broke the data down by men and women; attitudes towards gays, lesbians, and homosexuals in general; those high or low in prejudice; those whose implicit and explicit attractions diverged. No matter how it was sliced, support was not found for the repression idea. When relationships did exist between implicit attraction and explicit attitudes, it usually ran in the opposite direction of the repression hypothesis: those who showed implicit attraction were less negative towards homosexuals (albeit quite modestly). I don’t suspect this will stop those who fancy the repression hypothesis to abandon it – likely because they value it for reasons beyond its established truth value, which is currently dubious at best –  but it is a possible starting point for that journey.   

References: MacInnis, C. & Hodson, G. (2013). Is homophobia associated with an implicit same-sex attraction? Journal of Sex Research, 50, 777-785.

The Fight Against Self-Improvement

In the abstract, most everyone wants to be the best version of themselves they can. More attractive bodies, developing and improving useful skills, a good education, achieving career success; who doesn’t want those things? In practice, lots of people, apparently. While people might like the idea of improving various parts of their life, self-improvement takes time, energy, dedication, and restraint; it involves doing things that might not be pleasant in the short-term with the hope that long-term rewards will follow. Those rewards are by no means guaranteed, though, either in terms of their happening at all or the degree to which they do. While people can usually improve various parts of their life, not everyone can achieve the levels of success they might prefer no matter how much time they devote to their crafts. All of those are common reasons people will sometimes avoid improving themselves (it’s difficult and contains opportunity costs), but they do not straightforwardly explain why people sometimes fight against others improving.

“How dare they try to make a better life for themselves!”

I was recently reading an article about the appeal of Trump and came across this passage concerning this fight against the self-improvement of others:

“Nearly everyone in my family who has achieved some financial success for themselves, from Mamaw to me, has been told that they’ve become “too big for their britches.”  I don’t think this value is all bad.  It forces us to stay grounded, reminds us that money and education are no substitute for common sense and humility. But, it does create a lot of pressure not to make a better life for yourself…”

At first blush, this seems like a rather strange idea: if people in your community – your friends and family – are struggling (or have yet to build a future for themselves), why would anyone object to the prospect of their achieving success and bettering their lot in life? Part of the answer is found a little further down:

“A lot of these [poor, struggling] people know nothing but judgment and condescension from those with financial and political power, and the thought of their children acquiring that same hostility is noxious.”

I wanted to explore this idea in a bit more depth to help explain why these feelings might rear their head when faced with the social or financial success of others, be they close or distant relations.

Understanding these feelings requires drawing on a concept my theory of morality leaned heavily on: association value. Association value refers to the abstract value that others in the social world have for each other; essentially, it asks the question, “how desirable of a friend would this person make for me (and vice versa)?” This value comes in two parts: first, there is the matter of how much value someone could add to your life. As an easy example, someone with a lot of money is more capable of adding value to your life than someone with less money; someone who is physically stronger tends to be able to provide benefits a weaker individual could not; the same goes for individuals who are more physically attractive or intelligent. It is for this reason that most people wish they could improve on some or all of these dimensions if doing so were possible and easy: you end up as a more desirable social asset to others.

The second part of that association value is a bit trickier, however, reflecting the crux of the problem: how willing someone is to add value to your life. Those who are unwilling to help me have a lower value than those willing to make the investment. Reliable friends are better than flaky ones, and charitable friends are better than stingy ones. As such, even if someone has a great potential value they could add to my life, they still might be unattractive as associates if they are not going to turn that potential into reality. An unachieved potential is effectively the same thing as having no potential value at all. Conversely, those who are very willing to add to my life but cannot actually do so in meaningful ways don’t make attractive options either. Simply put, eager but incompetent individuals wouldn’t make good hires for a job, but neither would competent yet absent ones.

“I could help you pay down your crippling debt. Won’t do it, though”

With this understanding of association value, there is only one piece left to add to equation: the zero-sum nature of friendship. Friendship is a relative term; it means that someone values me more than they value others. If someone is a better friend to me, it means they are a worse friend to others; they would value my welfare over the welfare of others and, if a choice had to be made, would aid me rather than someone else. Having friends is also useful in the adaptive sense of the word: they help provide access to desirable mates, protection, provisioning, and can even help you exploit others if you’re on the aggressive side of things. Putting all these pieces together, we end up with the following idea: people generally want access to the best friends possible. What makes a good friend is a combination of their ability and willingness to invest in you over others. However, their willingness to do so depends in turn on your association value to them: how willing and able you are to add things to their lives. If you aren’t able to help them out – now or in the future – why would they want to invest resources into benefiting you when they could instead put those resources into others who could?

Now we can finally return to the matter of self-improvement. By increasing your association value through various forms of self-improvement (e.g., making yourself more physically attractive and stronger through exercise, improving your income by moving forward in your career, learning new things, etc) you make yourself a more appealing friend to others. Crucially, this includes both existing friends and higher-status individuals who might not have been willing to invest in you prior to your ability to add value to their life materializing. In other words, as your value as an associate rises, unless the value of your existing associates rises in turn, it is quite possible that you can now do better than them socially, so to speak. If you have more appealing social prospects, then, you might begin to neglect or break-off existing contacts in favor of newer, more-profitable friendships or mates. It is likely that your existing contacts understand this – implicitly or otherwise – and might seek to discourage you from improving your life, or preemptively break-off contact with you if you do, under the assumptions you will do likewise to them in the future. After all, if you’re moving on eventually they would be better off building new connections sooner, rather than later. They don’t want to invest in failing relationships anymore than you do.

In turn, those who are thinking about self-improvement might actually decide against pursuing their goals not necessarily because they wouldn’t be able to achieve them, but because they’re afraid that their existing friends might abandon them, or even that they themselves might be the ones who do the abandoning. Ironically, improving yourself can sometimes make you look like a worse social prospect.

To put that in a simple example, we could consider the world of fitness. The classic trope of weak high-schooler being bullied by the strong, jock type has been ingrained in many stories in our culture. For those doing the bullying, their targets don’t offer them much socially (their association value to others is low, while the bully’s is high) and they are unable to effectively defend themselves, making exploitation appear as an attractive option. In turn, those who are the targets of this bullying are, in some sense, wary of adopting some of the self-improvement behaviors that the jocks engage in, such as working out, because they either don’t feel they can effectively compete against the jocks in that realm (e.g., they wouldn’t be able to get as strong, so why bother getting stronger) or because they worry that improving their association value by working out will lead to them adopting a similar pattern of behavior to those they already dislike, resulting in their losing value to their current friends (usually those of similar, but relatively-low association value). The movie Mean Girls is an example of this dynamic struggle in a different domain.

So many years later, and “Fetch” still never happened…

This line of thought has, as far as I can tell, also been leveraged (again, consciously or otherwise) by one brand within the fitness community: Planet Fitness. Last I heard an advertisement for their company on the radio, their slogan appeared to be, “we’re not a gym; we’re planet fitness.” An odd statement to be sure, because they are a gym, so what are we to make of it? Presumably that they are in some important respects different from their competition. How are they different from other gyms? The “About” section on their website lays their differences out in true, ironic form:

“Make yourself comfy. Because we’re Judgement Free…you deserve a little cred just for being here. We believe no one should ever feel Gymtimidated by Lunky behavior and that everyone should feel at ease in our gyms, no matter what his or her workout goals are…We’re fiercely protective of our Planet and the rights of our members to feel like they belong. So we create an environment where you can relax, go at your own pace and just do your own thing without ever having to worry about being judged.”

This marketing is fairly transparent pandering to those who currently do not feel they can compete with those who are very fit or are worried about becoming a “lunk” themselves (they even have an alarm in the gym designed to bet set off if someone is making too much noise while lifting, or wearing the wrong outfit). However, in doing so, they devalue those who are successful or passionate in their pursuits of self-improvement. While I have never seen a gym more obsessed with judging their would-be members than Planet Fitness, so long as that judgment is pointed at the right targets, they try to appeal (presumably effectively) to certain portions of the population untapped by other gyms. Planet Fitness wants to be your friend; not the friend of those jerks who make you feel bad.

There is value in not letting success go to one’s head; no one wants a fair-weather friend who will leave the moment it’s expedient. Such an attitude undermines loyalty. The converse, however, is that using that as an excuse to avoid (or condemn) self-improvement will make you and others worse-off in the long term. A better solution to this dilemma is to improve yourself so you can improve those who matter the most to you, hoping they reciprocate in turn (or improve together for even better success).

Skepticism Surrounding Sex

It’s a basic truth of the human condition that everybody lies; the only variable is about what

One of my favorite shows from years ago was House; a show centered around a brilliant but troubled doctor who frequently discovers the causes of his patient’s ailments through discerning what they – or others – are lying about. This outlook on people appears to be correct, at least in spirit. Because it is sometimes beneficial for us that other people are made to believe things that are false, communication is often less than honest. This dishonesty entails things like outright lies, lies by omission, or stretching the truth in various directions and placing it in different lights. Of course, people don’t just lie because deceiving others is usually beneficial. Deception – much like honesty – is only adaptive to the extent that people do reproductively-relevant things with it. Convincing your spouse that you had an affair when you didn’t is dishonest for sure, but probably not a very useful thing to do; deceiving someone about what you had for breakfast is probably fairly neutral (minus the costs you might incur from coming to be known as a liar). As such, we wouldn’t expect selection to have shaped our psychology to lie about all topics with equal frequency. Instead, we should expect that people tend to preferentially lie about particular topics in predictable ways.

Lies like, “This college degree will open so many doors for you in life”

The corollary idea to that point concerns skepticism. Distrusting the honesty of communications can protect against harmful deceptions, but it also runs the risk of failing to act on accurate and beneficial information. There are costs and benefits to skepticism as there are to deception. Just as we shouldn’t expect people to be dishonest about all topics equally often, then, we shouldn’t expect people to be equally skeptical of all the information they receive either. This is point I’ve talked about before with regards to our reasoning abilities, whereby information agreeable to our particular interests tends to be accepted less critically, while disagreeable information is scrutinized much more intensely.

This line of thought was recently applied to the mating domain in a paper by Walsh, Millar, & Westfall (2016). Humans face a number of challenges when it comes to attracting sexual partners typically centered around obtaining the highest quality of partner(s) one can (metaphorically) afford, relative to what one offers to others. What determines the quality of partners, however, is frequently context specific: what makes a good short-term partner might differ from what makes a good long-term partner and – critically, as far as the current research is concerned – the traits that make good male partners for women are not the same as those that make good females partner for men. Because women and men face some different adaptive challenges when it comes to mating, we should expect that they would also preferentially lie (or exaggerate) to the opposite sex about those traits that the other sex values the most. In turn, we should also expect that each sex is skeptical of different claims, as this skepticism should reflect the costs associated with making poor reproductive decisions on the basis of bad information.

In case that sounds too abstract, consider a simple example: women face a greater obligate cost when it comes to pregnancy than men do. As far as men are concerned, their role in reproduction could end at ejaculation (which it does, for many species). By contrast, women would be burdened with months of gestation (during which they cannot get pregnant again), as well as years of breastfeeding prior to modern advancements (during which they also usually can’t get pregnant). Each child could take years of a woman’s already limited reproductive lifespan, whereas the man has lost a few minutes. In order to ease those burdens, women often seek male partners who will stick around and invest in them and their children. Men who are willing to invest in children should thus prove to be more attractive long-term partners for women than those who are unwilling. However, a man’s willingness to stick around needs to be assessed by a woman in advance of knowing what his behavior will actually be. This might lead to men exaggerating or lie about their willingness to invest, so as to encourage women to mate with them. Women, in turn, should be preferentially skeptical of such claims, as being wrong about a man’s willingness to invest is costly indeed. The situation should be reversed for traits that men value in their partners more than women.

Figure 1: What men most often value in a woman

Three such traits for both men and women were examined by Walsh et al (2016). In their study, eight scenarios depicting a hypothetical email exchange between a man and woman who had never met were displayed to approximately 230 (mostly female; 165) heterosexual undergraduate students. For the women, these emails depicted a man messaging a woman; for men, it was a woman messaging a man. The purpose of these emails was described as the person sending them looking to begin a long-term intimate relationship with the recipient. Each of these emails described various facets of the sender, which could be broadly classified as either relevant primarily to female mating interests, relevant to male interests, or neutral. In terms of female interests, the sender described their luxurious lifestyle (cuing wealth), their desire to settle down (commitment), or how much they enjoy interacting with children (child investment). In terms of male interests, the sender talked about having a toned body (cuing physical attractiveness), their openness sexually (availability/receptivity), or their youth (fertility and mate value). In the two neutral scenarios, the sender either described their interest in stargazing or board games.

Finally, the participants were asked to rate (on a 1-5 scale) how deceitful they thought the sender was, whether they believed the sender or not, and how skeptical they were of the claims in the message. These three scores were summed for each participant to create a composite score of believability for each of the messages (the lower the score, the less believable it was rated as being). Those scores were then averaged across the female-relevant items (wealth, commitment, and childcare), the male-relevant items (attractiveness, youth, and availability), and the control conditions. (Participants also answered questions about whether the recipient should respond and how much they personally liked the sender. No statistical analyses are reported on those measures, however, so I’m going to assume nothing of note turned up)

The results showed that, as expected, the control items were believed more readily (M = 11.20) than the male (M = 9.85) or female (9.6) relevant items. This makes sense, inasmuch as believing lies about stargazing or interests in board games aren’t particularly costly for either sex in most cases, so there’s little reason to lie about them (and thus little reason to doubt them); by contrast, messages about one’s desirability as a partner have real payoffs, and so are treated more cautiously. However, an important interaction with the sex of the participant was uncovered as well: female participants were more skeptical on the female-relevant items (M = about 9.2) than males were (M = 10.6); similarly, males were more likely to be skeptical in male-relevant conditions  (M = 9.5) than females were (M = 10). Further, the scores for the individual items all showed evidence of the same sex kinds of differences in skepticism. No sex difference emerged for the control condition, also as expected.

In sum, then – while these differences were relatively small in magnitude – men tended to be more skeptical of claims that, if falsely believed, were costlier for them than women, and women tended to be more skeptical of claims that, if falsely believed, were costlier for them than men. This is a similar pattern to that found in the reasoning domain, where evidence that agrees with one’s position is accepted more readily than evidence that disagrees with it.

“How could it possibly be true if it disagrees with my opinion?”

The authors make a very interesting point towards the end of their paper about how their results could be viewed as inconsistent with the hypothesis that men have a bias to over-perceived women’s sexual interest. After all, if men are over-perceiving such interest in the first place, why would they be skeptical about claims of sexual receptivity? It is possible, of course, that men tend to over-perceive such availability in general and are also skeptical of claims about its degree (e.g., they could still be manipulated by signals intentionally sent by females and so are skeptical, but still over-perceive ambiguous or less-overt cues), but another explanation jumps out at me that is consistent with the theme of this research: perhaps when asked to self-report about their own sexual interest, women aren’t being entirely accurate (consciously or otherwise). This explanation would fit well with the fact that men and women tend to perceive a similar level of sexual interest in other women. Then again, perhaps I only see that evidence as consistent because I don’t think men, as a group, should be expected to have such a bias, and that’s biasing my skepticism in turn.

References: Walsh, M., Millar, M., & Westfall, S. (2016). The effects of gender and cost on suspicion in initial courtship communications. Evolutionary Psychological Science, DOI 10.1007/s40806-016-0062-8

Musings About Police Violence

I was going to write about something else today (the finding from a meta-analysis that artificial surveillance cues do not appear to appreciably increase generosity; the effects fail to reliably replicate), but I decided to switch topics up to something more topical: police violence. My goal today is not to provide answers to this on-going public debate – I certainly don’t know enough about the topic to consider myself an expert – but rather to try and add some clarity to certain features of the discussions surrounding the matter, and hopefully help people think about it in somewhat unusual ways. If you expect me to take a specific stance on the issue, be that one that agrees or disagrees with your own, I’m going to disappoint you. That alone may upset some people who take anything other than definite agreement as a sign of aggression against them, but there isn’t much to do about that. That said, the discussion about police violence itself is a large and complex one, the scope of which far exceeds the length constraints of my usual posts. Accordingly, I wanted to limit my thoughts on the matter to two main domains: important questions worth answering, and addressing the matter of why many people find the “Black Lives Matter” hashtag needlessly divisive.

Which I’m sure will receive a warm, measured response

First, let’s jump into the matter of important questions. One of the questions I’ve never seen explicitly raised in the context of these discussions – let alone answered – is the following: How many people should we expect to get killed by police each year? There is a gut response that many would no doubt have to that question: zero. Surely someone getting killed is a tragedy that we should seek to avoid at all times, regardless of the situation; at best, it’s a regrettable state of affairs that sometimes occurs because the alternative is worse. While zero might be the ideal world outcome, this question is asking more about the world that we find ourselves in now. Even if you don’t particularly like the expectation that police will kill people from time to time, we need to have some expectation of just how often it will happen to put the violence in context. These killings, of course, include a variety of scenarios: there are those in which the police justifiably kill someone (usually in defense of themselves or others), those cases where the police mistakenly kill someone (usually when an error of judgment occurs regarding the need for defense, such as when someone has a toy gun), and those cases where police maliciously kill someone (the killing is aggressive, rather than defensive, in nature). How are we to go about generating these expectations? One popular method seems to be comparisons of police shootings cross-nationally. The picture that results from such analyses appears to suggest that US police shoot people much more frequently than police from other modern countries. For instance, The Guardian claims that Canadian police shoot and kill about 25 people a year, as compared with approximately 1,000 such shootings in the US in 2015. Assuming those numbers are correct, once we correct for population size (the US is about ten-times more populated than Canada), we can see that US police shoot and kill about four-times as many people. That sure seems like a lot, probably because it is a lot. We want to do more than note that there is a difference, however; we want to see whether that difference violates our expectations, and to do that, we need to be clear about why our expectations were generated. If, for example, police in the US face threatening situations more often than Canadian police, this is a relevant piece of information. To begin engaging with that idea, we might consider how many police die each year in the line of duty, cross-nationally as well. In Canada, the number for 2015 looks to be three; adjusting for population size again, we would generate an expectation of 30 US police officer deaths if all else were equal. All else is apparently not equal, however, as the actual number for 2015 in the US is about 130. Not only are the US police killing four-times as often as their Canadian counterparts, then, but they’re also dying at approximately the same rate as well. That said, those numbers include factors other than homicides, and so that too should be taken into account when generating our expectations (in Canada, the number of police shot was 2 in 2015, compared to 40 in the US, which is still twice as high as one would expect from population size. There are also other methods of killing police, such as the 50 US police killed by bombs or cars; 0 for Canada). Given the prevalence of firearm ownership in the US, it might not be too surprising that the rates of violence between police and citizens – as well as between citizens and other citizens – looks substantially different than in other countries. There are other facts which might adjust our expectations up or down. For instance, while the US has 10 times the population of Canada, the number of police per 100,000 people (376) is different than that of Canada (202). How we should adjust the numbers to make a comparison based on population differences, then, is a matter worth thinking about (should we expect ratio of police officers to citizens per se to increase the number of them that are shot, or is population the better metric?). Also worth mentioning is that the general homicide rate per 100,000 people is quite a bit higher in the US (3.9) than in Canada (1.4). While this list of considerations is very clearly not exhaustive, I hope it generates some thoughts regarding the importance of figuring out what our expectations are, as well as why. The numbers of shootings alone are going to be useless without good context. 

Factor 10: Perceived silliness of uniforms

The second question concerns bias within these shootings in the US. In addition to our expectations for the number of people being killed each year by police, we also want to generate some expectations for the demographics of those who are shot: what should we expect the demographics of those being killed by police to be? Before we can claim there is a bias in the shooting data, we need to both have a sense for what our expectation in that regard are, why they are such, and only then can we look at how those expectations are violated. The obvious benchmark that many people would begin would be the demographics of the US as a whole. We might expect, for instance, that the victims of police violence in the US are 63% white, 12% black, about 50% male, and so on, mirroring the population of the country. Some data I’ve come across suggests that this is not the case, however, with approximately 50% of the victims being white and 26% being black. Now that we know the demographics don’t match up as we’d expect from population alone, we want to know why. One tempting answer that many people fall back on is that police are racially motivated: after all, if black people make up 12% of the population but represent 26% of police killings, this might mean police specifically target black suspects. Then again, males make up about 50% of the population but represent about 96% of police killings. While one could similarly posit that police have a wide-spread hatred of men and seek to harm them, that seems unlikely. A better explanation for more of the variation is that men are behaving differently than women: less compliant, more aggressive, or something along those lines. After all, the only reasons you’d expect police shootings to match population demographics perfectly would be either if police shot people at random (they don’t) or police shot people based on some nonrandom factors that did not differ between groups of people (which also seems unlikely). One such factor that we might use to adjust our expectations would be crime rates in general; perhaps violent crime in particular, as that class likely generates a greater need for officers to defend themselves. In that respect, men tend to commit much more crime than women, which likely begins to explain why men are also shot by police more often. Along those lines, there are also rather stark differences between racial groups when it comes to involvement in criminal activity: while 12% of the US population is black, approximately 40% of the prison population is, suggesting differences in patterns of offending. While some might claim that prison percentage too is due to racial discrimination against blacks, the arrest records tend to agree with victim reports, suggesting a real differential involvement in criminal activity. That said, criminal activity per se shouldn’t get one shot by police. When generating our expectations, we also might want to consider factors such as whether people resist arrest or otherwise threaten the officers in some way. In testing theories of racial biases, we would want to consider whether officers of different races are more or less likely to shoot citizens of various demographics (that is to ask whether, say, black officers are any more or less likely to shoot black civilians than white officers are. I could have sworn I’ve seen data on that before but cannot appear to locate it at this time. What I did find, however, was a case-matched study of NYPD officers, reporting that black officers were about three times as likely to discharge their weapon as white officers at the scene, spanning 106 shooting and about 300 officers; Ridgeway, 2016). Again, while this is not a comprehensive list of things to think about, factors like these should help us generate our expectations about what the demographics of police shooting victims should look like, and it is only from there that we can begin to make claims about racial biases in the data.

It’s hard to be surprised at the outcomes sometimes

Regardless of where you settled on your answer to the above expectations, I suspect that many people would nonetheless want to reduce those numbers, if possible. Fewer people getting killed by police is a good thing most of the time. So how do we want to go about seeing that outcome achieved? Some have harnessed the “Black Lives Matter” (BLM) hashtag and suggest that police (and other) violence should be addressed via a focus on, and reductions in, explicit, and presumably implicit, racism (I think; finding an outline of the goals of the movement proves a bit difficult). One common response to this hashtag has been the notion that BLM is needlessly divisive, suggesting instead that “All Lives Matter” (ALM) be used as a more appropriate description. In turn, the reply to ALM by BLM is that the lack of focus on black people is an attempt to turn a blind eye to problems viewed a disproportionately affecting black populations. The ALM idea was recently criticized by the writer Maddox, who compared the ALM expression to a person who, went confronted with the idea of “supporting the troops,” suggests that we should support all people (the latter being a notion that receives quite a bit of support, in fact). This line of argument is not unique to Maddox, of course, and I wanted to address that thought briefly to show why I don’t think it works particularly well here. First, I would agree that “support the troops” slogan is met with a much lower degree of resistance than “black lives matter,” at least as far as I’ve seen. So why this differential response? As I see it, the reason this comparison breaks down involves the zero-sum nature of each issue: if you spend $5 to buy a “support the troops” ribbon magnet to attach to your car, that money is usually intended to be designated towards military-related causes. Now, importantly, money that is spent relieving the problems in the military domain cannot be spent elsewhere. That $5 cannot be given to both military causes and also given to cancer research and also given to teachers and also used to repave roads, and so on. There need to be trade-offs in whom you support in that case. However, if you want to address the problem of police violence against civilians, it seems that tactics which effectively reduce violence against black populations should also be able to reduce violence against non-black populations, such as use-of-force training or body cameras. The problems, essentially, have a very high degree of overlap and, in terms of the raw numbers, many more non-black people are killed by police than black ones. If we can alleviate both at the same time with the same methods, focusing on one group seems needless. It is only those killings of civilians that effect black populations (24% of the shootings) and are also driven predominately or wholly by racism (an unknown percent of that 24%) that could be effectively addressed by a myopic focus on the race of the person being killed per se. I suspect that many people have independently figured that out – consciously or otherwise – and so dislike the specific attention drawn to race. While a focus on race might be useful for virtue signaling, I don’t think it will be very productive in actually reducing police violence.

“Look at how high my horse is!”

To summarize, to meaningfully talk about police violence, we need to articulate our expectations about how much of it we should see, as well as its shape. It makes no sense to talk about how violence is biased against one group or another until those benchmarks have been established (this logic applies to all discussions of bias in data, regardless of topic). None of this is intended to be me telling you how much or what kind of violence to expect; I’m by no means in possession of the necessary expertise. Regardless, if one wants to reduce police violence, inclusive solutions are likely going to be superior to exclusive ones, as a large degree of overlap in causes likely exists between cases, and solving the problems of one group will help solve the problems of another. There is merit to addressing specific problems as well – as that overlap is certainly less than 100% – but in doing so, it is important to not lose sight of the commonalities and distance those who might otherwise be your allies.  References: Ridgeway, G. (2016). Officer risk factors associated with police shootings: a matched case-control study. Statistics & Public Policy, 3, 1-6.

Why Women Are More Depressed Than Men

Women are more likely to be depressed than men; about twice as likely here in the US, as I have been told. It’s an interesting finding, to be sure, and making sense of it poses a fun little mystery (as making sense of many things tends to). We don’t just want to know that women are more depressed than men; we also want to know why women are more depressed. So what are the causes of this difference? The Mayo Clinic floats a few explanations, noting that this sex difference appears to emerge around puberty. As such, many of the explanations they put forth center around the problems that women (but not men) might face when undergoing that transitional period in their life. These include things like increased pressure to achieve in school, conflict with parents, gender confusion, PMS, and pregnancy-related factors. They also include ever-popular suggestions such as societal biases that harm women. Now I suspect these are quite consistent with the answers you would get if queried your average Joe or Jane on the street as to why they think women are more depressed. People recognize that depression often appears to follow negative life events and stressors, and so they look for proximate conditions that they believe (accurately or not) disproportionately affect women.

Boys don’t have to figure out how to use tampons; therefore less depression

While that seems to be a reasonable strategy, it produces results that aren’t entirely satisfying. First, it seems unlikely that women face that much more stress and negative life events than men do (twice as much?) and, secondly, it doesn’t do much to help us understand individual variation. Lots of people face negative life events, but lots of them also don’t end up spiraling into depression. As I noted above, our understanding of the facts related to depression can be bolstered by answering the why questions. In this case, the focus many people have is on answering the proximate whys rather than the ultimate ones. Specifically, we want to know why people respond to these negative life events with depression in the first place; what adaptive function depression might have. Though depression reactions appear completely normal to most, perhaps owing to their regularity, we need to make that normality strange. If, for example, you imagine a new mouse mother facing the stresses of caring for her young in a hostile world, a postpartum depression on her part might seem counterproductive: faced with the challenges of surviving and caring for her offspring, what adaptive value would depressive symptoms have? How would low energy, a lack of interest in important everyday activities, and perhaps even suicidal ideation help make her situation better? If anything, they would seem to disincline her from taking care of these important tasks, leaving her and her dependent offspring worse off. This strangeness, of course, wouldn’t just exist in mice; it should be just as strange when we see it in humans.

The most compelling adaptive account of depression I’ve read (Hagen, 2003) suggests that the ultimate why of depression focuses on social bargaining. I’ve written about it before, but the gist of the idea is as follows: if I’m facing adversity that I am unlikely to be able to solve alone, one strategy for overcoming that problem is to recruit others in the world to help me. However, those other people aren’t always forthcoming with the investment I desire. If others aren’t responding to my needs adequately, it would behoove me to try and alter their behavior so as to encourage them to increase their investment in me. Depression, in this view, is adapted to do just that. The psychological mechanisms governing depression work to, essentially, place the depressed individual on a social strike. When workers are unable to effectively encourage an increased investment from their employers (perhaps in the form of pay or benefits), they will occasionally refuse to work at all until their conditions improve. While this is indeed costly for the workers, it is also costly for the employer, and it might be beneficial for the employer to cave to the demands rather than continue to face the costs of not having people work. Depression shows a number of parallels to this kind of behavior, where people withdraw from the social world – taking with them the benefits they provided to others – until other people increase their investment in the depressed individual to help see them through a tough period.

Going on strike (or, more generally, withdrawing from cooperative relationships), of course, is only one means of getting other people to increase their investment in you; another potential strategy is violence. If someone is enacting behaviors that show they don’t value me enough, I might respond with aggressive behaviors to get them to alter that valuation. Two classic examples of this could be shooting someone in self-defense or a loan-shark breaking a delinquent client’s legs. Indeed, this is precisely the type of function that Sell et al (2009) proposed that anger has: if others aren’t giving me my due, anger motivates me to take actions that could recalibrate their concern for my welfare. This leaves us with two strategies – depression and anger – that can both solve the same type of problem. The question arises, then, as to which strategy will be the most effective for a given individual and their particular circumstances. This raises a rather interesting possibility: it is possible that the sex difference in depression exists because the anger strategy is more effective for men, whereas the depression strategy is more effective for women (rather than, say, because women face more adversity than men). This would be consistent with the sex difference in depression arising around puberty as well, since this is when sex differences in strength also begin to emerge. In other words, both men and women have to solve similar social problems; they just go about it in different ways. 

“An answer that doesn’t depend on wide-spread sexism? How boring…”

Crucially, this explanation should also be able to account for within-sex differences as well: while men are more able to successfully enact physical aggression than women, not all men will be successful in that regard since not all men possess the necessary formidability. The male who is 5’5″ and 130 pounds soaking wet likely won’t win against his taller, heavier, and stronger counterparts in a fight. As such, men who are relatively weak might preferentially make use of the depression strategy, since picking fights they probably won’t win is a bad idea, while those who are on the stronger side might instead make use of anger more readily. Thankfully, a new paper by Hagen & Rosenstrom (2016) examines this very issue; at least part of it. The researchers sought to test whether upper-body strength would negatively predict depression scores, controlling for a number of other, related variables.

To do so, they accessed data from the National Health and Nutrition Examination Survey (NHANES), netting a little over 4,000 subjects ranging in age from 18-60. As a proxy for upper-body strength, the authors made use of the measures subjects had provided of their hand-grip strength. The participants had also filled out questions concerning their depression, height and weight, socioeconomic status, white blood cell count (to proxy health), and physical disabilities. The researchers predicted that: (1) depression should negatively correlate with grip-strength, controlling for age and sex, (2) that relationship should be stronger for men than women, and (3) that the relationship would persist after controlling for physical health. About 9% of the sample qualified as depressed and, as expected, women were more likely to report depression than men by about 1.7 times. Sex, on its own, was a good predictor of depression (in their regression, ß = 0.74).

When grip-strength was added into the statistical model, however, the effect of sex dropped into the non-significant range (ß = 0.03), while strength possessed good predictive value (ß = -1.04). In support of the first hypothesis, then, increased upper-body strength did indeed negatively correlate with depression scores, removing the effect of sex almost entirely. In fact, once grip strength was controlled for, men were actually slightly more likely to report depression than women (though this didn’t appear to be significant). Prediction 2 was not supported, however, with their being no significant interaction between sex and grip-strength on measures of depression. This effect persisted even when controlling for socioeconomic status, age, anthropomorphic, and hormonal variables. However, physical disability did attenuate the relationship between strength and depression quite a bit, which is understandable in light of the fact that physically-disabled individuals likely have their formidability compromised, even if they have stronger upper bodies (an example being a man in a wheelchair having good grip strength, but still not being much use in a fight). It is worth mentioning that the relationship between strength and depression appeared to grow larger over time; the authors suggest this might have something to do with older individuals having more opportunities to test their strength against others, which sounds plausible enough. 

Also worth noting is that when depression scores were replaced with suicidal ideation, the predicted sex-by-strength interaction did emerge, such that men with greater strength reported being less suicidal, while women with greater strength reported being more suicidal (the latter portion of which is curious and not predicted). Given that men succeed at committing suicide more often than women, this relationship is probably worth further examination.  

“Not today, crippling existential dread”

Taken together with findings from Sell et al (2009) – where men, but not women, who possessed greater strength reported being quicker to anger and more successful in physical conflicts – the emerging picture is one in which women tend to (not consciously) “use” depression as a means social bargaining because it tends to work better for them than anger, whereas the reverse holds true for men. To be clear, both anger and depression are triggered by adversity, but those events interact with an individual’s condition and their social environment in determining the precise response. As the authors note, the picture is likely to be a dynamic one; not one that’s as simple as “more strength = less depression” across the board. Of course, other factors that co-vary with physical strength and health – like attractiveness – could also being playing a roll in the relationship with depression, but since such matters aren’t spoken to directly by the data, the extent and nature of those other factors is speculative.

What I find very persuasive about this adaptive hypothesis, however – in addition to the reported data – is that many existing theories of depression would not make the predictions tested by Hagen & Rosenstrom (2016) in the first place. For example, those who claim something like, “depressed people perceive the world more accurately” would be at a bit of a loss to explain why those who perceive the world more accurately also seem to have lower upper-body strength (they might also want to explain why depressed people don’t perceive the world more accurately, either). A plausible adaptive hypothesis, on the other hand, is useful for guiding our search for, and understanding of, the proximate causes of depression.

References: Hagen, E.H. (2003). The bargaining model of depression. In: Genetic and Cultural Evolution of Cooperation, P. Hammerstein (ed.). MIT Press, 95-123

Hagen, E. & Rosenstrom, T. (2016). Explain the sex difference in depression with a unified bargaining model of anger and depression. Evolution, Medicine, & Public Health, 117-132

Sell, A., Tooby, J., & Cosmides, L. (2009). Formidability and the logic of human anger. Proceedings of the National Academy of Sciences, 106, 15073-78.

Chivalry Isn’t Dead, But Men Are

In the somewhat-recent past, there was a vote in the Senate held on the matter of whether women in the US should be required to sign up for the selective service – the military draft – when they turn 18. Already accepted, of course, was the idea that men should be required to sign up; what appears to be a relatively less controversial idea. This represents yet another erosion of male privilege in modern society; in this case, the privilege of being expected to fight and die in armed combat, should the need arise. Now whether any conscription is likely to happen in the foreseeable future (hopefully not) is a somewhat different matter than whether women would be among the first drafted if that happened (probably not), but the question remains as to how to explain this state of affairs. The issue, it seems, is not simply one of whether men or women are better able to shoulder the physical demands of combat, however; it extends beyond military service into intuitions about real and hypothetical harm befalling men and women in everyday life. When it comes to harm, people seem to generally care less about it happening to men.

Meh

One anecdotal example of these intuitions I’ve encountered during my own writing is when an editor at Psychology Today removed an image in one my posts of a woman undergoing bodyguard training in China by having a bottle smashed over her head (which can be seen here; it’s by no means graphic). There was a concern expressed that the image was in some way inappropriate, despite my posting of other pictures of men being assaulted or otherwise harmed. As a research-minded individual, however, I want to go beyond simple anecdotes from my own life that confirm my intuitions into the empirical world where other people publish results that confirm my intuitions. While I’ve already written about this issue a number of times, it never hurts to pile on a little more.  Recently, I came upon a paper by FeldmanHall et al (2016) that examined these intuitions about harm directed towards men and women across a number of studies that can help me do just that.

The first of the studies in the paper was a straightforward task: fifty participants were recruited from Mturk to respond to a classic morality problem called the footbridge dilemma. Here, the life of five people can be saved from a train by pushing one person in front of it. When these participants were asked whether they would push a man or woman to their death (assuming, I think, that they were going to push one of them), 88% of participants opted for killing the man. Their second study expanded a bit on that finding using the same dilemma, but asking instead how willing they would be (on a 1-10 scale) to push either a man, woman, or a person of unspecified gender without other options existing. The findings here with regard to gender were a bit less dramatic and clear-cut: participants were slightly more likely to indicate that they would push a man (M = 3.3) than a woman (M = 3.0), though female participants were nominally less likely to push a woman (roughly M = 2.3) than men were (roughly M = 3.8), perhaps counter to what might be predicted. That said, the sample size for this second study was fairly small (only about 25 per group), so that difference might not be worth making much over until more data is collected.

When faced with a direct and unavoidable trade-off between the welfare of men and women, then, the results overwhelmingly showed that the women were being favored; however, when it came to cases where men or women could be harmed alone, there didn’t seem to be a marked difference between the two. That said, that moral dilemma alone can only take us so far in understanding people’s interests about the welfare of others in no small part because of their life-and-death nature potentially introducing ceiling effects (man or woman, very few people are willing to throw someone else in front of a train). In other instances where the degree of harm is lowered – such as, say, male vs female genital cutting – differences might begin to emerge. Thankfully, FeldmanHall et al (2016) included an additional experiment that brought these intuitions out of the hypothetical and into reality while lowering the degree of harm. You can’t kill people to conduct psychological research, after all.

Yet…

In the next experiment, 57 participants were recruited and given £20. At the end of the experiment, any money they had would be multiplied by ten, meaning participants could leave with a total of £200 (which is awfully generous as far as these things go). As with most psychology research, however, there was a catch: the participants would be taking part in 20 trials where £1 was at stake. A target individual – either a man or a woman – would be receiving a painful electric shock, and the participants could give up some of that £1 to reduce its intensity, with the full £1 removing the shock entirely. To make the task a little less abstract, the participants were also forced to view videos of the target receiving the shocks (which, I think, were prerecorded videos of real shocks – rather than shocks in real time – but I’m not sure from my reading of the paper if that’s a completely accurate description).

In this study, another large difference emerged: as expected, participants interacting with female targets ended up keeping less money by the end (M = £8.76) than those interacting with male targets (M = £12.54; d = .82). In other words, the main finding of interest was that participants were willing to give up substantially more money to prevent women from receiving painful shocks than they were to help men. Interestingly, this was the case in spite of the facts that (a) the male target in the videos was rated more positively overall than the female target, and (b) in a follow-up study where participants provided emotional reactions to thinking about being a participant in the former study, the amount of reported aversion to letting the target suffer shocks was similar regardless of the target’s gender. As the authors conclude:

While it is equally emotionally aversive to hurt any individual—regardless of their gender—that society perceives harming women as more morally unacceptable, suggests that gender bias and harm considerations play a large role in shaping moral action.

So, even though people find harming others – or letting them suffer harm for a personal gain – to generally be an uncomfortable experience regardless of their gender, they are more willing to help/avoid harming women than they are men, sometimes by a rather substantial margin.

Now onto the fun part: explaining these findings. It doesn’t go nearly far enough as an explanation to note that “society condones harming men more than women,” as that just restates the finding; likewise, we only get so far by mentioning that people perceive men to have a higher pain tolerance than women (because they do), as that only pushes the question back a step to the matter of why men tolerate more pain than women. As for my thoughts, first, I think these findings highlight the importance of a modular understanding of psychological systems: our altruistic and moral systems are made up of a number of component pieces, each with a distinct function, and the piece that is calculating how much harm is generated is, it would seem, not the same piece deciding whether or not to do something about it. The obvious reason for this distinction is that alleviating harm to others isn’t always adaptive to the same extent: it does me more adaptive good to help kin relative to non-kin, friends relative to strangers, and allies relative to enemies, all else being equal. 

“Just stay out of it; he’s bigger than you”

Second, it might well be the case that helping men, on average, tends to pay off less than helping women. Part of the reason for that state of affairs is that female reproductive potential cannot be replaced quite as easily as male potential; male reproductive success is constrained by the number of available women much more than female potential is by male availability (as Chris Rock put it, “any money spent on dick is a bad investment“). As such, men might become particularly inclined to invest in alleviating women’s pain as a form of mating effort. The story clearly doesn’t end there, however, or else we would predict men being uniquely likely to benefit women, rather than both sexes doing similarly. This raises two additional possibilities to me: one of these is that, if men value women highly as a form of mating effort, that increased social value could also make women more valuable to other women in turn. To place that in a Game of Thrones example, if a powerful house values their own children highly, non-relatives may come to value those same children highly as well in the hopes of ingratiating themselves to – or avoiding the wrath of – the child’s family.

The other idea that comes to mind is that men are less willing to reciprocate aid that alleviated their pain because to do so would be an admission of a degree of weakness; a signal that they honestly needed the help (and might in the future as well), which could lower their relative status. If men are less willing to reciprocate aid, that would make men worse investments for both sexes, all else being equal; better to help out the person who would experience more gratitude for your assistance and repay you in turn. While these explanations might or might not adequately explain these preferential altruistic behaviors directed towards women, I feel they’re worthwhile starting points.

References: FeldmanHall, O., Dalgleish, T., Evans, D., Navrady, L., Tedeschi, E., & Mobbs, D. (2016). Moral chivalry: Gender and harm sensitive predict costly altruism. Social Psychological & Personality Science, DOI: 10.1177/1948550616647448

Sexism, Testing, And “Academic Ability”

When I was teaching my undergraduate course on evolutionary psychology, my approach to testing and assessment was unique. You can read about that philosophy in more detail here, but the gist of my method was specifically avoiding multiple-choice formats in favor of short-essay questions with unlimited revision ability on the part of the students. I favored this exam format for a number of reasons, chief among which was that (a) I didn’t feel multiple choice tests were very good at assessing how well students understood the material (memorization and good guessing does not equal understanding), and (b) I didn’t really care about grading my students as much as I cared about getting them to learn the material. If they didn’t grasp it properly on their first try (and very few students do), I wanted them to have the ability and motivation to continue engaging with it until they did get it right (which most eventually did; the class average for each exam began around a 70 and rose to a 90). For the purposes of today’s discussion, the important point here is that my exams were a bit more cognitively challenging than is usual and, according to a new paper, that means I had unintentionally biased my exams in ways that disfavor “historically underserved groups” like women and the poor.

Oops…

What caught my eye about this particular paper, however, was the initial press release that accompanied it. Specifically, the authors were quoted as saying something I found, well, a bit queer:

“At first glance, one might assume the differences in exam performance are based on academic ability. However, we controlled for this in our study by including the students’ incoming grade point averages in our analysis,”

So the authors appear to believe that a gap in performance on academic tests arises independent of academic abilities (whichever those entail). This raised the immediate question in my mind of how one knows that abilities are the same unless one has a method of testing them. It seems a bit strange to say that abilities are the same on the basis of one set of tests (those that provided incoming GPAs), but then to continue to suggest that abilities are the same when a different set of tests provides a contrary result. In the interests of settling my curiosity, I tracked the paper down to see what was actually reported; after all, these little news blurbs frequently get the details wrong. Unfortunately, this one appeared to capture the author’s views accurately.

So let’s start by briefly reviewing what the authors were looking at. The paper, by Wright et al (2016), is based on data collected from three-years worth of three introductory biology courses spanning 26 different instructors, approximately 5,000 students, and 87 different exams.Without going into too much unnecessary detail, the tests were assessed by independent raters for how cognitively challenging they were, their format, and the students were classified according to their gender and socio-economic status (SES; as measured by whether they qualified for a financial aid program). In order to attempt and control for academic ability, Wright et al (2016) also looked at the freshman-year GPA of the students coming into the biology classes (based on approximately 45 credits, we are told). Because the authors controlled for incoming GPA, they hope to persuade the reader of the following:

This implies that, by at least one measure, these students have equal academic ability, and if they have differential outcomes on exams, then factors other than ability are likely influencing their performance.

Now one could argue that there’s more to academic ability than is captured by a GPA – which is precisely why I will do so in a minute – but let’s continue on with what the authors found first.

Cognitive challenging test were indeed, well, more challenging. A statistically-average male student, for instance, would be expected to do about 12% worse on the most challenging test in their sample, relative to the easiest one. This effect was not the same between genders, however. Again, using statistically-average men and women, when the tests were the least cognitively challenging, there was effectively no performance gap (about a 1.7% expected difference favoring men); however, when the tests were the most cognitively challenging, that expected gap rose to an astonishing expected…3.2% difference. So, while the gender difference just about nominally doubled, in terms of really mattering in any practical sense of the word, its size was such that it likely wouldn’t be noticed unless one was really looking for it. A similar pattern was discovered for SES: when the tests were easy, there was effectively no difference between those low or high in SES (1.3% favoring those higher); however, when the tests were about maximally challenging, this expected difference rose to about 3.5%. 

Useful for both spotting statistical blips and burning insects

There’s a lot to say about these results and how they’re framed within the paper. First, as I mentioned, they truly are minor differences; there are very few cases were a 1-3% difference in test scores is going to make-or-break a student, so I don’t think there’s any real reason to be concerned or to adjust the tests; not practically, anyway.

However, there are larger, theoretical issues looming in the paper. One of these is that the authors use the phrase “controlled for academic ability” so often that a reader might actually come to believe that’s what they did from simple repetition. The problem here, of course, is that the authors did not control for that; they controlled for GPA. Unfortunately for Wright et al’s (2016) presentation, those two things are not synonyms. As I said before, it is strange to say that academic ability is the same because one set of tests (incoming GPA) says they are while another set does not. The former set of tests appear to be privileged for no sound reason. Because of that unwarranted interpretation, the authors lose (or rather, purposefully remove) the ability to talk about how these gaps might be due to some performance difference. This is a useful rhetorical move if one is interested in doing advocacy – as it implies the gap is unfair and ought to be fixed somehow – but not if one is seeking the truth of the matter.

Another rather large issue in the paper is that, as far as I could tell, the authors predicted they would find these effects without ever really providing an explanation as for how or why that prediction arose. That is, what drove their expectation that men would outperform women and the rich outperform the poor? This ends up being something of a problem because, at the end of the paper, the authors do float a few possible (untested) explanations for their findings. The first of these is stereotype threat: the idea that certain groups of people will do poorly on tests because of some negative stereotype about their performance. This is a poor fit for the data for two reasons: first, while Wright et al (2016) claim that stereotype is “well-documented”, it actually fails to replicate (on top of not making much theoretical sense). Second, even if it was a real thing, stereotype threat, as it typically studied, requires that one’s sex be made salient prior to the test. As I encountered a total of zero tests during my entire college experience that made my gender salient, much less my SES, I can only assume that the tests in question didn’t do it either. In order for stereotype threat to work as an explanation, then, women and the poor would need to be under relative constant stereotype threat. In turn, this would make documenting and student stereotype threat in the first place rather difficult, as you could never have a condition where your subjects were not experiencing it. In short, then, stereotype threat seems like a bad fit.

The other explanations that are put forth for this gender difference are the possibility that women and poor students have more fixed views of intelligence instead of growth mindsets, so they withdraw from the material when challenged rather than improve (i.e., “we need to change their mindsets to close this daunting 2% gap), or the possibility that the test questions themselves are written in ways that subtly bias people’s ability to think about them (the example the authors raise is that a question written about applying some concept to sports might favor men, relative to women, as men tend to enjoy sports more). Given that the authors did have access to the test questions, it seems that they could have examined that latter possibility in at least some detail (minimally, perhaps, by looking at whether tests written by female instructors resulted in different outcomes than those written by male ones, or by examining the content of the questions themselves to see if women did worse on gendered ones). Why they didn’t conduct such analyses, I can’t say.

 Maybe it was too much work and they lacked a growth mindset

In summary, these very minor average differences that were uncovered could easily be chalked up – very simply – to GPA not being a full measure of a student’s academic ability. In fact, if the tests determining freshman GPA aren’t the most cognitively challenging (as one might well expect, given that students would have been taking mostly general introductory courses with large class sizes), then this might make the students appear to be more similar in ability than they actually were. The matter can be thought of using this stereotypically-male example (that will assuredly hinder women’s ability to think about it): imagine I tested people in a room with weights ranging from 1-15 pounds and asked them to curl each one time. This would give me a poor sense for any underlying differences in strength because the range of ability tested was restricted. Provided I were to ask them to do the same with weights ranging from 1-100 pounds the next week, I might conclude that it’s something about the weights – and not people’s abilities – when it came to figuring out why differences suddenly emerged (since I mistakenly believe I already controlled for their abilities the first time).

Now I don’t know if something like that is actually responsible, but if the tests determining freshman GPA were tapping the same kinds of abilities to the same degrees as those in the biology courses studied, then controlling for GPA should have taken care of that potential issue. Since controlling for GPA did not, I feel safe assuming there being some difference in the tests in terms of what abilities they’re measuring.

References: Wright, C., Eddy, S., Wenderoth, M., Abshire, E., Blankenbiller, M., & Brownell, S. (2016). Cognitive difficulty and format of exams predicts gender and socioeconomic gaps in exam performance of students in introductory biology courses. Life Science Education, 15.

Smoking Hot

If the view counts on previous posts have been any indication, people really do enjoy reading about, understanding, and – perhaps more importantly – overcoming the obstacles found on the dating terrain; understandably so, given its greater personal relevance to their lives. In the interests of adding some value to the lives of others, then, today I wanted to discuss some research examining the connection between recreational drug use and sexual behavior in order to see if any practical behavioral advice can be derived from it. The first order of business will be to try and understand the relationship between recreational drugs and mating from an evolutionary perspective; the second will be to take a more direct look at whether drug use has positive and negative effects when it comes to attracting a partner, and in what contexts those effects might exist. In short, will things like drinking and smoking make you smoking hot to others?

So far selling out has been unsuccessful, so let’s try talking sex

We can begin by considering why people care so much about recreational drug use in general: from historical prohibitions on alcohol to modern laws prohibiting the possession, use, and sale of drugs, many people express a deep concern over who gets to put what into their body at what times and for what reasons. The ostensibly obvious reason for this concern that most people will raise immediately is that such laws are designed to save people from themselves: drugs can cause a great degree of harm to users and people are, essentially, too stupid to figure out what’s really good for them. While perceptions of harm to drug users themselves no doubt play a role in these intuitions, they are unlikely to actually be whole story for a number of reasons, chief among which is that they would have a hard time explaining the connection between sexual strategies and drug use (and that putting people in jail probably isn’t all that good for them either, but that’s another matter). Sexual strategies, in this case, refer roughly to an individual’s degree of promiscuity: some people preferentially enjoy engaging in one or more short-term sexual relationships (where investment is often funneled to mating efforts), while others are more inclined towards single, long-term ones (where investment is funneled to parental efforts). While people do engage in varying degrees of both at times, the distinction captures the general idea well enough. Now, if one is the type who prefers long-term relationships, it might benefit you to condemn behaviors that encourage promiscuity; it doesn’t help your relationship stability to have lots of people around who might try to lure your mate away or reduce the confidence of a man’s paternity in his children. To the extent that recreational drug use does that (e.g., those who go out drinking in the hopes of hooking up with others owing to their reduced inhibitions), it will be condemned by the more long-term maters in turn. Conversely, those who favor promiscuity should be more permissive towards drug use as it makes enacting their preferred strategy easier.

This is precisely the pattern of results that Quintelier et al (2013) report: in a cross-cultural sample of Belgians (N = 476), Dutch (N = 298), and Japanese (N = 296) college students who did not have children, even after controlling for age, sex, personality variables, political ideology, and religiosity, attitudes towards drug use were still reliably predicted by participant’s sexual attitudes: the more sexually permissive one was, the more they tended to approve of drug use. In fact, sexual attitudes were the best predictors of people’s feelings about recreational drugs both before and after the controls were added (findings which replicated a previous US sample). By contrast, while the non-sexual variables were sometimes significant predictors of drug views after controlling for sexual attitudes, they were not as reliable and their effects were not as large. This pattern of results, then, should yield some useful predictions about how drug use effects your attractiveness to other people: those who are looking for short-term sexual encounters might find drug use more appealing (or at least less off-putting), relative to those looking for long-term relationships.

“I pronounce you man and wife. Now it’s time to all get high”

Thankfully, I happen to have a paper on hand that speaks to the matter somewhat more directly. Vincke (2016) sought to examine how attractive brief behavioral descriptions of men were rated as being by women for either short- or long-term relationships. Of interest, these descriptions included the fact that the man in question either (a) did not, (b) occasionally, or (c) frequently smoke cigarettes or drink alcohol. A sample of 240 Dutch women were recruited and asked to rate these profiles with respect to how attractive the men in question would be for either a casual or committed relationship and whether they thought the men themselves were more likely to be interested in short/long-term relationships.

Taking these in reverse order, the women rated the men who never smoked as somewhat less sexually permissive (M = 4.31, scale from 1 to 7) than those who either occasionally or frequently did (Ms = 4.83 and 4.98, respectively; these two values did not significantly differ). By contrast, those who never drank or occasionally did were rated as being comparably less permissive (Ms = 4.04) than the men who drank frequently (M = 5.17). Drug use, then, did effect women’s perceptions of men’s sexual interests (and those perceptions happen to match reality, as a second  study with men confirmed). If you’re interested in managing what other people think your relationship intentions are, then, managing your drug use accordingly can make something of a difference. Whether that ended up making the men more attractive is a different matter, however.

As it turns out, smoking and drinking appear to look distinct in that regard: in general, smoking tended to make men look less attractive, regardless of whether the mating context was short- or long-term, and frequent smoking was worse than occasional smoking. However, the decline in attractiveness from smoking was not as large in short-term contexts. (Oddly, Vincke (2016) frames smoking as being an attractiveness benefit in short-term contexts within her discussion when it’s really just less of a cost. The slight bump seen in the data is neither statistically or practically significant) This pattern can be seen in the left half of the author’s graph. By contrast – on the right side – occasional drinkers were generally rated as more attractive than men who never or frequently drank across conditions across both short- and long-term relationships. However, in the context of short-term mating, frequent drinking was rated as being more attractive than never drinking, whereas this pattern reversed itself for long-term relationships. As such, if you’re looking to attract someone for a serious relationship, you probably won’t be impressing them much with your ability to do keg stands of liquor, but if you’re looking for someone to hook up with that night it might be better to show that off than sip on water all evening.

Cigarettes and alcohol look different from one another in the attractiveness domain even though both might be considered recreational drug use. It is probable that what differentiates them here is their effects on encouraging promiscuity, as previously discussed. While people are often motivated to go out drinking in order to get intoxicated, lose their inhibitions, and have sex, the same cannot usually be said about smoking cigarettes. Singles don’t usually congregate at smoking bars to meet people and start relationships, short-term or otherwise (forgoing for the moment that smoking bars aren’t usually things, unless you count the rare hookah lounges). Smoking might thus make men appear to be more interested in casual encounters because it cues a more general interest in short-term rewards, rather than anything specifically sexual; in this case, if one is willing to risk the adverse health effects in the future for the pleasure cigarettes provide today, then it is unlikely that someone would be risk averse in other areas of their life.

If you want to examine sex specifically, you might have picked the wrong smoke

There are some limitations here, namely that this study did not separate women in terms of what they were personally seeking in terms of relationships or their own interests/behaviors when it comes to engaging in recreational drug use. Perhaps these results would look different if you were to account for women’s smoking/drinking habits. Even if frequent drinking is a bad thing for long-term attractiveness in general, a mismatch with the particular person you’re looking to date might be worse. It is also possible that a different pattern might emerge if men were assessing women’s attractiveness, but what differences those would be are speculative. It is unfortunate that the intuitions of the other gender didn’t appear to be assessed. I think this is a function of Vincke (2016) looking for confirmatory evidence for her hypothesis that recreational drug use is attractive to women in short-term contexts because it entails risk, and women value risk-taking more in short-term male partners than long-term ones. (There is a point to make about that theory as well: while some risky activities might indeed be more attractive to women in short-term contexts, I suspect those activities are not preferred because they’re risky per se, but rather because the risks send some important cue about the mate quality of the risk taker. Also, I suspect the risks need to have some kind of payoff; I don’t think women prefer men who take risks and fail. Anyone can smoke, and smoking itself doesn’t seem to send any honest signal of quality on the part of the smoker.)

In sum, the usefulness of these results for making any decisions in the dating world is probably at its peak when you don’t really know much about the person you’re about to meet. If you’re a man and you’re meeting a woman who you know almost nothing about, this information might come in handy; on the other hand, if you have information about that woman’s preferences as an individual, it’s probably better to use that instead of the overall trends. 

References: Quintelier, K., Ishii, K., Weeden, J., Kurzban, R., & Braeckman, J. (2013). Individual differences in reproductive strategy are related to views about recreational drug use in Belgium, the Netherlands, and Japan. Human Nature, 24, 196-217.

Vincke, E. (2016). The young male cigarette and alcohol syndrome: Smoking and drinking as a short-term mating strategy. Evolutionary Psychology, 1-13.

Count The Hits; Not The Misses

At various points in our lives, we have all read or been told anecdotes about how someone turned a bit of their life around. Some of these (or at least variations of them) likely sound familiar: “I cut out bread from my diet and all the sudden felt so much better”; “Amy made a fortune working from home selling diet pills online”; “After the doctors couldn’t figure out what was wrong with me, I started drinking this tea and my infection suddenly cleared up”. The whole point of such stories is to try and draw a casual link, in these cases: (1) eating bread makes you feel sick, (2) selling diet pills is a good way to make money, and (3) tea is useful for combating infections. Some or all of these statements may well be true, but the real problem with these stories is the paucity of data upon which they are based. If you wanted to be more certain about those statements, you want more information. Sure; you might have felt better after drinking that tea, but what about the other 10 people who drank similar tea and saw no results? How about all the other people selling diet pills who were in the financial hole from day one and never crawled out of it because it’s actually a scam? If you want to get closer to understanding the truth value of those statements, you need to consider the data as a whole; both stories of success and stories of failure. However, stories of someone not getting rich from selling diet pills aren’t quite as moving, and so don’t see the light of day; at least not initially. This facet of anecdotes was made light of by The Onion several years ago (and Clickhole had their own take more recently).

“At first he failed, but with some positive thinking he continued to fail over and over again”

These anecdotes often try and throw the spotlight on successful cases (hits) while ignoring the unsuccessful ones (misses), resulting in a biased picture of how things will work out. They don’t get us much closer to the truth. Most people who create and consume psychology research would like to think that psychologists go beyond these kinds of anecdotes and generate useful insights into how the mind works, but there have been a lot of concerns raised lately about precisely how much further they go on average, largely owing the the results of the reproducibility project. There have been numerous issues raised about the way psychology research is conducted: either in the form of advocacy for particular political and social positions (which distorts experimental designs and statistical interpretations) or the selective ways in which data is manipulated or reported to draw attention to successful data without acknowledging failed predictions. The result has been quite a number of false positives and overstated real ones cropping up in the literature.

While these concerns are warranted, it is difficult to quantify the extent of the problems. After all, very few researchers are going to come out and say they manipulated their experiments or data to find the results they wanted because (a) it would only hurt their careers and (b) in some cases, they aren’t even aware that they’re doing it, or that what they’re doing is wrong. Further, because most psychological research isn’t preregistered and null findings aren’t usually published, figuring out what researchers hoped to find (but did not) becomes a difficult undertaking just by reading the literature. Thankfully, a new paper from Franco et al (2016) brings some data to bear on the matter of how much underreporting is going on. While this data will not be the final word on the subject by any means (largely owing to their small sample size), they do provide some of the first steps in the right direction.

Franco et al (2016) report on a group of psychology experiments whose questionnaires and data were made publicly available. Specifically, these come from the Time-sharing Experiments for the Social Sciences (TESS), an NSF program in which online experiments are embedded in nationally-representative population surveys. Those researchers making use of TESS face strict limits on the number of questions they can ask, we are told, meaning that we ought to expect they would restrict their questions to the most theoretically-meaningful ones. In other words, we can be fairly confident that the researchers had some specific predictions they hoped to test for each experimental condition and outcome measure, and that these predictions were made in advance of actually getting the data. Franco et al (2016) were then able to track the TESS studies through to the eventual published versions of the papers to see what experimental manipulations and results were and were not reported. This provided the authors with a set of 32 semi-preregistered psychology experiments to examine for reporting biases.

A small sample I will recklessly generalize to all of psychology research

The first step was to compare the number of experimental conditions and outcome variables that were present in the TESS studies to the number that ultimately turned up in published manuscripts (i.e. are the authors reporting what they did and what they measured?). Overall, 41% of the TESS studies failed to report at least one of their experimental conditions; while there were an average of 2.5 experimental conditions in the studies, the published papers only mentioned an average of 1.8. In addition, 72% of the papers failed to report all their outcomes variables; while there were an average of 15.4 outcome variables in the questionnaires, the published reports only mentioned 10.4  Taken together, only about 1-in-4 of the experiments reported all of what they did and what they measured. Unsurprisingly, this pattern extended to the size of the reported effects as well. In terms of statistical significance, the median reported p-value was significant (.02), while the median unreported p-value was not (.32); two-thirds of the reported tests were significant, while only one-forth of the unreported tests were. Finally, published effect sizes were approximately twice as large as unreported ones.

Taken together, the pattern that emerged is that psychology research tends to underreport failed experimental manipulations, measures that didn’t pan out, and smaller effects. This should come as no surprise to almost anyone who has spent much time around psychology researchers or the researchers themselves who have tried to publish null findings (or, in fact, have tried to publish almost anything). Data is often messy and uncooperative, and people are less interested in reading about the things that didn’t work out (unless they’re placed in the proper contexts, where failures to find effects can actually be considered meaningful, such as when you’re trying to provide evidence against a theory). Nevertheless, the result of such selective reporting on what appears to be a fairly large scale is that the overall trustworthiness of reported psychology research dips ever lower, one false-positive at a time.

So what can be done about this issue? One suggestion that is often tossed around is the prospect that researchers should register their work in advance, making it clear what analyses they will be conducting and what predictions they have made. This was (sort of) the case in the present data, and Franco et al (2016) endorse this option. It allows people to assess research as more of a whole than just relying on the published accounts of it. While that’s a fine suggestion, it only goes so far to improving the state of the literature. Specifically, it doesn’t really help the problem of journals not publishing null findings in the first place, nor does it necessarily disallow researchers from doing post-hoc analyses of their data either and turning up additional false positives. What is perhaps a more ambitious way of alleviating these problems that comes to mind would be to collectively change the way journals accept papers for publication. In this alternate system, researchers would submit an outline of their article to a journal before the research is conducted, making clear (a) what their manipulations will be, (b) what their outcome measures will be, and (c) what statistical analyses they will undertake. Then, and this is important, before either the researcher or the journals know what the results will be, the decision will be made to publish the paper or not. This would allow null results to make their way into mainstream journals while also allowing the researchers to build up their own resumes if things don’t work out well. In essence, it removes some of the incentives for researchers to cheat statistically. The assessment of the journals will then be based not on whether interesting results emerged, but rather on whether a sufficiently important research question had been asked.

Which is good, considering how often real, strong results seem to show up

There are some downsides to that suggestion, however. For one, the plan would take some time to enact even if everyone was on board. Journals would need to accept a paper for publication weeks or months in advance of the paper itself actually being completed. This would pose some additional complications for journals inasmuch as researchers will occasionally fail to complete the research at all, in timely manner, or submit sub-par papers not worthy of print quite yet, leaving possible publication gaps. Further, it will sometimes mean that an issue of a journal goes out without containing any major advancements to the field of psychological research (no one happened to find anything this time), which might negatively affect the impact factor of the journals in question. Indeed, that last part is probably the biggest impediment to making major overhauls to the publication system that’s currently in place: most psychology research probably won’t work out all that well, and that will probably mean fewer people ultimately interested in reading about and citing it. While it is possible, I suppose, that null findings would actually be cited at similar rates to positive ones, that remains to be seen, and in the absence of that information I don’t foresee journals being terribly interested in changing their policies and taking that risk.

References: Franco, A., Malhotra, N., & Simonovits, G. (2016). Underreporting in psychology experiments: Evidence from a study registry. Social Psychological & Personality Science, 7, 8-12.