Chivalry Isn’t Dead, But Men Are

In the somewhat-recent past, there was a vote in the Senate held on the matter of whether women in the US should be required to sign up for the selective service – the military draft – when they turn 18. Already accepted, of course, was the idea that men should be required to sign up; what appears to be a relatively less controversial idea. This represents yet another erosion of male privilege in modern society; in this case, the privilege of being expected to fight and die in armed combat, should the need arise. Now whether any conscription is likely to happen in the foreseeable future (hopefully not) is a somewhat different matter than whether women would be among the first drafted if that happened (probably not), but the question remains as to how to explain this state of affairs. The issue, it seems, is not simply one of whether men or women are better able to shoulder the physical demands of combat, however; it extends beyond military service into intuitions about real and hypothetical harm befalling men and women in everyday life. When it comes to harm, people seem to generally care less about it happening to men.

Meh

One anecdotal example of these intuitions I’ve encountered during my own writing is when an editor at Psychology Today removed an image in one my posts of a woman undergoing bodyguard training in China by having a bottle smashed over her head (which can be seen here; it’s by no means graphic). There was a concern expressed that the image was in some way inappropriate, despite my posting of other pictures of men being assaulted or otherwise harmed. As a research-minded individual, however, I want to go beyond simple anecdotes from my own life that confirm my intuitions into the empirical world where other people publish results that confirm my intuitions. While I’ve already written about this issue a number of times, it never hurts to pile on a little more.  Recently, I came upon a paper by FeldmanHall et al (2016) that examined these intuitions about harm directed towards men and women across a number of studies that can help me do just that.

The first of the studies in the paper was a straightforward task: fifty participants were recruited from Mturk to respond to a classic morality problem called the footbridge dilemma. Here, the life of five people can be saved from a train by pushing one person in front of it. When these participants were asked whether they would push a man or woman to their death (assuming, I think, that they were going to push one of them), 88% of participants opted for killing the man. Their second study expanded a bit on that finding using the same dilemma, but asking instead how willing they would be (on a 1-10 scale) to push either a man, woman, or a person of unspecified gender without other options existing. The findings here with regard to gender were a bit less dramatic and clear-cut: participants were slightly more likely to indicate that they would push a man (M = 3.3) than a woman (M = 3.0), though female participants were nominally less likely to push a woman (roughly M = 2.3) than men were (roughly M = 3.8), perhaps counter to what might be predicted. That said, the sample size for this second study was fairly small (only about 25 per group), so that difference might not be worth making much over until more data is collected.

When faced with a direct and unavoidable trade-off between the welfare of men and women, then, the results overwhelmingly showed that the women were being favored; however, when it came to cases where men or women could be harmed alone, there didn’t seem to be a marked difference between the two. That said, that moral dilemma alone can only take us so far in understanding people’s interests about the welfare of others in no small part because of their life-and-death nature potentially introducing ceiling effects (man or woman, very few people are willing to throw someone else in front of a train). In other instances where the degree of harm is lowered – such as, say, male vs female genital cutting – differences might begin to emerge. Thankfully, FeldmanHall et al (2016) included an additional experiment that brought these intuitions out of the hypothetical and into reality while lowering the degree of harm. You can’t kill people to conduct psychological research, after all.

Yet…

In the next experiment, 57 participants were recruited and given £20. At the end of the experiment, any money they had would be multiplied by ten, meaning participants could leave with a total of £200 (which is awfully generous as far as these things go). As with most psychology research, however, there was a catch: the participants would be taking part in 20 trials where £1 was at stake. A target individual – either a man or a woman – would be receiving a painful electric shock, and the participants could give up some of that £1 to reduce its intensity, with the full £1 removing the shock entirely. To make the task a little less abstract, the participants were also forced to view videos of the target receiving the shocks (which, I think, were prerecorded videos of real shocks – rather than shocks in real time – but I’m not sure from my reading of the paper if that’s a completely accurate description).

In this study, another large difference emerged: as expected, participants interacting with female targets ended up keeping less money by the end (M = £8.76) than those interacting with male targets (M = £12.54; d = .82). In other words, the main finding of interest was that participants were willing to give up substantially more money to prevent women from receiving painful shocks than they were to help men. Interestingly, this was the case in spite of the facts that (a) the male target in the videos was rated more positively overall than the female target, and (b) in a follow-up study where participants provided emotional reactions to thinking about being a participant in the former study, the amount of reported aversion to letting the target suffer shocks was similar regardless of the target’s gender. As the authors conclude:

While it is equally emotionally aversive to hurt any individual—regardless of their gender—that society perceives harming women as more morally unacceptable, suggests that gender bias and harm considerations play a large role in shaping moral action.

So, even though people find harming others – or letting them suffer harm for a personal gain – to generally be an uncomfortable experience regardless of their gender, they are more willing to help/avoid harming women than they are men, sometimes by a rather substantial margin.

Now onto the fun part: explaining these findings. It doesn’t go nearly far enough as an explanation to note that “society condones harming men more than women,” as that just restates the finding; likewise, we only get so far by mentioning that people perceive men to have a higher pain tolerance than women (because they do), as that only pushes the question back a step to the matter of why men tolerate more pain than women. As for my thoughts, first, I think these findings highlight the importance of a modular understanding of psychological systems: our altruistic and moral systems are made up of a number of component pieces, each with a distinct function, and the piece that is calculating how much harm is generated is, it would seem, not the same piece deciding whether or not to do something about it. The obvious reason for this distinction is that alleviating harm to others isn’t always adaptive to the same extent: it does me more adaptive good to help kin relative to non-kin, friends relative to strangers, and allies relative to enemies, all else being equal. 

“Just stay out of it; he’s bigger than you”

Second, it might well be the case that helping men, on average, tends to pay off less than helping women. Part of the reason for that state of affairs is that female reproductive potential cannot be replaced quite as easily as male potential; male reproductive success is constrained by the number of available women much more than female potential is by male availability (as Chris Rock put it, “any money spent on dick is a bad investment“). As such, men might become particularly inclined to invest in alleviating women’s pain as a form of mating effort. The story clearly doesn’t end there, however, or else we would predict men being uniquely likely to benefit women, rather than both sexes doing similarly. This raises two additional possibilities to me: one of these is that, if men value women highly as a form of mating effort, that increased social value could also make women more valuable to other women in turn. To place that in a Game of Thrones example, if a powerful house values their own children highly, non-relatives may come to value those same children highly as well in the hopes of ingratiating themselves to – or avoiding the wrath of – the child’s family.

The other idea that comes to mind is that men are less willing to reciprocate aid that alleviated their pain because to do so would be an admission of a degree of weakness; a signal that they honestly needed the help (and might in the future as well), which could lower their relative status. If men are less willing to reciprocate aid, that would make men worse investments for both sexes, all else being equal; better to help out the person who would experience more gratitude for your assistance and repay you in turn. While these explanations might or might not adequately explain these preferential altruistic behaviors directed towards women, I feel they’re worthwhile starting points.

References: FeldmanHall, O., Dalgleish, T., Evans, D., Navrady, L., Tedeschi, E., & Mobbs, D. (2016). Moral chivalry: Gender and harm sensitive predict costly altruism. Social Psychological & Personality Science, DOI: 10.1177/1948550616647448

Sexism, Testing, And “Academic Ability”

When I was teaching my undergraduate course on evolutionary psychology, my approach to testing and assessment was unique. You can read about that philosophy in more detail here, but the gist of my method was specifically avoiding multiple-choice formats in favor of short-essay questions with unlimited revision ability on the part of the students. I favored this exam format for a number of reasons, chief among which was that (a) I didn’t feel multiple choice tests were very good at assessing how well students understood the material (memorization and good guessing does not equal understanding), and (b) I didn’t really care about grading my students as much as I cared about getting them to learn the material. If they didn’t grasp it properly on their first try (and very few students do), I wanted them to have the ability and motivation to continue engaging with it until they did get it right (which most eventually did; the class average for each exam began around a 70 and rose to a 90). For the purposes of today’s discussion, the important point here is that my exams were a bit more cognitively challenging than is usual and, according to a new paper, that means I had unintentionally biased my exams in ways that disfavor “historically underserved groups” like women and the poor.

Oops…

What caught my eye about this particular paper, however, was the initial press release that accompanied it. Specifically, the authors were quoted as saying something I found, well, a bit queer:

“At first glance, one might assume the differences in exam performance are based on academic ability. However, we controlled for this in our study by including the students’ incoming grade point averages in our analysis,”

So the authors appear to believe that a gap in performance on academic tests arises independent of academic abilities (whichever those entail). This raised the immediate question in my mind of how one knows that abilities are the same unless one has a method of testing them. It seems a bit strange to say that abilities are the same on the basis of one set of tests (those that provided incoming GPAs), but then to continue to suggest that abilities are the same when a different set of tests provides a contrary result. In the interests of settling my curiosity, I tracked the paper down to see what was actually reported; after all, these little news blurbs frequently get the details wrong. Unfortunately, this one appeared to capture the author’s views accurately.

So let’s start by briefly reviewing what the authors were looking at. The paper, by Wright et al (2016), is based on data collected from three-years worth of three introductory biology courses spanning 26 different instructors, approximately 5,000 students, and 87 different exams.Without going into too much unnecessary detail, the tests were assessed by independent raters for how cognitively challenging they were, their format, and the students were classified according to their gender and socio-economic status (SES; as measured by whether they qualified for a financial aid program). In order to attempt and control for academic ability, Wright et al (2016) also looked at the freshman-year GPA of the students coming into the biology classes (based on approximately 45 credits, we are told). Because the authors controlled for incoming GPA, they hope to persuade the reader of the following:

This implies that, by at least one measure, these students have equal academic ability, and if they have differential outcomes on exams, then factors other than ability are likely influencing their performance.

Now one could argue that there’s more to academic ability than is captured by a GPA – which is precisely why I will do so in a minute – but let’s continue on with what the authors found first.

Cognitive challenging test were indeed, well, more challenging. A statistically-average male student, for instance, would be expected to do about 12% worse on the most challenging test in their sample, relative to the easiest one. This effect was not the same between genders, however. Again, using statistically-average men and women, when the tests were the least cognitively challenging, there was effectively no performance gap (about a 1.7% expected difference favoring men); however, when the tests were the most cognitively challenging, that expected gap rose to an astonishing expected…3.2% difference. So, while the gender difference just about nominally doubled, in terms of really mattering in any practical sense of the word, its size was such that it likely wouldn’t be noticed unless one was really looking for it. A similar pattern was discovered for SES: when the tests were easy, there was effectively no difference between those low or high in SES (1.3% favoring those higher); however, when the tests were about maximally challenging, this expected difference rose to about 3.5%. 

Useful for both spotting statistical blips and burning insects

There’s a lot to say about these results and how they’re framed within the paper. First, as I mentioned, they truly are minor differences; there are very few cases were a 1-3% difference in test scores is going to make-or-break a student, so I don’t think there’s any real reason to be concerned or to adjust the tests; not practically, anyway.

However, there are larger, theoretical issues looming in the paper. One of these is that the authors use the phrase “controlled for academic ability” so often that a reader might actually come to believe that’s what they did from simple repetition. The problem here, of course, is that the authors did not control for that; they controlled for GPA. Unfortunately for Wright et al’s (2016) presentation, those two things are not synonyms. As I said before, it is strange to say that academic ability is the same because one set of tests (incoming GPA) says they are while another set does not. The former set of tests appear to be privileged for no sound reason. Because of that unwarranted interpretation, the authors lose (or rather, purposefully remove) the ability to talk about how these gaps might be due to some performance difference. This is a useful rhetorical move if one is interested in doing advocacy – as it implies the gap is unfair and ought to be fixed somehow – but not if one is seeking the truth of the matter.

Another rather large issue in the paper is that, as far as I could tell, the authors predicted they would find these effects without ever really providing an explanation as for how or why that prediction arose. That is, what drove their expectation that men would outperform women and the rich outperform the poor? This ends up being something of a problem because, at the end of the paper, the authors do float a few possible (untested) explanations for their findings. The first of these is stereotype threat: the idea that certain groups of people will do poorly on tests because of some negative stereotype about their performance. This is a poor fit for the data for two reasons: first, while Wright et al (2016) claim that stereotype is “well-documented”, it actually fails to replicate (on top of not making much theoretical sense). Second, even if it was a real thing, stereotype threat, as it typically studied, requires that one’s sex be made salient prior to the test. As I encountered a total of zero tests during my entire college experience that made my gender salient, much less my SES, I can only assume that the tests in question didn’t do it either. In order for stereotype threat to work as an explanation, then, women and the poor would need to be under relative constant stereotype threat. In turn, this would make documenting and student stereotype threat in the first place rather difficult, as you could never have a condition where your subjects were not experiencing it. In short, then, stereotype threat seems like a bad fit.

The other explanations that are put forth for this gender difference are the possibility that women and poor students have more fixed views of intelligence instead of growth mindsets, so they withdraw from the material when challenged rather than improve (i.e., “we need to change their mindsets to close this daunting 2% gap), or the possibility that the test questions themselves are written in ways that subtly bias people’s ability to think about them (the example the authors raise is that a question written about applying some concept to sports might favor men, relative to women, as men tend to enjoy sports more). Given that the authors did have access to the test questions, it seems that they could have examined that latter possibility in at least some detail (minimally, perhaps, by looking at whether tests written by female instructors resulted in different outcomes than those written by male ones, or by examining the content of the questions themselves to see if women did worse on gendered ones). Why they didn’t conduct such analyses, I can’t say.

 Maybe it was too much work and they lacked a growth mindset

In summary, these very minor average differences that were uncovered could easily be chalked up – very simply – to GPA not being a full measure of a student’s academic ability. In fact, if the tests determining freshman GPA aren’t the most cognitively challenging (as one might well expect, given that students would have been taking mostly general introductory courses with large class sizes), then this might make the students appear to be more similar in ability than they actually were. The matter can be thought of using this stereotypically-male example (that will assuredly hinder women’s ability to think about it): imagine I tested people in a room with weights ranging from 1-15 pounds and asked them to curl each one time. This would give me a poor sense for any underlying differences in strength because the range of ability tested was restricted. Provided I were to ask them to do the same with weights ranging from 1-100 pounds the next week, I might conclude that it’s something about the weights – and not people’s abilities – when it came to figuring out why differences suddenly emerged (since I mistakenly believe I already controlled for their abilities the first time).

Now I don’t know if something like that is actually responsible, but if the tests determining freshman GPA were tapping the same kinds of abilities to the same degrees as those in the biology courses studied, then controlling for GPA should have taken care of that potential issue. Since controlling for GPA did not, I feel safe assuming there being some difference in the tests in terms of what abilities they’re measuring.

References: Wright, C., Eddy, S., Wenderoth, M., Abshire, E., Blankenbiller, M., & Brownell, S. (2016). Cognitive difficulty and format of exams predicts gender and socioeconomic gaps in exam performance of students in introductory biology courses. Life Science Education, 15.

Psychology Research And Advocacy

I get the sense that many people get a degree in psychology because they’re looking to help others (since most clearly aren’t doing it for the pay). For those who get a degree in the clinical side of the field, this observation seems easy to make; at the very least, I don’t know of any counselors or therapists who seek to make their clients feel worse about the state their life is in and keep them there. For those who become involved in the research end of psychology, I believe this desire to help others is still a major motivator. Rather than trying to help specific clients, however, many psychological researchers are driven by a motivation to help particular groups in society: women, certain racial groups, the sexually promiscuous, the outliers, the politically liberal, or any group that the researcher believes to be unfairly marginalized, undervalued, or maligned. Their work is driven by a desire to show that the particular group in question has been misjudged by others, with those doing the misjudging being biased and, importantly, wrong. In other words, their role as a researcher is often driven by their role as an advocate, and the quality of their work and thinking can often take a back seat to their social goals.

When megaphones fail, try using research to make yourself louder

Two such examples are highlighted in a recent paper by Eagly (2016), both of which can broadly be considered to focus on the topic of diversity in the workplace. I want to summarize them quickly before turning to some of the other facets of the paper I find noteworthy. The first case concerns the prospect that having more women on corporate boards tends to increase their profitability, a point driven by a finding that Fortune 500 companies in the top quarter of female representation on boards of directors performed better than those in the bottom quarter of representation. Eagly (2016) rightly notes that such a basic data set would be all but unpublishable in academia for failing to do a lot of important things. Indeed, when more sophisticated research was considered in a meta-analysis of 140 studies, the gender diversity of the board of directors had about as close to no effect as possible on financial outcomes: the average correlations across all the studies ranged from about r = .01 all the way up to r = .05 depending on what measures were considered. Gender diversity per se seemed to have no meaningful effect despite a variety of advocacy sources claiming that increasing female representation would provide financial benefits. Rather than considering the full scope of the research, the advocates tended to cite only the most simplistic analyses that provided the conclusion they wanted (others) to hear.

The second area of research concerned how demographic diversity in work groups can affect performance. The general assumption that is often made about diversity is that it is a positive force for improving outcomes, given that a more cognitively-varied group of people can bring a greater number of skills and perspectives to bear on solving tasks than more homogeneous groups can. As it turns out, however, another meta-analysis of 146 studies concluded that demographic diversity (both in terms of gender and racial makeup) had effectively no impact on performance outcomes: the correlation for gender was r = -.01 and was r = -.05 for racial diversity. By contrast, differences in skill sets and knowledge had a positive, but still very small effect (r = .05). In summary, findings like these would suggest that groups don’t get better at solving problems just because they’re made up of enough [men/women/Blacks/Whites/Asians/etc]. Diversity in demographics per se, unsurprisingly, doesn’t help to magically solve complex problems.

While Eagly (2016) appears to generally be condemning the role of advocacy in research when it comes to getting things right (a laudable position), there were some passages in the paper that caught my eye. The first of these concerns what advocates for causes should do when the research, taken as a whole, doesn’t exactly agree with their preferred stance. In this case, Eagly (2016) focuses on the diversity research that did not show good evidence for diverse groups leading to positive outcomes. The first route one might take is to simply misrepresent the state of the research, which is obviously a bad idea. Instead, Eagly suggests advocates take one of two alternative routes: first, she recommends that researchers might conduct research into more specific conditions under which diversity (or whatever one’s preferred topic is) might be a good thing. This is an interesting suggestion to evaluate: on the one hand, people would often be inclined to say it’s a good idea; in some particular contexts diversity might be a good thing, even if it’s not always, or even generally, useful. This wouldn’t be the first time effects in psychology are found to be context-dependent. On the other hand, this suggestion also runs some serious risks of inflating type 1 errors. Specifically, if you keep slicing up data and looking at the issue in a number of different contexts, you will eventually uncover positive results even if they’re just due to chance. Repeated subgroup or subcontext analysis doesn’t sound much different from the questionable statistical practices currently being blamed for psychology’s replication problem: just keep conducting research and only report the parts of it that happened to work, or keep massaging the data until the right conclusion falls out.    

“…the rest goes in the dumpster out back”

Eagly’s second suggestion I find a bit more worrisome: arguing that relevant factors – like increases in profits, productivity, or finding better solutions – aren’t actually all that relevant when it comes to justifying why companies should increase diversity. What I find odd about this is that it seems to suggest that the advocates begin with their conclusion (in this case, that diversity in the work force ought to be increased) and then just keep looking for ways to justify it in spite of previous failures to do so. Again, while it is possible that there are benefits to diversity which aren’t yet being considered in the literature, bad research would likely result from a process where someone starts their analysis with the conclusion and keeps going until they justify it to others, no matter how often it requires shifting the goal posts. A major problematic implication with that suggestion mirrors other aspects of the questionable psychology research practices I mentioned before: when a researcher finds the conclusion they’re looking for, they stop looking. They only collect data up until the point it is useful, which rigs the system in favor of finding positive results where there are none. That could well mean, then, that there will be negative consequences to these diversity policies which are not being considered. 

What I think is a good example of this justification problem leading to shoddy research practices/interpretation follows shortly thereafter. In talking about some of these alternative benefits that more female hires might have, Eagly (2016) notes that women tend to be more compassionate and egalitarian than men; as such, hiring more women should be expected to increase less-considered benefits, such as a reduction in the laying-off of employees during economic downturns (referred to as labor hoarding), or more favorable policies towards time off for family care. Now something like this should be expected: if you have different people making the decisions, different decisions will be made. Forgoing for the moment the question of whether those different policies are better, in some objective sense of the word, if one is interested in encouraging those outcomes (that is, they’re preferred by the advocate) then one might wish to address those issue directly, rather than by proxy. That is to say if you are looking to make the leadership of some company more compassionate, then it makes sense to test for and hire more compassionate people, not hiring more women under the assumption you will be increasing compassion. 

This is an important matter because people are not perfect statistical representations of the groups to which they belong. On average, women may be more compassionate than men; the type of woman who is interested in actively pursuing a CEO position in a Fortune 500 company might not be as compassionate as your average woman, however, and, in fact, might even be less compassionate than a particular male candidate. What Eagly (2016) has ended up reaching, then, is not a justification for hiring more women; it’s a justification for hiring compassionate or egalitarian people. What is conspicuously absent from this section is a call for more research to be conducted on contexts in which men might be more compassionate than women; once the conclusion that hiring women is a good thing has been justified (in the advocate’s mind, anyway), the concerns for more information seem to sputter out. It should go without saying, but such a course of action wouldn’t be expected to lead to the most accurate scientific understanding of our world.

The solution to that problem being more diversity, of course..

To place this point in another quick example, if you’re looking to assemble a group of tall people, it would be better to use people’s height when making that decision rather than their sex, even if men do tend to be taller than women. Some advocates might suggest that being male is a good enough proxy for height, so you should favor male candidates; others would suggest that you shouldn’t be trying to assemble a group of tall people in the first place, as short people offer benefits that tall ones don’t; other still will argue that it doesn’t matter if short people don’t offer benefits as they should be preferentially selected to combat negative attitudes towards the short regardless (at the expense of selecting tall candidates). For what it’s worth, I find the attitude of “keep doing research until you justify your predetermined conclusion” to be unproductive and indicative of why the relationship between advocates and researchers ought not be a close one. Advocacy can only serve as a cognitive constraint that decreases research quality as the goal of advocacy is decidedly not truth. Advocates should update their conclusions in light of the research; not vice versa. 

References: Eagly, A. (2016). When passionate advocates meet research on diversity, does the honest broker stand a chance? Journal of Social Issues, 72, 199-222.

Men Are Better At Selling Things On eBay

When it comes to gender politics, never take the title of the piece at face value; or the conclusions for that matter.

In my last post, I mentioned how I find some phrases and topics act as red flags regarding the quality of research one is liable to encounter. Today, the topic is gender equality – specifically some perceived (and, indeed, some rather peculiar) discrimination against women – which is an area not renowned for its clear-thinking or reasonable conclusions. As usual, the news articles circulating this piece of research made some outlandish claim that lacks even remote face validity. In this case, the research in question concludes that people, collectively, try to figure out the gender of the people selling things on eBay so as to pay women substantially less than men for similar goods. Those who found such a conclusion agreeable to their personal biases spread it to others across social media as yet another example of how the world is an evil, unfair place. So here I am again, taking a couple recreational shots at some nonsense story of sexism.

Just two more of these posts and I get a free smoothie

The piece question today is an article from Kricheli-Katz & Regev (2016) that examined data from about 1.1 million eBay auctions. The stated goals of the authors involve examining gender inequality in online product markets, so at least we can be sure they’re going into this without an agenda. Kricheli-Katz & Regev (2016) open their piece by talking about how gender inequality is a big problem, launching their discussion almost immediately with a rehashing of that misleading 20% pay gap statistic that’s been floating around forever. As that claim has been dissected so many times at this point, there’s not much more to say about it other than (a) when controlling for important factors, it drops to single digits and (b) when you see it, it’s time to buckle in for what will surely be an unpleasant ideological experience. Thankfully, the paper does not disappoint in that regard, promptly suggesting that women are discriminated against in online markets like eBay.

So let’s start by considering what the authors did, and what they found. First, Kricheli-Katz & Regev (2016) present us with their analysis of eBay data. They restricted their research to auctions only, where sellers will post an item and any subsequent interaction occurs between bidders alone, rather than between bidders and sellers. On average, they found that the women had about 10 fewer months of experience than men, though the accounts of both sexes had existed for over nine years of age, and women also had very-slightly better reputations, as measured by customer feedback. Women also tended to set slightly higher initial prices than men for their auctions, controlling for the product being sold. As such, women also tended to receive slightly fewer bids on their items, and ultimately less money per sale when they ended.

However, when the interaction between sex and product type (new or used) was examined, the headline-grabbing result appeared: while women netted a mere 3% less on average for used products than men, they netted a more-impressive 20% less for new products (where, naturally, one expects products to be the same). Kricheli-Katz & Regev (2016) claim that the discrepancy in the new-product case are due to beliefs about gender. Whatever these unspecified beliefs are, they cause people to pay women about 20% less for the same item. Taking that idea on face value for a moment, why does that gap all but evaporate in the used category of sales? The authors attribute that lack of a real difference to an increased trust people have in women’s descriptions of the condition of their products. So men trust women more when it comes to used goods, but pay them less for new ones when trust is less relevant. Both these conclusions, as far as I can see from the paper, have been pulled directly out of thin air. There is literally no evidence presented to support them: no data; not citations; no anything.

I might have found the source of their interpretations

By this point, anyone familiar with how eBay works is likely a bit confused. After all, the sex of the seller is at no point readily apparent in almost any listings. Without that crucial piece of information, people would have a very difficult time discriminating on the basis of it. Never fear, though; Kricheli-Katz & Regev (2016) report the results of a second study where they pulled 100 random sellers from their sample and asked about 400 participants to try and determine the sex of sellers in question. Each participant offered their guesses about five profiles, for a total of 2000 attempts. About 55% of the time, participants got the sex right, 9% of the time they got it wrong, and the remaining 36% of the time, they said they didn’t know (which, since they don’t know, also means they got it wrong). In short, people couldn’t determine the sex reliably about half the time. The authors do mention that the guesses got better as participants viewed more items that the seller had posted, however.

So here’s the story they’re trying to sell: When people log onto eBay, they seek out a product they’re looking to buy. When they find a seller listing the product, they examine the seller’s username, the listing in question, and their other listings in their store to attempt and discern the sex of the seller. Buyers subsequently lower their willingness to pay for an item by quite a bit if they see it is being sold by a woman, but only if it’s new. In fact, since women made 20% less, the actual reduction in willingness to pay must be larger than that, as sex can only be determined about half of the time reliably when people are trying. Buyers do all this despite even trusting female sellers more. Also, I do want to emphasis the word they, as this would need to be a pretty collective action. If it wasn’t a fairly universal response among buyers, the prices of female-sold items would eventually even out with the male price, as those who discriminated less against women would be drawn towards the cheaper prices and bump them back up.

Not only do I not buy this story – not even a little – but I wouldn’t pay the authors less for it because they happen to be women if I was looking to make a purchase. While people might be able to determine the sex of the seller on eBay sometimes, when they’re specifically asked to do so, that does not mean people engage in this sort of behavior naturally.

Finally, Kricheli-Katz & Regev (2016) report the results of a third study, asking 100 participants how much they value a $100 gift card being sold by either an Alison or a Brad. Sure enough, people were willing to pay Alison less for the card: she got a mere $83 to Brad’s $87; a 5% difference. I’d say someone should call the presses, but it looks like they already did, judging from the coverage this piece has received. Now this looks like discrimination – because it is – but I don’t think it’s based on sex per se. I say that because, earlier in the paper, Kricheli-Katz & Regev (2016) also report that women as buyers on eBay, tended to pay about 3% more than men for comparable goods. To the extent that the $4 difference in valuation is meaningful here, there are two things to say about it. First, it may well represent the fact that women aren’t as willing to negotiate prices in their favor. Indeed, while women were 23% of the sellers on eBay, they only represented 16% of the auctions with a negotiation component. If that’s the case, people are likely willing to pay less to women because they perceive (correctly) some population differences in their ability to get a good deal. I suspect if you gave them individuating information about the seller’s abilities, sex would stop mattering even 5%. Second, that slight, 5% difference would by no means account for the 20% gap the authors report finding with respect to new product sales; not even close.

But maybe your next big idea will work out better…

Instead, my guess is that in spite of the authors’ use of the word “equally qualified” when referring to the men and women in their seller sample, there were some important differences in listings the buyers noticed; the type of differences that you can’t account for when you’re looking at over a million of them and rough control measures aren’t effective. Kricheli-Katz & Regev (2016) never seemed to consider – and I mean really consider – the possibility that something about these listings, something they didn’t control for, might have been driving sale price differences. While they do control for factors like the seller’s reputation, experience, number of pictures, year of the sale, and some of the sentiments expressed by words in the listing (how positive or negative it is), there’s more to making a good listing than that. A more likely story is that differences in sale prices reflect different behaviors on the part of male and female sellers (as we already know others differences exist in the sample), as the alternative story attempting to be championed would require a level of obsession with gender-based discrimination in the population so wide and deep that we wouldn’t need to research it; it would be plainly obvious to everyone already.

Then again, perhaps it’s time I make my way over to eBay to pick up a new tinfoil hat.

References: Kricheli-Katz, T. & Regev, T. (2016). How many cents on the dollar? Women and men in product markets. Science Advances, 2, DOI: 10.1126/sciadv.1500599

Thoughtful Suggestions For Communicating Sex Differences

Having spent quite a bit of time around the psychological literature – both academic and lay pieces alike – there are some words or phrases I can no longer read without an immediate, knee-jerk sense of skepticism arising in me, as if they taint everything that follows and precedes them. Included in this list are terms like bias, stereotype, discrimination, and, for the present purposes, fallacy. The reason these words elicit such skepticism on my end is due to the repeated failure of people using them to  consistently produce high-quality work or convincing lines of reasoning. This is almost surely due to the perceived social stakes when such terms are being used: if you can make members of a particular group appear uniquely talented, victimized, or otherwise valuable, you can subsequently direct social support towards and away from various ends. When the goal of argumentation becomes persuasion, truth is not a necessary component and can be pushed aside. Importantly, the people engaged in such persuasive endeavors do not usually recognize they are treating information or arguments differently, contingent on how it suits their ends.

“Of course I’m being fair about this”

There are few areas of research that seem to engender as much conflict – philosophically and socially – as sex differences, and it is here those words appear regularly. As there are social reasons people might wish to emphasize or downplay sex differences, it has steadily become impossible for me to approach most of the writing I see on the topic with the assumption it is at least sort of unbiased. That’s not to say every paper is hopelessly mired in a particular worldview, rejecting all contrary data, mind you; just that I don’t expect them to reflect earnest examinations of the capital-T, truth. Speaking of which, a new paper by Maney (2016) recently crossed my desk; a the paper that concerns itself with how sex differences get reported and how they ought to be discussed. Maney (2016) appears to take a dim view of the research on sex differences in general and attempts to highlight some perceived fallacies of people’s understandings of them. Unfortunately, for someone trying and educate people about issues surrounding the sex difference literature, the paper does not come off as one written by someone possessing a uniquely deep knowledge of the topic.

The first fallacy Maney (2016) seeks to highlight is the idea that sexes form discrete groups. Her logic for explaining why this is not the case revolves around the idea that while the sexes do indeed differ to some degree on a number of traits, they also often overlap a great deal on them. Instead, Maney (2016) argues that we ought to not be asking whether the sexes differ on a given trait, but rather by how much they do. Indeed, she even puts the word ‘differences’ in quotes, suggesting that these ‘differences’ between sexes aren’t, in many cases, real. I like this brief section, as it highlights well why I have grown to distrust words like fallacy. Taking her points in reverse order, if one is interested in how much groups (in this case, sexes) differ, then one must have, at least implicitly, already answered the question as whether or not they do. After all, if the sexes did not differ, it would pointless to talk about the extent of those non-differences; there simply wouldn’t be variation. Second, I know of zero researchers whose primarily interest resides in answering the question of whether the sexes differ to the exclusion of the extent of those differences. As far as I’m aware, Maney (2016) seems to be condemning a strange class of imaginary researchers who are content to find that a difference exists and then never look into it further or provide more details. Finally, I see little value in noting that the sexes often overlap a great deal when it comes to explaining the areas in which they do not. In much the same way, if you were interested in understanding the differences between humans and chimpanzees, you are unlikely to get very far by noting that we share a great deal of genes in common. Simply put, you can’t explain differences with similarities. If one’s goal is to minimize the perception of differences, though, this would be a helpful move.  

The second fallacy that Maney (2016) seeks to tackle is that idea that the cause of a sex differences in behavior can be attributed to differing brain structures. Her argument on this front is that it is logically invalid to do the following: (1) note that some brain structure between men and women differ, (2) note that this brain structure is related to a given behavior on which they also differ, and so (3) conclude that a sex difference in brain structure between men and women is responsible for that different behavior. Now while this argument is true within the rules of formal logic, it is clear that differences in brain structure will result in differences in behavior; the only way that idea could be false would be if brain structure was not connected to behavior, and I don’t know of anyone crazy enough to try and make that argument. The researchers engaging in the fallacy thus might not get the specifics right all the time, but their underlying approach is fine: if a difference exists in behavior (between sexes, species, or individuals), there will exist some corresponding structural differences in the brain. The tools we have for studying the matter are a far cry from perfect, making inquiry difficult, but that’s a different issue. Relatedly, then, noting that some formal bit of logic is invalid is assuredly not the same thing as demonstrating that a conclusion is incorrect or the general approach misguided. (Also worth noting is that the above validity issue stops being a problem when conclusions are probabilistic, rather than definitive.)

“Sorry, but it’s not logical to conclude his muscles might determine his strength”

The third fallacy Maney (2016) addresses is the idea that sex differences in the brain must be preprogrammed or fixed, attempting to dispel the notion that sex differences are rooted in biology and thus impervious to experience. In short, she is arguing against the idea of hard genetic determinism. Oddly enough, I have never met a single genetic determinist in person; in fact, I’ve never even read an article that advanced such an argument (though maybe I’ve just been unusually lucky…). As every writer on the subject I have come across has emphasized – often in great detail – the interactive nature of genes and environments in determining the direction of development, it again seems like Maney (2016) is attacking philosophical enemies that are more imagined than real. She could have, for instance, quoted researchers who made claims along the lines of, “trait X is biologically-determined and impervious to environmental inputs during development”; instead, it looks like everyone she cites for this fallacy is making a similar criticism of others, rather than anyone making the claims being criticized (though I did not check those references myself, so I’m not 100% there). Curiously, Maney (2016) doesn’t seem to be at all concerned about the people who, more-or-less, disregard the role of genetics or biology in understanding human behavior; at the very least she doesn’t devote any portion of her paper to addressing that particular fallacy. That rather glaring omission – coupled with what she does present – could leave one with the impression that she isn’t really trying to present a balanced view of the issue.

With those ostensibly fallacies out of the way, there are a few other claims worth mentioning in the paper. The first is that Maney (2016) seems to have a hard time reconciling the idea of sexual dimorphisms – traits that occur in one form typical of males and one typical of females – with the idea that the sexes overlap to varying degrees on many of them, such as height. While it’s true enough that you can’t tell someone’s sex for certain if you only know their height, that doesn’t mean you can’t make some good guesses that are liable to be right a lot more often than they’re wrong. Indeed, the only dimorphisms she mentions are the presence of sex chromosomes, external genitalia, and gonads and then continues to write as if these were of little to no consequence. Much like height, however, there couldn’t be selection for any physical sex differences if the sexes did not behave differently. Since behavior is controlled by the brain, physical differences between the sexes, like height and genitalia, are usually also indicative of some structural differences in the brain. This is the case whether the dimorphism is one of degree (like height) or kind (like chromosomes).

Returning to the main point, outside of these all-or-none traits, it is unclear what Maney (2016) would consider a genuine difference, much less any clear justification for that standard. For example, she notes some research that found a 90% overlap in interhemispheric connectivity between the male and female distributions, but then seems to imply that the corresponding 10% non-overlap does not reflect a ‘real’ sex difference. We would surely notice a 10% difference in other traits, like height, IQ, or number of fingers but, I suppose in the realm of the brain, 10% just doesn’t cut it.

Maney (2016) also seems to take an odd stance when it comes to explanations for these differences. In one instance, she writes about a study on multitasking that found a sex difference favoring men; a difference which, we are told, was explained by a ‘much larger difference in video game experience,’ rather than sex per se. Great, but what are we to make of that ‘much larger’ sex difference in video game experience? It would seem that that finding too requires an explanation, and one is not present. Perhaps video game experience is explained more by, I don’t know, competitiveness than sex, but then what are we to explain competitiveness with? These kinds of explanations usually end up going nowhere in a hurry unless they eventually land on some kind of adaptive endpoint, as once a trait’s reproductive value is explained, you don’t need to go any further. Unfortunately, Maney (2016) seems to oppose evolutionary explanations for sex differences, scolding those who propose ‘questionable’ functional or evolutionary explanations for sex differences for being genetic determinists who see no role for sociocultural influences. In her rush to condemn those genetic determinists (who, again, I have never met or read, apparently), Maney’s (2016) piece appears to fall victim to the warning laid out by Tinbergen (1963) several decades ago: rather than seeking to improve the shape and direction of evolutionary, functional analyses, Maney (2016) instead recommends that people simply avoid them altogether.

“Don’t ask people to think about these things; you’ll only hurt their unisex brains”

This is a real shame, as evolutionary theory is the only tool available for providing a deeper understanding of these sex differences (as well as our physical and psychological form more generally). Just as species will differ in morphology and behavior to the extent they have faced different adaptive problems, so too will the sexes within a species. By understanding the different challenges faced by the sexes historically, one can get a much clearer sense as to where psychological and physical difference will – and will not – be expected to exist, as well as why (this extra level of ‘why’ is important, as it allows you to better figure out where an analysis has gone wrong if the predictions don’t work). Maney (2016), it would seem, even missed a golden opportunity within her paper to explain to her readers that evolutionary explanations complement, rather than supplant, more proximate explanations when quoting an abstract that seemed to contrast the two. I suspect this opportunity was missed because she is either legitimately unaware of that point, or does not understand it (judging from the tone of her paper), believing (incorrectly) instead that evolutionary means genetic, and therefore immutable. If that is the case, it would be rather ironic for someone who does not seem to have much understanding of the evolutionary literature lecturing others on how it ought to be reported.

References: Maney, D. (2016). Perils and pitfalls of reporting sex differences. Philosophical Transactions B, 371, 1-11.

Tinbergen, N. (1964). On aims and methods of ethology. Zeitschrift für Tierpsychologie, 20, 410-433.

 

Stereotyping Stereotypes

I’ve attended a number of talks on stereotypes; I’ve read many more papers in which the word was used; I’ve seen still more instances where the term has been used outside of academic settings in discussions or articles. Though I have no data on hand, I would wager that the weight of this academic and non-academic literature leans heavily towards the idea that stereotypes are, by in large, inaccurate. In fact, I would go a bit farther than that: the notion that stereotypes are inaccurate seems to be so common that people often see little need in ensuring any checks were put into place to test for their accuracy in the first place. Indeed, one of my major complaints about the talks on stereotypes I’ve attended is just that: speakers never mentioning the possibility that people’s beliefs about other groups happen to, on the whole, match up to reality fairly well in many cases (sometimes they have mentioned this point as an afterthought but, from what I’ve seen, that rarely translates into later going out and testing for accuracy). To use a non-controversial example, I expect that many people believe men are taller than women, on average, because men do, in fact, happen to be taller.

Pictured above: not a perceptual bias or an illusory correlation

This naturally raises the question of how accurate stereotypes – when defined as beliefs about social groups – tend to be. It should go without saying that there will not be a single answer to that question: accuracy is not an either/or type of matter. If I happen to think it’s about 75 degrees out when the temperature is actually 80, I’m more accurate in my belief than if the temperature was 90. Similarly, the degree of that accuracy should be expected to vary on the intended nature of the stereotype in question; a matter to which I’ll return later. That said, as I mentioned before, quite a bit of the exposure I’ve had to the subject of stereotypes suggests rather strongly and frequently that they’re inaccurate. Much of the writing about stereotypes I’ve encountered focuses on notions like “tearing them down”, “busting myths”, or about how people are unfairly discriminated against because of them; comparatively little of that work has focused on instances in which they’re accurate which, one would think, would represent the first step in attempting to understand them.

According to some research reviewed by Jussim et al (2009), however, that latter point is rather unfortunate, as stereotypes often seem to be quite accurate, at least by the standards set by other research in psychology. In order to test for the accuracy of stereotypes, Jussim et al (2009) report on some empirical studies that met two key criteria: first, the research had to compare people’s beliefs about a group to what that group was actually like; that much is a fairly basic requirement. Second, the research had to use an appropriate sample to determine what that group was actually like. For example, if someone was interested in people’s beliefs about some difference between men and women in general, but only tested these beliefs against data from a convenience sample (like men and women attending the local college), this could pose something of a problem to the extent that the convenience sample differs from the reference group of people holding the stereotypes. If people, by in large, have accurate stereotypes, researchers would never know if they make use of a non-represented reference group.

Within the realm of racial stereotypes, Jussim et al (2009) summarized the results of 4 papers that met this criteria. The majority of the results fell within what the authors consider “accurate” range (as defined by being 0-10% off from the criteria values) or near-misses (those between 10-20% off). Indeed, the average correlations between the stereotypes and criteria measures ranged from .53 to .93, which are very high, relative to the average correlation uncovered by psychological research. Even the personal stereotypes, while not as high, were appreciably accurate, ranging from .36 to .69. Further, while people weren’t perfectly accurate in their beliefs, those who overestimated differences between racial groups tended be balanced out by those who underestimated those differences in most instances. Interestingly enough, people’s stereotypes about group differences tended to be a bit more accurate than their within group stereotypes.

“Ha! Look at all that inaccurate shooting. Didn’t even come close”

The same procedure was used to review research on gender stereotypes as well, yielding 7 papers with larger sample sizes. A similar set of results emerged: the average stereotype was rather accurate, with correlations ranging between .34 to .98, most of which hovered in the range of .7. Individual stereotypes were again less accurate, but most were still heading in the right direction. To put those numbers in perspective, Jussim et al (2009) summarized a meta-analyses examining the average correlation found in psychological research. According to that data, only 24% of social psychology effects represent correlations larger than .3 and a mere 5% exceeded a correlation of .5; the corresponding numbers for averaged stereotypes were 100% of the reviewed work meeting the .3 threshold, and about 89% of the correlations exceeding the .5 threshold (personal stereotypes at 81% and 36%, respectively).

Now neither Jussim et al (2009) or I would claim that all stereotypes are accurate (or at least reasonably close); no one I’m aware of has. This brings us to the matter of when we should expect stereotypes to be accurate and when we should expect them to fall shorter of that point. As an initial note, we should always expect some degree of inaccuracy in stereotypes – indeed, in all beliefs about the world – to the extent that gathering information takes time and improving accuracy is not always worth that investment in the adaptive sense. To use a non-biological example, spending an extra three hours studying to improve one’s grade on a test from a 70 to a 90 might seem worth it, but the same amount of time used to improve from a 90 to a 92 might not. Similarly, if one lacks access to reliable information about the behavior of others in the first place, stereotypes should also tend to be relatively inaccurate. For this reason, Jussim et al (2009) note that cross-cultural stereotypes in national personalities tend to be among the most inaccurate, as people from, say, India, might have relatively little exposure to information about people from South Africa, and vice versa.

The second point to make on accuracy is that, to the extent that beliefs guide behavior and that behavior carries costs or benefits, we should expect beliefs to tend towards accuracy (again, regardless of whether they’re about social groups or the world more generally). If you believe, incorrectly, that group A is as likely to assault you as group B (the example that Jussim et al (2009) use involves biker gang members and ballerinas), you’ll either end up avoiding one group more than you need to, not being wary enough around one, or miss in both directions, all of which involves social and physical costs. One of the only cases in which being wrong might reliably carry benefits are contexts in which one’s inaccurate beliefs modifies the behavior of other people. In other words, stereotypes can be expected to be inaccurate in the realm of persuasion. Jussim et al (2009) make nods toward this possibility, noting that political stereotypes are among the least accurate ones out there, and that certain stereotypes might have been crafted specifically with the intent of maligning a particular group.

For instance…

While I do suspect that some stereotypes exist specifically to malign a particular group, that possibility does raise another interesting question: namely, why would anyone, let alone large groups of people, be persuaded to accept inaccurate stereotypes? For the same reason that people should prefer accurate information over inaccurate information when guiding their own behaviors, they should also be relatively resistant to adopting stereotypes which are inaccurate, just as they should be when it comes to applying them to individuals when they don’t fit. To the extent that a stereotype is of this sort (inaccurate), then, we should expect that it not be widely held, except in a few particular contexts.

Indeed, Jussim et al (2009) also review evidence that suggests people do not inflexibly make use of stereotypes, preferring individuating information when it’s available: according to the meta-analyses reviewed, the average influence of stereotypes on judgments hangs around r = .1 (which does not, in many instances, have anything to say about the accuracy of the stereotype; just the extent of its effect); by contrast, individuating information had an average effect of about .7 which, again, is much larger than the average psychology effect. Once individuating information is controlled for, stereotypes tend to have next to zero impact on people’s judgments of others. People appear to rely on personal information to a much higher degree than stereotypes, and often jettison ill-fitting stereotypes in favor of personal information. In other words, the knowledge that men tend to be taller than women does not have much of an influence on whether I think a particular women is taller than a particular man.

When should we expect that people will make the greatest use of stereotypes, then? Likely when they have access to the least amount of individuating information. This has been the case in a lot of the previous research on gender bias where very little information is provided about the target individual beyond their sex (see here for an example). In these cases, stereotypes represent an individual doing the best they can with limited information. In some cases, however, people express moral opposition to making use of that limited information, contingent on the group(s) it benefits or disadvantages. It is in such cases that, ironically, stereotypes might be stereotyped as inaccurate (or at least insufficiently accurate) to the greatest degree.

References: Jussim, L., Cain, T., Crawford, J., Harber, K., & Cohen, F. (2009). The unbearable accuracy of stereotypes. In Nelson, T. The Handbook of Prejudice, Stereotyping, and Discrimination (199-227). NY: Psychological Press.  

Should We Expect Cross-Cultural Perceptual Errors?

There was a rather interesting paper that crossed my social media feeds recently concerning stereotypes about women in science fields; a topic about which I have been writing lately. I’m going to do something I don’t usually do and talk about it briefly despite having just read the abstract and discussion section. The paper, by Miller, Eagly, and Linn (2014), reported on people’s implicit gender stereotypes about science, which associated science more readily with men, relative to women. As it turns out, across a number of different cultures, people’s implicit stereotypes corresponded fairly well to the actual representation of men and women in those fields. In other words, people’s perceptions, or at least their responses, tended to be accurate: if more men were associated with science psychologically, it seemed to be because more men also happened to work in science fields. In general, this is how we should expect the mind to work. While our minds might imperfectly gather information about the world, they should do their best to be accurate. The reasons for this accuracy, I suspect, have a lot to do with being right resulting in useful modifications of behaviors.

   Being wrong about skateboarding skill, for instance, has some consequences

Whenever people propose psychological hypotheses that have to do with people being wrong, then, we should be a bit skeptical. A psychology designed in such a way so as to be wrong about the world consistently will, on the whole, tend to direct behavior in more maladaptive ways than a more accurate mind would. If one is positing that people are wrong about the world in some regard, it would require either that (a) there are no consequences for being wrong in that particular way or (b) there are some consequences, but the negative consequences are outweighed by the benefits. Most hypotheses for holding incorrect beliefs I have encountered tend towards the latter route, suggesting that some incorrect beliefs might outperform true beliefs in some fitness-relevant way(s).

One such hypothesis that I’ve written about before concerns error management theory. To recap, error management theory recognizes that some errors are costlier to make than others. To use an example in the context of the current paper I’m about to discuss, consider a case in which a man desires to have sex with a woman. The woman in question might or might not be interested in the prospect; the man might also perceive that she is interested or not interested. If the woman is interested and the man makes the mistake of thinking she isn’t, he has missed out on a potentially important opportunity to increase his reproductive output. On the other hand, if the woman isn’t interested and the man makes the mistake of thinking she is, he might waste some time and energy pursuing her unsuccessfully. These two mistakes do not carry equivalent costs: one could make the argument that a missed encounter is costlier on average, from a fitness standpoint, than an unsuccessful pursuit (depending, of course, on how much time and energy is invested in the pursuit).

Accordingly, it has been hypothesized that male psychology might be designed in such a way so as to over-perceive women’s sexual interest in them, minimizing the costs associated with making mistakes, multiplied by their frequency, rather than minimizing the number of mistakes one makes in total. While that sounds plausible at first glance, there is a rather important point worth bearing in mind when evaluating it: incorrect beliefs are not the only way to go about solving this problem: a man could believe, correctly, that a woman is not all that interested in him, but simply use a lower threshold for acceptable pursuits. Putting that into numbers, let’s say a woman has a 5% chance of having sex with the man in question: the man might not pursue any chance below 10%, and so could bias his belief upward to think he actually has a 10% chance; alternatively, he might believe she has about a 5% chance of having sex with him and decide to go after her anyway. It seems that the second route solves this problem more effectively, as a biased probability of success with a woman might have downstream effects on other pursuits.

Like on the important task of watching the road

Now in that last post I mentioned, it seems that the evidence that men over-perceive women’s sexual interest might instead be better explained by the hypothesis that women are underreporting their intentions. After all, we have no data on the probability of a woman having sex with someone given she did something like held his hand or bought him a present, so concluding that men over-perceive requires assuming that women report accurately (the previous evidence would also require that pretty much everyone else but the woman is wrong about her behavior, male or female). Some new evidence puts the hypothesis of male over-perception into even hotter water. A recent paper by Perilloux et al (2015) sought to test this over-perception bias cross-culturally, as most of the data bearing on it happens to have been derived from American samples. If men possess some adaptation designed for over-perception of sexual interest, we should expect to see it cross-culturally; it ought to be a human universal (as I’ve noted before, this doesn’t mean we should expect invariance in its expression, but we should at least find its presence).

Perilloux et al (2015) collected data from participants in Spain, Chile, and France, representing a total sample size of approximately 400 subjects. Men and women were given a list of 15 behaviors. They were asked to imagine they had been out on a few dates with a member of the opposite sex, and then about their estimates of having sex with them, given that this opposite sex individual engaged in those behaviors (from -3 being “extremely unlikely” to 3 being “extremely likely”). The results showed an overall sex difference in each country, with men tending perceive more sexual interest than women. While this might appear to support the idea that over-perception is a universal feature of male psychology, a closer examination of the data cast some doubt on that idea.

In the US sample, men perceived more sexual interest than women in 12 of the 15 items; in Spain, that number was 5, in Chile it was 2, and in France it was 1. It seemed that the question concerning whether someone bought jewelry was enough to driving this sex difference in both the French and Chilean samples. Rather than men over-perceiving women’s reported interests in general across a wide range of behaviors, it seemed that the cross-cultural sample’s differences were being driven by only a few behaviors; behaviors which are, apparently, also rather atypical for relationships in those countries (inasmuch as women don’t usually buy men jewelry). As for why there’s a greater correspondence between French and Chilean men and women’s reported likelihoods, I can’t say. However, that men from France and Chile seem to be rather accurate in their perceptions of female sexual intent would cast doubt on the idea that male psychology contains some mechanisms for sexual over-perception.

I’ll bet US men still lead in shooting accuracy, though

This paper helps make two very good points that, at first, might seem like they oppose each other, despite their complimentary nature. The first point is the obvious importance of cross-cultural research; one cannot simply take it for granted that a given effect will appear in other cultures. Many sex differences – like height and willingness to engage in casual sex – do, but some will not. The second point, however, is that hypotheses about function can be developed and even tested (albeit incompletely) in absence of data about their universality. Hypotheses about function are distinct from hypotheses about proximate form or development, though these different levels of analysis can often be used to inform others. Indeed, that’s what happened in the current paper, with Perilloux et al (2015) drawing the implicit hypothesis about universality from the hypothesis about ultimate functioning, using data about the former to inform their posterior beliefs about the latter. While different levels of analysis inform each other, they are nonetheless distinct, and that’s always worth repeating.

References: Perilloux, C., Munoz-Reyes, J., Turiegano, E., Kurzban, R., & Pita, M. (2015). Do (non-American) men overestimate women’s sexual intentions? Evolutionary Psychological Science, DOI 10.1007/s40806-015-0017-5

Miller, D., Eagly, A., & Linn, M., (2014). Women’s representation in science predicts national gender-science stereotypes: Evidence from 66 nations. Journal of Educational Psychology,  http://dx.doi.org/10.1037/edu0000005

I Reject Your Fantasy And Substitute My Own

I don’t think it’s a stretch to make the following generalization: people want to feel good about themselves. Unfortunately for all of us, our value to other people tends to be based on what we offer them and, since our happiness as a social species tends to be tethered to how valuable we are perceived to be by others, being happy can be more of chore than we would prefer. These valuable things need not be material; we could offer things like friendship or physical attractiveness, pretty much anything that helps fill a preference or need others have. Adding to the list of misfortunes we must suffer in the pursuit of happiness, other people in the world also offer valuable things to the people we hope to impress. This means that, in order to be valuable to others, we need to be particularly good at offering things to others people: either through being better at providing something than many people provide, or able to provide something relatively unique that others typically don’t. If we cannot match the contributions of others, then people will not like to spend time with us and we will become sad; a terrible fate indeed. One way to avoid that undesirable outcome, then, is to increase your level of competition to become more valuable to other people; make yourself into the type of person others find valuable. Another popular route, which is compatible with the first, is to condemn other people who are successful or promote the images of successful people. If there’s less competition around, then our relative ability becomes more valuable. On that note, Barbie is back in the news again.

“Finally; a new doll for my old one to tease for not meeting her standards!”

The Lammily doll has been making the rounds on various social media sites, marketed as the average Barbie, with the tag line: “average is beautiful”. Lammily is supposed to be proportioned so as to represent the average body of a 19-year-old woman. She also comes complete with stickers for young girls to attach to her body in order to give her acne, scars, cellulite, and stretch marks. The idea here seems to be that if young girls see a more average-looking doll, they will compare themselves less negatively to it and, hopefully, end up feeling better about their body. Future incarnations of the doll are hoped to include diverse body types, races, and I presume other features upon which people vary (just in case the average doll ends up being too alienating or high-achieving, I think). If this doll is preferred by girls to Barbie, then by all means I’m not going to tell them they shouldn’t enjoy it. I certainly don’t discourage the making of this doll or others like it. I just get the sense that the doll will end up primarily making parents feel better by giving them the sense they’re accomplishing something they aren’t, rather than affecting their children’s perceptions.

As an initial note, I will say that I find it rather strange that the creator of the doll stated: “By making a doll real I feel attention is taken away from the body and to what the doll actually does.” The reason I find that strange is because the doll does not, as far as I can see, come with a number of different accessories that make it do different things. In fact, if Lammily does anything, I’m not sure what that anything is, as it’s never mentioned. The only accessory I see are the aforementioned stickers to make her look different. Indeed, the whole marketing of the doll is focuses on how it looks; not what it does. For a doll ostensibly attempting to take attention away from the body, it’s body seems to be its only selling point.

The main idea, rather, as far as I can tell, is to try and remove the possible intrasexual competition over appearance that women might feel when confronted with a skinny, attractive, makeup-clad figure. So, by making the doll less attractive with scar stickers, girls will feel less competition to look better. There are a number of facets of the marketing of the doll that would support this interpretation: one such point is the tag line. Saying that “average is beautiful” is, from a statistical standpoint, kind of strange; it’s a bit like saying “average is tall” or “average is smart”. These descriptors are all relative terms – typically ones that apply to upper-ends of some distribution – so applying them to more people would imply that people don’t differ as much on the trait in question. The second point to make about the tagline is that I’m fairly certain, if you asked him, the creator of the Lammily doll – Nickolay Lamm - would not tell you he meant to imply that women who are above or below some average are not beautiful; instead, you’d probably get some sentiment to the effect that everyone is attractive and unique in their own special way, further obscuring the usefulness of the label. Finally, if the idea is to “take attention away from the body”, then selling the doll under the label of its natural beauty is kind of strange.

So does Barbie have a lot to answer for culturally, and is Lammily that answer? Let’s consider some evidence examining whether Barbie dolls are actually doing harm to young girl in the first place and, if they are, whether that harm might be mitigated via the introduction of more-proportionate figures.

“If only she wasn’t as thin, this never would have happened”

One 2006 paper (Dittmar, Halliwell, & Ive, 2006) concludes that the answer is “yes” to both those questions, though I have my doubts. In their paper, the researchers exposed 162 girls between the ages of 5 and 8 to one of three picture books. These books contained a few images of Barbie (who would be a US dress size 2) or Emme (a size 16) dolls engaged in some clothing shopping; there was also a control book that did not draw attention to bodies. The girls were then asked questions about how they looked, how they wanted to look, and how they hoped to look when they grew up. After 15 minutes of exposure to these books, there were some changes in these girl’s apparent satisfaction with their bodies. In general, the girls exposed to the Barbies tended to want to be thinner than those exposed to the Emme dolls. By contrast, those exposed to Emme didn’t want to be thinner than those exposed to no body images at all. In order to get a sense for what was going on, however, those effects require some qualifications

For starters, when measuring the difference between one’s perception of her current body and her current ideal body, exposure to Barbie only made the younger children want to be thinner. This includes the girls in the 5 – 7.5 age range, but not the girls in the 7.5 – 8.5 range. Further, when examining what the girl’s ideal adult bodies would be, Barbie had no effect on the youngest girls (5 – 6.5) or the oldest ones (7.5 – 8.5). In fact, for the older girls, exposure to the Emme doll seemed to make them want to be thinner as adults (the authors suggesting this to be the case as Emme might represent a real, potential outcome the girls are seeking to avoid). So these effects are kind of all over the place, and it is worth noting that they, like many effects in psychology, are modest in size. Barbie exposure, for instance, reduced the girls “body esteem” (a summed measure of six questions about the girl felt about their bodies that got a 1 to 3 response, with 1 being bad, 2 neutral, and 3 being good) from a mean of 14.96 in the control condition to 14.45. To put that in perspective, exposure to Barbie led to girls, on average, moving one response out of six half a point on a small scale, compared to the control group.

Taking these effects at face value, though, my larger concerns with the paper involve a number of things it does not do. First, it doesn’t show that these effects are Barbie-specific. By that I don’t mean that they didn’t compare Barbie against another doll – they did – but rather that they didn’t compare Barbie against, say, attractive (or thin) adult human women. The authors credit Barbie with some kind of iconic status that is likely playing an important role in determining girl’s later ideals of beauty (as opposed to Barbie temporarily, but not lastingly, modifying it their satisfaction), but they don’t demonstrate it. On that point, it’s important to note what the authors are suggesting about Barbie’s effects: that Barbies lead to lasting changes in perceptions and ideals, and that the older girls weren’t being affected by exposures to Barbies because they have already ”…internalized [a thin body ideal] as part of their developing self-concept” by that point.

At least you got all that self-deprecation out of the way early

An interesting idea, to be sure. However, it should make the following prediction: adult women exposed to thin or attractive members of the same sex shouldn’t have their body satisfaction affected, as they have already “internalized a thin ideal”. Yet this is not what one of the meta-analysis papers cited by the authors themselves finds (Groesz, Levine, & Murnen, 2002). Instead, adult women faced with thin models feel less satisfied with their bodies relative to when they view average or above-average weight models. This is inconsistent with the idea that some thin beauty standard has been internalized by age 8. Both sets of data, however, are consistent with the idea that exposure to an attractive competitor might reduce body satisfaction temporarily, as the competitor will be perceived to be more attractive by other people. In much the same way, I might feel bad about my skill at playing music when I see someone much better at the task than I am. I would be dissatisfied because, as I mentioned initially, my value to others depends on who else happens to offer what I do: if they’re better at it, my relative value decreases. A little dissatisfaction, then, either pushes me to improve my skill or to find a new domain in which I can compete more effectively. The disappointment might be painful to experience, but it is useful for guiding behavior. If the older girls just stopped viewing Barbie as competition, perhaps, because they have moved onto new stages in their development, this would explain why Barbie had no effect on them as well. The older girls might simply have grown out of competing with Barbie.

Another issue with the paper is that the experiment used line drawings of body shapes, rather than pictures of actual human bodies, to determine which body girls think they have and which body they want, both now and in the future. This could be an issue, as previous research (Tovee & Cornelissen, 2001) failed to replicate the “girls want to be skinnier than men would prefer” effects – which were found using line drawings – when using actual pictures of human bodies. One potential reason for that different in findings is that a number of features besides thinness might unintentionally co-vary in these line drawings. So some of the desire to be skinny that the girls were expressing in the 2006 experiment might have just been an artifact of the stimulus materials being used.

Additionally, Dittmar, Halliwell, & Ive (2006), somewhat confusingly, didn’t ask the girls about whether or not they owned Barbies or how much exposure they had to them (though they do note that it probably would have been a useful bit of information to have). There are a number of predictions we might make about such a variable. For instance, girls exposed to Barbie more often should be expected to have a greater desire for thinness, if the author’s account is true. Further still, we might also predict that, among girls who have lots of experience with Barbies, a temporary exposure to pictures of Barbie shouldn’t be expected to effect their perception of their ideal body much, if at all. After all, if they’re constantly around the doll, they should have, as the authors put it, already “…internalized [a thin body ideal] as part of their developing self-concept”, meaning that additional exposure might be redundant (as it was with the older girls). Since there’s no data on the matter, I can’t say much more about it.

A match made in unrealistic heaven.

So would a parent have a lasting impact on their daughter’s perception of beauty by buying her a Barbie? Probably not. The current research doesn’t demonstrate any particularly unique, important, or lasting role for Barbie in the development of children’s feelings about their bodies (thought it does assume them). You probably won’t do any damage to your child by buying them an Emme or a Lammily either. It is unlikely that these dolls are the ones socializing children and building their expectations of the world; that’s a job larger than one doll could ever hope to accomplish. It’s more probable that features of these dolls reflect (in some cases exaggerated) aspects of our psychology concerning what is attractive, rather than creating them.

A point of greater interest I wanted to end with, though, is why people felt that the problem which needed to be addressed when it came to Barbie was that she was disproportionate. What I have in mind is that Barbie has a long history of prestigious careers; over 150 of them, most of which being decidedly above-average. If you want a doll that focuses on what the character does, Barbie seems to be doing fine in that regard. If we want Barbie to be an average girl sure, she won’t be as thin, but then chances are that she doesn’t even have her Bachelor’s degree either, which would preclude her from a number of the professions she has held. She’s also unlikely to be a world class athlete or performer. Now, yes, it is possible for people to hold those professions while it is impossible for anyone to be proportioned as Barbie is, but it’s certainly not the average. Why is the concern over what Barbie looks like, rather than what unrealistic career expectations she generates? My speculation is that the focus arises because, in the real world, women compete with each other more over their looks than their careers in the mating market, but I don’t have time to expand on that much more here.

It just seems peculiar to focus on one particular non-average facet of reality obsessively only to state that it doesn’t matter. If the debate over Barbie can teach us anything, it’s that physical appearance does matter; quite a bit, in fact. To try and teach people – girls or boys – otherwise might help them avoid some temporary discomfort (“Looks don’t matter; hooray!”), but it won’t give them an accurate impression of how the wider world will react to them (“Yeah, about that whole looks thing…”); a rather dangerous consequence, if you ask me.

References: Dittmar, H., Halliwell, E., & Ive, S. (2006). Does Barbie make girls want to be thin? The effect of experimental exposure to images of dolls on the body image of 5- to 8-year-old girls. Developmental Psychology, 42, 283-292.

Groesz, L., Levine, M., & Murnen, S. (2002). The effect of experimental presentation of thin media images on body satisfaction: A metaanalytic review. International Journal of Eating Disorders, 31, 1–16.

Tovee, M. & Cornelissen, P. (2001). Female and male perceptions of physical attractiveness in front-view and profile. British Journal of Psychology, 92, 391-402.

Keeping It Topical: That Catcalling Video

Viral fame is an interesting thing. It can come out of nowhere and disappear just as quickly; not unlike a firework. It can also be rather difficult to predict, due to the fact that eventual popularity can often be determined largely by preexisting popularity. This week, one such story that appears to have been caught up in a popularity spiral has been the subject of catcalling: specifically, a video of a woman documenting around 100 instances of unsolicited comments she accumulated while wandering the streets of New York City for 10 hours (which is about one such comment each 6 minutes). At time of writing, the video has around 33 million views, not counting the various clone videos (which is around 6 million such views a day, making for such pleasant numerical symmetry). Unsurprisingly, there’s been a lot of talk about the video; a pile which I’m about to add to. Perhaps the most common conversations have been had concerning whether it’s appropriate to call some of the unsolicited comments the woman received “harassment” (for example, “Have a nice evening”, said in passing, or the various comments suggesting she is “beautiful”).

  Can’t a girl be dating a guy for two years and not get bombarded with harassing proposals?

On that front, there are some natural barriers in perspective that might make consensus hard to reach, owing to what these propositions are thought to represent: solicitations for causal sex. Men, for instance, would likely find such solicitations or comments generally pleasant when receiving them from women, whereas women tend to have precisely the opposite reaction (Clark & Hatfield, 1989). Given the perceptual flavor that such comments often have, men might tend to see them as less of a big deal than women (though sex is hardly all there is too it; such an effect would also be influenced by one’s mating strategy – whether they prefer long- or short-term sexual relationships – as well as other such interacting variables). A second barrier to consensus on the matter is the concentrated nature of such comments: whereas the woman in the video might have received over 100 comments that she views as annoying, they are also coming from over 100 different men. If individual comments aren’t viewed as a problem, but an aggregate of them is (kind of like pollution), discussions over whether they should be condemned might hit some snags in attempting to reach agreement.

A second discussion that has been had about the video concerns the racial component. In the viral video, the majority of the men on the street making these comments are non-white. Subsequent analysis of the video led to the conclusion that around 60% of the comments in question were received on a single street in Harlem. Whether this location was specifically selected in order to solicit more comments, whether certain comments from other people in other areas were edited out, or whether the comments were simply received primarily from the people in that area are unknown, but it does leave a lot to be desired in terms of research methods. It’s important to bear in mind that this video was not a research project for the sake of gathering new information: it was a video designed to go viral that ends with a donation link. Any video which failed to generate appropriate reactions from people on the street would be unlikely to be used, as I can’t imagine video of someone walking around the street without incident encourages people to empty their wallets effectively.

In the interests of furthering that discussion, it’s also worth considering a reported cross-cultural replication attempt of this study. Psychological research has often been criticized for relying on WEIRD samples, and reliance on a single person (with an agenda) from largely a single street should not be taken to be representative of people’s experiences more generally (either in that city or aboard). So, when a woman in New Zealand apparently tried the same thing – wandering the streets of a city for, I presume, 10 hours – it’s worth noting that the video reports her receiving a total of two comments, one of which was a man asking for directions. Assuming the walking time was the same, that’s the difference between a comment every 6 minutes and a comment, with different content, every 300. As seems to be the case in psychology research, flashy, attention-grabbing results don’t always replicate, leaving one wondering what caused the initial set of results to be generated in the first place. Statistical variance? Experimental demand characteristics? Improper sampling?

  Divine intervention, perhaps?

It’s difficult to say precisely what caused the difference in men’s behavior between videos, as well as why most of the comments were made in one specific area in the first one. The default answer most people would likely fall back on would, I imagine, be “cultural differences”, but that answer is sufficiently vague to not actually be one. This is the part where I need to be disappointing and say that I don’t actually have an answer to the questions. However, I would like to begin some speculation as to the psychology underlying the sending of these unsolicited comments and, from there, we might be able to figure out some variables which are doing some of the proverbial lifting here.

One possibility is that these comments are used by men specifically to intimidate women, or make them feel otherwise uncomfortable and unwelcome. As some might suggest, these comments are just an extension of a male culture that hates women because they’re women and will take about every chance it gets to ruin their day (variants of this hypothesis abound). I find such an explanation implausible for a number of reasons, chief among which is that calling someone beautiful is unlikely to be the most effective way of expressing contempt for them. When black people in America were marching for civil rights, they were not met with protesters telling them to “have a great day” or admiring their bodies with a suggestive “damn”. Such an explanation likely mistakes an outcome of an event for its motivating cause: because some women feel uncomfortable with these comments, some people think the comments are made in order to make women uncomfortable. This conclusion is likely the result of people wishing to condemn such comments and, in order to do so, they paint the perpetrators in the worst possible light.

However, it’s worth noting that, as far as I can tell, there are some women who either (a) express flattery at these comments or (b) sadness that they are not the targets of such comments, taking the lack of comments to say something negative about their attractiveness (which might not be inaccurate). While such sentiments may or may not be in the minority (I have no formal data speaking to the issue), they paint a much different picture of the matter. Typically, people experiencing violence, oppression, and/or hatred, do not, I think, need to be assured that they aren’t actually being complimented; the two are quite easy to tell apart most of the time. In fact, in the original video, at least one of the men is explicit about the notion that he is complimenting the girl (though admittedly he does go about it in a less than desirable fashion), while another man asks whether the reason the woman isn’t talking to him is that he’s ugly. If, as these ancedotes might suggest, catcalling is tied to factors like whether she is attractive or unattractive, or the response to it tied to the man’s desirability, it would be difficult to tie these factors in with misogyny or intentional harassment more generally.

“Why does my friend always get the harassment? Is it my hair?”

There is, of course, also the other end of this issue: men getting catcalled. While, again, I have no data on the issue, the misogyny explanation would be hard to reconcile with gay men or women making such comments towards men (even if such comments are likely less common owing to the historical costs and benefits of short-term sexual encounters for each sex). The simpler explanation would seem to be that such unsolicited comments, while not necessarily desired by the recipient, are earnest – if clumsy – attempts to start conversations or lead to a sexual encounter. Given that similar comments tend to be made in first messages on dating websites, this alternative seems reasonable (women who complain about receiving too many one line messages online should see the parallels immediately). The problem with such attempts is unlikely to be with any particular one being deplorable so much as it is their sheer volume.

Now it is quite unlikely that these comments ever yield successful encounters, as I mentioned above. This could be one reason they are often considered to be something other than friendly or sexual in nature (i.e., “since this behavior rarely results in sex, it can’t be about getting sex”; the same kind of error I mentioned earlier). The rarity of sexual encounters resulting from them is also likely why the proportion of men making them is really very low even though they’re rather cheap – in terms of time and energy – to make. While 100 comments in 10 hours might seem like a lot, one also needs to consider how many men the woman in question passed in that time, in one of the largest cities in the world, who said nothing. For every comment there were likely several dozen (or hundred) men who made no attempt to talk to the actress. Any explanation for these comments, then, would need to pinpoint some differences between those who do and do not make them; general aspersions against an entire gender or culture won’t do when it comes to predictive accuracy. For what it’s worth, I think a healthy portion of that variance will be accounted for by one’s sexual strategy, one’s current relationship status, the attractiveness of the person in question, and whether the target is sending any signals correlated with sexual receptivity.

What predictions can be drawn from alternative perspectives I leave up to you.

He’s Got Your Eyes…Right?

Last post, I was discussing paternal investment in children. The point of that post was to draw attention to the fact there are often rather good biological reasons for why we might expect men and women to be differentially interested in investing their time and energy in raising children versus doing other adaptive things. This is not to say that we shouldn’t expect men to be interested in investing in children, of course; just that we shouldn’t expect such things to be indiscriminate or motivated by the same factors as women’s altruism. I wanted to expand on one of those ideas a bit more today: specifically, the idea that men lack assurance in their paternity when fertilization takes places inside the female, whereas women can be 100% certain the child they give birth to is theirs. “Certain”, in the former context, refers to the notion that women were unlikely in need of a solution the adaptive problem of maternity certainty, as giving birth to a child was an honest and reliable signal that the child was related to the mother genetically.

“Little does she know it’s not her child” – No one, ever

The first study I wanted to draw attention to concerns the resemblance of a child to their parents. Naturally, as children inherit half of their genes from each parent, we should expect that children tend to resemble their parents with respect to a number of external and internal features. That much is pretty noncontroversial. However, since fathers cannot be assured of their paternity, we might expect men to attend to certain similarities between them and their would-be children when calculating the likelihood that a given child is actually theirs. If an Asian woman gives birth to a Black child, her Asian husband can likely be fairly assured that the child in question is not, in fact, his, and there might have been some infidelity involved somewhere along the line. One might – and, indeed, some have – make the corresponding argument that we might expect children to physically resemble their father more than their mother. The logic would go roughly as follows: if the child resembles their father more, they might receive more parental investment from the father, as he can be more certain the child is his. So, since that tends to be a good thing for the survival and reproductive prospects of the child, we might expect children to bias their resemblance towards their fathers (foregoing for the moment the precise mechanisms through which that might be achieved).

There’s a major issue that such expectations run into, however: reality. In a 1999 paper, Berdart & French collected photos of parents and their children from 28 families. The children’s photographs were taken when they were approximately 1, 3, and 5 years old; the parent’s pictures were all from when the child was about 1 year old. During each trial, the participants (180 undergraduates) were presented with a single picture of a child alongside three women or three men and asked to try and identify the child’s parent. This process was repeated 28 times for each subject. If children tend to resemble their fathers more than their mothers, we should also expect people to be better at matching the child to their father than their mother. This effect failed to materialize, however: for one-year old children, the average correct matching to the father (13/28) was not different than correct matching to the mother (12/28); similar findings obtained for the three-year olds (13/28 and 13/28, respectively) and five-year olds (14/28 and 14/28, respectively).

So while child did certainly tend to resemble their parents consistently (though clearly less than perfectly), they failed to resemble their fathers more than their mothers. Berdart & French (1999) suggested that there might be a rationale for this lack of distinct resemblance: if fathers were good at figuring out which children were theirs, they would presumably also be good at figuring out which children were not theirs, and withhold investment from the latter group. I don’t want to spend too much time on this point other than to note that it’s not a particularly strong one, as it would be good for the ostensible fathers if they had such a skill, and the ill effects on the unrelated child shouldn’t be expected to have an impact on its adaptive nature. Nevertheless, the important point here is that children do not appear to actually resemble their father anymore than their mother. Reality need not always get in the way of perceptions, though.

Fitness is 99% instagram filters

In general, having perceptions that match up to reality is a good thing. If you think you can succeed where you will fail, you’re more likely to waste your time; it you think you will fail when you can succeed, you’ll miss opportunities. Things like that. The exception to that rule concerns contexts of persuasion. There are beliefs I might prefer you to have because they would make me better off, rather than because they are true. So, if some part of my brain that deals with persuading others holds an incorrect belief, that’s not necessarily a problem. As the last post touched on, if people can convince others certain sex differences are due to sexism, that might have some useful implications for certain groups of people to the extent that people are trying to avoid being perceived as sexist and are willing to take steps to remediate the situation. For the present purposes, however, if women are trying to extract investment from men for their children, it would be in those women’s best interest if the man in question believes the child in question is actually related to him, as men are more likely to invest in child on that basis alone. Accordingly, we might predict that women will be more likely to try and convince men that their children resemble them.

Enter some research by Daly & Wilson (1982) that examined the spontaneous utterances of people following the birth of a child. Their first sample consisted of 111 births which had been taped; the fathers were present in 42 of them. From those 111 tapes, approximately 70 comments about the baby’s appearance were recorded. When it was the mother speaking, she remarked on the baby’s resemblance to the father in 16 instances, and the baby’s resemblance to herself 4 times. By contrast, when the father was speaking, he remarked on how the baby resembled the mother 4 times, and himself only once. Further, every time the resemblance to the mother was commented on, the utterance was singular; when the father’s resemblance was being discussed by the mother, however, six of them contained repetitions (e.g. “He looks like you…He’s got your eyes”). Immediately following the birth of a child, then, the resemblance to the father appeared to focused on specifically by the mother. While the mother was nominally more likely to comment on the resemblance to the father in his presence (75%) relative to his absence (47%), this difference didn’t reach significant with the small sample size.

A follow-up study surveyed the responses of mothers, fathers, and the relatives of both concerning the child’s resemblance. Responses came back from about 230 parents and 150 relatives. In all cases, each group suggested the child looks more like the father than the mother by a ratio of at least 2:1. This is in slight opposition to the previous results insomuch as both mothers and fathers said the child looked more like the father. This may have something to do with sampling bias, though, as only about 1/5 of the sample returned any surveys. It seems plausible, as the authors note, that “...fathers rankled by any serious suspicion of nonpaternity would be unlikely to find the questionnaire an amusing diversion“. It possible, then, that fathers might be overstating their physical resemblance to the child in surveys as signals of their unwillingness to abandon investment in the child or relationship, but that’s just speculation on my part. The videos, by contrast, might have proven to be more of an unbiased sample, freer from demand characteristics. Though it’s difficult to say, it’s worth noting that around 25% of the survey respondents reported that “everyone” in their life said the baby looked like the father, as compared with no one reporting comparable utterances about the mother. This is in spite of the finding that children don’t particularly resemble the father over the mother in a matching task, suggesting that such comments might represent social politeness, rather than accurate perceptions.

“She looks just like both of you…”

To repeat the major point here, there can be benefits to perceiving the world in inaccurate ways when you are trying to convince other people of things. Whether that thing is the resemblance of the child to a parent or whether sex differences are due sexism is quite irrelevant. It is likely, in many of these cases, that the part of the brain doing the talking legitimately believes those perceptions so as to better convince others, while different parts of the brain might disagree. Now, in this case, we happen to have data to suggest that the perceptions – or at least what people say about their perceptions – are incorrect; we also have a relatively straightforward theory for explaining why we might expect this might be the case. In many other cases we are not so fortunate.

References: Berdart, S. & French, R. (1999). Do babies resemble their fathers more than their mothers? A failure to replicate Chistenfeld & Hill (1995). Evolution & Human Behavior, 20, 129-135.

Daly, M. & Wilson, M. (1982). Whom are newborn babies said to resemble? Ethology & Sociobiology, 3, 69-78.