Research Tip: Ask About What You Want To Measure

Recently I served as a reviewer for a research article that had been submitted to a journal for publication. Without going into too much detail as to why, the authors of this paper wanted to control for people’s attitudes towards casual sex when conducting their analysis. They thought that it was possible people who were more sexually-permissive when it comes to infidelity might respond to certain scenarios differently than those who were less sexually-permissive. If you were the sensible type of researcher, you might do something like ask your participants to indicate on some scale as to how acceptable or unacceptable they think sexually infidelity is, then. The authors of this particular paper opted for a different, altogether stranger route: they noted that people’s attitudes towards infidelity correlate (imperfectly) with their political ideology (i.e., whether they consider themselves to be liberals or conservatives). So, rather than ask participants directly about how acceptable infidelity is (what they actually wanted to know), they asked participants about their political ideology and used that as a control instead.

 ”People who exercise get tired, so we measured how much people napped to assess physical fitness”

This example is by no means unique; psychology researchers frequently try to ask questions about topic X in the hopes of understanding something about topic Y. This can be acceptable at times, specifically when topic Y is unusually difficult – but not impossible – to study directly. After all, if topic Y is impossible to directly study, then one obviously cannot say that studying topic X tells you something about Y with much confidence, as you would have no way of assessing the relationship between X and Y to begin with. Assuming that the relationship between X and Y has been established and it is sufficiently strong and Y is unusually difficult to study directly, then there’s a good, practical case to be made for using X instead. When that is done, however, it should always be remembered that you aren’t actually studying what you’d like to study, so it’s important to not get carried away with the interpretation of your results.

This brings us nicely to the topic of research on sexism. When people hear the word “sexism” a couple things come to mind: someone who believes one sex is (or should be) – socially, morally, legally, psychologically, etc – inferior to the other, or worth less; someone who wouldn’t want to hire a member of one sex for a job (or intentionally pays them less if they did) strictly because of that variable regardless of their qualifications; someone who inherently dislikes members of one sex. While this list is by no means exhaustive, I suspect things like these are probably the prototypical examples of sexism; some kind of explicit, negative attitude about people because of their sex per se that directly translates into behavior. Despite this, people who research sexism don’t usually ask about such matters directly, as far as I’ve seen. To be clear, they easily could ask such questions assessing such attitudes in straightforward manners (in fact, they used to do just that with measures like the “Attitudes Towards Women Scale” in the 1970s), but they do not. As I understand it, the justification for not asking about such matters directly is because it has become more difficult to find people who actually express such views (Loo & Thorpe, 1998). As attitudes had already become markedly less sexist from 1972 to 1998, one can only guess at how much more change occurred from then to now. In short, it’s becoming rare to find blatant sexists anymore, especially if you’re asking college students.

Many researchers interpret that difficulty as being the result of people still holding sexist attitudes but either (a) are not willing express them publicly for fear of condemnation, or (b) are not consciously aware that they hold such views. As such, researchers like to ask about questions about “Modern Sexism” or “Ambivalent Sexism“; they maintain the word “sexism” in their scales, but they begin to ask about things which are not what people first think of when they hear the term. They no longer ask about explicitly sexist attitudes. Therein lies something of a problem, though: if what you really want to know is whether people hold particular sexist beliefs or attitudes, you need some way of assessing those attitudes directly in order to determine that other questions which don’t directly ask about that sexism will accurately reflect it. However, if such a method of assessing those beliefs accurately, directly, and easily does exist, then it seems altogether preferable to use that method instead. In short, just ask about the things you want to ask about. 

“We wanted to measure sugar content, so we assessed how much fruit the recipe called for”

If you continue on with using an alternate measure – like using the Ambivalent Sexism Inventory (ASI), rather than the Attitudes towards Women Scale – then you really should restrict your interpretations to things you’re actually asking about. As a quick example, let’s consider the ASI, which is made up of a hostile and benevolent sexism component. Zell et al (2016) summarize the scale as follows:

“Hostile sexism is an adversarial view of gender relations in which women are perceived as seeking control over men. Benevolent sexism is a subjectively positive view of gender relations in which women are perceived as pure creatures who ought to be protected, supported, and adored; as necessary companions to make a man complete; but as weak and therefore best relegated to traditional gender roles (e.g., homemaker).”

In other words, the benevolent scale measures the extent to which women are viewed as children: incapable of making their own decisions and, as such, in need of protection and provisioning by men. The hostile scale measures the extent to which men don’t trust women and view them as enemies. Glick & Fiske (1996) claim that  ”...hostile and benevolent sexism…combine notions of the exploited group’s lack of competence to exercise structural power with self-serving “benevolent” justifications.” However, not a single measure on either the hostile or benevolent sexism inventory actually asks about female competencies or whether women ought to be restricted socially. 

To make this explicit, let’s consider the questions Zell et al (2016) used to assess both components. In terms of hostile sexism, participants were asked to indicate their agreement with the following three statements:

  • Women seek power by gaining control over men
  • Women seek special favors under the guise of equality
  • Women exaggerate their problems at work

There are a few points to make about these questions: first, they are all clearly true to some extent. I say that because these are behaviors that all kinds of people engage in. If these behaviors are not specific to one sex – if both men and women exaggerate their problems at work – then agreement with the idea that women do does not stop me from believing men do this as well and, accordingly, does not necessarily track any kind of sexist belief (the alternative, I suppose, is to believe that women never exaggerate problems, which seems unlikely). If the questions are meant to be interpreted as a relative statement (e.g., “women exaggerate their problems at work more than men do”), then that statement needs to first be assessed empirically as true or false before you can say that endorsement of it represents sexism. If women actually do tend to exaggerate problems at work more (a matter that is quite difficult to objectively determine because of what the term exaggerate means), then agreement with the statement just means you accurately perceive reality; not that you’re a sexist.

More to the point, however, none of the measures ask about what the researchers interpret them to mean: women seeking special favors does not imply they are incompetent or unfit to hold positions outside of the home, nor does it imply that one views gender relations primarily as adversarial. If those views are really what a researcher is trying to get at, then they ought to just ask about them directly. A similar story emerges for the benevolent questions:

  • Women have a quality of purity few men possess
  • Men should sacrifice to provide for women
  • Despite accomplishment, men are incomplete without women

 Again, I see no mention of women’s competency, ability, intelligence, or someone’s endorsement of strict gender roles. Saying that men ought to behave altruistically towards women in no way implies that women can’t manage without men’s help. When a man offers to pay for an anniversary dinner (a behavior which I have seen labeled sexist before), he is usually not doing so because he feels his partner is incapable of paying anymore than my helping a friend move suggests I view them as a helpless child. 

“Our saving you from this fire implies you’re unfit to hold public office”

The argument can, of course, be made that scores on the ASI are related to the things these researchers actually want to measure. Indeed, Glick & Fiske (1996) made that very argument: they report that the hostile sexism scores (controlling for the benevolent scores) did correlate with “Old Fashion Sexism” and “Attitudes towards Women” scores (rs = .43 and .60, respectively, bearing in mind that was almost 20 years ago and these attitudes are changing). However, the correlations between benevolent sexism scores and these sexist attitudes were effectively zero (rs = -.03 and .04, respectively). In other words, it appears that people endorse these statements for reasons that have nothing at all to do with whether they view women as weak, or stupid, or any other pejorative you might throw out there, and their responses may tell you nothing at all about their opinion concerning gender roles. If you want to know about those matters, then ask about them. In general, it’s fine to speculate about what your results might mean – how they can best be interpreted – but an altogether easier path is to simply ask about such matters directly and reduce the need for pointless speculation.

 References: Glick, P. & Fiske, S. (1996). The ambivalent sexism inventory: Differentiating hostile and benevolent sexism. Journal of Personality & Social Psychology, 70, 491-512.

Loo, R. & Thorpe, K. (1998). Attitudes towards women’s roles in society: A replication after 20 years. Sex Roles, 39, 903-912.

Zell, E., Strickhouser, J., Lane, T., & Teeter, S. (2016). Mars, Venus, or Earth? Sexism and the exaggeration of psychological gender differences. Sex Roles, 75, 287-300.

Chivalry Isn’t Dead, But Men Are

In the somewhat-recent past, there was a vote in the Senate held on the matter of whether women in the US should be required to sign up for the selective service – the military draft – when they turn 18. Already accepted, of course, was the idea that men should be required to sign up; what appears to be a relatively less controversial idea. This represents yet another erosion of male privilege in modern society; in this case, the privilege of being expected to fight and die in armed combat, should the need arise. Now whether any conscription is likely to happen in the foreseeable future (hopefully not) is a somewhat different matter than whether women would be among the first drafted if that happened (probably not), but the question remains as to how to explain this state of affairs. The issue, it seems, is not simply one of whether men or women are better able to shoulder the physical demands of combat, however; it extends beyond military service into intuitions about real and hypothetical harm befalling men and women in everyday life. When it comes to harm, people seem to generally care less about it happening to men.

Meh

One anecdotal example of these intuitions I’ve encountered during my own writing is when an editor at Psychology Today removed an image in one my posts of a woman undergoing bodyguard training in China by having a bottle smashed over her head (which can be seen here; it’s by no means graphic). There was a concern expressed that the image was in some way inappropriate, despite my posting of other pictures of men being assaulted or otherwise harmed. As a research-minded individual, however, I want to go beyond simple anecdotes from my own life that confirm my intuitions into the empirical world where other people publish results that confirm my intuitions. While I’ve already written about this issue a number of times, it never hurts to pile on a little more.  Recently, I came upon a paper by FeldmanHall et al (2016) that examined these intuitions about harm directed towards men and women across a number of studies that can help me do just that.

The first of the studies in the paper was a straightforward task: fifty participants were recruited from Mturk to respond to a classic morality problem called the footbridge dilemma. Here, the life of five people can be saved from a train by pushing one person in front of it. When these participants were asked whether they would push a man or woman to their death (assuming, I think, that they were going to push one of them), 88% of participants opted for killing the man. Their second study expanded a bit on that finding using the same dilemma, but asking instead how willing they would be (on a 1-10 scale) to push either a man, woman, or a person of unspecified gender without other options existing. The findings here with regard to gender were a bit less dramatic and clear-cut: participants were slightly more likely to indicate that they would push a man (M = 3.3) than a woman (M = 3.0), though female participants were nominally less likely to push a woman (roughly M = 2.3) than men were (roughly M = 3.8), perhaps counter to what might be predicted. That said, the sample size for this second study was fairly small (only about 25 per group), so that difference might not be worth making much over until more data is collected.

When faced with a direct and unavoidable trade-off between the welfare of men and women, then, the results overwhelmingly showed that the women were being favored; however, when it came to cases where men or women could be harmed alone, there didn’t seem to be a marked difference between the two. That said, that moral dilemma alone can only take us so far in understanding people’s interests about the welfare of others in no small part because of their life-and-death nature potentially introducing ceiling effects (man or woman, very few people are willing to throw someone else in front of a train). In other instances where the degree of harm is lowered – such as, say, male vs female genital cutting – differences might begin to emerge. Thankfully, FeldmanHall et al (2016) included an additional experiment that brought these intuitions out of the hypothetical and into reality while lowering the degree of harm. You can’t kill people to conduct psychological research, after all.

Yet…

In the next experiment, 57 participants were recruited and given £20. At the end of the experiment, any money they had would be multiplied by ten, meaning participants could leave with a total of £200 (which is awfully generous as far as these things go). As with most psychology research, however, there was a catch: the participants would be taking part in 20 trials where £1 was at stake. A target individual – either a man or a woman – would be receiving a painful electric shock, and the participants could give up some of that £1 to reduce its intensity, with the full £1 removing the shock entirely. To make the task a little less abstract, the participants were also forced to view videos of the target receiving the shocks (which, I think, were prerecorded videos of real shocks – rather than shocks in real time – but I’m not sure from my reading of the paper if that’s a completely accurate description).

In this study, another large difference emerged: as expected, participants interacting with female targets ended up keeping less money by the end (M = £8.76) than those interacting with male targets (M = £12.54; d = .82). In other words, the main finding of interest was that participants were willing to give up substantially more money to prevent women from receiving painful shocks than they were to help men. Interestingly, this was the case in spite of the facts that (a) the male target in the videos was rated more positively overall than the female target, and (b) in a follow-up study where participants provided emotional reactions to thinking about being a participant in the former study, the amount of reported aversion to letting the target suffer shocks was similar regardless of the target’s gender. As the authors conclude:

While it is equally emotionally aversive to hurt any individual—regardless of their gender—that society perceives harming women as more morally unacceptable, suggests that gender bias and harm considerations play a large role in shaping moral action.

So, even though people find harming others – or letting them suffer harm for a personal gain – to generally be an uncomfortable experience regardless of their gender, they are more willing to help/avoid harming women than they are men, sometimes by a rather substantial margin.

Now onto the fun part: explaining these findings. It doesn’t go nearly far enough as an explanation to note that “society condones harming men more than women,” as that just restates the finding; likewise, we only get so far by mentioning that people perceive men to have a higher pain tolerance than women (because they do), as that only pushes the question back a step to the matter of why men tolerate more pain than women. As for my thoughts, first, I think these findings highlight the importance of a modular understanding of psychological systems: our altruistic and moral systems are made up of a number of component pieces, each with a distinct function, and the piece that is calculating how much harm is generated is, it would seem, not the same piece deciding whether or not to do something about it. The obvious reason for this distinction is that alleviating harm to others isn’t always adaptive to the same extent: it does me more adaptive good to help kin relative to non-kin, friends relative to strangers, and allies relative to enemies, all else being equal. 

“Just stay out of it; he’s bigger than you”

Second, it might well be the case that helping men, on average, tends to pay off less than helping women. Part of the reason for that state of affairs is that female reproductive potential cannot be replaced quite as easily as male potential; male reproductive success is constrained by the number of available women much more than female potential is by male availability (as Chris Rock put it, “any money spent on dick is a bad investment“). As such, men might become particularly inclined to invest in alleviating women’s pain as a form of mating effort. The story clearly doesn’t end there, however, or else we would predict men being uniquely likely to benefit women, rather than both sexes doing similarly. This raises two additional possibilities to me: one of these is that, if men value women highly as a form of mating effort, that increased social value could also make women more valuable to other women in turn. To place that in a Game of Thrones example, if a powerful house values their own children highly, non-relatives may come to value those same children highly as well in the hopes of ingratiating themselves to – or avoiding the wrath of – the child’s family.

The other idea that comes to mind is that men are less willing to reciprocate aid that alleviated their pain because to do so would be an admission of a degree of weakness; a signal that they honestly needed the help (and might in the future as well), which could lower their relative status. If men are less willing to reciprocate aid, that would make men worse investments for both sexes, all else being equal; better to help out the person who would experience more gratitude for your assistance and repay you in turn. While these explanations might or might not adequately explain these preferential altruistic behaviors directed towards women, I feel they’re worthwhile starting points.

References: FeldmanHall, O., Dalgleish, T., Evans, D., Navrady, L., Tedeschi, E., & Mobbs, D. (2016). Moral chivalry: Gender and harm sensitive predict costly altruism. Social Psychological & Personality Science, DOI: 10.1177/1948550616647448

Sexism, Testing, And “Academic Ability”

When I was teaching my undergraduate course on evolutionary psychology, my approach to testing and assessment was unique. You can read about that philosophy in more detail here, but the gist of my method was specifically avoiding multiple-choice formats in favor of short-essay questions with unlimited revision ability on the part of the students. I favored this exam format for a number of reasons, chief among which was that (a) I didn’t feel multiple choice tests were very good at assessing how well students understood the material (memorization and good guessing does not equal understanding), and (b) I didn’t really care about grading my students as much as I cared about getting them to learn the material. If they didn’t grasp it properly on their first try (and very few students do), I wanted them to have the ability and motivation to continue engaging with it until they did get it right (which most eventually did; the class average for each exam began around a 70 and rose to a 90). For the purposes of today’s discussion, the important point here is that my exams were a bit more cognitively challenging than is usual and, according to a new paper, that means I had unintentionally biased my exams in ways that disfavor “historically underserved groups” like women and the poor.

Oops…

What caught my eye about this particular paper, however, was the initial press release that accompanied it. Specifically, the authors were quoted as saying something I found, well, a bit queer:

“At first glance, one might assume the differences in exam performance are based on academic ability. However, we controlled for this in our study by including the students’ incoming grade point averages in our analysis,”

So the authors appear to believe that a gap in performance on academic tests arises independent of academic abilities (whichever those entail). This raised the immediate question in my mind of how one knows that abilities are the same unless one has a method of testing them. It seems a bit strange to say that abilities are the same on the basis of one set of tests (those that provided incoming GPAs), but then to continue to suggest that abilities are the same when a different set of tests provides a contrary result. In the interests of settling my curiosity, I tracked the paper down to see what was actually reported; after all, these little news blurbs frequently get the details wrong. Unfortunately, this one appeared to capture the author’s views accurately.

So let’s start by briefly reviewing what the authors were looking at. The paper, by Wright et al (2016), is based on data collected from three-years worth of three introductory biology courses spanning 26 different instructors, approximately 5,000 students, and 87 different exams.Without going into too much unnecessary detail, the tests were assessed by independent raters for how cognitively challenging they were, their format, and the students were classified according to their gender and socio-economic status (SES; as measured by whether they qualified for a financial aid program). In order to attempt and control for academic ability, Wright et al (2016) also looked at the freshman-year GPA of the students coming into the biology classes (based on approximately 45 credits, we are told). Because the authors controlled for incoming GPA, they hope to persuade the reader of the following:

This implies that, by at least one measure, these students have equal academic ability, and if they have differential outcomes on exams, then factors other than ability are likely influencing their performance.

Now one could argue that there’s more to academic ability than is captured by a GPA – which is precisely why I will do so in a minute – but let’s continue on with what the authors found first.

Cognitive challenging test were indeed, well, more challenging. A statistically-average male student, for instance, would be expected to do about 12% worse on the most challenging test in their sample, relative to the easiest one. This effect was not the same between genders, however. Again, using statistically-average men and women, when the tests were the least cognitively challenging, there was effectively no performance gap (about a 1.7% expected difference favoring men); however, when the tests were the most cognitively challenging, that expected gap rose to an astonishing expected…3.2% difference. So, while the gender difference just about nominally doubled, in terms of really mattering in any practical sense of the word, its size was such that it likely wouldn’t be noticed unless one was really looking for it. A similar pattern was discovered for SES: when the tests were easy, there was effectively no difference between those low or high in SES (1.3% favoring those higher); however, when the tests were about maximally challenging, this expected difference rose to about 3.5%. 

Useful for both spotting statistical blips and burning insects

There’s a lot to say about these results and how they’re framed within the paper. First, as I mentioned, they truly are minor differences; there are very few cases were a 1-3% difference in test scores is going to make-or-break a student, so I don’t think there’s any real reason to be concerned or to adjust the tests; not practically, anyway.

However, there are larger, theoretical issues looming in the paper. One of these is that the authors use the phrase “controlled for academic ability” so often that a reader might actually come to believe that’s what they did from simple repetition. The problem here, of course, is that the authors did not control for that; they controlled for GPA. Unfortunately for Wright et al’s (2016) presentation, those two things are not synonyms. As I said before, it is strange to say that academic ability is the same because one set of tests (incoming GPA) says they are while another set does not. The former set of tests appear to be privileged for no sound reason. Because of that unwarranted interpretation, the authors lose (or rather, purposefully remove) the ability to talk about how these gaps might be due to some performance difference. This is a useful rhetorical move if one is interested in doing advocacy – as it implies the gap is unfair and ought to be fixed somehow – but not if one is seeking the truth of the matter.

Another rather large issue in the paper is that, as far as I could tell, the authors predicted they would find these effects without ever really providing an explanation as for how or why that prediction arose. That is, what drove their expectation that men would outperform women and the rich outperform the poor? This ends up being something of a problem because, at the end of the paper, the authors do float a few possible (untested) explanations for their findings. The first of these is stereotype threat: the idea that certain groups of people will do poorly on tests because of some negative stereotype about their performance. This is a poor fit for the data for two reasons: first, while Wright et al (2016) claim that stereotype is “well-documented”, it actually fails to replicate (on top of not making much theoretical sense). Second, even if it was a real thing, stereotype threat, as it typically studied, requires that one’s sex be made salient prior to the test. As I encountered a total of zero tests during my entire college experience that made my gender salient, much less my SES, I can only assume that the tests in question didn’t do it either. In order for stereotype threat to work as an explanation, then, women and the poor would need to be under relative constant stereotype threat. In turn, this would make documenting and student stereotype threat in the first place rather difficult, as you could never have a condition where your subjects were not experiencing it. In short, then, stereotype threat seems like a bad fit.

The other explanations that are put forth for this gender difference are the possibility that women and poor students have more fixed views of intelligence instead of growth mindsets, so they withdraw from the material when challenged rather than improve (i.e., “we need to change their mindsets to close this daunting 2% gap), or the possibility that the test questions themselves are written in ways that subtly bias people’s ability to think about them (the example the authors raise is that a question written about applying some concept to sports might favor men, relative to women, as men tend to enjoy sports more). Given that the authors did have access to the test questions, it seems that they could have examined that latter possibility in at least some detail (minimally, perhaps, by looking at whether tests written by female instructors resulted in different outcomes than those written by male ones, or by examining the content of the questions themselves to see if women did worse on gendered ones). Why they didn’t conduct such analyses, I can’t say.

 Maybe it was too much work and they lacked a growth mindset

In summary, these very minor average differences that were uncovered could easily be chalked up – very simply – to GPA not being a full measure of a student’s academic ability. In fact, if the tests determining freshman GPA aren’t the most cognitively challenging (as one might well expect, given that students would have been taking mostly general introductory courses with large class sizes), then this might make the students appear to be more similar in ability than they actually were. The matter can be thought of using this stereotypically-male example (that will assuredly hinder women’s ability to think about it): imagine I tested people in a room with weights ranging from 1-15 pounds and asked them to curl each one time. This would give me a poor sense for any underlying differences in strength because the range of ability tested was restricted. Provided I were to ask them to do the same with weights ranging from 1-100 pounds the next week, I might conclude that it’s something about the weights – and not people’s abilities – when it came to figuring out why differences suddenly emerged (since I mistakenly believe I already controlled for their abilities the first time).

Now I don’t know if something like that is actually responsible, but if the tests determining freshman GPA were tapping the same kinds of abilities to the same degrees as those in the biology courses studied, then controlling for GPA should have taken care of that potential issue. Since controlling for GPA did not, I feel safe assuming there being some difference in the tests in terms of what abilities they’re measuring.

References: Wright, C., Eddy, S., Wenderoth, M., Abshire, E., Blankenbiller, M., & Brownell, S. (2016). Cognitive difficulty and format of exams predicts gender and socioeconomic gaps in exam performance of students in introductory biology courses. Life Science Education, 15.

Psychology Research And Advocacy

I get the sense that many people get a degree in psychology because they’re looking to help others (since most clearly aren’t doing it for the pay). For those who get a degree in the clinical side of the field, this observation seems easy to make; at the very least, I don’t know of any counselors or therapists who seek to make their clients feel worse about the state their life is in and keep them there. For those who become involved in the research end of psychology, I believe this desire to help others is still a major motivator. Rather than trying to help specific clients, however, many psychological researchers are driven by a motivation to help particular groups in society: women, certain racial groups, the sexually promiscuous, the outliers, the politically liberal, or any group that the researcher believes to be unfairly marginalized, undervalued, or maligned. Their work is driven by a desire to show that the particular group in question has been misjudged by others, with those doing the misjudging being biased and, importantly, wrong. In other words, their role as a researcher is often driven by their role as an advocate, and the quality of their work and thinking can often take a back seat to their social goals.

When megaphones fail, try using research to make yourself louder

Two such examples are highlighted in a recent paper by Eagly (2016), both of which can broadly be considered to focus on the topic of diversity in the workplace. I want to summarize them quickly before turning to some of the other facets of the paper I find noteworthy. The first case concerns the prospect that having more women on corporate boards tends to increase their profitability, a point driven by a finding that Fortune 500 companies in the top quarter of female representation on boards of directors performed better than those in the bottom quarter of representation. Eagly (2016) rightly notes that such a basic data set would be all but unpublishable in academia for failing to do a lot of important things. Indeed, when more sophisticated research was considered in a meta-analysis of 140 studies, the gender diversity of the board of directors had about as close to no effect as possible on financial outcomes: the average correlations across all the studies ranged from about r = .01 all the way up to r = .05 depending on what measures were considered. Gender diversity per se seemed to have no meaningful effect despite a variety of advocacy sources claiming that increasing female representation would provide financial benefits. Rather than considering the full scope of the research, the advocates tended to cite only the most simplistic analyses that provided the conclusion they wanted (others) to hear.

The second area of research concerned how demographic diversity in work groups can affect performance. The general assumption that is often made about diversity is that it is a positive force for improving outcomes, given that a more cognitively-varied group of people can bring a greater number of skills and perspectives to bear on solving tasks than more homogeneous groups can. As it turns out, however, another meta-analysis of 146 studies concluded that demographic diversity (both in terms of gender and racial makeup) had effectively no impact on performance outcomes: the correlation for gender was r = -.01 and was r = -.05 for racial diversity. By contrast, differences in skill sets and knowledge had a positive, but still very small effect (r = .05). In summary, findings like these would suggest that groups don’t get better at solving problems just because they’re made up of enough [men/women/Blacks/Whites/Asians/etc]. Diversity in demographics per se, unsurprisingly, doesn’t help to magically solve complex problems.

While Eagly (2016) appears to generally be condemning the role of advocacy in research when it comes to getting things right (a laudable position), there were some passages in the paper that caught my eye. The first of these concerns what advocates for causes should do when the research, taken as a whole, doesn’t exactly agree with their preferred stance. In this case, Eagly (2016) focuses on the diversity research that did not show good evidence for diverse groups leading to positive outcomes. The first route one might take is to simply misrepresent the state of the research, which is obviously a bad idea. Instead, Eagly suggests advocates take one of two alternative routes: first, she recommends that researchers might conduct research into more specific conditions under which diversity (or whatever one’s preferred topic is) might be a good thing. This is an interesting suggestion to evaluate: on the one hand, people would often be inclined to say it’s a good idea; in some particular contexts diversity might be a good thing, even if it’s not always, or even generally, useful. This wouldn’t be the first time effects in psychology are found to be context-dependent. On the other hand, this suggestion also runs some serious risks of inflating type 1 errors. Specifically, if you keep slicing up data and looking at the issue in a number of different contexts, you will eventually uncover positive results even if they’re just due to chance. Repeated subgroup or subcontext analysis doesn’t sound much different from the questionable statistical practices currently being blamed for psychology’s replication problem: just keep conducting research and only report the parts of it that happened to work, or keep massaging the data until the right conclusion falls out.    

“…the rest goes in the dumpster out back”

Eagly’s second suggestion I find a bit more worrisome: arguing that relevant factors – like increases in profits, productivity, or finding better solutions – aren’t actually all that relevant when it comes to justifying why companies should increase diversity. What I find odd about this is that it seems to suggest that the advocates begin with their conclusion (in this case, that diversity in the work force ought to be increased) and then just keep looking for ways to justify it in spite of previous failures to do so. Again, while it is possible that there are benefits to diversity which aren’t yet being considered in the literature, bad research would likely result from a process where someone starts their analysis with the conclusion and keeps going until they justify it to others, no matter how often it requires shifting the goal posts. A major problematic implication with that suggestion mirrors other aspects of the questionable psychology research practices I mentioned before: when a researcher finds the conclusion they’re looking for, they stop looking. They only collect data up until the point it is useful, which rigs the system in favor of finding positive results where there are none. That could well mean, then, that there will be negative consequences to these diversity policies which are not being considered. 

What I think is a good example of this justification problem leading to shoddy research practices/interpretation follows shortly thereafter. In talking about some of these alternative benefits that more female hires might have, Eagly (2016) notes that women tend to be more compassionate and egalitarian than men; as such, hiring more women should be expected to increase less-considered benefits, such as a reduction in the laying-off of employees during economic downturns (referred to as labor hoarding), or more favorable policies towards time off for family care. Now something like this should be expected: if you have different people making the decisions, different decisions will be made. Forgoing for the moment the question of whether those different policies are better, in some objective sense of the word, if one is interested in encouraging those outcomes (that is, they’re preferred by the advocate) then one might wish to address those issue directly, rather than by proxy. That is to say if you are looking to make the leadership of some company more compassionate, then it makes sense to test for and hire more compassionate people, not hiring more women under the assumption you will be increasing compassion. 

This is an important matter because people are not perfect statistical representations of the groups to which they belong. On average, women may be more compassionate than men; the type of woman who is interested in actively pursuing a CEO position in a Fortune 500 company might not be as compassionate as your average woman, however, and, in fact, might even be less compassionate than a particular male candidate. What Eagly (2016) has ended up reaching, then, is not a justification for hiring more women; it’s a justification for hiring compassionate or egalitarian people. What is conspicuously absent from this section is a call for more research to be conducted on contexts in which men might be more compassionate than women; once the conclusion that hiring women is a good thing has been justified (in the advocate’s mind, anyway), the concerns for more information seem to sputter out. It should go without saying, but such a course of action wouldn’t be expected to lead to the most accurate scientific understanding of our world.

The solution to that problem being more diversity, of course..

To place this point in another quick example, if you’re looking to assemble a group of tall people, it would be better to use people’s height when making that decision rather than their sex, even if men do tend to be taller than women. Some advocates might suggest that being male is a good enough proxy for height, so you should favor male candidates; others would suggest that you shouldn’t be trying to assemble a group of tall people in the first place, as short people offer benefits that tall ones don’t; other still will argue that it doesn’t matter if short people don’t offer benefits as they should be preferentially selected to combat negative attitudes towards the short regardless (at the expense of selecting tall candidates). For what it’s worth, I find the attitude of “keep doing research until you justify your predetermined conclusion” to be unproductive and indicative of why the relationship between advocates and researchers ought not be a close one. Advocacy can only serve as a cognitive constraint that decreases research quality as the goal of advocacy is decidedly not truth. Advocates should update their conclusions in light of the research; not vice versa. 

References: Eagly, A. (2016). When passionate advocates meet research on diversity, does the honest broker stand a chance? Journal of Social Issues, 72, 199-222.

Men Are Better At Selling Things On eBay

When it comes to gender politics, never take the title of the piece at face value; or the conclusions for that matter.

In my last post, I mentioned how I find some phrases and topics act as red flags regarding the quality of research one is liable to encounter. Today, the topic is gender equality – specifically some perceived (and, indeed, some rather peculiar) discrimination against women – which is an area not renowned for its clear-thinking or reasonable conclusions. As usual, the news articles circulating this piece of research made some outlandish claim that lacks even remote face validity. In this case, the research in question concludes that people, collectively, try to figure out the gender of the people selling things on eBay so as to pay women substantially less than men for similar goods. Those who found such a conclusion agreeable to their personal biases spread it to others across social media as yet another example of how the world is an evil, unfair place. So here I am again, taking a couple recreational shots at some nonsense story of sexism.

Just two more of these posts and I get a free smoothie

The piece question today is an article from Kricheli-Katz & Regev (2016) that examined data from about 1.1 million eBay auctions. The stated goals of the authors involve examining gender inequality in online product markets, so at least we can be sure they’re going into this without an agenda. Kricheli-Katz & Regev (2016) open their piece by talking about how gender inequality is a big problem, launching their discussion almost immediately with a rehashing of that misleading 20% pay gap statistic that’s been floating around forever. As that claim has been dissected so many times at this point, there’s not much more to say about it other than (a) when controlling for important factors, it drops to single digits and (b) when you see it, it’s time to buckle in for what will surely be an unpleasant ideological experience. Thankfully, the paper does not disappoint in that regard, promptly suggesting that women are discriminated against in online markets like eBay.

So let’s start by considering what the authors did, and what they found. First, Kricheli-Katz & Regev (2016) present us with their analysis of eBay data. They restricted their research to auctions only, where sellers will post an item and any subsequent interaction occurs between bidders alone, rather than between bidders and sellers. On average, they found that the women had about 10 fewer months of experience than men, though the accounts of both sexes had existed for over nine years of age, and women also had very-slightly better reputations, as measured by customer feedback. Women also tended to set slightly higher initial prices than men for their auctions, controlling for the product being sold. As such, women also tended to receive slightly fewer bids on their items, and ultimately less money per sale when they ended.

However, when the interaction between sex and product type (new or used) was examined, the headline-grabbing result appeared: while women netted a mere 3% less on average for used products than men, they netted a more-impressive 20% less for new products (where, naturally, one expects products to be the same). Kricheli-Katz & Regev (2016) claim that the discrepancy in the new-product case are due to beliefs about gender. Whatever these unspecified beliefs are, they cause people to pay women about 20% less for the same item. Taking that idea on face value for a moment, why does that gap all but evaporate in the used category of sales? The authors attribute that lack of a real difference to an increased trust people have in women’s descriptions of the condition of their products. So men trust women more when it comes to used goods, but pay them less for new ones when trust is less relevant. Both these conclusions, as far as I can see from the paper, have been pulled directly out of thin air. There is literally no evidence presented to support them: no data; not citations; no anything.

I might have found the source of their interpretations

By this point, anyone familiar with how eBay works is likely a bit confused. After all, the sex of the seller is at no point readily apparent in almost any listings. Without that crucial piece of information, people would have a very difficult time discriminating on the basis of it. Never fear, though; Kricheli-Katz & Regev (2016) report the results of a second study where they pulled 100 random sellers from their sample and asked about 400 participants to try and determine the sex of sellers in question. Each participant offered their guesses about five profiles, for a total of 2000 attempts. About 55% of the time, participants got the sex right, 9% of the time they got it wrong, and the remaining 36% of the time, they said they didn’t know (which, since they don’t know, also means they got it wrong). In short, people couldn’t determine the sex reliably about half the time. The authors do mention that the guesses got better as participants viewed more items that the seller had posted, however.

So here’s the story they’re trying to sell: When people log onto eBay, they seek out a product they’re looking to buy. When they find a seller listing the product, they examine the seller’s username, the listing in question, and their other listings in their store to attempt and discern the sex of the seller. Buyers subsequently lower their willingness to pay for an item by quite a bit if they see it is being sold by a woman, but only if it’s new. In fact, since women made 20% less, the actual reduction in willingness to pay must be larger than that, as sex can only be determined about half of the time reliably when people are trying. Buyers do all this despite even trusting female sellers more. Also, I do want to emphasis the word they, as this would need to be a pretty collective action. If it wasn’t a fairly universal response among buyers, the prices of female-sold items would eventually even out with the male price, as those who discriminated less against women would be drawn towards the cheaper prices and bump them back up.

Not only do I not buy this story – not even a little – but I wouldn’t pay the authors less for it because they happen to be women if I was looking to make a purchase. While people might be able to determine the sex of the seller on eBay sometimes, when they’re specifically asked to do so, that does not mean people engage in this sort of behavior naturally.

Finally, Kricheli-Katz & Regev (2016) report the results of a third study, asking 100 participants how much they value a $100 gift card being sold by either an Alison or a Brad. Sure enough, people were willing to pay Alison less for the card: she got a mere $83 to Brad’s $87; a 5% difference. I’d say someone should call the presses, but it looks like they already did, judging from the coverage this piece has received. Now this looks like discrimination – because it is – but I don’t think it’s based on sex per se. I say that because, earlier in the paper, Kricheli-Katz & Regev (2016) also report that women as buyers on eBay, tended to pay about 3% more than men for comparable goods. To the extent that the $4 difference in valuation is meaningful here, there are two things to say about it. First, it may well represent the fact that women aren’t as willing to negotiate prices in their favor. Indeed, while women were 23% of the sellers on eBay, they only represented 16% of the auctions with a negotiation component. If that’s the case, people are likely willing to pay less to women because they perceive (correctly) some population differences in their ability to get a good deal. I suspect if you gave them individuating information about the seller’s abilities, sex would stop mattering even 5%. Second, that slight, 5% difference would by no means account for the 20% gap the authors report finding with respect to new product sales; not even close.

But maybe your next big idea will work out better…

Instead, my guess is that in spite of the authors’ use of the word “equally qualified” when referring to the men and women in their seller sample, there were some important differences in listings the buyers noticed; the type of differences that you can’t account for when you’re looking at over a million of them and rough control measures aren’t effective. Kricheli-Katz & Regev (2016) never seemed to consider – and I mean really consider – the possibility that something about these listings, something they didn’t control for, might have been driving sale price differences. While they do control for factors like the seller’s reputation, experience, number of pictures, year of the sale, and some of the sentiments expressed by words in the listing (how positive or negative it is), there’s more to making a good listing than that. A more likely story is that differences in sale prices reflect different behaviors on the part of male and female sellers (as we already know others differences exist in the sample), as the alternative story attempting to be championed would require a level of obsession with gender-based discrimination in the population so wide and deep that we wouldn’t need to research it; it would be plainly obvious to everyone already.

Then again, perhaps it’s time I make my way over to eBay to pick up a new tinfoil hat.

References: Kricheli-Katz, T. & Regev, T. (2016). How many cents on the dollar? Women and men in product markets. Science Advances, 2, DOI: 10.1126/sciadv.1500599

Sexism: One More Time With Feeling

For whatever reason, a lot of sexism-related pieces have been crossing my desk lately. It’s not that I particularly mind; writing about these papers is quite engaging, and many people – no matter the side of the issue they tend to find themselves falling on – seem to share a similar perspective when it comes to reading about them (known more colloquially as the Howard Stern Effect). Now, as I’ve said before on several of the occasions I’ve written about them, the interpretations of the research on sexism – or sometimes the research itself – feels rather weak. The main reason I’ve found this research to feel so wanting centers around the rather transparent and socially-relevant persuasive messages that reside in such papers: when people have some vested interest in the outcome of the research – perhaps because it might lend legitimacy to their causes or because it paints a socially-flattering picture of their group – this opens the door for research designs and interpretations of data that can get rather selective. Basically, I have a difficult time trusting that truth will fall out of sexism research for the same reason I wouldn’t take a drug company’s report about the safety of their product at face value; there’s just too much on the line socially to not be skeptical.

“50% of the time it worked 100% of the time. Most of the rats didn’t even die!”

Up for consideration today is a paper examining how men and women perceive the quality of sexism research, contingent on the results of it (Handley et al, 2015). Before getting into the meat of this paper, I want to quote a passage from its introduction to applaud the brilliant tactical move the authors make (and to give you a sense for why I experience a certain degree of distrust concerning sexism research). When discussing how some of the previous research published by one of the authors was greeted with skepticism by predominately men – at least according to an informal analysis of online comments replying to coverage of it – the authors have this to say:

“…men might find the results reported by Moss-Racusin et al. threatening, because remedying the gender bias in STEM fields could translate into favoring women over men, especially if one takes a zero-sum-gain perspective. Therefore, relative to women, men may devalue such evidence in an unintentional implicit effort to retain their status as the majority group in STEM fields.”

This is just a fantastic passage for a few reasons. First, it subtlety affirms the truth of the previous research; after all, if there did not exist a real gender bias, there would be nothing in need of being remedied, so the finding must therefore reflect reality. Second. the passage provides a natural defense against future criticism of their work: anyone who questions the soundness of their research, or their interpretation of the results, is probably just biased against seeing the plainly-obvious truth they have stumbled upon because they’re male and trying to maintain their status in the world. For context, it’s worth noting that I have touched upon the piece in question before, writing, “Off the top of my head, I see nothing glaringly wrong with this study, so I’m fine with accepting the results…“. While I think the study in question seemed fine, I nevertheless questioned how well their results mesh with other findings (I happen to think there are some inconsistencies that would require a rather strange kind of discrimination be at play in the real world) and I was not overly taken with their interpretation of what they found.

With that context in mind, the three studies in the paper followed the same general method: an abstract of some research was provided to men and women (the first two studies used the abstract from one of the authors; the third used a different one). The subjects were asked to evaluate, on a 1-6 scale, whether they agreed with the author’s interpretation of the results, whether the research was important, whether the abstract was well written, and what their overall evaluation of the research was. These scores were then averaged into a single measure for each subject. In the third experiment the abstract itself was modified to either suggest that a bias favoring men and disfavoring women in STEM fields was uncovered by the research, or that no bias was found (why no condition existed in which the bias favored women I can’t say, but I think it would have been a nice addition to the paper). Just as with the previous paper, I see nothing glaringly wrong with their methods (beyond that omission), so let’s consider the results.

The first sample was comprised of 205 Mturk participants, and found that men were somewhat less favorable about the research that found evidence of sexism in STEM fields (M = 4.25) relative to women (M = 4.66). The second sample was made up of 205 academics from an unnamed research university and the same pattern was observed: overall, male faculty assessed the research somewhat less favorably (M = 4.21) than female faculty (M = 4.65). However, an important interaction emerged: the difference in this second sample was due to male-female differences within STEM fields. Male STEM faculty were substantially less positive about the study (M = 4.02) than their female counterparts (M = 4.80); non-STEM faculty did not differ in this respect, both falling right in between those two points (Ms = 4.55). Now it is worth mentioning that the difference between the STEM and non-STEM male faculty was statistically significant, but the difference between the female STEM and non-STEM faculty was not. Handley et al (2015) infer from that result that, “…men in STEM displayed harsher judgments of Moss-Racusin et al.’s research, not that women in STEM exhibited more positive evaluations of it“. This is where I’m going to be sexist and disagree with the author’s interpretation, as I feel it’s also worth noting that the sample size of male STEM faculty (n = 66) was almost twice as large as the female sample (n = 38), which likely contributed to that asymmetry in statistical significance. Descriptively speaking, STEM men were less accepting of the research and STEM women were more accepting of it, relative to the academics for whom this finding would be less immediately relevant.

“The interpretation of this research determines who deserves a raise, so please be honest.”

The third experiment that modified the abstract to contain a finding of either sexism against women or no sexism also used an Mturk sample of 303 people, rather than faculty. The same basic pattern was found here: when the research reported a bias against women, men were less favorable towards it (M = 3.65) than if it found no bias (M = 3.83); women showed the opposite pattern (Ms =  3.86 and 3.59, respectively). So – taken together – there’s some neat evidence here that the relevance of a research finding affects how that finding is perceived. Those who have something to gain by the research finding sexism (women, particularly those in STEM) tended to be slightly more favorable towards research that found it, whereas those who had something to lose (men, particularly those in STEM) tended to be slightly unfavorable towards research finding sexism. This isn’t exactly new – research on the idea has dated back at least two decades - but it fits well with what we know about how motivated reasoning works.

I want to give credit where credit is due: Handley et al (2015) do write that they cannot conclude that one gender is more biased than the other; just that gender appears to – sometimes – bias how sexism research is perceived to some degree. Now that tentative conclusion would be all well and good were it a consistent theme throughout their paper. However, the examples raised in the write-up universally center around how men might find findings of sexism threatening and how women are known to be disadvantaged by it; not on how women might be strategically inclined towards such research because it suits their goals (as, to remedy anti-female bias, female-benefiting plans may well have to be enacted). Even a quick reading of the paper should demonstrate that the authors are clearly of the view that sexism is a rather large problem for STEM fields, writing about how female participation needs to be increased and encouraged. That would seem to imply that anyone who denies the importance of the research reporting sexism is the one with the problematic bias, and that is a much less tentative way to think about the results. In the spirit of furthering their own interests, the authors further note how these biases could be a real problem for people publishing sexism research, as many of the people reviewing research articles are likely to be men and, accordingly, not necessarily inclined towards it (which, they note, makes it harder for them to publish in good journals and get tenure).

Handley et al’s (2015) review of the literature also comes off as rather one-sided, never explicitly discussing other findings that run counter to the idea that women experienced a constant stream of sexist discrimination in academia (like this finding: qualified women are almost universally preferred to qualified men by hiring committees, often by a large margin). Funnily enough, the authors transition from writing about how the evidence of sexism against women in STEM is “mounting” in the introduction to how the evidence is “copious” by the discussion. This one-sided treatment can be seen again around the very end of their discussion (in the “limitations and future directions” section) when Handley et al (2015) note that they failed to find an effect they were looking for: abstracts that were ostensibly written by women were not rated any differently than abstracts presented as being written by men (they hoped to find the female abstracts to be rated as lower quality). For whatever reason, however, they neglected to report this failure in their results section, where it belonged; indeed, they failed to mention that this was a prediction they were making the main paper at all, even though it was clearly something they were looking to find (else why would they include that factor and analyze the data in the first place?). Not mentioning a prediction that didn’t work out upfront strikes me as somewhat less than honest.

“Yeah; I probably should have mentioned I was drunk before right now. Oops”

Taking these results at face value, we can say that people who are motivated to interpret results in a particular way are going to be less than objective about that work, relative to someone with less to gain or lose. With that in mind, I would be inherently skeptical of the way sexist biases are presented in the literature more broadly and how they’re discussed in the current paper: the authors clearly have a vested interest in their research uncovering particular patterns of sexism, and in their interpretations of their data being accepted by the general and academic populations. That doesn’t make them unique (you could describe almost all academic researchers that way), nor does it make their results incorrect, but it does seem to make their presentation of these impactful issues seem painfully one-sided. This is especially concerning because these are matters which many feel carry important social implications. Bear in mind, I am not taking issue with the methods or the data presented in the current paper; those seem fine; what I take issue with is the interpretation and presentation of them. Then again, perhaps these only seem like issues to me because I’m a male STEM major…

References: Handley, I., Brown, E., Moss-Racusin, C., & Smith, J. (2015). Quality of evidence revealing subtle gender biases in science is in the eye of the beholder. Proceedings of the National Academy of Science, 112, 13201-13206.

The Very Strange World Of Sexism Research

Just from reading that title, many of you are likely already experiencing a host of emotions concerning the topic of sexism. It’s one of those topics that lights more than the usual number of metaphorical fires under people’s metaphorical asses, as well it should: it’s one of the labels tethered to people’s value as associates in the social world. Being branded a sexist is bad for business, socially, professionally, and otherwise. Conversely, being able to label others as sexist can be helpful for achieving your social goals (as others might acquiesce to your demands to avoid the label), whereas being thought of as someone who throws around the label inappropriately can lead to condemnation of its own. Because there is so much on the line socially when it comes to sexism, the topic tends to be one that migrates away from the realm of truth to the realm of persuasion; a place where truth might or might not be present, but is besides the point anyway. It also yields some truly strange papers with some even stranger claims.

“I’d like to introduce you to my co-authors…”

Some of these strange claims – such as the Ambivalent Sexism Inventory’s (ASI) interpretations of sexism – I’ve written about before. Specifically, I found it to be a rather odd scale for assessing sexism; perhaps being more suited for assessing whether someone is likely to identify as a feminist (which, to head off any comments to the contrary, is not the same thing). For instance, one question on the ASI concerns whether “most women interpret innocent remarks or acts as being sexist”, which is a nice way of building into your scale a way of denigrating people who think the scale misinterprets certain remarks or acts as indicating sexism. While it’s open to interpretation whether the scale measures what it claims to measure, it’s also an open question as to how well the answers to the inventory relate to actual sexist behaviors. Luckily, the study I wanted to discuss today sought to examine just that very thing, which is a happy little coincidence. Unfortunately, just as the interpretation of sexist attitudes is open to interpretation, the paper’s interpretation of sexist behavior is also rather open to interpretation, as I will soon discuss. Also unfortunately, the study sought to develop an implicit association task (IAT) to measure these sexism scores as well, and my thoughts on IATs have historically been less than positive.

The paper in question (de Oliveira Laux, Ksenofontov, & Becker, 2015) begins with a discussion of two types of sexism (against women) assessed by the ASI: benevolent and hostile sexism. The former refers to attitudes which hold women in high regard and to the prospect that men ought to behave altruistically towards them; the latter type of sexism refers largely to attitudes concerning whether women seek social advantages by overstating complaints and making unreasonable demands. At least that’s my interpretation of what the inventory is measuring when looking at the questions it asks; if you asked the authors of the current paper, they would tell you that hostile sexism inventory is measuring “antipathy towards non-traditional women who are perceived as challenging male power and as posing a threat for men” and that benevolent sexism measures “subjectively positive but patronizing view of women who conform to traditional roles“. These definitions will be important later, so keep them in mind.

In either case, the researchers wondered whether people’s explicit responses to these questions might be hiding their true levels of sexism, as hostile sexism is socially condemned. Accordingly, their first goal was to try and create an IAT that measured implicit hostile and benevolent sexism. They sought to develop this implicit measure despite their (surely a priori) expectation that it would be less predictive of sexist behavior than the explicit measures, which is one of those stranger aspects of this research I mentioned before: they were seeking to create an implicit measure that does worse at predicting behavior than existing, explicit ones. Undeterred by that expectation, the researchers recruited 126 males to take their sexism IATs and fill out the ASI. The benevolent sexism IAT portion had participants view 10 comics in which the man or woman was taking the active role. More precisely, a man/woman was either: (1) protecting the other with a gun, (2) proposing, (3) carrying their spouse through a door, post-marriage, (4) protecting the other with what looks like a stick, and (5) putting a coat on the other. The hostile sexism portion had words – not pictures – referring to “traditional” women (housewife/mother) or “non-traditional” women (feminist/women’s rights activists). Participants were supposed to sort these pictures/words into pleasant and unpleasant categories, I think; the section concerning the methods is less than specific about what the instructions behind the task were.

“Precise reporting is a tool of patriarchy”

Now the study already has a problem here in that it’s unclear what precisely participants are responding too when they see the pictures in the benevolent IAT: might they find the active women or the man cowering behind her the unpleasant part of the picture they’re categorizing? That concern aside, there were indeed correlations between the IATs and their explicit counterpart measures on the ASI: those who were higher in benevolent sexism were quicker to pair women in the protector role with negative words, and those higher in hostile sexism were quicker to pair feminism with negative words. Sure; both of these correlations were about r = .2, but they were not statistically zero. Further, the IAT measures of benevolent and hostile sexism did not correlate with each other (r = -.12), even though the explicit measures on the ASI did (r = .54). Naturally, the authors interpreted this as providing “strong support” for the validity of these IAT measures.

As a quick aside, I find this method a bit peculiar. The authors believe that hostile sexism might be consciously suppressed, meaning that the explicit measures of it might not be particularly good at measuring people’s actual attitudes. However, they’re trying to validate their implicit measures by correlating them with the explicit ones which they just suggested might not be accurate reflections of attitudes. That makes things rather difficult to interpret if you want to know which measure – explicit or implicit – taps into the construct better. Moving on…

In the second phase of the study, 83 of the original participants were brought back to assess their sexist behavior. What kind of behaviors were being assessed as sexist? Funny I should assumed you asked: in the benevolent sexism condition, participants were paired with a female confederate and asked to do a bit of role playing across three scenarios. During these role playing scenarios, the participants could choose between a pre-selected “sexist” action (like paying for the meal on their anniversary, expressing concern over their sister’s safety were she to take an internship counseling rapists, or asking that their female partner to create a shopping list for baking a cake while he allocated himself the job of creating a shopping list for heavy tools) or non-sexist ones (like simply expressing a concern that his sister would be disappointed by the rapist-counseling internship; not that she might be endangered by it, as that would be sexist).

Assessing the hostile sexist behaviors involving pairing the men with other male confederates. The job of this male-male pair was to review and recommend jokes. Each were given 9 cards that contained either a sexist joke and a neutral one, or two neutral ones.  They were asked to take turns choosing which joke they liked more and indicated whether they would recommend it to others. If both agreed it should be recommended to others, it would be passed on to the next group completing the task. Here’s an example of a neutral joke:

“Who invented the Triathlon? – The Polish. They walk to the swimming pool, swim one round and return home on a bike.”

If you can make sense of it, please let me know in the comments, because I certainly can’t parse what’s supposed to be funny about it, or even what it’s supposed to mean. We can also consider the example of a joke tapping hostile sexism:

“Why does a woman have one brain cell more than a horse? So that she doesn’t drink from the bucket while washing the stairs.”

While that joke does indeed sounds mean, I have some reservations as to whether it counts as hostile sexism the way the authors define it: as an antipathy towards non-tradition women who challenge male power structures. In that joke, the woman is not engaged in a non-traditional task, nor is she challenging male power, as far as I can tell. While the joke might correspond to what people think when they hear the words “hostile sexism” – i.e., being mean to women because of their sex –  it does not correspond well to the definition the authors use. It seems there are better examples of jokes that reflect the hostile sexism the authors hope to tap into (though these jokes no doubt tap many other things as well).

Like this one, for instance.

Skipping over one other role-playing task for length constraints, the final part of the hostile sexist behavior assessment examined one last sexist behavior: whether the participant would sign a petition for a men’s rights organization that the male confederate showed him. Signing the petition was counted as a sexist behavior, while not signing was counted as non-sexist. Take from that what you will.

As for the results of this second portion, the participant’s behavioral sexism scores did not correlate with their IAT measures of benevolent sexism at all, whether that behavior was supposed to count as benevolent or hostile. The IAT measure of hostile sexism did, for whatever reason, correlate with both benevolent and hostile behaviors, but correlated more strongly with benevolent sexism (rs = .33 and .21, respectively), which, as far as I can tell, was not predicted. Perhaps the evidence in favor the validity of these IAT measures was not quite as strong as the authors had claimed earlier. Also, as apparently expected, the implicit measures correlated less well with behavior than the explicit measures in all cases anyway (the correlations between explicit answers and behavior were both about .6), making one wonder why they were developed.

Interpreting these results generously, we might conclude that explicit attitudes predict behaviors –  a finding that many would not consider particularly unique – and that implicit associations predict behaviors less well or not at all. Interpreting these results less charitably, we might conclude that we don’t really learn much about sexism or attitudes, but learn instead that the authors likely identify as feminists and, perhaps, feel that those who disagree with them ought to be labeled as sexists, as they’re willing to stretch the definition of sexism far beyond its normal meaning while only studying the behavior of men. If you lean towards that second interpretation, however, it probably means you’re sexist.

References: de Oliveira Laux, S., Ksenofontov, I., & Becker, J. (2015). Explicit but not implicit sexist beliefs predict benevolent and hostile sexist behavior. European Journal of Social Psychology, 45, 702-715.

Tilting At The Windmills Of Stereotype Threat

If I had the power to reach inside your mind and effect your behavior, this would be quite the adaptive skill for me. Imagine being able to effortless make your direct competitors less effective than you, those who you find appealing more interested in associating with you, and, perhaps, even reaching inside your own mind, improving your performance to levels you couldn’t previously reach. While it would be good for me to possess these powers, it would be decidedly worse for other people if I did. Why? Simply put, because my adaptive best interests and theirs do not overlap 100%. Improving my standing in the evolutionary race will often come at their expense, and being able to manipulate them effectively would do just that. This means that they would be better off if they possessed the capacity to resist my fictitious mind-control powers. To bring this idea back down to reality, we could consider the relationship between parasites and hosts: parasites often make their living at their host’s expense, and the hosts, in turn, evolve defense mechanisms – like immune systems – to fight off the parasites.

 Now with 10% more Autism!

This might seem rather straightforward: avoiding manipulative exploitation is a valuable skill. However, the same kind of magical thinking present in the former paragraph seems to present in psychological research from time to time; the line of reasoning that goes, “people have this ability to reach into the minds of others and change their behavior to suit their own ends”. Admittedly, the reasoning is a lot more subtle and requires some digging to pick up on, as very few psychologists would ever say that humans possess such magical powers (with Daryl Bem being one notable exception). Instead, the line of thinking seems to go something like this: if I hold certain beliefs about you, you will begin to conform to those beliefs; indeed, even if such beliefs exist in your culture more generally, you will bend your behavior to meet them. If I happen to believe you’re smart, for example, you will become smarter; if I happen to believe you are a warm, friendly person, you will become warmer. This, of course, is expected to work in the opposite direction as well: if I believe you’re stupid, you will subsequently get dumber; if I believe you’re hostile, you will in turn become more hostile. This is a bit of an oversimplification, perhaps, but it captures the heart of these ideas well.

The problem with this line of thinking is precisely the same as the problem I outlined initially: there is a less than perfect (often far less than perfect) overlap between the reproductive best interests of the believers and the targets. If I allowed your beliefs about me to influence my behavior, I could be pushed and pulled in all sorts of directions I would rather not go in. Those who would rather not see me succeed could believe that I will fail, which would, generally, have negative implications for my future prospects (unless, of course, other people could fight that belief by believing I would succeed, leading to an exciting psychic battle). It would be better for me if I ignored their beliefs and simply proceeded forward on my own. In light of that, it would be rather strange to expect that humans possess cognitive mechanisms which use the beliefs of others as inputs for deciding our own behavior in a conformist fashion. Not only are the beliefs of others hard to accurately assess directly, but conforming to them is not always a wise idea even if they’re inferred correctly.

This hasn’t stopped some psychologists from suggesting that we do basically that, however. One such line of research that I wanted to discuss today is known as “stereotype threat”. Pulling a quick definition from reducingstereotyethreat.org: “Stereotype threat refers to being at risk of confirming, as self-characteristic, a negative stereotype about one’s group”. From the numerous examples they list, a typically research paradigm involves some variant of the following: (1) get two groups together to take a test that (2) happen to differ with respect to cultural stereotypes about who will do well. Following that, you (3) make salient their group membership in some way. The expected result is that the group that is on the negative end of the stereotype will perform worse when they’re aware of their group membership. To turn that into an easy example, men are believed to be better at math than women, so if you remind women about their gender prior to a math test, they ought to do worse than women not so reminded. The stereotype of women doing poorly on math actually makes women perform worse.

The psychological equivalent of getting Nancy Kerrigan’d

In the interests of understanding more about stereotype threat – specifically, its developmental trajectory with regard to how children of different ages might be vulnerable to it – Ganley et al (2013) ran three stereotype threat experiments with 931 male and female students, ranging from 4th to 12th grade. In their introduction, Ganley et al (2013) noted that some researchers regularly talk about the conditions under which stereotype threat is likely to have its negative impact: perhaps on hard questions, relative to easy ones; on math-identified girls but not non-identified ones; ones in mixed-sex groups but not single-sex groups, and so on. While some psychological phenomenon are indeed contextually specific, one could also view all that talk of the rather specific contexts required for stereotype threat to obtain as a post-hoc justification for some sketchy data analysis (didn’t find the result you wanted? Try breaking the data into different groups until you do find it). Nevertheless, Ganley et al (2013) set up their experiments with these ideas in mind, doing their best to find the effect: they selected high-performing boys and girls who scored above the mid-point of math identification, used evaluative testing scenarios, and used difficult math questions.

Ganley et al (2013) even used some rather explicit stereotype threat inductions: rather than just asking students to check off their gender (or not do so), their stereotype-threat conditions often outright told the participants who were about to take the test that boys outperform girls. It doesn’t get much more threatening than that. Their first study had 212 middle school students who were told either that boys showed more brain activation associated with math ability and, accordingly, performed better than girls, or that both sexes performed equally well. In this first experiment, there was no effect of condition: the girls who were told that boys do better on math tests did not under-perform, relative to the girls who were told that both sexes do equally well. In fact, the data went in the opposite direction, with girls in the stereotype threat condition performing slightly, though not significantly, better. Their next experiment had 224 seventh-graders and 117 eighth-graders. In this stereotype threat condition, they were asked to indicate their gender on a test before than began it because boys tended to outperform girls on these measures (this wasn’t mentioned in the control condition). Again, the results found no stereotype threat at either grade and, again, their data went in the opposite direction, with stereotype threat groups performing better.

Finally, their third study contained 68 forth-graders, 105 eighth-graders, and 145 twelfth-graders. In this stereotype threat condition, students first solved an easy math problem concerning many more boys being on the math team than girls before taking their test (the control condition’s problem did not contain the sex manipulation). They also tried to make the test seem more evaluative in the stereotype threat condition (referring to it as a “test”, rather than “some problems”). Yet again, no stereotype threat effects emerged at any grade level, with two of the three means going in the wrong direction. No matter how they sliced it, no stereotype threat effects fell out. Their data wasn’t even consistently in the direction of stereotype threat being a negative thing. Ganley et al (2013) even took their analysis just a little further in the discussion section, noting that published studies of such effects found some significant effect 80% of the time. However, these effects were also reported among other, non-significant findings. In other words, these effects were likely found after cutting the data up in different ways. By contrast, the three unpublished dissertations on stereotype threat all found nothing, suggesting the possibility that both data cheating and publication bias were probably at work in the literature (and they’re not the only ones).

     ”Gone fishing for P-values”

The current findings appear to build upon the trend of the frequently non-replicable nature of psychological research. More importantly, however, the type of thinking that inspired this research doesn’t seem to make much sense in the first place, though that part doesn’t seem to be discussed at all. There are good reasons to not let the beliefs of others affect your performance; an argument needs to made as to why we would be sensitive to such things, especially when they’re hypothesized to make us worse, and it isn’t present. To make that point crystal clear, try and apply stereotype threat thinking to any non-human species and see how plausible it sounds. By contrast, a real theory, like kin selection, applies with just as much force to humans as it does to other mammals, birds, insects, and even single-cell organisms. If there’s no solid (and plausible) adaptive reasoning in which one grounds their work – as there isn’t with stereotype threat – it should come as no surprise that effects flicker in and out of existence.

References: Ganley, C., Mingle, L., Ryan, A., Ryan, K., Vasilyeva, M., & Perry, M. (2013). An examination of stereotype threat effects on girls’ mathematical performance. Developmental Psychology, 49, 1886-1897.

Examining The Performance-Gender Link In Video Games

Like many people around my age or younger, I’m a big fan of video games. I’ve been interested in these kinds of games for as long as I can remember, and they’ve been the most consistent form of entertainment in my life, often winning out over the company of other people and, occasionally, food. As I – or pretty much anyone who has spent time within the gaming community – can attest to, the experience of playing these games with others can frequently lead to, shall we say, less-than-pleasant interactions with those who are upset by losses. Whether being derided for your own poor performance, good performance, good luck, or tactics of choice, negative comments are a frequent occurrence in the competitive online gaming environment. There are some people, however, who believe that simply being a woman in such environments yields a negative reception from a predominately-male community. Indeed, some evidence consistent with this possibility was recently published by Kasumovic & Kuznekoff (2015) but, as you will soon see, the picture of hostile behavior towards women that emerges in much more nuanced than it is often credited as being.

Aggression, video games, and gender relations; what more could you want to read about?

As an aside, it is worth mentioning that some topics – sexism being among them – tend to evade clear thinking because people have some kind of vested social interest in what they have to say about the association value of particular groups. If, for instance, people who play video games are perceived negatively, I would likely suffer socially by extension, since I enjoy video games myself (so there’s my bias). Accordingly, people might report or interpret evidence in ways that aren’t quite accurate so as to paint certain pictures. This issue seems to rear its head in the current paper on more than one occasion. For example, one claim made by Kasumovic & Kuznekoff (2015) is that “…men and women are equally likely to play competitive video games”. The citation for this claim is listed as “Essential facts about the computer and video game industry (2014)“. However, in that document, the word “competitive” does not appear at all, let alone a gender breakdown of competitive game play. Confusingly, the authors subsequently claim that competitive games are frequently dominated by males in terms of who plays them, directly contradicting the former idea. Another claim made by Kasumovic & Kuznekoff (2015) is that women are “more often depicted as damsels in distress”, though the paper they link to to support that claim does not appear to contain any breakdown of women’s actual representation in video games as characters, instead measuring people’s perceptions of women’s representation in them. While such a claim may indeed be true – women may be depicted as in need of rescue more often than they’re depicted in other roles and/or relative to men’s depictions – it’s worth noting that the citation they use does not contain the data they imply it does.

Despite these inaccuracies, Kasumovic & Kuznekoff (2015) take a step in the right direction by considering how the reproductive benefits to competition have shaped male and female psychologies when approaching the women-in-competitive-video-games question. For men, one’s place in a dominance hierarchy was quite relevant for determining their eventual reproductive success, leading to more overt strategies of social hierarchy navigation. These overt strategies include the development of larger, more muscular upper-bodies in men, suited for direct physical contests. By contrast, women’s reproductive fitness was often less affected by their status within the social hierarchy, especially with respect to direct physical competitions. As men and women begin to compete in the same venues where differences in physical strength no longer determine the winner – as is the case in online video games – this could lead to some unpleasant situations for particular men who have the most to lose by having their status threatened by female competition.

In the interests of being more explicit about why female involvement in typically male-style competitions might be a problem for some men, let’s employ some Bayesian reasoning. In terms of physical contests, larger men tend to dominate smaller ones; this is why most fighting sports are separated into different classes based on the weight of the combatants. So what are we to infer when a smaller fighter consistently beats a larger one? Though these aren’t mutually exclusive, we could infer either that the smaller fighter is very skilled or that the larger fighter is particularly unskilled. Indeed, if the larger fighter is losing both to people of his own weight class and of a weight class below him, the latter interpretation becomes more likely. It doesn’t take much of a jump to replace size with sex in this example: because men tend to be stronger than women, our Bayesian priors should lead us to expect that men will win in direct physical competition over women, on average. A man who performs poorly against both men and women in physical competition, is going to suffer a major blow to his social status and reputation as a fighter.

It’ll be embarrassing for him to see that replayed five times from three angles.

While winning in competitive video games does not rely on physical strength, a similar type of logic applies there as well: if men tend to be the ones overwhelming dominating a video game in terms of their performance, then a man who performs poorly has the most to lose from women becoming involved in the game, as he now might compare poorly both to the standard reference group and to the disfavored minority group. By contrast, men who are high performers in these games would not be bothered by women joining in, as they aren’t terribly concerned about losing to them and having their status threatened. This yields some interesting predictions about what kind of men are going to become hostile towards women. By comparison, other social and lay theories (which are often hard to separate) do not tend to yield such predictions, instead suggesting that both high and low performing men might be hostile towards women in order to remove them from a type of male-only space; what one might consider a more general sexist discrimination.

To test these hypotheses, Kasumovic & Kuznekoff (2015) reported on some data collected while they were playing Halo 3, during which time all matches and conversations within the game were recorded. During these games, the authors had approximately a dozen neutral phrases prerecorded with either a male or female voice they would play during appropriate times in the match. These phrases served to cue the other players as to the ostensible gender of the researcher. The matches themselves were 4 vs 4 games in which the objective for each is to kill more members of the enemy team than they kill of yours. All in-game conversations were transcribed, with two coders examined the transcripts for comments directed towards the researcher playing the game, classifying them as positive, negative, or neutral. The performance of the players making these comments were also recorded with respect to whether the game was won or lost, that player’s overall skill level, and the number of their kills and deaths in the match, so as to get a sense for the type of player making them.

The data represented 163 games of Halo, during which 189 players directed comments towards the researcher across 102 of the games. Of those 189 players who made comments, all of them were males. Only the 147 of those commenters that came from a teammate were retained for analysis. In total, then, 82 players directed comments towards the female-voiced player, whereas 65 directed comments towards the male-voiced player.

A few interesting findings emerged with respect to the gender manipulation. While I won’t mention all of them, I wanted to highlight a few. First, when the researcher used the female voice, higher-skill male players tended to direct significantly more positive comments towards them, relative to low-skill players (β = -.31); no such trend was observed for the male-voiced character. Additionally, as the difference between the female-voiced researcher and the commenting player grew larger (specifically, as the person making the comment was of progressively higher ranks than the female-voiced player), the number of positive comments tended to increase. Similarly, high-skill male players tended to direct fewer negative comments towards the female-voiced research as well (β = -.18). Finally, in terms of their kills during the match, poor performing males directed more negative comments towards female voiced characters, relative to high-performing men (β = .35); no such trend was evident for the male-voiced condition.

“I’m bad at this game and it’s your fault people know it!”

Taken together, the results seem to point in a pretty consistent direction: low-performing men tended to be less welcoming of women in their competitive game of choice, perhaps because it highlighted their poor performance to a greater degree. By contrast, high-performing males were relatively less troubled by the ostensible presence of women, dipping over into being quite welcoming of them. After all, a man being good at the game might well be an attractive quality to women who also enjoy the world of Esports, and what better way to kick off a potential relationship than with a shared hobby? As a final point, it is worth noting that the truly sexist types might present a different pattern of data, relative to people who were just making positive or negative comments: only 11 of the players (out of 83 who made negative comments and 189 who made any comments) were classified as making comments considered to be “hostile sexism”, which did not yield a large enough sample for a proper analysis. The good news, then, seems to be such comments are at least relatively rare.

References: Kasumovic, M. & Kuznekoff, J. (2015). Insights into sexism: Male status and performance moderates female-directed hostile and amicable behavior. PLoS One, 10: e0131613. doi:10.1371/journal.pone.0131613

Are Video Games Making People Sexist?

If the warnings of certain pop-culture critics are correct, there’s a harm being perpetuated against women in the form of video games, where women are portrayed as lacking agency, sexualized, or prizes to be won by male characters. The harm comes from the downstream effects of playing these games, as it would lead to players – male and female – developing beliefs about the roles and capabilities of men and women from their depictions, entrenching sexist attitudes against women and, presumably, killing women’s aspirations to be more than mere ornaments for men as readily as one kills the waves of enemies that run directly into their crosshairs in any modern shooter. It’s a very blank slate type of view of human personality; one which suggests that there’s really not a whole lot inside our heads but a mound of person-clay, waiting to be shaped by the first set of media representations we come across. This blank slate view also happens to be a widely-implausible one lacking much in the way of empirical support.

Which would explain why my Stepford wife collection was so hard to build

The blank slate view of the human mind, or at least one of its many varieties, has apparently found itself a new name lately: cultivation theory. In the proud tradition of coming up with psychological theories that are not actually theories, cultivation theory restates an intuition: that the more one is exposed to or uses a certain type of media, the more one’s views will come to resemble what gets depicted in that medium. So, if one plays too many violent video games, say, they should be expected to turn into more violent people over time. This hasn’t happened yet, and violent content per se doesn’t seem to be the culprit of anger or aggression anyway, but it hasn’t stopped people from trying to push the idea that it could, will, or is currently happening. A similar idea mentioned in the introduction would suggest that if people are playing games in which women are depicted in certain ways – or not depicted at all – people will develop negative attitudes to them over time as they play more of these games.

What’s remarkable about these intuitions is how widely they appear to be held, or at least entertained seriously, in the absence of any real evidence that this cultivation of attitudes actually happens. Recently, the first longitudinal test of this cultivation idea was reported by Breuer et al (2015). Drawing on some data from German gamers, the researchers were able to examine how video game use and sexist attitudes changed from 2011 to 2013 among men and women. If there’s any cultivation going on, a few years ought to be long enough to detect at least some of it. The study ended up reporting on data from 824 participants (360 female), ages 14-85 (M = 38) concerning their sex, education level, frequency of game use, preference of genre of game, and sexist attitudes. The latter measure was derived from agreement on a scale from 1 to 5 concerning three questions: whether men should be responsible for major decisions in the family, whether men should take on leadership roles in mixed-sex groups, and whether women should take care of the home, even if both partners are wage earners.

Before getting into the relationships between video game use and sexist attitudes, I would like to note at the outset a bit of news which should be good for almost everyone: sexist attitudes were quite low, with each question garnering about an average agreement of about 1.8. As the scale is anchored from “strongly disagree” to “agree completely”, these scores would indicate that the sexist statements were met with rather palpable disagreement on the whole. There was a modest negative correlation between education and acceptance of those views, as well as a small, and male-specific, negative correlation with age. In other words, those who disagreed with those statements the least tended to be modestly less educated and, if they were male, younger. The questions of the day, though, are whether those people who play more video games are more accepting of such attitudes and whether that relationship grows larger over time.

Damn you, Call of Duty! This is all your fault!

As it turns out, no; they are not. In 2011, the regression coefficients for video game use and sexist attitudes were .04 and .06 for women and men, respectively (in 2013, these numbers were -.08 and -.07). Over time, not much changed: the female association between video game use in 2011 and sexist attitudes in 2013 was .12, while the male association was -.08. If video games were making people more accepting of sexism, it wasn’t showing up here. The analysis was attempted again, this time taking into account specific genres of gaming, including role-playing, action, and first-person shooters; genres in which women are thought to be particularly underrepresented or represented in sexist fashions (full disclosure: I don’t know what a sexist depiction of a woman in a game is supposed to look like, though it seems to be an umbrella term for a lot of different things from presence vs absence, to sexualization, to having women get kidnapped, none of which strike me as sexist, in the strict sense of the word. Instead, it seems to be a term that stands in for some personal distaste on the part of the person doing the assessment). However, considerations of specific genres yielded no notable associations between gaming and endorsement of the sexist statements either, which would seem to leave the cultivation theory dead in the water.

Breuer et al (2015) note that their results appear inconsistent with previous work by Stermer & Burkley (2012) that suggested a correlation exists between sexist video game exposure and endorsement of “benevolent sexism”. In that study, 61 men and 114 women were asked about the three games they played the most, ranked each on a 1-7 scale concerning how much sexism was present in them (again, this term doesn’t seem to be defined in any clear fashion), and then completed the ambivalent sexism scale; a dubious measure I have touched upon before. The results reported by Stermer & Burkley (2012) found participants reporting a very small amount of perceived sexism in their favorite games (M = 1.87 for men and 1.54 for women) and, replicating past work, also found no difference of endorsement of benevolent sexism between men and women on average, nor among those who played games they perceived to be sexist and those who did not, though men who perceived more sexism in their games endorsed the benevolent items relatively more (β = 0.21). Finally, it’s worth noting there was no connection between the hostile sexism score and video game playing. One issue might raise about this design concerns asking people explicitly about whether their leisure time activities are sexist and then immediately asking them about how much they value women and feel they should be protected. People might be right to begin thinking about how experimental demand characteristics could be effecting the results at that point.

Tell me about how much you hate women and why that’s due to video games

So is there much room to worry about when it comes to video games turning people into sexists? According to the present results, I would say probably not. Not only was the connection between sexism and video game playing small to the point of nonexistence in the larger, longitudinal sample, but the overall endorsement and perception of sexism in these samples is close to a floor effect. Rather than shaping our psychology in appreciable ways, a more likely hypothesis is that various types of media – from video games to movies and beyond - reflect aspects of it. To use a simple example, men aren’t drawn to being soldiers because of video games, but video games reflect the fact that most soldiers are men. For whatever reason, this hypothesis appears to receive considerably less attention (perhaps because it makes for a less exciting moral panic?). When it comes to video games, certain features our psychology might be easier to translate into compelling game play, leading to certain aspects more typical of men’s psychology being more heavily represented. In that sense, it would be rather strange to say that women are underrepresented in gaming, as one needs a reference point to what appropriate representation would mean and, as far as I can tell, that part is largely absent; kind of like how most research on stereotypes begins by assuming that they’re entirely false.

References: Breuer, J., Kowert, R., Festl, R., & Quandt, T. (2015). Sexist games = sexist gamers? A longitudinal study on the relationship between video game use and sexist attitudes. Cyberpsychology, Behavior, & Social Networking, 18, 1-6.

Stermer, P. & Burkley, M. (2012). SeX-Box: Exposure to sexist video games predicts benevolent sexism. Psychology of Popular Media Culture, 4, 47-56.