About Jesse Marczyk

An Evolutionary-Minded Psychologist, of All Things

Practice, Hard Work, And Giving Up

There’s no getting around it: if you want to get better at something – anything – you need to practice. I’ve spent the last several years writing rather continuously and have noticed that my original posts are of a much lower quality when I look back at them. If you want to be the best version of yourself that you can be, you’ll need to spend a lot of time working at your skills of choice. Nevertheless, people do vary widely in terms of how much practice they are willing to devote to a skill and how readily they abandon their efforts in the face of challenges, or simply to time. Some musicians will wake up and practice several hours a day, some only a few days a week, some a few times throughout the year, and some will stop playing entirely (in spite of almost none of them making anything resembling money from it). In a word, some musicians possess more grit than others.

Those of us who spend too much time at a computer acquire a different kind of grit

To give you a sense for what is meant by grit, consider the following description offered by Duckworth et al (2007):

The gritty individual approaches achievement as a marathon; his or her advantage is stamina. Whereas disappointment or boredom signals to others that it is time to change trajectory and cut losses, the gritty individual stays the course.

Grit, in this context, refers to those who continue to pursue their goals when faced with obstacles, major or minor. According to Duckworth et al (2007), this trait of grit is referenced regularly by people discussing the top performers in their field about as often as talent, even if they might not refer to it by that name.

The aim of the Duckworth et al (2007) paper, broadly speaking, was two-fold: to create a scale to measure grit (as one did not currently exist), and then use that scale to see how well grit predicted subsequent achievements. Without going too in depth into the details of the project, the grit scale eventually landed on 12 questions. Six of those dealt with how consistent one’s interests are (like, “my interests change from year to year”) and the other six with perseverance of effort (like, “I have overcome setbacks to conquer an important challenge”). While this measure of grit was highly correlated with the personality trait of conscientiousness (r = .77), the two were apparently different enough to warrant separate categorization, as the grit score still predicted some outcomes after controlling for personality.

When the new scale was directed at student populations, grit was also found to relate to educational achievement, controlling for measures of general intelligence: in this case, college GPA controlling for SAT scores in a sample of about 1,400 Upenn undergraduates. The relationship between grit and GPA was modest (r = .25), though it got somewhat larger after controlling for SAT scores (r = .34). In a follow-up study, the grit scale was also used to predict which cadets at a military academy completed their summer training. Though about 94% of the cadets completed this training, these grittiest individuals were the least likely to drop out, as one might expect. However, unlike in the Upenn sample, grit was not a good predictor of subsequent cadet GPA in that sample (r = .06), raising some questions about the previous result (which I’ll get to in a minute).

This is time not spent studying for that engineering test

With that brief summary of grit in mind – hopefully enough to give you a general sense for the term – I wanted to discuss some of the theoretical aspects of the idea. Specifically, I want to consider when grit might be a good thing and when it might be better to persevere a little less or find new interests.

One big complication stopping people from being gritty is the simple matter of opportunity costs. For every task I decide to invest years of dedicated, consistent practice to, there are other tasks I do not get to accomplish. Time spent writing this post is time I don’t get to spend pursuing other hobbies (which I have been taking intermediate breaks to pursue, for the record). This is, in fact, why I have begun writing a post every two weeks or so down from each week: there are simply other things in life I want to spend my time on. Being gritty about writing means I don’t get to be equally gritty about other things. In fact, if I were particularly gritty about writing I might not get to be gritty about anything at all. Not unless I wanted to stop being gritty about sleep, but even then I could just devote that sleeping time to writing as well.

This is a problem when it comes to grit being useful, because of a second issue: diminishing returns on practice. That first week, month, or year you spend learning a skill typically yields a more appreciable return than the second, third, or so on. Putting that into a quick example, if I started studying chess (a game I almost never play), I would see substantial improvements to my win rate in the first month. Let’s just say 10% to put a number on it. The next month of practice still increases my win rate, but not by quite as much, as there are less obvious mistakes I’m making. I go up another 5%. As this process continues, I might eventually spend a month of practice to increase my win rate by mere fractions of a percent. While this dedicated practice does, on paper, make me better, the size of those rewards relative to the time investment I need to make to get them gets progressively smaller. At a certain point, it doesn’t make much more sense to commit that time to chess when I could be learning to speak Spanish or even just spend that time with friends.

This brings us nicely to the next point: the rate of improvement, both in terms of how quickly you learn and how far additional practice can push you, ought to depend on one’s biological potential (for lack of a better term). No matter how much time I spend practicing guitar, for instance, there are certain ceilings on performance I will not be able to break: perhaps it becomes physically impossible to play any faster while maintaining accuracy; perhaps some memory constraints come into play and I cannot remember everything I’ve tried to learn. We should expect grit to interact with potential in a certain way: if you don’t have the ability to achieve a particular task, being gritty about pursuing it is going to be time spent effectively banging your head against a brick wall. By contrast, the individual who possesses a greater potential for the task in question has a much higher chance of grit paying off. They can simply get more from practice.

Some people just have nicer ceilings than others

This is, of course, assuming the task is actually one that can be accomplished. If you’re very gritty about finding the treasure buried in your backyard that doesn’t actually exist, you’ll spend a lot of time digging and none getting rich. Being gritty about achieving the impossible is a bad idea. But who’s to say what’s impossible? We usually don’t have access to enough information to say something cannot (or at least will not) be achieved, but we can often make some fairly-educated guesses. Let’s just stick to the music example for now: say you want to accomplish the task of becoming a world-famous rockstar. You have the potential to perform and you’re very gritty about pursuing it. You spend years practicing, forming bands, writing songs, finding gigs, and so on. One problem you’re liable to encounter in this case is simply that many other people who are similarly qualified are doing likewise, and there’s only so much room at the top. Even if you are all approximately as talented and gritty, there are some ceiling effects at play where being even grittier and more talented does not, by any means, guarantee more success. As I have mentioned before, the popularity of cultural products can be a fickle thing. It’s not just about the products you produce or what you can do. 

We see this playing out in the world of academia today. As many have lamented, there seem to be too few academic jobs for all the PhDs getting minted across the country. Being gritty about pursuing that degree – all the time, energy, and money spent earning it – turned out to not be a great idea for many who have done so. Sure, you can bet that just about everyone who achieved their dream job as a professor making a decent salary was pretty gritty about things. You have to be if you’re going to spend 10 or more years invested in higher education with little payoff and many challenges along the way. It’s just that lots of people who were about as gritty as those who got a job failed to do anything with their degree after they achieved it. As this example shows, not only does the task need to be achievable, but the rewards for achieving it need to be both valuable and likely if grit is to pay off. If the rewards aren’t valuable (eg, a job as an adjunct teaching 5 courses a semester for about as much as you’d make working minimum wage, all things considered), then pursuing them is a bad idea. If the rewards are valuable but unlikely (eg, becoming a top-selling pop artist), then pursuing them is similarly a bad idea for just about everyone. There are better things to do with your time.

The closest most people will come to being a rockstar

This yields the following summary: for grit to be potentially useful, a task needs to be capable of being accomplished, you need the potential to accomplish it given enough time, the rewards of achieving it need to be large enough, relative to the investment you put in, and the probability of achieving those rewards is comparably high. While that does leave many tasks for which passionate persistence and practice might pay off (and many for which it will not), this utility always exists in the context of other people doing likewise. For that reason, beyond a certain ceiling of effort more is not necessarily much of a guarantee of success. You can think of grit as – in many cases – something of a prerequisite for success rather than a great determinant. Finally, all of that needs to be weighed against the other things you could be doing with your time. Time spent being gritty about sports is time not spent being gritty about academics, which is time not spent being gritty about music, and so on.

If you want to reach your potential within a domain, there’s really no other option. You’ll need to invest lots of time and effort. Figuring out where that effort should go is the tricky part.

References: Duckworth, A., Peterson, C, Matthews, M., & Kelly, D. (2007). Grit: Perseverance and passion for long-term goals. Journal of Personality & Social Psychology, 92, 1087-1101.

Untitled Creativity Post

Creativity – much like intelligence – is a highly-valued trait. It is also – much like intelligence – a term that encompasses multiple abilities applied across a broad number of domains, which can result in some confusion over precisely what one means when the word is used. Since I wanted to think a bit about creativity today, a good starting point for this discussion would be to clarify what creativity refers to in terms of function and form. Being clear about these issues can help us avoid getting mired in topics related to creativity – like intelligence – but which are not creativity themselves. There’s nothing quite like definitional confusion when it comes to stagnating discussions; just ask gender.

These are all bathrooms now, and there still aren’t enough kinds

In terms of a good starting definition for thinking about creativity, I think we are lucky enough to have one available to us. Paraphrasing a bit from Plucker, Beghetto, and Dow (2004), creativity generally refers to the creation of something novel that manages to do something useful or appropriate. The former point is generally accepted in common usage: products or people that are viewed as creative are or do something that hasn’t been done quite that way before. Something new is being created (hence the term), rather than something being repeated or copied. The less-appreciated – but equally important – facet of the definition is the latter portion. There are a great many ways of creating something new without it being creative. You might, for instance, write a bunch of nouns on pieces of paper, mix them in a bag, then pull two at random and create a new product with them. Say you pulled out a piece that said “clock” and another that said “fish” and so designed a clock with a dead fish nailed to the middle of it. While that design would be novel – at least I haven’t seen many clocks with attached fish – it wouldn’t be appropriate or useful in most senses of the word. There’s thus a difference between creative and just being different, or even random. A quick examination of the lyrics to any Mars Volta song should highlight the importance of appropriateness when considering whether novelty is creative or just nonsense. Anyone can string words together in new ways, but that does not always (or even usually) make for a creative song.

Which brings me to another important point about problems and their solutions more generally: there is no such thing as a general-purpose problem and, accordingly, no such thing a general-purpose solution. To place that into a quick example, if I asked you to design me a tool that “does useful things,” you would likely find that request a bit underspecified. What kind of useful things should it do? This is an important question to answer because tools that are designed to do one task well often do others poorly, if at all. A hammer might be good at driving a nail into wood, less good at applying paint to a wall, worse still at holding water, and entirely incapable of transporting you from point A to B. The shape of a problem determines the shape of the solution, and as all problems have different shapes, so too must each solution.

There are several implications that flow from this idea as it pertains to creativity. The first is that the difference between novelty and creativity can be more readily understood. If I told you I wanted a device to hold water, there are an infinite number of possible devices you could give me that don’t currently exist. However, very few of that infinite set would do the job well (a hammer or sieve would not) and, of those that do accomplish the task, fewer still would be an improvement on existing solutions. This is why “novel” alone does not translate into “creative.”

As seen on TV, since no store would ever stock it

Yet another implication is that – just like humans (or any other species I’m aware of) don’t appear to possess general-purpose learning mechanisms, equally capable of learning anything – so too should we expect that creativity is not any singular mechanism within the mind that gets applied equally well to any problem. Those who are considered creative with respect to painting may not be expected to evidence that same degree of creativity when it comes to math or biology. It’s not likely that there’s a way to make people more creative across every domain they might encounter. After all, if creativity refers to the generation of more efficient and appropriate solutions to problems, asking that someone become more creative in general is like asking that they become better at effectively solving all types of problems or making connections between all areas of their brains. In keeping with the tool example from above, it would also be like asking that your water-holding device get better at solving all problems related to holding liquid (small and large quantities or varying types for varying lengths, etc), which doesn’t work well in practice; if it did, we wouldn’t need oil drums and cooking pots and measuring cups. We could just use one device for all of those tasks. Good luck using a 40-gallon drum to measure out a quarter cup of water effectively, though.

This expectation has been demonstrated empirically as well. Baer (1996) examined what effects training poetry-relevant creativity skills would have both writing poetry and short stories; an ostensibly-related domain. In this case, approximately 75 students were trained up on divergent-thinking skills relevant to poetry including thinking of words that sound the same as a target, have the same sound, work as a metaphor, or inventing words that are suggestive of other things. Another 75 students did not receive this training to serve as a control group. All the students then wrote poetry and short stories that would be evaluated by independent judges for creativity on a 1 to 5 scale. As it turned out, the poems written by the trained students did end up more creative (3 vs 2.2), yielding a gain of about 0.8 points. By contrast, the short stories in the trained group saw a substantially smaller gain of 0.3 points (2.8 v 2.5). Creativity training did not appear to have an equivalent effect across domains, even though the domains were, in many respects, closely related.

The final implication I wanted to cover right now when it comes to creativity concerns the purpose of solutions in the first place. We seek solutions to problems, in large part, because solutions are time savers. Once you have learned how to complete a task, you don’t need to relearn how to complete it each time you attempt it. Once I learned how to commute to work, I don’t need to figure out how to get there every day, which saves me time. Chefs working in kitchens don’t need to relearn how to make dishes (or even what dishes they will be making) each and every time they come into work, allowing them to complete their tasks with greater ease in shorter amounts of time. By contrast, creativity can be a time-consuming process, where new candidate solutions need to be developed and tested against existing alternatives, then learned and mastered. In other words, creativity is costly both in terms of the time and energy it takes to develop something new, but also costly in the sense that all the time you spend creating is time spent not applying existing solutions to a problem. The probability of your creative endeavors paying off at all in terms of improving outputs, as well as the degree to which they improve upon existing alternatives, needs to be weighed against the time it takes to develop them.

Thanks for all your hard work and effort. Next!

But what if your creative endeavors are successful? Well, first of all, good for you if they are. Achieving that much is no easy task. But assuming they are successful, you now have a new, even-more efficient solution to a problem you were facing. What are you going to do now? Well, you could continue your creative search for an additional solution that’s even better than the one you came up with, or you could apply your new solution to the problem. Remember: solutions are time savers. If you spend all your time innovating and none of it actually applying what you came up with, then you haven’t really saved time. In fact, if you aren’t going to then apply that solution, searching for it seems rather pointless. The great irony here, then, is that an end goal of creativity is effectively to not have to be creative anymore, at least with respect to that problem.

The more empirical end of this suggestion is represented by the finding that creativity appears to decrease with education, at least among engineering undergraduate students. Sola et al (2017) examined a sample of approximately 60 introductory and senior engineering college students. Creativity was assessed through a thinking-drawing procedure, where participants were presented with an incomplete picture and asked to complete it in any manner they wished. These drawings were subsequently assessed across 15 factors, ultimately finding higher creativity scores among the freshmen, on average, in several of the domains.

Nothing quite like the tried and true

To be clear, then, some people will generally be more creative than others, just like some people will generally be more intelligent than others. In that sense, you could consider some people creative. That does not mean their creativity will extend to all domains of life, however, or even that their creativity will extend throughout the same domains across their life. When you have a solution to a problem, the need to seek out a new solution is relatively lower, and so creativity should decline.

An implication of this framework would seem to be that if you want to keep creative output high, you need to constantly be facing problems that are perceived to be notably different from those already encountered (and the solutions to those problems need to be meaningful to find. People likely won’t be too motivated to be creative if finding a new solution will only yield minimal benefits). That said, there is also a risk in making a constant stream of problems seem novel: it suggests that the creative solutions you develop to a problem are not liable to serve you well in the future, as the problems you will face tomorrow are not the same ones you are facing today. If the solutions are not perceived to be useful in the future, creative efforts may be scaled back accordingly. Striking that balance between novelty and predictability may prove key in determining subsequent creative efforts.

References: Baer, J. (1996). The effect of task-specific divergent thinking training. The Journal of Creative Behavior, 30, 183-187.

Sola, E., Hoekstra, R., Fiore, S., & McCauley, P. (2017). An investigation of the state of creativity and critical thinking in engineering undergraduates. Creative Education, 8, 1495-1522.

No Sexism In SCRABBLE

My last couple of posts have focused primarily on the topic of group differences and on understanding how they might come to exist. Some of the most commonly-advanced explanations for these differences concern discrimination – explicit or implicit – that serves to keep otherwise interested and qualified people out of arenas they would like to compete in. For example, few men might want to be nurses because male nurses aren’t considered for positions even if they’re qualified because of a social stigma against men in that area. If that was the explanation for these group differences, it would represent a wealth of untapped social value achievable by reducing or removing those discriminatory boundaries. On the other hand, if discrimination is not the cause of those differences, a lot of time and energy could be invested into chasing down a boogeyman without yielding much in the way of value for anyone.

Unfortunately, as we saw last time (and other times), research seeking to test these explanations can be designed or interpreted in ways that make them resilient to falsification. If the hypothesized effect attributable to discrimination is observed, it is counted as evidence consistent with the explanation; when the effect isn’t observed, however, it is not counted as evidence against the proposal. They are sure the discrimination is there; they just didn’t dig deep enough to find it. This practice can be maintained effectively in many domains because of the fuzzy nature of performance within them. That is, it’s not always clear which person would make a better manager or professor when it comes time to make a hiring decision or assess performance, so different rates of hiring or promotions cannot be clearly related to different behavior.

And if the quality of your work can’t be assessed, it also means you can never be said to fail at your job

One way of working that fuzziness out of the equation is to turn towards domains where more objective measures of performance can be obtained. While it might be difficult to say for certain that one person would make a superior manager to another – especially when they are closely matched in skills – it is quite a lot easier to see if they can complete a task with objective performance criteria, such as winning in a video game or performing pull-ups. In realms of objective performance, it doesn’t matter if people like you or not; your abilities are being tested against reality. Accordingly, domains with more objective performance criteria make for appealing research tools when it comes to assessing and understanding group differences.

On that note, Moxley, Ericsson, & Tuffiash (2017) report some interesting information concerning the board game SCRABBLE. For the handful of you who might not know what SCRABBLE is, it’s a game where each player randomly selects a number of tiles with letters on them, then uses those tiles to spell words: the larger the word or the harder the letters are to utilize, the more points the player receives. The player with the most points after the tiles have been used up wins. As it turns out, men tend to be over-represented in the upper tiers of SCRABBLE performance. Within the highest-performing competitive SCRABBLE divisions, 86% of the players are male, while only 31% of the players in the lowest-performing divisions are. This patterns holds even though most of the competitive SCRABBLE players are women. Indeed, when regular people are asked about whether they would expect more male or female SCRABBLE champions, the intuition seems to be that women should be more common (despite, for context, all 10 of the last world champions having been male).

How is that sex difference in performance to be explained? In this instance, discrimination looks to be an odd explanation: competitive SCRABBLE tournaments do not present clear barriers to entry and women appear to be at least as interested – if not more so – in SCRABBLE than men are, as inferred from participation rates. Moreover, people even seem to expect women would do better than men in that field, so an explanation along the lines of stereotype threat doesn’t work well either. According to the research of Moxley, Ericsson, & Tuffiash (2017), the explanation for most of that sex difference in performance does, in fact, relate to varying male and female interests, but perhaps not those directed at playing SCRABBLE itself. While I won’t discuss every part of the studies they undertook, I wanted to highlight some general points of this research because of how well it can highlight the difficulty and nuance in understanding sex differences and their relation to performance within a given field.

Even this vicious field of battle

The general methodology employed by the researchers involved surveying participants at National SCRABBLE competitions in 2004 and 2008 about their overall level of practice each year, both in terms of time spent studying alone and practicing seriously with others. These responses were then examined in the context of the player’s competitive SCRABBLE rating. The first study turned up several noteworthy relationships. As expected, women tended to have lower ratings than men (d = -0.74). However, it was also found that different types of SCRABBLE practice had varying impacts on player ratings. In this case, studying vocabulary had a negative impact on performance, while time spent analyzing past games and doing anagrams had a positive impact. This means that just asking people about how much they practiced SCRABBLE is not enough of a fine-tuned question for good predictive accuracy concerning performance. In this case, the practice questions asked about were unable to account for the entirety of the gender difference in performance, but they did reduce it somewhat.

This led the researchers to ask more detailed questions about SCRABBLE players’ practice in their second study. As before, women tended to have lower ratings than men (d = -0.69), but once the more refined questions about practice and experience were accounted for, there was no longer a direct effect of gender on rating. This would suggest that the performance advantage men had in SCRABBLE can be largely attributed to their spending more time engaged in solitary practice that benefits performance, while women tended to spend more time playing SCRABBLE with others; a behavior which did not yield comparable performance benefits.

The final step in this analysis was to figure out why men and women spent different amounts of time engaged in the types of practice they did. To do so, the players’ responses about how relevant, enjoyable, and effortful various types of practice felt were assessed. In order, the players felt tournament experience was the most important for improving their skills, then playing SCRABBLE itself, followed last by other types of word games. On that front, perceptions weren’t quite accurate. A similar pattern emerged in terms of which activities were rated as most enjoyable. However, there was a sex difference in that women rated playing SCRABBLE outside of tournaments as more enjoyable than men, and men rated SCRABBLE-specific practice (like anagrams) as more enjoyable than women.

Taken together, men tended to find the most-effective practice methods more enjoyable than women, and so engaged in them more. This differential involvement in effective practice in turn explained the sex difference in player rankings. Nothing too shocking, but reality often isn’t.

Published in the journal of, “I’m sorry; did you say something?”

What we see in this research is an appreciable sex difference in performance resulting from varying male and female interests, but those interests themselves are not necessarily the most obvious targets for investigation. If you were to just ask men and women whether they were interested in SCRABBLE, you might find that women had a higher average interest. If you were to just ask about how much time they spent practicing, you might not observe a sex difference capable of explaining the differences in performance. It wouldn’t be until you asked specifically about their interests in particular types of practice and understood how those related to eventual performance that you end up with a better picture of that performance gap. In this case, it seems to be the case that the sex difference is largely the product of men being more interested in specific types of practice that are ultimately more productive when it comes to improving performance. The corollary point to that is that if you were trying to reduce the male-female performance gap in SCRABBLE, if your explanation for that gap was that women are being discriminated against and so sought to reduce discrimination in the field, you’d probably do nothing to help even out the scores (though you might achieve some social maligning). 

Thankfully this kind of analysis can be reasonably undertaken in a realm where performance can be objectively assessed. If you were to think about trying this same analysis with respect to, say, the relative distribution of men and women in STEM fields, you’re in for a much rockier experience where it’s not clear how certain interests relate to ultimate performance.

References: Moxley, J., Ericsson, A., & Tuffiash, M. (2017). Gender differences in SCRABBLE performance and associated engagement in purposeful practice activities. Psychological Research, DOI 10.1007/s00426-017-0905-3

Imagine If The Results Went The Other Way

One day, three young children are talking about what they want to be when they get older. The first friend says, “I love animals, so I want to become a veterinarian.” The second says, “I love computers, so I want to become a programmer.” The third says, “I love making people laugh, so I want to become a psychology researcher.” Luckily for all these children, they all end up living a life that affords them the opportunity to pursue their desires, and each ends up working happily in the career of their choice for their entire adult life.

The first question I’d like to consider is whether any of those children made choices that were problematic. For instance, should the first child have decided to help animals, or perhaps should they have put their own interests aside and pursued another line of work because of their sex and the current sex-ratio of men and women in that field? Would your answer change if you found out the sex of each of the children in question? Answer as if the second child was a boy, then think about whether your answer would change if you found out she was a girl.

Well if you wanted to be a vet, you should have been born a boy

This hypothetical example should, hopefully, highlight a fact that some people seem to lose track of from time to time: broad demographic groups are not entities themselves; only made up of their individual members. Once one starts talking about how gender inequality in professions ought to be reduced – such that you see a greater representation of 50/50 men and women across a greater number of fields – you are, by default, talking about how some people need to start making choices less in line with their interests, skills, and desires to reach that parity. This can end up yielding strange outcomes, such as a gender studies major telling a literature major she should have gone into math instead. 

Speaking of which, a paper I wanted to examine today (Riegle-Crumb, King, & Moore, 2016) begins laying on the idea of gender inequality across majors rather thick. Unless I misread their meaning, they seem to think that gender segregation in college majors ought to be disrupted and, accordingly, sought to understand what happens to men and women who make non-normative choices in selecting a college major, relative to their more normative peers. Specifically, they set out to examine what happens to men who major both in male- and female-dominated fields: are they likely to persist in their chosen field of study in the same or different percentages? The same question was asked of women as well. Putting that into a quick example, you might consider how likely a man who initially majors in nursing is to switch or stay in his program, relative to one who majors in computer science. Similarly, you might think about the fate of a woman who majors in physics, compared to one who majors in psychology.

The authors expected that women would be more likely to drop out of male-dominated fields because they encounter a “chilly” social climate there and face stereotype threat, compared to their peers in female-dominated fields. By contrast, men were expected to drop out of female-dominated fields more often as they begin to confront the prospect of earning less money in the future and/or lose social status on account of emasculation brought on by their major (whether perceived or real).

To test these predictions, Riegle-Crumb, King, & Moore (2016) examined a nationally-representative sample of approximately 3,700 college students who had completed their degree. These students had been studied longitudinally, interviewed at the end of their first year of college in 2004, then again in 2006 and 2009. A gender atypical major was coded as one in which the opposite sex compromised 70% or more of the major. In the sample being examined, 14% of the males selected a gender-atypical field, while 4% of women did likewise. While this isn’t noted explicitly, I suspect some of that difference might have to do with the relative size of certain majors. For instance, psychology is one of the most popular majors in the US, but also happened to fall under the female-dominated category. That would naturally yield more men than women choosing a gender atypical major if the pattern continued into other fields.

Can’t beat that kind of ratio in the dating pool, though

Moving on to what was found, the researchers were trying to predict whether people would switch majors or not. The initial analysis found that men in male-typical majors switched about 39% of the time, compared to the 63% of men who switched from atypical majors. So the men in atypical fields were more likely to switch. There was a different story for the women, however: those in female-typical majors switched 46% of the time, compared to 41% who switched in atypical fields. The latter difference was neither statistically or practically significant. Unsurprisingly, for both men and women, those most likely to switch had lower GPAs than those who stayed, suggesting switching was due, in part, to performance.

When formally examined with a number of control variables (for social background and academic performance) included in the model, men in gender atypical fields were about 2.6 times as likely to switch majors, relative to those in male-dominated ones. The same analysis run for women found that those in atypical majors were about 0.8 times as likely to switch majors as those in female-dominated ones. Again, this difference wasn’t statistically significant. Nominally, however, women in atypical fields were more likely to stay put.

What do the authors make of this finding? Though they note correctly that their analysis says nothing of the reasons for the switch, they view the greater male-atypical pattern of switching as consistent with their expectations. I think this is probably close to the truth: as a greater proportion of a man’s future success is determined by his ability to provision mates and his social status, we might expect that men tend to migrate from majors with a lower future financial payoff to those that have a larger one. Placing that into a personal example, I might have wanted to be a musician, but the odds of landing a job as a respected rockstar seemed slim indeed. Better that I got a degree in something capable of paying the bills consistently if I care about money.

By contrast, the authors also correctly note that they don’t find evidence consistent with their prediction that women in gender-atypical fields would switch more often. This does not, however, cause them to abandon the justifications for their prediction. As far as I can tell, they still believe that factors like a chilly climate and stereotype threat are pushing women out of those majors; they just supplement that expectation by adding on that a number of factors (like the aforementioned financial ones) might be keeping them in, and the latter factors are either more common or influential (though that certainly makes you wonder why women tend to choose lower-paying fields in greater numbers the first place).

Certainly worth a 20-year career in a field you hate

This strikes me as kind of a fool-proof strategy for maintaining a belief in the prospect of nefarious social forces doing women harm. To demonstrate why, I’d like to take this moment to think about what people’s reactions to these findings might have been if the patterns for men and women were reversed. If it turned out that women in male-dominated majors were more likely to switch than their peers in female-dominated majors, would there have been calls to address the clear sexism causally responsible for that pattern? I suspect that answer is yes, judging from reactions I’ve seen in the past. So, if that result was found, the authors could point a finger at the assumed culprits. However, even when that result was not found, they can just tack on other assumptions (women remain in this major for the money) that allows the initial hypothesis of discrimination to be maintained in full force. Indeed, they end their paper by claiming, “Gender segregation in fields of study and related occupations severely constrains the life choices and chances of both women and men,” demonstrating a full commitment to being unphased by their results.

In other words, there doesn’t seem to be a pattern of data that could have been observed capable of falsifying the initial reasons these expectations were formed. Even nominally contradictory data appears to have been assimilated into their view immediately. Now I’m not going to say it’s impossible that there are large, sexist forces at work trying to push women out of gender atypical fields that are being outweighed by other forces pulling in the opposite direction; that is something that could, in theory, be happening. What I will say is that granting that possibility makes the current work a poor test of the original hypotheses, since no data could prove it wrong. If you aren’t conducting research capable of falsifying your ideas – asking yourself, “what data could prove me wrong?” – then you aren’t engaged in rigorous science. 

References: Riegle-Crumb, C., King, B., & Moore, C. (2016). Do they stay or do they go? The switching decisions of individuals who enter gender atypical college majors. Sex Roles, 74, 436-449.

Diversity: A Follow-Up

My last post focused on the business case for demographic diversity. Summarizing briefly, an attempted replication of a paper claiming that companies with greater gender and racial diversity outperformed those with less diversity failed to reach the same conclusion. Instead, these measures of diversity were effectively unrelated to business performance once you controlled for a few variables. This should make plenty of intuitive sense, as demographic variables per se aren’t related to job performance. While they might prove to be rough proxies if you have no information (men or women might be better at tasks X or Y, for instance), once you can assess skills, competencies, and interests, the demographic variables cease to be good predictors of much else. Being a man or a woman, African or Chinese, does not itself make you competent or interested in any particular domain. Today, I wanted to tackle the matter of diversity itself on more of a philosophical level. With any luck, we might be able to understand some of the issues that can cloud discussions on the topic.

And if I’m unlucky, well…

Let’s start with the justifications for concerns with demographic diversity. As far as I’ve seen, there are two routes people take with this. The first – and perhaps most common – has been the moral justification for increasing diversity of race and gender in certain professions. The argument here is that certain groups of people have been historically denied access to particular positions, institutions, and roles, and so they need to be proactively included in such endeavors as a means of reparation to make up for past wrongs. While that’s an interesting discussion in its own right, I have not found many people who claim that, say, more women should be brought into a profession no matter the impact. That is, no one has said, “So what if bringing in more women would mess everything up? Bring them in anyway.” This brings us to the second justification for increasing demographic diversity that usually accompanies the first: the focus on the benefits of cognitive diversity. The general idea here is not only that people from all different groups will perform at least as well in such roles, but that having a wider mix of people from different demographic groups will actually result in benefits. The larger your metaphorical cognitive toolkit, the more likely you will successfully meet and overcome the challenges of the world. Kind of like having a Swiss Army knife with many different attachments, just with brains.

This idea is appealing on its face but, as we saw last time, diversity wasn’t found to yield any noticeable benefits. There are a few reasons why we might expect that outcome. The first is that cognitive diversity itself is not always going to be useful. If you’re on a camping trip and you need to saw through a piece of wood, the saw attachment on your Swiss Army knife would work well; the scissors, toothpick, and can opener will all prove ineffective at solving your problem. Even the non-serrated knife will prove inefficient at the task. The solutions to problems in the world are not general-purpose in nature. They require specialized equipment to solve. Expanding that metaphor into the cognitive domain, if you’re trying to extract bitumen from tar sands, you don’t want a team of cognitively diverse individuals including a history major, a psychology PhD, and a computer scientist, along with a middle-school student. Their diverse set of skills and knowledge won’t help you solve your problem. You might do better if you hired a cognitively non-diverse group of petroleum engineers.

This is why companies hiring for positions regularly list rather specific qualification requirements. They understand – as we all should – that cognitive diversity isn’t always (or even usually) useful when it comes to solving particular tasks efficiently. Cognitive specialization does that. Returning this point back to demographic diversity, the problem should be clear enough: whatever cognitive diversity exists between men and women, or between different racial groups, it needs to be task relevant in order for it to even potentially improve performance outcomes. Even if the differences are relevant, in order for diversity to improve outcomes, the different demographic groups in question need to complement the skill sets of the other. If, say, women are better at programming than men, then diversity of men and women wouldn’t improve programming outcomes; the non-diverse outcome of hiring women instead of men would.

Just like you don’t improve your track team’s relay time by including diverse species

Now it’s not impossible that such complementary cognitive demographic differences exist, at least in theory, even though the former restrictions are already onerous. However, the next question that arises is whether such cognitive differences would actually exist in practice by the time hiring decisions were made. There’s reason to expect they would not, as people do not specialize in skills or bodies of knowledge at random. While there might be an appreciable amount of cognitive diversity between groups like men and women, or between racial groups, in the entire population, (indeed, meaningful differences would need to exist in order for the beneficial diversity argument to make any sense in the first place) people do not get randomly sorted into groups like professions or college majors.

Most people probably aren’t that interested in art history, or computer science, or psychology, or math to the extent they would pursue it at the expense of everything else they could do. As such, the people who are sufficiently interested in psychology are probably more similar to one another than they are to people who major in engineering. Those who are interested in plumbing are likely more similar to other plumbers than they are to nurses.

As such, whatever differences exist between demographics on the population level may be reduced in part or in whole once people begin to self-select into different groups based on skills, interests, and aptitudes. Even if men and women possess some cognitive differences in general, male and female nurses, or psychologists, or engineers, might not differ in those same regards. The narrower the skill set you’re looking for when it comes to solving a task, the more similar we might expect people who possess those skills to be. Just to use my profession, psychologists might be more similar than non-psychologists; those with a PhD might be more similar than those with just a BA; those who do research may differ from those who enter into the clinical field, and so on.

I think these latter points are where a lot of people get tripped up when thinking about the possible benefits of demographic diversity to task performance. They notice appreciable and real differences between demographic groups on a number of cognitive dimensions, but fail to appreciate that these population differences might (a) not be large once enough self-selection by skills and interests has taken place, (b) not be particularly task relevant, and (c) might not be complementary.

Ironically, one of the larger benefits to cognitive diversity might be the kind that people typically want to see the least: the ability of differing perspectives to help check the personal biases we possess. As people become less reliant on those in their immediate vicinity and increasingly able to self-segregate into similar-thinking social and political groups around the world, they may begin to likewise pursue policies and ideas that are increasingly self-serving and less likely to benefit the population on the whole. Key assumptions may go unchallenged and the welfare of others may be taken into account less frequently, resulting in everyone being worse off. Groups like the Heterodox Academy have been set up to try and counteract this problem, though the extent of their success is debatable.

A noble attempt to hold back the oncoming flood all the same

Condensing this post a little, the basic idea is this: men and women (to use just one group), on average, are likely to show a greater degree of between-group cognitive diversity than are male and female computer science majors. Or male and female literature majors. Any group you can imagine. Once people are segregating themselves into different groups on the basis of shared abilities and interests, those within the groups should be much more similar to one another than you’d expect on the basis of their demographics. If much of the cognitive diversity between these groups is getting removed through self-selection, then there isn’t much reason to expect that demographic diversity within those groups will have as much of an effect one way or the other. If male and female programmers already know the same sets of skills and have fairly similar personalities, making those groups look more male or more female won’t have much of an overall effect on their performance.

For it to even be possible that such diversity might help, we need to grant that meaningful, task-relevant differences between demographic groups exist, are retained throughout a long process of self-selection, and that these differences complement each other, rather than one group being superior. Further, these differences would need to create more benefits than conflicts. While there might be plenty of cognitive diversity in, say, the US congress in terms of ideology, that doesn’t necessarily mean it helps people achieve useful outcomes all the time once you account for all the dispute-related costs and lack of shared goals. 

If qualified and interested individuals are being kept out of a profession simply because of their race or gender, that obviously carries costs and should be stopped. There would be many valuable resources going untapped. If, however, people left to their own devices are simply making choices they feel suit them better – creating some natural demographic imbalances – then just changing their representation in this field or that shouldn’t impact much.

Does Diversity Per Se Pay?

In one of the most interesting short reports I read recently, some research was conducted in Australia examining what the effect of blind reviews would be on hiring. The premise of the research, far as I can surmise, was that a fear existed of conscious or unconscious bias against women and minority groups when it came to getting hired. This bias would naturally make it harder for those groups to find employment, ultimately yielding a less diverse workforce. In the interests of avoiding that bias, the research team compared what happened when candidates were assessed on either standard resumes or de-identified ones. The latter resumes were identical to the former, except they had group-relevant information (like gender and race) removed. If reviewers don’t have that information of race or gender available, then they couldn’t possibly assess the candidates on the basis of them, whether consciously or unconsciously. That seems straightforward enough. The aim was to compare the results from the blind assessments to those of the standard resumes. As it turned out, there were indeed hints of bias; relatively small in size sometimes, but present nonetheless. However, the bias did not go in the direction that had been feared.

Shocking that the headline wasn’t “Blind review processes are biased”

Specifically, when the participants assessing the resumes had information about gender, they were about 3% more likely to select women, and 3% less likely to select men. Further, minorities were more likely to be selected as well when the information was available (about 6% for males and 9% for females). While there’s more to the picture than that, the primary result seemed to be that, when given the option, these reviewers discriminated in favor of women and minority groups simply because of their group membership. If these results had run in the opposite direction (against women and minorities) there would have no doubt been calls for increasing blind reviews. However, because blind reviews seemed to disfavor women and minorities, the authors had a different suggestion:

Overall, the results indicate the need for caution when moving towards ’blind’ recruitment processes in the Australian Public Service, as de-identification may frustrate efforts aimed at promoting diversity

It’s hard to interpret that statement as anything other than ”we should hire more women and minorities, regardless of qualifications.” Even if sex and race ought to be irrelevant to the demands of the job and candidates should be assessed on their merit, people should also apparently be cautious when removing those irrelevant pieces from the application process. The authors seemed to favor discrimination based on sex or race so long as it benefited the right groups. Such discriminatory practices have led to negative reactions on the part of others, as one might expect.

This brings me another question: why should we value diversity when it comes to hiring decisions? To be clear, the diversity being sought is often strictly demographic in nature (many organizations tout diversity in race, for instance, but not in perspective. I don’t recall the draw of many positions being that you will meet a variety of people who hold fundamental disagreements with your view on the world). It’s also usually the kind of diversity that benefits women and minorities (I’ve never come across calls to get more white males into certain fields dominated by women or other races. Perhaps they exist; I just haven’t seen them). But are there real economic benefits to increasing diversity per se? Could it be the case that more diverse organizations just do better? On the face of it, I would assume the answer is “no” if the diversity in question is simply demographic in nature. What matters when it comes to job performance is not the color of one’s skin or what sex chromosomes they possess, but rather their skills and competencies they bring with them. While some of those skills and competencies might be very roughly approximated by race and gender if you have no additional information about your applicants, we thankfully don’t need to rely on those indirect measures. Rather than asking about gender or race, one could just ask directly about skill sets and interests. When you can do that, the additional value of knowing one’s group membership is likely close to nil. Why bother using a predictor of a variable when you can just use the variable itself?

Do you really love roundabouts that much?

Nevertheless, it has apparently been reported before that demographic diversity predicts the relative success of companies (Herring, 2009). A business case was made for diversity, such that diverse companies were found to generally do better than less diverse ones across a number of different metrics. Not that those in favor of increasing diversity really seemed to need a financial justification, but having one certainly wouldn’t hurt their case. As this paper was apparently popular within the literature (for what I assume is that reason), a replication was attempted (Stojmenovska et al, 2017), beginning in a graduate course as an assignment to help students “learn from the best.” Since it seems “psychology research” and “replications” mix about as well as oil and water as of late, the results turned out a bit worse than hoped. The student wasn’t even trying to look for problems; they just stumbled upon them.  

In this instance, the replication attempt failed to find the published result, instead catching two primary mistakes made in the original paper (as opposed to anything malicious): there were a number of coding mistakes within the data, and the sample data itself was skewed. Without going too deeply into why this is a problem, it should suffice to say that coding mistakes are bad for all the obvious reasons. Fixing the coding mistakes by deleting missing data resulted in a substantial reduction in sample size (25-50% smaller). As for the issue of skew, having a skewed sample can result in an underestimation of the relationship between predictors and outcomes. In brief, there were confounding relationships between predictor variables and the outcomes that were not adequately controlled for in the original paper. To correct for the skew issue, a log transformation on the data was carried out, resulting in a dramatic increase in the relationship between particular variables.

In order to provide a concrete sense for that increase, in the original report the correlation between company size and racial diversity was .14; after the log transformation was carried out, that correlation increased to .41. This means that larger companies tended to be more racially diverse than smaller ones, but that relationship was not fully accounted for in the original paper examining how diversity impacted success. The same issue held for gender diversity and establishment size.

Once these two issues – coding errors and skewed data – were addressed, the new results showed that gender and racial diversity were effectively unrelated to company performance. The only remaining relationship was a small one between gender diversity and the logged number of customers. While seven of the original eight hypotheses were supported in the first paper, the replication attempt correcting these errors only found one of the eight to be statistically supported. As most of the effects no longer existed, and the one that did exist was small in size, the business justification for increasing racial and gender diversity failed to receive any real support.

Very colorful, but they ultimately all taste the same

As I initially mentioned, I don’t see a very good reason to expect that a more demographically diverse group of employees should yield better outcomes. They don’t yield worse outcomes either. However, the study from Australia suggests that the benefits of diversity (or the lack thereof) are basically besides the point in many instances. That is, not only would I imagine this failure to replicate won’t have a substantial impact on many people’s views on whether or not diversity should be increased, but I don’t think it would even if diversity was found to be a bad thing, financially speaking. This is because I don’t suspect many views of whether increasing diversity should be done are based on the foundation that it’s good for people economically in the first place. Increasing diversity isn’t viewed as a tricky empirical matter as much as it seems to be a moral one; one in which certain groups of people are viewed as owing or deserving various things.

This is only looking at the outcomes of adding diversity, of course. The causes of such diverse levels of diversity across different walks of life is another beast entirely.

References: Stojmenovska, D., Bol, T., & Leopolda, T. (2017). Does diversity pay? Replication of Herring (2009). American Sociological Review, 82, 857-867. 

Herring, C. (2009). Does diversity pay? Race, gender, and the business case for diversity. American Sociological Review, 74, 208–224.

If You Got It, Think Hard About Flaunting It

I’ve attended the Gay Pride Parade in New York on more than one occasion. The event itself holds a special significance for many people who have been close to me and I’m always happy to see them happy, even if parades normally aren’t my cup of tea. That said, I have found certain aspects of the event a little peculiar, at least with regard to its execution. I had this to say about it some years ago:

One could be left wondering what a straight pride parade would even look like anyway, and admittedly, I have no idea. Of course, if I didn’t already know what gay pride parades do look like, I don’t know why I would assume they would be populated with mostly naked men and rainbows, especially if the goal is fostering acceptance and rejection of bigotry. The two don’t seem to have any real connection, as evidenced by black civil rights activists not marching mostly naked for the rights afforded to whites, and suffragettes not holding any marches while clad in assless leather chaps.

Colorful exaggerations aside, there’s something very noteworthy to think about here. While it might seem normal for gay pride events to be rather flamboyant affairs, there need not be any displays of promiscuous sexuality inherent to the event. That is, if people were celebrating a straight, monogamous relationship style with a parade, I don’t think we’d see many people dressing down or, in some cases, going without clothing at all. I imagine the event would be substantially more modest as, well, most other parts of life tend to be.

“From: Straight Pride Boat Ride, 2016″

The relevance of this point comes when one begins to consider what types of people in the world are most opposed to homosexual lifestyles and, accordingly, pose the largest obstacles to things like marriage and adoption rights for the gay community. When considering who those people are, the most common idea that will no doubt spring to many minds are the conservative, religious type (likely because that would be the correct answer). But why are such people most likely to condemn homosexuality on a moral level? A tempting answer would be to make reference to some religious texts condemning homosexuality, but that’s a rather circular explanation: religious people condemn homosexuality because they believe in a doctrine that condemns homosexuality. It’s also not entirely complete, as many parts of the doctrine are only selectively followed in other contexts. We’re also left wondering why those doctrines condemned homosexuality in the first place, placing us back at square one.

A more detailed picture begins to emerge when you consider what predicts religiosity in the first place; what type of person is most drawn to such groups. As it turns out, one of the better predictors of who ends up associating themselves with religious groups and who does not is sexual strategy. Those who are more inclined to monogamy (or, more precisely, opposed to promiscuity) tend to be more religious, and this holds across cultures and religions. By contrast, religiosity is not well predicted by general cooperative morals or behavior. It would be remarkable if religions from all parts of the world ended up stumbling upon a common distaste for promiscuity if it was not inherently tied to religious belief. Something about sexual behavior is uniquely predictive of religiosity, which ought to be strange when you consider that one’s sexual behavior should have little bearing on whether a deity (or several deities) exist. It has even been proposed that religious groups themselves function to support particular kinds of relatively monogamous mating arrangements. In that light, religious groups can be viewed as a support structure for monogamous couples who plan on having many children.

With that perspective in mind, the religious opposition to promiscuity becomes substantially clearer: promiscuity makes monogamous arrangements more difficult to sustain, and vice versa. If you plan on having a lot of children, men face risks of cuckoldry (raising a child that was unknowingly sired by another man) while women face risks of abandonment (if their husband runs off with another woman, leaving her to care for the children alone). As such, having lots of promiscuous men and women around who might lure your partner away or stop them from investing in you in the first place does the monogamous type no favors. In order to support their more monogamous lifestyle, then, these people begin to punish those who engage in promiscuous behaviors to make such strategies more costly to engage in and, accordingly, more rare.

The first punishment for promiscuity – spankings – didn’t have the intended effect

While homosexual individuals themselves don’t exactly pose direct risks to heterosexual, long-term mating couples, they may nevertheless be condemned to the extent that the gay community is viewed as promiscuous. There are a few possible reasons for that outcome to obtain. Perhaps homosexuals are viewed as supporting and encouraging promiscuity, and to let that go unpunished would start other people down a path towards promiscuity (similar to how recreational drug use is also condemned by the long-term maters). Perhaps all sorts of non-traditional sexual behavior is condemned by the conservative groups and homosexuality just ends up condemned as a byproduct. Whatever the explanation for this condemnation, however, a key prediction falls out of this framework: moral condemnation of homosexuality ought to increase to the extent they are viewed as promiscuous and decrease to the extent they are viewed as monogamous. As homosexual groups (particularly men) are viewed as more promiscuous than their heterosexual counterparts (because they are, from every data set I’ve seen), this might help explain the condemnation and, in turn, do something about it.

This is exactly what a new paper by Pinsof & Haselton (2017) sought to test. The pair recruited approximately 1,000 participants from online. The participants read either an article that reported gay men had more partners than straight ones, or an article that reported gay men and straight had the same number of partners. Participants were also asked about their own perceptions of how promiscuous gay men are, their stance on gay rights, and on their own mating orientation (whether they thought short-term sexual encounters were acceptable or not).

As expected, there was an appreciable relationship between one’s mating orientation and one’s support of gay rights: the more long-term their mating strategy, the less supportive of gay rights they were (r = -0.4). That said, despite men being more accepting of promiscuity than women, there was no relationship between gender and support for gay rights. Crucially, an interaction was observed between experimental condition and mating orientation when it came to predicting support for gay rights: Those who were particularly accepting of short-term mating arrangements opposed gay rights very little regardless of which article they had read regarding gay men’s sexual behavior (Ms = approximately 2.25 in both groups, on a scale from 1-7). However, among those who were relatively less accepting of short-term mating, there was a significant difference between the two conditions: when reading an article about how gay men were more promiscuous, opposition to gay rights was higher (M = 4.25) than it was in the condition where they read about how gay men were equally promiscuous (M = 3.5).

Acceptable

By manipulating perceptions of whether gay men were promiscuous, the researchers were also able to manipulate opposition to gay rights. So, if one is interested in achieving greater support for the homosexual community, that’s important information to bear in mind. It also brings me back to the initial point I mentioned about the Gay Pride events I have attended. While I was there, I couldn’t help but wonder whether the atmosphere of sexual promiscuity surrounding the parade would be off-putting to a substantial percentage of the population (even within the gay community), and it seems that intuition was borne out by the present data. The Gay Pride events go beyond a simple celebration and acceptance of homosexuality at points, as it is frequently coupled with sexual promiscuity. It seems that many people might have less of a problem with the former issue if the latter one wasn’t tagging along.

Then again, perhaps promiscuity will be a bit more closely linked with the homosexual community in general, given that children do not result from such unions (making them less costly to engage in) and because heterosexual men are usually only as promiscuous as women allow them to be. If women were just as interested in casual sex as men, there would likely be a lot more casual sex going on. When men are attracted to other men, however, the barriers that usually holds promiscuity in check (children and women’s desires) are much weaker. That does raise the interesting question of whether a different pattern holds for lesbian relationships (which are less promiscuous than gay ones), and it’s certainly one worth pursuing.

References: Pinsof, D. & Haselton, M. (2017). The effect of the promiscuity stereotype on opposition to gay rights. PLoS ONE 12(7): e0178534. https://doi.org/10.1371/journal.pone.0178534

Not-So-Leaky Pipelines

There’s an interesting perspective many people take when trying to understand the distribution of jobs in the world, specifically with respect to men and women: they look at the percentage of men and women in a population (usually in terms of country-wide percentages, but sometimes more localized), make note of any deviations from those percentages in terms of representation in a job, and then use those deviations to suggest that certain desirable fields (but not usually undesirable ones) are biased against women. So, for instance, if women make up 50% of the population but only represent 30% of lawyers, there are some who would conclude this means the profession (and associated organizations) is likely biased against women, usually because of some implicit sexism (as evidence of explicit and systematic sexism in training or hiring practices is exceptionally hard to come by). Similar methods have been used when substituting race for gender as well.

Just another gap, no doubt caused by sexism

Most of the ostensible demonstrations of this sexism issue are wanting, and I’ve covered a number of these examples before (see here, here, here, and here). Simply put, there are a lot of factors in the world that determine where people ultimately end up working (or whether they’re working at all). Finding a consistent gap between groups tells you something is different, just not what. As such, you don’t just get to assume that the cause of the difference is sexism and call it a day. My go-to example in that regard has long been plumbing. As a profession, it is almost entirely male dominated: something like 99% of the plumbers in the US are men. That’s as large of a gender gap as you could ask for, yet I have never once seen a campaign to get more women into plumbing or complaints about sexism in the profession keeping otherwise-interested women out. Similarly, men make up about 96% of the people shot by police, but the focus on police violence has never been on getting officers to shoot fewer men per se. In those cases, most people seem to recognize that factors other than sex are the primary determinants of the observed sex differences. Correlation isn’t causation, and maybe women aren’t as interested in digging around through human waste or committing violent felonies as men are. Not to say that many men are interested, just that more of those who are end up being men.

If that was the case and these sex differences aren’t caused by sexism, any efforts that sought to “fix” the gap by focusing on sexism would ultimately be unsuccessful. At the risk of saying something too obvious, you change outcomes by changing their causes; not unrelated issues. If we have the wrong idea as to what is causing an outcome, we end up wasting time and money (which often does not belong to us) trying to change it and accomplishing very little in the process (outside of getting people annoyed at us for wasting their time and money).

Today I wanted to add to that pile of questionable claims of sexism concerning an academic neighbor to psychology: philosophy. Though I was unaware of this debate, there is apparently some contention within the field concerning the perceived under-representation of women. As is typical, the apparent under-representation of women in this field has been chalked up to sexist biases keeping women discouraged and out of a job. To be clear about things, some people are looking at the percentage of men and women in the field of philosophy, noting that it differs from their expectations (whatever those are and however they were derived), calling it under-representation because of those expectations, and then further assuming a culprit in the form of sexism. As it turns out, the data has something to say about that.

It also has some great jokes about Polish people if you’re a racist.

The data in question come from a paper by Allen-Hermanson (2017), which examined sex differences in tenure-track hiring and academic publishing in philosophy departments. The reasoning behind this line of research was that if insidious forces are at work against women in philosophy departments, we ought to expect something of a leaky pipeline: women should not be as successful as men at landing desirable, tenure-track jobs, relative to the rates at which each sex earn philosophy degrees. So, if women earned, say, 40% of the philosophy PhDs during the last year, we might expect that they get 40% of the tenure-track jobs in the next, all else being equal. Across the 10 year period examined (2005-2014), there were three years in which women were hired very slightly below their relative percentage into the tenure-track jobs (and by “very slightly” I’m talking in range of about 1-2%), one year in which it was dead even, and during the remaining six years women were hired at above the rate which would be expected by much more substantial margins (in the range of 5-10%).

Putting some rough numbers to that, women earned about 28% of the PhDs and received about 36% of the jobs in the most recent hiring seasons. It seems, then, women tended to be over-represented in those positions, on average. Other data discussed in the paper corresponds to those findings, again suggesting that women had about a 25% advantage over men in finding desirable positions (in terms of less desirable positions, men and women were hired in about equal numbers).

This finding is made all the stranger by Allen-Hermanson (2017) noting that male and female degree holders differed with respect to how often they published. On average, the new tenure-track female candidates who had never held such a position before had 0.77 publications. The comparable male number was 1.37. Of those who secured a job in 2012-2013, men averaged 2.4 publications to women’s 1.17. Not only are the men publishing about twice as much, then, but they’re also modestly less successful at landing a job (and this effect did not appear to be driven by particularly prolific publishers). While one could possibly make the case that maybe female publications are in some sense higher qualitythat remains to be seen. One could more easily make the case that female candidates were held to lower standards than male ones.

As the data currently stand, I can’t imagine many people will be making a fuss about them and crying sexism. Perhaps the men with the degrees went out to seek work elsewhere and that explains why women are over-represented. Perhaps there are other causes. The world is a complicated place, after all. The point here is that there won’t be talk about how philosophy departments are biased against men, just like there wasn’t much talk I saw last time research found a much larger academic bias in favor of women, holding candidate quality constant. I think that is largely because the data apparently favor women with respect to hiring. If the results had run in the opposite direction, I can imagine that a lot more noise would have been made about them and many people would be getting scolded right now about their tolerance of sexism. But that’s just an intuition.

“Now, if you’ll excuse me, I’m off to find bias against my group somewhere else”

When asking a question of under-representation, the most pressing matter should always be, “under-represented with respect to what expectation?” In order to say that a group is under-represented, you need to make it clear what the expected degree of representation is as well as why. We shouldn’t expect that men and women be killed by police in equal numbers unless we also expect that both groups behave more-or-less identically. We similarly shouldn’t expect that men and women enter into certain fields in the same proportion unless they have identical sets of interests. On the other hand, if the two groups are different with respect to some key factor that determines an outcome, such as interests, using sex itself is just a poor variable choice. Compared to interest in fixing toilets (and other such relevant factors), I imagine sex itself uniquely predicts very little about who ultimately ends up becoming a plumber. If we can use those better, more directly-relevant factors, we should. You don’t build your predictive model with irrelevant factors; not if accuracy is your goal, in any case.

References: Allen-Hermanson S. (2017). Leaking pipeline myths: In search of gender effects on the job market and early career publishing in philosophy. Frontiers in Psychology, 8, doi: 10.3389/fpsyg.2017.00953

Understanding Sex In Advertising

When people post videos on YouTube, one major point of interest for content creators and aggregators is to capture as much attention as possible. Your video is adrift in a sea of information and you’re trying to get as many eyes/clicks on your work as possible. In that realm, first impressions are all important: you want your video to have an attention-grabbing thumbnail image, as that will likely be the only thing viewers see before they actually click (or don’t) on it. So how do people go about capturing attention in that realm? One popular method is to ensure their thumbnail has a very emotive expression on it; a face of shock, embarrassment, stress, or any similar emotion. That’s certainly one way of attracting attention: trying to convince people there is something worth looking at, not unlike articles titled along the lines of five shocking tips for a better sex life (and number 3 will blow your mind!). Speaking of sex, that’s another popular method of grabbing attention: it’s fairly common for video thumbnails to feature people or body parts in various stages of undress. Not much will pull eyes towards a video like the promise of sex (and if you’re feeling an urge to click on that link, you’ll have experienced exactly what I’m talking about).

Case in point: most of that content is unrelated to the featured women

If sex happens to be attention grabbing, the natural question arises concerning what you might do with that attention once you have it. Much of the time, that answer will involve selling some good or service. In other words, sex is used as a form of advertising to try and sell things. “If you enjoyed that picture of a woman wearing a thong, you’ll surely love our reasonably-costed laptops!”. Something along those lines, anyway. Provided that’s your goal, lots of questions naturally start to crop up: How effective is sex at these goals? Does it capture attention well? Does it help people notice and remember your product or brand? Are those who viewed your sexy advert more likely to buy the product you’re selling? How do other factors – the sex of the person viewing the ad – contribute to your success in these realms?

These are some of the questions examined in a recent meta-analysis by Wirtz, Sparks, & Zimbres (2017). The researchers searched the literature and found about 80 studies, representing about 18,000 participants. They sought to find out what effects featuring sexually provocative material had, on average (defined in terms of style of dress, sexual behavior, innuendo, or sexual embeds, which is where hidden messages or images are placed within the ad, like the word “sex” added somewhere to the picture, which is something people apparently think is a good idea sometimes). These ads had to have been compared against a comparable, non-sexual ad for the same product to be included in the analysis to determine which was more effective.

The effectiveness of these ads were assessed across a number of domains as well, including ad recognition (in aided and unaided contexts), whether the brand being advertised in the ad could be recalled (i.e., were people paying attention to just the sex, or did they remember the product?), the positive or negative response people had to the ad, what people thought about the brand being advertised with sex, and whether the ad actually got them interested in purchasing the product (does sex sell?).

Finally, a number of potentially moderating factors that might influence these effects were considered. The first of these was gender: did these ads have different impacts on men and women? Others factors included the gender of the model used in the advertisement, the date the article was published (to see if attitudes shifted over time), the sample used (college students or not), and – most interestingly – product/ad congruity: did the type of product being advertised matter when it came to whether sex was effective? Perhaps sex might help sell a product like sun-tan lotion (as the beach might be a good place to pick up mates), but be much less effective for selling, say, laptops.

Maybe even political views

In terms of capturing attention, sex works. Of the 20 effects looking at the recall for ads, the average size was d = .38. Interesting, this effect was slightly larger for the congruent ads (d = .45), but completely reversed for the incongruent ones (d = -.45). Sex was good at getting people to remember ads selling a sex-related product, but not just generally useful. That said, they seemed better at getting people to remember just the ads. When the researchers turned to the matter of whether the brands within the ads were more likely to be recalled, the 31 effects looking at brand recognition turned out to barely break zero (d = .09). While sex might be attention-grabbing, it didn’t seem especially good at getting people to remember the objects being sold.

Regarding people’s attitudes towards the ads, sex seems like something of a wash (d = -.07). Digging a little deeper revealed a more nuanced pictured of these reactions, though: while sexual ads seemed to be a modest hit with the men (d = .27), they had the opposite effect on women (d = -.38). Women seemed to dislike the ads modestly more than men liked them, as sexual strategies theory would suggest (for the record, the type of model being depicted didn’t make much of a difference. In order, people liked males models the least (d = -.28), then female models (d = -.20), and couples were mildly positive, d = .08).

Curiously, both the men and women seemed to be agreement regarding their stance towards brands that used sex to sell things: negative, on the whole (d – =.22). For women, this makes some intuitive sense: they didn’t see to be a fan of the sexual ads, so they weren’t exactly feeling too ingratiated towards the brand itself. But why were the men negatively inclined towards the brand if they were favorably inclined towards the ads? I can only speculate on that front, but I assume it would have something to do with their inevitable disappointment: either that the brands were promising on sex the male customers likely knew they couldn’t deliver on, or perhaps the men simply wanted to enjoy the sex part and the brand itself ended up getting in their way. I can’t imagine men would be too happy with their porn time being interrupted by an ad for toilet paper or fruit snacks mid-video.

Finally, turning the matter of purchase intentions – whether the ads encouraged people to want to buy the product or not – it seemed that sex didn’t really sell, but it didn’t really seem to hurt, either (d = .01). One interesting exception in that realm was that sex appeals were actually less likely to get people to buy a product when the product being sold was incongruent with the sexual appeal (d = -.24). Putting that into a simple example, the phrase “strip club buffet” probably doesn’t wet many appetites, and wouldn’t be a strong selling point for such a venue. Sex can be something of a disease vector, and associating your food with that might illicit more than a bit of disgust.

“Oh good, I was starving. This seems like as good a place as any”

As I’ve noted before, context matching matters in advertising. If you’re looking to sell people something that highlights their individuality, then doing so in a mating context works better than in a context of fear (as animals aren’t exactly aiming to look distinct when predators are nearby). The same seems to hold for using sex. While it might be useful for getting eyes on your advertisement, sex is by no mean guaranteed to ensure that people like what they see once you have their attention. In that regard, sex – like any other advertising tool – needs to be used selectively, targeting the correct audience in the correct context if it’s going to succeed at increasing people’s interest in buying. Sex in general doesn’t sell. However, it might prove more effective for those with more promiscuous attitudes than those with more monogamous ones; it might prove useful if advertising a product related to sex or mating, but not useful for selling domain names (like the old GoDaddy commercials; coincidentally, GoDaddy was also the brand I used to register this site); it might work better if you associate your product with things that lead to sex (like status), rather than sex itself. These are all avenues worth pursuing further to see when, where, and why sex works or fails.

That said, it is still possible that sex might prove useful, even in some inappropriate contexts. Consider the following hypothetical example: people will consider buying a product only after they have seen an advertisement for it. Advertisement X isn’t sexual, but when paired with the product will increase people’s intentions to buy it by 10%. However, it will also not really get noticed by many people, as the content is bland. By contrast, advertisement Y is sexual, will decrease people’s intentions to buy a product by 10%, but will also get four-times as many eyes on it. The latter ad might well be more successful, as it will capture the eye of more potential customers that may still buy the product despite the inappropriate use of sexWhile targeting advertisements might be more effective, the attention model of advertising shouldn’t be ruled out entirely, especially if targeting advertising would prove too cumbersome.

References: Wirtz, J., Sparks, J., & Zimbres, T. (2017). The effect of exposure to sexual appeals in advertisements on memory, attitude, and purchase intention: A meta-analytic review. International Journal of Advertising, https://doi.org/10.1080/02650487.2017.1334996

 

Divorced Dads And Their Daughters

Despite common assumptions, parents have less of an impact on their children’s future development than they’re often credited with. Twins reared apart usually aren’t much different than twins reared together, and adopted children don’t end up resembling their adoptive parents substantially more than strangers. While parents can indeed affect their children’s happiness profoundly, a healthy (and convincing) literature exists supporting the hypothesis that differences in parenting behaviors don’t do a whole lot of shaping in terms of children’s later personalities (at least when the child isn’t around the parent; Harris, 2009). This makes a good deal of theoretical sense, as children aren’t developing to be better children; they’re developing to become adults in their own right. What children learn works when it comes to interacting with their parents might not readily translate to the outside world. If you assume your boss will treat you the same way your parents would, you’re likely in for some unpleasant clashes with reality. 

“Who’s a good branch manager? That’s right! You are!”

Not that this has stopped researchers from seeking to find ways that parent-child interactions might shape children’s future personalities, mind you. Indeed, I came upon a very new paper purporting to do just that this last week. It suggested that the quality of a father’s investment in his daughters causes shifts in his daughter’s willingness to engage in risky sexual behavior (DelPriore, Schlomer, & Ellis, 2017). The analysis in the paper is admittedly a bit tough to follow, as the authors examine three- and even four-way interactions (which are difficult to keep straight in one’s mind: the importance of variable A changes contingent on the interaction between B, C, & D), so I don’t want to delve too deeply into the specific details. Instead, I want to discuss the broader themes and design of the paper.

Previous research looking at parenting effects on children’s development often suffers from the problem of relatedness, as genetic similarities between parents and children make it hard to tease apart the unique effects of parenting behaviors (how the parents treat their children) from natural resemblances (nice parents have nice children). In a simple example, parents who love and nurture their children tend to have children who grow up kinder and nicer, while parents who neglect their children tend to have children who grow up to be mean. However, it seems likely that parents who care for their children are different in some important regards than those who neglect them, and those tendencies are perfectly capable of being passed on through shared genes. So are the nice kids nice because of how their parents treated them or because of inheritance? The adoption studies I mentioned previously tend to support the latter interpretation. When you control for genetic factors, parenting effects tend to drop out.

What’s good about the present research is its innovative design to try and circumvent this issue of genetic similarities between children and parents. To accomplish this goal, the authors examined (among other things) how divorce might affect the development of different daughters within the same family. The reasoning for doing so seems to go roughly as follows: daughters should base their sexual developmental trajectory, in part, on the extent of paternal investment they’re exposed to during their early years. When daughters are regularly exposed to fathers that invest in them and monitor their behavior, they should come to expect that subsequent male parental investment will be forthcoming in future relationships and avoid peers who engage in risky sexual behavior. The net result is that such daughters will engage in less risky sexual behavior themselves. By contrast, when daughters lack proper exposure to an investing father, or have one who does not monitor their peer behavior as tightly (due to divorce), they should come to view future male investment as unlikely, associate with those who engage in riskier sexual behavior, and engage in such behavior themselves.

Accordingly, if a family with two daughters experiences a divorce, the younger daughter’s development might be affected differently than the older daughter’s, as they have different levels of exposure to their father’s investment. The larger this age gap between the daughters, the larger this effect should be. After recruiting 42 sister pairs from intact families and 59 sister pairs from divorced families and asking them some retrospective questions about what their life was like growing up, this is basically the result the authors found. Younger daughters tended to receive less monitoring than older daughters in families of divorce and, accordingly, tended to associate with more sexually-risky peers and engage in such behaviors themselves. This effect was not present in biologically intact families. Do we finally have some convincing evidence of parenting behaviors shaping children’s personalities outside the home?

Look at this data and tell me the first thing that comes to your mind

I don’t think so. The first concern I would raise regarding this research is the monitoring measure utilized. Monitoring, in this instance, represented a composite score of how much information the daughters reported their parents had about their lives (rated from (1) didn’t know anything, (2) knew a little, or (3) knew a lot) in five domains: who their friends were, how they spent their money, where they spent their time after school, where they were at night, and how they spent their free time. While one might conceptualize that as monitoring (i.e., parents taking an active interest in their children’s lives and seeking to learn about/control what they do), it seems that one could just as easily think of that measure as how often children independently shared information with their parents. After all, the measure doesn’t specify, “how often did your parents try to learn about your life and keep track of your behavior?” It just asked about how much they knew.

To put that point concretely, my close friends might know quite a bit about what I do, where I go, and so on, but it’s not because they’re actively monitoring me; it’s because I tell them about my day voluntarily. So, rather than talking about how a father’s monitoring of his daughter might have a causal effect on her sexual behavior, we could just as easily talk about how daughters who engage in risky behavior prefer not to tell their parents about what they’re doing, especially if their personal relationship is already strained by divorce.

The second concern I have concerns divorce itself. Divorce can indeed affect the personal relationships of children with their parents. However, that’s not the only thing that happens after a divorce. There are other effects that extend beyond emotional closeness. An important example of these other factors are the financial ones. If a father has been working while the mother took care of the children – or if both parents were working – divorce can result in massive financial hits for the children (as most end up living with their mother or in a joint custody arrangement). The results of entering additional economic problems into an already emotionally-upsetting divorce can entail not only additional resentment between children and parents (and, accordingly, less sharing of information between them; the reduced monitoring), but also major alterations to the living conditions of the children. These lifestyle shifts could include moving to a new home, upsetting existing peer relations, entering new social groups, and presenting children with new logistical problems to solve.

Any observed changes in a daughter’s sexual behavior in the years following a divorce, then, can be thought of as a composite of all the changes that take place post-divorce. While the quality and amount of the father-daughter relationship might indeed change during that time, there are additional and important factors that aren’t controlled for in the present paper.

Too bad the house didn’t split down the middle as nicely

The final concern I wanted to discuss was more of a theoretical one, and it’s slightly larger than the methodological points above. According to the theory proposed at the beginning of the paper:

“…the quality of fathering that daughters receive provides information about the availability and reliability of male investment in the local ecology, which girls use to calibrate their mating behavior and expectations for long-term investment from future mates.”

This strikes me as a questionable foundation for a few reasons. First, it would require that the relationship of a daughter’s parents are substantially predictive of the relationships she is likely to encounter in the world with regard to male investment. In other words, if your father didn’t invest in your mother (or you) that heavily (or at least during your childhood), that needs to mean that many other potential fathers are likely to do the same to you (if you’re a girl). This would further require, then, that male investment be appreciably uniform across time in the world. If male investment wasn’t stable between males and across time within a given male, then trying to predict the general availability of future male investment from your father’s seems like a losing formula for accuracy.

It seems unlikely the world is that stable. For similar reasons, I suggested that children probably can’t accurately gauge future food availability from their access to food at a young age. Making matters even worse in this regard is that, unlike food shortages, the presence or absence of male parental investment doesn’t seem like the kind of thing that will be relatively universal. Some men in a local environment might be perfectly willing to invest heavily in women while others are not. But that’s only considering the broad level: men who are willing to invest in general might be unwilling to invest in a particular woman, or might be willing or unwilling to invest in that woman at different stages in her life, contingent on her mate value shifting with age. Any kind of general predictive power that could be derived about men in a local ecology seems weak indeed, especially if you are basing that decision off a single relationship: the one between your parents. In short, if you want to know what men in your environment are generally like, one relationship should be as informative as another. There doesn’t seem to be a good reason to assume your parents will be particularly informative.

Matters get even worse for the predictive power of father-daughter relationships when one realizes the contradiction between that theory and the predictions of the authors. The point can be made crystal clear simply by considering the families examined in this very study. The sample of interest was comprised of daughters from the same family who had different levels exposure to paternal investment. That ought to mean, if I’m following the predictions properly, that the daughters – the older and younger one – should develop different expectations about future paternal investment in their local ecology. Strangely, however, these expectations would have been derived from the same father’s behavior. This would be a problem because both daughters cannot be right about the general willingness of males to invest if they hold different expectations. If the older daughter with more years of exposure to her father comes to believe male investment will be available and the younger daughter with fewer years of exposure comes to believe it will be unavailable, these are opposing expectations of the world.

However, if those different expectations are derived from the same father, that alone should cast doubt on the ability of a single parental relationship to predict broad trends about the world. It doesn’t even seem to be right within families, let alone between them (and it’s probably worth mentioning at this point that, if children are going to be right about the quality of male investment in their local ecology more generally, all the children in the same area should develop similar expectations, regardless of their parent’s behavior. It would be strange for literal neighbors to develop different expectations of general male behavior in their local environment just because the parents of one home got divorced while the other stayed together. Then again, it should strange for daughters of the same home to develop different expectations, too).

Unless different ecologies have rather sharp boarders

On both a methodological and theoretical level, then, there are some major concerns with this paper that render its interpretation suspect. Indeed, at the heart of the paper is a large contradiction: if you’re going to predict that two girls from the same family develop substantially different expectations about the wider world from the same father, then it seems impossible that the data from that father is very predictive of the world. In any case, the world doesn’t seem as stable as it would need to be for that single data point to be terribly useful. There ought not be anything special about the relationship of your parents (relative to other parents) if you’re looking to learn something about the world in general.

While I fully expect that children’s lives following their parents divorce will be different – and those differences can affect development, depending on when they occur – I’m not so sure that the personal relationship between fathers and daughters is the causal variable of primary interest.

References: DelPriore, D., Schlomer, G., & Ellis, B. (2017). Impact of Fathers on Parental Monitoring of Daughters and Their Affiliation With Sexually Promiscuous Peers: A Genetically and Environmentally Controlled Sibling Study. Developmental Psychology. Advance online publication. http://dx.doi.org/10.1037/dev0000327

Harris, J. (2009) The Nurture Assumption: Why Children Turn Out the Way They Do. Free Press, NY.