Diversity: A Follow-Up

My last post focused on the business case for demographic diversity. Summarizing briefly, an attempted replication of a paper claiming that companies with greater gender and racial diversity outperformed those with less diversity failed to reach the same conclusion. Instead, these measures of diversity were effectively unrelated to business performance once you controlled for a few variables. This should make plenty of intuitive sense, as demographic variables per se aren’t related to job performance. While they might prove to be rough proxies if you have no information (men or women might be better at tasks X or Y, for instance), once you can assess skills, competencies, and interests, the demographic variables cease to be good predictors of much else. Being a man or a woman, African or Chinese, does not itself make you competent or interested in any particular domain. Today, I wanted to tackle the matter of diversity itself on more of a philosophical level. With any luck, we might be able to understand some of the issues that can cloud discussions on the topic.

And if I’m unlucky, well…

Let’s start with the justifications for concerns with demographic diversity. As far as I’ve seen, there are two routes people take with this. The first – and perhaps most common – has been the moral justification for increasing diversity of race and gender in certain professions. The argument here is that certain groups of people have been historically denied access to particular positions, institutions, and roles, and so they need to be proactively included in such endeavors as a means of reparation to make up for past wrongs. While that’s an interesting discussion in its own right, I have not found many people who claim that, say, more women should be brought into a profession no matter the impact. That is, no one has said, “So what if bringing in more women would mess everything up? Bring them in anyway.” This brings us to the second justification for increasing demographic diversity that usually accompanies the first: the focus on the benefits of cognitive diversity. The general idea here is not only that people from all different groups will perform at least as well in such roles, but that having a wider mix of people from different demographic groups will actually result in benefits. The larger your metaphorical cognitive toolkit, the more likely you will successfully meet and overcome the challenges of the world. Kind of like having a Swiss Army knife with many different attachments, just with brains.

This idea is appealing on its face but, as we saw last time, diversity wasn’t found to yield any noticeable benefits. There are a few reasons why we might expect that outcome. The first is that cognitive diversity itself is not always going to be useful. If you’re on a camping trip and you need to saw through a piece of wood, the saw attachment on your Swiss Army knife would work well; the scissors, toothpick, and can opener will all prove ineffective at solving your problem. Even the non-serrated knife will prove inefficient at the task. The solutions to problems in the world are not general-purpose in nature. They require specialized equipment to solve. Expanding that metaphor into the cognitive domain, if you’re trying to extract bitumen from tar sands, you don’t want a team of cognitively diverse individuals including a history major, a psychology PhD, and a computer scientist, along with a middle-school student. Their diverse set of skills and knowledge won’t help you solve your problem. You might do better if you hired a cognitively non-diverse group of petroleum engineers.

This is why companies hiring for positions regularly list rather specific qualification requirements. They understand – as we all should – that cognitive diversity isn’t always (or even usually) useful when it comes to solving particular tasks efficiently. Cognitive specialization does that. Returning this point back to demographic diversity, the problem should be clear enough: whatever cognitive diversity exists between men and women, or between different racial groups, it needs to be task relevant in order for it to even potentially improve performance outcomes. Even if the differences are relevant, in order for diversity to improve outcomes, the different demographic groups in question need to complement the skill sets of the other. If, say, women are better at programming than men, then diversity of men and women wouldn’t improve programming outcomes; the non-diverse outcome of hiring women instead of men would.

Just like you don’t improve your track team’s relay time by including diverse species

Now it’s not impossible that such complementary cognitive demographic differences exist, at least in theory, even though the former restrictions are already onerous. However, the next question that arises is whether such cognitive differences would actually exist in practice by the time hiring decisions were made. There’s reason to expect they would not, as people do not specialize in skills or bodies of knowledge at random. While there might be an appreciable amount of cognitive diversity between groups like men and women, or between racial groups, in the entire population, (indeed, meaningful differences would need to exist in order for the beneficial diversity argument to make any sense in the first place) people do not get randomly sorted into groups like professions or college majors.

Most people probably aren’t that interested in art history, or computer science, or psychology, or math to the extent they would pursue it at the expense of everything else they could do. As such, the people who are sufficiently interested in psychology are probably more similar to one another than they are to people who major in engineering. Those who are interested in plumbing are likely more similar to other plumbers than they are to nurses.

As such, whatever differences exist between demographics on the population level may be reduced in part or in whole once people begin to self-select into different groups based on skills, interests, and aptitudes. Even if men and women possess some cognitive differences in general, male and female nurses, or psychologists, or engineers, might not differ in those same regards. The narrower the skill set you’re looking for when it comes to solving a task, the more similar we might expect people who possess those skills to be. Just to use my profession, psychologists might be more similar than non-psychologists; those with a PhD might be more similar than those with just a BA; those who do research may differ from those who enter into the clinical field, and so on.

I think these latter points are where a lot of people get tripped up when thinking about the possible benefits of demographic diversity to task performance. They notice appreciable and real differences between demographic groups on a number of cognitive dimensions, but fail to appreciate that these population differences might (a) not be large once enough self-selection by skills and interests has taken place, (b) not be particularly task relevant, and (c) might not be complementary.

Ironically, one of the larger benefits to cognitive diversity might be the kind that people typically want to see the least: the ability of differing perspectives to help check the personal biases we possess. As people become less reliant on those in their immediate vicinity and increasingly able to self-segregate into similar-thinking social and political groups around the world, they may begin to likewise pursue policies and ideas that are increasingly self-serving and less likely to benefit the population on the whole. Key assumptions may go unchallenged and the welfare of others may be taken into account less frequently, resulting in everyone being worse off. Groups like the Heterodox Academy have been set up to try and counteract this problem, though the extent of their success is debatable.

A noble attempt to hold back the oncoming flood all the same

Condensing this post a little, the basic idea is this: men and women (to use just one group), on average, are likely to show a greater degree of between-group cognitive diversity than are male and female computer science majors. Or male and female literature majors. Any group you can imagine. Once people are segregating themselves into different groups on the basis of shared abilities and interests, those within the groups should be much more similar to one another than you’d expect on the basis of their demographics. If much of the cognitive diversity between these groups is getting removed through self-selection, then there isn’t much reason to expect that demographic diversity within those groups will have as much of an effect one way or the other. If male and female programmers already know the same sets of skills and have fairly similar personalities, making those groups look more male or more female won’t have much of an overall effect on their performance.

For it to even be possible that such diversity might help, we need to grant that meaningful, task-relevant differences between demographic groups exist, are retained throughout a long process of self-selection, and that these differences complement each other, rather than one group being superior. Further, these differences would need to create more benefits than conflicts. While there might be plenty of cognitive diversity in, say, the US congress in terms of ideology, that doesn’t necessarily mean it helps people achieve useful outcomes all the time once you account for all the dispute-related costs and lack of shared goals. 

If qualified and interested individuals are being kept out of a profession simply because of their race or gender, that obviously carries costs and should be stopped. There would be many valuable resources going untapped. If, however, people left to their own devices are simply making choices they feel suit them better – creating some natural demographic imbalances – then just changing their representation in this field or that shouldn’t impact much.

Does Diversity Per Se Pay?

In one of the most interesting short reports I read recently, some research was conducted in Australia examining what the effect of blind reviews would be on hiring. The premise of the research, far as I can surmise, was that a fear existed of conscious or unconscious bias against women and minority groups when it came to getting hired. This bias would naturally make it harder for those groups to find employment, ultimately yielding a less diverse workforce. In the interests of avoiding that bias, the research team compared what happened when candidates were assessed on either standard resumes or de-identified ones. The latter resumes were identical to the former, except they had group-relevant information (like gender and race) removed. If reviewers don’t have that information of race or gender available, then they couldn’t possibly assess the candidates on the basis of them, whether consciously or unconsciously. That seems straightforward enough. The aim was to compare the results from the blind assessments to those of the standard resumes. As it turned out, there were indeed hints of bias; relatively small in size sometimes, but present nonetheless. However, the bias did not go in the direction that had been feared.

Shocking that the headline wasn’t “Blind review processes are biased”

Specifically, when the participants assessing the resumes had information about gender, they were about 3% more likely to select women, and 3% less likely to select men. Further, minorities were more likely to be selected as well when the information was available (about 6% for males and 9% for females). While there’s more to the picture than that, the primary result seemed to be that, when given the option, these reviewers discriminated in favor of women and minority groups simply because of their group membership. If these results had run in the opposite direction (against women and minorities) there would have no doubt been calls for increasing blind reviews. However, because blind reviews seemed to disfavor women and minorities, the authors had a different suggestion:

Overall, the results indicate the need for caution when moving towards ’blind’ recruitment processes in the Australian Public Service, as de-identification may frustrate efforts aimed at promoting diversity

It’s hard to interpret that statement as anything other than ”we should hire more women and minorities, regardless of qualifications.” Even if sex and race ought to be irrelevant to the demands of the job and candidates should be assessed on their merit, people should also apparently be cautious when removing those irrelevant pieces from the application process. The authors seemed to favor discrimination based on sex or race so long as it benefited the right groups. Such discriminatory practices have led to negative reactions on the part of others, as one might expect.

This brings me another question: why should we value diversity when it comes to hiring decisions? To be clear, the diversity being sought is often strictly demographic in nature (many organizations tout diversity in race, for instance, but not in perspective. I don’t recall the draw of many positions being that you will meet a variety of people who hold fundamental disagreements with your view on the world). It’s also usually the kind of diversity that benefits women and minorities (I’ve never come across calls to get more white males into certain fields dominated by women or other races. Perhaps they exist; I just haven’t seen them). But are there real economic benefits to increasing diversity per se? Could it be the case that more diverse organizations just do better? On the face of it, I would assume the answer is “no” if the diversity in question is simply demographic in nature. What matters when it comes to job performance is not the color of one’s skin or what sex chromosomes they possess, but rather their skills and competencies they bring with them. While some of those skills and competencies might be very roughly approximated by race and gender if you have no additional information about your applicants, we thankfully don’t need to rely on those indirect measures. Rather than asking about gender or race, one could just ask directly about skill sets and interests. When you can do that, the additional value of knowing one’s group membership is likely close to nil. Why bother using a predictor of a variable when you can just use the variable itself?

Do you really love roundabouts that much?

Nevertheless, it has apparently been reported before that demographic diversity predicts the relative success of companies (Herring, 2009). A business case was made for diversity, such that diverse companies were found to generally do better than less diverse ones across a number of different metrics. Not that those in favor of increasing diversity really seemed to need a financial justification, but having one certainly wouldn’t hurt their case. As this paper was apparently popular within the literature (for what I assume is that reason), a replication was attempted (Stojmenovska et al, 2017), beginning in a graduate course as an assignment to help students “learn from the best.” Since it seems “psychology research” and “replications” mix about as well as oil and water as of late, the results turned out a bit worse than hoped. The student wasn’t even trying to look for problems; they just stumbled upon them.  

In this instance, the replication attempt failed to find the published result, instead catching two primary mistakes made in the original paper (as opposed to anything malicious): there were a number of coding mistakes within the data, and the sample data itself was skewed. Without going too deeply into why this is a problem, it should suffice to say that coding mistakes are bad for all the obvious reasons. Fixing the coding mistakes by deleting missing data resulted in a substantial reduction in sample size (25-50% smaller). As for the issue of skew, having a skewed sample can result in an underestimation of the relationship between predictors and outcomes. In brief, there were confounding relationships between predictor variables and the outcomes that were not adequately controlled for in the original paper. To correct for the skew issue, a log transformation on the data was carried out, resulting in a dramatic increase in the relationship between particular variables.

In order to provide a concrete sense for that increase, in the original report the correlation between company size and racial diversity was .14; after the log transformation was carried out, that correlation increased to .41. This means that larger companies tended to be more racially diverse than smaller ones, but that relationship was not fully accounted for in the original paper examining how diversity impacted success. The same issue held for gender diversity and establishment size.

Once these two issues – coding errors and skewed data – were addressed, the new results showed that gender and racial diversity were effectively unrelated to company performance. The only remaining relationship was a small one between gender diversity and the logged number of customers. While seven of the original eight hypotheses were supported in the first paper, the replication attempt correcting these errors only found one of the eight to be statistically supported. As most of the effects no longer existed, and the one that did exist was small in size, the business justification for increasing racial and gender diversity failed to receive any real support.

Very colorful, but they ultimately all taste the same

As I initially mentioned, I don’t see a very good reason to expect that a more demographically diverse group of employees should yield better outcomes. They don’t yield worse outcomes either. However, the study from Australia suggests that the benefits of diversity (or the lack thereof) are basically besides the point in many instances. That is, not only would I imagine this failure to replicate won’t have a substantial impact on many people’s views on whether or not diversity should be increased, but I don’t think it would even if diversity was found to be a bad thing, financially speaking. This is because I don’t suspect many views of whether increasing diversity should be done are based on the foundation that it’s good for people economically in the first place. Increasing diversity isn’t viewed as a tricky empirical matter as much as it seems to be a moral one; one in which certain groups of people are viewed as owing or deserving various things.

This is only looking at the outcomes of adding diversity, of course. The causes of such diverse levels of diversity across different walks of life is another beast entirely.

References: Stojmenovska, D., Bol, T., & Leopolda, T. (2017). Does diversity pay? Replication of Herring (2009). American Sociological Review, 82, 857-867. 

Herring, C. (2009). Does diversity pay? Race, gender, and the business case for diversity. American Sociological Review, 74, 208–224.