Privilege And The Nature Of Inequality

Recently, there’s been a new comic floating around my social news feeds claiming that it will forever change the way I think about something. It’s not like there’s ever isn’t such article on my feeds, really, but I decided it would provide me with the opportunity to examine some research I’ve wanted to write about for some time. In the case of this mind-blowing comic, the concept of privilege is explained through a short story. The concept itself is not a hard one to understand: privilege here refers to cases in which an individual goes through their life with certain advantages they did not earn. The comic in question looks at an economic privilege: two children are born, but one has parents with lots of money and social connections. As expected, the one with the privilege ends up doing fairly well for himself, as many burdens of life have been removed, while the one without ends up working a series of low-paying jobs, eventually in service to the privileged one. The privileged individual declares that nothing has ever been handed to him in life as he is literally being handed some food on a silver platter by the underprivileged individual, apparently oblivious to what his parent’s wealth and connections have brought him.

Stupid, rich baby…

In the interests of laying my cards on the table at the outset, I would count myself among those born into privilege. While my family is not rich or well-connected the way people typically think about those things, there haven’t been any necessities of life I have wanted for; I have even had access to many additional luxuries that others have not. Having those burdens removed is something I am quite grateful for, and it has allowed me to invest my time in ways other people could not. I have the hard-work and responsibility of my parents to thank for these advantages. These are not advantages I earned, but they are certainly not advantages which just fell from the sky; if my parents had made different choices, things likely would have worked out differently for me. I want to acknowledge my advantages without downplaying their efforts at all.

That last part raises a rather interesting question that pertains to the privilege debate, however. In the aforementioned comic, the implication seems to be – unless I’m misunderstanding it – that things likely would have turned out equally well for both children if they had been given access to the same advantages in their life. Some of the differences that each child starts with seems to be the results of their parent’s work, while other parts of that difference are the result of happenstance. The comic appears to suggest the differences in that case were just due to chance: both sets of parents love their children, but one set seems to have better jobs. Luck of the draw, I suppose. However, is that the case for life more generally; you know, the thing about which the comic intends to make a point?

For instance, if one set of parents happen to be more short-term oriented – interested in taking rewards now rather than foregoing them for possibly larger rewards in the future, i.e., not really savers – we could expect that their children will, to some extent, inherit those short-term psychological tendencies; they will also inherit a more meager amount of cash. Similarly, the child of the parents who are more long-term focused should inherit their proclivities as well, in addition to the benefits those psychologies eventually accrued.

Provided that happened to be the case, what would become of these two children if they both started life in the same position? Should we expect that they both end up at similar places? Putting the questions another way, let’s imagine that, all the sudden, the wealth of this world was evenly distributed among the population; no one had more or less than anyone else. In this imaginary world, how long would that state of relative equality last? I can’t say for certain, but my expectation is that it wouldn’t last very long at all. While the money might be equally distributed in the population, the psychological predispositions for spending, saving, earning, investing, and so on are unlikely to be. Over time, inequalities will again begin to assert themselves as those psychological differences – be they slight or large – accumulate from decision after decision.

Clearly, this isn an experiment that couldn’t be run in real life – people are quite attached to their money – but there are naturally occurring versions of it in everyday life. If you want to find a context in which people might randomly come into possession of a sum of money, look no further than the lottery. Winning the lottery, both whether one wins at all and how much money you get, are as close to randomly determined as we’re going to get. If the differences between the families in the mind-blowing comic are due to chance factors, we would predict that people who win more money in the lottery should, subsequently, be doing better in life, relative to those who won smaller amounts. By contrast, if chance factors are relatively unimportant, than the amount won should be less important: whether they win large or small amounts, they might spend it (or waste it) at similar rates.

Nothing quite like a dose of privilege to turn your life around

This was precisely what was examined by Hankins et al (2010): the authors sought to assess the relationship between the amount of money won in a lottery and the probability of the winner filing for bankruptcy within a five year period of their win. Rather than removing inequalities and seeing how things shake out, then, this research took the opposite approach: examining a process that generated inequalities and seeing how long it took for them to dissipate.

The primary sample for this research were the Fantasy 5 winners in Florida from April 1993 to November, 2002 who had won $600 or more: approximately 35,000 of them after certain screening measures had been implemented. These lottery winners were grouped into those who won between $10,000 and $50,000, and those who won between $50,000 and $150,000 (subsequent analyses would examine those who won $10,000 or less as well, leading to small, medium, and large winner groups).

Of those 35,000 winners, about 2,000 were linked to a bankruptcy filing within five years of their win, meaning that a little more than 1% of winners were filing each year on average; a rate comparable to the broader Florida population. The first step was to examine whether the large winners were doing comparable amounts of bankruptcy filing prior to their win, relative to the low winners which, thankfully, they were. In pretty much all respects, those who won a lot of money did not differ from those who won less before their win (including race, gender, marital status, educational attainment, and nine other demographic variables). That’s what one would expect from the lottery, after all.

Turning to what happened after their win, within the first two years, those who won larger sums of money were less likely to file for bankruptcy than smaller winners; however, in years 3 through 5 that pattern reversed itself, with larger winners becoming more likely to file. The end result of this shifting pattern was that, in five years time, large winners were equally likely to have filed for bankruptcy, relative to smaller winners. As Hankins et al (2010) put it, large cash payments did not prevent bankruptcy; they only postponed it. This result was consistently obtained after attempting a number of different analyses, suggesting that the finding is fairly robust. In fact, when the winners eventually did file for bankruptcy, the big winners didn’t have much more to show for it than small winners: those who won between $25,000 and $150,000 only had about $8,000 more in assets than those who had won less than $1,500, and the two groups had comparable debts.

Not much of an ROI on making it rain these days, it seems

At least when it came to one of the most severe forms of financial distress, large sums of cash did not appear to stop people from falling back into poverty in the long term, suggesting that there’s more going on in the world than just poor luck and unearned privilege. Whatever this money was being spent on, it did not appear to be sound investments. Maybe people were making more of their luck than they realized.

It should be noted that this natural experiment does pose certain confounds, perhaps the most important of which is that not everyone plays the lottery. In fact, given that the lottery itself is quite a bad investment, we are likely looking at a non-random sample of people who choose to play it in the first place; people who already aren’t prone to making wise, long-term decisions. Perhaps these results would look different if everyone played the lottery but, as it stands, thinking about these results in the context of the initial comic about privilege, I would have to say that my mind remains un-blown. Unsurprisingly, deep truths about social life can be difficult to sum up in a short comic.

References: Hankins, S., Hoekstra, M., & Skiba, P. (2010). The ticket to easy street? The financial consequences of winning the lottery. Vanderbilt Law and Economics Research Paper, 10-12.

Relaxing With Some Silly Research

In psychology, there is a lot of bad research out there by all estimates. The poor quality of this research can be attributed to concerns about ideology-driven research agendas, research bias, demand characteristics, lack of any real theory guiding the research itself, p-hacking, file-drawer effects, failures to replicate, small sample sizes, and reliance on undergraduate samples, among others. Arguably, there is more bad (or at least inaccurate) research than good research floating around as, in principle, there are many more ways of being wrong about the human mind than there are of being right about it (even given our familiarity with it); a problem made worse by the fact that being (or appearing) wrong or reporting null findings does not tend to garner one social status in the world of academia. If many of the incentives reside in finding particular kinds of results – and those kinds are not necessarily accurate – the predictable result is a lot of misleading papers. Determining what parts of the existing psychological literature are an accurate description of human psychology can be something of a burden, however, owing to the obscure nature of some of these issues: it’s not always readily apparent that a paper found a fluke result or that certain shady research practices have been employed. Thankfully, it doesn’t take a lot of effort to see why some particular pieces of psychological research are silly; criticizing that stuff can be as relaxing as a day off at the beach.

Kind of like this, but indoors and with fewer women

The last time I remember coming across some of the research that can easily be recognized as silly was when one brave set of researchers asked if leaning to the left made the Eiffel tower look smaller. The theory behind that initial bit of research is called, I think, number line theory, though I’m not positive on that. Regardless of the name, the gist of the idea seems to be that people - and chickens, apparently - associate smaller numbers with a relative leftwardly direction and larger numbers with a rightwardly one. For humans, such a mental representation might make sense in light of our using certain systems of writing; for nonhumans, this finding would seem to make zero sense. To understand why this finding makes no sense, try and place it within a functional framework by asking (a) why might humans and chickens (and perhaps other animals as well) represent smaller quantities with their left, and (b) why might leaning to the left be expected to bias one’s estimate of size? Personally, I’m coming up with a blank on the answer to those questions, especially because biasing one’s estimate of size on the basis of how one is leaning is unlikely to yield more accurate estimates. A decrease in accuracy seems like that could only carry costs in this case; not benefits. So, at best, we’re left calling those findings a development byproduct for humans and likely a fluke for the chickens. In all likelihood, the human finding is probably a fluke as well.

Thankfully, for the sake of entertainment, silly research is not to be deterred. One of the more recent tests of this number line hypothesis (Anelli et al, 2014) makes an even bolder prediction than the Eiffel tower paper: people will actually get better at performing certain mathematical operations when they’re traveling to the left or the right: specifically, going right will make you better at addition and left better at subtraction. Why? Because smaller numbers are associated with the left? How does that make one better at subtraction? I don’t know and the paper doesn’t really go into that part. On the face of it, this seems like a great example of what I have nicknamed “dire straits thinking”. Named after the band’s song, “money for nothing” this type of thinking leads people to hypothesizing that others can get better (or worse) at tasks without any associated costs. The problem with this kind of thinking is that if people did possess the cognitive capacities to be better at certain tasks, one might wonder why people ever perform worse than they could. This would lead me to pose questions like, “why do I have to be traveling right to be better at addition; why not just be better all the time?” Some kind of trade-offs need to referenced to explain that apparent detriment/bonus to performance, but none ever are in dire straits thinking.

In any case, let’s look at the details of the experiment, which was quite simple. Anelli et al, (2014) had a total of 48 participants walk with an experimenter (one at a time; not all 48 at once). The pair would walk together for 20 seconds in a straight line, at which point the experimenter would call out a three-digit number, tell the participants to add or subtract from it by 3 aloud for 22 seconds, give them a direction to turn (right or left), and tell them to begin. At that point, the participant would turn and start doing the math. Each participant completed four trials: two congruent (right/addition or left/subtraction) and two incongruent (right/subtraction or left/addition). The researchers hoped to uncover a congruency effect, such that more correct calculations would be performed in the congruent, relative to incongruent, trials.

Now put the data into to the “I’m right” program and it’s ready to publish

Indeed, just such an effect was found: when participants were moving in a congruent direction as their mathematical operations, they performed more correct calculations on average (M = 10.1), relative to when they were traveling in an incongruent direction (M = 9.6). However, when this effect was broken down by direction, it turns out that the effect only exists when participants were doing addition (M = 11.1 when going right, 10.2 when going left); there was no difference for subtraction (M = 9.0 and 9.1, respectively). Why was there no effect for subtraction? Well, the authors postulate a number of possibilities – one of which being that perhaps participants needed to be walking backwards – though none of them include the possibility of the addition finding being a statistical fluke. It’s strange how infrequently this possibility is ever mentioned in published work, especially in the face of inconsistent findings.

Now one obvious criticism of this research is that the participants were never traveling right or left; they were walking straight ahead in all cases. Right or left, unlike East or West, depends on perspective. When I am facing my computer, I feel I am facing ahead; when I turn around to walk to the bathroom, I don’t feel like I’m walking behind me. The current research would thus rely on the effects of a momentary turn affecting participant’s math abilities for about half a minute. Accordingly, participants shouldn’t even have needed to be walking; asking them to turn and stand in place should be expected to have precisely the same effect. If the researchers wanted to measure walking to the right or left, they should have had participants moving to the side by sliding, rather than turning and walking forward.

Other obvious criticisms of the research could include the small sample size, the small effect size, the inconsistency of the effect (works for addition but not subtraction and is inconsistent with other research they cite which was itself inconsistent – people being better at addition when going up in an elevator but not walking up stairs, if I understand correctly), or the complete lack of anything resembling a real theory guiding the research. But let’s say for a moment that my impression of these results as silly is incorrect; let’s assume that these results accurately describe the workings of human mind in some respect. What are the implications of that finding? What, in other words, happens to be at stake here? Why would this research be published, relative to the other submissions received by Frontiers in Psychology? Even if it’s a true effect – which already seems unlikely, given the aforementioned issues – it doesn’t seem particularly noteworthy. Should people be turning to the right and left while taking their GREs? Do people need to be doing jumping jacks to improve their multiplication skills so as to make their body look more like the multiplication symbol? If so, how could you manage to do them while you’re supposed to be sitting down quietly while taking your GREs without getting kicked out of the testing site? Perhaps someone more informed on the topic could lend a suggestion, because I’m having trouble seeing the importance of it.

Maybe the insignificance of the results is supposed to make the reader feel more important

Without wanting to make a mountain out of a mole hill, this paper was authored by five researchers and presumably made it passed an editor and several reviewers before it saw publication. At a minimum, that’s probably about 8 to 10 people. That seems like a remarkable feat, given how strange the paper happens to look on its face. I’m not just mindlessly poking fun at the paper, though: I’m bringing attention to it because it seems to highlight a variety of problems in the world of psychological research. There are, of course, many suggestions as to how these problems might be ferreted out, though many of them that I have seen focus more on statistical solutions or combating researcher degrees of freedom. While such measures might reduce the quantity of bad research (like pre-registering studies), they will be unlikely to increase the absolute quality of good work (since one can pre-register silly ideas like this), which I think is an equally valuable goal. For my money, the requirement of some theoretical functional grounding for research would likely be the strongest candidate for improving work in psychology. I imagine many people would find it harder to propose such an idea in the first place if they needed to include some kind of functional considerations as to why turning right makes you better at addition. Even if such a feat was accomplished, it seems those considerations would make the rationale for the paper even easier to pick apart by reviewers and readers.

Instead of asking for silly research to be conducted on larger, more diverse samples, it seems better to ask that silly research not be conducted at all.

References: Anelli, F., Lugli, L., Baroni G., Borghi, A., & Nicoletti, R. (2014). Walking boosts your performance in making additions and subtractions. Frontiers in Psychology, 5, doi: 10.3389/fpsyg.2014.01459

(Some Of) My Teaching Philosophy

Over the course of my time at various public schools and universities I have encountered a great many teachers. Some of my teachers were quite good. I would credit my interest in evolutionary psychology to one particularly excellent teacher – Gordon Gallup. Not only was the material itself unlike anything I had previously been presented with in other psychology courses, but the way Gordon taught his classes was unparalleled. Each day he would show up and, without the aid of any PowerPoints or any apparent notes, just lecture. On occasion we would get some graphs or charts drawn on the board, but that was about it. What struck me about this teaching style is what it communicated about the speaker: this is someone who knows what he’s talking about. His command of the material was so impressive I actually sat through his course again for no credit in the follow years to transcribe them (and the similarity from year-to-year was remarkable, given that lack of notes). It was just a pleasure listening to him do what we did best.

A feat I was recently recognized for

That I say Gordon was outstanding is to say he was exceptional, relative to his peers (even if many of those peers, mistakenly, believe they are exceptional as well). The converse to that praise, then, is that I have encountered many more professors who were either not particularly good at what they did or downright awful at it (subjectively speaking, of course). I’ve had some professors who act, more or less, as an audio guide to the textbook that, when questioned, didn’t seem to really understand the material they were teaching; I’ve had another tell his class “now, we know this isn’t true, but maybe it’s useful” as he reviewed Maslow’s hierarchy of needs for what must have been the tenth time in my psychology education – a statement which promptly turned off my attention for the day. The number of examples I could provide likely outnumber my fingers and toes, so there’s no need to detail each one. In fact, just about everyone who has attended school has had experiences like this. Are these subjective evaluations of teachers that we have all made accurate representations of their teaching ability, though?

According to some research by Braga et al (2011), that answer is “yes”, but in a rather perverse sense: teacher evaluations tend to be negatively predictive of actual teaching effectiveness. In other words, at the end of a semester when a teacher receives evaluations from their students, the better these evaluations, the less effective the teacher tends to be. As someone who received fairly high evaluations from my own students, this should either be cause for some reflection as to my methods (since I am interested in my students learning; not just their being satisfied with my course) or a hunt for why the research in question must be wrong to make me feel better about my good reviews. In the interests of prioritizing my self-esteem, let’s start by considering the research and seeing if any holes can be poked in it.

“Don’t worry; I’m sure those good reviews will still reflect well on you”

Braga et al (2011) analyzed data from a private Italian university offering programs in economics, business, and law in 1998/9. The students in these programs had to take a fixed course of classes with fixed sets of materials and the same examinations. Additionally, students were randomly assigned to professors, making this one of the most controlled academic settings for this kind of research I could imagine. At the end of the terms, students provided evaluations of their instructors, allowing their ratings of instructors to be correlated – at the classroom level, as the evaluations were anonymous – with their performance in being effective teachers.

Teaching effectiveness was measured by examining how students did in subsequent courses, (controlling for a variety of non-teacher factors, like class size) the assumption being that students with better professors in the first course would do better in future courses, owing to their more proficient grasping of the material. These non-teacher factors accounted for about 57% of the variance in future course grades, leaving plenty of room for teacher effects. The effect of teachers was appreciable, with an increase of one standard deviation in effectiveness led to gain of about 0.17 standard deviations of grade in future classes (about a 2.3% bump up). Given the standardized materials and the gulf which could exist between the best and worst teachers, it seems there’s plenty of room for teacher effectiveness to matter. Certainly no students want to end up at a disadvantage because of a poor teacher; I know I wouldn’t.

When it came to the main research question, the results showed that teachers who were the least effective in providing future success for their students tended to receive the highest evaluations. This effect was sizable as well: for each standard deviation increase in teaching effectiveness, student evaluation ratings dropped by about 40% of a standard deviation. Perhaps unsurprisingly, grades were correlated with teaching evaluations as well: the better grades the students received, the better the evaluations they tended to give the professors. Interestingly, this effect did not exist in classes comprised of 25% or more of the top students (as measured by their cognitive entrance exams); the evaluations of those classes were simply not predictive of effectiveness.

That last section is the part of the paper that most everyone will cite: the negative relationship between teacher evaluations and future performance. What fewer people seem to do when referencing that finding is consider why this relationship exists and then use that answer to inform their teaching styles (as I get the sense this information will quite often be cited to excuse otherwise lackluster evaluations, rather than to change anything). The authors of the paper posit two main possibilities for explaining this effect: (1) that some teachers make class time more entertaining at the expense of learning, and/or (2) that some teachers might “teach for the test”, even if they do so at the expense of “true learning”. While neither possibility is directly tested in the paper, the latter possibility strikes me as most plausible: students in the “teaching for the test” classes might simply focus on the particular chunks of information relevant for them at the moment, rather than engaging it as a whole and understanding the subject more broadly.

In other words, vague expectations encourage cramming with a greater scope

With that research in mind, I would like to present a section of my philosophy when it came to teaching and assessment. A question of interest that I have given much thought to is what, precisely, are grades aimed at achieving? For many professors – indeed, I’d say the bulk of them – grades serve the ends of assessment. The grades are used to tell people – students and others – how well the students did at understanding the material come test time. My answer to this question is a bit different, however: as an instructor, I had no particular interest in the assessment of students per se; my interest was in their learning. I only wanted to assess my students as a means of pushing them to the end of learning. As a word of caution, my method of assessment demands substantially more effort from those doing the assessing, be it a teacher or assistant, than is typical. It’s an investment of time many might be unwilling to make.

My assessments were all short-essay style questions, asking students to apply theories they have learned about to novel questions we did not cover directly in class; there were no multiple choice questions. According to the speculations of Braga et al (2011), this would put me firmly in the “real teaching” camp, instead of the “teaching to the test” one. There are a few reasons for my decision: first, multiple choice questions don’t allow you to see what the students were thinking when answering the question. Just because someone gets an answer correct on a multiple choice exam, it doesn’t mean they got the correct answer for the right reasons. For my method to be effective, however, it does mean someone needs to read the exams in depth instead of just feeding them through a scantron machine, and that reading takes time. Second, essay exams force students to confront what they do and do not know. Having spent many years as a writer (and even more as a student), I’ve found that many ideas that seem crystal clear in my head do not always translate readily to text. The feeling of understanding can exist in lack of actual understanding. If students find they cannot explain an idea as readily as felt they understood it, that feeling might be effectively challenged, yielding a new round of engagement with the material.

After seeing where the students were going wrong, the essay format allowed me to make notes on their work and hand it back to them for revisions; something you can’t do very well with multiple choice questions either. Once the students had my comments on their work, they were free to revise it and hand it back into me. The grade they got on their revisions would be their new grade: no averaging of the two or anything of the sort. The process would then begin again, with revisions being made on revisions, until the students were happy with their grade or stopped trying. In order for assessment to serve the end of learning, assessment needs to be ongoing if you expect learning to be. If assessment is not ongoing, students have little need to fix their mistakes; they’ll simply look at their grade and then toss their test in the trash as many of them do. After all, why would they bother putting in the effort to figure out where they went wrong and how to go right if doing so successfully would have no impact whatsoever on the one thing they get from the class that people will see?

Make no mistake: they’re here for a grade. Educations are much cheaper than college.

I should also add that my students were allowed to use any resource they wanted for the exams, be that their notes, the textbook, outside sources, or even other students. I wanted them to engage with the material and think about it while they worked, and I didn’t expect them to have it all memorized already. In many ways, this format mirrors the way academics function in the world outside the classroom: when writing our papers, we are allowed to access our notes and references whenever we want; we are allowed to collaborate with others; we are allowed – and in many cases, required – to make revisions to our work. If academics were forced to do their job without access to these resources, I suspect the quality of it would drop precipitously. If these things all improve the quality of our work and help us learn and retain material, asking students to discard all of them come test time seems like a poor idea. It does require test questions to have some thought put into their construction, though, and that means another investment of time.

Some might worry that my method makes things too easy on the students. All that access to different materials means they could just get an easy “A”, and that’s why my evaluations were good. Perhaps that’s true, but just as my interest is not on assessment, my interest is also not on making a course “easy” or “challenging”; it’s on learning, and tests should be as easy or hard as that requires. As I recall, the class average for each test started at about a 75; by the end of the revisions, the average for each test had risen to about a 90. You can decide from those numbers whether or not that means my exams were too easy.

Now I don’t have the outcome measures that Braga et al (2011) did for my own teaching success. Perhaps my methods were a rousing failure when it came to getting students to learn, despite the high evaluations they earned me (in the Braga et al sample, the average teacher rating was 7 out of 10 with a standard deviation of 0.9; my average rating would be around a 9 on that scale, placing my evaluations about two standard deviations above the mean); perhaps this entire post reflects a defensiveness on my part when it comes to, ironically, having to justify my positive evaluations, just as I suspect people who cite this paper might use the results to justify relatively poor evaluations. In regards to the current results, I think both myself and others have room to be concerned: just because I received good evaluations, it does not mean my teaching method was effective; however, just because you received poor evaluations, it does not mean your teaching method is effective either. Just as students can get the right answer for the wrong reason, they can also give a teacher a good or bad evaluation for the right or wrong reasons. Good reviews should not make teachers complacent, just as poor reviews should not be brushed aside. The important point is that we both think about how to improve on our effectiveness as teachers.

References: Braga, M., Paccagnella, M., & Pellizzari, M. (2011). Evaluating students’ evaluations of professors. Economics of Education Review, 41, 71-88.  

Quid Pro Quo

Managing relationships is a task that most people perform fairly adeptly. That’s not to say that we do so flawlessly – we certainly don’t – but we manage to avoid most major faux pas with regularity. Despite our ability to do so, many of us would not be able to provide compelling answers that help others understand why we do what we do. Here’s a frequently referenced example: if you invited your friend over for dinner, many of you would likely find it rather strange – perhaps even insulting – if after the meal your friend pulled out his wallet and asked how much he owed you for the food. Though we would find such behavior strange or rude, when asked to explain what is rude about it, most people would verbally stumble. It’s not that the exchange of money for food is strange; that part is really quite normal. We don’t expect to go into a restaurant, be served, eat, and then leave without paying. There are also other kinds of strange goods and services – such a sex and organs – that people often do see something wrong with exchanging resources for, at least so long as the exchange is explicit; despite that, we often have less of a problem with people giving such resources away.

Alright; not quite implicit enough, but good try

This raises all sorts of interesting questions, such as why is it acceptable for people to give away things but not accept money for them? Why would it be unacceptable for a host to expect his guests to pay, or for the guests to offer? The most straightforward answer is that the nature of these relationships are different: two friends have different expectations of each other than two strangers, for instance. While such an answer is true enough, it don’t really deepen our understanding of the matter; it just seems to note the difference. One might go a bit further and begin to document some of the ways in which these relationships differ, but without a guiding functional analysis of why they differ we would be stuck at the level of just noting differences. We could learn not only that business associates treat each other differently than friends (which we knew already), but also some of the ways they do. While documenting such things does have value, it would be nice to place such facts in a broader framework. On that note, I’d like to briefly consider one such descriptive answer to the matter of why these relationships differ before moving onto the latter point: the distinction between what has been labeled exchange relationships and communal relationships. 

Exchange relationships are said to be those in which one party provides a good or service to the other in the hopes of receiving a comparable benefit in return; the giving thus creates the obligation for reciprocity. This is the typical consumer relationship that we have with businesses as customers: I give you money, you give me groceries. Communal relationships, by contrast, do not carry similar expectations; instead, these are relationships in which each party cares about the welfare of the other, for lack of a better word, intrinsically. This is more typically of, say, mother-daughter relationships, where the mother provisions her daughter not in the hopes of her daughter one day provisioning her, but rather because she earnestly wishes to deliver those benefits to her daughter.On the descriptive level, then, this difference between expectations of quid pro quo are supposed to differentiate the two types of relationships. Friends offering to pay for dinner are viewed as odd because they’re treating a communal relationship as an exchange one.

Many other social disasters might arise from treating one type of social relationship as if it were another. One of the most notable examples in this regard is the ongoing disputes over “nice guys”, nice guys, and the women they seek to become intimate with. To oversimplify the details substantially, many men will lament that women do not seem to be interested in guys who care about their well-being, but rather seek men who offer resources or treat them as less valuable. The men feel they are offering a communal relationship, but women opt for the exchange kind. Many women return the volley, suggesting instead that many of the “nice guys” are actually entitled creeps who think women are machines you put niceness coins into to get them to dispense sex. Now, it’s the men seeking the exchange relationships (i.e., “I give you dinner dates and you give me affection”), whereas the women are looking for the communal ones. But are these two types of relationships – exchange and communal – really that different? Are communal relationships, especially those between friends and couples, free of the quid-pro-quo style of reciprocity? There are good reasons to think that they are not quite different in kind, but rather different in respect to the  details of the quids and quos.

A subject our good friend Dr. Lecter is quite familiar with

To demonstrate this point, I would invite you to engage in a little thought experiment: imagine that your friend or your partner decided one day to behave as if you didn’t exist: they stopped returning your messages, they stopped caring about whether they saw you, they stopped coming to your aid when you needed them, and so on. Further, suppose this new-found cold and callous attitude wouldn’t change in the future. About how long would it take you to break off your relationship with them and move onto greener pastures? If your answer to that question was any amount of time whatsoever, then I think we have demonstrated that the quid-pro-quo style of exchange still holds in such relationships (and if you believe that no amount of that behavior on another’s part would ever change how much you care about that person, I congratulate you on the depths of your sunny optimism and view of yourself as an altruist; it would also be great if you could prove it by buying me things I want for as long as you live while I ignore you). The difference, then, is not so much whether there are expectations of exchanges in these relationships, but rather concerning the details of precisely what is being exchanged for what, the time frame in which those exchanges take place, and the explicitness of those exchanges.

(As an aside, kin relationships can be free of expectations of reciprocity. This is because, owing to the genetic relatedness between the parties, helping them can be viewed – in the ultimate, fitness sense of the word – as helping yourself to some degree. The question is whether this distinction also holds for non-relatives.)

Taking those matters in order, what gets exchanged in communal relationships is, I think, something that many people would explicitly deny is getting exchanged: altruism for friendship. That is to say that people are using behavior typical of communal relationships as an ingratiation device (Batson, 1993): if I am kind to you today, you will repay with [friendship/altruism/sex/etc] at some point in the future; not necessarily immediately or at some dedicated point. These types of exchange, as one can imagine, might get a little messy to the extent that the parties are interested in exchanging different resources. Returning to our initial dinner example, if your guest offers to compensate you for dinner explicitly, it could mean that he considers the debt between you paid in full and, accordingly, is not interested in exchanging the resource you would prefer to receive (perhaps gratitude, complete with the possibility that he will be inclined to benefit you later if need be). In terms of the men and women example for before, men often attempt to exchange kindness for sex, but instead receive non-sexual friendship, which was not the intended goal. Many women, by contrast, feel that men should value the friendship…unless of course it’s their partner building friendship with another woman, in which case it’s clearly not just about friendship between them.

But why aren’t these exchanges explicit? It seems that one could, at least in principle, tell other people that you will invite them over for dinner if they will be your friend in much the same way that a bank might extend a loan to person and ask that it be repaid over time. If the implicit nature of these exchanges were removed, it seems that lots of people could be saved a lot of headache. The reason such exchanges cannot be made explicit, I think, has to do with the signal value of the exchange. Consider two possible friends: one of those friends tells you they will be your friend and support you so long as you don’t need too much help; the other tells you they will support you no matter what. Assuming both are telling the truth, the latter individual would make the better friend for you because they have a greater vested interest in your well-being: they will be less likely to abandon you in times of need, less likely to take better social deals elsewhere, less likely to betray you, and the like. In turn, that fact should incline you to help the latter more than the former individual. After all, it’s better for you to have your very-valuable allies alive and well-provisioned if you want them to be able to continue to help you to their fullest when you need it. The mere fact that you are valuable to them makes them valuable to you.

“Also, your leaving would literally kill me, so…motivation?”

This leaves people trying to walk a fine line between making friendships valuable in the exchange-sense of the word (friendships need to return more than they cost, else they could not have been selected for), while maintaining the representation that they not grounded in explicit exchanges publicly so as to make themselves appear to be better partners. In turn, this would create the need for people to distinguish between what we might call “true friends” – those who have your interests in mind – and “fair-weather friends” – those who will only behave as your friend so long as it’s convenient for them. In that last example we assumed both parties were telling the truth about how much they value you; in reality we can’t ever be so sure. This strategic analysis of the problem leaves us with a better sense as for why friendship relationships are different from exchange ones: while both involve exchanges, the nature of the exchanges do not serve the same signaling function, and so their form ends up looking different. People will need to engage in proximately altruistic behaviors for which they don’t expect immediate or specific reciprocity in order to credibly signal their value as an ally. Without such credible signaling, I’d be left taking you at your word that you really have my interests at heart, and that system is way too open to manipulation.

Such considerations could help explain, in part, why people are opposed to exchanging things like selling organs or sex for money but have little problem with such things being given for free. In the case of organ sales, for instance, there are a number of concerns which might crop up in people’s minds, one of the most prominent being that it puts an explicit dollar sign on human life. While we clearly need to do so implicitly (else we could, in principle, be willing to exhaust all worldly resources trying to prevent just one person from dying today), to make such an exchange implicit turns the relationship into an exchange one, sending a message along the lines of, “your life is not worth all that much to me”. Conversely, selling an organ could send a similar message: “my own life isn’t worth that much to me”. Both statements could have the effect of making one look like a worse social asset even if, practically, all such relationships are fundamentally based in exchanges; even if such a policy would have an overall positive effect on a group’s welfare.

References: Batson, C. (1993). Communal and exchange relationships: What is the difference? Personality & Social Psychology Bulletin, 19, 677-683.

DeScioli, P. & Kurzban, R. (2009). The alliance hypothesis for human friendship. PLoS ONE, 4(6): e5802. doi:10.1371/journal.pone.0005802

Some Thoughts On Side-Taking

Humans have a habit of inserting themselves in the disputes of other people. We often care deeply about matters concerning what other people do to each other and, occasionally, will even involve ourselves in disputes that previously had nothing to do with us; at least not directly. Though there are many examples of this kind of behavior, one of the most recent concerned the fatal shooting of a teen in Ferguson, Missouri, by a police officer. People from all over the country and, in some cases, other countries, were quick to weigh in on the issue, noting who they thought was wrong, what they think happened, and what punishment, if any, should be doled out. Phenomena like that one are so commonplace in human interactions it’s likely the case that the strangeness of the behavior often goes almost entirely unappreciated. What makes the behavior strange? Well, the fact that intervention in other people’s affairs and attempts to control their behavior or inflict costs on them for what they did tends to be costly. As it turns out, people aren’t exactly keen on having their behavior controlled by others and will, in many cases, aggressively resist those attempts.

Not unlike the free-spirited house cat

Let’s say, for instance, that you have a keen interest in killing someone. One day, you decide to translate that interest into action, attacking your target with a knife. If I were to attempt and intervene in that little dispute to try and help your target, there’s a very real possibility that some portion of your aggression might become directed at me instead. It seems as if I would be altogether safer if I minded my own business and let you get on with yours. In order for there to be selection for any psychological mechanisms that predispose me to become involved in other people’s disputes, then, there need to be some fitness benefits that outweigh the potential costs I might suffer. Alternatively, there might also be costs to me for not becoming involved. If the costs to non-involvement are greater than the costs of involvement, then there can also be selection for my side-taking mechanisms even if they are costly. So what might some of those benefits or costs be?

One obvious candidate is mutual self-interest. Though that term could cover a broad swath of meanings, I intend it in the proximate sense of the word at the moment. If you and I both desire that outcome X occurs, and someone else is going to prevent that outcome if either of us attempt to achieve it, then it would be in our interests to join forces – at least temporarily – to remove the obstacle in both of our paths. Translating this into a concrete example, you and I might be faced by an enemy who wishes to kill both of us, so by working together to kill him first, we can both achieve an end we desire. In another, less direct case, if my friend became involved in a bar fight, it would be in my best interests to avoid seeing my friend harmed, as an injured (or dead) friend is less effective at providing me benefits than a healthy one. In such cases, I might preferentially side with my friend so as to avoid seeing costs inflicted on him. In both cases, both the other party and I share a vested interest in the same outcome obtaining (in this case, the removal of a mutual threat).

Related to that last example is another candidate explanation: kin selection. As it is adaptive for copies of my genes to reproduce themselves regardless of which bodies they happen to be located in, assisting genetic relatives in disputes could similarly prove to be useful. A partially-overlapping set of genetic interests, then, could (and likely does) account for a certain degree of side-taking behavior, just as overlapping proximate interests might. By helping my kin, we are achieving a mutually-beneficial (ultimate-level) goal: the propagation of common genes.

A third possible explanation could also be grounded in reciprocal altruism, or long-term alliances. If I take your side today to help you achieve our goals, this might prove beneficial in the long term to the extent that it encourages you to take my side in the future. This explanation would work even in the absence of overlapping proximate or genetic interests: maybe I want to build my house where others would prefer I did not and maybe you want to get warning labels attached to ketchup bottles.You don’t really care about my problem and I don’t really care about yours, but so long as you’re willing to help me scratch my back on my problem, I might also be willing to help you scratch yours.

Also not unlike the free-spirited house cat

There is, however, another prominent reason we might take the side of another individual in a dispute: moral concerns. That is, people could take sides on the basis of whether they perceive someone did something “wrong”. This strategy, then, relies on using people’s behavior to take sides. In that domain, locating the benefits to involvement or the costs to non-involvement becomes a little trickier. Using behavior to pick sides can carry some costs: you will occasionally side against your interests, friends, and family by doing so (to the extent that those groups behave in immoral ways towards others). Nevertheless, the relative upsides to involvement in disputes on the basis of morality need to exist in some form for the mechanisms generating that behavior to have been selected for. As moral psychology likely serves the function of picking sides in disputes, we could consider how well the previous explanations for side taking fare for explaining moral side taking.

We can rule out the kin selection hypothesis immediately as explaining the relative benefits to moral side taking, as taking someone’s side in a dispute will not increase your genetic relatedness to them. Further, a mechanism that took sides on the basis of kinship should be primarily using genetic relatedness as an input for side-taking behavior; a mechanism that uses moral perceptions should be relatively insensitive to kinship cues. Relatedness is out.

A mutualistic account of morality could certainly explain some of the variance we see in moral side-taking. If both you and I want to see a cost inflicted on an individual or group of people because their existence presents us with costs, then we might side against people who engage in behaviors that benefit them, representing such behavior as immoral. This type of argument has been leveraged to understand why people often oppose recreational drug use: the opposition might help people with long-term sexual strategies inflict costs on the more promiscuous members of a population. The complication that mutualism runs into, though, is that certain behaviors might be evaluated inconsistently in that respect. As an example, murder might be in my interests when in the service of removing my enemies or the enemies of my allies; however, murder is not in my interests when used against me or my allies. If you side against those who murder people, you might also end up siding against people who share your interests and murder people (who might, in fact, further your interests by murdering others who oppose them).

While one could make the argument that we also don’t want to be murdered ourselves – accounting for some or all of that moral representation  of murder as wrong – something about that line doesn’t sit right with me: it seems to conceive of the mutual interest in an overly broad manner. Here’s an example of what I mean: let’s say that I don’t want to be murdered and you don’t want to be murdered. In some sense, we share an interest in common when it comes to preventing murder; it’s an outcome we both want to avoid. So let’s say one day I see you being attacked by someone who intends to murder to you. If I were to come to your aid and prevent you from being killed, I have not necessarily achieved my goal (“I don’t want to be murdered”); I’ve just helped you achieve yours (“You don’t want to be murdered”). To use an even simpler example, if both you and I are hungry, we both share an interest in obtaining food; that doesn’t mean that my helping you get food is filling my interests or my stomach. Thus, the interest in the above example is not necessarily a mutual one. As I noted previously, in the case of friends or kin it can be a mutual interest; it just doesn’t seem to be the case when thinking about the behavior per se. My preventing your murder is only useful (in the fitness sense of the word) to the extent that doing so helps me in some way in the future.

Another account of morality which differs from the above positions posits that side-taking on the basis of behavior could help reduce the costs of becoming involved in the disputes of others. Specifically, if all (or at least a sizable majority of) third parties took the same side in a dispute, one side would back down without the need for fights to be escalated to determine the winner (as more evenly-matched fights might require increased fighting costs to determine a winner, whereas lopsided ones often do not). This is something of a cost-reduction model. While the idea that morality functions as a coordination device – the same way, say, a traffic light does – raises an interesting possibility, it too comes with a number of complications. Chief among those complications is that coordination need not require a focus on the behavior of the disputants. In much the same way that the color of a traffic light bears no intrinsic relationship to driving behavior but is publicly observable, so too might coordination in the moral domain need not bear any resemblance to the behavior of the disputants. Third parties could, for instance, coordinate around the flip of a coin, rather than the behavior of the disputants. If anything, coin flips might be better tools than disputant’s behavior as, unlike behavior, the outcome of coin flips are easily observable. Most immoral behavior is notably not publicly observable, making coordination around it something of a hassle.

 And also making trials a thing…

What about the alliance-building idea? At first blush, taking sides on the basis of behavior seems like a much different type of strategy than siding on the basis of existing friendships. With some deeper consideration, though, I think there’s a lot of merit to the idea. Might behavior work as a cue for who would make a good alliance partner for you? After all, friendships have to start somewhere, and someone who was just stolen from might have a sudden need for partial partners that you might fill by punishing the perpetrator. Need provides a catalyst for new relationships to form. On the reverse end, that friend of yours who happens to be killing other people is probably going to end up racking up more than a few enemies: both the ones he directly impacted and the new ones who are trying to help his victims. If these enemies take a keen interest in harming him, he’s a riskier investment as costs are likely coming his way. The friendship itself might even become a liability to the extent that the people he put off are interested in harming you because you’re helping him, even if your help is unrelated to his acts. At such a point, his behavior might be a good indication that his value as a friend has gone down and, accordingly, it might be time to dump your friend from your life to avoid those association costs; it might even pay to jump on the punishing bandwagon. Even though you’re seeking partial relationships, you need impartial moral mechanisms to manage that task effectively.

This could explain why strangers become involved in disputes (they’re trying to build friendships and taking advantage of a temporary state of need to do so) and why side-taking on the basis of behavior rather than identity is useful at times (your friends might generate more hassle than they’re worth due to their behavior, especially since all the people they’re harming look like good social investments to others). It’s certainly an idea that deserves more thought.

Are Consequences Of No Consequence?

Some recent events have led me back to considering the topic of moral nonconsequentialism. I’ve touch on the topic a few times before (here and here). Here’s a quick summary of the idea: we perceive the behaviors of others along some kind of moral dimension, ranging from morally condemnable (wrong) to neutral (right) to virtuous (praiseworthy). To translate those into everyday examples, we might have murder, painting, and jumping on a bomb to save the lives of others. The question of interest is what factors our minds use as inputs to move our perceptions along that moral spectrum; what things make an act appear more condemanble or praiseworthy? According to a consequentialist view, what moves our moral perceptions should be what results (or consequences) an act brings about. Is lying morally wrong? Well, that depends on what things happened because you lied. By contrast, the nonconsequentialist view suggests that some acts are wrong due to their intrinsic properties, no matter what consequences arise from them.

  “Since it’d be wrong to lie, the guy you’re trying to kill went that way”

Now, at first glance, both views seem unsatisfactory. Consequentialism’s weakness can be seen in the responses of people to what is known as the footbridge dilemma: in this dilemma, the lives of five people can saved from a train by pushing another person in front of it. Around 90% of the time, people judge the pushing to be immoral not permissible, even though there’s a net welfare benefit that arises from the pushing (+4 net lives). Just because more people are better off, it doesn’t mean an act will be viewed as moral. On the other hand, nonconsequentialism doesn’t prove wholly satisfying either. For starters, it doesn’t necessarily convincingly outline what kind of thing(s) make an act immoral and why they might do so; just that it’s not all in the consequences. Referencing the “intrinsic wrongness” of an act to explain why it is wrong doesn’t get us very far, so we’d need further specification. Further, consequences clearly do matter when it comes to making moral judgments. If – as a Kantian categorical imperative might suggest – lying is wrong per se, then we should consider it immoral for a family in 1940s Germany to lie to the Nazi’s about hiding a Jewish family in their attic (and something tells me we don’t). Finally, we also tend to view acts not just as wrong or right, but wrong to differing degrees. As far as I can tell, the nonconsequentialist view doesn’t tell us much about why, say, murder is viewed as worse than lying. As a theory of psychological functioning, nonconsequentialism doesn’t seem to make good predictions.

This tension between moral consequentialism and nonconsequentialism can be resolved, I think, so long as we are clear about what consequences we are discussing. The most typical type of consequentialism I have come across defines positive consequences in a rather specific way: the most amount of good (i.e., generating happiness, or minimize suffering) for people (or other living things) on the whole. This kind of consequentialism clearly doesn’t describe how human moral psychology functions very well, as it would predict people would say that killing one person to save five is the moral thing to do; since we don’t tend to make such judgments, something must be wrong. If we jettison this view that increasing aggregate welfare is something our psychology was selected to do and replace it instead with the idea that our moral psychology functions to strategically increasing the welfare of certain parties at the expense of others, then the problem largely dissolves. Explaining that last part requires more space than I have here (which I will happily make public once my paper is accepted for publication), but I can at least provide an empirical example of what I’m talking about now.

This example will make use of the act of lying. If I have understood the Kantian version of nonconsequentialism correctly, then lying should be immoral regardless of why it was done. Phrased in terms of a research hypothesis concerning human psychology, people should rate lying as immoral, regardless of what consequences accrued from the lie. If we’re trying to derive predictions from the welfare maximization type of consequentialism, we should predict that people will rate lying as immoral only when the negative consequences of lying outweigh the positive ones. At this point, I imagine you can all already think of cases where both of those predictions won’t work out, so I’m probably not spoiling much by telling you that they don’t seem to work out in the current paper either.

Spoiler alert: you probably don’t need that spoiler

The paper, by Brown, Trafimow, & Gregory (2005) contained three experiments, though I’m only going to focus on the two involving lying for the sake of consistency. In the first of these experiments, 52 participants read about a person – Joe – who had engaged in a dishonest behavior for one of five reasons: (1) for fun, (2) to gain $1,000,000, (3) to avoid losing $1,000,000, (4) to save his own life, or (5) to save someone else’s life. The subjects were then asked to, among other things, rate Joe on how moral they thought he was from -3 (extremely immoral) to +3 (extreme moral). Now a benefit of $1,000,000 should, under the consequentialist view, make lying more acceptable than when it was done just for fun, as there is a benefit to the liar to take into account; the nonconsequentialist account, however, suggests that people should discount the million when making their judgments of morality.

Round 1, in this case, went to the nonconsequentalists: when it came to lying just for fun, Joe was morally rated at a -1.33 on average; lying for money didn’t seem to budge the matter much, with a -1.73 rating for the gaining a million and a -0.6 for losing a million. Statistical analysis found no significant differences between the two money conditions and no difference between the combined money conditions and the “for fun” category. Round 2 went the consequentialists, however: when it came to the saving lives category, lying to save one’s own life was rated as slightly morally positive (0.81), as was lying to save someone else’s (M = 1.36). While the difference was not significant between the two life saving groups, the two were different than the “for fun” group. That last finding required a little bit of qualification, though, as the situation being posed to the subjects was too vague. Specifically, the question had read “Joe was dishonest to a friend to save his life”, which could be interpreted as suggesting that either Joe was saving his own life or his friend’s life. The wording was amended in next experiment to read that “…was dishonest to a friend to save his own life”. The “for fun” was also removed, leaving the dishonest behavior without any qualification in the control group.

With the new wording, 96 participants were recruited and given one of three contexts: George being dishonest for no stated reason, to save his own life, or to save his friend’s life. This time, when participants were asked about the morality of George’s behavior, a new result popped up: being dishonest for no reason was rated somewhat negatively (M = -0.5) as before, but this time, being dishonest to save one’s own life was similarly negative (M = -0.4). Now saving a life is arguably more of a positive consequence than being dishonest is negative when considered in a vacuum, so the consequentialist account doesn’t seem to be faring so well. However, when George was being dishonest to save his friend’s life, the positive assessments returned (M = 1.03). So while there was no statistical difference between George lying for no reason and to save his own life, both conditions were different than George lying to save the life of another. Framed in terms of the Nazi analogy, I don’t see many people condemning the family for hiding Anne Frank.

 The jury is still out on publishing her private diary without permission though…

So what’s going on here? One possibility that immediately comes to mind from looking at these results is that consequences matter, but not in the average-welfare-maximization sense. In both of these experiments lying was deemed to be OK so long as someone other than the liar was benefiting. When someone was lying to benefit himself – even when that benefit was large – is was deemed unacceptable. So it’s not just that the consequences, in the absolute sense, matter; their distribution appears to be important. Why should we expect this pattern of results? My suggestion is that it has to do with the signal that is sent by the behavior in question regarding one’s value as a social asset. Lying to benefit yourself demonstrates a willingness to trade-off the welfare of others for your own, which we want to minimize in our social allies; lying to benefit others sends a different signal.

Of course, it’s not just that benefiting others is morally acceptable or praiseworthy: lying to benefit a socially-undesirable party is unlikely to see much moral leniency. There’s a reason the example people use for thinking about the morality of lying uses hiding Jews from the Nazis, rather than lying to Jews to benefit the Nazis. Perhaps the lesson here is that trying to universalize morality doesn’t do us much good when it comes to understanding it, despite our natural inclinations to view morality not a matter of personal preferences.

References: Brown, J., Trafimow, D., & Gregory, W. (2005). The generality of negativity hierarchically restrictive behaviors. British Journal of Social Psychology, 44, 3-13.

Not-So-Shocking Results About People Shocking Themselves

I’m bored is a useless thing to say. I mean, you live in a great, big, vast world that you’ve seen none percent of. Even the inside of your own mind is endless; it goes on forever, inwardly, do you understand? The fact that you’re alive is amazing, so you don’t get to say ‘I’m bored’”. – Louis CK

One of the most vivid – and strangely so – memories of my childhood involved stand-up comedy. I used to always have the TV on in the background of my room when I was younger, usually tuned to comedy central or cartoons. Young me would somewhat-passively absorb all that comedic material and then regurgitate it around people I knew without context; a strategy that made me seem a bit weirder than I already was as a child. The joke in particular that I remember so vividly came from Jim Gaffigan: in it, he expressed surprise that the male seahorses are the one’s who give birth, suggesting that they should just call the one that gives birth the female; he also postulates that the reason this wasn’t the case was that a stubborn scientist had made a mistake. One reason this joke stood out to me in particular was that, as my education progressed, it served as perhaps the best example of how people who don’t know much about the subject they’re discussing can be surprised at some ostensible quirks about it that actually make a great deal of sense to more knowledgeable individuals.

In fact, many things about the world are quite shocking to them.

In the case of seahorses, the mistake Jim was making is that biological sex in that species, as well as many others, can be defined by which sex produces the larger gametes (eggs vs. sperm). In seahorses, the females produce the eggs, but the males provide much of the parental investment, carrying the fertilized eggs in a pouch until they hatch. In such species where the burden of parental care is shouldered more heavily by the males, we also tend to see reversals of mating preferences, where the males tend to become more selective about sexual partners, relative to the females. So not only was the labeling of the sexes no mistake, but there are lots of neat insights to be drawn about psychology from that knowledge. Admittedly, this knowledge does also ruin the joke, but here at Popsych we take the utmost care to favor being a buzzkill over being inaccurate (because we have integrity and very few friends). In the interests of continuing that proud tradition, I would like to explain why the second part of the initial Louis CK joke – the part about us not being allowed to be bored because we ought to just be able to think ourselves to entertainment and pleasure – is, at best, misguided.

That part of the joke contains an intuition shared by more than Louis, of course. In the somewhat-recent past, an article was making its way around the popular psychological press about how surprising it was that people tended to find sitting with their own thoughts rather unpleasant. The paper, by Wilson et al (2014), contains 11 studies. Given that number, along with the general repetitiveness of the designs and lack of details presented in the paper itself, I’ll run through them in the briefest form possible before getting to meat of the discussion. The first six studies involved around 400 undergrads being brought into a “sparsely-furnished room” after having given up any forms of entertainment they were carrying, like their cell phones and writing implements. They were asked to sit in a chair and entertain themselves with only their thoughts for 6-15 minutes without falling asleep. Around half the participants rated the experience as negative, and a majority reported difficulty concentrating or their mind wandering.The next study repeated this design with 169 subjects asked to sit alone at home without any distractions and just think. The undergraduates found the experience about as thrilling at home as they did in the lab, the only major difference being that now around a third of the participants reported “cheating” by doing things like going online or listening to music. Similar results were obtained in a community sample of about 60 people, a full half of which reported cheating during the period.

Finally, we reach the part of the study that made the headlines. Fifty-five undergrads were again brought into the lab. Their task began by rating the pleasantness of various stimuli, one of which being a mild electric shock (designed to be unpleasant, but not too painful). After doing this, they were given the sit-alone-and-think task, but were told they could, if they wanted, shock themselves again during their thinking period via an ankle bracelet they were wearing. Despite participants knowing that the shock was unpleasant and that shocking themselves was entirely optional, around 70% of men and 25% of women opted to deliver at least one shock to themselves during the thinking period when prompted with the option. Even among the subjects who said they would pay $5 instead of being shocked again, 64% of the men and 15% of the women shocked themselves anyway. From this, Wilson et al (2014) concluded that thinking was so aversive that people would rather shock themselves than think if given the option, even if they didn’t like the shock.

“The increased risk of death sure beats thinking!”

The authors of the paper posited two reasons as to why people might dislike doing nothing but sitting around thinking, neither of which make much sense to me: their first explanation was that people might ruminate more about their own shortcomings when they don’t have anything else to do. Why people’s minds would be designed to do such a thing is a bit beyond me and, in any case, the results didn’t find that people defaulted to thinking they were failures. The second explanation was that people might find it unpleasant to be alone with their own thoughts because they had to be a “script writer” and an “experiencer” of them. Why that would be unpleasant is also a bit beyond me and, again, that wasn’t the case either: participants did not find having someone else prompting the focus of thoughts anymore pleasant.

Missing from this paper, like many papers in psychology, is an evolutionary-level, functional consideration of what’s going on here: not explored or mentioned is the idea that thinking itself doesn’t really do anything. By that, I mean evolution, as a process, cannot “see” (i.e. select for) what organisms think or feel directly. The only thing evolution can “see” is what an organism does; how it behaves. That is to say the following: if you had one organism that had a series of incredibly pleasant thoughts but never did anything because of them, and another that never had any thoughts whatsoever but actually behaved in reproductively-useful ways, the latter would win the evolutionary race every single time.

To further drive this point home, imagine for a moment an individual member of a species which could simply think itself into happiness; blissful happiness, in fact. What would be the likely result for the genes of that individual? In all probability, they would fair less well than their counterparts who were not so inclined since, as we just reviewed, feeling good per se does not do anything reproductively useful. If those positive feelings derived from just thinking happy thoughts motivated any kind of behavior (which they frequently do) and those feelings were not designed to be tied to some useful fitness outcomes (which they wouldn’t be, in this case), it is likely that the person thinking himself to bliss would end up doing fewer useful things along with many maladaptive ones. The logic here is that there many things they could do, but only a small subsection of those things are actually worth doing. So, if organisms selected what to do on the basis of their emotions, but these emotions were being generated for reasons unrelated to what they were doing, they would select poor behavioral options more than not. Importantly, we could make a similar argument for an individual that thought himself into despair frequently: to the extent that feeling motivate behaviors, and to the extent that those feelings are divorced from their fitness outcomes, we should expect bad fitness results.

Accordingly, we ought to also expect that thinking per se is not what people find aversive in this experiment. There’s no reason for the part of the brain doing the thinking (rather loosely conceived) here to be hooked up to the pleasure or pain centers of the brain. Rather, what the subjects likely found aversive here was the fact that they weren’t doing anything even potentially useful or fun. The participants in these studies were asked to do more than just think about something; the participants were also asked to forego doing other activities, like browsing the internet, reading, exercising, or really anything at all. So not only were the subjects asked to sit around and do absolutely nothing, they were also asked not do the other fun, useful things they might have otherwise spent their time on.

“We can’t figure out why he doesn’t like being in jail despite all that thinking time…”

Now, sure, it might seem a bit weird that people would shock themselves instead of just sit there and think at first glance. However, I think that strangeness can be largely evaporated by considering two factors: first, there are probably some pretty serious demand characteristics at work here. When people know they’re in a psychology experiment and you only prompt them to do one thing (“but don’t worry it’s totally optional. We’ll just be in the other room watching you…”), many of them might do it because they think that’s what the point of the experiment is (which, I might add, they would be completely correct about in this instance). There did not appear to be any control group to see how often people independently shocked themselves when not prompted to do so, or when it wasn’t their only option. I suspect few people would under that circumstance.

The second thing to consider is that most organisms would likely start behaving very strangely after a time if you locked them in an empty room; not just humans. This is because, I would imagine, the minds of organisms are not designed to function in environments where there is nothing to do. Our brains have evolved to solve a variety of environmentally-recurrent problems and, in this case, there seems to be no way to solve the problem of what to do with one’s time. The cognitive algorithms in their mind would be running through a series of “if-then” statements and not finding a suitable “then”. The result is that their mind could potentially start generating relatively-random outputs. In a strange situation, the mind defaults to strange behaviors. To make the point simply, computers stop working well if you’re using them in the shower, but, then again, they were never meant to go in the shower in the first place.

To return to Louis CK, I don’t think I get bored because I’m not thinking about anything, nor do I think that thinking about things is what people found aversive here. After all, we are all thinking about things – many things – constantly. Even when we are “distracted”, that doesn’t mean we are thinking about nothing; just that our attention is on something we might prefer it wasn’t. If thinking it what was aversive here, we should be feeling horrible pretty much all the time, which we don’t. Then again, maybe animals in captivity really do start behaving weird because they don’t want to be the “script writer” and “experiencer” of their own thoughts…

References: Wilson, T., Reinhard, D., Westgate, E., Gilbert, D., Ellerbeck, N., Hahn, C., Brown, C., & Shaked, A. (2014). Just think: The challenges of the disengaged mind. Science, 345, 75-77.

Keepin’ It Topical: The Big Facebook Study

I happen to have an iPhone because, as many of you know, I think differently (not to be confused with the oddly-phrased “Think Different”® slogan of the parent company, Apple), and nothing expresses those characteristics of intelligence and individuality about me better than my ownership of one of the most popular phones on the market. While the iPhone itself is a rather functional piece of technology, there is something about it (OK; related to it) that has consistently bothered me: the Facebook app I can download for it. The reason this app has been bugging me is that, at least as far as my recent memory allows, the app seems to have an unhealthy obsession with showing me the always-useless “top stories” news feed as my default, rather than the “most recent” feed I actually want to see. In fact, I recall that the last update to the app actually made it more of a hassle to get to the most recent feed, rather than make it easily accessible. I had always wondered why there didn’t seem to be a simple way to change my default, as this seems like a fairly basic design fix. Not to get too conspiratorial about the whole thing, but this past week, I think I might have found part of the answer.

Which brings us to the matter of the Illuminati…

It’s my suspicion that the “top stories” feed has uses beyond simply trying to figure out which content I might want to see; this would be a good thing, because if the function were to figure out what I want to see, it’s pretty bad at that task. The “top stories” feed might also be used for the sinister purposes of conducting research (then again, the “most recent” feed can probably do that as well; I just really enjoy complaining about the “top stories” one). Since this new story (or is it a “top story”?) about Facebook conducting research with its users has been making the rounds in the media lately, I figured I would add my two cents to the incredibly tall stack of pennies the internet has collectively made in honor of the matter. Don’t get it twisted, though: I’m certainly not doing this in the interest of click-bait to capitalize on a flavor-of-the-week topic. If I were, I would have titled this post “These three things about the Facebook study will blow your mind; the fourth will make you cry” and put it up on Buzzfeed. Such behavior is beneath me because, as I said initially, I think different(ly)…

Anyway, onto the paper itself. Kramer et al (2014) set out to study whether manipulating what kind of emotional content people are exposed to in  other’s Facebook status updates had an effect on that person’s later emotional content in their own status updates. The authors believe such an effect would obtain owing to “emotional contagion”, which is the idea that people can “…transfer positive and negative moods and emotions to others”. As an initial semantic note, I think that such phrasing – the use of contagion as a metaphor – only serves to lead one to think incorrectly about what’s going on here. Emotions and moods are not the kind of things that can be contagious the way pathogens are: pathogens can be physically transferred from one host to another, while moods and emotions cannot. Instead, moods and emotions are things generated by our minds from particular sets of inputs.

To understand that distinction quickly, consider two examples: in the first case, you and I are friends. You are sad and I see you being sad. This, in turn, makes me feel sad. Have your emotions “infected” me? Probably not; consider what would happen if you and I were enemies instead: since I’m a bastard and I like to see people I don’t like fail, your sadness might make me feel happy instead. So it doesn’t seem to be your emotion per se that’s contagious; it might just be the case that I happen to generate similar emotions under certain circumstances. While this might seem to be a relatively minor issue, similar types of thinking about the topic of ideas – specifically, that ideas themselves can be contagious – has led to a lot of rather unproductive thinking and discussions about “memes”. By talking about ideas or moods independently of the minds that create them, we end up with a rather dim view of how our psychology works, I feel.

Which is just what the Illuminati want…

Moving past that issue, however, the study itself is rather simple: for a week in 2012, approximately 700,000 Facebook users had some of their news feed content hidden from them some of the time. Each time one of the subjects viewed their feed, depending on what condition they were in, each post containing certain negative or positive words had a certain probability (between 10-90% chance) of being omitted. Unfortunately, the way the paper is written, it’s a bit difficult to get a sense as to precisely how much content was, on average, omitted. However, as the authors note, this was done a per-viewing basis, so content that was hidden during one viewing might well have showed up were the page to be refreshed (and sitting there refreshing Facebook minute after minute is something many people might actually do). The content was also only hidden on the news feed: if the subject visited a friend’s page directly or sent or received any messages, all the content was available. So, for a week, some of the time, some of the content was omitted, but only on a per-view basis, and only in one particular form (the news feed); not exactly the strongest manipulation I could think of.

The effect of that manipulation was seen when examining what percentage of positive or negative words the subjects themselves used when posting their status updates during the experimental period. Those subjects who saw more positive words in their feed tended to post more positive words themselves, and vice versa for negative words. Sort of, anyway. OK; just barely. In the condition where subjects had access to fewer negative words, the average subject’s status was made up of about 5.3% positive words and 1.7% negative words; when the subjects had access to fewer positive words, these percentages plummeted/jumped to…5.15% and 1.75%, respectively. Compared to the control groups, then, these changes amount to increases or decreases of in between 0.02 and 0.0001 standard deviations of emotional word usage or, as we might say in precise statistical terms, effects so astonishingly small they might as well not be said to exist.

“Can’t you see it? The effect is right there; plain as day!”

What we have here, in sum, then, is an exceedingly weak and probabilistic manipulation that had about as close to no net effect as one could reasonably get, based on an at-least-partially (if only metaphorically) deficient view of how the mind works. The discussion about the ethical issues people perceived with the research appears to have vastly overshadowed the fact that research itself wasn’t really very strong or interesting. So for all of you people outraged over this study for fear that people were harmed: don’t worry. I would say the evidence is good that no appreciable harm came of it.

I would also say that other ethical criticisms of the study are a bit lacking. I’ve seen people raise concerns that Facebook had no business seeing if bad moods would be induced by showing people a disproportionate number of negative status updates; I’ve also seen concerns that the people posting these negative updates might not have received the support they needed if other people were blocked from seeing them. The first thing to note is that Facebook did not increase the absolute number of positive or negative posts; only (kind of) hid some of them from appearing (some of the time, to some people, in one particular forum); the second is that, given those two criticisms, it would seem that Facebook is in a no-win situation: reducing or failing to reduce the number of negative stories leads to them being criticized. Facebook is either failing to get people the help they need or bumming us out by disproportionately exposing us to people who need help. Finally, I would add that if anyone did miss a major life event of a friend – positive or negative – because Facebook might have probabilistically omitted a status update on a given visit, then you’re likely not very good friends with that person anyway, and probably don’t have a close enough relationship with them that would allow you to realistically lend help or take much pleasure from the incident.

References: Kramer, A., Guillory, J., & Hancock, J. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences of the United States of America, doi: 10.1073/pnas.1320040111

Classic Theory In Evolution: The Big Four Questions

Explanations for things appear to be a truly vexing issue for people in many instances. Admittedly, that might sound a little strange; after all, we seem to explain things all the times without much apparent effort. We could consider a number of examples for explanations of behavior: people punch walls because they’re angry; people have sex because it feels good; people eat certain foods because they prefer those flavors, and so on. Explanations like these seem to come automatically to us; one might even say naturally. The trouble that people appear to have with explanations is with respect to the following issue: there are multiple, distinct, and complimentary ways of explaining the same thing. Now by that I don’t mean that, for instance, someone punched a wall because they were angry and drunk, but rather that there are qualitatively different ways to explain the same thing. For instance, if you ask me what an object is, I could tell you it’s a metallic box that appears to run on electricity and contains a heating element that can be adjusted via knobs; I could also tell you it’s a toaster. The former explanation tells you about various features of the object, while the latter tells you (roughly) what it functions to do (or, at least, what it was initially designed to do).

…And might have saved you that trip to the ER.

More precisely, the two issues people seem to run into when it comes to these different kinds of explanations is that they (a) don’t view these explanations as complimentary, but rather as mutually-exclusive, or (b) don’t realize that there are distinct classes of explanations that require different considerations from one another. It is on the second point that I want to focus today. Let’s start by considering the questions found in the first paragraph in what is perhaps their most basic form: “what causes that behavior?” or, alternatively, “what preceding events contributed to the occurrence of the behavior?” We could use as our example the man punching the wall to guide us through the different classes of explanations, of which there are 4 generally-agreed upon categories (Tinbergen, 1963).

The first two of these classes of explanations can be considered proximate – or immediate – causes of the behavior. The standard explanation many people might give for why the man punched the wall would be to reference the aforementioned anger. This would correspond to Tinbergen’s (1963) category of causation which, roughly, can be captured by considerations of how the cognitive systems which are responsible for generating the emotional outputs of anger and corresponding wall-punching work on a mechanical level: what inputs do they use, how are these inputs operated upon to generate outputs, what outputs are generated, what structures in the brain become activated, and so on. It is on this proximate level of causation that most psychological research focuses, and with good reason: the hypothesized proximate causes for behaviors are generally the most open to direct observation. Now that’s certainly not to say that they are easy to observe and distinguish in practice (as we need to determine what cognitive or behavioral units we’re talking about, and how they might be distinct from others), but the potential is there.

The second type of explanation one might offer is also a proximate-type of explanation: an ontological explanation. Ontology refers to changes to the underlying proximate mechanisms that takes place during the course of development, growth, and aging of an organism. Tinbergen (1963) is explicit in what this does not refer to: behavioral changes that correspond to environmental changes. For instance, a predator might evidence feeding behavior in the presence of prey, but not evidence that behavior in absence of prey. This is not good evidence that anything has changed in the underlying mechanisms that generate the behavior in question; it’s more likely that they exist in the form they did moments prior, but now have been provided with novel inputs. More specifically, then, ontology refers, roughly, to considerations of what internal or external inputs are responsible for shaping the underlying mechanisms as they develop (i.e. how is the mechanism shaped as you grow from a single cell into an adult organism). For instance, if you raise certain organisms in total darkness, parts of their eyes may fail to process visual information later in life; light, then, is a necessary developmental input for portions of the visual system. To continue on with the wall-punching example, ontological explanations for why the man punched the wall would reference what inputs are responsible for the development of the underlying mechanisms that would produce the eventual behavior.

Like their father’s fear of commitment…

The next two classes of explanations refer to ultimate – or distal – causal explanations. The first of these is what Tinbergen calls evolution, though it could be more accurately referred to as a phylogenetic explanation. Species tend to resemble each other to varying degrees because of shared ancestry. Accordingly, the presence of certain traits and mechanisms can be explained by homology (common descent). The more recently two species diverged from one another in their evolutionary history, the more traits we might expect the two to share in common. In other words, all the great apes might have eyes because they all share a common ancestor who had eyes, rather than because they all independently evolved the trait. Continuing on with our example, the act of wall-punching might be explained phylogenetically by noting that the cognitive mechanisms we possess related to, say, aggression, are to some degree shared with a variety of species.

Finally, this brings us to my personal favorite: survival value. Survival value explanations for traits involve (necessarily-speculative, but perfectly testable) considerations about what evolutionary function a given trait might have (i.e. what reproductively-relevant problem, if any, is “solved” by the mechanism in question). Considerations of function help inform some of the “why” questions of the proximate levels, such as “why are these particular inputs used by the mechanism?”, “why do these mechanisms generate the output they do?”, or “why does this trait develop in the manner that it does?”. To return to the punching example, we might say that the man punched the wall because aggressive responses to particular frustrations might have solved some adaptive problem (like convincing others to give you a needed resource rather than face the costs of your aggression). Considerations of function also manage to inform the evolution, or phylogeny, level, allowing us to answer questions along the lines of, “why was this trait maintained in certain species but not others?”. As another for instance, even if cave-dwelling and non-cave dwelling species share a common ancestor that had working eyes, that’s no guarantee that functional eyes will persist in both populations. Homology might explain why the cave-dweller develops non-functional eyes, but it would not itself explain why those eyes don’t work. Similarly, noting that people punch walls when they are angry alone does not explain why we do so.

All four types of explanations answer the question “what causes this behavior?”, but in distinct ways. This distinction between questions of function and questions of causation, ontogeny, and phylogeny, for instance, can be summed up quite well by a quote from Tinbergen (1963):

No physiologist applying the term “eye” to a vertebrate lens eye as well as a compound Arthropod eye is in danger of assuming that the mechanism of the two is the same; he just knows that the word “eye” characterizes achievement, and nothing more.

Using the word “eye” to refer to a functional outcome of a mechanism (processing particular classes of light-related information) allows us to speak of the “eyes” of different species, despite them making use of different proximate mechanisms and cues, developing in unique fashions over the span of an organism’s life, and having distinct evolutionary histories. If the functional level of analysis was not distinct, in some sense, from analyzes concerning development, proximate functioning, and evolutionary history, then we would not be able to even discuss these different types of “eyes” as being types of the same underlying thing; we would fail to recognize a rather useful similarity.

“I’m going to need about 10,000 contact lens”

To get a complete (for lack of a better word) understanding of a trait, all four of these questions need to be considered jointly. Thankfully, each level of analysis can, in some ways, help inform the other levels: understanding the ultimate function of a trait can help inform research into how that trait functions proximately; homologous traits might well serve similar functions in different species; what variables a trait is sensitive towards during development might inform us as to its function, and so on. That said, each of these levels of analysis remains distinct, and one can potentially speculate about the function of a trait without knowing much about how it develops, just as one could research the proximate mechanisms of a trait without knowing much about its evolutionary history.

Unfortunately, there has been and, sadly, continues to be, some hostility and misunderstandings with respect to certain levels of analyzes. Tinbergen (1963) had this to say:

It was a reaction against the habit of making uncritical guesses about the survival value, the function, of life processes and structures. This reaction, of course healthy in itself, did not (as one might expect) result in an attempt to improve methods of studying survival value; rather it deteriorated into lack of interest in the problem – one of the most deplorable things that can happen in science. Worse, it even developed into an attitude of intolerance: even wondering about survival value was consider unscientific

That these same kinds of criticisms continue to exist over 50 years later (and they weren’t novel when Tinbergen was writing either) might suggest that some deeper, psychological issue exists surrounding our understanding of explanations. Ironically enough, the proximate functioning of the mechanisms that generate these criticisms might even give us some insight into their ultimate function. Then again, we don’t want to just end up telling stories and making assumptions about why traits work, do we?

References: Tinbergen, N. (1963). On aims and methods of Ethology. Zeitschrift für Tierpsychologie, 20, 410-433.

What Percent Of Professors Are Bad Teachers?

Let’s make a few assumptions about teaching ability. The first of these is that the ability to be an effective teacher (a broad trait, to be sure, compromised of many different sub-traits) – as measured by your ability to, roughly, put knowledge into people’s head in such a manner as so they can recall it later – is not an ability that is evenly distributed throughout the human population. Put simply, some people will make better teachers than others, all else being equal.The second assumption is that teaching ability is approximately normally distributed: a few people are outstanding teachers, a few people are horrible, and most are a little above or below average. This may or may not be true, but let’s just assume that it is to make things easy for us. Given these two assumptions, we might wonder how many of those truly outstanding-tail-end teachers end up being instructors at the college level. The answer to that question depends, of course; on what basis are teachers being hired?

Glasses AND a sweater vest? Seems legitimate enough for me.

Now, having never served on any hiring committees myself, I can offer little data or direct insight on that matter. Thankfully, I can offer anecdotes. From what I’ve been told, many colleges seem to look at two things when considering how to make their initial cut of the dozens or hundreds of resumes they receive for the single job they are offering: publications in academic journals (more publications in “better” journals is a good thing) and grant funding (the more money you have, the better you look, for obvious reasons). Of course, those two factors aren’t everything when it comes to who gets hired, but they at least get your foot in the door for consideration or an interview. The importance of those two factors doesn’t end post-hiring either, as far as I’ve been told, later becoming relevant for such minor issues like “promotions” and “tenure”. Again, this is all gossip, so take it with a grain of salt.

However, to the extent that this resembles the truth of the matter, it would seem to game the incentive system away from investing time and effort into becoming a “good” teacher, as such investments in teaching (as well as the teaching itself) would be more of a “distraction” from other, more-important matters. How does this bear on our initial question? Well, if college professors are being hired primarily on their ability to do things other than teach, we ought to expect that the proportion of professors being drawn from the upper-tail of that distribution in teaching ability might end up being lower than we would prefer (that is, unless teaching ability correlates pretty well with one’s ability to do research and get grants, which is certainly an empirical matter). I’m sure many of you can relate to that issue, having both had teachers who inspired you to pursue an entirely new path in life, as well as teachers who inspired you to get an extra hour of sleep instead of showing up to their class.The difference between a good teacher (and you’ll know them when you see them, just like porn) and a mediocre or poor one can be massive.

So why ask this questions about teaching ability? It has to do with a recent meta-analysis by Freeman et al (2014) examining what the empirical research has to say about the improvements in education outcomes that active learning classes have over traditional lecture teaching in STEM fields. For those of you not in the know, “active learning” is a rather broad, umbrella term for a variety of classroom setups and teaching styles that go beyond strictly lecturing. As the authors put it, the term, “...included approaches as diverse as occasional group problem-solving, worksheets or tutorials completed during class, use of personal response systems with or without peer instruction, and studio or workshop course designs“. Freeman et al (2014) wanted to see which instruction style had better outcomes for both (1) standardized tests and (2) failure/withdrawal rates from the classes.

“Don’t lecture him, dear; just let the active learning happen”

The results found that, despite this exceedingly-broad definition for active learning, the method seemed to have a marked increase in learning outcomes, relative to lecture classes. With respect to the standardized test scores, the average effect size was 0.47, meaning that, on the whole, students in active learning classes tended to score about half a standard deviation higher than students in lecture based classes. In simpler terms, this means that students in the active learning classes should be expected to earn about a B on that standardized test, relative to the lecture student’s B-. While that might seem neat, if not terribly dramatic, the effect of the failure rate was substantially more noteworthy: specifically, students in lecture-only classes were 1.5 times more likely to fail than a student in an active learning class (roughly 22% failure rate in active learning classes, relative to lecture’s 34%). These effects were larger in small classes, relative to large ones, but held regardless of class size or subject matter. Active learning seemed to be better.

The question of why active learning seems to have these benefits is certainly an interesting one, especially given the diversity of methods that fall under the term. As the authors note, “active learning” could refer both to a class that spent 10% of its time on “clicker” questions (real-time multiple choice questions) or a class that was entirely lecture-free. One potential explanation is that active learning per se doesn’t actually have too much of a benefit; instead, the results might be due to the “good” professors being more likely to volunteer for research on the topic of teaching or likely to adopt the method. This explanation, while it might have some truth to it, seems to be contradicted by the fact that the data reported by Freeman et al (2014) suggests that the active learning effect isn’t diminished even when it’s the same professor doing the teaching in both kinds of courses.

We might also consider that there’s a lot to be said for learning by doing. When students have practice answering similar kinds of questions (along with feedback) to those which might appear on tests – either of the professor’s making or the standardized varieties – we might also expect that they do better on the tasks when they counts. After all, there’s a big difference between reading a lot of books about how to paint and actually being able to create a painting that bears a resemblance to what you hoped it would look like. Similarly, answering questions about your subject matter before a test might be good at getting you to answer questions better. Simple enough. While an exceedingly-plausible sounding explanation, the extent to which active learning facilitates learning in this manner is an unknown. In the current study, as previously mentioned, active learning could involve something as brief as a few quick questions or an entire class without lecture; the duration or type of active learning wasn’t controlled for. Learning by doing seems to help, but past a certain point it might simply be overkill.

Which is good news for all you metalhead professors out there

Another potential explanation that occurs to me returns to our initial question. If we assume that many professors do not receive their jobs on the basis of their teaching ability – at least not primarily – and if increasing one’s skill at teaching isn’t often or thoroughly incentivized, then it’s quite possible that many people placed in teaching positions are not particularly outstanding when it comes to their teaching ability. If student learning is in some way tied to teaching ability (likely), then we shouldn’t necessarily expect the best learning outcomes if the teacher is the only source of information. What that might mean is that students could learn better when they are able to rely on something that isn’t their teacher to achieve that end. As the current study might hint towards, what that “something” is might not even need to be very specific; almost anything might be preferable to a teacher reading powerpoint slides which they didn’t make and are just restatements of the textbook verbatim, as seems to be popular among many instructors who use lectures currently. If some professors view teaching as more of a chore than a pleasure, we might see similar issues. Before calling the lecture itself a worse format, I would like to see more discussion of how it might be improved and whether there are specific variables that separate “good” lectures from “bad” ones. Perhaps all lectures will turn out to be equally poor, and teaching ability has nothing at all to do with student’s performance in those classes. I would just like to see that evidence before coming to any strong conclusions about their effectiveness.

References: Freeman, S., Eddy, S., McDonough, M., Smith, M., Okoroafor, N., Jordt, H., & Wenderoth, M. (2014). Active learning increases student performance in science, engineering, and mathematics. Proceedings of the National Academy of Sciences, doi: 10.1073/pnas.1319030111.