Unusual Names In Learning Research

Learning new skills and bodies of knowledge takes time, repetition, and sustained effort. It’s a rare thing indeed for people to learn even simple skills or bodies of knowledge fluently with only a single exposure to them if they’re properly motivated. Given the importance of learning to succeed in life, a healthy body of literature in psychology examines people’s ability to learn and remember information. This literature extends both to how we learn successfully and the contexts in which we fail. Good research in this realm will often leverage something in the way of adaptive function for understanding why we learn what we do. It is unfortunate that this theoretical foundation appears to be lacking in much of the research on psychology in general, with learning and memory research being no exception. In the course I taught on the topic last semester, for instance, I’m not entirely sure the world “relevance” appeared once in the textbook I was using to help the reader understand our memory mechanisms. There was, however, a number of parts of that book which caught my attention, though not for the best reasons.

You have my attention, but no longer have a working car.

Recently, for instance, I came upon a reference to a phenomenon called the labor-in-vain effect through this textbook. In it, the effect was summarized as such:

Here’s the basic methodology. Nelson and Leonesio (1988) asked participants to study words paired with nonsense syllables (e.g., monkey–DAX). Participants made judgments of learning in an initial stage. Then, when given a chance to study the items again, each participant could choose the amount of time to study for each item. Finally, in a cued recall test, participants were given the English word and asked to recall the nonsense syllable….Even though they spent most of their time studying the difficult items, they were still better at remembering the easy ones. For this reason, Nelson and Leonesio labeled the effect labor in vain because their experiment showed that participants were unable to compensate for the difficulty of those items

As I like to be thorough when preparing the materials for my course, I did what every self-respecting teacher should do (even though not all of them will): I went to go track down and read the primary literature upon which this passage was based. Professors (or anyone who wants to talk about these findings) ought to go read the source material themselves for two reasons: first, because you want to be an expert in the material you’re teaching your students about (why else would they be listening to you?) and, second, because textbooks – really secondary sources in general – have a bad habit of getting details wrong. What I found in this case was not only that the textbook mischaracterized the effect and failed to provide crucial details about the research, but the original study itself was a bit ambitious in their naming and assessment of the phenomenon. Let’s take those points in order.

First, to see why the textbook’s description wasn’t on point, let’s consider the research itself (Nelson & Leonesio, 1988). The general procedure in their experiments was as follows: participants (i.e., undergraduate students looking for extra credit) were given lists to study. In the first experiment these were trigrams (like BUG or DAX), in the second they were words paired with trigrams (like Monkey-DAX), and in the third they were tested on general-information questions they had failed to answer correctly (like, “what is the capital of Chile?”). During each experiment, the participants would be broken up into groups that either emphasized speed or accuracy in learning. Both groups were told they could study the target information at their own pace and that the goal was to remember as much of the information as possible, but the speed groups were told their study time would count against their eventual score. Following that study phase, participants were then given a recall task after a brief delay to see how successful their study time had been.

As one might expect, the speed-emphasis groups studied the information for less time than the accuracy-emphasis groups. Crucially, the extra study time invested by the participants did not yield statistically significant gains in their ability to subsequently recall the information in 2 of the 3 experiments (in experiment three, the difference was significant). This was dubbed the labor-in-vain effect because participants were putting in extra labor for effectively little to no gain.

We can see from this summary that the textbook’s description of the labor-in-vain effect isn’t quite accurate. The labor in vain effect does not refer to the fact that participants were unable to make up the difference between the easy and hard items (which they actually did in one of the three studies); instead, it refers to the idea that the participants were not gaining anything at all from their extra study time. To quote the original paper:

We refer to this finding of substantial extra study time yielding little or no gain in recall as the labor-in-vain effect. Although we had anticipated that extra study time might yield diminishing (i.e., negatively accelerated) gains in recall, the present findings are quite extreme in showing not even a reliable gain in recall after more than twice as much extra study time.

This mischaracterization might seem like a minor error speaking to the meticulousness of the author, but that’s not the only problem with the book’s presentation of the information. Specifically, the textbook provided no sense as for the exact methodological details, the associated data, and whether the interpretation of these findings were accurate. So let’s turn to those now.

If the labor will all be in vain, why bother laboring at all?

The general summary of the research I just provided is broadly true, but very important details are missing that help contextualize it. The first of these involves how the study phases of the experiments took place. Let’s just consider the first experiment, as the methods are broadly similar across the three. In the study phase, the participants had 27 trigrams to commit to memory. The participants were seated at a computer, and one of these trigrams would appear on the screen at a time. After the participants felt they had studied it enough, they would hit the enter key to advance to the next item, but they could not go back to previous items once they did. This meant there was no ability to restudy or practice test oneself in advance of the formal test. To be frank, this method of study resembles no kind that I know humans to naturally engage in. Since the context of studying in the experiment is so strange, I would be hesitant to say that it tells us much about how learning occurs in the real word, but the problems get worse than that.

As I mentioned before, these are undergraduate participants trying to earn extra credit. With that mental picture of the samples in mind, we might come to expect that the participants are a little less than motivated to deliver a flawless performance. If they’re anything like the undergraduates I’ve known, they likely just want to get the experiment over and done with so they can go back to doing things they actually want to. In terms of the interests of college students, learning nonsense syllables isn’t high on that list; in fact, I don’t think that task is high on anybody’s list. The practical information value of what they’re learning is nonexistent, and very little is riding on their success. It might come as no surprise, then, that the participants dedicated effectively no time to studying these items. Bear in mind, there were 27 of these trigrams to learn. In the speed group, the average number of seconds devoted to study was 1.9 per trigram. Two whole seconds of learning per bit of nonsense. In the accuracy group, this study time skyrocketed to a substantial…5.4 seconds.

An increase of 3.3 seconds per item does not strike me as anything I’d refer to as labor, even if the amount of study time was nominally over twice as long. A similar pattern emerged in the other two experiments. The speed/accuracy study times were 4.8 and 15.2 in the second study, and 1.2 and 8.4 in the third. Putting this together up to this point, we have (likely unmotivated, undergraduate) participants studying useless information in unnatural ways for very brief periods of time. Given that, why on Earth would anyone expect to find large differences in later recall performance?

Speaking of eventual performance, though, let’s finally consider how well each group performed during the recall task; how much of that laboring was being done in vain. In the first experiment, the speed group recalled 43% of the trigrams; the accuracy group got 49% correct. That extra study time of about 3 seconds per item yields a 6% improvement in performance. The difference wasn’t statistically significant but, again, exactly how large of an improvement should have been expected, given the context? In the second study, these percentages were 49% and 57%, respectively (a gain of 8%); in the third, they were 75% and 83% (another 8% difference that actually was statistically significant given the larger sample size for experiment 3). So, across three studies, we do not see evidence of people laboring in vain; not really. Instead, what we see is that very small amounts of extra time devoted to studying nonsense in unusual ways by people who want to be doing other things yields corresponding small – but consistent – gains in recall performance. It’s not that this labor was in vain; it’s that not much labor was invested in the first place, so the gains were minimal.

If you want to make serious gains, you’ll need more than baby weight

On a theoretical level, it sure would be strange if people would spend substantially extra time laboring in study to make effectively no gains. Why waste all that valuable time and energy doing something that has no probability of paying off? That’s not something anyone should posit a brain would do if they were using evolutionary theory to guide their thinking. It would be strange to truly observe a labor-in-vain effect in the biological sense of the word. However, given a fuller picture of the methods of the research and the data it uncovered, it doesn’t seem like the name of that effect is particularly apt. The authors of the original paper seem to have tried to make these results sound more exciting than they are (through their naming of the effect and the use of phrases like, “…substantial extra study time,” and differences in study time that are, “highly significant,” as well as an exclamation point here and there). That the primary literature is a little ambitious is one thing, but we also saw that the secondary summary of the research by my textbook was less than thorough or accurate. Anyone reading the textbook would not leave with a good sense for what this research found. It’s not hard to imagine how this example could extend further to a student summarizing the summary they read to someone else, at which point all the information to be gained from the original study is effectively gone.

The key point to take away from this is that textbooks (indeed, secondhand sources in general) should certainly not be used as an end-point for research; they should be used as a tentative beginning to help track down primary literature. However, that primary literature is not always to be taken at face value. Even assuming the original study was well-designed and interpreted properly, it would still only represent a single island of information in the academic ocean. Obtaining true and useful information from that ocean takes time and effort which, unfortunately, you often cannot trust others to do on your behalf. To truly understand the literature, you need to dive into it yourself.

References: Nelson, T. & Leonesio, R. (1988). Allocation of self-paced study time and the “Labor-in-Vain Effect”. Journal of Experimental Psychology, 14, 676-686.

Pop Psychology

The Internet's Best Evolutionary Psycholo-guy

Unusual Names In Learning Research