Rejection can be a painful process for almost anyone (unless you’re English). For many, rejection is what happens when a (perhaps overly-bloated) ego ends up facing the reality that it really isn’t as good as it likes to tell people it is. For others, rejection is what happens when the person in charge of making the decision doesn’t possess the accuracy of assessment that they think they do (or wish they did), and failed to recognize your genius. One of the most notable examples of the latter is The Beatle’s Decca audition in 1962, during which the band was told they had no future in show business. Well over 250 million certified sales later, “oops” kind of fails to cut it with respect to how large of a blunder that decision was. This is by no means a phenomenon unique to The Beatles either: plenty of notable celebrities had been previously discouraged or rejected from their eventual profession by others. So we have a bit of error management going on here: record labels want to do things like (a) avoid signing artists that are unlikely to go anywhere while (b) avoiding failures to sign the best-selling band of all time. As they can’t do either of those things with perfect accuracy, they’re bound to make some mistakes.
“Yet again, our talents have gone unnoticed despite our sick riffs”
Part of the problem facing companies that put out products such as albums, books, movies, and the rest, is that popularity can be a terribly finicky thing, since popularity can often snowball on itself. It’s not necessarily the objective properties of a song or book that make it popular; a healthy portion of popularity depends on who else likes it (which might sound circular, but it’s not). This tends to make the former problem of weeding out the bad artists easier than finding the superstars: in most cases, people who can’t sing well won’t sell, but just because one can sing well it doesn’t mean they’re going to be a hit. As we’re about to see, these problems are shared not only by people who put out products like music or movies; they’re also shared by people who publish (or fail to publish) scientific research. A recent paper by Siler, Lee, & Bero (2014) sought to examine how good the peer review process – the process through which journal editors and reviewers decide what gets published and what does not – is at catching good papers and filtering out bad ones.
The data examined by the authors focused on approximately 1,000 papers that had been submitted to three of the top medical journals between 2003 and 2004: Annals of Internal Medicine, British Medical Journal, and The Lancet. Of the 1,008 manuscripts, 946 – or about 94% of them – were rejected. The vast majority of those rejections – about 80% – were desk rejections, which is when an article is not sent out for review before the journal decides to not publish it. From that statistic alone, we can already see that these journals are getting way more submissions than they could conceivably publish or review and, accordingly, lots of people are going to be unhappy with their decision letters. Thankfully, publication isn’t a one-time effort; authors can, and frequently do, resubmit their papers to other journals for publication. In fact, 757 of the rejected papers were found to have been subsequently published in other journals (more might have been published after being modified substantially, which would make them more difficult to track). This allowed Siler, Lee, & Bero (2014) the opportunity to compare the articles that were accepted to those which were rejected in terms of their quality and importance.
Now determining an article’s importance is a rather subjective task, so the authors decided to focus instead on the paper’s citation counts – how often other papers had referenced them – as of April 2014. While by no means a perfect metric, it’s a certainly a reasonable one, as most citations tend to be positive in nature. First, let’s consider the rejected articles. Of the articles that had been desk rejected by one of the three major journals but eventually published in other outlets, the average citation count was 69.8 per article; somewhat lower than the articles which had been sent out for review before they had been rejected (M = 94.65). This overstates the “average” difference by a bit, however, as citation count is not distributed normally. In the academic world, some superstar papers receive hundreds or thousands of the citations, whereas many others hardly receive any. To help account for this, the authors also examined the log-transformed number of citations. When they did so, the mean citation count for the desk rejected papers was 3.44, and 3.92 for the reviewed-then-rejected ones. So that is some evidence consistent with the notion that those who decide whether or not to send papers out for review work as advertised: the less popular papers (which we’re using as a proxy for quality) were rejected more readily, on average.
“I just don’t think they’re room for you on the team this season…”
There’s also evidence that, if the paper gets sent out to reviewers, the peer reviewers are able to assess a paper’s quality with some accuracy. When reviewers send their reviews back to the journal, they suggest that the paper be published as is, with minor/major revisions, or rejected. If those suggestions are coded as numerical values, each paper’s mean reviewer score can be calculated (e.g., fewer recommendations to reject = better paper). As it turns out, these scores correlated weakly – but positively – with an article’s subsequent citation count (r = 0.28 and 0.21 with citation and logged citation counts, respectively), so it seems the reviewers have at least some grasp on the paper’s importance and quality as well. That said, the number of times an article was revised prior to acceptance had no noticeable effect on it’s citation count. While reviewers might be able to discern the good papers from the bad at better-than-chance rates, the revisions they suggested did not appear to have a noticeable impact on later popularity.
What about the lucky papers that managed to get accepted by these prestigious journals? As they had all gone out for peer review, the reviewer’s scores were again compared against citation count, revealing a similarly small but positive correlation (0.21 and 0.26 with citation and logged citation counts). Additionally, the published articles that did not receive any recommendations to reject from the reviewers received higher citation counts on average (162.8 and 4.72) relative to those with at least one recommendation to reject (115.24 and 4.33). Comparing these numbers to the citation counts of the rejected articles, we can see a rather larger difference: articles being accepted by the high-end journals tended to garner substantially more citations than the ones that were rejected, whether before or after peer review.
That said, there’s a complication present in all this: papers rejected from the most prestigious journals tend to subsequently get published in less-prestigious outlets, which fewer people tend to read. As fewer eyes tend to see papers published in less-cited journals, this might mean that even good articles published in worse journals receive less attention. Indeed, the impact factor of the journal (the average citation count of the recent articles published in it) in which an article was published correlated 0.54 with citation and 0.42 with logged citation counts. To help get around that issue, the authors compared the published to rejected-then-published papers in journals with an impact factor of 8 or greater. When they did so, the authors found, interestingly, that the rejected articles were actually cited more than the accepted ones (212.77 vs 143.22 citations and 4.77 and 4.53 logged citations). While such an analysis might bias the number of “mistaken” rejections upwards (as it doesn’t count the papers that were “correctly” bumped down into lower journals), it’s a worthwhile point to bear in mind. It suggests that, above a certain threshold of quality, the acceptance or rejection by a journal might reflect chance differences more than meaningful ones.
But what about the superstar papers? Of the 15 most cited papers, 12 of them (80%) had been desk rejected. As the authors put it, “This finding suggests that in our case study, articles that would eventually become highly cited were roughly equally likely to be desk-rejected as a random submission“. Of the remaining three papers, two had been rejected after review (one of which had been rejected by two of the top 3 journals in question). While it was generally the case, then, that peer review appears to help weed out the “worst” papers, the process does not seem to be particularly good at recognizing the “best” work. Much like The Beatles Decca audition, then, rockstar papers are not often recognized as such immediately. Towards the end of the paper, the authors make reference to some other notable cases of important papers being rejected (one of which being rejected twice for being trivial and then a third time for being too novel).
“Your blindingly-obvious finding is just too novel”
It is worth bearing in mind that academic journals are looking to do more than just publish papers that will have the highest citation count down the line: sometimes good articles are rejected because they don’t fit the scope of the journal; others are rejected just because the journals just don’t have the space to publish them. When that happens, they thankfully tend to get published elsewhere relatively soon after; though “soon” can be a relative term for academics, it’s often within about half a year.
There are also cases where papers will be rejected because of some personal biases on the part of the reviewers, though, and those are the cases most people agree we want to avoid. It is then that the gatekeepers of scientific thought can do the most damage in hindering new and useful ideas because they find them personally unpalatable. If a particularly good idea ends up published in a particularly bad journal, so much the worse for the scientific community. Unfortunately, most of those biases remain hidden and hard to definitively demonstrate in any given instance, so I don’t know how much there is to do about reducing them. It’s a matter worth thinking about.
References: Siler, K., Lee, K., & Bero, L. (2014). Measuring the effectiveness of scientific gatekeeping. Proceedings of the National Academy of Sciences (US), DOI10.1073/pnas.1418218112