Jeff Huang recently published a list of papers from several major conferences that won Best Paper awards. It’s a nice collection of papers, highlighted in a way that is difficult to obtain from the ACM Digital Library. (Why that should be the case is a different story.)
Clearly winning a Best Paper award is a significant achievement and authors of such papers should be proud of their work. But does this merit translate into impact? For example, do papers that win Best Paper awards get cited more frequently than other papers from the same conference?
I wish I had access to the ACM metadata for the papers in question, as it would make a larger-scale analysis of citation rates straightforward. As it was, I resigned myself (again) to some manual analysis: I recorded the within-ACM citation counts for all full papers published in the SIGIR 2001-2005 conferences, as shown in Figure 1. (The data itself is here.)
First, the general shape is not surprising: a few papers receive a lot of citations, and the rest trail off. Even though the number of papers accepted varied from year to year (46, 44, 46, 58, 71), the range of citations was quite comparable, so I didn’t bother normalizing the data.
So where do the best papers lie? For the years 2001-2005, the best paper winners were cited 39, 68, 41, 39, and 36 times, respectively. This corresponds to ranks 17.5, 3, 13, 9, and 9. This is summarized in the following table:
|Year||Papers||Rank of best
|% of rank||# citations||% of max|
This is not enough data to make strong conclusions, but it seems like the award winners achieve on average about 1/3 the citation rate of the best-cited paper in a conference, and rank, on average in top quintile.
In another attempt to visualize the data, I subtracted the citation count of the best paper in each year from all the other papers published that year, to produce Figure 2.
In this chart, we can see that 2002 was significantly better than the other years in predicting the best paper award winner (most of the papers were cited less), but the other years were largely indistinguishable.
I suppose there are many reasons that a Best Paper award winner does not get high citation rates:
- some of this is due to chance (perhaps the high-cited papers were also considered for the award)
- some may be due to innovation (ground-breaking work may not get as much recognition as solid work in an established area)
- some may be due to selection bias within the nominating committee that may consider certain papers to be flawed or otherwise unacceptable, while the public at large may recognize something important in the paper anyway
- How well-attended a paper presentation is may also make a difference in generating buzz about the work.
The upshot, however, is that it’s quite difficult to predict the real impact of a paper at the time it is published. It might be interesting for specific communities to hold reviews of papers published five to ten years ago to assess the impact retrospectively. This should, of course, involve more than citation counts: it is important to look at the reasons papers get cited. A measurement that aggregates the degree of importance of a paper to the papers the cite it might approximate this longer-term impact, although such a measure might be quite subjective.