Miles Efron posted recently on his take on the progress of the IR field in response to a question posted by Andrew Dillon at the last ASIST conference. Miles’ take was that progress was indeed being made for two reasons: the SIGIR conference has become more competitive over the years, and the diversity of corpora in the TREC umbrella has also increased. Unfortunately, I wasn’t there to hear the question or the subsequent discussion, but my guess as to what Andrew Dillon actually meant was not a question of statistical significance, but rather one of magnitude.
Every year we see incremental improvements in Mean Average Precision (MAP) scores reported in SIGIR (and in CIKM, and in other venues) for some narrow conceptions of the search task. The gains are real, but they may not matter. Similarly, Google recently reported (thanks Jeremy, thanks Greg) that a change in latency from 100 msec to 400 msec reduced the number of queries people ran by about 0.5%. Statistically significant, yes. Important? Maybe not.
The scientists among us like to measure things. That’s how we (and others) know we did something interesting. But it seems that what we really want to measure is difficult to observe, and so we settle on some plausible proxy. And so begins the slippery slope.
It is certainly true that having ongoing improvement in indexing and retrieval algorithms is a good thing. But in some ways it has become a victim of its own success, and, like commercial agriculture, now produces decent commodity goods at ridiculously low cost. To continue with the analogy, we need to diversify our notion of information retrieval to include not only the supermarket (where any time of day you can find exactly the same product that you’ve always bought but without the ability to really understand or control what’s in the box) but also the farmer’s market, where you can find more variety, more surprises, and more interaction with the people who grow the food you will be eating.
So there is still room in the field of information retrieval for progress, but the low-hanging fruit of precision-oriented search have been harvested. We now need to look to more difficult tasks, to exploratory search, to interaction, to collaboration. Looking beyond the ranked list is not only a pragmatic strategy for innovation, it’s also good science.