It is reasonably well-known that people who examine search results often don’t go past the first few hits, perhaps stopping at the “fold” or at the end of the first page. It’s a habit we’ve acquired due to high-quality results to precision-oriented information needs. Google has trained us well.
But this habit may not always be useful when confronted with uncommon, recall-oriented, information needs. That is, when doing research. Looking only at the top few documents places too much trust in the ranking algorithm. In our SIGIR 2013 paper, we investigated what happens when a light-weight preview mechanism gives searchers a glimpse at the distribution of documents — new, re-retrieved but not seen, and seen — in the query they are about to execute.
The preview divides the top 100 documents retrieved by a query into 10 bins, and builds a stacked bar chart that represents the three categories of documents. Each category is represented by a color. New documents are shown in teal, re-retrieved ones in the light blue shade, and documents the searcher has already seen in dark blue. The figures below show some examples:
Thus the searcher can tell at a glance whether the query being composed will simply re-rank already-retrieved documents, or if it will introduce new documents and how those newly-found documents will be distributed through the query ranks. The computation is simple if you keep track of which documents have already been retrieved in the current session, and compare the ids of the query preview with the existing set. We found that the Lucene search engine running on Linux, wrapped in a RESTful API can evaluate typical queries over a 1.7 million document collection in around 100 msec without any tuning, making it quite suitable for interactive use. In fact, in real use, the barchart displays shown above approximate the experience of a spectral display of an audio amplifier, moving up and down to reflect the changes in the query as the seacher types.
We evaluated the use of this system for patentability searches. We found differences in attention, behavior, and performance that can reasonably be attributed to the preview control. People spent about 1 second per query actually looking at the preview, and ran fewer queries in the preview condition. They examined search results more deeply, and saved useful documents throughout the middle ranks of queries. Finally, they found more unique relevant documents in the preview condition (measured using residual precision and recall) even though regular recall and precision numbers did not differ between conditions. We used residual precision and recall to avoid double-counting previously-retrieved documents. When a relevant document was retrieved, it was counted toward recall and precision for first query that retrieved it. It was not counted as relevant when it was retrieved by subsequent queries in the same session. We believe that emphasizing document newness (or lack thereof) encouraged people to formulate more diverse queries.
In short, we conclude that the way in which interaction with search results is structured can have significant effects on people’s ability to find information without any changes to the underlying ranking algorithms.