I’ve been re-reading a paper by Joho et al. that explored the effectiveness of a number of strategies with respect to collaborative search. The paper finds that
…looking at the top 20 documents in more queries was more effective than looking at the top, say, 100 documents in one fifth the number of queries.
This finding, supported by some of the observations by Vakkari, suggests that encouraging users (working individually or collaboratively) to issue multiple queries, and supporting them in subsequent sense-making activities should improve overall effectiveness of the search process.
So what’s a good way to offer this support? How can we help users make sense of search results collected over time?
We’ve built a couple of different systems that offer this possibility, but this work is just scratching the surface. Our Cerchiamo system, for example, offered a shared display that represented session activity. For each query, it showed where useful and not useful key frames occurred in the timeline, and which key frames were seen. While the system aggregated that information to produce the ranking for the RSVP user (the miner), the aggregation was not visualized directly.
In the more recent Querium system, we’re looking at more explicit and more interactive representations of a session’s retrieval history. The histograms represent document-centric views of the session, and the fusion list gives the searchers a sense of the retrieval history of a group of documents.
The colored bars to the left of the results show the density and rank of retrieval of each document, and the colors show that more than one person was involved in this search session. Clicking on any bar pivots the display to the corresponding query, showing the selected document in its retrieval context.
The next challenge will be to design experiments to evaluate these kinds of interfaces in a meaningful way. One difficulty lies in establishing how easy it is for people to learn how to use these more complex interfaces, particularly if the search tasks are constrained to achieve experimental control. Another variable worth paying attention to is task difficulty: we found a much more pronounced advantage for collaboration when topics had few relevant documents rather than many.