A discussion among commenters on a post about PubMed search strategies raised the issue of how people need to make sense of the results that a search engine provides. For precision-oriented searches a “black box” approach may make sense because as long as the system manages to identify a useful document, it doesn’t matter much how it does that. For exploratory search, which may be more recall-oriented, having a comprehensible representation of the system’s computations is important to assess coverage of your results. This suggests the need to foster useful mental models, rather than relying on the system to divine your intent and magically produce the “right” result.
Sarah Vogel raises this issue as a professional searcher who needs to explain her results to her clients, but I believe this capability is important for anyone involved in exploratory search. There are at least two questions that need to be asked:
- how can the system explain why it retrieved the documents it did retrieve?
- what other documents that were similar did it not chose to show, and why?
Having answers to the first question will help people make sense of the results and will give some confidence that the system is responding in a predictable (and thus ultimately comprehensible) manner. Jeremy Pickens wrote about this concept, that he dubbed Explanatory Search, and Daniel Tunkelang has written about transparency in information retrieval. I would argue that having an actionable mental model is sufficient — the user doesn’t have to know what’s actually going on in the system as long as the system fosters a mental model that the user can use to make useful predictions. Newtonian mechanics is a good example: we (now) know that it’s not quite right, but it’s good enough (and simple enough) to use instead of more accurate models.
What else was there?
Having answers to the second question will help the searcher concerned with recall understand how the set of documents retrieved relates to the much larger set of documents that were not retrieved. It can help inform subsequent query (search strategy) construction and iteration, and can make it clear when a threshold (probabilistic, proximity, etc.) has been applied, and what the effects might be of modifying these constraints. eBay, for example, does this for queries that return no results by suggesting which query terms could be deleted to produce matches. It also indicates how many results will be returned by each keyword combination.
Another way to make data more comprehensible is to make the querying process more interactive, in the information visualization sense of the word. The lesson of Ben Shneiderman’s Dynamic Queries project was that giving people real-time feedback on query results helps them understand the organization of the data better. In my MS Thesis work, I applied this idea to proximity searches by allowing users to dynamically adjust the range of the proximity operator to see its effect on search results in real-time. In cognitive dimensions terms, this ability to manipulate thresholds and other constraints dynamically makes the interfaces less viscous and avoids premature commitment. The result is a more exploratory style of interaction that allows the searcher to get a better feel for the collection.
Similar interaction is possible with faceted search interfaces because the cardinality of information for various facet intersections is available to the retrieval engine. For example, Anselm Spoerri’s InfoCrystal offered a way to visualize all possible combinations of facets, which could readily tell the user how results were distributed. Anslem has continued this theme of information visualization for exploration in searchCrystal. Of course the notion of similarity can be different in faceted search compared with full-text search, but it is interesting that some guidelines (see here and here, for example) developed for faceted search are designed specifically to help the user make sense of the search result space.
The upshot is that while precision-oriented web search has reduced the search process to magic (which is by implication unknowable), a better approach would be to foster mental models that can inform effective query construction. Having the means to understand search results makes it easier assess the scope of what you’ve found, makes it easier to form new queries, and makes it easier to recover from mistakes.